XLSTAT - K Nearest Neighbors (KNN)

What is K Nearest Neighbors (KNN) machine learning?

The K Nearest Neighbors method (KNN) aims to categorize query points whose class is unknown given their respective distances to points in a learning set (i.e. whose class is known a priori). It is one of the most popular supervised machine learning tools.

A simple version of KNN can be regarded as an extension of the nearest neighbor method (NN method is a special case of KNN, k = 1).

The KNN classification approach assumes that each example in the learning set is a random vector in Rn. Each point is described as x =< a1(x), a2(x), a3(x),.., an(x) > where ar(x) denotes the value I of the rth attribute. ar(x) can be either a quantitative or a qualitative variable.

To determine the class of the query point xq, each of the k nearest points x1,…,xk to xq proceed to voting. The class of xq corresponds to the majority class.

K Nearest Neighbors in XLSTAT: options

Distances: Several distance metrics can be used in XLSTAT to compute similarities in the K Nearest Neighbors algorithm. Options vary according to the type of variables characterizing the observations (qualitative or quantitative).

  • Distances available for quantitative data (metrics): Euclidian, Minkowski, Manhatan, Tchebychev, Canberra 
  • Distances available for quantitative data (kernels): linear, sigmoid, logarithmic, power, Gaussian, Laplacian
  • Distances available for qualitative data: Overlap Metric (OM), Value Difference Metric (VDM)

Validation: XLSTAT proposes a K-fold cross validation technique to quantify the quality of the classifier. Data is partitioned into k equally sub samples of equal size. Among the k subsamples, a single subsample is retained as the validation data to test the model, and the remaining k − 1 subsamples are used as training data.

Other options available in the XLSTAT K Nearest Neighbors feature include observation tracking as well as vote weighing.

K Nearest Neighbors in XLSTAT: results

The K Nearest Neighbors feature in XLSTAT includes displaying results by class or by object (observation).

XLSTAT

This analysis is available in the XLStat-Base addin for Microsoft Excel

About KCS

Kovach Computing Services (KCS) was founded in 1993 by Dr. Warren Kovach. The company specializes in the development and marketing of inexpensive and easy-to-use statistical software for scientists, as well as in data analysis consulting.

Mailing list Join our mailing list

Home | Order | MVSP | Oriana | XLStat
QDA Miner | Accent Composer | Stats Books
Stats Links | Anglesey

Share: FacebookFacebook TwitterTwitter RedditReddit
Del.icio.usDel.icio.us Stumble UponStumble Upon

 

Like us on Facebook

Get in Touch

  • Email:
    sales@kovcomp.com
  • Address:
    85 Nant y Felin
    Pentraeth, Isle of Anglesey
    LL75 8UY
    United Kingdom
  • Phone:
    (UK): 01248-450414
    (Intl.): +44-1248-450414