K-nearest Neighbor Algorithm Website Links For
Neighbor
 

Information About

K-nearest Neighbor Algorithm




In Pattern Recognition , the ''k''-nearest neighbour algorithm (k-NN) is a method for Classifying phenomena based upon observable Features , similar to the Nearest Neighbor Classification Method .

The difference lies in the fact that rather than assigning a classification based upon the classification of the nearest neighbour (the nearest neighbour is normally calculated using a distance measure such as the Euclidean distance) the algorithm selects a set which contains the ''k'' nearest neighbours and assigns the class label to the new data point based upon the most numerous class with the set. The best choice of ''k'' depends upon the data; generally, larger values of ''k'' reduce the effect of noise on the classification, but make boundaries between classes less distinct.

The accuracy of the k-NN algorithm can be severely degraded by the presence of noisy or irrelevant features, or if the features scales are not consistent with their relevance. Much research effort has been placed into selecting or scaling features to improve classification. A particularly popular approach is the use of Genetic Algorithms to optimize feature scaling. Another popular approach is to scale features by the Mutual Information of the training data with the training classes.


EXTERNAL LINKS