1-NN Rule: Given an unknown sample X decide if
for
That is, assign X to category if the closest neighbor of X is from category i.
i
),(),( jlik XXdXXd ikjl
i
X
k-NN rule: instead of looking at the closest sample, we look at k nearest neighbors to X and we take a vote. The largest vote wins. k is usually taken as an odd number so that no ties occur.
Distance Measures(Metrices)Non-negativity
Reflexivity when only x = y
Symmetry
Triangle inequality
Euclidian distance satisfies.Not always a meaningful measure.Consider 2 features with different units scaling
problem.
0),( yxD
0),( yxD
),(),( xyDyxD
),(),(),( yxDyzDzxD
)(2
hight
X
)(1
weight
X
Scale of changed to half.1X
Minkowski MetricA general definition
norm – City block (Manhattan) distance norm – Euclidian
Scaling: Normalize the data by re-scaling (Reduce the range to 0-1) (equivalent to changing the metric)
kd
i
k
iik yxyxL/1
1
),(
1L2L
Y
b
a
X
baL 1
References
R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, New York: John Wiley, 2001. N. Yalabık, “Pattern Classification with Biomedical Applications Course Material”, 2010.