enn: extended nearest neighbor method for pattern recognition this lecture notes is based on the...

18
ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015 Prof. Haibo He Electrical Engineering University of Rhode Island, Kingston, RI 02881 Computational Intelligence and Self-Adaptive Systems (CISA) Laboratory http://www.ele.uri.edu/faculty/he / Email: [email protected]

Upload: arron-cummings

Post on 26-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

ENN: Extended Nearest Neighbor Method for Pattern Recognition

This lecture notes is based on the following paper:B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE

Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Prof. Haibo HeElectrical Engineering

University of Rhode Island, Kingston, RI 02881

Computational Intelligence and Self-Adaptive Systems (CISA) Laboratoryhttp://www.ele.uri.edu/faculty/he/

Email: [email protected]

Page 2: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Extended Nearest Neighbor for Pattern Recognition

1. Limitations of K-Nearest Neighbors (KNN)

2. “Two-way communication”: Extended Nearest

Neighbors (ENN)

3. Experimental Analysis

4. Conclusion

Page 3: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Pattern Recognition

Parametric Classifier Class-wise density estimation, including

naive Bayes, mixture Gaussian, etc. Non-Parametric Classifier

Nearest Neighbors Neural Network Support Vector Machine

Nonparametric nature

Easy implementation

Powerfulness

Robustness

Consistency

Page 4: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Scale-Sensitive Problem: The class 1 samples dominate their near neighborhood with higher density (i.e., more concentrated distribution). The class 2 samples are distributed in regions with lower density (i.e., more spread out distribution).

Limitations of traditional KNN

Those class 2 samples which are close to the region of class 1 may be easily misclassified.

Page 5: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

ENN: A New Approach

Define generalized class-wise statistic for each class:

Si denotes the samples in class i, and NNr(x, S) denotes the r-th nearest neighbor of x in S.

Ti measures the coherence of data from the same class. 0 ≤ Ti ≤ 1 with Ti = 1 when all the nearest neighbors of class i data are also from the same class i, and with Ti = 0 when all the nearest neighbors are from other classes.

Page 6: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Intra-class coherence:

Given an unknown sample Z to be classified, we iteratively assign it to class 1 and class 2, respectively, to obtain two new generalized class-wise statistics Ti

j, where j=1,2. Then, the sample Z is classified according to:

ENN Classification Rule: Maximum Gain of Intra-class Coherence.

For N-class classification:

Page 7: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

To avoid the recalculation of generalized class-wise statistics in testing stage, an Equivalent Version of ENN is proposed:

Page 8: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

The equivalent version has the same result as the original one, but avoids the recalculation of Ti

j

Page 9: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

How this simple rule works better than KNN

The ENN method makes a prediction in a “two-way communication” style: it considers not only who are the nearest neighbors of the test sample, but also who consider the test sample as their nearest neighbors.

Page 10: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Experimental Results and Analysis

Page 11: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Sampling methods

Synthetic Data Set:A 3-dimensional Gaussian data with 3 classes:

Considering the following four models, their error rates are:

Model 2 Class 1 Class 2 Class 3

KNN ENN KNN ENN KNN ENNk = 3 32 31.9 39.3 34.4 31.4 30.5k = 5 31.2 29.7 40.5 33.7 28.6 26.7k = 7 28.5 28.3 40.8 33.6 25 24.3 Model 3 Class 1 Class 2 Class 3 KNN ENN KNN ENN KNN ENNk = 3 33.2 31 27 26.8 38.8 33.7k = 5 30.3 27.3 24 23.2 40.2 33.5k = 7 26.7 25.1 20.8 20.8 40.6 33

2 2 21 2 35, 20, 5

2 2 21 2 35, 5, 20

Page 12: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

• MNIST Handwritten Digit Recognition

Sampling methods

Real-life Data Sets:

Data Examples

Page 13: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

Sampling methods

Real-life Data Sets:

t-test shows that ENN can significantly improve the classification performance in 17 out of 20 datasets, in comparison with KNN.

• 20 data sets from UCI Machine Learning Repository

Page 14: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

ENN

Summary: Three versions of ENN

Page 15: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

ENN.V1

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Summary: Three versions of ENN

Page 16: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Summary: Three versions of ENN

ENN.V2

Page 17: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015

Summary: Three versions of ENN

Online Resources

http://www.ele.uri.edu/faculty/he/research/ENN/ENN.html

Supplementary materials and Matlab source code implementation

available at:

Page 18: ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest

1. A new ENN classification methodology based on the maximum gain of intra-

class coherence.

2. “Two-way communication”: ENN considers not only who are the nearest

neighbors of the test sample, but also who consider the test sample as their

nearest neighbors.

3. Important and useful for many other machine learning and data mining

problems, such as density estimation, clustering, regression, among others.

Conclusion

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition," IEEE Computational Intelligence Magazine, vol.10, no.3, pp.52 - 60, Aug. 2015