feature selection 1 feature selection for image retrieval by karina zapién arreola january 21th,...
TRANSCRIPT
![Page 1: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/1.jpg)
1Feature Selection
Feature Selection for Image Retrieval
By Karina Zapién Arreola
January 21th, 2005
![Page 2: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/2.jpg)
2Feature Selection
Introduction
Variable and feature selection have become the focus of much research in areas of applications for datasets with many variables are available
Text processing
Gene expression
Combinatorial chemistry
![Page 3: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/3.jpg)
3Feature Selection
Motivation
The objective of feature selection is three-fold:
Improving the prediction performance of the predictors
Providing a faster and more cost-effective predictors
Providing a better understanding of the underlying process that generated the data
![Page 4: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/4.jpg)
4Feature Selection
Why use feature selection in CBIR
Different users may need different features for image retrieval
From each selected sample, a specific feature set can be chosen
![Page 5: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/5.jpg)
5Feature Selection
Boosting
Method for improving the accuracy of any learning algorithm
Use of “weak algorithms” for single rules
Weighting of the weak algorithms
Combination of weak rules into a strong learning algorithm
![Page 6: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/6.jpg)
6Feature Selection
Adaboost Algorithm
Is a iterative boosting algorithmNotation
Samples (x1,y1),…,(xm,ym), where, yi= -1,1There are m positive samples, and l negative samples
Weak classifiers hi
For iteration t, the error is defined as:
εt = min (½)Σi ωi |hi(xi) – yi|
where ωi is a weight for xi.
![Page 7: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/7.jpg)
7Feature Selection
Adaboost Algorithm
Given samples (x1,y1),…,(xm,ym), where yi = -1,1
Initialize ω1,i=1/(2m), 1/(2l), for yi = 1,-1
For t=1,…,TNormalize ωt,i = ωt,i /(Σj ωt,j)
Train base learner ht,i using distribution ωi,j
Choose ht that minimize εt with error ei
Update ωt+1,i = ωt,i βt1-ei
Set βt = (εt)/(1- εt) and αt = log(1/ βt)
Output the final classifier H(x) = sign( Σt αt ht(x) )
![Page 8: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/8.jpg)
8Feature Selection
Adaboost Application
Searching similar groupsA particular image class is chosen
A positive sample of this group is given randomly
A negative sample of the rest of the images is given randomly
![Page 9: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/9.jpg)
9Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
![Page 10: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/10.jpg)
10Feature Selection
Domain knowledgeFeatures used
colordb_sumRGB_entropy_d1col_gpd_hsvcol_gpd_labcol_gpd_rgbcol_hu_hsv2col_hu_lab2col_hu_labcol_hu_rgb2col_hu_rgbcol_hu_seg2_hsvcol_hu_seg2_labcol_hu_seg2_rgb
Features usedcol_hu_seg_hsv
col_hu_seg_lab
col_hu_seg_rgb
col_hu_yiq
col_ngcm_rgb
col_sm_hsv
col_sm_lab
col_sm_rgb
col_sm_yiq
text_gabor
text_tamura
edgeDB
waveletDB
Features usedhist_phc_hsv
hist_phc_rgb
Hist_Grad_RGB
haar_RGB
haar_HSV
haar_rgb
haar_hmmd
![Page 11: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/11.jpg)
11Feature Selection
Check list Feature Selection
Domain knowledge
Commensurate features
Normalize features between an appropriated range
Adaboost takes each feature independent so it is not necessary to normalize them
![Page 12: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/12.jpg)
12Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
![Page 13: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/13.jpg)
13Feature Selection
Feature construction and space dimensionality reduction
Clustering
Correlation coefficient
Supervised feature selection
Filters
![Page 14: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/14.jpg)
14Feature Selection
Check list Feature Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Features with the same value for all samples (variance=0) were eliminated
From4912 Linear Features3583 were selected
![Page 15: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/15.jpg)
15Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individually
When there is no asses method, use Variable Ranking method. In Adaboost this is not necessary
![Page 16: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/16.jpg)
16Feature Selection
Variable Ranking
Preprocessing step
Independent of the choice of the predictor
Correlation criteriaIt can only detect linear dependencies
Single variable classifiers
![Page 17: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/17.jpg)
17Feature Selection
Variable Ranking
Noise reduction and better classification may be obtained by adding variables that are presumable redundant
Perfectly correlated variables are truly redundant in the sense that no additional information is gained by adding them. It doesn’t mean absence of variable complementarily
Two variables that are useless by themselves can be useful together
![Page 18: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/18.jpg)
18Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
![Page 19: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/19.jpg)
19Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
![Page 20: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/20.jpg)
20Feature Selection
Adaboost Algorithm
Given samples (x1,y1),…,(xm,ym), where xi, yi -1,1
Initialize ω1,i=1/(2m), 1/(2l), for yi = -1,1
For t=1,…,TNormalize ωt,i = ωt,i /(Σj ωt,j)
Train base learner ht,i using distribution ωi,j
Choose ht that minimize εt with error ei
Update ωt+1,i = ωt,i βt1-ei
Set βt = (εt)/(1- εt) and αt = log(1/ βt)
Output the final classifier H(x) = sign( Σt αt ht(x) )
![Page 21: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/21.jpg)
21Feature Selection
Weak classifier
Each weak classifier hi is defined as follows:
hi.pos_mean – mean value for positive samples
hi.neg_mean – mean value for negative sample
A sample is classified as:1 if it is closer to hi.pos_mean
-1 if it is closer to hi.neg_mean
![Page 22: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/22.jpg)
22Feature Selection
Weak classifier
hi.pos_mean – mean value for positive samples
hi.neg_mean – mean value for negative sample
A Linear Classifier was used
hi.neg_mean
hi.pos_mean
![Page 23: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/23.jpg)
23Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
![Page 24: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/24.jpg)
24Feature Selection
Adaboost experiments and results
4 positives
10 positives
![Page 25: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/25.jpg)
25Feature Selection
Few positive samplesUse of 4
positive samples
![Page 26: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/26.jpg)
26Feature Selection
More positive samples
False Positive
Use of 10 positive samples
![Page 27: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/27.jpg)
27Feature Selection
Training data
Training data
Test data
False negative
Use of 10 positive samples
![Page 28: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/28.jpg)
28Feature Selection
Changing number of Training Iterations
The number of iterations
Used was from 5 to 50
Iterations = 30was set
![Page 29: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/29.jpg)
29Feature Selection
Changing Sample Size
5 pos10 pos
15 pos20 pos
25 pos
30 pos35 pos
![Page 30: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/30.jpg)
30Feature Selection
Few negative samplesUse of 15
negative samples
![Page 31: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/31.jpg)
31Feature Selection
More negative samplesUse of 75
negative samples
![Page 32: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/32.jpg)
32Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparison (ideas, time, comp. resources, examples)Stable solution
![Page 33: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/33.jpg)
33Feature Selection
Stable solution
For Adaboost is important to have a representative sample
Chosen parameters:Positives samples: 15
Negative samples: 100
Iteration number: 30
![Page 34: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/34.jpg)
34Feature Selection
Stable solution with more samples and iterations
Beaches
Dinosaurs
Mountains
ElephantsBuildings
Humans
RosesBusesHorses
Food
![Page 35: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/35.jpg)
35Feature Selection
Stable solution for DinosaursUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
![Page 36: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/36.jpg)
36Feature Selection
Stable solution for RosesUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
![Page 37: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/37.jpg)
37Feature Selection
Stable solution for BusesUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
![Page 38: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/38.jpg)
38Feature Selection
Stable solution for BeachesUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
![Page 39: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/39.jpg)
39Feature Selection
Stable solution for FoodUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
![Page 40: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/40.jpg)
40Feature Selection
Unstable Solution
![Page 41: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/41.jpg)
41Feature Selection
Unstable solution for Roses
Use of:• 5 Positive samples• 10 Negative samples• 30 Iterations
![Page 42: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/42.jpg)
42Feature Selection
Best features for classification
HumansBeachesBuildingsBusesDinosaursElephantsRosesHorsesMountainsFood
![Page 43: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/43.jpg)
43Feature Selection
And the winner is…
![Page 44: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/44.jpg)
44Feature Selection
Feature frequency
Feature's Frequency
00.020.040.060.08
0.10.120.140.160.18
0.2
Haar_
rgb_
norm
haar
_hmm
d
hist_
Grad_
RGB
haar
_RGB
haar
_HSV
hist_
phc_
hsv
hist_
phc_
rgb
03-co
l_gpd
_hsv
.mat
03-co
l_sm
_yiq.
mat
03-co
l_hu_
yiq.m
at
03-co
l_hu_
seg_
hsv.m
at
03-co
l_sm
_lab.m
at
04-te
xt_ta
mur
a.mat
05-e
dgeD
B
03-co
l_sm
_rgb
.mat
03-co
l_gpd
_rgb
.mat
03-co
l_gpd
_lab.
mat
05-w
avel
etDB
03-co
l_hu_
lab.m
at
03-co
l_hu_
seg_
rgb.
mat
03-co
l_ngc
m_rgb
.mat
03-co
l_hu_
seg_
lab.
mat
04-te
xt_ga
bor.m
at
Feature
Appe
aren
ce t
imes
![Page 45: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/45.jpg)
45Feature Selection
Extensions
Searching similar imagesPairs of images are built
The difference for each feature is calculated
Each difference is classified as: 1 if both images belong to the same class
-1 if both images belong to different classes
Multiclass adaboost
![Page 46: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/46.jpg)
46Feature Selection
Extensions
Use of another weak classifierDesign weak classifier using multiple features
→ classifier fusion
Use different weak classifier such as SVM, NN, threshold function, etc.
Different feature selection method: SVM
![Page 47: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/47.jpg)
47Feature Selection
Discussion
Is important to add feature Selection for Image retrieval
A good methodology for selecting features should be used
Adaboost is a learning algorithm
→ data dependent
It is important to have representative samples
Adaboost can help to improve the classification potential of simple algorithms
![Page 48: Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005](https://reader035.vdocuments.site/reader035/viewer/2022062712/56649c535503460f948fd8ff/html5/thumbnails/48.jpg)
48Feature Selection
Thank you !