Download - A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant
![Page 1: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/1.jpg)
A Validity Measure for Hard and Fuzzy Clustering Derived
from Fisher’s Linear Discriminant
Cláudia R. de FrancoLeonardo da S. VidalAdriano J. de O. Cruz
May 2002
![Page 2: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/2.jpg)
Topics
Validity Measures Finding the number and the distribution of
clusters Pattern Recognition
Identify and classify patterns
![Page 3: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/3.jpg)
Índice
Estudo Realizado Categorização Classificação Validação de Categorias
PropostasEFLD ICCSistema ICC-KNN
![Page 4: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/4.jpg)
Estudo Realizado
Categorização Classificação Validação de Categorias
![Page 5: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/5.jpg)
Categorização
Processo de particionar um conjunto de amostras em subconjuntos (categorias)
Dados similares entre si por suas característicasDisposição EspacialCategoria definida pela proximidade das
amostras – Distância Partições Rígidas e Nebulosas
![Page 6: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/6.jpg)
Classificação
Técnica que associa amostras a classes previamente conhecidas
Rígida e Nebulosa Supervisionados
MLP treinamento Não supervisionados
K-NN e K-NN nebuloso sem treinamento
![Page 7: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/7.jpg)
Reconhecimento de Padrões
Reconhecimento de Padrões + Categorização Sistema Estatístico Não paramétrico de Reconhecimento de Padrões
Estatístico avalia a similaridade dos dados através de medidas matemáticas
Não-Paramétrico sem conhecimento prévio da distribuição das amostras
![Page 8: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/8.jpg)
Denominação de Características
Extração de Características
Identificação de Características
Categorização
Validação de Categorias
Classificador
Dados de Treinamento
Dados de Teste
Taxa de erro
Sistema Estatístico Não-Paramétrico de Reconhecimento de Padrões
![Page 9: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/9.jpg)
Métodos de Categorização
Não-HierárquicosDados distribuídos pelo número de categorias
pré-definidoCritério é otimizado
Minimização da variação interna das categorias
![Page 10: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/10.jpg)
Métodos de Categorização
Hierárquico 1ª AbordagemCada ponto é um centro de categoriaCada 2 pontos mais próximos são fundidos
em uma categoriaNúmero de categorias desejado é atingido
Hierárquico 2ª AbordagemUma categoria contém todas as amostrasCritério é utilizado para dividí-la no número de
categorias desejado
![Page 11: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/11.jpg)
Métodos de Categorização
RígidosCada amostra pertence a uma única
categoria Nebulosos
Cada amostra pertence a todos os agrupamentos com diferentes graus de afinidade
Grau de inclusão
![Page 12: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/12.jpg)
Métodos de Categorização
k-Means K-NN e K-NN nebulosoFCM FKCNGGGK
![Page 13: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/13.jpg)
Métodos de Categorização
K-Means e FCMDistância Euclidiana Hiperesferas
Gustafson-KesselDistância de Mahalanobis Hiperelipsóides
Gath-GevaDistância de Gauss superfícies convexas
de formato indeterminado
![Page 14: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/14.jpg)
Rede Kohonen de Categorização Nebulosa FKCN
Método de Categorização Nebuloso não supervisionado
Distância Euclidiana Categorias hiperesféricas Converge mais rápido que FCM Forte tendência a convergir para mínimos
locaisCategorias pouco representam as classes
![Page 15: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/15.jpg)
K-NN e K-NN nebuloso
Métodos de Classificação Classes identificadas por padrões Classifica pelos k vizinhos mais próximos Conhecimento a priori das classes do
problema Não se restringe à uma distribuição
específica das amostras
![Page 16: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/16.jpg)
K-NN Rígido
w1
w2 w3
w4
w13
w10
w9 w14
w5
w8
w12
w11
w6w7
Classe 1Classe 2
Classe 3
Classe 4 Classe 5
![Page 17: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/17.jpg)
K-NN Nebuloso
w1
w2
w3
w4
w13
w10
w9 w14
w5
w8
w12
w11
w6w7
Classe 1Classe 2
Classe 3
Classe 4 Classe 5
![Page 18: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/18.jpg)
Medidas de Validação
![Page 19: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/19.jpg)
Validity Measures
Used to find the ideal number of clusters that represent the sample space. Number of classes unknownNumber of classes Number of clusters
![Page 20: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/20.jpg)
Validity Measures
Applied to the partitions generated by the clustering algorithm
Measure the quality of the partitions
Crisp or Fuzzy
![Page 21: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/21.jpg)
Coeficiente de Partição – F
Medida de Validação Nebulosa Maximizar – 1/c F 1 Diretamente influenciada pelo
Número de categorias e Sobreposição das classes
nFc
i
n
jij
1 1
2
![Page 22: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/22.jpg)
Compacidade e Separação – CS
Medida de Validação Nebulosa Minimizar – 0 CS Avalia diferentes funções objetivo
2mindn
JCS m
![Page 23: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/23.jpg)
Compacidade e Separação – CS
Mede:O grau de separação entre as categoriasA compacidade das categoriasNão sofre influência da sobreposição das
categorias
Maior taxa de acertos dentre as medidas de validação estudadas
![Page 24: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/24.jpg)
Discriminante Linear de Fisher - FLD
Crisp Validity Measure
Measures the compactness and separation of the partitions produced by crisp clustering techniques
Between-Class Scatter Matrix – SB
Within-Class Scatter Matrix Scatter – SW
![Page 25: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/25.jpg)
Discriminante Linear de Fisher - FLD
Critério J – Maximizado
W
B
S
SJ
W
B
Strace
StraceJ
![Page 26: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/26.jpg)
Indicadores de Validade
Calculam o grau de separação entre as categorias
Menor a sobreposição das categorias melhor a categorização obtida
MinRF, MaxRF e MinNMMcard
![Page 27: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/27.jpg)
Propostas
EFLD ICC Sistema ICC-KNN
![Page 28: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/28.jpg)
EFLD
![Page 29: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/29.jpg)
EFLD
Extended Fisher Linear Discriminant
Capable of validate crisp and fuzzy clusters
![Page 30: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/30.jpg)
EFLD Extended between-classes scatter matrix
mei is the centroid of cluster i
c
i
Tn
jijBeS
1 1
mmmm eiei
i
j
μ
x
n
jij
eim1
n
jij
1
iμand
Tc
iiB
mmnS
ii mm1
![Page 31: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/31.jpg)
EFLD
Extended within-class scatter matrix
Extended total scatter matrix
c
i
Tn
jWS
1 1ijij mxmx
c
i
Tn
jijWeS
1 1eijeij mxmx
SSS BeWeTeSSS BWT
![Page 32: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/32.jpg)
EFLD
It can be proved that if the sum of all membership values of any element is equal to one then the total scattering is independent of the partition
1,1
c
iijj
SSS T
n
j
T
BeWe
1
mxmx jj
![Page 33: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/33.jpg)
EFLD Extended Fisher Linear Discriminant
Determinants impose limits on the minimum number of points of each cluster
Trace - fasterNo limitations due to the number of points
We
Bee Strace
StraceJ
We
Bee S
SJ
![Page 34: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/34.jpg)
EFLD – Otimização
Matrix traces are the product of a column vector by its transpose
Trace is equal to the square of the module of this vector
2
1 1
)(
c
i
n
jijBeBe
Strace mmeis2
1 1
)(
c
i
n
jijWeWe
Strace eij mxs
![Page 35: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/35.jpg)
EFLD – Improving
Sum of both traces (SBe and Swe) is constant
sT is evaluated only once
Calculating sBe is faster than sWe
2
1
)(
n
jTTStraces mx j
![Page 36: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/36.jpg)
EFLD – Improving
So EFLD can be rewritten as
Faster to evaluate Find the maximum value of Je
BeT
BeeJ ss
s
We
Bee Strace
StraceJ
![Page 37: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/37.jpg)
EFLD – testing
Three classes, 500 point each X1 – (1,1), (6,1), (3,5, 7) with Std 0,3 X2 – (1,5, 2,5), (4,5, 2,5), (3,5, 4,5) with Std 0,7 Apply FCM to m = 2 and c = 2 ...6
![Page 38: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/38.jpg)
EFLD – Aplication
EFLDNumber of Clusters
2 3 4 5 6
X1 4,6815 4,9136 0,2943 0,2559 0,3157
X2 0,3271 0,8589 0,8757 0,9608 1,0674
For superposed classes, Je, like J (FLD), is not a good measure
Behaviour similar to FLD
![Page 39: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/39.jpg)
EFLD – Aplication
Alocação errônea dos centros
Mínimo local = Ponto médio do conjunto de pontos
Je extremamente pequeno = 9,8010 x 10-5
![Page 40: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/40.jpg)
ICC
![Page 41: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/41.jpg)
ICC – Inter Class Contrast
EFLDIncreases as the number of clusters
rises. Increases when classes have high
degree of overlapping.
Reaches maximum for a wrong number of clusters.
![Page 42: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/42.jpg)
ICC
Evaluates a crisp and fuzzy clustering algorithms
Measures:Partition Compactness Partition Separation
ICC must be Maximized
![Page 43: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/43.jpg)
ICC
sBe – estimates the quality of the placement of the centres.
1/n – scale factorCompensates the influence of the number of
points in sBe
cDn
ICC Be min
s
![Page 44: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/44.jpg)
ICCcD
n
sICC Be min
Dmin – minimum Euclidian distance between all pairs of centres
Neutralizes the tendency of sBe to grow, avoiding the maximum being reached for a number of clusters greater than the ideal value.When 2 or more clusters represent a class –
Dmin decreases abruptly
![Page 45: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/45.jpg)
ICC
c
cDn
sICC Be min
– square root of the number of clusters
Avoids the maximum being reached for a number of clusters below the ideal. When 1 cluster represents two or more
classes - Dmin increases
![Page 46: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/46.jpg)
ICC – Fuzzy Application
Five classes with 500 points each No class overlapping X1 – (1,2), (6,2), (1, 6), (6,6), (3,5, 9) Std 0,3 Apply FCM for m = 2 and c = 2 ...10
![Page 47: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/47.jpg)
MeasuresNumber of clusters
2 3 4 5
ICC M 7,596 41,99 51,92 96,70
ICCTra M 7,596 41,99 51,92 96,70
ICCDet M IND 154685 259791 673637
EFLD M 0.185 0.986 1.877 13.65
EFLDTra M 0,185 0,986 1,877 13,65
EFLDDet M IND 0,955 3,960 182,70
CS m 0,350 0,096 0,070 0,011
F M 0,705 0,713 0,795 0,943
MinHT M 0,647 0,572 2,124 1,994
MeanHT M 0,519 0,496 1,327 1,887
MinRF 0 0,100 0,316 0 0
![Page 48: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/48.jpg)
TimeNumber of Categories
2 3 4 5
ICC 0,0061 0,0069 0,0082 0,00914
ICCTra 0,0078 0,0060 0,0088 0,0110
ICCDet 0,0110 0,0088 0,0110 0,0132
EFLD 0.0053 0.0071 0.0063 0.0080
EFLDTra 0,7678 1,0870 1,4780 1,8982
EFLDDet 0,7800 1,1392 1,5510 2,0160
CS 0,0226 0,0261 0,0382 0,0476
NFI 0,0061 0,0056 0,0058 0,00603
F 0,0044 0,0045 0,0049 0,00491
FPI 0,0061 0,0045 0,0049 0,00532
![Page 49: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/49.jpg)
ICC – Fuzzy Application
Five classes with 500 points each High cluster overlapping X1 – (1,2), (6,2), (1, 6), (6,6), (3,5, 9) Std 0,3 Apply FCM for m = 2 and c = 2 ...10
![Page 50: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/50.jpg)
Measures 2 3 4 5 10
ICC M 5,065 4,938 6,191 7,829 5,69
ICCTra M 5,065 4,938 6,191 7,829 5,69
ICCDet M IND 715,19 3572 7048 6024
EFLD M 0.450 0.585 0.839 1.095 1.344
EFLDTra M 0,450 0,585 0,839 1,095 1,344
EFLDDet M IND 0,049 0,315 0,743 1,200
CS m 0,164 0,225 0,191 0,122 0,223
F M 0,754 0,621 0,591 0,586 0,439
MeanHT M 0,632 0,485 0,550 0,597 0,429
MinRF 0 0,170 0,294 0,194 0,210 0,402
MPE m 0,568 0,601 0,561 0,525 0,565
![Page 51: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/51.jpg)
TimeNumber of Clusters
2 3 4 5
ICC 0,0060 0,0064 0,0077 0,00881
ICCTra 0,0066 0,0060 0,0098 0,0110
ICCDet 0,0110 0,0078 0,0110 0,0120
EFLD 0.0063 0.0088 0.0096 0.0110
EFLDTra 0,7930 2,1038 1,7598 2,2584
EFLDDet 0,9720 1,2580 1,6090 1,8450
CS 0,0220 0,0283 0,0362 0,05903
F 0,0112 0,0121 0,0061 0,0164
MPE 0,0167 0,0271 0,0319 0,03972
![Page 52: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/52.jpg)
Medidas 4 5 6 7 8
ICC M 81,8485 105,4463 15,0987 14,8891 13,4127
DLF M 5,9021 67,262 72,354 77,413 79,549
CS m 0,1195 0,0121 0,6593 0,7413 16,1588
Tempos 4 5 6 7 8
ICC 0,0074 0,00801 0,0085 0,0093 0,0102
DLF 1,3216 1,6784 2,0324 2,3002 2,6140
CS 0,0308 0,03772 0,0437 0,0502 0,0569
![Page 53: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/53.jpg)
ICC – Aplicação Rígida
Medidas 4 5 6 7 8
ICC M 15,5823 18,1940 13,4461 13,3913 14,9289
DLF M 2,9176 4,8258 5,4257 6,0781 6,8428
CS m 0,2488 0,1898 0,3928 0,4338 0,3717
Tempos 4 5 6 7 8
ICC 0,0074 0,00991 0,0102 0,0115 0,0135
DLF 1,3258 1,6534 1,9850 2,3288 2,6166
CS 0,0321 0,03822 0,0454 0,0516 0,0582
![Page 54: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/54.jpg)
ICC – Conclusões
Rápida e Eficiente Analisa partições Nebulosas e Rígidas Eficiente com alta sobreposição das
classes Alta taxa de acertos
![Page 55: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/55.jpg)
ICC-KNN
![Page 56: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/56.jpg)
Sistema ICC-KNN
Sistema Estatístico Não-Paramétrico de Reconhecimento de Padrões
Associa FCM, KNN nebuloso e ICC Avaliar dados dispostos em diversos
formatos de classes
![Page 57: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/57.jpg)
Sistema ICC-KNN
Módulo de ClassificaçãoEstabelecer estruturas nos dados
Primeira Fase de TreinamentoAvalia a melhor distribuição de padrões para o
K-NN nebuloso FCM – Aplicado para cada classe ICC – Encontra o melhor número de categorias que
representa cada classe
![Page 58: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/58.jpg)
Sistema ICC-KNN
Segunda Fase de Treinamento Avalia a melhor constante nebulosa e o
melhor número de vizinhos para o K-NN – maior performanceVaria-se m e kEscolhe-se m e k para a maior taxa de
Acertos Rígidos
![Page 59: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/59.jpg)
Sistema ICC-KNN
Módulo de Reconhecimento de PadrõesAtribuir os dados às classes definidas
Utiliza os padrões, m e k para classificar os dados
![Page 60: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/60.jpg)
Sistema ICC-KNN
Classe 1
Classe s
FCM
FCM
ICC
ICC
K-NNnebuloso
m k
W, Uw
W Uw
w1
ws
U1cmin
U1cmáx
UScmin
UScmáx
K-NNnebuloso
Módulo de Classificação
Módulo de Reconhecimento de
Padrões
Dados não classificados
![Page 61: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/61.jpg)
Sistema ICC-KNN - Algoritmo Módulo de Classificação
Primeira fase do Treinamento Passo 1. Fixar m Passo 2. Fixar cmin e cmáx Passo 3. Para cada classe s conhecida
Gerar o conjunto Rs com os pontos de R pertencentes à classe sPara cada categoria c no intervalo [cmin , cmáx]
Executar FCM para c e o conjunto Rs gerando Usc e Vsc
Calcular a ICC para Rs e UscFimDefinir os padrões ws da classe s como a matriz Vsc que maximiza a ICC
Passo 4. Gerar o conjunto W = {w1, ..., ws}
![Page 62: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/62.jpg)
Sistema ICC-KNN - Algoritmo Segunda fase do Treinamento Passo 5. Fixar mmin e mmáx Passo 6. Fixar kmin e kmáx
Para cada m do intervalo [mmin , mmáx] Para cada k do intervalo [kmin , kmáx]
Executar o K-NN nebuloso para os padrões do conjunto W, gerando Umk Calcular os acertos rígidos para Umk
Passo 7. Escolher o m e k que obtêm a maior taxa de acertos rígidos Passo 8. Se houver empate
Se os k são diferentesEscolher o menor k
SenãoEscolher o menor m
![Page 63: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/63.jpg)
Sistema ICC-KNN - Algoritmo
Módulo de Reconhecimento de Padrões
Passo 9. Aplicar o K-NN nebuloso com os padrões do conjunto W e os parâmetros m e k escolhidos aos dados a serem classificados
![Page 64: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/64.jpg)
Sistema ICC-KNN - Avaliação
2000 amostras, 4 classes, 500 amostras em cada classe Classe 1 e 4 – classes côncavas Classes 2 e 3 – classes convexas com formato elíptico
![Page 65: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/65.jpg)
Sistema ICC-KNN - Avaliação
Primeira Fase de Treinamento FCM aplicado a cada classe
Dados de treinamento 80% 400 amostrasc = 3..7 e m = 1,25
ICC aplicada aos resultadosClasses 1 e 4 4 categoriasClasses 2 e 3 3 categorias
![Page 66: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/66.jpg)
Sistema ICC-KNN - Avaliação
Segunda Fase de Treinamento Execução do K-NN Nebuloso
Padrões da PFTPadrões Aleatóriosk = 3 a 7 vizinhosm = {1,1; 1,25; 1,5; 2}
![Page 67: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/67.jpg)
Sistema ICC-KNN - Avaliação
Conclusão: K-NN é mais estável em relação ao valor de m para os
padrões da PFT
![Page 68: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/68.jpg)
Sistema ICC-KNN - Avaliação
Dados de Treinamento
ClassesPadrões da PFT Padrões Aleatórios
1 2 3 4 1 2 3 4
1 388 10 0 2 213 66 0 121
2 14 379 0 7 19 380 0 1
3 0 0 376 24 3 0 324 73
4 0 1 2 397 4 46 1 349
Dados de Treinamento Linhas classes Colunas classificação m = 1,5 e k = 3 96,25% m = 1,1 e k = 3 79,13% (padrões aleatórios)
![Page 69: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/69.jpg)
Sistema ICC-KNN - Avaliação
Dados de Teste Módulo de Reconhecimento de padrões Execução do K-NN nebuloso nos dados de teste Pad. PFT – 94,75% Pad. Aleat – 79%
Dados de Testes
ClassesPadrões da PFT Padrões Aleatórios
1 2 3 4 1 2 3 4
1 97 2 0 1 53 27 0 20
2 4 93 0 3 4 96 0 0
3 0 0 90 10 0 0 82 18
4 0 0 1 99 0 15 0 85
![Page 70: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/70.jpg)
Sistema ICC-KNN - Avaliação
Tempos de Execução Padrões da PFT 36,5 s
PFT FCM + ICC= 15,5 sSFT 21,04 sTotal 36,5 s
Aleatório 23,11s
![Page 71: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/71.jpg)
Sistema ICC-KNN - Avaliação
Acerto Nebuloso grau de inclusão > 1/k
![Page 72: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/72.jpg)
ICC-KNN x Mét. de Categorização
FCM, FKCN, GG e GK Fase de Treinamento (FTr)
Dados de treinamentoc = 4 e m = {1,1; 1,25; 1,5; 2}Associar as categorias às classes
Critério do somatório dos graus de inclusão Cálculo do somatório dos graus de inclusão dos pontos
de cada classe em cada categoria Uma classe pode ser representada por mais de uma
categoria
![Page 73: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/73.jpg)
ICC-KNN x Mét. de Categorização
Fase de TesteDados de Teste Inicialização dos métodos com os centros da FTrCalcula o grau de inclusão dos pontos em cada
categoria Classe representada por mais de 1 categoria
Grau de inclusão = soma dos graus de inclusão dos pontos nas categorias que representam a classe
![Page 74: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/74.jpg)
GK para m = 2 84% FCM e FKCN 66% para m = 1,1 e m = 1,25 GG-FCM 69% para m = 1,1 e 1,25 GG Aleatório 57,75% para m = 1,1 e 25% para m = 1,5
ICC-KNNKNN A.
FCM FKCN GG GK
R 94,75% 79% 66% 66% 69% 84%
N95,75%
83%70,75% 70,75% 69% 89,5%
T36,5s
23,11s2,91s 2,59s 22,66s 18,14s
ICC-KNN x Mét. de Categorização
![Page 75: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/75.jpg)
FCM GG-FCM
GK
![Page 76: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/76.jpg)
Reconhecimento de Dígitos Manuscritos
![Page 77: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/77.jpg)
Problema
Dígitos manuscritos extraídos de formulários
Escaneados imagens do tipo Tiff Algoritmo de Afinamento
Esqueleto da imagem Extração de características
Método do Polígono 122 características 4077 dígitos 3266 e 811 amostras
![Page 78: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/78.jpg)
Aplicação do ICC-KNN
PFT FCM m = 1,25 e c = 2..30
0 1 2 3 4 5 6 7 8 9
22 29 12 25 15 26 25 23 10 30
SFT K-NN neb. Padrões da PFT e Aleatóriosk = 3..7 e m ={1,1; 1,25; 1,5; 2}
![Page 79: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/79.jpg)
Acertos e Tempos
Métodos ICC-KNN K-NN Neb. Alea.
Acertos Ríg. 87,8% 72,4%
Acertos Neb. 94,53% 85,63%
Tempos 7166 s 1224,3 s
Dados de Teste m = 1,25 e k = 7 87,8% 21,3% superior
![Page 80: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/80.jpg)
ICC-KNN x Mét. De Categorização
Comparação com os Mét. De CategorizaçãoFCM, FKCN, GG, GK
122 19 característicasPCA – Principal Components AnalysisVariância preservada 82,6%p(p-1)/2
![Page 81: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/81.jpg)
Acertos e Tempos
ICC-KNN K-NN A. FCM FKCN GG GK
86,7% 75,22% 57% 55% 51% 49%
93,8% 85,66% 60% 54% 39,5% 39,8%
1784 s 260 s 30,38 s 32,79 s 108,15 s 711,77 s
Dados de Teste ICC-KNN 86,7% param = 1,25 e k = 6 FCM 57% para m = 1,25 52% de ganho do ICC-KNN sobre o FCM
![Page 82: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/82.jpg)
Acertos Rígidos
Pouco estável em relação à m
![Page 83: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/83.jpg)
Conclusões EFLD
Estendeu eficientemente as funcionalidades do FLD partições rígidas e nebulosas
Maior velocidade ICC
Eficiente e rápidaSuporta alta sobreposição das classesAvalia a compacidade e a separação das
classesAlto grau de acertos
![Page 84: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/84.jpg)
Conclusões
Sistema ICC-KNNMaior eficiência sobre sistemas que usam
métodos de categorizaçãoMelhor classificação dos dadosFacilidade de implementaçãoNão oferece restrições ao conjunto de amostrasTaxas superiores no problema de
reconhecimento de dígitos manuscritos
![Page 85: A Validity Measure for Hard and Fuzzy Clustering Derived from Fisher’s Linear Discriminant](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681478e550346895db4bf51/html5/thumbnails/85.jpg)
Trabalhos Futuros
ICC-KNN com outros métodos de categorização
Variar a constante nebulosa na PFT Empregar redes MLP para avaliar os
graus de inclusão gerados pelo ICC-KNNAvaliar as amostras em um espaço
dimensional menor