barrett poster - michigan state universityamezqui3/demat/barrett_poster.pdf · 2018. 12. 3. · c i...

1
Efficient object classification using Euler Characteristic Erik Amézquita a , Antonio Rieser b , Mario Canul Ku c , Rogelio Hasimoto c , Salvador Ruiz Correa d , Diego Jiménez Badillo e a Math Department, Universidad de Guanajuato, b CONACYT - Centro de Investigación en Matemáticas, c Centro de Investigación en Matemáticas, d Instituto Potosino de Investigación Científica y Tecnológica, e Instituto Nacional de Antropología e Historia [email protected] Q UESTION Can topology tell us some- thing about the origins of pre-Columbian masks? G ENERAL METHODOLOGY The classification algorithm is based on computing the Euler Characteristic Graph of each object as suggested in [1]. We introduced the extended ECG, concatenating several ECGs from the same artifact. Consider a n-dimensional object X =( V 0 , V 1 ,..., V n ) and its Euler Characteristic (EC): χ = n k=0 (-1) k | V k | No. of k -dimensional cells Then fix a vertex filtration function g and extend it to the rest of k -cells: g k ({v 0 , v 1 ,..., v k } )= min 0ik { g(v i ) } g k : V k [a, b] A k -cell A fixed function g : V 0 [a, b] V 0 set of vertices; [a, b] any interval The interval [a, b] is divided into T equally-spaced thresholds a = t 0 < t 1 < t 2 <...< t T = b. Consider the EC at i-th threshold: χ i = n k=0 (-1) k | V (i) k |. No. of k -cells c k such that g k (c k ) t i The EC Graph (ECG) is obtained by comparing χ i vs. t i . O NE PARTICULAR PROBLEM The goal was to specifically assort 128 masks into 9 different groups. Different filtrations and ECGs were considered. Heatmaps under different filtration functions. Using the principal cur- vature values we filtered according to minimum curvature, mean curva- ture and shape index. Each mask is embedded in the [-1, 1] 3 cube, with its mass centered at origin. The projections of each vertex to each x = 0, y = 0, z = 0 planes were considered as filtrations. Three different filtrations yield three different ECGs. The three of them are concatenated to get one extended graph for each mask. All the ECGs for all the masks were computed and plotted on the same plotspace, from which we can infer poor performance based on curvature values. -200 -150 -100 -50 0 50 100 150 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Almost every family Mean curvature based ECGs for 256 thresholds -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 0 0.05 0.1 0.15 0.2 0.25 Shape index based ECGs for 256 thresholds However, our extended ECG obtained by concatenated projections’ ECGs gives a more attractive scenario to run a classification algorithm such as Support Vector Machine (SVM.) -20 -10 0 10 20 30 0 20 40 60 80 100 120 140 160 180 R ESULTS SVM was run using half of the mask’s dataset as training set and the rest of it was used for testing. A new assortment was obtained. The number of specimens per family was homogenized, since only two out of nine families possess less than 10 items. Out of the remaining seven families, 8 items were chosen randomly from each and their projection based ECGs were plotted. Different colors refer to different items. Family 2 -10 -8 -6 -4 -2 0 2 4 6 8 0 20 40 60 80 100 120 140 160 180 200 10 11 18 20 24 31 43 27 Family 3 -20 -15 -10 -5 0 5 10 15 20 0 20 40 60 80 100 120 140 160 180 200 027 036 085 099 007 026 087 092 Family 5 -25 -20 -15 -10 -5 0 5 10 0 20 40 60 80 100 120 140 160 180 200 047 084 095 113 180 0228 198 186 Family 9 -5 0 5 10 15 20 25 30 0 20 40 60 80 100 120 140 160 180 200 137 131 112 111 110 083 058 057 Removing the outlying masks and plotting the same four families simultaneously: -10 -8 -6 -4 -2 0 2 4 6 8 10 0 20 40 60 80 100 120 140 160 180 200 Familias 2,3,5,8 Families 2,3,5,9 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 0 20 40 60 80 100 120 140 160 180 200 New classification C ONCLUSIONS Answer to the original question: Yes it can! Archaeologists have manifested some approval towards the first proposed assortment. The computation of the ECG associated to a given object, especially if it is based on projections, is a simple algorithm of linear complexity and memory. A larger database, with more specimens per family might solve the lack of character- ization problem faced throughout the project. Outliers suggest that there might not be just nine families in total but more. It would be an interesting problem to determine the appropriate number of families of masks in first place. ACKNOWLEDGMENTS The authors would like to thank both Instituto Nacional de Antropología e Historia through its project Desarrollo de Aplicaciones de Computación en Arqueología and Red Temática CONACYT de Tecnologías Digitales para la Difusión del Patrimonio Cultural for providing the 3D scanned pre-Columbian masks which made possible this project. 47 TH BARRETT M EMORIAL L ECTURES —U NIVERSITY OF T ENNESSEE ,K NOXVILLE —M AY 2017 R EFERENCES [1] E. Richardson, M. Weirman Efficient classification using the Euler Characteristic In Pattern Recognition Letters Vol.49 pp.99-106, 2014. [2] C. Chang, C. Lin LIBSVM: A library for support vector machines In ACM Transactions on Intelligent Systems and Technology Vol.2, No.3, 2011.

Upload: others

Post on 05-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Barrett Poster - Michigan State Universityamezqui3/demat/barrett_poster.pdf · 2018. 12. 3. · c i = n å k=0 ( 1)k jV (i) k j: No. of k-cells c k such that g k (c k) t i The EC

Efficient object classification using Euler CharacteristicErik Amézquitaa, Antonio Rieserb, Mario Canul Kuc, Rogelio Hasimotoc,

Salvador Ruiz Corread , Diego Jiménez Badilloea Math Department, Universidad de Guanajuato, b CONACYT - Centro de Investigación en Matemáticas,

c Centro de Investigación en Matemáticas, d Instituto Potosino de Investigación Científica y Tecnológica,e Instituto Nacional de Antropología e [email protected]

QUESTION

Can topology tell us some-thing about the origins ofpre-Columbian masks?

GENERAL METHODOLOGY• The classification algorithm is based on computing the Euler Characteristic Graph

of each object as suggested in [1].• We introduced the extended ECG, concatenating several ECGs from the same artifact.

Consider a n-dimensional object X = (V0,V1, . . . ,Vn) and its Euler Characteristic (EC):

χ =n

∑k=0

(−1)k|Vk|

No. of k-dimensional cellsThen fix a vertex filtration function g and extend it to the rest of k-cells:

gk({v0,v1, . . . ,vk}) = min0≤i≤k

{g(vi)}

gk : Vk→ [a,b] A k-cellA fixed function g : V0 → [a,b]

V0 set of vertices; [a,b] any interval

The interval [a,b] is divided into T equally-spaced thresholds a= t0 < t1 < t2 < .. . < tT = b.Consider the EC at i-th threshold:

χi =n

∑k=0

(−1)k |V (i)k|.

No. of k-cells ck such that gk(ck)≥ ti

The EC Graph (ECG) is obtained by comparing χi vs. ti.

ONE PARTICULAR PROBLEMThe goal was to specifically assort 128 masks into 9 different groups. Different filtrationsand ECGs were considered.

Heatmaps under differentfiltration functions.Using the principal cur-vature values we filteredaccording to minimumcurvature, mean curva-ture and shape index.

Each mask is embedded in the [−1,1]3 cube, with its mass centered at origin. Theprojections of each vertex to each x = 0, y = 0, z = 0 planes were considered as filtrations.

Three different filtrationsyield three differentECGs. The three of themare concatenated to getone extended graph foreach mask.

All the ECGs for all the masks were computed and plotted on the same plotspace, fromwhich we can infer poor performance based on curvature values.

-200

-150

-100

-50

0

50

100

150

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Almosteveryfamily

Mean curvature based ECGs for 256 thresholds

-800

-700

-600

-500

-400

-300

-200

-100

0

100

200

300

0 0.05 0.1 0.15 0.2 0.25

Shape index based ECGs for 256 thresholds

However, our extended ECG obtained by concatenated projections’ ECGs gives a moreattractive scenario to run a classification algorithm such as Support Vector Machine (SVM.)

-20

-10

0

10

20

30

0 20 40 60 80 100 120 140 160 180

RESULTS• SVM was run using half of the mask’s dataset as training set and the rest of it was

used for testing. A new assortment was obtained.• The number of specimens per family was homogenized, since only two out of nine

families possess less than 10 items.• Out of the remaining seven families, 8 items were chosen randomly from each and

their projection based ECGs were plotted.• Different colors refer to different items.

Fam

ily2

-10

-8

-6

-4

-2

0

2

4

6

8

0 20 40 60 80 100 120 140 160 180 200

1011182024314327

Fam

ily3

-20

-15

-10

-5

0

5

10

15

20

0 20 40 60 80 100 120 140 160 180 200

027036085099007026087092

Fam

ily5

-25

-20

-15

-10

-5

0

5

10

0 20 40 60 80 100 120 140 160 180 200

047084095113180

0228198186

Fam

ily9

-5

0

5

10

15

20

25

30

0 20 40 60 80 100 120 140 160 180 200

137131112111110083058057

Removing the outlying masks and plotting the same four families simultaneously:

-10

-8

-6

-4

-2

0

2

4

6

8

10

0 20 40 60 80 100 120 140 160 180 200

Familias2,3,5,8

Families 2,3,5,9

-25

-20

-15

-10

-5

0

5

10

15

20

25

30

0 20 40 60 80 100 120 140 160 180 200

New classification

CONCLUSIONS• Answer to the original question: Yes it can!• Archaeologists have manifested some approval towards the first proposed assortment.

• The computation of the ECG associated to a given object, especially if it is based onprojections, is a simple algorithm of linear complexity and memory.

• A larger database, with more specimens per family might solve the lack of character-ization problem faced throughout the project.

• Outliers suggest that there might not be just nine families in total but more. It wouldbe an interesting problem to determine the appropriate number of families of masksin first place.

ACKNOWLEDGMENTSThe authors would like to thank both Instituto Nacional de Antropología e Historia throughits project Desarrollo de Aplicaciones de Computación en Arqueología and Red TemáticaCONACYT de Tecnologías Digitales para la Difusión del Patrimonio Cultural for providingthe 3D scanned pre-Columbian masks which made possible this project.

47TH BARRETT MEMORIAL LECTURES — UNIVERSITY OF TENNESSEE, KNOXVILLE — MAY 2017

REFERENCES[1] E. Richardson, M. Weirman Efficient classification using the Euler Characteristic In Pattern Recognition

Letters Vol.49 pp.99-106, 2014.[2] C. Chang, C. Lin LIBSVM: A library for support vector machines In ACM Transactions on Intelligent

Systems and Technology Vol.2, No.3, 2011.