amy langville, associate professor of mathematics, the college of charleston in south carolina at...
DESCRIPTION
My talk will cover four ranking and clustering projects that I consulted on this past year. The projects range from ranking Olympic athletes, mixed martial arts fighters, and cell phone carriers to clustering sentences to rank individuals by how much humility they evidence in their written language. For each project, I will address the particular data challenges and the solutions and techniques we proposed.TRANSCRIPT
![Page 1: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/1.jpg)
1
4 Consulting Projects from this past yearSeptember 19, 2014
Machine Learning 2014
Amy LangvilleMathematics Department
College of [email protected]
![Page 2: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/2.jpg)
2
Tyler PeriniMathematics Department
College of [email protected]
4 Consulting Projects from this past year
Amy LangvilleMathematics Department
College of [email protected]
![Page 3: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/3.jpg)
3
4 Consulting Projects from this past year
Tyler PeriniMathematics Department
College of [email protected]
Amy LangvilleMathematics Department
College of [email protected]
![Page 4: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/4.jpg)
4
2 Books generate questions
US Olympic Projects
CageRank
Ranking Cell Phone Carriers
The Humility Project
Outline
![Page 5: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/5.jpg)
5
2 Books generate questions
1232-1315
![Page 6: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/6.jpg)
6
2 Books generate questions
1232-1315
Chapter 7 talks about . . . but I need to . . . Any advice?
![Page 7: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/7.jpg)
7
2 Books generate questions
1232-1315
Chapter 7 talks about . . . but I need to . . . Any advice?
I really enjoyed your book, but my problem is . . ., which you
don’t mention. How do I solve it?
![Page 8: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/8.jpg)
8
Project 1: from U.S. Olympic Committee
![Page 9: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/9.jpg)
9
Project 1: from U.S. Olympic Committee
Problem 1:Your book talks a lot about ranking in head-to-head contests (and that was helpful), but we need to rank
multi-competitor sports like downhill skiing and gymnastics.
![Page 10: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/10.jpg)
10
Project 1: from U.S. Olympic Committee
Problem 1:
Solution 1: TRUESKILL
μ = average skill
σ = uncertainty
Your book talks a lot about ranking in head-to-head contests (and that was helpful), but we need to rank
multi-competitor sports like downhill skiing and gymnastics.
![Page 11: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/11.jpg)
11
![Page 12: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/12.jpg)
12
Project 1: from U.S. Olympic Committee
1st
3rd
2nd
![Page 13: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/13.jpg)
13
Project 1: from U.S. Olympic Committee
1st
3rd
2nd
![Page 14: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/14.jpg)
14
Project 1: from U.S. Olympic Committee
2nd
3rd
1st
![Page 15: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/15.jpg)
15
Project 1: from U.S. Olympic Committee
Problem 2:Your book talks a lot about ranking
in head-to-head contests where there are multiple matches
between competitors, but our data is sparse. Any advice?
![Page 16: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/16.jpg)
16
![Page 17: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/17.jpg)
17
Problem:
Solution: FIND SIMILAR FIGHTERS to densify the graph
Project 2: CageRank
You talk a lot about ranking head-to-head contests, like ours [MMA
fights], but our data is really sparse. How do we deal with that?
![Page 18: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/18.jpg)
UFC 163Phil Davis Lyoto Machida
![Page 19: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/19.jpg)
UFC 163Phil Davis Lyoto Machida
had never fought each other
![Page 20: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/20.jpg)
College football vs. UFC
![Page 21: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/21.jpg)
UFC 163Rashad Evans 1
Ryan Bader 2Alexander Gustafson 3
Antonio Rogerio Nogueira 4Quinton “Rampage” Jackson
5Chael Sonnen 6
Matt Hamill 7James Te-Huna 8
Dan Henderson 9Vladimir Matyushenko 10
Phil Davis Lyoto Machida1 Ricardo Arona
2 Jason Brilz
3 Ryan Bader
4 Stephan Bonnar5 Randy Couture6 Trevor Prangley
7 Tito Ortiz
8 Mark Coleman
9 Ovince St. Preux10 Chael Sonnen
Find 10 most similar
fighters to each
Similar by? Fightmetric statsSVD SIGNS
![Page 22: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/22.jpg)
UFC 163Rashad Evans 1
Ryan Bader 2Alexander Gustafson 3
Antonio Rogerio Nogueira 4Quinton “Rampage” Jackson
5Chael Sonnen 6
Matt Hamill 7James Te-Huna 8
Dan Henderson 9Vladimir Matyushenko 10
Phil Davis Lyoto Machida1 Ricardo Arona
2 Jason Brilz
3 Ryan Bader
4 Stephan Bonnar5 Randy Couture6 Trevor Prangley
7 Tito Ortiz
8 Mark Coleman
9 Ovince St. Preux10 Chael Sonnen
6
![Page 23: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/23.jpg)
UFC 163Rashad Evans 1
Ryan Bader 2Alexander Gustafson 3
Antonio Rogerio Nogueira 4Quinton “Rampage” Jackson
5Chael Sonnen 6
Matt Hamill 7James Te-Huna 8
Dan Henderson 9Vladimir Matyushenko 10
Phil Davis Lyoto Machida1 Ricardo Arona
2 Jason Brilz
3 Ryan Bader
4 Stephan Bonnar5 Randy Couture6 Trevor Prangley
7 Tito Ortiz
8 Mark Coleman
9 Ovince St. Preux10 Chael Sonnen
12
6
Question: is the goal to predict the winner or generate buzz?
![Page 24: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/24.jpg)
24
Problem:
Project 3: Ranking Cell Phone CarriersRather than individual games between carriers, we have a
distribution of game scores for each carrier. How do we use this
data to rank carriers?
![Page 25: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/25.jpg)
25
Problem:
Solution: SIMULATE HEAD-TO-HEAD GAMES BY RANDOM DRAWS FROM DATA, then rank aggregate by BORDA COUNT (#carriers each carrier outranks).
Project 3: Ranking Cell Phone CarriersRather than individual games between carriers, we have a
distribution of game scores for each carrier. How do we use this
data to rank carriers?
![Page 26: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/26.jpg)
26
Project 3: Ranking Cell Phone CarriersRather than individual games between carriers, we have a
distribution of game scores for each carrier. How do we use this
data to rank carriers?
Problem:
Solution: SIMULATE HEAD-TO-HEAD GAMES BY RANDOM DRAWS FROM DATA, then rank aggregate by BORDA COUNT (#carriers each carrier outranks).
New Problem: data is loaded with ties!
![Page 27: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/27.jpg)
27
![Page 28: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/28.jpg)
28
Project 3: Ranking Cell Phone CarriersMARKOV CHAIN
Question: what makes a model good?Stability in the face of small data changesExplainability to public
![Page 29: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/29.jpg)
29
Problem:
Project 4: Humility Project
We’re trying to analyze a person’s writing to predict
his/her humility, but we lost our data guy. Can you help us?
![Page 30: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/30.jpg)
30
Problem:
Solution: NON-NEGATIVE MATRIX FACTORIZATION (NMF) to find hidden clusters in text.
Project 4: Humility Project
We’re trying to analyze a person’s writing to predict
his/her humility, but we lost our data guy. Can you help us?
![Page 31: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/31.jpg)
31
Project 4: Humility Project
![Page 32: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/32.jpg)
32
Project 4: Humility Project
![Page 33: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/33.jpg)
33
Project 4: Humility Project
![Page 34: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/34.jpg)
34
Project 4: Humility Project
![Page 35: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/35.jpg)
35
Project 4: Humility Project
![Page 36: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/36.jpg)
36
Project 4: Humility Project
![Page 37: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/37.jpg)
37
ConclusionsWe need you. You open our eyes to problems we never
would have thought about.
Iterative Collaboration
Many GREAT ALGORITHMS exist. Some just need tweaking.
![Page 38: Amy Langville, Associate Professor of Mathematics, The College of Charleston in South Carolina at MLconf ATL](https://reader033.vdocuments.site/reader033/viewer/2022061111/5454e67daf79590b088b45ed/html5/thumbnails/38.jpg)
38
ConclusionsWe need you. You open our eyes to problems we never would
have thought about.
Iterative Collaboration
Many GREAT ALGORITHMS exist. Some just need tweaking.
Future Work. . . (you tell me)