data analysis of tennis matches

35
Data Analysis of Tennis Matches Fatih Çalışır

Upload: josiah

Post on 22-Feb-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Data Analysis of Tennis Matches. Fatih Çalışır. Domain of the Data. ATP World Tour 250 ATP 250 Brisbane ATP 250 Sydney ... ATP World Tour 500 ATP 500 Memphis ATP 500 Dubai. 4 Types of Tennis Tournaments. Domain of the Data. ATP World Tour 1000 ATP 1000 Paris ATP 1000 Shanghai ... - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Analysis of Tennis Matches

Data Analysis of Tennis Matches

Fatih Çalışır

Page 2: Data Analysis of Tennis Matches

1. ATP World Tour 250 ATP 250 Brisbane ATP 250 Sydney ...

2. ATP World Tour 500 ATP 500 Memphis ATP 500 Dubai

Domain of the Data4 Types of Tennis Tournaments

Page 3: Data Analysis of Tennis Matches

3. ATP World Tour 1000 ATP 1000 Paris ATP 1000 Shanghai ...

4. Grand Slams Australian Open Roland Garros Wimbeldon US Open

Domain of the Data

Page 4: Data Analysis of Tennis Matches

• Men’s Single• Year 2010• 11 ATP 500 Tournament• 9 ATP 1000 Tournament• 4 Grand Slams

Domain of the Data

Page 5: Data Analysis of Tennis Matches

Source of DataInternetOfficial Websites of the Players

ATP(Association of Tennis Professionals) Homa Page

2010 Result Archive

Page 6: Data Analysis of Tennis Matches

Data ConstructionFrom different tablesEach table from different

websiteCombining easily

Page 7: Data Analysis of Tennis Matches

Data ConstructionPlayers Table

Page 8: Data Analysis of Tennis Matches

Data ConstructionTournament Results Table

Page 9: Data Analysis of Tennis Matches

Data ConstructionTournament Info Table

Page 10: Data Analysis of Tennis Matches

Data ConstructionFinal Data Table29 features1453 instances

Page 11: Data Analysis of Tennis Matches

Aim of the ProjectClassification

Finding weights for attributes

Page 12: Data Analysis of Tennis Matches

Missing ValuesPlayers’ HeightPlayers’ WeightPlayers’ BMIPlayers’ Date of being

Professional

Page 13: Data Analysis of Tennis Matches

Missing ValuesPlayers’ HeightConsider players with same weight

Take the averagePlayers’ WeightConsider players with same height

Take the average

Page 14: Data Analysis of Tennis Matches

Missing ValuesPlayers’ Height and WeightIf both of them are missingRemove the row

Players’ Date of beign ProfessionalConsider players with same ageTake the average

Page 15: Data Analysis of Tennis Matches

Data UnderstandingMin,Max,Median,Average

values for numeric attributes

Page 16: Data Analysis of Tennis Matches

Data UnderstandingOccurrence table for categorical

and numeric attributes

Page 17: Data Analysis of Tennis Matches

Data UnderstandingHistogram for numeric attributes

Page 18: Data Analysis of Tennis Matches

Data UnderstandingBox Plot for main characteristics

of numerical attributes

Page 19: Data Analysis of Tennis Matches

Data UnderstandingScatter Plot to relate two

attributes

Page 20: Data Analysis of Tennis Matches

Feature SelectionLinear Correlation

Page 21: Data Analysis of Tennis Matches

Feature SelectionBackward EleminationNaive Bayes for Ranking

Page 22: Data Analysis of Tennis Matches

Feature Selection28 attributes reduced to 19

attributesAtrributes are meaningful

Page 23: Data Analysis of Tennis Matches

Weight of AttributesRIMARC to find weights

Page 24: Data Analysis of Tennis Matches

ClassificationKNIME

Decision Tree – C4.5

Gain Ratio Qualitiy Meauser

Page 25: Data Analysis of Tennis Matches

Classification1017 instances for training436 instances for testing842 positive instances611 negative instancesTraining and test data is

randomly selected

Page 26: Data Analysis of Tennis Matches

ClassificationDecision Tree

Page 27: Data Analysis of Tennis Matches

Classification

Page 28: Data Analysis of Tennis Matches

ClassificationConfusion Matrix

Page 29: Data Analysis of Tennis Matches

ClassificationConfusion Matrix

Page 30: Data Analysis of Tennis Matches

ClassificationAccuracy Statistics

Page 31: Data Analysis of Tennis Matches

ClassificationNaive Bayes ClassifierConfusion Matrix

Page 32: Data Analysis of Tennis Matches

ClassificationConfusion Matrix

Page 33: Data Analysis of Tennis Matches

ClassificationAccuracy Statistics

Page 34: Data Analysis of Tennis Matches

ClassificationC4.5 vs Naive Bayes

Decision Tree (C4.5) Naive Bayes

Page 35: Data Analysis of Tennis Matches

ClassificationC4.5 vs Naive Bayes

Decision Tree (C4.5)

Naive Bayes