nfl play predictions will burton, ncsu industrial engineering 2015 michael dickey, ncsu statistics...

14
NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Upload: alannah-mathews

Post on 16-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015

Michael Dickey, NCSU Statistics 2015

1

Page 2: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

1) Research classification methods2) Gather and clean NFL play by play

data3) Exploratory analysis and significance

testing

Method

2

4) Model building5) Implement models in R Shiny Application

If an offensive play call can be predicted, then the defense can adjust their formation to potentially shut down the offense

Motivation

Page 3: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Final Variables Used For Modeling

+ Down+ Field Position+ Yards to Go+ Cumulative Number Interceptions+ Cumulative Number Fumbles+ Cumulative number of sacks+ Down * Yards To Go+ Number Offensive Points+ Number Defensive Points+ Point Differential+ Offensive Timeouts Remaining+ Defensive Timeouts Remaining+ Minutes Remaining in Quarter+ Previous play call+ Quarter

3

● Source: ArmchairAnalysis.com● NFL plays from 2000 – 2014 seasons● 132,220 observations (trained using only 2011-2014

seasons)

Data Characteristics

Page 4: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Logistic Regression

Final Model Created For:

1st Quarter 4th Quarter Winning2nd Quarter 4th Quarter Losing3rd Quarter 4th Quarter Tied

Additional Steps to Final Model:

1) Adjust for non-linearity2) Perform forward selection3) Reduce overfitting with Lasso

Penalty4) Cross-Validation

4

Page 5: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Non-Linearity of Odds

5

Page 6: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

6

1st Quarter model

Lasso and Cross-Validation

Lasso C-V for

Page 7: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

7

ResultsTested on 20 randomly selected games between 2011- 2014:

Mean Accuracy Rate: 75.4% CorrectHighest Accuracy Rate: 91.4% Correct

Actual Pass Actual Run

Pred Pass 59 13

Pred Run 12 37

79.3% Correct91.4% Correct

Actual Pass Actual Run

Pred Pass 63 8

Pred Run 2 46

Page 8: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

8

Random Forest

Tuning Parameters:

1) Number of candidate splitting variables2) Maximum number of nodes3) Cutoff threshold

Variable Selection

● Assess importance of variables with Gini Index

● Informal process of trial and error

Variable importance expressed as the mean decrease in Gini index

Page 9: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Tuning ParametersCandidate Splitting Variables:- Compare OOB error rates

Maximum Nodes:- Trees grown to maximum depth- Determined via experimentation

Cutoff Threshold:- Use Youden index to maximize

(Specificity + Sensitivity)

9

tuneRF: {randomForest}

Page 10: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Tuning ParametersCandidate Splitting Variables:- tuneRF: {randomForest}

Maximum Nodes:- Trees grown to maximum depth - Determined via experimentation

Cutoff Threshold:- Use Youden index to maximize

(Specificity + Sensitivity)

9

Trees are overgrown to promote variability!

Page 11: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Tuning ParametersCandidate Splitting Variables:- tuneRF: {randomForest}

Maximum Nodes:- Trees grown to maximum depth- Determined via experimentation

Cutoff Threshold:- Use Youden index to maximize

(Specificity + Sensitivity)

9

coords: {pROC}

Page 12: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

12

Results

Actual Pass Actual Run

Pred Pass 59 7

Pred Run 6 47

Actual Pass Actual Run

Pred Pass 60 15

Pred Run 11 35

Tested on 20 randomly selected games between 2011- 2014:Mean Accuracy Rate: 74.5% CorrectHighest Accuracy Rate: 89.1% Correct

89.1% Correct78.5% Correct

Page 13: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

• R Shiny visualization demo

• Plotting the predictions within one new game (try for Super Bowl 2015)

13

Visualizing Predictions

Page 14: NFL Play Predictions Will Burton, NCSU Industrial Engineering 2015 Michael Dickey, NCSU Statistics 2015 1

Results and Conclusions

• You can predict NFL plays with high accuracy using simple variables!

• Model building processes are much different but achieve similar accuracy rates

Further Research

• Compare with more complex machine learning techniques (i.e. neural networks)

• Develop a loss function to account for difference in consequences for each misclassification