exploring nfl data with nflscraprryurko/files/user_nflscrapr.pdf · brief history of football...

15
Exploring NFL data with nflscrapR Ron Yurko (@Stat_Ron) Department of Statistics & Data Science Carnegie Mellon University 1

Upload: others

Post on 15-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Exploring NFL data with nflscrapR

Ron Yurko (@Stat_Ron)

Department of Statistics & Data Science

Carnegie Mellon University

1

Page 2: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Suppose it’s 4th down with 4 yards to go from the 40 yard line…

You have three options:

1. Punt - sacrifice possession, but gain (some) field position

2. Attempt a field goal - sacrifice possession but (possibly) gain three points

3. GO FOR IT - try to advance the ball four yards and maintain possession

… what do you do???

2

Page 3: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

• Expected Points - historically how many points have teams scored in similar situations?

• Win Probability - have teams in similar situations won the game?

• Expected points added (EPA) & Win probability added (WPA)

How do we value a play?

3

Page 4: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

• Operations Research on Football (Carter and Machol, 1971)

• The Hidden Game of Football (Carroll, et al 1988)

• Aaron Schatz, Football Outsiders

• Brian Burke, Advanced Football Analytics

• numberFire, Pro Football Focus, ESPN, etc.

Brief history of football analytics

4

Page 5: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

NOT ALL YARDS ARE CREATED EQUAL!

5

Page 6: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Recent work in football analytics is not easily reproducible:

• Reliance on proprietary & costly data sources

• Data quality potentially biased by human judgement

Prevents the growth of a football analytics community!

Where’s the data???

6

Page 7: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

• There is publicly available play-by-play data!

• NFL API structures its data using JSON for games going back to 2009

• In python, can use `nflgame’ and `nfldb’ by the unsung hero ‘burntsushi` (no longer maintained…)

• nflscrapR retrieves and organizes the API data into easy-to-use data frames, with more!

Enter nflscrapR

7

Page 8: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Expected Points

8

Page 9: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Win Probability

9

Page 10: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

air and YAC EPA/WPA

10

Page 11: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

11

Page 12: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

12

IT’S WORKING

Page 13: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

IT’S WORKING

13

Page 14: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Let’s look at some data(ggplot2 football field code courtesy of Michael Lopez

https://github.com/statsbylopez/BlogPosts/blob/master/fball_field.R)

14

Page 15: Exploring NFL data with nflscrapRryurko/files/useR_nflscrapR.pdf · Brief history of football analytics 4. NOT ALL YARDS ARE CREATED EQUAL! 5. Recent work in football analytics is

Oct 19th: Football Analytics Workshop Featuring a Q&A session with

Michael Lopez, NFL Director of Data & Analytics Oct 20th: Carnegie Mellon Sports Analytics Conference

https://cmusportsanalytics.com/conference2018.html #CMSAC18

Thanks - join us on October 19-20th!

15