a multiresolution stochastic process model for basketball ...dcervone.com/slides/epv_jsm.pdfnba...

52
A Multiresolution Stochastic Process Model for Basketball Possession Outcomes Dan Cervone, Alex D’Amour, Luke Bornn, Kirk Goldsberry Harvard Statistics Department August 11, 2015 Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

A Multiresolution Stochastic Process Model forBasketball Possession Outcomes

Dan Cervone, Alex D’Amour, Luke Bornn, Kirk Goldsberry

Harvard Statistics Department

August 11, 2015

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 2: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

NBA optical tracking data

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 3: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

NBA optical tracking data

(x , y) locations for all 10 players (5 on each team) at 25Hz.

(x , y , z) locations for the ball at 25Hz.

Event annotations (shots, passes, fouls, etc.).

1230 games from 2013-14 NBA, each 48 minutes, featuring 461 players intotal.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 4: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

NBA optical tracking data

(x , y) locations for all 10 players (5 on each team) at 25Hz.

(x , y , z) locations for the ball at 25Hz.

Event annotations (shots, passes, fouls, etc.).

1230 games from 2013-14 NBA, each 48 minutes, featuring 461 players intotal.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 5: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Expected Possession Value (EPV)

JamesshotNORRIS COLE POSSESSION

RASHARDLEWISPOSS.

LEBRON JAMES POSSESSION

EPV constant whilepass is en route

PassPass

Slight dip in EPV aftercrossing 3 point line

Slight dip in EPV aftercrossing 3 point line

Acceleratestowards basket

Runs behindbasket and

defenders close

Acceleratesinto the paint

Splits the defense;clear path to

basket

0.8

1.0

1.2

1.4

1.6

time

EP

V

0 3 6 9 12 15 18

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 6: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 7: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 8: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 9: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 10: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 11: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 12: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

EPV definition

Let Ω be the space of all possible basketball possessions. For ω ∈ Ω

X (ω) ∈ 0, 2, 3: point value of possession ω.

T (ω): possession length

Zt(ω), 0 ≤ t ≤ T (ω): time series of optical tracking data.

F (Z)t = σ(Z−1

s : 0 ≤ s ≤ t): natural filtration.

Definition

The expected possession value (EPV) at time t ≥ 0 during a possession is

νt = E[X |F (Z)t ].

EPV provides an instantaneous snapshot of the possession’s value, given itsfull spatiotemporal history.

νt is a Martingale: E[νt+h|F (Z)t ] = νt for all h > 0.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 13: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Calculating EPVνt = E[X |F (Z)

t ]

Regression-type prediction methods:

– Data are not traditional input/output pairs.

– No guarantee of stochastic consistency.

Markov chains:

+ Stochastically consistent.

– Information is lost through discretization.

– Many rare transitions.

Brute force, “God model” for basketball.

+ Allows Monte Carlo calculation of νt by simulating future possession paths.

– Zt is high dimensional and includes discrete events (passes, shots, turnovers).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 14: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Calculating EPVνt = E[X |F (Z)

t ]

Regression-type prediction methods:

– Data are not traditional input/output pairs.

– No guarantee of stochastic consistency.

Markov chains:

+ Stochastically consistent.

– Information is lost through discretization.

– Many rare transitions.

Brute force, “God model” for basketball.

+ Allows Monte Carlo calculation of νt by simulating future possession paths.

– Zt is high dimensional and includes discrete events (passes, shots, turnovers).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 15: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Calculating EPVνt = E[X |F (Z)

t ]

Regression-type prediction methods:

– Data are not traditional input/output pairs.

– No guarantee of stochastic consistency.

Markov chains:

+ Stochastically consistent.

– Information is lost through discretization.

– Many rare transitions.

Brute force, “God model” for basketball.

+ Allows Monte Carlo calculation of νt by simulating future possession paths.

– Zt is high dimensional and includes discrete events (passes, shots, turnovers).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 16: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

A coarsened process

Finite collection of states C = Cposs ∪ Cend ∪ Ctrans.

Cposs: Ball possession states

player × region × defender within 5 feet

Cend: End states

made 2, made 3, turnover

Ctrans: Transition states

pass linking c , c ′ ∈ Cposs,shot attempt from c ∈ Cposs,

turnover in progress,rebound in progress .

Ct ∈ C: state of the possession at time t.

C (0),C (1), . . . ,C (K): discrete sequence of distinct states.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 17: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

A coarsened process

Finite collection of states C = Cposs ∪ Cend ∪ Ctrans.

Cposs: Ball possession states

player × region × defender within 5 feetCend: End states

made 2, made 3, turnover

Ctrans: Transition states

pass linking c , c ′ ∈ Cposs,shot attempt from c ∈ Cposs,

turnover in progress,rebound in progress .

Ct ∈ C: state of the possession at time t.

C (0),C (1), . . . ,C (K): discrete sequence of distinct states.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 18: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

A coarsened process

Finite collection of states C = Cposs ∪ Cend ∪ Ctrans.

Cposs: Ball possession states

player × region × defender within 5 feetCend: End states

made 2, made 3, turnover

Ctrans: Transition states

pass linking c , c ′ ∈ Cposs,shot attempt from c ∈ Cposs,

turnover in progress,rebound in progress .

Ct ∈ C: state of the possession at time t.

C (0),C (1), . . . ,C (K): discrete sequence of distinct states.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 19: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

A coarsened process

Finite collection of states C = Cposs ∪ Cend ∪ Ctrans.

Cposs: Ball possession states

player × region × defender within 5 feetCend: End states

made 2, made 3, turnover

Ctrans: Transition states

pass linking c , c ′ ∈ Cposs,shot attempt from c ∈ Cposs,

turnover in progress,rebound in progress .

Ct ∈ C: state of the possession at time t.

C (0),C (1), . . . ,C (K): discrete sequence of distinct states.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 20: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Possible paths for Ct

Cposs

Player 1

Passto

P2 Player 2

Passto

P3 Player 3

Passto

P4 Player 4

Passto

P5 Player 5

Ctrans Turnover Shot Rebound

Cend Made 2pt Made 3pt End of possession

Figure 1: Schematic of the coarsened possession process C, with states (rectangles) and possiblestate transitions (arrows) shown. The unshaded states in the first row compose Cposs. Here, statescorresponding to distinct ballhandlers are grouped together (Player 1 through 5), and the discretizedcourt in each group represents the player’s coarsened position and defended state. The gray shadedrectangles are transition states, Ctrans, while the rectangles in the third row represent the end states,Cend. Blue arrows represent possible macrotransition entrances (and red arrows, macrotransitionexits) when Player 1 has the ball; these terms are introduced in Section ??.

1

τt =

mins : s > t,Cs ∈ Ctrans if Ct ∈ Cposs

t if Ct 6∈ Cposs

δt = mins : s ≥ τt ,Cs 6∈ Ctrans.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 21: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Possible paths for Ct

Cposs

Player 1

Passto

P2 Player 2

Passto

P3 Player 3

Passto

P4 Player 4

Passto

P5 Player 5

Ctrans Turnover Shot Rebound

Cend Made 2pt Made 3pt End of possession

Figure 1: Schematic of the coarsened possession process C, with states (rectangles) and possiblestate transitions (arrows) shown. The unshaded states in the first row compose Cposs. Here, statescorresponding to distinct ballhandlers are grouped together (Player 1 through 5), and the discretizedcourt in each group represents the player’s coarsened position and defended state. The gray shadedrectangles are transition states, Ctrans, while the rectangles in the third row represent the end states,Cend. Blue arrows represent possible macrotransition entrances (and red arrows, macrotransitionexits) when Player 1 has the ball; these terms are introduced in Section ??.

1

τt =

mins : s > t,Cs ∈ Ctrans if Ct ∈ Cposs

t if Ct 6∈ Cposs

δt = mins : s ≥ τt ,Cs 6∈ Ctrans.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 22: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Stopping times for switching resolutions

τt =

mins : s > t,Cs ∈ Ctrans if Ct ∈ Cposs

t if Ct 6∈ Cposs

δt = mins : s ≥ τt ,Cs 6∈ Ctrans.

Key assumptions:

A1 C is marginally semi-Markov.

A2 For all s > δt , P(Cs |Cδt ,F (Z)t ) = P(Cs |Cδt ).

Theorem

Assume (A1)–(A2), then for all 0 ≤ t < T,

νt =∑

c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 23: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Stopping times for switching resolutions

τt =

mins : s > t,Cs ∈ Ctrans if Ct ∈ Cposs

t if Ct 6∈ Cposs

δt = mins : s ≥ τt ,Cs 6∈ Ctrans.

Key assumptions:

A1 C is marginally semi-Markov.

A2 For all s > δt , P(Cs |Cδt ,F (Z)t ) = P(Cs |Cδt ).

Theorem

Assume (A1)–(A2), then for all 0 ≤ t < T,

νt =∑

c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 24: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Stopping times for switching resolutions

τt =

mins : s > t,Cs ∈ Ctrans if Ct ∈ Cposs

t if Ct 6∈ Cposs

δt = mins : s ≥ τt ,Cs 6∈ Ctrans.

Key assumptions:

A1 C is marginally semi-Markov.

A2 For all s > δt , P(Cs |Cδt ,F (Z)t ) = P(Cs |Cδt ).

Theorem

Assume (A1)–(A2), then for all 0 ≤ t < T,

νt =∑

c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 25: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Stopping times for switching resolutions

τt =

mins : s > t,Cs ∈ Ctrans if Ct ∈ Cposs

t if Ct 6∈ Cposs

δt = mins : s ≥ τt ,Cs 6∈ Ctrans.

Key assumptions:

A1 C is marginally semi-Markov.

A2 For all s > δt , P(Cs |Cδt ,F (Z)t ) = P(Cs |Cδt ).

Theorem

Assume (A1)–(A2), then for all 0 ≤ t < T,

νt =∑

c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 26: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Multiresolution models

EPV:νt =

∑c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Let M(t) be the event τt ≤ t + ε.M1 P(Zt+ε|M(t)c ,F (Z)

t ): the microtransition model.

M2 P(M(t)|F (Z)t ): the macrotransition entry model.

M3 P(Cδt |M(t),F (Z)t ): the macrotransition exit model.

M4 P, with Pqr = P(C (n+1) = cr |C (n) = cq): the Markov transition probabilitymatrix.

Monte Carlo computation of νt :

Draw Cδt |F (Z)t using (M1)–(M3).

Calculate E[X |Cδt ] using (M4).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 27: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Multiresolution models

EPV:νt =

∑c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Let M(t) be the event τt ≤ t + ε.M1 P(Zt+ε|M(t)c ,F (Z)

t ): the microtransition model.

M2 P(M(t)|F (Z)t ): the macrotransition entry model.

M3 P(Cδt |M(t),F (Z)t ): the macrotransition exit model.

M4 P, with Pqr = P(C (n+1) = cr |C (n) = cq): the Markov transition probabilitymatrix.

Monte Carlo computation of νt :

Draw Cδt |F (Z)t using (M1)–(M3).

Calculate E[X |Cδt ] using (M4).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 28: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Multiresolution models

EPV:νt =

∑c∈Ctrans∪Cend

E[X |Cδt = c]P(Cδt = c |F (Z)t ).

Let M(t) be the event τt ≤ t + ε.M1 P(Zt+ε|M(t)c ,F (Z)

t ): the microtransition model.

M2 P(M(t)|F (Z)t ): the macrotransition entry model.

M3 P(Cδt |M(t),F (Z)t ): the macrotransition exit model.

M4 P, with Pqr = P(C (n+1) = cr |C (n) = cq): the Markov transition probabilitymatrix.

Monte Carlo computation of νt :

Draw Cδt |F (Z)t using (M1)–(M3).

Calculate E[X |Cδt ] using (M4).

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 29: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Microtransition model

Player `’s position at time t is z`(t) = (x`(t), y `(t)).

x`(t + ε) = x`(t) + α`x [x`(t)− x`(t − ε)] + η`x(t)

η`x(t) ∼ N (µ`x(z`(t)), (σ`x)2).µx has Gaussian Process prior.y `(t) modeled analogously (and independently).

Defensive microtransition model based on defensive matchups[Franks et al., 2015].

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 30: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Microtransition model

Player `’s position at time t is z`(t) = (x`(t), y `(t)).

x`(t + ε) = x`(t) + α`x [x`(t)− x`(t − ε)] + η`x(t)

η`x(t) ∼ N (µ`x(z`(t)), (σ`x)2).µx has Gaussian Process prior.y `(t) modeled analogously (and independently).

TONY PARKER WITH BALL DWIGHT HOWARD WITH BALL

Defensive microtransition model based on defensive matchups[Franks et al., 2015].

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 31: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Microtransition model

Player `’s position at time t is z`(t) = (x`(t), y `(t)).

x`(t + ε) = x`(t) + α`x [x`(t)− x`(t − ε)] + η`x(t)

η`x(t) ∼ N (µ`x(z`(t)), (σ`x)2).µx has Gaussian Process prior.y `(t) modeled analogously (and independently).

TONY PARKER WITHOUT BALL DWIGHT HOWARD WITHOUT BALL

Defensive microtransition model based on defensive matchups[Franks et al., 2015].

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 32: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Microtransition model

Player `’s position at time t is z`(t) = (x`(t), y `(t)).

x`(t + ε) = x`(t) + α`x [x`(t)− x`(t − ε)] + η`x(t)

η`x(t) ∼ N (µ`x(z`(t)), (σ`x)2).µx has Gaussian Process prior.y `(t) modeled analogously (and independently).

TONY PARKER WITHOUT BALL DWIGHT HOWARD WITHOUT BALL

Defensive microtransition model based on defensive matchups[Franks et al., 2015].

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 33: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Macrotransition entry model

Recall M(t) = τt ≤ t + ε:Six different “types”, based on entry state Cτt ,∪6

j=1Mj(t) = M(t).

Hazards: λj(t) = limε→0P(Mj (t)|F (Z)

t )ε .

log(λj(t)) = [W`j (t)]′β`j + ξ`j

(z`(t)

)

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 34: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Macrotransition entry model

Recall M(t) = τt ≤ t + ε:Six different “types”, based on entry state Cτt ,∪6

j=1Mj(t) = M(t).

Hazards: λj(t) = limε→0P(Mj (t)|F (Z)

t )ε .

log(λj(t)) = [W`j (t)]′β`j + ξ`j

(z`(t)

)

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 35: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Macrotransition entry model

Recall M(t) = τt ≤ t + ε:Six different “types”, based on entry state Cτt ,∪6

j=1Mj(t) = M(t).

Hazards: λj(t) = limε→0P(Mj (t)|F (Z)

t )ε .

log(λj(t)) = [W`j (t)]′β`j + ξ`j

(z`(t)

)+ ξ`j (zj(t)) 1[j ≤ 4]

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 36: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Hierarchical modeling

Dwight Howard’s shot chart:

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 37: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Hierarchical modeling

Shrinkage needed:

Across space.

Across different players.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 38: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Basis representation of spatial effects

Spatial effects ξ`j`: ballcarrier identity.

j : macrotransition type (pass, shot, turnover).

Functional basis representation

ξ`j (z) = [w`j ]′φj(z).

φj = (φji . . . φjd)′: d spatial basis functions.

w`j : weights/loadings.

Information sharing

φj allows for non-stationarity, correlations between disjoint regions[Higdon, 2002].

w`j : weights across players follow a CAR model [Besag, 1974] based on playersimilarity graph H.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 39: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Basis representation of spatial effects

Spatial effects ξ`j`: ballcarrier identity.

j : macrotransition type (pass, shot, turnover).

Functional basis representation

ξ`j (z) = [w`j ]′φj(z).

φj = (φji . . . φjd)′: d spatial basis functions.

w`j : weights/loadings.

Information sharing

φj allows for non-stationarity, correlations between disjoint regions[Higdon, 2002].

w`j : weights across players follow a CAR model [Besag, 1974] based on playersimilarity graph H.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 40: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Basis representation of spatial effects

Spatial effects ξ`j`: ballcarrier identity.

j : macrotransition type (pass, shot, turnover).

Functional basis representation

ξ`j (z) = [w`j ]′φj(z).

φj = (φji . . . φjd)′: d spatial basis functions.

w`j : weights/loadings.

Information sharing

φj allows for non-stationarity, correlations between disjoint regions[Higdon, 2002].

w`j : weights across players follow a CAR model [Besag, 1974] based on playersimilarity graph H.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 41: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Basis representation of spatial effects

Basis functions φj learned in pre-processing step:

Graph H learned from players’ court occupancy distribution:

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 42: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Basis representation of spatial effects

Basis functions φj learned in pre-processing step:

Graph H learned from players’ court occupancy distribution:

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 43: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Inference

“Partially Bayes” estimation of all model parameters:

Multiresolution transition models provide partial likelihood factorization[Cox, 1975].

All model parameters estimated using R-INLA software[Rue et al., 2009, Lindgren et al., 2011].

Distributed computing implementation:

Preprocessing involves low-resource, highly parallelizable tasks.

Parameter estimation involves several CPU- and memory-intensive tasks.

Calculating EPV from parameter estimates involves low-resource, highlyparallelizable tasks.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 44: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Inference

“Partially Bayes” estimation of all model parameters:

Multiresolution transition models provide partial likelihood factorization[Cox, 1975].

All model parameters estimated using R-INLA software[Rue et al., 2009, Lindgren et al., 2011].

Distributed computing implementation:

Preprocessing involves low-resource, highly parallelizable tasks.

Parameter estimation involves several CPU- and memory-intensive tasks.

Calculating EPV from parameter estimates involves low-resource, highlyparallelizable tasks.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 45: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Inference

“Partially Bayes” estimation of all model parameters:

Multiresolution transition models provide partial likelihood factorization[Cox, 1975].

All model parameters estimated using R-INLA software[Rue et al., 2009, Lindgren et al., 2011].

Distributed computing implementation:

Preprocessing involves low-resource, highly parallelizable tasks.

Parameter estimation involves several CPU- and memory-intensive tasks.

Calculating EPV from parameter estimates involves low-resource, highlyparallelizable tasks.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 46: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

New insights from basketball possessions

JamesshotNORRIS COLE POSSESSION

RASHARDLEWISPOSS.

LEBRON JAMES POSSESSION

EPV constant whilepass is en route

PassPass

Slight dip in EPV aftercrossing 3 point line

Slight dip in EPV aftercrossing 3 point line

Acceleratestowards basket

Runs behindbasket and

defenders close

Acceleratesinto the paint

Splits the defense;clear path to

basket

0.8

1.0

1.2

1.4

1.6

time

EP

V

0 3 6 9 12 15 18

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 47: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

New insights from basketball possessions0.

81.

01.

21.

41.

6

time

EP

V

0 3 6 9 12 15 18

A

B

C

D

1

2 3

4

5

1

2

34

51

2 3

4

5

1

2

34

51

2 3

4

5

1

2

34

5

A

V 1.42P 0.83

V 1.09P 0.00

V 0.98P 0.04

V 1.01P 0.06

V 1.03P 0.04

TURNOVERV 0.00P 0.02

OTHERV 0.99P 0.02

LEGENDMOVEMENT

Macro History Micro

MACROTRANSITION VALUE (V)

High Medium Low

MACROTRANSITION PROBABILITY (P)High Medium Low

OFFENSE1

2

3

4

5

Norris ColeRay AllenRashard LewisLeBron JamesChris Bosh

DEFENSE1

2

3

4

5

Deron WilliamsJason Terry

Joe JohnsonAndray Blatche

Brook Lopez

1

23

4

5

1

2

3

451

23

4

5

1

2

3

451

23

4

5

1

2

3

45

B

V 1.17P 0.27

V 1.02P 0.03 V 0.98

P 0.04

V 1.01P 0.23

V 1.03P 0.12

TURNOVERV 0.00P 0.16

OTHERV 1.00P 0.16

1

23

4

5

1

23

45

1

23

4

5

1

23

45

1

23

4

5

1

23

45

C

V 1.01P 0.05

V 1.00P 0.13

V 1.01P 0.04

V 1.01P 0.61

V 1.05P 0.04 TURNOVER

V 0.00P 0.01

OTHERV 0.97P 0.13

12

3

4

5

12

3

4

5

12

3

4

5

12

3

4

5

12

3

4

5

12

3

4

5

D

V 1.50P 0.83

V 1.01P 0.01

V 1.09P 0.01

V 0.98P 0.04

V 1.03P 0.00

TURNOVERV 0.00P 0.04

OTHERV 1.06P 0.06

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 48: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

New metrics for player performance

EPV-added:

Rank Player EPVA

1 Kevin Durant 3.262 LeBron James 2.963 Jose Calderon 2.794 Dirk Nowitzki 2.695 Stephen Curry 2.506 Kyle Korver 2.017 Serge Ibaka 1.708 Channing Frye 1.659 Al Horford 1.55

10 Goran Dragic 1.54

Rank Player EPVA

277 Zaza Pachulia -1.55278 DeMarcus Cousins -1.59279 Gordon Hayward -1.61280 Jimmy Butler -1.61281 Rodney Stuckey -1.63282 Ersan Ilyasova -1.89283 DeMar DeRozan -2.03284 Rajon Rondo -2.27285 Ricky Rubio -2.36286 Rudy Gay -2.59

Table : Top 10 and bottom 10 players by EPV-added (EPVA) per game in 2013-14,minimum 500 touches during season.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 49: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

New metrics for player performance

Shot satisfaction:

Rank Player Satis.

1 Mason Plumlee 0.352 Pablo Prigioni 0.313 Mike Miller 0.274 Andre Drummond 0.265 Brandan Wright 0.246 DeAndre Jordan 0.247 Kyle Korver 0.248 Jose Calderon 0.229 Jodie Meeks 0.22

10 Anthony Tolliver 0.22

Rank Player Satis.

277 Garrett Temple -0.02278 Kevin Garnett -0.02279 Shane Larkin -0.02280 Tayshaun Prince -0.03281 Dennis Schroder -0.04282 LaMarcus Aldridge -0.04283 Ricky Rubio -0.04284 Roy Hibbert -0.05285 Will Bynum -0.05286 Darrell Arthur -0.05

Table : Top 10 and bottom 10 players by shot satisfaction in 2013-14, minimum 500touches during season.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 50: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Acknowledgements and future work

Our EPV framework can be extended to better incorporate unique basketballstrategies:

Additional macrotransitions can be defined, such as pick and rolls, screens,and other set plays.

Use more information in defensive matchups (only defender locations, notidentities, are currently used).

Summarize and aggregate EPV estimates into useful player- or team-specificmetrics.

Thanks to:

Co-authors: Alex D’Amour, Luke Bornn, Kirk Goldsberry.

Colleagues: Alex Franks, Andrew Miller.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 51: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

Acknowledgements and future work

Our EPV framework can be extended to better incorporate unique basketballstrategies:

Additional macrotransitions can be defined, such as pick and rolls, screens,and other set plays.

Use more information in defensive matchups (only defender locations, notidentities, are currently used).

Summarize and aggregate EPV estimates into useful player- or team-specificmetrics.

Thanks to:

Co-authors: Alex D’Amour, Luke Bornn, Kirk Goldsberry.

Colleagues: Alex Franks, Andrew Miller.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015

Page 52: A Multiresolution Stochastic Process Model for Basketball ...dcervone.com/slides/epv_jsm.pdfNBA optical tracking data (x;y) locations for all 10 players (5 on each team) at 25Hz. (x;y;z)

References I

[0] Besag, J. (1974).Spatial interaction and the statistical analysis of lattice systems.Journal of the Royal Statistical Society: Series B (Methodological), 36(2):192–236.

[0] Cox, D. R. (1975).A note on partially bayes inference and the linear model.Biometrika, 62(3):651–654.

[0] Franks, A., Miller, A., Bornn, L., and Goldsberry, K. (2015).Characterizing the spatial structure of defensive skill in professional basketball.Annals of Applied Statistics.

[0] Higdon, D. (2002).Space and space-time modeling using process convolutions.In Quantitative Methods for Current Environmental Issues, pages 37–56. Springer, New York, NY.

[0] Lindgren, F., Rue, H., and Lindstrom, J. (2011).An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach.Journal of the Royal Statistical Society: Series B (Methodological), 73(4):423–498.

[0] Rue, H., Martino, S., and Chopin, N. (2009).Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations.Journal of the Royal Statistical Society: Series B (Methodological), 71(2):319–392.

Dan Cervone (Harvard) Multiresolution Basketball Modeling August 11, 2015