nonparametric bayesian learning of switching dynamical processes
DESCRIPTION
Laboratory for Information and Decision Systems. Nonparametric Bayesian Learning of Switching Dynamical Processes. Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky Nonparametric Bayes Workshop 2008 Helsinki, Finland. Applications. = set of dynamic parameters. Priors on Modes. - PowerPoint PPT PresentationTRANSCRIPT
Massachusetts Institute of Technology
Stochastic Systems Group
Nonparametric Bayesian Learning of
Switching Dynamical Processes
Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky
Nonparametric Bayes Workshop 2008
Helsinki, Finland
Laboratory for Information and Decision Systems
Page 2Massachusetts Institute of Technology
Applications
Page 3Massachusetts Institute of Technology
Priors on Modes
• Switching linear dynamical processes useful for describing nonlinear phenomena
• Goal: allow uncertainty in number of dynamical modes
Utilize hierarchical Dirichlet process (HDP) prior
Cluster based on dynamics
Switching Dynamical Processes
= set of dynamic parameters
Page 4Massachusetts Institute of Technology
Outline
• Background Switching dynamical processes: SLDS, VAR Prior on dynamic parameters Sticky HDP-HMM
• HDP-AR-HMM and HDP-SLDS
• Sampling Techniques
• Results Synthetic IBOVESPA stock index Dancing honey bee
Page 5Massachusetts Institute of Technology
Linear Dynamical Systems
• State space LTI model:
• Vector autoregressive (VAR) process:
Page 6Massachusetts Institute of Technology
Linear Dynamical Systems
• State space LTI model:
State space models
VAR processes
• Vector autoregressive (VAR) process:
Page 7Massachusetts Institute of Technology
Switching Dynamical Systems
• Switching linear dynamical system (SLDS):
• Switching VAR process:
Page 8Massachusetts Institute of Technology
Prior on Dynamic Parameters
Group all observations assigned to mode k
Define the following mode-specific matrices
Results in K decoupled linear regression problems
Rewrite VAR process in matrix form:
Place matrix-normal inverse Wishart prior on:
Page 9Massachusetts Institute of Technology
Sticky HDP-HMM
• Dirichlet process (DP): Mode space of unbounded size Model complexity adapts to
observations
• Hierarchical: Ties mode transition distributions Shared sparsity
• Sticky: self-transition bias parameter
Time
Mo
de
Infinite HMM: Beal, et.al., NIPS 2002HDP-HMM: Teh, et. al., JASA 2006 Sticky HDP-HMM: Fox, et.al., ICML 2008
Page 10Massachusetts Institute of Technology
• Global transition distribution:
Sticky HDP-HMM
sparsity of is shared,increased probability of self-transition
• Mode-specific transition distributions:
Page 11Massachusetts Institute of Technology
HDP-AR-HMM and HDP-SLDS
HDP-AR-HMM HDP-SLDS
Page 12Massachusetts Institute of Technology
Blocked Gibbs Sampler
Sample parameters
• Approximate HDP: Truncate stick-breaking Weak limit approximation:
• Sample transition distributions:
• Sample dynamic parameters using state sequence as VAR(1) pseudo-observations:
Fox, et.al., ICML 2008
Page 13Massachusetts Institute of Technology
Blocked Gibbs Sampler
Sample mode sequence
• Use state sequence as pseudo-observations of an HMM
• Compute backwards messages:
• Block sample as:
Page 14Massachusetts Institute of Technology
Blocked Gibbs Sampler
Sample state sequence
• Equivalent to LDS with time-varying dynamic parameters
• Compute backwards messages (backwards information filter):
• Block sample as:
All Gaussian distributions
Page 15Massachusetts Institute of Technology
Hyperparameters
• Place priors on hyperparameters and learn them from data
• Weakly informative priors
• All results use the same settings
hyperparameters
can be set using the data
Page 16Massachusetts Institute of Technology
Results: Synthetic VAR(1)
HDP-HMM
HDP-VAR(1)-HMM HDP-VAR(2)-HMM
HDP-SLDS
5-mode VAR(1) data
Page 17Massachusetts Institute of Technology
Results: Synthetic AR(2)
HDP-SLDS
HDP-VAR(1)-HMM HDP-VAR(2)-HMM
HDP-HMM
3-mode AR(2) data
Page 18Massachusetts Institute of Technology
Results: Synthetic SLDS
HDP-SLDS
HDP-VAR(1)-HMM HDP-VAR(2)-HMM
HDP-HMM
3-mode SLDS data
Page 19Massachusetts Institute of Technology
Results: IBOVESPA
• Data: Sao Paolo stock index
• Goal: detect changes in volatility
• Compare inferred change-points to 10 cited world events
sticky HDP-SLDS non-sticky HDP-SLDS ROC
Daily Returns
Carvalho and Lopes, Comp. Stat. & Data Anal., 2006
Page 20Massachusetts Institute of Technology
Results: Dancing Honey Bee
• 6 bee dance sequences with expert labeled dances: Turn right (green) Waggle (red) Turn left (blue)
Sequence 1 Sequence 2 Sequence 3 Sequence 4 Sequence 5 Sequence 6
TimeOh et. al., IJCV, 2007
x-pos
y-pos
sin
cos
• Observation vector: Head angle (cos, sin) x-y body position
Page 21Massachusetts Institute of Technology
Movie: Sequence 6
Page 22Massachusetts Institute of Technology
Results: Dancing Honey Bee
Nonparametric approach:
• Model: HDP-VAR(1)-HMM
• Set hyperparameters
• Unsupervised training from each sequence
• Infer: Number of modes Dynamic parameters Mode sequence
Supervised Approach [Oh:07]:
• Model: SLDS
• Set number of modes to 3
• Leave one out training: fixed label sequences on 5 of 6 sequences
• Data-driven MCMC Use learned cues (e.g., head angle)
to propose mode sequences
Oh et. al., IJCV, 2007
Page 23Massachusetts Institute of Technology
Results: Dancing Honey Bee
Sequence 4 Sequence 5 Sequence 6
HDP-AR-HMM: 83.2%SLDS [Oh]: 93.4%
HDP-AR-HMM: 93.2%SLDS [Oh]: 90.2%
HDP-AR-HMM: 88.7%SLDS [Oh]: 90.4%
Page 24Massachusetts Institute of Technology
Results: Dancing Honey Bee
Sequence 1 Sequence 2 Sequence 3
HDP-AR-HMM: 46.5%SLDS [Oh]: 74.0%
HDP-AR-HMM: 44.1%SLDS [Oh]: 86.1%
HDP-AR-HMM: 45.6%SLDS [Oh]: 81.3%
Page 25Massachusetts Institute of Technology
Conclusion
• Examined HDP as a prior for nonparametric Bayesian learning of SLDS and switching VAR processes.
• Presented efficient Gibbs sampler
• Demonstrated utility on simulated and real datasets