fast variational bayesian linear state-space...

Fast Variational Bayesian

Linear State-Space Model

Jaakko Luttinen

Aalto University

Department of Information and Computer Science

jaakko.luttinen@aalto.�

September 25, 2013

Introduction

I Model: Gaussian linear state-space model

I Problem: Variational Bayesian (VB) estimation is slow

I Solution: Use parameter expansion

Linear state-space model

I Observations:

yn = Cxn + noise .

I Dynamics in the latent space:

xn = Axn−1 + noise .

I xn, A and C are unknown and estimated using the data.

Variational Bayesian learning

I The variables are given Gaussian priors (hyperparameters are

ignored for simplicity)

I Approximate the posterior as

q(x1, . . . , xN) · q(A) · q(C)

I Update the variables one at a time

Simple demonstration

I 1-dimensional model y = cx + noise

I Approximate VB posterior:

p(c , x |y) ≈ q(c) · q(x)

Initialization

Iteration 1 - Update q(c)

Iteration 1 - Update q(x)

Problem: Slow convergence

I How to avoid zigzagging?

I Does the convergence matter in practice?

Solution: Parameter expansion

I Find a parameterization related to the coupling

I Joint optimize the variables using the parameterization

I Known as parameter expansion

I The states can be rotated by compensating it in the loadings:

yn = Cxn = CR−1Rxn ,

thus rotate as C→ CR−1 and xn → Rxn.

I Keep the dynamics of the latent states una�ected:

Rxn = RAR−1Rxn−1 ,

thus rotate as A→ RAR−1.

I The states can be rotated by compensating it in the loadings:

yn = Cxn = CR−1Rxn ,

thus rotate as C→ CR−1 and xn → Rxn.

I Keep the dynamics of the latent states una�ected:

Rxn = RAR−1Rxn−1 ,

thus rotate as A→ RAR−1.

Simple demonstration

I 1-dimensional model y = cx + noise

I Approximate VB posterior:

p(c , x |y) ≈ q(c) · q(x)

I Rotate as x → Rx and c → c/R

Initialization

crotation curvex→Rx and c→c/R

Iteration 1 - Rotation

Arti�cial experiment

I 400 observations with 30 dimensions

I 8-dimensional latent space

I Performance as a function of VB iterations (log-scale):

100 101 102 103 104

6.51e3

BaselineRotations

100 101 102 103 104

4.4 BaselineRotations

(a) VB lower bound (b) Test RMSE

Weather data experiment

I 89202 observations with 66 dimensions

I 10-dimensional latent space

I Performance as a function of VB iterations (log-scale):

100 101 102 1034.0

BaselineRotations

100 101 102 103

10 BaselineRotations

(a) VB lower bound (b) Test RMSE

Summary

Rotation speeds up VB learning by orders of magnitude

Code and data available at

http://users.ics.aalto.fi/jluttine/ecml2013

fast variational bayesian linear state-space...

Documents

variational algorithms for approximate bayesian inference

life after the em algorithm: the variational approximation...

variational bayesian line spectral estimation with multiple...

variational bayesian learning of ica with missing data

bayesian learning and variational inference · for...

parallelization of variational inference for bayesian

variational algorithms for approximate bayesian...

variational bayesian image processing on stochastic ...

variational bayesian mixed-effects inference for ... ·...

variational bayesian inference for parametric and...

skew-normal variational approximations for bayesian...

efﬁcient variational bayesian approximation method based...

variational bayesian phd ﬁlter with deep learning network...

variational bayesian inference - kay brodersen ·...

variational algorithms for approximate bayesian...

skew-normal variational approximations for bayesian...

the variational approximation for bayesian...

vine: a variational inference-based bayesian neural ... ·...

regularized variational bayesian learning of echo state...

, number 1, pp. 1–44 variational bayesian learning of...