discriminative training of kalman filters p. abbeel, a. coates, m. montemerlo, a. y. ng, s. thrun

29
iscriminative Training of Kalman Filte Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thr Kalman filters estimate the state of a dynamical system from inputs and measurements. The Kalman filter’s parameters (e.g., state and observation variances) significantly affect its accuracy, and are difficult to choose. Current practice: hand-engineer the parameters to get the best possible result.. We propose to collect ground- truth data and then learn the Kalman filter parameters automatically from the data using discriminative training.

Upload: rudolf

Post on 25-Jan-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun. Kalman filters estimate the state of a dynamical system from inputs and measurements. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Discriminative Training of Kalman FiltersP. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

• Kalman filters estimate the state of a dynamical system from inputs and measurements.

• The Kalman filter’s parameters (e.g., state and observation variances) significantly affect its accuracy, and are difficult to choose.

• Current practice: hand-engineer the parameters to get the best possible result..

• We propose to collect ground-truth data and then learn the Kalman filter parameters automatically from the data using discriminative training.

Page 2: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Discriminative Training of Kalman FiltersP. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Ground truth

Hand-engineered

Learned

Page 3: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Discriminative Training of Kalman Filters

Pieter Abbeel, Adam Coates, Mike Montermerlo, Andrew Y. Ng, Sebastian Thrun

Stanford University

Page 4: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Motivation

• Extended Kalman filters (EKFs) estimate the state of a dynamical system from inputs and measurements.

• For fixed inputs and measurements, the estimated state sequence depends on:

Next-state function.

Measurement function.

Noise model.

Page 5: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Motivation (2)

• Noise terms typically result from a number of different effects:

Mis-modeled system dynamics and measurement dynamics.

The existence of hidden state in the environment not modeled by the EKF.

The discretization of time.

The algorithmic approximations of the EKF itself, such as the Taylor approximation commonly used for linearization.

• In Kalman filters noise is assumed independent over time, in practice noise is highly correlated.

Page 6: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Motivation (3)

• Common practice: careful hand-engineering of the Kalman filter’s parameters to optimize its performance, which can be very time consuming.

• Proposed solution: automatic learning of the Kalman filter’s parameters from data where (part of) the state-sequence is observed (or very accurately measured).

• In this work: we focus on learning the state and measurement variances, although the principles are more generally applicable.

Page 7: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Example

• Problem: estimate the variance of a GPS unit used to estimate the fixed position x of a robot.

• Standard model: xmeasured = xtrue + » N(,2) (Gaussian with mean 0, variance 2)

• Assuming the noise is independent over time, we have after n measurements variance = 2/n.

• However if the noise is perfectly correlated, the true variance is 2 >> 2/n.

• Practical implication: wrongly assuming independence leads to overconfidence in the GPS sensor. This matters not only for the variance, but also for the state estimate when information from multiple sensors is combined.

Page 8: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Extended Kalman filter

• State transition equation: xt = f(xt-1,ut) + » N(0, R) (Gaussian with mean zero, covariance R)

• Measurement equation: zt = g(xt) + » N(0,Q) (Gaussian with mean zero, covariance Q)

• The extended Kalman filter linearizes the non-linear function f, g through their Taylor approximation:

f(xt-1,ut) ¼ f(t-1,ut) + Ft(xt-1-t-1)

g(xt) ¼ g(t) + Gt(xt - t)

Here Ft and Gt are Jacobian matrices of f and g respectively, taken at the filter estimate .

Page 9: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Extended Kalman Filter (2)

• For linear systems, the (standard) Kalman filter produces exact updates of expected state and covariance. The extended Kalman filter applies the same updates to the linearized system.

• Prediction update step: Place holder Place holder

• Measurement update step: Place holder Place holder Place holder

Page 10: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Discriminative training

• Let y be a subset of the state variables x for which we obtain ground-truth data. E.g., the positions obtained with a very high-end GPS receiver that is not part of the robot system when deployed.

• We use h(.) to denote the projection from x to y:

y = h(x).

• Discriminative training: Given (u1:T, z1:T, y1:T).

Find the filter parameters that predict y1:T most accurately.

[Note: it is actually sufficient that y is a highly accurate estimate of x. See the paper for details.]

Page 11: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Three discriminative training criteria

• Minimizing the residual prediction error:

• Maximizing the prediction likelihood:

• Maximizing the measurement likelihood:

(The last criterion does not require access to y1:T.)

Page 12: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Evaluating the training criteria

• The extended Kalman filter computes p(xt|z1:t,u1:t) = N(t, t) for all times t 2 {1 … T}.

• Residual prediction error and prediction likelihood can be evaluated directly from the filter’s output.

• Measurement likelihood:

Page 13: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Robot testbed

Page 14: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

The robot’s state, inputs and measurements

• State: x,y : position coordinates.: heading. v : velocity. b : gyro bias.

• Control inputs: r : rotational velocity. a : forward acceleration.

• Measurements: Optical wheel encoders measure v. A (cheap) GPS unit measures x,y (1Hz, 3m accuracy). A magnetic compass measures

Page 15: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

System Equations

Page 16: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Experimental setup

• We collected two data sets (100 seconds each) by driving the vehicle around on a grass field. One data set is used to discriminatively learn the parameters, the other data set is used to evaluate the performance of the different algorithms.

• A Novatel RT2 differential GPS unit (10Hz, 2cm accuracy) was used to obtain ground truth position data. Note the hardware on which our algorithms are evaluated do not have the more accurate GPS.

Page 17: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Experimental results (1)

• Evaluation metrics:

RMS error (on position):

Prediction log-loss (on position):

Method Residual Error

Prediction Likelihood

Measurement Likelihood

CMU hand-tuned

RMS error 0.26 0.29 0.26 0.50

log-loss -0.23 0.048 40 0.75

Page 18: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Experimental results (2)

Zoomed in on this area on next figure.

Ground truth

CMU hand-engineered

Learned minimizingresidual prediction error

Page 19: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Experimental results (3)

Ground truth

CMU hand-engineered

Learned minimizingresidual prediction error

Page 20: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Experimental Results (4)

Ground truth

CMU hand-engineered

Learned minimizingresidual prediction error

Learned maximizingprediction likelihood

Learned maximizingmeasurement likelihood

Zoomed in on this area on next figure.

x (cheap) GPS

Page 21: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Experimental Results (5)

Ground truth

CMU hand-engineered

Learned minimizingresidual prediction error

Learned maximizingprediction likelihood

Learned maximizingmeasurement likelihood

x (cheap) GPS

Page 22: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Related Work

• Conditional Random Fields (J. Lafferty, A. McCallum, F. Pereira, 2001). The optimization criterion (translated to our setting) is:

• Issues for our setting: The criterion does not use filter estimates. (It uses

smoother estimates instead.) The criterion assumes the state is fully observed.

Page 23: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Discussion and conclusion

• In practice Kalman filters often require time-consuming hand-engineering of the noise parameters to get optimal performance.

• We presented a family of algorithms that use a discriminative criterion to learn the noise parameters automatically from data.

• Advantages: Eliminates hand-engineering step.

More accurate state estimation than even a carefully hand-engineered filter.

Page 24: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Discriminative

Page 25: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Training of

Page 26: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Kalman Filters

Page 27: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Pieter Abbeel, Adam Coates, Mike Montemerlo, Andrew Y.

Ng and Sebastian Thrun

Stanford University

Page 28: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Pieter Abbeel, Adam Coates, Mike Montemerlo, Andrew Y. Ng and

Sebastian Thrun

Stanford University

Page 29: Discriminative Training of Kalman Filters P. Abbeel, A. Coates, M. Montemerlo, A. Y. Ng, S. Thrun

Robot testbed