francesco borrelli email: fborrelli@berkeley.edu ... · berkeley autonomous 1/10 race car project...

Post on 19-Jul-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 1

Iterative Learning Model Predictive Control

Francesco BorrelliEmail: fborrelli@berkeley.edu

University of CaliforniaBerkeley, USA

www.mpc.berkeley.edu

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 2

Acknowledgements

Learning MPC: Ugo Rosolia

Adaptive MPC: Monimoy Bujarbaruah, George Xiaojing Zhang

Autonomous Drift: Edo Jelavick, George Xiaojing Zhang

Analog Optimization: Sergey Vicky

Autonomous Drift: Edo Jelavick, Yuri Glauthier

Analog Optimization: Sergey Vicky

Connected Cars: Jacopo Guanetti, Jongsang Suh, Roya Firoozi, Yeojun Kim , Eric Choi

BARC: Jon Gonzales, Tony Zeng, Charlott Vallon

Research Sponsors:

Hyundai Corporation

Ford Research Labs, Siemens, Mobis, Komatzu

National Science Foundation

Office of Naval Research

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 3

Iterative Learning Model Predictive Control

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 4

Iterative Learning Model Predictive Control

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 5

Now Available on Amazon

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 6

Constrained Infinite-Time Optimal Control

“Solved” as..

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 7

Repeated Solution of Constrained Finite Time Optimal Control

Predictive Controller:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 8

Repeated Solution of Constrained Finite Time Optimal Control

Approximates the `tail' of the cost Approximates the `tail' of the constraints

N constrained by computation and forecast uncertaintyRobust and stochastic versions subject of current research

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 9

Repeated Solution of Constrained Finite Time Optimal Control

Predictive Controller:

Predictive Control: Theory & Computation

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 10

Repeated Solution of Constrained Finite Time Optimal Control

Predictive Controller:

Predictive Control Classical Theory

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 11

Terminal cost: Control Lyapunov function

Terminal constraint set: Control Invariant set

Predictive Control Theory: Sufficient conditions to guarantee

Convergence to the desired equilibrium point/regionConstraint satisfaction at all times

Control Invariant Set

Control Lyapunov Function

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 12

Repeated Solution of Constrained Finite Time Optimal Control

Predictive Controller:

Predictive Control Computation

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 13

Offline π(∙) and Online π(x) Computation

Option 1 (Offline Based): “Complex” Offline, “Simple” Online

π0(∙) often piecewise constant or affine disturbance feedback

Dynamic Programing is one choice

Sampling model reduction/aggregation required for n>5

Option 2 (Online Based): “Simple” Offline, “Complex” Online

Compute on-line π 0(x(t)) with a “sophisticated” algorithm

Interior point method solver is one choice

Convexification required for real-time embedded control

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 14

Online Based

Excellent, (non-) convex open-source solvers

Tailored solvers for embedded linear and nonlinear MPC

Offline Based

For linear and piecewise linear systems: explicit MPC

Mixing pre-computation and online-optimization

Suboptimal MPC

Fast Online Implementation on embedded FPGA, GPU

Analog MPC: microsecond sampling time

Major effort over the past 20 years for enlarging MPC application domain A very biased story

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 15

Iterative Learning Model Predictive Control

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 16

Three Forms of Learning1 - Skill acquisition

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 17

Three Forms of Learning2 - Performance Improvement

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 18

Three Forms of Learning3 - Computation Load Reduction

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 19

Three Forms of Learning.Practice in order to:

Learning from demonstrationTransfer learning Learning from simulations

Iterative Learning Computational load reduction of control policy

Acquire a Skill Improve Performance Reduce Computational Load

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 20

Three Forms of Learning.Practice in order to:

Learning from demonstrationTransfer learning Learning from simulations

Iterative Learning Computational reduction of control policy

Acquire a Skill Improve Performance Reduce Computational Load

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 21

Learning MPC Applied to Robo-Cars (instead of robo-soccer players..)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 22

Autonomous Cars @MPC Lab

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 23

Autonomous Vehicles- Motion Control Through:

Acceleration, Braking, SteeringAlso:

4 braking torquesGear RatioEngine torque + front and rear distribution4 dampers for active suspensions

Hyundai Genesis G70

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 24

Static Nonlinearities: Tires

Inequality Constraints: Safety region

Uncertain Tire Model, Road Friction, Obstacles

Nonlinear Dynamical System

Useful Model Abstraction

and

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 25

Tires and RoadSimplified Nonlinear Model

Slip Angle

Lateral ForceDry Asphalt

Snow

Ice

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 28

Berkeley Autonomous 1/10 Race Car Projectwww.barc-project.com

Complete Open SourceUbuntu, RoS, OpenCV, Julia, IPOPTCamera, IMU, Ultrasounds, LIDARCloud-Based

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 30

Three Forms of Learning1 - Skill acquisition

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 31

Three Forms of Learning2 - Performance Improvement

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 32

Three Forms of Learning3 - Computation Load Reduction

Average CPU Load at each iteration

Lap Time at each iteration

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 34

Three Forms of Learning

How we do this?

Model Predictive ControlA Simple Idea (which exploits the iterative nature of the tasks)

Important Design Steps

Acquire a Skill Improve Performance Reduce Computational Load

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 35

Iterative Learning Model Predictive Control

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 36

One task execution referred to as “iteration” or “episode”Same initial and terminal state at each iterationNotation:

Iterative Tasks - Problem Setup

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 37

Iterative Tasks - Problem Setup

One task execution referred to as “iteration” or “episode”Same initial and terminal state at each iterationNotation:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 38

Iterative Tasks - Problem Setup

One task execution referred to as “iteration” or “episode”Same initial and terminal state at each iterationNotation:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 39

Goal

Safety guarantees:

Constraint satisfaction at iteration j → satisfaction at iteration j+1

Performance improvement guarantees:

Closed loop cost at iteration j+1 ≤cost at iteration j

Learned fromdata

Iterative Learning MPCIncorporating data in advanced model based controller

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 40

Learned fromdata

Learning MPCIncorporating data in advance model based controller

Simplification (general case later)

Known/nominal model

Infinite Horizon Task

Uncertainty and model adaptation later (and at this conference)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 43

Learning Model Predictive Control (LMPC)

• Recursive feasibility• Iterative feasibility

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 44

Iteration 0

Assume at iteration 0 the closed-loop trajectory is feasible

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 45

Iteration 0

Assume at iteration 0 the closed-loop trajectory is feasible

Fact

is a control invariant

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 46

Iteration 1, Step 0

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 47

Iteration 1, Step 0

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 48

Iteration 1, Step 1

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 49

Iteration 1, Step 1

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 50

Iteration 1, Step 2

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 51

Iteration 1, Step 2

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 52

Iteration 1, Step 3

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 53

Iteration 1, Step 4

Use SS0 as terminal set at Iteration 1

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 54

Iteration 2 Safe Set

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 55

Constructing the terminal set

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 56

Terminal Set : Convex all of Sample Safe Set

for Constrained Linear Dynamical Systems is a Control Invariant Set

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 57

Learning Model Predictive Control (LMPC)

• Convergence• Performance

improvement• Local optimality

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 58

Terminal Cost at Iteration 0

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 59

Terminal Cost at Iteration 0

A control Lyapunov “function”

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 60

Terminal Cost at the j-th iteration

Define

Compute terminal cost as

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 61

Terminal Cost: Barycentric Approximation of Q()

Control Lyapunov Function

(for Constrained Linear Dynamical Systems)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 62

ILMPC Summary

MPC strategy:

Optimize over inputs and lambdasFor constrained linear systems

Safety guarantees:- Constraint satisfaction at iteration j=> satisfaction at iteration j+1

Performance improvement guarantees:- Closed loop cost at iteration j >= cost at iteration j+1

Convergence to global optimal solutionConstraint qualification conditions required for cost decrease

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 63

Performance Improvement Proof

Conjecture

Notation

Closed-loop state and input trajectory at iteration j

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 64

Step 1:

Performance Improvement Proof

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 65

Performance Improvement Proof

Step 1:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 66

Performance Improvement Proof

Step 1:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 67

Performance Improvement Proof

Step 1:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 68

Performance Improvement Proof

Step 1:

Step 2:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 69

Performance Improvement Proof

Step 1:

Step 2:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 70

Performance Improvement Proof

Step 1:

Step 2:

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 71

Performance Improvement Proof

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 72

Iterative Learning MPC

Optimize over inputs and lambdas

Simple proofs

For constrained linear systems

Safety and Performance improvement guarantees

Convergence to global optimal solution (for linear

Constraint qualification conditions required for cost decrease

If full column rank, improvement cannot be obtained

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 73

Control objective

System dynamicsSystem constraints

Starting Position

Constrained LQR Example

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 74

Control objective

System dynamicsSystem constraints

Terminal Constraint

Initial Condition

Iterative LMPC with horizon N=2

Will not work!

Will work if one sets N=3

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 75

Comparison with R.L.??

RL term too broadTwo good references:

Bertsekas paper connecting MPC and ADP*Lewis and Vrabile survey on CSM**Recht survey (section 6): https://arxiv.org/abs/1806.09460

ILMPC highlightsContinuous state formulationConstraints satisfaction and Sampled Safe SetsQ-function constructed (learned) locally based on cost/model

driven exploration and past trailsQ-function at stored state is “exact” and lowerbounds property

at intermediate points (for convex problems)*Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC**Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 76

About Model Learning in Racing

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 77

Autonomous Racing Control Problem

Start & end position

System dynamicsSystem constraints

Obstacle avoidance

Control objective

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 78

Learning Model Predictive Control (LMPC)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 80

Learning Process

The lap time decreases until the LMPC converges to a set of trajectories

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 82

Learning Model Predictive Control (LMPC)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 83

Nonlinear Dynamical System

Useful Vehicle Model Abstraction

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 84

Nonlinear Dynamical System

Useful Vehicle Model Abstraction

Kinematic Equations

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 85

Nonlinear Dynamical System

Useful Vehicle Model Abstraction

Kinematic Equations

Identifying the Dynamical System

Linearization around predicted trajectory

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 86

Nonlinear Dynamical System

Useful Vehicle Model Abstraction

Kinematic Equations

Dynamic Equations

Identifying the Dynamical SystemLocal Linear Regression

Linearization around predicted trajectory

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 87

Useful Vehicle Model Abstraction

Identifying the Dynamical System

Important Design Steps1. Compute trajectory to linearize around uses previous optimal inputs and

inputs in the safe set2. Enforce model-based sparsity in local linear regression

Local Linear Regression

Linearization around predicted trajectory

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 88

Nonlinear Dynamical System

Useful Vehicle Model Abstraction

The velocity update is not affected by Position and Acceleration command

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 89

Useful Vehicle Model Abstraction

Identifying the Dynamical System

Important Design Steps1. Compute trajectory to linearize aroundpusing previous optimal inputs

and inputs in the safe set2. Enforce model-based sparsity in local linear regression3. Use data close to current state trajectory for parameter ID4. Use kernel K() to weight differently data as a function of distance to

linearized trajectory

Local Linear Regression

Linearization around predicted trajectory

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 90

Accelerations

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 91

Results

Gain from steering to lateral velocity

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 95

About Model Learning Ball in Cup

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 96

Ball in a Cup System with MuJoCo

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 97

Ball in a Cup Control Problem

Start & end position

System dynamicsSystem constraints

Obstacle avoidance

Control objective

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 98

Learning Model Predictive Control (LMPC)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 99

Useful Mujoco Model Abstraction

Identifying the Dynamical SystemLocal Linear Regression

Linearization around predicted trajectory

Important Design Steps1. Compute trajectory to linearize aroundpusing previous optimal inputs

and inputs in the safe set2. Enforce model-based sparsity in local linear regression3. Use data close to current state trajectory for parameter ID4. Use kernel K() to weight differently data as a function of distance to

linearized trajectory

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 100

Ball in a Cup System

At iteration 0 find a sequence by sampling parametrized inputs profiles (takes 5mins)

Use ILMPC: At iteration 1, time reduced of 10%, cup height movement reduced of 35%

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 101

Back to our main chart..

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 102

Three Forms of Learning

Skill acquisition Performance improvement

How we do this?

Model Predictive Control +A Simple Idea +Good Practices

Reduce load forRoutine Execution

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 103

Offline π(∙) and Online π(x) Computation

Option 1 (Offline Based): “Complex” Offline, “Simple” Online

π(∙) often Piecewise Constant (except special classes)

Dynamic Programing is one choice

Basic rule: n>5 impossible

Option 2 (Online Based): “Simple” Offline, “Complex” Online

Compute on-line π(x) with a “sophisticated” algorithm

Interior point method solver is one choice

Basic Rule: avoid use `home-made’ solvers

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 104

One Simple Way: Data-Based Policy for π(∙)

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 105

One Simple Way: Data-Based Policy for π(∙)

Historical dataof converged iterations

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 107

Three Forms of Learning3 - Computation Load Reduction

Average CPU Load at each iteration

Lap Time at each iteration

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 108

Experimental Results

Factor of 10

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 109

Data Based Policy: Alternatives

Nearest NeighborTrain ReLU Neural NetworkLocal Explicit MPC

All Continuous Piecewise Affine Policies

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 110

Learned fromdata

Learning MPCIncorporating data in advance model based controller

In Practice

Noise and model uncertainty: Robust case

What about noise and model uncertainty?

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 111

ILMPC – Robust and Adaptive design

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 112

At Iteration 0

Linear System

Terminal Goal Set

Successful Iteration

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 113

At Iteration 1

CVX hull is not a robust invariant!

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 114

ILMPC – Robust and Adaptive design

Robust invariants

“Robustify” Q-function (and dualize for computational efficiency)

Chance constraints

See my group papers at this conference if interested..

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 115

For Iterative tasks I discussed

By using Iterative learning MPC, i.e.

Model Predictive ControlA Simple Idea (which exploits the iterative nature of the tasks)

A Few Important Design Steps

How to obtain performance improvement and reduced computational load while satisfying constraints

Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 116

The End

top related