francesco borrelli email: [email protected] ... · berkeley autonomous 1/10 race car project...
TRANSCRIPT
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 1
Iterative Learning Model Predictive Control
Francesco BorrelliEmail: [email protected]
University of CaliforniaBerkeley, USA
www.mpc.berkeley.edu
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 2
Acknowledgements
Learning MPC: Ugo Rosolia
Adaptive MPC: Monimoy Bujarbaruah, George Xiaojing Zhang
Autonomous Drift: Edo Jelavick, George Xiaojing Zhang
Analog Optimization: Sergey Vicky
Autonomous Drift: Edo Jelavick, Yuri Glauthier
Analog Optimization: Sergey Vicky
Connected Cars: Jacopo Guanetti, Jongsang Suh, Roya Firoozi, Yeojun Kim , Eric Choi
BARC: Jon Gonzales, Tony Zeng, Charlott Vallon
Research Sponsors:
Hyundai Corporation
Ford Research Labs, Siemens, Mobis, Komatzu
National Science Foundation
Office of Naval Research
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 3
Iterative Learning Model Predictive Control
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 4
Iterative Learning Model Predictive Control
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 5
Now Available on Amazon
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 6
Constrained Infinite-Time Optimal Control
“Solved” as..
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 7
Repeated Solution of Constrained Finite Time Optimal Control
Predictive Controller:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 8
Repeated Solution of Constrained Finite Time Optimal Control
Approximates the `tail' of the cost Approximates the `tail' of the constraints
N constrained by computation and forecast uncertaintyRobust and stochastic versions subject of current research
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 9
Repeated Solution of Constrained Finite Time Optimal Control
Predictive Controller:
Predictive Control: Theory & Computation
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 10
Repeated Solution of Constrained Finite Time Optimal Control
Predictive Controller:
Predictive Control Classical Theory
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 11
Terminal cost: Control Lyapunov function
Terminal constraint set: Control Invariant set
Predictive Control Theory: Sufficient conditions to guarantee
Convergence to the desired equilibrium point/regionConstraint satisfaction at all times
Control Invariant Set
Control Lyapunov Function
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 12
Repeated Solution of Constrained Finite Time Optimal Control
Predictive Controller:
Predictive Control Computation
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 13
Offline π(∙) and Online π(x) Computation
Option 1 (Offline Based): “Complex” Offline, “Simple” Online
π0(∙) often piecewise constant or affine disturbance feedback
Dynamic Programing is one choice
Sampling model reduction/aggregation required for n>5
Option 2 (Online Based): “Simple” Offline, “Complex” Online
Compute on-line π 0(x(t)) with a “sophisticated” algorithm
Interior point method solver is one choice
Convexification required for real-time embedded control
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 14
Online Based
Excellent, (non-) convex open-source solvers
Tailored solvers for embedded linear and nonlinear MPC
Offline Based
For linear and piecewise linear systems: explicit MPC
Mixing pre-computation and online-optimization
Suboptimal MPC
Fast Online Implementation on embedded FPGA, GPU
Analog MPC: microsecond sampling time
Major effort over the past 20 years for enlarging MPC application domain A very biased story
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 15
Iterative Learning Model Predictive Control
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 16
Three Forms of Learning1 - Skill acquisition
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 17
Three Forms of Learning2 - Performance Improvement
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 18
Three Forms of Learning3 - Computation Load Reduction
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 19
Three Forms of Learning.Practice in order to:
Learning from demonstrationTransfer learning Learning from simulations
Iterative Learning Computational load reduction of control policy
Acquire a Skill Improve Performance Reduce Computational Load
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 20
Three Forms of Learning.Practice in order to:
Learning from demonstrationTransfer learning Learning from simulations
Iterative Learning Computational reduction of control policy
Acquire a Skill Improve Performance Reduce Computational Load
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 21
Learning MPC Applied to Robo-Cars (instead of robo-soccer players..)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 22
Autonomous Cars @MPC Lab
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 23
Autonomous Vehicles- Motion Control Through:
Acceleration, Braking, SteeringAlso:
4 braking torquesGear RatioEngine torque + front and rear distribution4 dampers for active suspensions
Hyundai Genesis G70
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 24
Static Nonlinearities: Tires
Inequality Constraints: Safety region
Uncertain Tire Model, Road Friction, Obstacles
Nonlinear Dynamical System
Useful Model Abstraction
and
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 25
Tires and RoadSimplified Nonlinear Model
Slip Angle
Lateral ForceDry Asphalt
Snow
Ice
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 28
Berkeley Autonomous 1/10 Race Car Projectwww.barc-project.com
Complete Open SourceUbuntu, RoS, OpenCV, Julia, IPOPTCamera, IMU, Ultrasounds, LIDARCloud-Based
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 30
Three Forms of Learning1 - Skill acquisition
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 31
Three Forms of Learning2 - Performance Improvement
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 32
Three Forms of Learning3 - Computation Load Reduction
Average CPU Load at each iteration
Lap Time at each iteration
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 34
Three Forms of Learning
How we do this?
Model Predictive ControlA Simple Idea (which exploits the iterative nature of the tasks)
Important Design Steps
Acquire a Skill Improve Performance Reduce Computational Load
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 35
Iterative Learning Model Predictive Control
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 36
One task execution referred to as “iteration” or “episode”Same initial and terminal state at each iterationNotation:
Iterative Tasks - Problem Setup
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 37
Iterative Tasks - Problem Setup
One task execution referred to as “iteration” or “episode”Same initial and terminal state at each iterationNotation:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 38
Iterative Tasks - Problem Setup
One task execution referred to as “iteration” or “episode”Same initial and terminal state at each iterationNotation:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 39
Goal
Safety guarantees:
Constraint satisfaction at iteration j → satisfaction at iteration j+1
Performance improvement guarantees:
Closed loop cost at iteration j+1 ≤cost at iteration j
Learned fromdata
Iterative Learning MPCIncorporating data in advanced model based controller
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 40
Learned fromdata
Learning MPCIncorporating data in advance model based controller
Simplification (general case later)
Known/nominal model
Infinite Horizon Task
Uncertainty and model adaptation later (and at this conference)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 43
Learning Model Predictive Control (LMPC)
• Recursive feasibility• Iterative feasibility
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 44
Iteration 0
Assume at iteration 0 the closed-loop trajectory is feasible
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 45
Iteration 0
Assume at iteration 0 the closed-loop trajectory is feasible
Fact
is a control invariant
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 46
Iteration 1, Step 0
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 47
Iteration 1, Step 0
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 48
Iteration 1, Step 1
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 49
Iteration 1, Step 1
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 50
Iteration 1, Step 2
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 51
Iteration 1, Step 2
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 52
Iteration 1, Step 3
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 53
Iteration 1, Step 4
Use SS0 as terminal set at Iteration 1
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 54
Iteration 2 Safe Set
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 55
Constructing the terminal set
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 56
Terminal Set : Convex all of Sample Safe Set
for Constrained Linear Dynamical Systems is a Control Invariant Set
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 57
Learning Model Predictive Control (LMPC)
• Convergence• Performance
improvement• Local optimality
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 58
Terminal Cost at Iteration 0
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 59
Terminal Cost at Iteration 0
A control Lyapunov “function”
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 60
Terminal Cost at the j-th iteration
≡
Define
Compute terminal cost as
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 61
Terminal Cost: Barycentric Approximation of Q()
Control Lyapunov Function
(for Constrained Linear Dynamical Systems)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 62
ILMPC Summary
MPC strategy:
Optimize over inputs and lambdasFor constrained linear systems
Safety guarantees:- Constraint satisfaction at iteration j=> satisfaction at iteration j+1
Performance improvement guarantees:- Closed loop cost at iteration j >= cost at iteration j+1
Convergence to global optimal solutionConstraint qualification conditions required for cost decrease
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 63
Performance Improvement Proof
Conjecture
Notation
Closed-loop state and input trajectory at iteration j
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 64
Step 1:
Performance Improvement Proof
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 65
Performance Improvement Proof
Step 1:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 66
Performance Improvement Proof
Step 1:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 67
Performance Improvement Proof
Step 1:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 68
Performance Improvement Proof
Step 1:
Step 2:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 69
Performance Improvement Proof
Step 1:
Step 2:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 70
Performance Improvement Proof
Step 1:
Step 2:
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 71
Performance Improvement Proof
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 72
Iterative Learning MPC
Optimize over inputs and lambdas
Simple proofs
For constrained linear systems
Safety and Performance improvement guarantees
Convergence to global optimal solution (for linear
Constraint qualification conditions required for cost decrease
If full column rank, improvement cannot be obtained
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 73
Control objective
System dynamicsSystem constraints
Starting Position
Constrained LQR Example
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 74
Control objective
System dynamicsSystem constraints
Terminal Constraint
Initial Condition
Iterative LMPC with horizon N=2
Will not work!
Will work if one sets N=3
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 75
Comparison with R.L.??
RL term too broadTwo good references:
Bertsekas paper connecting MPC and ADP*Lewis and Vrabile survey on CSM**Recht survey (section 6): https://arxiv.org/abs/1806.09460
ILMPC highlightsContinuous state formulationConstraints satisfaction and Sampled Safe SetsQ-function constructed (learned) locally based on cost/model
driven exploration and past trailsQ-function at stored state is “exact” and lowerbounds property
at intermediate points (for convex problems)*Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC**Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 76
About Model Learning in Racing
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 77
Autonomous Racing Control Problem
Start & end position
System dynamicsSystem constraints
Obstacle avoidance
Control objective
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 78
Learning Model Predictive Control (LMPC)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 80
Learning Process
The lap time decreases until the LMPC converges to a set of trajectories
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 82
Learning Model Predictive Control (LMPC)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 83
Nonlinear Dynamical System
Useful Vehicle Model Abstraction
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 84
Nonlinear Dynamical System
Useful Vehicle Model Abstraction
Kinematic Equations
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 85
Nonlinear Dynamical System
Useful Vehicle Model Abstraction
Kinematic Equations
Identifying the Dynamical System
Linearization around predicted trajectory
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 86
Nonlinear Dynamical System
Useful Vehicle Model Abstraction
Kinematic Equations
Dynamic Equations
Identifying the Dynamical SystemLocal Linear Regression
Linearization around predicted trajectory
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 87
Useful Vehicle Model Abstraction
Identifying the Dynamical System
Important Design Steps1. Compute trajectory to linearize around uses previous optimal inputs and
inputs in the safe set2. Enforce model-based sparsity in local linear regression
Local Linear Regression
Linearization around predicted trajectory
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 88
Nonlinear Dynamical System
Useful Vehicle Model Abstraction
The velocity update is not affected by Position and Acceleration command
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 89
Useful Vehicle Model Abstraction
Identifying the Dynamical System
Important Design Steps1. Compute trajectory to linearize aroundpusing previous optimal inputs
and inputs in the safe set2. Enforce model-based sparsity in local linear regression3. Use data close to current state trajectory for parameter ID4. Use kernel K() to weight differently data as a function of distance to
linearized trajectory
Local Linear Regression
Linearization around predicted trajectory
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 90
Accelerations
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 91
Results
Gain from steering to lateral velocity
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 95
About Model Learning Ball in Cup
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 96
Ball in a Cup System with MuJoCo
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 97
Ball in a Cup Control Problem
Start & end position
System dynamicsSystem constraints
Obstacle avoidance
Control objective
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 98
Learning Model Predictive Control (LMPC)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 99
Useful Mujoco Model Abstraction
Identifying the Dynamical SystemLocal Linear Regression
Linearization around predicted trajectory
Important Design Steps1. Compute trajectory to linearize aroundpusing previous optimal inputs
and inputs in the safe set2. Enforce model-based sparsity in local linear regression3. Use data close to current state trajectory for parameter ID4. Use kernel K() to weight differently data as a function of distance to
linearized trajectory
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 100
Ball in a Cup System
At iteration 0 find a sequence by sampling parametrized inputs profiles (takes 5mins)
Use ILMPC: At iteration 1, time reduced of 10%, cup height movement reduced of 35%
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 101
Back to our main chart..
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 102
Three Forms of Learning
Skill acquisition Performance improvement
How we do this?
Model Predictive Control +A Simple Idea +Good Practices
Reduce load forRoutine Execution
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 103
Offline π(∙) and Online π(x) Computation
Option 1 (Offline Based): “Complex” Offline, “Simple” Online
π(∙) often Piecewise Constant (except special classes)
Dynamic Programing is one choice
Basic rule: n>5 impossible
Option 2 (Online Based): “Simple” Offline, “Complex” Online
Compute on-line π(x) with a “sophisticated” algorithm
Interior point method solver is one choice
Basic Rule: avoid use `home-made’ solvers
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 104
One Simple Way: Data-Based Policy for π(∙)
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 105
One Simple Way: Data-Based Policy for π(∙)
Historical dataof converged iterations
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 107
Three Forms of Learning3 - Computation Load Reduction
Average CPU Load at each iteration
Lap Time at each iteration
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 108
Experimental Results
Factor of 10
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 109
Data Based Policy: Alternatives
Nearest NeighborTrain ReLU Neural NetworkLocal Explicit MPC
All Continuous Piecewise Affine Policies
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 110
Learned fromdata
Learning MPCIncorporating data in advance model based controller
In Practice
Noise and model uncertainty: Robust case
What about noise and model uncertainty?
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 111
ILMPC – Robust and Adaptive design
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 112
At Iteration 0
Linear System
Terminal Goal Set
Successful Iteration
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 113
At Iteration 1
CVX hull is not a robust invariant!
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 114
ILMPC – Robust and Adaptive design
Robust invariants
“Robustify” Q-function (and dualize for computational efficiency)
Chance constraints
See my group papers at this conference if interested..
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 115
For Iterative tasks I discussed
By using Iterative learning MPC, i.e.
Model Predictive ControlA Simple Idea (which exploits the iterative nature of the tasks)
A Few Important Design Steps
How to obtain performance improvement and reduced computational load while satisfying constraints
Borrelli (UC Berkeley) Iterative Learning MPC 2018 CDC– Slide 116
The End