cs 188: artificial intelligencecs188/sp10/slides/sp10 cs188 lecture 26... · cs 188: artificial...
TRANSCRIPT
1
CS 188: Artificial IntelligenceSpring 2010
Advanced Applications:
Robotics
Pieter Abbeel – UC Berkeley
A few slides from Sebastian Thrun, Dan Klein
1
Announcements
� Project 5 due Thursday --- Classification!
� Contest!!� Tournaments every night.
� Final tournament: We will use submissions received by
Thursday May 6, 11pm.
2
Estimation: Laplace Smoothing
� Laplace’s estimate (extended):� Pretend you saw every outcome
k extra times
� What’s Laplace with k = 0?
� k is the strength of the prior
� Laplace for conditionals:� Smooth each conditional
independently:
H H T
So Far: Foundational Methods
4
3
Now: Advanced Applications
5
Robotic Control Tasks
� Perception / Tracking� Where exactly am I?
� What’s around me?
� Low-Level Control� How to move the robot
and/or objects from position A to position B
� High-Level Control� What are my goals?
� What are the optimal high-level actions?
4
Robot folds towels
� [pile of 5 video]
7[Maitin-Shepard, Cusumano-Towner, Lei & Abbeel, 2010]
Low-Level Planning
� Low-level: move from configuration A to configuration B
5
A Simple Robot Arm
� Configuration Space
� What are the natural
coordinates for specifying the
robot’s configuration?
� These are the configuration
space coordinates
� Can’t necessarily control all
degrees of freedom directly
� Work Space
� What are the natural
coordinates for specifying the
effector tip’s position?
� These are the work space
coordinates
Coordinate Systems
� Workspace:� The world’s (x, y) system
� Obstacles specified here
� Configuration space� The robot’s state
� Planning happens here
� Obstacles can be projected to here
6
Obstacles in C-Space
� What / where are the obstacles?
� Remaining space is free space
Two-link manipulator
dddd1
dddd2
XXXX
YYYY
αααα2
αααα1
(x, y)
7
Example Obstacles in C-Space
Two-link manipulator
� Demohttp://www-inst.eecs.berkeley.edu/~cs188/fa08/demos/robot.html
8
Probabilistic Roadmaps
� Idea: sample random points as nodes in a visibility graph
� This gives probabilistic
roadmaps
� Very successful in practice
� Lets you add points where you
need them
� If insufficient points, incomplete
or weird paths
Perception
18
1. Find a point see in two camera views
2. Find 3D coordinates by finding the intersection of the rays
9
19
20
10
21
22
11
23
24
12
25
26
13
27
28
14
29
30
15
31
32
16
33
34
17
35
36
18
Glanced over
� Calibration of camera and robot
� Recognition of corners
� More generally: visual feedback during all
manipulations
� How should we move the corners such
that we obtain the desired result?
37
Now: Advanced Applications
38
19
Motivating Example
� How do we specify a task like this?
[demo: autorotate / tictoc]
Autonomous Helicopter Flight
� Control inputs:� alon : Main rotor longitudinal cyclic pitch control (affects pitch rate)
� alat : Main rotor latitudinal cyclic pitch control (affects roll rate)
� acoll : Main rotor collective pitch (affects main rotor thrust)
� arud : Tail rotor collective pitch (affects tail rotor thrust)
20
Autonomous Helicopter Setup
On-board inertial
measurement unit (IMU)
Send out controls to
helicopter
Position
HMM for Tracking the Helicopter
� Hmm representing KF variables here;
mention KF
42
21
Helicopter MDP
� State:
� Actions (control inputs):� alon : Main rotor longitudinal cyclic pitch control (affects pitch rate)
� alat : Main rotor latitudinal cyclic pitch control (affects roll rate)
� acoll : Main rotor collective pitch (affects main rotor thrust)
� arud : Tail rotor collective pitch (affects tail rotor thrust)
� Transitions (dynamics):
� st+1 = f (st, at) + wt
[f encodes helicopter dynamics]
[w is a probabilistic noise model]
� Can we solve the MDP yet?
Problem: What’s the Reward?
� Rewards for hovering:
� Rewards for “Tic-Toc”?
� Problem: what’s the target trajectory?
� Just write it down by hand?
44
[demo: hover]
[demo: bad]
22
Helicopter Apprenticeship?
47
[demo: unaligned]
Probabilistic Alignment
� Intended trajectory satisfies dynamics.
� Expert trajectory is a noisy observation of one of the
hidden states.
� But we don’t know exactly which one.
Intended trajectory
Expert
demonstrations
Time indices
[Coates, Abbeel & Ng, 2008]
23
Alignment of Samples
� Result: inferred sequence is much cleaner!49
[demo: alignment]
Final Behavior
50
[demo: airshow]
24
Now: Advanced Applications
51
� Reward function trades off 25 features.
� R(x) = w⊤φ(x)
Quadruped
[Kolter, Abbeel & Ng, 2008]
25
Find the Reward Function for Foot Placements
� R(x) = w⊤φ(x)
� Find the reward function that makes the
demonstrations better than all other paths
by some margin
53
minw,ξ ‖w‖22 +∑
i
ξ(i)
s.t. ∀i, ∀x,∑
t
w⊤φ(x(i)t ) ≥
∑
t
w⊤φ(xt) + 1− ξ(i)
� Demonstrate path across the “training terrain”
� Run apprenticeship learning to find a set of weights w
� Receive “testing terrain” (a height map)
� Find a policy for crossing the testing terrain.
High-Level Control
26
Without learning
With learned reward function
27
Now: Advanced Applications
57
Autonomous Vehicles
Autonomous vehicle slides adapted from Sebastian Thrun
28
� 150 mile off-road robot race across the Mojave desert
� Natural and manmade hazards
� No driver, no remote control
� No dynamic passing
� 150 mile off-road robot race across the Mojave desert
� Natural and manmade hazards
� No driver, no remote control
� No dynamic passing
Grand Challenge: Barstow, CA, to Primm, NV
Inside an Autonomous Car
5 LasersCamera
Radar
E-stopGPS
GPS compass
6 Computers
IMU Steering motor
Control Screen
29
Sensors: Laser Readings
12
3
Readings: No Obstacles
30
∆Z
Readings: Obstacles
Raw Measurements: 12.6% false positives
Obstacle Detection
Trigger if |Zi−Zj| > 15cm for nearby zi, zj
31
xt+2xt xt+1
zt+2zt zt+1
Probabilistic Error Model
GPS
IMU
GPS
IMU
GPS
IMU
HMM Inference: 0.02% false positivesRaw Measurements: 12.6% false positives
HMMs for Detection
32