robust belief-based execution of manipulation programs
DESCRIPTION
Robust Belief-based Execution of Manipulation Programs. Kaijen Hsiao Tomás Lozano-Pérez Leslie Pack Kaelbling MIT CSAIL. Achieving Goals under Uncertainty. Two kinds of uncertainty: current state: need to plan in information space results of future actions: - PowerPoint PPT PresentationTRANSCRIPT
Robust Belief-based Execution ofManipulation Programs
Kaijen HsiaoTomás Lozano-PérezLeslie Pack Kaelbling
MIT CSAIL
Achieving Goals under Uncertainty
Two kinds of uncertainty:• current state:
• need to plan in information space• results of future actions:
• search branches on outcomes as well as actions
Choice of action must be dependent on current information state
Discrete POMDP Formulation
• states• actions• observations• transition model• observation
model• reward
Controller
SE
Environment
belief
actionsensing
POMDP Controller
• State estimation is discrete Bayesian filter• Policy maps belief states to actions
Action selection in POMDPs
• Off-line optimal policy generation• Intractable for large spaces
• On-line search: finite-depth expansion of belief-space tree from current belief state to select single action
• Tractable in broad subclass of problems
Challenges for action selection
• Continuous state spaces
• Requirement to select action for any belief state
• Long horizon
• Action branching factor
• Outcome branching factor
• Computationally complex observation and
transition models
Grasping in uncluttered environments
Points of leverage:
• Robot pose is approximately observable
• Robot dynamics are nearly deterministic
• Bounded uncertainty over unobserved
object parameters
• Room to maneuver
Online belief-space search
Continuous state space: discretize object state space
Discretize object configuration space
workspace
configuration space
belief state
Online belief-space search
Continuous state space: discretize object state space
Action for any belief: search forward from current belief state
Search forward from current belief
• Low entropy belief states enable reliable grasp• Use entropy as static evaluation function at leaves• Actions can be useful for information gathering
Online belief-space search
Continuous state space: discretize object state space
Action for any belief: search forward from current belief state
Long horizon: use temporally extended actions
Use temporally extended actions
Primitive actions Entire trajectoriesReduce horizon Observations at end
Online belief-space search
Continuous state space: discretize object state space
Action for any belief: search forward from current belief state
Long horizon: use temporally extended actionsLarge action branching factor: parameterize
small set of action types by current belief
Parameterize actions with belief
Actions are entire world-relative trajectories
In current belief state, • execute with respect to most likely object
configuration• terminate on contact or end of trajectory
Online belief-space search
Continuous state space: discretize object state space
Action for any belief: search forward from current belief state
Long horizon: use temporally extended actionsLarge action branching factor: parameterize
small set of action types by current beliefComputationally complex observation and
transition models: precompute models
Precompute models
Execute WRT• with respect to estimated state e
• in world state w
Expected observation,transition
Based on geometric simulation
Online belief-space search
Continuous state space: discretize object state space
Action for any belief: search forward from current belief state
Long horizon: use temporally extended actionsLarge action branching factor: parameterize
small set of action types by current beliefComputationally complex observation and
transition models: precompute modelsLarge observation branching factor: canonicalize
observations for each discrete state and action
Canonicalize observations
Any (e, w) pair with same relative transformation has same world-relative outcomes and observations
• Only sample for one e with w varying within initial range of uncertainty
Cluster observations and represent each bin of object configurations by a single representative one
• Only branch on canonical observations
Algorithm
Off-line:• plan WRTs for grasping and info gathering• compute models
On-line:• while current belief state doesn’t satisfy goal
• compute expected info gain of each WRT• execute best WRT until termination• use observation to update current belief• return to initial pose
• execute final grasp trajectory
Application to grasping with simulated robot arm
Initial conditions (ultimately from vision)
• Object shape is roughly known (contacted vertices should be within ~1 cm of actual positions)
• Object is on table and pose (x, y, rotation) is roughly known (center of mass std ~5 cm, 30 deg)
Achieve specific grasp of object
Observations
Fingertips: 6-axis force/torque sensors
• position • normal
Additional contact sensors:• just contact
Swept non-colliding path rules out poses that would have generated contact
Grasping a Box
Most likely robot-relative position Where it actually is
Initial belief state
Summed over theta
Tried to move down; finger hit corner
Probability of contact observation at each location
Updated belief
Re-centered
Trying again, with new belief
Back up Try again
Final state and observation
Grasp Observation probabilities
Updated belief state: Success!
Goal: variance < 1 cm x, 15 cm y, 6 deg theta
What if Y coord of grasp matters?
Need explicit information gathering
Simulation Experiments
Methods tested:
• Single open-loop execution of goal-achieving WRT with respect to the most likely state
• Repeated execution of goal-achieving WRT with respect to the most likely state
• Online selection of information-gathering and goal-achieving grasps (1-step lookahead)
Box experiments
Allowed variation in goal grasp: 1 cm, 1 cm, 5 degInitial uncertainty: 5 cm, 5 cm, 30 deg
0
20
40
60
80
100
open loop repeated WRT repeated WRT withinfo-grasp
Pe
rce
nt
gra
sp
ed
co
rre
ctl
y
Cup experiments
Cup experiments
Goal 1 cm x, 1 cm y, rotation doesn’t matter (no info-grasps used)Start uncertainty 30 deg theta (x,y varies)
0
20
40
60
80
100
1 3 5Uncertainty std in x,y (cm)
Per
cen
t gra
sped
co
rrec
tly
Open loop
RepeatedWRT
Increasing uncertainty
Grasping a Brita Pitcher
Target grasp:
Put one finger through the handle and grasp
Brita Pitcher experiments
Brita Pitcher results
Increasing uncertainty
0
10
20
30
40
50
60
70
80
90
100
loc 1, rot 3 loc 3, rot 9 loc 5, rot 15 loc 5, rot 30
Uncertainty standard dev (cm, deg)
Pe
rce
nt
gra
sp
ed
co
rre
ctl
y
Open loop withperfect info
Repeated WRT
Hand-generatedguarded moves
Open loop withimperfect info
Repeated WRTwith info-grasps
Other recent probabilistic approaches to manipulation
Off-line POMDP solution for grasping (Hsiao et al. 2007)
Bayesian state estimation using tactile sensors to locate object before grasping (Petrovskaya et al. 2006)
Finding a fixed trajectory that is most likely to succeed under uncertainty (Alterovitz et al. 2007, Burns and Brock 2007)
The End.
Timing For Brita Pitcher
(2.16 GHz processor, 3.24 GB RAM running Python, times in seconds)
1 cm3 deg
3 cm9 deg
5 cm15 deg
5 cm30 deg
Grid size 5733 16337 14415 24025
Computing observation matrix (1 traj)
12 33 29 51
1st belief-state update
4 10 10 19
Choosing 1st info-grasp
10 9 17 30
Number of Actions Used
1 cm 3 deg
3 cm9 deg
5 cm15 deg
5 cm 30 deg
Robust execution of target
1.9 2.5 3.3 3.5
Robust execution with info-grasps
not run 4.4 4.1 4.2
Creating Information-gain Trajectories
Trajectory generation• Generate endpoints, use randomized planner (such as
OpenRAVE) to find nominal collision-free path• Sweep through entire workspace
Choose a small set based on information gain from start uncertainty