beyond geometric path planning: when context matters

Download Beyond Geometric Path Planning: When Context Matters

Post on 25-Feb-2016

46 views

Category:

Documents

3 download

Embed Size (px)

DESCRIPTION

Beyond Geometric Path Planning: When Context Matters. Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena. Outline. Motivation Approach Context-based score Feedback mechanism Learning algorithm Results. Structured To Unstructured Environments. Beam. Kiva. Kuka arm. - PowerPoint PPT Presentation

TRANSCRIPT

PowerPoint Presentation

Beyond Geometric Path Planning:When Context MattersAshesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena

OutlineMotivation

ApproachContext-based scoreFeedback mechanismLearning algorithm

ResultsJain, Sharma, Joachims, SaxenaStructured To Unstructured Environments

[Images from Google]

Kuka armKivaBeamBaxterPR2Robot Nurse

Jain, Sharma, Joachims, Saxena3Path PlanningHigh DoF manipulatorsContinuous high dimensional spaceObstaclesBASE7 DoF armJointLinkEnd-EffectorABGeometric criteria's

Collision freeShortest pathLeast timeMinimum energy

Kavraki et. al. PRMLaValle et. al. RRTRatliff et. al. CHOMPKaramanet. al. RRT*Schulman et. al. TrajOptJain, Sharma, Joachims, SaxenaA necessary capability for every robot is to plan paths. So consider a 7 dof arm, it has links joints and an end-effector with Which it interacts with the world. 4Context Rich Environment

Jain, Sharma, Joachims, SaxenaContext Rich EnvironmentJain, Sharma, Joachims, Saxenahttp://www.youtube.com/watch?v=uLktpkd7ojAVideo [14 sec to 18 sec]What went wrong?

Robot not modeling the context

Does not understand the preferencesJain, Sharma, Joachims, SaxenaSo what we just saw is an example when robot simply follows a geometric approach to path planningIt does not understand the preferences because it is not modeling the context and the related preferences. 7Does Existing Works Address This?Inverse Reinforcement Learning(Kober and Peters 2011 , Abbeel et. al. 2010 , Ziebrat et. al. 2008 , Ratliff et. al. 2006)

Context is not important, focuses on specific trajectoryModeling human navigation patterns Kitani et. al. ECCV 2012Optimal Demonstrations: Requires an expert

Abbeel et. al.Ratliff et. al.Kober et. al.

Jain, Sharma, Joachims, SaxenaSo the next question is, can existing work help?Answer is, apparently NOT!!!Previous works on learning good trajectories has focused on specific trajectories for a specific task. Learning these trajectories is hard but there is no context and no notion of user preferences.Because is most of the examples there is a single trajectory which is right.Hence for learning they rely on a demonstration from a expert. So in case of helicopter an expert pilot will fly it for you.This limits the applicability of these approaches in household setting where the user is typically non-expert. But as many studies in HRI has shown giving such expert demonstrations is challenging on high DoF robots. 8

Our GoalModel Context

Generate multiple trajectories for a task

User preferences

Learn from non-experts

Jain, Sharma, Joachims, SaxenaOutlineMotivation

ApproachContext-based scoreFeedback mechanismLearning algorithm

ResultsJain, Sharma, Joachims, Saxena

Learning SettingUserRobotOnline learning systemLearns from user feedbackSub-optimal feedbackGoal: Learn user preferencesJain, Sharma, Joachims, Saxena11OutlineMotivation

ApproachContext-based scoreFeedback mechanismLearning algorithm

ResultsJain, Sharma, Joachims, Saxena

Example of Preferences

Move a glass of waterUprightContextContorted ArmPreferences varies with users, tasks and environments

Jain, Sharma, Joachims, Saxena13Score functionRobot configurationandEnvironment Interactions

ContextTrajectoryJain, Sharma, Joachims, SaxenaObject-object InteractionsConnecting waypoints to neighboring objectsTrajectory graph14Score function

Jain, Sharma, Joachims, SaxenaTrajectory graphObject attributes: {electronic, fragile, sharp, liquid, hot, }E.g.Laptop: {electronic, fragile}Knife: {sharp} ..Hermans et. al. ICRA w/s 2011Koppula et. al. NIPS 2011Distance featuresObject-object Interactions15Score functionObject-object InteractionsRobot configurationandEnvironment Interactions

BadGoodJain, Sharma, Joachims, SaxenaFeatures

Spectrogram

Objects distance from horizontal and vertical surfaces

Objects angle with vertical axis

Robots wrist and elbow configuration in cylindrical co-ordinateCakmak et. al. IROS 201116OutlineMotivation

ApproachContext-based scoreFeedback mechanismLearning algorithm

ResultsJain, Sharma, Joachims, SaxenaNow lets see how can the user interact with the robot. 17User Feedback

Intuitive feedback mechanisms

Re-rankInteractiveZero-GJain, Sharma, Joachims, SaxenaIn our setting we want the feedback to be really easy. So as feedback user slightly improve the trajectory proposed by the robot. 181. Re-rankRobot ranks trajectories and user selects one

Top three trajectoriesUser feedbackUser observing top three trajectories

Jain, Sharma, Joachims, Saxena192. Zero-GUser corrects trajectory waypoints

Bad waypoint in redHolding wrist activates zero-G modeJain, Sharma, Joachims, Saxenav20Bad waypoint in red

Holding wrist activates zero-G mode

2. Zero-GUser corrects trajectory waypoints

Jain, Sharma, Joachims, Saxenav213. InteractiveNot all robots support zero-G feedback

Jain, Sharma, Joachims, Saxena3. InteractiveNot all robots support zero-G feedback

Jain, Sharma, Joachims, SaxenaOutlineMotivation

ApproachContext-based scoreFeedback mechanismLearning algorithm

ResultsJain, Sharma, Joachims, Saxena

Coactive LearningUserRobotGoal: Learn user preferencesShivaswamy & Joachims, ICML 2012

Learn from sub-optimal feedbackJain, Sharma, Joachims, Saxena25Trajectory Preference PerceptronRegret boundShivaswamy & Joachims, ICML 2012Jain, Sharma, Joachims, Saxena26OutlineMotivation

ApproachContext-based scoreFeedback mechanismLearning algorithm

ResultsJain, Sharma, Joachims, Saxena

Experimental SetupTwo robots: Baxter and PR2

35 tasks in household setting2100 expert labeled trajectories

16 tasks in grocery store checkout settings1300 expert labeled trajectories

14 objectsBowl, Knife, Laptop, Metal box, Fruits, Egg cartons etc.

7 usersJain, Sharma, Joachims, SaxenaExperimental Setting 1Household environment on PR2

PouringCleaning the tableSetting up table35 tasks Variation in objects and environment Experts label on 2100 trajectories on a scale of 1 to 5

Jain, Sharma, Joachims, SaxenaExperimental Setting 2Grocery store checkout on Baxter

Cereal boxEgg cartonKnife in human vicinity

16 tasks Variations in objects and their placementExperts label on 1300 trajectories on a scale of 1 to 5

Jain, Sharma, Joachims, Saxena

Generalization

#FeedbacknDCG@3Ours w/o pre-trainingOurs pre-trainedSVM-rankMMP-onlineHousehold settingTesting on a new environment

Higher nDCG w/o feedback

SVM-rank trained on experts labels

MMP-online is an IRL techniqueJain, Sharma, Joachims, SaxenaUser Study10 tasks per user7 users

Total 7 hours worth robot interaction

Users interacts until satisfied

Jain, Sharma, Joachims, SaxenaUser Study

Task No.Time (min)#FeedbackIncreasing difficultyGrocery setting

Baxter

Re-rank popular for easier tasks

Increase in zero-G for hard tasks#FeedbackTimeRe-rankZero-GJain, Sharma, Joachims, Saxena33User# Re-rank# Zero-GTime (min)SelfScoreCrossScore15.43.37.83.84.021.81.74.64.33.632.92.05.04.43.243.21.55.33.03.753.61.95.03.53.363.12.4-3.53.672.31.8-4.14.1User Study3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3)Avg.Jain, Sharma, Joachims, SaxenaUser# Re-rank# Zero-GTime (min)SelfScoreCrossScore15.43.37.83.84.021.81.74.64.33.632.92.05.04.43.243.21.55.33.03.753.61.95.03.53.363.12.4-3.53.672.31.8-4.14.1User Study3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3)Avg.5 Feedback3 Re-rank2 Zero-G

Jain, Sharma, Joachims, SaxenaUser# Re-rank# Zero-GTime (min)SelfScoreCrossScore15.43.37.83.84.021.81.74.64.33.632.92.05.04.43.243.21.55.33.03.753.61.95.03.53.363.12.4-3.53.672.31.8-4.14.1User Study3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3)Avg.5 to 6 min. per task

Jain, Sharma, Joachims, SaxenaUser# Re-rank# Zero-GTime (min)SelfScoreCrossScore15.43.37.83.84.021.81.74.64.33.632.92.05.04.43.243.21.55.33.03.753.61.95.03.53.363.12.4-3.53.672.31.8-4.14.1User Study3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3)Avg.Similar preferences

Jain, Sharma, Joachims, SaxenaRobot DemonstrationJain, Sharma, Joachims, Saxenahttp://www.youtube.com/watch?v=uLktpkd7ojAVideo [full video]ConclusionChallenges of Unstructured Environment

Geometric approaches are not enough

Modeling context is crucial

Learning from users and not experts

Jain, Sharma, Joachims, SaxenaThank YouFor more details visit http://pr.cs.cornell.edu/coactive

Recommended

View more >