toward gesture-based programming - robotics institute · alleled freedom within which i could...

Prel

imin

ary

Toward Gesture-Based Programming:

Agent-Based Haptic Skill Acquisition and Interpretation

Richard M. Voyles

CMU-RI-TR-97-36

Submitted in partial fulfillmentof the requirements for the degree of

Doctor of Philosophyin Robotics

The Robotics InstituteCarnegie Mellon University

Pittsburgh, Pennsylvania 15213-3890

August 1997

© 1997 by Richard M. Voyles. All rights reserved.

This research was sponsored in part by the Department of Defense National Defense Science andEngineering Graduate Fellowship program, by the Department of Energy Integrated ManufacturingPredoctoral Fellowship program, and by Sandia National Laboratories. The views and conclusionscontained in this document are those of the author and should not be interpreted as representingofficial policies or endorsements, either expressed or implied, of the United States Government.

Advanced Mechatronics Lab - Carnegie Mellon University i

Abstract

Programming by human demonstration is a new paradigm for the development ofrobotic applications that focuses on the needs oftask experts rather thanprogrammingexperts. The traditional text-based programming paradigm demands the user be anexpert in a particular programming language and further demands that the user cantranslate the task into this foreign language. This level of programming expertise gen-erally precludes the user from having detailed task expertise because his/her time isdevoted to the practice of programming, not the practice of the task. The goal of pro-gramming by demonstration is to eliminate both the programming language expertiseand, more importantly, the expertise required to translate the task into the language.

Gesture-Based Programming is a new form of programming by human demonstrationthat views the demonstration as a series of inexact “gestures” that convey the “inten-tion” of the task strategy, not the details of the strategy itself. This is analogous to thetype of “programming” that occurs between human teacher and student and is moreintuitive for both. However, it requires a “shared ontology” between teacher and stu-dent -- in the form of a common skill database -- to abstract the observed gestures tomeaningful intentions that can be mapped onto previous experiences and previously-acquired skills.

This thesis investigates several key components required for a Gesture-Based Pro-gramming environment that revolve around a common, though seemingly unrelatedtheme: sensor calibration. A novel approach to multi-axis sensor calibration based onshape and motion decomposition was developed as a companion to the development ofsome novel, fluid-based, wearable fingertip sensors for observing contact gestures dur-ing demonstration. “Shape from Motion Calibration” does not require explicit refer-ences for each and every measurement. For force sensors, unknown, randomly-appliedloads result in an accurate calibration matrix. The intrinsic “shape” of the input/outputmapping is extracted from the random “motion” of the applied load through the sens-ing space. This ability to extract intrinsic structure led to a convenient eigenspacelearning mechanism that provides three necessary pieces of the task interpretation andabstraction process: sensorimotor primitive acquisition (populating the skill database),primitive identification (relating gestures to skills in the database), and primitive trans-formation (“skill morphing”). This thesis demonstrates the technique for learning,identifying, and morphing simple manipulative primitives on a PUMA robot and inter-preting the gestures of a human demonstrator in order to program a robot to performthe same task.

ii Toward Gesture-Based Programming: Agent-Based Haptic Skill Acquisition and Interpretation

Advanced Mechatronics Lab - Carnegie Mellon University iii

Acknowledgments

I would like to take the unusual step of first acknowledging my wife, Kathleen, for herlove and support throughout my education, which spanned nineteen years, three uni-versities in three different states, and four children. Enough said.

Of course, thanks go out to my advisor, Pradeep Khosla, and thesis committee mem-bers, Katsushi Ikeuchi, Gary Fedder, and Ken Salisbury. I hope I have not only learnedfrom their collective wisdom, but am able to apply that wisdom as I move forth intothe ranks of engineering faculty. Pradeep, in particular, provided an environment in theAdvanced Mechatronics Lab of outstanding facilities, incomparable minds, and unpar-alleled freedom within which I could pursue my own research objectives. I also owemuch gratitude for the support I received as a National Defense Science and Engineer-ing Graduate Fellow and as a Department of Energy Integrated Manufacturing Predoc-toral Fellow.

Despite the traditional labors of graduate student life, my time spent at Carnegie Mel-lon was quite enjoyable. This was due largely to the intellectual and social camarade-rie I enjoyed with the many friends I made there. Officemates always top the list andBrad Nelson and Dan Morrow provided many amusing (and often “sublime”) opportu-nities for fun and games, as well as good old fashioned hard work. Without them thisthesis would not have been as complete or as fun as it was. Although Brad was clearlyour inferior when it came to political discussion, he kept us both focused on what wasimportant intellectually.

No mention of officemates would be complete without including Harrrrrrrrrry! Shum,who, unfortunately succumbed to the Dark Side after graduation, and Nina Zumel,who is just a great person. I think Nina surprised us all by moving directly from a highschool-equivalent (and a weenie one, at that!) to a first-class Ph.D. program. I stillchuckle when I think of good-natured John Krumm biting unsuspectingly into mykids’ “April Fools Donuts”! Ben Brown was only an “incidental officemate” but he’sbeen involved in so many cool projects anyone would be a fool not to tap his expertise.

So many people made the Robotics Institute such a cool place that I can’t possiblyacknowledge them all. Ian Davis organized the Viking Death Rats for all sorts of ath-letic endeavors including softball, basketball, and outdoor and indoor soccer. (Intramu-ral champs! Not bad for a bunch of geeks!) He came up with some cool uniforms andeven had a way cool band! The NP Completions was another championship intramuralteam on the football field that helped me get my mind off research every now and then.

iv Toward Gesture-Based Programming: Agent-Based Haptic Skill Acquisition and Interpretation

Chris Paredis, Anne Murray, Sing Bing Kang, and Dave Stewart all provided help andintellect around the AML. Though none of them have received the kind of immortalitythat results from new terminology, such as the word “Gertzian,” I will always remem-ber them. As an added bonus, Chris and Sing Bing both introduced me to fruit in oneform or another: Chris to awesome Belgian fruit beers and Sing Bing to the dreadeddurian, which literally force the evacuation of part of our building.

Finally, outside recognition goes to professors Oussama Khatib and H. Marshal Dixon,with special thanks to Mark Cutkosky for lending me the sensors I built for him andfor letting me use his lab and his weather. Last, but not least, to Mom and Nelson, forpaying for some of those nineteen years and for everything else.

Advanced Mechatronics Laboratory - Carnegie Mellon University v

Table of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Making Robot Programming Easier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Prior Work on Learning by Observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Prior Example: Robot Instruction by Human Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . 5Gesture-Based Programming Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7What is This Thesis About? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Chapter 2: Gestures and Interpreting Gestural Intent . . . . . . . . . . . . . . . . . . 13What Are Gestures? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Symbolic Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Tactile Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15Hand Motion Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Gestural Alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Cognitive Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Tropism System Cognitive Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Fine-Grained Tropism Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Fine-grained implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Gesture Recognition Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20What is an Agent? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Input/output relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Non-parental influence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Cognitive behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Example of Gesture Interpretation for Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Relevant Gestures and Psychological Basis for Interpretation . . . . . . . . . . . . . . . . . . . . 25The Multi-Agent Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Agent topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Tropism-based interpretation agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Contact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Confusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Width. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29Rectangle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Experimental trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

vi Toward Gesture-Based Programming: Agent-Based Haptic Skill Acquisition and Interpretation

Chapter 3: Observing Tactile Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35Meeting the Sensory Needs of Gesture-Based Programming . . . . . . . . . . . . . . . . . . . . . . . . 35

Intrinsic and Extrinsic Tactile Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Design Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Intrinsic Tactile Sensor Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Extrinsic Tactile Sensor Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Tactile Actuator Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Chapter 4: Shape from Motion Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . .47Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47The Calibration Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49

Least Squares Calibration Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Shape from Motion Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Shape from Motion Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51Proper Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Applications of Shape From Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Force-Only Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2-DOF Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553-DOF Force Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Force/Torque Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602-D Force/Torque Sensor: 2 Forces, 1 Torque . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603-D Force/Torque Sensor: 3 forces and 3 torques . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63Experimental Comparison to Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Fingertip Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65Lord F/T Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Sensor Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Bias and Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Bias and Shape from Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Greater Autonomy: Collaborative Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Collaborative Calibration Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Chapter 5: Shape from Motion Primordial Learning . . . . . . . . . . . . . . . . . . . .79From Sensor Calibration to Sensorimotor Primitive Acquisition . . . . . . . . . . . . . . . . . . . . . 79

The Gesture-Based Programming Skill Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Acquisition of Sensorimotor Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Mobile Robot Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83PUMA Manipulator Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Guarded Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Advanced Mechatronics Laboratory - Carnegie Mellon University vii

Edge-to-Surface Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Identification of Previously Learned Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Guarded Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Edge-to-Surface Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Transformation of Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Surface-to-Surface Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Chapter 6: Gesture-Based Programming Demonstrations . . . . . . . . . . . . . . . 97Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Robotic Cable Harnessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Task Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Teaching the Trajectory by Human Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Robot Practice and Fine Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Robot Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Agent Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .105

Low-Tolerance Peg-in-Hole Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Two Peg-in-Hole Demonstrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107A Variation on the Peg-in-Hole Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Chapter 7: Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Gesture-Based Programming Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Shape from Motion Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Shape from Motion Primordial Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Fine-Grained Tropism-System Cognitive Architecture . . . . . . . . . . . . . . . . . . . . . . 118Wearable Fluid-Based Tactile Sensor/Actuator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Fine-Grained Tropism-System Cognitive Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 118Shape from Motion Primordial Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Gel-Based Tactile Sensor and Actuator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Chapter 8: References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Appendix A: Shape from Motion Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Basic Shape from Motion Calibration Mathematica Code . . . . . . . . . . . . . . . . . . . . . . . . . 131

File: calibrate.math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131File: motion.basic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

viii Toward Gesture-Based Programming: Agent-Based Haptic Skill Acquisition and Interpretation

File: newrot.math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137File: motion.math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139File: calib.fing.math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Shape from Motion Calibration with Offsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142File: calib.off.math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142File: motion.off.basic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144File: motion.off.math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149Analytic Solutions for Finding the Affine Transform . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Shape from Motion Primordial Learning Matlab Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154File: pumapcat.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154File: pcatrain.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155File: rm_zero_eig.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156File: ratiosv.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157File: pumapcar.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157File: pcarun.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158File: pumapcai2.m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Shape from Motion Primordial Learning Chimera Code . . . . . . . . . . . . . . . . . . . . . . . . . . 161File: skill.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161File: zguard.rmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169File: xroll.rmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

Appendix B: Demonstration Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .171Chimera Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

File: demo2.fsm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171File: demo3.fsm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Advanced Mechatronics Laboratory - Carnegie Mellon University ix

List of FiguresRobot Instruction by Human Demonstration System Overview. . . . . . . . . . . . . . . . . . . . . . . . . . 6Gesture-Based Programming System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Example Symbolic Hand gestures used in a cable harnessing application. . . . . . . . . . . . . . . . . . 14Standard Tropism System Cognitive Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Static Fine-Grained Tropism System Cognitive Architecture for gesture recognition. . . . . . . . . 21Structure of the Gesture Recognition Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Gestural User Interface Employing Tactile Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24The shapes of the trajectory families. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Hints of a triangle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Edges of polygons suggest relationships more than shape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26The tropism system instantiation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27To increase the width, either apply force A or alternately apply forces B and C. . . . . . . . . . . . . 29Graphical representation of the internal model of the rectangle agent. . . . . . . . . . . . . . . . . . . . . 30Annotated temporal and spatial plots of planar trajectory modification. . . . . . . . . . . . . . . . . . . . 32Second-trial temporal and spatial plots of the trajectory modification system. . . . . . . . . . . . . . . 33Cross-section of ER sensor and tactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Cross-section of ER sensor and strain gauge sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40Modular Tips for Single-Beam Intrinsic Sensor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Cutaway view of the original prototype fingertip sensor with plated capacitors. . . . . . . . . . . . . 42Extrinsic Sensor Response to 3mm Cylinder Perpendicular to Sensor Axis. . . . . . . . . . . . . . . . 42Extrinsic Sensor Response to 3mm Cylinder 60 degrees from Perpendicular. . . . . . . . . . . . . . . 43Cross-section of ER sensor and tactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45The sensor and calibration function mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Flow chart for the Shape from Motion Calibration procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 54Single Cantilever Beam-Based Extrinsic Tactile Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Cross section of the cantilever beam with pure force applied. . . . . . . . . . . . . . . . . . . . . . . . . . . 56Test jig for the 2-D fingertip experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Recovered Motion Matrix from Planar Shape from Motion Calibration Experiment . . . . . . . . . 58Recovered motion of the Lord F/T sensor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Single-image random-dot Stereogram of recovered motion data. . . . . . . . . . . . . . . . . . . . . . . . . 60Simulated Comparison of Shape from Motion and Least Squares as Orientation Changes . . . . 64Calibration Fixture for 6-Axis Force/Torque Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67Comparison of Shape from Motion and Least Squares as Orientation Changes . . . . . . . . . . . . . 69Angular error of Least Squares and Shape from Motion at specific known loads. . . . . . . . . . . . 70Motion of the force vector around both sensors during collaborative calibration. . . . . . . . . . . . 75Comparison of Human and Shape from Motion Commands on a Teleoperated Dataset. . . . . . . 85Singular values associated with the first 14 eigenvectors of the guarded move training data. . . 87Autonomous operation of the learned, one-dimensional guarded move. . . . . . . . . . . . . . . . . . . 87Velocity commands of “yroll” and “zguard” primitives working cooperatively. . . . . . . . . . . . . 88Measured Forces with “yroll” and “zguard” primitives working cooperatively. . . . . . . . . . . . . 89

x Toward Gesture-Based Programming: Agent-Based Haptic Skill Acquisition and Interpretation

Cartesian velocity commands recorded during random teleoperated motion. . . . . . . . . . . . . . . . 90Force components recorded during random teleoperated motion and interactions with objects. 90Goodness of match determined by automatic skill identification algorithm. . . . . . . . . . . . . . . . . 90Recognition of zguard primitive from human demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Force in z-axis as zguard, xroll, and yroll cooperate to press one surface against another. . . . . 93Torque around x- and y-axes as zguard, xroll, and yroll cooperate.. . . . . . . . . . . . . . . . . . . . . . . 94Wire Harness Jig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Cable Harnessing training network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Cable Harnessing practice network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101Cable Harnessing application network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Segmented and raw trajectories from training. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103Another segmented trajectory and raw trajectory from training. . . . . . . . . . . . . . . . . . . . . . . . . 104Trained trajectory (dotted) and practice trajectory (solid) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Demonstration, fine-tuning, and execution of the Cable Harnessing Task . . . . . . . . . . . . . . . . 106Demonstration of the Low-Tolerance Peg-in-Hole Task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108Identification of Guarded Moves in Three Separate, Consecutive Task Demonstrations . . . . . 109SPI Graphical display of program resulting from Peg-in-Hole demonstration. . . . . . . . . . . . . 110Robotic Execution of the Low-Tolerance Peg-in-Hole Task. . . . . . . . . . . . . . . . . . . . . . . . . . . 111Variation on Demonstration of Low-Tolerance Peg-in-Hole Task. . . . . . . . . . . . . . . . . . . . . . 112SPI Graphical display of program resulting from second Peg-in-Hole demonstration. . . . . . . 113Robotic Execution of Second Low-Tolerance Peg-in-Hole Task. . . . . . . . . . . . . . . . . . . . . . . . 114Micromachined differential capacitor pressure sensor (before encapsulation). . . . . . . . . . . . . 119

1

Chapter 1

Introduction

Motivation for Gesture-Based Programming

Programming robotic systems is difficult not only for novices but experts alike. Plan-ning operation sequences and collision-free paths while maintaining realistic cycletimes can be hard enough, but add in the complexities of robot/environment contact,uncertainty, and exception handling and the expertise required becomes daunting. As aresult, the impact of robots on most manufacturing domains has been minimal1, partic-ularly as product life cycles and batch sizes decrease. Ironically, these are the condi-tions under which robots were originally touted as being most effective [Hartley,1983].

Once implemented, many robotic applications fail to meet expectations or prove insuf-ficiently robust for the desired level of autonomy. A significant reason for this is thatprogramming difficulties maintain a layer of insulation between the system’s use andits development. Because the programming requires so much expertise, unskilledusers, by definition, cannot be programmers and experienced programmers are too“valuable” to be users. As a result, there is a constant gap between personnel needsand expertise.

1. in terms of the number of robots employed for a particular job function compared to thenumber of humans employed for that same function across the industry

MAKING ROBOT PROGRAMMING EASIER

2 Toward Gesture-Based Programming: Agent-Based Haptic Skill Acquisition and Interpretation

1.1 MAKING ROBOT PROGRAMMING EASIER

Numerous attempts have been made to ease the discomfort of robot programming andimprove relevant skill transfer with varying degrees of success. From the birth ofrobotics, researchers have built special robot programming languages and extensionsto existing languages in the form of “robocentric” subroutines and libraries. Ernst wasprobably the first to create a robot-specific programming language in 1961 [Ernst,1961]. MHI (Mechanical Hand Interpreter) was a simple interpreted language thatincluded a “move” command for his mechanical hand. A few more familiar examplesthat are still around today include VAL [Unimation, 1984], AML [IBM, 1983], andRCCL [Hayward and Paul, 1987].

Robot-specific languages add commands that parametrize and encapsulate complexbehavior, reducing the breadth of expertise required for application programmers. Forexample, a simple “move” command encapsulates low-level joint control, error boundchecking, safety systems, and trajectory generation, all in one simple function call. Nolonger must the programmer understand classical control theory to make the robotmove. Furthermore, the move command is parametrized so the programmer can selectstraight-line motion, goal location, and speed without being burdened by other impor-tant but irrelevant parameters such as PID gains and interpolation constants.

Still, command libraries only go so far because they are generally sequential in natureand the order of execution must be explicitly chosen by the programmer. Ernst, eerilyforeshadowing current research trends including sections of this thesis, also experi-mented with THI (Teleological Hand Interpreter), an attempt at task-level program-ming for his mechanical hand [Ernst, 1961]. THI incorporated “outside-in” controllersdriven by external events. Today, behavior-based and multi-agent-based systems[Brooks, 1986][Connell, 1989][Wooldridge and Jennings, 1995][Mataric, 1997]encapsulate “functionality” that respond directly to the outside world, executing inparallel and, more importantly, possessing some degree of autonomy. Like special lan-guages, the goal is to make programming easier for programming experts, but the dif-ference is the agents, themselves, can help decidewhen to execute. This concept ofdecentralization has led to orders-of-magnitude improvements in the capabilities andperformance of robots in the real world [Connell, 1989] in comparison to their deliber-ative predecessors [Nilsson, 1984].

Visual programming environments like Chimera/Onika [Stewart, et al, 1992][Gertz, etal, 1993] and commercial packages such as MatrixX/SystemBuild, Matlab/Simulink,and LabView capitalize on modular, reconfigurable software blocks by iconifyingthem within “software assembly” environments [Stewart, et al, 1993]. These visualenvironments allow programming at a much higher level, hiding many details of theparticular implementation and lessening the burden of programming expertise. Con-

MAKING ROBOT PROGRAMMING EASIER

Advanced Mechatronics Lab - Carnegie Mellon University 3

trary to the other approaches outlined thus far, visual programming is aimed at noviceprogrammers and factory-floor workers that know what needs to be done, but may notpossess detailed engineering expertise.

Despite individual successes of these varied approaches, a fairly high level of expertiseis still required to program and interact with robots. For example, Onika, a visual pro-gramming environment, allows novice users to easily build simple pick-and-placerobotic applications in a matter of minutes [Gertz, 1994]. But it is imperative that theuser understands and consciously considers the importance of via points, collisionavoidance, grip selection, and the dynamic effects of transport in order for the applica-tion to be successful. And that’s just for a simple pick-and-place task! Visual program-ming does little for the bulk of potential robotic applications that involve substantivecontact with the environment, uncertainty, and complex sensing.

Since these characteristics are inherent in all manipulative tasks and are adeptly mas-tered by humans, they are usually handled unconsciously. Hence the distinctionbetweentask experts andprogramming experts. Programming experts are trained tobring these subconscious requirements up to the conscious level in order to transcribethem as required by the particular programming environment.

With this observation in mind, it is instructive to look at the inroads robots have madeinto manufacturing to see how these problems can be mitigated. Two of the most com-mon industrial applications are spray painting and spot welding. Both of these use nat-ural programming methods such as “lead-through teaching” and point-teaching,respectively [Todd, 1986], that are easily mastered by semi-skilled workers. Lead-through teaching, in particular, allows anyone who can perform a “valid” task to pro-gram the robot to perform the task by demonstrating it. As such, programmingchanges can be made reactively or proactively by a task expert, not by calling in arobot programming expert. Also, tasks such as spray painting involveskill transferfrom the teacher (i.e., programmer) to the robot. Lead-through teaching provides anintuitive mechanism for a task expert to accomplish this.

Unfortunately, lead-through teaching is limited to simple tasks because of its relianceon pure kinematic replay. But the lesson is clear and it is only natural that robot pro-gramming move more in this direction. Hence, the emergence of the expanding field oflearning by observation (LBO) [Kuniyoshi, et al, 1994][Ikeuchi and Suehiro,1994][Kang and Ikeuchi, 1996]. By forcing the robot to observe a human interactingwith the world, rather than forcing the human to interact with atextual representationof the robot interacting with the world, a more natural, rich, “anthropocentric” envi-ronment results. But the systems developed to-date are mostly kinematically-based, aswell, and operate off-line; if the robot misinterprets the desired trajectory or the envi-ronment changes slightly, the whole sequence must be re-taught.

PRIOR WORK ON LEARNING BY OBSERVATION


1.2 PRIOR WORK ON LEARNING BY OBSERVATION

Ikeuchi has been involved in a variety of investigations of learning by observation. Hiswork with Kang [Kang, 1994] is described in detail in the following section. He alsoworked extensively with others on an approach called “Assembly Plan from Observa-tion” (APO) [Ikeuchi and Suehiro, 1994]. APO assumes an assembly subtask consistsof an initial state, a period of human interaction, and a final state. APO uses a light-stripe 3-D vision system to match objects in the world to known geometric objectmodels in order to extract state information. A task is temporally segmented by simplylooking for changes in brightness levels which are assumed to be caused by humaninteraction. The APO system reasons about subtasks by hypothesizing geometric con-tact conditions between recognized objects and using an assembly taxonomy todeduce the assembly operation that the human used to assemble the objects. Based onknown initial state of blocks in the robot’s workspace, Robot World is used to build theassembly using position control.

The APO system differs from GBP in several ways. It is purely kinematically-based asit uses no force feedback during training or execution. Although they do mention theuse of “skills” for execution, they are specific to the initial poses of the parts involvedand are hand-coded, providing no opportunity for direct learning. Finally, no informa-tion is extracted from the demonstration on grasp type or via points; this information isgleaned from geometric planning.

Kuniyoshi et al [1994] performed work similar to APO. It, too, operated in a “blocksworld” and used binocular stereo to determine initial and final states of objects in theworkspace. However, they crudely tracked the motion of the hand during training toextract trajectories and rough grasp points. The system deduced logical relationshipsbetween objects in the scene based on geometric constraints in order to build an exe-cutable program.

Kuniyoshi extended parts of this work to “cooperation by observation” for multi-robotsystems [Kuniyoshi, 1996]. In this related work, observation of other robots perform-ing tasks results in strategies for cooperation to complete the task.

Brunner et al [1995] discuss learning by showing in the context of the ROTEX project[Hirzinger et al, 1993]. ROTEX is a space-based robot that operated in the Space Shut-tle and has a number of autonomous and semi-autonomous modes. A key aspect of thiswork is augmented teleoperation in the presence of time delays. “Elemental moves”are employed in a manner similar to skills during both learning by showing and teleop-eration with time delays, but they are kinematics-based.

PRIOR EXAMPLE: ROBOT INSTRUCTION BY HUMAN DEMONSTRATION


Kaiser and Dillman [1996] use human demonstration to build elemental robotic skillsand have an extensive body of work that, in many cases, touches on other aspects ofthe programming by human demonstration problem, including the inconsistency ofhumans. Delson and West [1996] also explore the difficulties of extracting roboticskills from demonstration in the presence of human inconsistency.

There has also been significant interest in programming by human demonstration inthe user interface community as reflected in the book published by Cypher et al[1993].

1.3 PRIOR EXAMPLE: ROBOT INSTRUCTION BY HUMAN DEMONSTRATION

This thesis was partially motivated by our own contributions to the work of Sing BingKang and Katsushi Ikeuchi on “Robot Instruction by Human Demonstration” [Kang,1994]. However, it is not a direct extension of this work because the driving philoso-phy is quite different. Kang was driven by a top-down philosophy in which, given theproblem of robot instruction by demonstration, he methodically decomposed the prob-lem and recursively solved the resulting sub-problems. To paraphrase his methodol-ogy, in order to be instructed by demonstration, the system must be able to:

• observe the demonstration

• segment the demonstration into meaningful subtasks

• recognize various instances of possible subtasks

• translate the subtasks to the robot

• recompose the subtasks into the complete task

• execute the task with the robot

The resulting system is depicted diagrammatically in Figure 1.1. A CyberGlove withPolhemus and a multi-baseline stereo system [Kang et al, 1994] were used to sense thehuman and the task, respectively. Using CAD models of the parts, their motionsthrough the workspace were tracked independent of the hand motions, despite occlu-sions. This provided the kinematic trajectories for the parts. Separate kinematic trajec-tories for the hand were provided by the Polhemus. The CyberGlove provided handpose information during grasps that suggested grasp type through the use of a grasptaxonomy.

In assisting Kang with the implementation of the final step, “execute on the robot” (thelower right corner of the figure), it became obvious that a vitally important componentwas missing: there is no feedback path during autonomous execution. The arrow from

PRIOR EXAMPLE: ROBOT INSTRUCTION BY HUMAN DEMONSTRATION


the execution agents to the task is truly indicative of unidirectional flow of informa-tion.

It was clear that integrating feedback into a learning by observation (LBO) systemcould significantly enhance the state-of-the-art. So the philosophy driving this researchwas explicitly bottom-up. We wanted to address the issue of incorporating force-based“skills” into an LBO paradigm to permit the trueprogrammingof contact-intensivetasks, rather than theinstruction of a preprogrammed, open-loop system. Hence, thefocus of this thesis is not on creating a complete system (though a prototype did resultand is described in Chapter 6), but in the study of prospective methods of incorporat-ing autonomous execution primitives at run-time.

LBO

Task

= training

= execution

= input/output

= physical interaction

Kinematic

ExecutionAgents

Optimization

Observer

Trajectory

HandTemporalTaskSegmentation

Vision

Sensors

FIGURE 1.1 ROBOT INSTRUCTION BY HUMAN DEMONSTRATION SYSTEM OVERVIEW.

GESTURE-BASED PROGRAMMING OVERVIEW


1.4 GESTURE-BASED PROGRAMMING OVERVIEW

If one views programming as a form of teaching, textual programming is clearlyunnatural for humans. Despite a human’s ability to master some task -- from high-levelplanning of sequence and trajectory, to grasp selection, to robust, low-level execution -- it remains virtually impossible for that human to describe the task in syntacticallycorrect prose in a foreign (programming) language for all but the simplest of cases.(As evidenced by the relatively poor infiltration of robots into contact-intensive appli-cations.) Humans are more effective at teaching by demonstration followed by practiceand it has been proven empirically to be the most successful technique betweenhumans [Patrick, 1992]. But teaching by demonstration is not always successfulbetween humans and one of the reasons is it assumes a basic set ofa priori capabili-ties. Imagine trying to teach calculus to someone who doesn’t know algebra or tryingto demonstrate the assembly of a carburetor to someone who doesn’t know how toinsert and tighten a screw. Without the appropriatea priori capabilities or skills, teach-ing by demonstration is very difficult.

Gesture-Based Programming (GBP) addresses these issues by providing a more natu-ral, demonstration-based programming paradigm that allows both demonstration andpractice phases and relies on an underlying skill base of robust, semi-autonomousprimitives from which new tasks can be assembled.

GBP begins by observing a human demonstrate the task to be programmed. (See thestick figure on the left of Figure 1.2.) Observation of the human’s hand and fingertipsis achieved through a sensorized glove. The modular glove system senses hand pose,finger joint angles, and fingertip contact conditions. Objects in the environment aresensed with computer vision while a speech recognition system extracts “articulatorygestures.” (Gestures will be described later.) Primitive gesture classes are extractedfrom the raw sensor information and passed on to a gesture interpretation network.These autonomous multi-agent networks extract the demonstrator’s intentions withrespect to the system’s skill base to create an abstraction of the task. In other words,the system is not merely remembering everything the human does, but is trying tounderstand -- within its scope of expertise -- the subtasks the human is performing(“gesturing”). These primitive capabilities in the skill base take the form ofencapsu-lated expertise agents -- semi-autonomous agents that encode sensorimotor dexterity.

The output of the GBP system is the executable program for performing the demon-strated task on the target hardware. This program consists of a network of encapsu-lated expertise agents of two flavors. The primary agents implement the primitivesrequired to perform the task and come from the pool of primitives represented in theskill base. The secondary set of agents includes many of the same gesture recognitionand interpretation agents used during the demonstration. These agents perform on-line

GESTURE-BASED PROGRAMMING OVERVIEW


observation of the human to allow supervised practicing of the task, if desired. (Stickfigure on far right of Figure 1.2.)

As mentioned above, the human model for teaching by demonstration most ofteninvolves a practice phase. The reason for this is that passive observation of a taskrarely provides accurate parametrization for the trainee’s deduced task “model” (inthis case, the model is represented by the collection of primary encapsulated expertiseagents) and sometimes the deduced model is wrong (e.g. missing or incorrect encapsu-lated expertise agents). Incorporating gesture recognition and interpretation agentsinto the executable provides an intuitive way for the demonstrator, or another user, tofine tune the operation of the program without having to demonstrate the entire taskover again. Because all our agents are implemented as autonomous software modules,

GBP TaskAbstraction

Task

= training

= practice and execution

= input/output

= physical interaction

Execution

Observer

Vision

Hand/FingerSensors

ExpertiseDatabase

Collection ofEncapsulatedExpertise Agents

Agents

FIGURE 1.2 GESTURE-BASED PROGRAMMING SYSTEM OVERVIEW

WHAT IS THIS THESIS ABOUT?


these observation agents can easily be disabled without recompiling the program toprevent unauthorized modification.

In the real world, it will not be possible to represent most useful tasks with one net-work of encapsulated expertise agents. Therefore, it is necessary to segment the dem-onstration into a series of discrete subtasks (e.g. a grasping subtask followed by amanipulation subtask). Each subtask will be embodied by a network as describedabove. In this case, the executable program will consist of a sequence of networksrather than one all-encompassing network.

1.5 WHAT IS THIS THESIS ABOUT?

Gesture-Based Programming is a very complex issue with a great opportunity forimpact and a great many possibilities for research. As mentioned, this thesis takes abottom-up perspective with emphasis on incorporating agent-based, contact-intensiveskills into the interpretation and execution phases.

In order to use human demonstrations for the domain of contact-intensive tasks it isnecessary to sense contact at the human fingertip. Kang used vision and kinematics toguess about the presence or absence of contact, but was unable to extract any detailedinformation without local sensors. It may also be necessary to filter to the tactile sensa-tions the human perceives to prevent the human from using cues unobservable to therobot. This thesis includes sections on the preliminary design and testing of a novel,modular, inside-out symmetric tactile sensor/actuator capable of such sensing and fil-tering and a novel calibration scheme that makes calibrating large numbers of sensorseasier.

This thesis also addresses the issue of a single representation that provides the acquisi-tion of sensorimotor primitives from human demonstration through teleoperation, theexecution of primitives on an autonomous robotic system in an agent-based environ-ment, identification of previously learned primitives performed both via teleoperationand direct manipulation, and the transformation of those primitives to autonomouslyperform previously unobserved tasks. It turns out the calibration technique used on thesensors above suggests a useful integrated representation.

Finally, humans are inherently qualitative in their actions and communication. Inter-preting the intent of a demonstration requires scalability and qualitative calculations.We chose to implement our work with multi-agent networks to achieve scalability andpresent an extension of the Tropism System Cognitive Architecture for qualitativedecision making.

THESIS ORGANIZATION


Although this thesis presents prospective solutions to some of the most importantproblems of GBP and performs convincing demonstrations of their effectiveness, Ges-ture-Based Programming is not a “done deal.” Many issues remain for further study tocreate a system usable by semi-skilled workers.

1.6 THESIS ORGANIZATION

The chapters of this thesis are not organized in order of the importance of the contribu-tion, nor are they in chronological order with respect to their development. Insteadthey are arranged in a way that we believe will make them most understandable as awhole. In fact, the key foundational component is presented in Chapter 4, whichdetails Shape from Motion Calibration. This technique anchors our work on skill-baseconstruction, human gesture interpretation, and multi-agent networks as well as thedesign and construction of multi-axis sensors. All these areas are profoundly relevantto our approach to GBP for contact-intensive tasks. Despite its importance, it isreserved for the midpoint of the thesis because its relevance becomes clearer after “set-ting the stage” and the rest of the work naturally flows from that point.

Chapter 2 presents qualitative definitions of gestures amd agents and a new twist onthe agent-based Tropism System Cognitive Architecture. Reconfigurable multi-agentsystems form a strong undercurrent to this research and provide the philosophicalunderpinnings to every aspect. We use this modified architecture for implementingmulti-agent networks for interpreting human intention through gestures as an example.

Chapter 3 highlights the development of a novel, modular, reconfigurable tactile sen-sor for observing contact conditions during human demonstration and robotic execu-tion. Preliminary work on an inside-out-symmetric tactile actuator is presented todemonstrate the feasibility of a wearable, multi-element tactor.

As previously mentioned, Chapter 4 describes Shape from Motion Calibration, whichlooks at calibration in a new light. This technique was motivated by the sensor devel-opment described in chapter 3 and the potential multiplicity of sensors for human androbot hands, but quickly yielded other fruit.

Chapter 5 details a prospective learning approach for the acquisition of sensorimotorprimitives that is derived from Shape from Motion Calibration. It is an important com-ponent of GBP in that it conveniently provides acquisition, identification, and transfor-mation (“morphing”) of sensorimotor primitives in one representation.

Chapter 6 describes some primitive demonstrations of the GBP paradigm using thecomponents described, including a cable harnessing task and a peg-in-hole task. The

THESIS ORGANIZATION


cable harnessing task is an extension of the gesture interpretation experimentdescribed in chapter 2. It adds a human demonstration component and multi-modalgestural interpretation. The peg-in-hole task is true GBP. A program consisting of asequence of parametrized primitives results entirely from a human demonstration ofthe task.

Finally, a summary of research and the contributions of the thesis are presented inChapter 7.

THESIS ORGANIZATION


13

Chapter 2

Gestures and InterpretingGestural Intent

Multi-Agent Networks for Qualitative Tasks

2.1 WHAT ARE GESTURES?

The phrase “robot programming by human demonstration” elicits clear mental imagesof its meaning. A human performs a task; a robot observes the task; the robot performsthe task. “Gesture-based programming,” on the other hand, is rather vague and confus-ing. This is the very nature of gestures themselves. Gestures are not a language, yetthey facilitate communication. They are context-dependent, yet they are used in multi-ple domains. They are not universal, yet they convey intention across cultural bound-aries, although not precisely.

Gesture-based programming is a form of programming by human demonstration, butthe phrase better captures the focus of this research. We want to interpret theintentionbehind fleeting and sometimes ambiguous human motions during a demonstration,rather than just recording what is observed. Different demonstrated motions, thoughappearing to be similar, may have different meanings under different conditions -- likegestures. Gestures may hold the same high-level meaning to different observers, yetelicit different low-level responses from each one -- like skills. Finally, we want toemphasize real-time performance so interaction can occur not only in the demonstra-tion phase, but also in the practice phase.

Gestures, within the scope of this work, are defined as “imprecise events that conveythe intentions of the gesturer.” While this is a broad definition that encompasses manygestural modalities, we are interested in a small subset of four manual and articulatorygestures. These includesymbolic gestures, such as the pre-shape of a dextrous grasp,

WHAT ARE GESTURES?


motion gestures, such as transporting or manipulating an object,tactile gestures, suchas fingertip contact force, andarticulatory gestures, which are utterances such as“oops!’ when an error is made or geometric model-based information such as “tightenthe screw.” (The whole phrase is irrelevant; only the keyword “screw” is important.)Other gestural modes exist but this is the subset we consider for our GBP system.Because gestures are imprecise events, they are context dependent; in isolation theyconvey little information. Yet, when combined with the state of the system and theenvironment at the time the gesture occurred, perhaps augmented with some past his-tory of gestures, the intention becomes interpretable. The key point is the interpreta-tion of non-symbolic gestures develops over a series of gestures rather thaninstantaneously.

2.1.1 Symbolic Gestures

Symbolic gestures differ from most other types of gestures in that they generally haveone-to-one mappings of intention. Sign language, for example, is implemented withsymbolic gestures; each gesture has one and only one meaning. (Although they can bemisinterpreted.)

Symbolic gestures are encoded entirely by the kinematic configuration of the hand sothey require no force or tactile information to be observed. They can be observedentirely by a glove such as the CyberGlove or, perhaps, with a camera as in the “Digit-Eyes” system [Rehg and Kanade, 1995]. Their uses include the implementation of amenu (as in Figure 2.1) for robot instruction or for indexing into a taxonomy of grasptypes for task demonstration. These are the most specific form of gesticulation andthey result in the least amount of interpretation time and so respond much quicker forthe run-time phase of operation.

FIGURE 2.1 EXAMPLE SYMBOLIC HAND GESTURES USED TO IMPLEMENT A MENU USING A DATAGLOVE INA CABLE HARNESSING APPLICATION.

GESTURAL ALPHABET


2.1.2 Tactile Gestures

Tactile gestures involve force impulses or force signatures that result from physicalcontact between the gesturer and the robot or the gesturer and the environment. Thereare two types of tactile gestures we consider for this thesis, delineated by sensingmechanism: end-of-arm gestures sensed by a wrist force/torque sensor and fingertipgestures sensed locally at the fingertip. End-of-arm gestures involve the user physi-cally interacting with the robot at the robot’s end effector as in nudging to fine tune abehavior. Fingertip gestures can be as simple as indicating that a particular fingertip isin contact with a surface or as complicated as a force signature characteristic of a dex-trous manipulation. Sensing these gestures will be discussed in greater detail in Chap-ter 3.

2.1.3 Hand Motion Gestures

Hand motion gestures are much more varied than tactile gestures and are more difficultto classify. They can be deliberative, such as tracking a curve to indicate a desired tra-jectory, or flippant, as in motioning a little farther along. Deliberative gestures are eas-ier to recognize and are better suited to training, so we focus on them for GBP.

We consider three types of hand motion gestures: straight-line translations, free-formtranslations, and reorientations. These three types of gestures provide redundant infor-mation on the segments of the trajectories and their boundaries during training. Redun-dancy is important to achieve high interpretation accuracy because the recognitionaccuracy of individual gestures is low by comparison.

2.2 GESTURAL ALPHABET

Using a linguistic analogy, the raw gestures and state information form agesturalalphabet. The gestural alphabet is made up of characters representing the differentgestural modes and the different types of relevant state information. Agestural wordisassembled from the raw gesture and its associated context by a gesture recognizer.Gesture interpretation examines these words withingestural sentences that are strungtogether by the gesturer.

The gestural alphabet is like a linguistic alphabet in the sense that characters areparametrized. For instance, in English, the letter “a” has many sounds including thelong “a” as in “save,” the short “a” as in “sad,” and the silent “a” as in “read.” This isan implicit parametrization. In contrast, the gestural characters in our representationare explicitly parametrized by either a single integer or floating-point argument.

COGNITIVE ARCHITECTURES


For example, consider a “nudge” -- an end-of-arm tactile gesture. Let’s call it character“N” for nudge. (The alphabetic labels are arbitrary.)N is discretely parametrized intoone of three types: brief, elongated or continuation; soN takes an integer parameter.The reason for these three types of nudges is to allow for gestures that are drawn out intime. This notation permits the recognition of the start of a gesture in progress.

The raw gesture is just a force impulse so it has some average magnitude and directionassociated with it as well as its duration. The magnitude and direction can be repre-sented with force components so, for planar forces, we add the alphabetic charactersXandY with their floating point parametric values. For a particular gestural event theraw gesture becomes “Nn Xx Yy,” wheren, x andy are the instantiated values associ-ated with the event.

To form a complete gestural word, state information must be added to facilitate inter-pretation. Imagine a fictitious application that requires only the cartesian position of

the end-of-arm. We can call the components of the robot’s cartesian positionP, QandR, which would have floating point parametric values, also. The resulting applica-tion-specific gestural word would be “Nn Xx Yy Pp Qq Rr.”

2.3 COGNITIVE ARCHITECTURES

Observing gestures to extract human intention is a complex and inherently qualitativetask. There are no specific analytical models that can be applied to a sensor stream tooutput an optimum gestural interpretation. Some form of reasoning must be performedto arrive at a probable interpretation.

Deliberative cognitive architectures have been the mainstay in robotics since its incep-tion. Shakey [Nilsson, 1984] is a classic example of a hierarchical, monolithic, sense-think-act architecture from the early 80’s. In the past decade, there has been a strongpush toward multi-agent systems as opposed to monolithic ones. These tend to incor-porate reactive, sense-act behaviors that avoid the latencies of deliberative world mod-elers, but they still rely on hierarchical structures for goal management. For example,the subsumption architecture [Brooks, 1986][Connell, 1989] is the prototypical behav-ior-based architecture that relies on the ability of higher-level layers to subsume thefunctions of lower-level layers.

While monolithic deliberative architectures are appropriate for many types of goal-directed tasks, they are not a good model for many biological systems, such as insectcolonies, which have demonstrated themselves to be successful and robust in complexenvironments (e.g. the natural world). The Tropism System Cognitive Architecture[Agah and Bekey, 1996b] was proposed as a basis for the study of the emergent and

ℜ3

TROPISM SYSTEM COGNITIVE ARCHITECTURE


collaborative behavior of robot colonies (multi-agent systems) with limited inter-agentcommunication. Tropisms are the positive and negative responses of an agent to spe-cific stimuli -- essentially the agent’s likes and dislikes [Walter, 1953]. Each agent’sbehavior is determined by a set of evolving tropisms. A key benefit of this like/dislikerepresentation is that it is easily understood by system designers.

Due to the highly qualitative nature of gestures and the need to reconfigure the recog-nition software for different applications in both training and demonstration, wedecided a multi-agent approach would be more appropriate than a deliberative archi-tecture. In this chapter a modified version of the tropism architecture is described andit serves as the multi-agent infrastructure for the implementations of this entire thesis.It will be shown here that a multi-agent system equipped with the Fine-Grained Tro-pism System Cognitive Architecture [Voyles et al, 1997] can be utilized to recognizeand interpret human operators’ gestures for robot instruction. Tropism-based cognitionnot only allows for simple construction of the control agents, but also yields a systemthat can function well both in reactive and proactive modes.

2.4 TROPISM SYSTEM COGNITIVE ARCHITECTURE

The Tropism-based cognitive architecture system proposed by Agah and Bekey[1996b] is depicted diagrammatically in Figure 2.2. The architecture is based on theidea that the behavior of an agent is dependent on the likes and dislikes of the agent.Biological systems tend to do things they like and avoid those which they dislike.When an agent encounters a situation, the likelihood of a response toward the morelikable entity is higher. The tropism-based cognition enables an agent to behave insuch a manner. It should be noted that the agent does not necessarily take the actionwith the highest tropism, but it is more likely to take an action that has a higher tro-pism.

The agent is embedded in a world that is populated by entities,εi, each in stateσi,which can be sensed. Based on this sensed information, an agent has preferenceτj totake actionαj. Thus, atropism element can be represented as a 4-tuple:

(ε, σ, α, τ)

The behavior of an agent is thus determined by a set of such tropism elements,Ξ. Asthis set can change dynamically, it is a function of time:

Ξ(t) = {(ε, σ, α, τ)}

TROPISM SYSTEM COGNITIVE ARCHITECTURE


After sensing the surrounding world, each agent matches sensed entity/state pairs withtropism elements in its tropism set. From this subset of matched tropism elements theagent probabilistically selects actions based on its own preferences. (See [Agah andBekey, 1996b] for more detail.)

Previous applications of this architecture have focused primarily on colonies of roboticagents, each equipped with the tropism architecture and their own sets of tropisms(e.g. [Agah and Bekey, 1996a][Agah and Bekey, 1996b]). These colonies consisted ofmany individual robotic agents with discrete sets of states and actions, embodied inworlds with a known set of entities. Typical worlds included hunter/gatherer agentsand predator agents.

For the interpretation of human gestures, we are not interested in colonies of littlemobile robots running around searching for hockey pucks. Instead, we are interestedin a heterogeneous colony of software agents with a high degree of agent speciality. Infact, each agent will be responsible for only one “action” in the user’s world but theagent’s world will consist of continuous actions and states (although we discretize thestates using fuzzy sets).

Sense

MatchTropisms

SelectTropism

Act

Learning

Evolution

TropismSet

World

FIGURE 2.2 STANDARD TROPISM SYSTEM COGNITIVE ARCHITECTURE

FINE-GRAINED TROPISM ARCHITECTURE


Only static tropism systems are examined in this work. We do not address the issues ofdynamic learning of individual tropism sets and group evolution (ontogenetic and phy-logenetic learning), which are explained in [Agah and Bekey, 1996a]. This remains asa topic of important but future research.

2.5 FINE-GRAINED TROPISM ARCHITECTURE

As set forth, the tropism system cognitive architecture works well for simple embod-ied agents in a known world. However, for creating highly complex entities with com-peting complex behaviors, it is rather unwieldy. Tropism sets consist of very simplecompeting likes and dislikes from which more complex behavior emerges. Embodyinghighly complex entities suggests each entity should be constructed of multiple agents,each dictating some behavior. This implies a hierarchical structure in which fine-grained agents are networked and encapsulated to create still higher-level agents. TheFine-Grained Tropism System Cognitive Architecture is a modification of the basicarchitecture to better support this hierarchical structure for gesture interpretation.

An underlying motivator for the fine-grained, multi-agent architecture is our interest inthe development of rapidly-deployable systems. Our approach to rapidly-deployablesystems engineering involves primitive software and hardware agents that are modular,reconfigurable, simple, and reusable [Morrow and Khosla, 1995]. In this light, werestrict each agent to a single interpretation or hypothesis, resulting in a fine-grainedagent decomposition, based on the tropism cognition architecture.

From the world’s (user’s) perspective (referring to Figure 2.2), each agent can takeonly one action corresponding to the hypothesis it is trying to prove or disprove. Fromthe agent’s perspective, however, it has a number of actions to take that either increaseor decrease its confidence in the sole hypothesis.

2.5.1 Fine-grained implementation

Assume an embodied agent is composed of a set of fine-grained agents which we willrefer to asbehavior agents. Behavior agentBk is responsible for determining whenbehaviorβk is appropriate to exhibit. It determines this by observing gestures thatstrengthen or refute its hypothesis thatβk is appropriate.Bk’s confidence inβk is rep-resented by the scalar,cβk, which is integrated over time.

Because the fine-grained architecture limits each fine-grained agent to the single goalof confirming or refuting its hypothesis over a series of inputs (a “gestural sentence,”

GESTURE RECOGNITION AGENTS


in this case), the actions,α, involve the increase or decrease of the agent’s confidencein its hypothesis. Hence, actions become scalar functions that are integrated over time.

In the context of gesture interpretation, the entities,ε, are sensors (or virtual sensors)for different gestural classes. Sensorεi will take fuzzy valueσi with membershipσmi.Bk can take actions to adjust its confidence inβk by amountsαj(σmi). So the action,αj,is proportional to the membership ofεi in σi. The tuple (ε, σ) is derived from a “ges-tural word” from a virtual sensor, or gesture recognition agent, and a fuzzification pro-cess dictated by the discrete tropism elements. These elements can be represented as:

(ε, σ, α(σm), τ)

Since α(σm) is a scalar function proportional to membership value it includes thenotion of preference. In operation, there is a high degree of variability in the gesturalwords generated by the operator. As a result,α(σm) represents a probabilistic distribu-tion when integrated over several samples (a gestural sentence) so we setτ to 1 withoutloss of generality:

(ε, σ, α(σm), 1)

Adapting Figure 2.2 to this specific implementation results in the tropism system cog-nition for gesture recognition, as depicted in Figure 2.3 where “GRA” stands for Ges-ture Recognition Agent and “GIA” stands for Gesture Interpretation Agent.

2.6 GESTURE RECOGNITION AGENTS

Because gestures are context dependent, state information must be associated witheach gesture. This state information is generally application specific. We modularizethe gesture recognition agents by separating the application-specific state specificationagent from the raw gesture agent.

Gesture recognition is handled by individual GRAs for each gestural mode. Each rec-ognition agent is made up of two subordinate agents, one general, one domain-spe-cific. The general agent is the “gesture agent” of Figure 2.4. It recognizes the instanceof a gesture and the type of gesture within the gesture class (mode). The domain spe-cific agent (the “state agent”) grabs the relevant state information associated with thegesture type for the given domain and appends it to the gestural word.

The Gesture Interpretation Agents are task specific and must be described within thecontext of a particular application.

GESTURE RECOGNITION AGENTS


GRA1

Act

Behavior)

TropismSet i

User &

GRA2 GRAr

GIA1 GIA2 GIAi

Sense

MatchTropism

Robot(World)

...

...

(Vote to Select

FIGURE 2.3 STATIC FINE-GRAINED TROPISM SYSTEM COGNITIVE ARCHITECTURE AS IMPLEMENTED FORGESTURE RECOGNITION.

GestureAgent

StateAgent

GestureRecognition

SensorInput

WorldInput

FIGURE 2.4 STRUCTURE OF THE GESTURE RECOGNITION AGENT

Gestural Word

Agent

WHAT IS AN AGENT?


2.7 WHAT IS AN AGENT?

I have referred to agents often in this chapter without providing a definition. The term“agent” is used in many contexts and often the definition is left to the reader’s ownbeliefs. Wooldridge and Jennings [1995] have gathered the most commonly acceptedproperties of agents into what they call a “weak notion of agency”:

• autonomy: agents have control over their own actions and internal state;

• social ability: agents interact with other agents (including humans);

• reactivity: agents perceive their environment and respond in a timely fash-

ion;

• pro-activeness: agents exhibit goal-directed behavior.

Most researchers agree that these are the general properties that an agent must possess.However, Wooldridge and Jennings also acknowledge a “stronger notion of agency.”Many researchers prefer to add to the above the stipulation that agents are imple-mented or conceptualized using concepts that are more usually applied to humans.These researchers use mentalistic notions such as knowledge, intention, belief, andobligation [Shoham, 1993].

Although we do not have a concise, quantitative definition of an agent, we view agentsas hardware or softwareinformation processorsand feel they should embody fivequalitative properties that are similar to the weak notion of agency:

• Input/output relationship

• Autonomy

• Persistence

• Non-parental influence

• Cognitive behavior

Input/output relationship. Agents must do something within their world, whether thatworld is real or simulated. They must also be responsive to that world or, at leastparametrizable. Our model of port-based objects [Stewart and Khosla, 1996] allowsagents to possess both “input/output ports” and “resource ports.” Input/output ports areconsidered variables during run-time, while resource ports are considered variablesduring initialization. In either case the ports are inputs and outputs to which the agentresponds or effects its world. The key difference is the dynamism during run-time.This is similar to reactivity and social ability.

EXAMPLE OF GESTURE INTERPRETATION FOR INSTRUCTION


Autonomy. Agents must be able to function on their own. It is not necessary that they beable to fully achieve their goals on their own nor must they be able to survive on theirown. Instead, teams of agents may be needed for certain tasks, but each agent musthave some level of autonomy in processing its inputs and outputs. This is slightly dif-ferent than that of Wooldridge and Jennings, see below.

Persistence.There is a need to distinguish a subroutine from a child process. To thispoint, we can not rule out a subroutine because it can have resource ports (argumentsand return value) and input/output ports and is autonomous for the duration of the call.Of course, a subroutine is wholly dependent on the main execution thread so, at best, itcan be considered apiecewise autonomous agent. Yet, persistence is the key idea thatdifferentiates a child process from a subroutine. Our combination of autonomy andpersistence is similar to autonomy of Wooldridge and Jennings.

Non-parental influence.To be truly independent, an agent must be able to influenceagents other than its parent. This helps distinguish levels of decomposition but doesnot exclude hierarchies in which a collection of agents can be considered an agent inits own right. This property also requires that the environment be considered an agent.

Cognitive behavior.Cognitive behavior, or, perhaps more appropriately,nonlinearbehavior, is the most controversial property because it seems to be the most arbitrary.Nonetheless, we feel a need to exclude such trivial things as parametrizable constants.In essence, agents should possess some non-trivial behavior, but it is difficult to quan-tify or define “non-trivial.”

2.8 EXAMPLE OF GESTURE INTERPRETATION FOR INSTRUCTION

The example presented in this chapter is a gesture-based user interface for a roboticmanipulator that provides an intuitive mechanism for modifying its periodic trajectory.The paradigm of interaction is that the user can “nudge” the robot into the desiredparametrized trajectory by physically pushing the end effector, as in Figure 2.5. Thetrajectory shape is not free-form, but one of a small number of polygonal trajectoryfamilies. This implies the gesture recognition cannot be free-form, either, relying onthe user to close the loop. Instead, the intention must be abstracted relative to theknown allowed operational modes of the system.

For this particular implementation, each family of trajectories is constrained to thesame vertical plane and the set of trajectory shapes consists of: a cross, a rectangle, aright triangle, and a pick-and-place path (Figure 2.6). There is also a “null” trajectorythat allows the operator to pause the robot. Note that the rectangle and pick-and-placetrajectories were deliberately chosen to be very similar.



All shapes have variable width (w) and height (h) and the cross can also vary in thick-ness (t). These parameters all vary independently. Nothing else but these parameters isallowed to vary, including the orientation and center point (although appropriateagents could be developed). No provision has been made at this time for non-polygo-nal trajectories. Tactile gestures, in the form of “nudges” on the end-effector, are theonly means of selecting a trajectory and varying its parameters.

As information is conveyed to the agents through the operators’ delivery of nudges(tactile gestures) to the robot, appropriate action is selected by the agents utilizing tro-pism-based cognition. The hand contacts convey the intentions of the robot instructorin much the same way that mouse movements convey the intentions of a computer user

RobotOperator

Fixtures

nudge

FIGURE 2.5 GESTURAL USER INTERFACE EMPLOYING TACTILE GESTURES

w

h

w

h

1.5w .5w

.5h

1.5h

t

FIGURE 2.6 THE SHAPES OF THE TRAJECTORY FAMILIES.

w

h



of a graphical user interface (GUI). By clicking and dragging corners or sides of a win-dow, the user’s intentions of changing the size of the window are communicated to theuser interface. The GUI will then make changes according to the state of the system.Analogously, the robot operator, delivers tactile gestures to a rectangular model of atrajectory, intending to change the shape or the size of the rectangle. The difference isthe GUI application uses a set of table look-up gestures that have one-to-one mappingsbetween gesture and intention, so interpretation is trivial.

2.8.1 Relevant Gestures and Psychological Basis for Interpretation

End-of-arm tactile gestures are used to select and fine-tune the robot’s polygonal tra-jectories. Much of the work on end-of-arm tactile gestures that is used for this researchhas been reported in [Voyles and Khosla, 1995a]. To summarize, end-of-arm tactilegestures are force impulses that have associated with them state information thatincludes:

• average force components

• force magnitude

• parallelism between force and the X-axis

• parallelism between force and robot velocity

• parallelism between force and radial vector

• quadrant location of gesture with respect to centroid

• closeness to a vertex of the polygon

Except for the force magnitude variables, the state information is represented as fuzzyvariables. For example, parallelism between the force and the robot’s velocity has amembership function defined by the dot product. This state value is used to determineif the gesture is aiding the motion (parallel), opposing it (anti-parallel) or trying toalter its direction (perpendicular).

The reasons for these selections come from the psychology literature [Hebb, 1949;Pearson, 1961; Hoffman and Richards, 1984] and our own observations. The dominantfeatures distinguishing different polygons are generally the corners. Looking atFigure 2.7, it is obvious that those are the corners of a triangle. Even though the sidesare substantially missing, the shape is still apparent. It seems natural that if one wantsto “massage” one polygon into another polygon, he/she would tug at the corners. Con-versely, if one wanted merely to change a parameter of the given polygon - such aselongating it - he/she would tug on the sides rather than the corners (also suggested bythe literature). This is evidenced by Figure 2.8. The lines do not suggest “rectangle” so



much as they suggest somerelationship such as separation, width, or height. It is inter-esting to note that this logical separation forinfluencing motion runs parallel to recentwork in perceiving motion [Rubin and Richards, 1985].

Furthermore, changing the shape of the trajectory requires the corroboration of severalof these “force features” (as perceived by the agents) in much the same way as severalcorners are required to visually suggest a shape. As a result of waiting for corrobora-tion, response is delayed. There are parameter modifications - elongation, for example- for which the user’s intention can be inferred almost instantaneously. Response tothese input features can be immediate.

These observations provide the basis for all of the selection agents. Forces applied atthe corners are the dominant features for the shape-determining agents while forcesapplied along the sides are the dominant features for the parameter-determiningagents. Of course, there are some exceptions and special cases that particular agentsemploy, but in general, these two basic rules dominate the generation of the agents’tropism sets.

Pinching gestures are also tactile gestures because they involve touch, but they arequite different from end-of-arm gestures. For this application, we are only sensing

FIGURE 2.7 HINTS OF A TRIANGLE.

FIGURE 2.8 EDGES OF POLYGONS SUGGEST RELATIONSHIPS MORE THAN SHAPE.



pinch between the index finger and thumb to determine start and stop points for trajec-tories during training. Although we maintain a fuzzy membership value based on thestrength of the pinch above a threshold, the pinch gesture is used as a binary flag forthis application.

2.8.2 The Multi-Agent Network

2.8.2.1 Agent topology

The agent network for interpreting tactile gestures is illustrated graphically inFigure 2.9 (additional controller and utility agents are not shown but are described in[Voyles and Khosla, 1995a]). The gray boxes represent gesture recognition agents andthe open boxes represent gesture interpretation agents.Confusion is a virtual sensoragent, like the recognition agents, but it provides proprioceptive sense of the network’sinability to reach consensus rather than sensing gestures from the operator. The Cyber-Glove GRA is for detecting hand motion gestures and symbolic gestures and is used inconjunction with another application described in Chapter 6.

ConfusionTrackball(Tactile)

F/T Sensor(Tactile)

CyberGlove(Motion)

FIGURE 2.9 THE TROPISM SYSTEM INSTANTIATION.

Height

RectCross Triangle Pick

ThickWidth

Vote

Pause

World

Act(Adjust Controller Agents)

ControllerAgents



The GRA’s detect elemental gestures and attach application-specific state information,creating “gestural words.” The GIA’s (height, width, etc.) interpret the gestures, collec-tively arbitrate the most probable interpretation, and modify the execution parametersof the robot agent and its cartesian controller.

2.8.2.2 Tropism-based interpretation agents

The GIA’s consist of a two-stage combination of characteristics from both fuzzy andneural systems to represent the tropism set. State information associated with eachgesture is represented by fuzzy variables but the final confidence value is computed bya biological neuron-like accumulator of tropism “firings.”

All interpretation agents are independent from one another and are responsible forevaluating the user’s intentions with respect to one particular parameter. (i.e.rect isresponsible for determining if the user is trying to switch to the rectangle trajectory.)These agents utilize the application-specific context appended to the gestural word bythe GRA’s which includes:

• average force components

• force magnitude

• perpendicularity of force and vertical (perp-vert)

• perpendicularity of force and robot velocity (perp-vel)

• parallelism of force and radial vector (points-out)

• quadrant location of gesture with respect to center

• closeness to a corner (corner)

To illustrate elements of the fine grained tropism set, several representative agentdescriptions appear below. All other agents that are not described operate similarly andare described in [Voyles and Khosla, 1994].

Contact. Contact is a special purpose agent that allows for predetermined physicalinteraction with the world. If the robot makes contact with the world, the agents couldconfuse that information for some type of gesture. Instead of programming agents tobeware of such “sensor noise,” the contact agent instructs the recognizers to filter outspecific impulse vectors that are task-related.

Confusion. This agent serves two very important purposes; it helps us maintain agentindependence and it allows for future development. Before explaining how it accom-plishes these two things, we’ll first describe it.



Confusion acts as a virtual sensor by determining the “confusion” of all the shapeagents and calculating a measure of their indecision based only on the value and rateof change of the runner-up confidence. This value is available to all other agents’ inputports if their internal models dictate its use. This is useful to any particular agentbecause it provides information on the “confidence” of the network as a whole, ratherthan an individual peer.

By examining the confusion measure, any agent can determine a rough upper boundfor the confidence of all other agents without being aware of their individual goals oridentities. This helps to ensure agent independence and modularity. Also, the confu-sion agent does not violate modularity since its calculation is based only on the runner-up confidence value without dependence on the particular agents that are posting.Finally, the failure or removal of this agent merely results in a defective virtual sensorand the other agents continue operating.

In future work, we want the network to be self-modifying to effect gesture-based pro-gramming of multi-agent systems. For this type of programming paradigm, it will benecessary for the network to have a proprioceptive sense of its competence so it canmodify its own topology to better achieve its goals. The confusion agent provides onecandidate measure of competence.

Width. The width and height agents are identical in operation. Only the direction of theforce features and the parameter on which they act are different. The actions ofwidthare determined by a very simple internal model that encodes a small number of distinctfeatures. Positive (reinforcing) gestures are: {{{perp-vertAND not-perp-vel} OR { perp-vert AND perp-vel}} AND not-cornerAND confusion} while negative gestures are {not-perp-vert}. (Width is horizontal, hence the importance ofperp-vert.)

From an intuitive standpoint, the model works like this: If the end effector is movingalong a side of the polygon and I push it further along the side (force A, Figure 2.10), Imust want to elongate that side if the shape confusion is high. Likewise for shortening.On the other hand, if I pull out on a side of the polygon, it’s not clear what I’m trying

A

BC

Rectangular Trajectory

FIGURE 2.10 TO INCREASE THE WIDTH, EITHER APPLY FORCE A OR ALTERNATELY APPLY FORCES B AND C.



to do. I could be trying to change the thickness, the width, or even the shape. But if Irepeatedly pull out on opposite sides of the polygon, I must be trying to increase thewidth (forces B and C, Figure 2.10)

Rectangle. The rectangle model is based on gestures that tug out at the corners - orwhat would be the corners if the current shape was a rectangle. Therefore, outwardforces at any corner have high rectangle membership (“excitatory” in the graphicalrepresentation of Figure 2.11). Inward forces anywhere have high membership in “notrectangle” (“inhibitory”). If the gesture is applied along a side, it is excitatory if it isoutward in the direction of the corners of a very large square. If the force is obliquewith respect to the velocity vector, it is slightly inhibitory as are attempts at reversalsin direction.

The agents use fuzzy logic to classify gestures and match them against tropism ele-ments and then algebraic equations to calculate actions to take that raise or lower theconfidence in the agent’s hypothesis.

For example,triangle maintains the sole hypothesis that the user intends to select thetriangle trajectory. Itslikes include nudges (tactile gestures) that tug the trajectory radi-ally outward at the right angle corner and along the hypotenuse at the other two cor-ners. It also “likes” nudges that push the trajectory along the hypotenuse when awayfrom a trajectory corner.Dislikes include nudges that tug out at the corners close to thehypotenuse, tug straight-line motions away from the horizontal or vertical except nearthe hypotenuse, and nudges that impede motion. Each of these likes and dislikesbecome tropism elements.

For the first “like,” triangle forms the fuzzy membership of the feature {corner AND

points-outAND 4th-quadrant}. If membership is sufficiently high, a match is made with

= decrease confidence (inhibitory)= increase confidence (excitatory)

FIGURE 2.11 GRAPHICAL REPRESENTATION OF THE INTERNAL MODEL OF THE RECTANGLE AGENT.



the tropism element and an action that supports the hypothesis is taken. This action(the amount of increase in the agent’s confidence) is calculated algebraically using therespective membership values and the magnitude of the nudge. The “neural character-istics” referred to above are these dynamic weights on the inputs and the accumulationof these actions and subsequent “firing” of the vote when an internal threshold isreached. (There is also an accommodation function that weights recent gestures moreheavily than past gestures.)

The final vote of what the operator’s intentions are and what user-level action to take isa multi-step process described in [Voyles and Khosla, 1995a], the details of whichhave little bearing on this discussion. Essentially, the highest confidence wins.

2.8.3 Experimental trials

Several trials were executed with a trained operator using both a desk-mounted track-ball and a gripper-mounted force/torque sensor as input devices.The agents weretested in approximately ten “significant” network configurations where significant isdefined as consisting of a minimum of ten of the roughly fifteen available agents(including controller agents).

With all agents operating and a trained operator, Figure 2.12 illustrates a trial of thesystem switching between the triangle and rectangle trajectories using the gripper-mounted F/T sensor. The top strip chart shows time histories of the X and Z positionsof the robot (top line is X-position, bottom line is Z-position) with the correspondingend effector forces superimposed. The bottom two boxes are spatial (X-Z) representa-tions of one actual cycle of each trajectory family with force vectors showing points ofapplication of the impulses. The top strip chart and bottom two plots are different rep-resentations of the same data; the former temporal, the latter spatial.

In the first phase of the strip chart in Figure 2.12 (0 to 13 seconds), the robot is execut-ing a triangular trajectory which is depicted spatially in the bottom left box. Gesturesf1 and f2 (visible as positive blips in the strip chart data) are applied to the hypotenuse,pulling out on the side to create another corner. This causes the confidence of the rect-angle agent to dominate and the trajectory switches. Phase 2 (13 to 40 seconds) is rect-angular, but gestures that push the end effector along the hypotenuse of the intendedtriangle (f3 and f4) and one that collapses the corner of the rectangle cause a switchback to the triangular trajectory.

These two transitions occurred after only two and three gestures, respectively. Onaverage, it takes about four gestures to properly discern the user’s intentions. This par-ticular trial violated an assumption on which the system is based: changes to the trajec-



tory are rather few and far between. As a result, the mode changes happen somewhatmore rapidly than they otherwise would due to the decay time of the confidence value.

Figure 2.13 contains data taken from another trial. In the first phase of this plot (200 to213 seconds), the robot is executing a narrow rectangular trajectory which is depictedspatially in the bottom left box. Gestures g1 and g2 push the rectangle wider. Thiseffect is apparent in the top strip chart as the X envelope gradually grows with moregestures. Once the rectangle has been widened, nudges that tug in on the corners (g4 -g7 in the center picture) collapse the rectangle to the cross, depicted spatially in thebottom right box. This takes a total of five gestures. g9 and g10 further widen thecross.

Width and height adjustments can occur after a single gesture, as evidenced by g3,which further increased the width after g1 and g2 caused the initial increase. As men-tioned previously, most trajectory family changes occur after three to five gestures.

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60-0.3

-0.2

-0.1

0

0.1

0.2

Time (sec)

X and Z End Effector Positions with Forces SuperimposedTriangular Rectangular Triangular

-0.1 0 0.1

-0.1

-0.05

0

0.05

0.1

X Pos

Z PosTriangular Interval

-0.1 0 0.1

-0.1

-0.05

0

0.05

0.1

X Pos

Z Pos

Rectangular Interval

AB

A

B

f1f2 f3 f4

f5

f1

f2

f3

f4

f5

IntervalIntervalInterval

FIGURE 2.12 ANNOTATED TEMPORAL PLOT (TOP) OF THE ROBOT’S POSITION IN THE PLANESUPERIMPOSED WITH FORCE IMPULSES. SPATIAL PLOTS (BOTTOM) SHOW PLANARTRAJECTORY INTERVALS WITH CORRESPONDING ANNOTATIONS.

(force impulses)

(reference point)

X

Z

0.20.10.0

-0.2

-0.1

0.0 0.1 0.2

-0.1

-0.2

CONCLUSIONS


“Pick” is consistently the most difficult because of its similarity to “rectangle” and thelong path length between the distinguishing features (the end points).

2.9 CONCLUSIONS

The Fine-Grained Tropism System Cognitive Architecture provides a useful and intui-tive formalism for describing and implementing colonies of highly specialized soft-ware agents for tactile and motion gesture interpretation. The gesture interpretationsystems described here were easy to implement, easy to reconfigure, and could beassembled in many topologies for different interpretations.

The fine-grained nature of the agents, implemented in the port-based object frame-work, allows for easy decomposition into reconfigurable and reusable elemental soft-

200 210 220 230 240 250 260 270 2800.4

0.5

0.6

0.7

0.8

200 210 220 230 240 250 260 270 2800

0.1

0.2

0.3

0.5 0.6 0.70

0.05

0.1

0.15

0.2

0.25

0.5 0.6 0.70

0.05

0.1

0.15

0.2

0.25

0.5 0.6 0.70

0.05

0.1

0.15

0.2

0.25

FIGURE 2.13 SECOND-TRIAL TEMPORAL PLOT (TOP) OF THE ROBOT’S POSITION IN THE PLANESUPERIMPOSED WITH FORCE IMPULSES. SPATIAL PLOTS (BOTTOM) SHOW PLANARTRAJECTORY INTERVALS WITH CORRESPONDING ANNOTATIONS.

NarrowRectangle

Wide Rectangle Wide Cross

X P

osZ

Pos

motion

g1 g2

g1

g2

g4 g5

g6 g7

g4g5

g6g7

g9

g10

g3

CONCLUSIONS


ware agents. Of course, the decompositions described were done manually, whichpoints out the need for an intuitive architectural representation.

35

Chapter 3

Observing Tactile Gestures

Realizing the Sense of Touch

3.1 MEETING THE SENSORY NEEDS OF GESTURE-BASED PROGRAMMING

Many researchers (such as Kang, Ikeuchi, and Kuniyoshi, for example) believe it isonly necessary to observe the motions and poses of the hand to achieve programmingby human demonstration, even for contact-intensive tasks. This belief is based on thehuman analogy; when one human trains another, the trainee generally only observeswith the eyes (and, perhaps, the ears). There is rarely any explicit force (contact) infor-mation conveyed during training. Parametrization of force-based primitives is done byselecting nominal values based on experience and then refining them during traineepractice.

While this is plausible for the wealth of experience possessed by a human, we feel itrequires too many assumptions for implementable robotic systems in the near term.Our approach, for the time being, is to replace much of the assumptions and guess-work with sensors. Using fingertip tactile sensors we can actually measure the contactparameters and help discriminate between closely related sensorimotor primitives.

But there is another reason to consider tactile sensing. GBP assumes a skill base ofpreviously-acquired sensorimotor primitives upon which task execution and demon-stration interpretation will be based. This skill base comprises the shared ontologybetween trainer and trainee. A significant portion of this thesis deals with a candidatemethod for populating the skill base. Even the sensorimotor primitives can be learnedfrom human demonstration and teleoperation, given the right set of sensory informa-tion.

MEETING THE SENSORY NEEDS OF GESTURE-BASED PROGRAMMING


As described in Section 1.4 and illustrated in Figure 1.2, much of the incoming infor-mation for our system comes from a sensorized glove. The basis for our sensorizedglove is the CyberGlove. But this glove only provides kinesthetic information and hasno provisions for measuring fingertip contact conditions. In fact, no commercial glovehas such capability. In order to observe the contact information so critical to interpret-ing the gestures associated with the contact tasks we’re trying to program, we mustconstruct wearable tactile sensors that can be integrated with the CyberGlove. To min-imize transformations between the human and robot tasks, it is desirable that thesesensors be modular so that it can be used both by the human and the robot.

This chapter provides a cursory description of our work to create such a modular tac-tile sensor system including preliminary work on an inside-out symmetric tactile actu-ator. Though somewhat peripheral to the central concept of GBP, it provides a startingpoint for the data acquisition and analysis employed throughout the work and moti-vates the following chapter on calibration (a key foundational component of the the-sis).

3.1.1 Intrinsic and Extrinsic Tactile Sensing

The human haptic sensing system consists of a wide variety of transducers and sensingmodalities. Perhaps the seminal description of this system from the physiological per-spective is by Valbo and Johansson [1984]. Summaries of work in physiology and neu-robiology on the human haptic system of interest to robotic tactile sensing andactuation can be found in [Nicolson, 1994][Kontarinis, 1995][Cholewiak and Collins,1992].

We focus our efforts on force and displacement sensing because these are most rele-vant to the types of manipulation primitives we are trying to acquire. Within thisdomain, there are two general classes of contact sensors: intrinsic and extrinsic [Bicchiand Dario, 1988]. Intrinsic tactile sensing uses pure force and torque measurementsplus geometric calculations to deduce contact conditions at the finger/object interface.Salisbury [1984] first proposed this approach for determining local surface orientation.The major drawbacks to this approach are the inability to handle multiple points ofcontact and errors induced by moments imparted by soft fingertip contact. Its benefitsinclude theoretically infinite spatial resolution, high bandwidth, and linear, nonhyster-etic force response.

Extrinsic tactile sensing uses arrays of 1-dimensional pressure sensors distributed overthe sensor surface to extract a tactile image of the finger/object interface. This is wheremost of the research on tactile sensing has focused. The advantages of extrinsic sens-ing include its inherently compliant finger/object interface, its ability to handle multi-ple contacts, and the possibility of determining object geometric properties without

PRIOR WORK


moving or regrasping [Fearing, 1987]. Drawbacks include the inherently limited spa-tial resolution, generally inaccurate force measurement (although it is improving dra-matically), and difficulty in construction.

3.2 PRIOR WORK

Dozens of tactile sensors have been developed over the past twenty years or so. Seigel[1986] and Nichols and Lee [1989] provided good early surveys of tactile sensingtechnologies while Howe’s [1994] is more up-to-date. Some of these sensors over theyears have gone so far as to become commercially available or are being producedwithin corporate R&D labs “for distribution.” Still, none have been successful at wide-spread use or even consistently applied to a particular task. The main reason for thishas been the lack of effective, generic strategies for using the tactile feedback. Severalresearchers have used tactile feedback for determining object shape [Salisbury et al,1986],[Fearing, 1987],[Nicolson, 1994] but few have achieved robust manipulation.Nicolson [1994] achieved robust grasping of unknown planar ellipses and rectangles.Berger and Khosla [1991] achieved real-time edge tracking with a direct-drive arm,but did not actually manipulate parts. Son [1996] used hand-coded primitives tomanipulate objects with a capacitive extrinsic tactile sensor of similar design to Fear-ing and Nicolson.

The vast majority of prior designs have relied on rubber and foam. While these havedesirable design properties, it has been conjectured that they provide inferior contact“feel.” Shimoga and Goldenberg [1992] and Akella [Akella and Cutkosky, 1989][Akella et al, 1991] have shown hard-surface and foam-surface sensors have inferiorcontact properties compared to gels and powders. In particular, impact and conformalproperties seem to be worse in pure Hooke’s Law materials. Active gels, in particular,hold promise for providing not just useful contact properties, but an additional degreeof freedom in modulating the grasp [Kenaley and Cutkosky, 1989]. Electrorheological(ER) fluids, which can change viscosity in the presence of an electric field, exhibitsuch active behavior and can be gelled [Voyles et al, 1989].

Fewer tactile actuators have been developed, in comparison to the volume of work ontactile sensors. Work on actuators (tactors) has focused primarily on virtual realityapplications and sensory substitution. Shimoga [1993] provides a comprehensive sur-vey of actuators and technologies while the work of Kontarinis, et al [1995] is morerepresentative of the current state of the art. The primary technology used to-date hasbeen vibrotactile stimulation using piezoelectrics or blunt pins. A few researchers havetried electrotactile stimulation -- direct electrical stimulation of the skin. This is anattractive option because of its simplicity of physical implementation, but it has provendifficult to toe the line between useful stimulation and painful electric shock. Shimoga

DESIGN CONCEPT


et al [1995] also investigated the use of commercially available shape memory alloyactuators for providing binary contact information during grasping tasks.

In and of itself, this is not sufficient reason to embark on “yet another tactile sensordesign.” There is a more basic motivation that springs from the application: the observ-ability problem. Extracting meaning from the tactile gestures the human employs isdifficult because much of the information content is unobservable to the sensors. Thehuman demonstrator can make use of sensory information beyond the capabilities ofthe instrumented clothing, such as alternative sensing modalities as well as superiorresolution and dynamic range. Since tactile sensing technology will not soon approachthe capabilities of the human fingertip, we endeavor to reduce the human sensorycapacity to that of the robot; in effect, to interject a filter.

What do we gain by doing this? In effect, a shared ontology, or “common view of real-ity,” while still capitalizing on the respective strengths of the human and robot. Para-digms for programming complex hand/arm systems by human demonstration relyfundamentally on the human for planning, high-level learning, and exception handling.We focus on systems with many degrees-of-freedom (as opposed to simple grippers)because planners for such systems are difficult to construct so there is a “big win” inincorporating human ability. Low-level tuning and repeatable execution are the contri-butions of the robot. With this in mind, it makes sense to “cripple” the human’s capa-bilities and force him/her to re-learn basic skills at the robot’s level because that isexactly what the human is good at doing.

With the goal of employing not just a tactile sensor, but a tactile filter as well, it isdesirable to have a “symmetric technology” that can be used on both the sensing andactuation ends. A tactile sensor and actuator based on an electrorheological or magne-torheological gel addresses all these issues to some degree and provides a novel designapproach with many rich directions for analysis and experimentation.

3.3 DESIGN CONCEPT

The development of this system spans many years and many miles. The originaldesign of our coarsely-foveated ER tactile sensor with grasp actuation capability wascompleted in Mark Cutkosky’s lab at Stanford [Voyles, et al, 1989]. Subsequently, sev-eral prototype intrinsic/extrinsic sensor combinations have been fabricated at CarnegieMellon with crude prototypes of the end-to-end human/robot system with this modularER sensor/actuator.

The modular sensor/actuator system is based on three interchangeable components: anextrinsic sensor array, an intrinsic force/torque sensor, and a conformal tactor array.

DESIGN CONCEPT


The tactor array is for end-of-human use only, either in conjunction with the extrinsicsensor array for gesturing or in conjunction with an external force reflecting device forvirtual reality (VR) applications (more on this later). The intrinsic sensor is for eitherhuman or robot and is used in conjunction with the extrinsic array or a variety of pas-sive tips during autonomous manipulation or telemanipulation. The extrinsic sensorcan be used either on the robot or on the human, both with and without underlyingcomponents for intrinsic sensing.

For sensing human contact gestures, the extrinsic sensor array and the tactor array aresimilar to a pair of thimbles. The wearable tactor will be somewhat smaller than thewearable extrinsic sensor so it can nest inside the sensor (Figure 3.1). Clearance willbe provided so an instrumented glove can be captured between the “thimbles” for fin-ger joint sensing.

The design of the two components will be very similar in that the tactor is an “inside-out” version of the sensor. Both use the electrorheological gel developed by Voyles, etal [1989], or the similar magnetorheological gel under development, to provide“fleshy” properties. The sensor will use it primarily as a dielectric for capacitive sens-ing and as a structural material. The tactor will use it primarily for its rheological prop-erties.

For sensing robotic contact during manipulation, the extrinsic and intrinsic sensorswill be nested to provide both contact force distribution and net force/torque. Twodesigns are being explored for the intrinsic sensor. The first design is based on a cylin-drical cantilever beam (Figure 3.1) similar in concept to that of Bicchi and Dario[1988] and is depicted in Figure 3.1.

Rubber membrane

Rubber membrane

ER Fluid

ER Fluid

Plastic Shell

FIGURE 3.1 CROSS-SECTION OF ER SENSOR AND TACTOR

Sensor Tactor

DESIGN CONCEPT


Not surprisingly, ER fluids have been proposed and implemented as haptic devices forsensing and actuation in limited ways in the past. Cutkosky has been involved with ERfingers [Akella et al, 1991][Kenaley and Cutkosky, 1989][Voyles et al, 1989] for along time, both as sensors and “passive actuators.” Monkman and Sano have both beenworking on ER display mechanisms recently. Monkman proposes a true tactile display[Monkman, 1992], similar in concept to that proposed here, but only simulates itseffect in a planar configuration. It is also geared more toward reconstructing the worldlocally, rather than reconstructing local tactile stimulation from the world. Sano con-structed a prototype haptic interface [Sano, 1995] somewhat like a joystick with con-trollable damping and proposes extending it to all five fingers.

Electrorheological fluids are attractive because they are controlled electrically, whichis convenient, they require little power (although voltages are very high), there are nomoving parts, and they can be made very compact. In fact, the smaller the dimensions,the higher the field strengths and the stronger the ER effect.

These characteristics make ER fluids attractive to the haptic interface community. Tac-tile sensing and actuation are probably most limited by packaging issues so any tech-nology that promises to address the packaging issues associated with the resolution,dynamic range and diversity of the sensing and actuation at the human fingertips, evenin a primitive way, will get some consideration.

Rubber membraneER Fluid

Plastic Shell

FIGURE 3.2 CROSS-SECTION OF ER SENSOR AND STRAIN GAUGE SENSOR

Extrinsic Sensor Intrinsic Sensor

ThreadedMountingStud

Cantilever Beamwith strain gages

INTRINSIC TACTILE SENSOR DESIGN


3.4 INTRINSIC TACTILE SENSOR DESIGN

The design of the intrinsic tactile sensor is fairly straightforward and has been studiedby many. Bicchi and Dario [1988] present an optimized design process but space andpackaging are the biggest issues in our modular, multi-component system but we will

attempt to optimize performance as best we can.

Our first two intrinsic prototypes used plastic as the structural material but we areinvestigating various metals for other implementations. Bonding strain gages to sometypes of plastic can be extremely difficult.

Calibration results for this component can be found in Chapter 4.

3.5 EXTRINSIC TACTILE SENSOR DESIGN

The extrinsic sensor shell is made of fiberglass for rapid fabrication. (The prototypesshown in Figure 3.3 are made of delrin.) The body of the sensor must be insulating tohold the individual capacitive electrodes. In the original cylindrical. prototype, weplated copper onto an acrylic core. We then patterned an array of individual, direct-wired electrodes to form one half of the sensing and actuation capacitor. The outerrubber membrane was conductive and grounded and formed the other half of all thesensing and actuation capacitors. As in that prototype, there will be no attempt to mul-tiplex the sense elements because that may preclude future use of the actuation poten-

FIGURE 3.3 MODULAR TIPS FOR SINGLE-BEAM INTRINSIC SENSOR.

EXTRINSIC TACTILE SENSOR DESIGN


tial of the fluid. (In general, although not always, multiplexing the ER excitation

voltage reduces the ER effect.)

FIGURE 3.4 CUTAWAY VIEW OF THE ORIGINAL PROTOTYPE FINGERTIP SENSOR WITH PLATEDCAPACITORS.

FIGURE 3.5 EXTRINSIC SENSOR RESPONSE TO 3MM CYLINDER PERPENDICULAR TO SENSOR AXIS.

EXTRINSIC TACTILE SENSOR DESIGN


There are several problems with sensing this way. First of all, it is a displacement sens-ing technique rather than a pressure or force sensing mechanism. This, of course, is acommon approach, but we don’t have a well-behaved medium to accurately relate dis-placement to force. Instead the fluid (or gel) can change from a Newtonian fluid to aBingham plastic under excitation.

The next problem with using the outer membrane as a sensor element is the unknownrest state. Because the fluid or gel has some freedom to move (the self-healing gel par-tially solves this problem) and does not return completely to any known or deduciblestate, it would be necessary to take constant offset measurements to zero the sensor.Gravity loading is also a problem. The fluid has a tendency to sag in a gravity field.

The use of this type of local pressure sensor precludes the use of a Newtonian fluid. Ifthat were the case, all sensors would see essentially the same pressure. This is anotheradvantage of the ER gel. It maintains pressure distributions throughout its volume,however, one of the unsolved challenges is modeling this behavior.

Over the fingertip mandrel (the “bone”) is the ER gel which is held in place by a con-ductive rubber membrane that also acts as a ground plane. Even though we are usingthe micromachined sensors for pressure measurement, it may still be necessary tomeasure the volume of fluid over the sensors to adequately model the fluid behavior orto determine local shape of the object in contact. If this proves necessary, we can plateelectrodes between the silicon sensors (or over them with a flexible membrane) todetermine outer membrane displacement capacitively.

FIGURE 3.6 EXTRINSIC SENSOR RESPONSE TO 3MM CYLINDER 60 DEGREES FROM PERPENDICULAR.

TACTILE ACTUATOR DESIGN


A final note must be made regarding structural integrity of the sensing medium. Withrubber and foam-based sensors, there is no question as to where the structural materialis. However, with fluids and gels, significant emphasis must be placed on keeping thestructure of the sensing medium intact. This includes the pliability of the outer mem-brane but it can also be greatly enhanced with structural members such as “finger-nails.” On the backside of the sensor will be a fluid exclusion dam to help maintain theshape of the outer membrane and to keep the gel reasonably contained in the area ofthe sensitive sites.

3.6 TACTILE ACTUATOR DESIGN

Because tactile actuation research is in its infancy [Shimoga, 1993], this is a difficulttask. The devices of Kontarinis, et al [1995] are generally considered the best shapedisplay devices to-date, but these are not nearly human-wearable in size or robustness.We have come up with the novel idea of creating not only a modular sensor actuatorsystem, but a sensor actuator pair that are built on the same technology and are“inside-out symmetric.”

The tactile actuator (tactor) has already been described as an “inside-out sensor” so wewon’t discuss to deeply the physical design issues since they are very similar to theextrinsic sensor. The key difference will be the lack of micromachined pressure sen-sors. Not only are they unnecessary for the tactor, but their survivability within suchstrong electric fields would be difficult to guarantee.

It may still be necessary to sense the membrane displacement to estimate fluid volumeand electric field strength, but that would be done by the capacitive displacementarrangement rather than the localized pressure sensors. Akella, et al [1991] did not usethe outer flexible membrane as one electrode. Instead, they interlaced pairs of fixedelectrodes perpendicular to the membrane surface. Fluid near the membrane surfacewas relatively unaffected by the electric fields but upon depressing the membrane, thefluid was forced to pass between the plates, through the high field concentrations. Theadvantage to this is the plate geometry does not change as the membrane is deformed.

We should pause, briefly, and examine the nature of the “actuation.” An electrorheo-logical fluid only has the ability to change viscosity and even develop yield stress. Ithas no capacity to actively exert forces on stationary objects. The only way the usercan detect a change is by exerting a force on the tactor. For our application, observinga human demonstration, this is not a problem because the user is actually performing acontact task and is exerting forces. Keep in mind we intend to use the sensor/actuatorpair as an active filter to control the feedback the human receives.

OTHER APPLICATIONS


Even then, some form of sensory substitution must come into play because edges andvery rigid surfaces cannot be duplicated with an ER fluid. Instead the user mustrespond to changes in damping and yield stress.

3.7 OTHER APPLICATIONS

Although our application is in observing human demonstrations, we have consideredhow our devices might be useful in other domains. The sensors are fairly generic andhave general applicability, but the tactor, being a “passive actuator,” is more limited.For example, VR applications do not include physical contact with a real environment,as does human demonstration.

This does not preclude the use of our tactor in VR. Quite to the contrary, our tactor canbe used in conjunction with existing net force haptic interfaces such as Salisbury’sPhantom [Salisbury, 1993] or Hollis’ Magic Wrist [Berkelman and Hollis, 1995].

The ER fluid is used in the extrinsic sensor mostly for convenience and symmetry. Wecan achieve the same dielectric and physical properties without the ER effect. How-ever, maintaining ER capability in the sensor allows another avenue of future work incontrolling the properties of the grasp at the contact boundary. Preliminary stepstoward this examination have already been taken toward impact control [Akella et al,1991] and improved gearing traction for grasping delicate objects [Kenaley and Cutko-sky, 1989] but no one has attempted active grasp control, to our knowledge.

Rubber membrane

Rubber membrane

ER Fluid

ER Fluid

Plastic Shell

FIGURE 3.7 CROSS-SECTION OF ER SENSOR AND TACTOR

Sensor Tactor

OTHER APPLICATIONS


47

Chapter 4

Shape from Motion Calibration

Nearly Autonomous Sensor Calibration

4.1 INTRODUCTION

Because we are interested in instrumenting all fingers of both the five-fingered humanhand and the four-fingered Utah/MIT hand with the sensors described in the previouschapter, force sensor calibration became an issue just from the sheer number of sen-sors and the time required to calibrate the entire sensor suite. The standard techniqueof sensor calibration -- reduction by least squares [Watson and Drake, 1975; Shimanoand Roth, 1977; Nakamura et al, 1988; Uchiyama et al, 1991] -- is very time consum-ing because it requiresa priori knowledge of both inputs and outputs. The outputs aresimple, consisting only of the sampled measurements of the vector of raw sense ele-ment values (e.g. strain gauge values). The inputs, on the other hand, consist of aredundant spanning set of accurately known forces and torques. It is the careful andaccurate application of these force and torques that is so tedious and time consuming.

Although effective, the Least Squares (LS) approach is cumbersome because of therelatively large number of accurate loads that must be applied to the sensor to reducenoise introduced by measurement error; often 12 to 30or moreare required to producea reasonably accurate calibration matrix for a 6-degree-of-freedom (DOF) sensor, eventhough six are theoretically sufficient [Watson and Drake, 1975; Shimano and Roth,1977]. For example, Lord Corporation allowed up to 60 loads for their user calibrationfunction [Lord Corporation, 1986]. When ATI took over Lord’s force/torque sensorbusiness, they removed the user calibration function from the product because of itsdifficulty.

INTRODUCTION


Despite this burden, there has been little incentive to find alternative calibration meth-ods. The least squares technique is mathematically sound (excepting its explicitassumption of zero-error load vectors, discussed later); it is accurate, given goodexperiment design and careful attention to implementation details; and, perhaps, mostimportant to its widespread use, it is intuitive to casual users. Because the relativelyhigh cost of least squares calibration is incurred only occasionally during the life ofany given sensor, few researchers have pursued alternative calibration techniques.

Several researchers have tried to alleviate the burden of the least squares approach bydesigning specialized calibration fixtures and procedures. Watson and Drake [1975]created a calibration table on which the sensor was mounted. The table had two mov-able pulleys that could be quickly and accurately positioned so that hanging weightsapplied accurately known forces and torques for least squares analysis. Uchiyama, etal [1991] created a similar but inverted apparatus. Rather than a table, they built aframe from which the sensor was suspended with an accurately adjustable “momentbar.” Weights were suspended from the moment bar to create the forces and torques forleast squares analysis. Shimano and Roth [1977] used the robot and gripper and knownobjects in the workspace as calibration fixtures forin situ calibration. The wrist wasreoriented through a fixed set of discrete poses, some with the gripper empty, somewhile gripping a known object. The accuracy of this procedure is limited by the accu-racy of the kinematics of the arm and grip points and the accuracy of the torques thearm can exert, however.

Assuming the calibration experiment has been properly designed, every calibrationmethod benefits from more data in the presence of noise. The problem with the leastsquares technique, as previously mentioned, is that each new piece of data has a highcost. Each piece of data has two parts: the raw output of the sensor and the carefullyapplied load that produced that output. While the sensor output is easily collected,carefully applying the load is much more time consuming.

I propose a new approach to force sensor calibration based on “shape and motiondecomposition” techniques from computer vision [Tomasi and Kanade, 1991]. Unlikethe least squares technique, Shape from Motion Calibration does not require knowl-edge of all applied loads. Instead, calibration is performed with a very large number ofunknown loads and only a few known loads. Imagine calibrating a force sensor byattaching a mass then just waving it around randomly in a gravity field. The gravita-tional force of the mass provides the applied load while rotating the sensor in space, ifdone in a reasonable fashion, ensures spanning the sensing space. Because the loadsmust not be knowna priori, calibration is much quicker, yet the robustness and noise-rejection of a large, redundant data set is preserved.

THE CALIBRATION PROBLEM


4.2 THE CALIBRATION PROBLEM

A sensor converts an applied load, , into a measurement vector,z (Figure 4.1). Inthe case of a force sensor, for example, the applied load is a vector of forces and

torques and the measurement is a vector of strain gauge readings. The purpose of thecalibration function is to invert this transformation so that, given a measurement vec-tor, z, we can estimate the load which generated it.

For a linear sensor, to which we restrict our attention [Bayo and Stubbe, 1989], the cal-ibration function is a constant matrix that transformsz into m.

(4.1)

The calibration problem is to recoverC, the calibration matrix (not to be confusedwith the compliance matrix of Uchiyama et al [1991] and Nakamura et al [1988]) inthe presence of two types of noise: errors in the applied load vector, , and measure-ment noise in the measurement vector,z. Because of these sources of random noise, anaccurate calibrationcannot be achieved without redundant data; more redundant dataresults in a better signal to noise ratio for calibration recovery.

Equation (4.1) assumes zero bias. If the sensor has some constant non-zero bias, thismust be determined as part of the experimental procedure and subtracted from everyreading. Again, careful experiment design, which is not a component of study of thisthesis, is required to determine a procedure to accurately extract the bias forany cali-bration technique based on (4.1).

For force/torque sensors with symmetric internal mass distributions, sensor bias can bedetermined with a simple procedure in which thez axis is aligned with the gravity vec-tor. The average of measurements in the parallel and anti-parallel orientations yieldsthe bias vector. However, there are some conditions under which the determination ofbias is difficult. An extension to the shape from motion technique to include bias deter-mination is explored in Section 4.6

m

sensor calibrationm z m

function

~

FIGURE 4.1 THE SENSOR AND CALIBRATION FUNCTION MAPPINGS.

Cz m= or zTCT mT=

m



4.2.1 Least Squares Calibration Solution

The standard technique for solving calibration problems is the least squares (LS) orpseudoinverse method. (Least squares and pseudoinverse are equivalent, [Shimano andRoth, 1977], [Strang, 1988].) This requires the application of severalknown loads,mi,and the measurement of the corresponding sensor vectors,zi. These data form twomatrices that plug into (4.1):

(4.2)

from which the calibration matrix can be computed using the pseudoinverse of themeasurement matrix,Z:

. (4.3)

Remember there are two types of noise: measurement noise and applied load error.Measurement noise occurs in the transducer, the signal conditioning electronics andthe data acquisition system and appears as artifacts in theZ matrix. Applied load error,on the other hand, is the result of the experimenter’s inability to precisely apply theload she intends to apply. As such, there is error in theM matrix, too. The effect ofmeasurement noise can easily be reduced by taking many measurements for eachknown load. Applied load error can only be reduced by carefully applying many dif-ferent known loads.

The accurate application of all load vectors,mi, makes the least squares calibrationprocess tedious. The difference between the true applied load and the intended appliedload (the “applied load error”) must be minimized for an accurate calibration becausethis error manifests itself directly in the calibration matrix,C. Most statistical analysesof least squares (i.e. [Chatterjee and Hadi, 1988]) assumeM contains no errors. This is

the same as assuming the applied load error is zero and leads to the result thatZ+M isan unbiased estimate ofC. In fact, applied load errors can lead to a biased estimate ofC [Beaton et al, 1976], which leads to inaccuracy in the calibration resulteven ifredundant loads are applied.

This error can only be minimized by exercising extreme care when applying each andevery load used in the LS calibration. Thus, incorporating a large number of redundantapplied loads in the LS calibration procedure is very expensive (in time) due to this

z1T

znT

CTm1

T

mnT

=... ...CT Z+M=



applied load error minimization requirement. In practice, this limits the number ofredundant loads. In the Advanced Mechatronics Lab, for a 6-DOF sensor, users typi-cally lose patience after only 6 to 14 redundant loads for a total of 12 to 20 knownapplied loads (although 30 - 40 would be more desirable).

4.2.2 Shape from Motion Calibration

With the shape from motion calibration approach it is not necessary to know all theapplied loads,mi, but only a constraint which relates them. This feature allows us toapply many redundant load vectors and use the resulting raw measurements to deter-mine calibration without knowing exactly what the loads were that caused them.Liter-ally hundreds of data points can be acquired in the shape from motion calibrationprocedure in a fraction of the time that a dozen data points in the LS procedurerequire. A small number of known applied loads are required to establish the desiredreference frame,but none of the redundant data requires accurate applied load knowl-edge for the shape from motion calibration procedure.The ability to economically col-lect and apply massive amounts of redundant data in shape from motion calibrationaccounts for its advantage over least squares.

Why is it calledshape from motion? The calibration matrix encodes the mechanicalstructure of the sensor, including the placement of sensing elements and the propertiesof the material from which it is made. These are what defines the sensor’s intrinsic“shape.” The motion refers to the movement of the applied load around the sensor.Shape from motion refers to the fact that we can recover the shape of the sensor byknowing the theoretical rank of the shape (the “proper rank” or rank in the absence ofnoise, defined later) and applying arbitrary motion to the load.

4.2.3 Shape from Motion Derivation

In this section, shape from motion calibration will be derived in an abstract sense.Later, in Section 4.3, it will be applied to specific sensors and sensing spaces. The der-ivation begins with a representation of the sensor function which maps a load onto ameasurement:

, (4.4)

where is a1 x p measurement vector, is a1 x m load vector, andS is them x p

shape matrix. There arep sense elements andm DOF. Note, from (4.1), that the calibra-tion matrix,C, is easily computed from the shape matrix,S, as

ziT mi

TS=

ziT mi

T



. (4.5)

If we applyn loads and collect the measurements, we can express (4.4) as the matrixequation (similar to (4.2))

, (4.6)

whereZ is then x p matrix of measurements andM is then x m matrix of applied loads.Note that the shape matrix,S, is unchanged.M is our motion matrix which encodes theapplied loads to the sensor.

In traditional calibration techniques (i.e. least squares), bothZ andM are known.Zcontains the output signals of the sensor whileM is constructed from careful externalmeasurements of the applied forces that correspond to each vector inZ. These externalmeasurements generally involve scales, bubble levels, and protractors and can be verytime consuming. Our technique eliminates the need to knowM a priori by simulta-neously determiningM andS given onlyZ. We achieve this by performing a singularvalue decomposition (SVD) [Klema and Laub, 1980] onZ.

SVD produces the following unique decomposition of anyn x p matrix,Z:

(4.7)

whereU is ann x n orthogonal matrix,Σ is ann x p “diagonal” matrix (padded withzeroes as needed) of the singular values ofZ in descending order, andV is ap x porthogonal matrix.

Assuming we know that the “proper rank” ofZ is r, it can be shown [Strang, 1988]that the best projection ofZ onto anr-dimensional space (for ) is

, (4.8)

whereU* consists of the firstr columns ofU, Σ* is a diagonal matrix of the firstr sin-

gular values, andV*T consists of the firstr rows ofVT. (Z* is not the same asZ+ inequation (3).)

We have assumedZ should have rankr (elaborated on below), but measurement noiseprovides additional independent information. Equation (4.8) gives us the best possible[Forsythe et al, 1977] rank-r representation ofZ in the presence of that noise, so, com-bining (4.6) and (4.8) yields

C ST[ ]+

=

Z MS=

Z UΣVT=

r p≤

Z∗ U∗Σ∗V∗T=



, (4.9)

from which we can estimateM andS:

(4.10)

Unfortunately, and are not yet the truemotion andshapematrices but are onlyinitial estimates. They are indeterminate by an affine transformation. Givenany invert-ible r x r matrix,A, (an affine transform)

, (4.11)

so we must find an appropriate matrix,A, such that

(4.12)

We indirectly findA by applying a geometric constraint to the individual vectors of themotion matrix (which we call the “motion constraint” and describe in Section 4.3.1) to

solve forA-1. KnowingA-1, findingA is trivial and we solve forS using (4.12) andCusing (4.5). Finally, we introduce a few precise measurements (zi, mi both known) inorder to orient the calibration matrix with respect to the desired reference frame and toscale the result to the desired engineering units. Figure 4.2 illustrates this procedure inflow chart form. (We present examples in Section 3 for clarification.) There are strongmathematical similarities between LS and shape from motion, but the essence of shapefrom motion is that we have replaced the LS requirement of knowing all individualloads with the shape from motion requirement of a geometric constraint onM with noassumptions of smoothness of motion.

4.2.4 Proper Rank

The “proper rank” of the matrix of output vectors is the rank of the matrix in theabsence of noise. For a given sensor configuration, we must determine the proper rankbefore we can apply shape from motion.

Z∗ U∗Σ∗V∗TMS= =

M U∗ Σ∗( )1 2⁄=

S Σ∗( )1 2⁄V∗T

=

M S

M S M A 1–( ) AS( )=

M M A 1–=

S AS=



We know the rank of the product of two matrices cannot exceed the rank of either indi-vidual matrix. (The product is both a subspace of the column space of one and a sub-space of the row space of the other.) From equation (4.6), which does not includenoise, we know the rank ofZ is limited by the rank of the “motion” and “shape” matri-ces. From these we can deduce the proper rank ofZ for any sensor.

We know little about the form of the shape matrix, but we can assume good sensordesign will produce maximum rank. If not, the sensor will, in fact, be degenerate. Themotion matrix, on the other hand, has a well-defined form. The motion matrixdescribes the motion of the force vector through euclidean space. Therefore, the rankwill be either 2 or 3 for planar and 3-space sensors, respectively. From this, we knowthe proper rank ofZ.

Note that the rank is deduced from forces only. A 3-space wrist sensor can have up tosix degrees-of-freedom, but the force vector remains embedded in 3-space. Torquesare a linear combination of forces, so they do not increase the rank of the motionmatrix. We will see that this complicates the shape from motion procedure, but doesnot cause failure.

collect lots of

Find A-1 to

Orient result to match

motion constraint

(z, m) data pairs

(no knowledge of loads)

Z

M , S

C

SVD(Z)

^ ^M , S

user-specified frame

raw data

enforce constraint

FIGURE 4.2 FLOW CHART FOR THE SHAPE FROM MOTION CALIBRATION PROCEDURE

APPLICATIONS OF SHAPE FROM MOTION


4.3 APPLICATIONS OF SHAPE FROM MOTION

4.3.1 Force-Only Sensors

This section considers sensors that measure only pure forces. Torques provide an addi-tional complication that will be addressed in Section 4.3.2.

4.3.1.1 2-DOF Calibration

As an example of the shape from motion calibration procedure, we first consider a 2-DOF subset of a fingertip sensor for a dextrous hand (Figure 4.3) described in chapter3 that is similar, in concept, to that of Bicchi and Dario [1988]. Force sensing resultsfrom the deformation of a single cantilever beam. If the beam is instrumented withstrain gauges, the voltage response,zj, from thejth gauge shown in Figure 4.4 is

(4.13)

wheregj is the electrical gain of the system,βj is the mechanical gain,F is the radialapplied force and the angles are as shown in the figure.

The two gain variables are lumped parameters. The electrical gain,gj, includes theamplifier gain, excitation voltage, gauge factor, and voltage divider or bridge network.The mechanical gain,βj, includes the structure of the sensing beams (i.e. cantilever

zi giβiF θ αi–( )cos=

FIGURE 4.3 SINGLE CANTILEVER BEAM-BASED EXTRINSIC TACTILE SENSOR



beam, maltese cross, etc.), the radius and length of the beam, the modulus of elasticityof the material, the axial placement of the strain gage along the beam, and the align-ment of the strain gage axis with respect to the cylinder’s axis. Formulae relating theseparameters can be found in Bicchi and Dario [1988].

We limit our consideration to four strain gauges that respond to forces in the plane,thus, the measurement vector for this experiment has four elements and the load vectorhas two elements (x and y directions). To collect measurements, a constant-magnitudeforce is applied in random directions in the plane using the setup of Figure 4.5.

Using trigonometric identities equation (4.13) becomes

. (4.14)

θ

αj

F jth strain gage

FIGURE 4.4 CROSS SECTION OF THE CANTILEVER BEAM WITH PURE FORCE APPLIED.

Rotating Stage

Cantilever Beam

Strain Gage (1 of 4)

F

String

FIGURE 4.5 TEST JIG FOR THE 2-D FINGERTIP EXPERIMENT.

zj gjβ jF α j θcoscos gjβ jF α j θsinsin+=



Lumping together the constant terms,gj, βj, F, cosαj, and sinαj, equation (4.14) can bewritten in the form of (4.4) for all four gauges at theith sample:

(4.15)

which, overn samples, produces theZ, M , andS matrices of equation (4.6). ButZ

decomposes into and so to recoverM andS we must find the appropriate motionconstraint that yieldsA to satisfy equation (4.12).

The constant magnitude of the force in equation (4.15) has been arbitrarily set to oneunit, leaving only cosθ and sinθ in the motion matrix. (Correct engineering units will

be selected in the final step.) This allows us to use cos2θ + sin2θ = 1, the sum of thesquares of the columns ofM , as our motion constraint. Furthermore, the rank of themotion matrix is 2, so the proper rank of ourn x p measurement matrix,Z, is also 2. To

find M , we denote the columns ofA-1 by a1 anda2, and theith row of bymiT.

Summing the squares of the columns of yields

(4.16)

which, if we denote the elements ofA-1 by a11, a12, a21, anda22, and theith row of

by mi1 andmi2, becomes

(4.17)

We solve for (a112 + a12

2), (a11a21 + a12a22), and (a212 + a22

2) in (4.17) in the leastsquares sense and then numerically solve for the individualaij values. (In fact, an ana-lytic solution exists for theaij ’s in both the 2- and 3-DOF cases.) Because there arethree equations and four unknowns, one variable is left free. Although any solution isacceptable, forcing A to be upper triangular or symmetric ensures invertibility. (Sym-

metric is preferred for good conditioning.) Having computedA-1, one can solve for theshape,S, using (4.12) and the calibration matrix,C, using (4.5).

A plot of the motion matrix after extraction is shown in Figure 4.6. This shows everyforce vector (magnitude and direction) applied during the calibration procedure. Thisinformation was unknown during the experiment and was recovered automatically bythe Shape from Motion approach.

zi 1 zi 2 zi 3 zi 4 θicos θisins11 s12 s13 s14

s21 s22 s23 s24

=

M S

M

M A 1–

1 miTa1( )

2mi

Ta2( )2

+=

M

1 mi12

a112

a122

+( ) 2mi1mi2 a11a21 a12a22+( ) mi22

a212

a222

+( )+ +=



Unfortunately, the resulting calibration matrix is not oriented in any particular direc-tion. To align it with our desired reference frame, we introduceone precise load (az,m pair, both vectors known) to rotate and scale it appropriately. To do this, simplyapplyC to z and use the resulting magnitude and angle to rotate and scale the matrixaccordingly:

, (4.18)

whereCo is the oriented calibration matrix,φ is the angular difference betweenm andCz, andRot(φ) is the2 x 2 rotation matrix.

We use Mathematica to perform the SVD and to solve the nonlinear equations toextractA, S, and finally,C. Figure 4.6 shows a polar plot of the recovered motion of

40

80

120

30

210

60

240

90

270

120

300

150

330

180 0

FIGURE 4.6 RECOVERED MOTION MATRIX FROM PLANAR SHAPE FROM MOTION CALIBRATIONEXPERIMENT. NOTE HIGH DEGREE OF CIRCULARITY DESPITE NOISE.

Co Rot φ( ) mCz

------------C=



one calibration trial. Despite significant noise, the motion displays excellent averagecircularity (quantitative assessments of precision appear in Section 4.5) and we cali-brated this 2-DOF sensor with only one known load.

4.3.1.2 3-DOF Force Sensor

An extension of the 2-D fingertip force sensor to 3-D was performed using a standard6-DOF Lord force/torque sensor with a compact mass mounted on the face plate. Inreality there is a small moment arm associated with this arrangement, but we willignore that for now.

The resulting motion matrix consists of vectors of the form [cosθsinψ sinθsinψ cosψ] soM is now rank 3. Likewise,S acquires another row, becoming3 x p. The derivation ofmotion and shape is the same as in the 2-DOF case with the necessary modifications tothe motion constraint (constant magnitude force application). This relation is easilyderived so we will not repeat it here. (It is similar in form to equation (4.17).) Plots ofthe recovered motion of the applied force vector during an actual calibration trial ofthe Lord force/torque sensor appear in Figure 4.7 (analogous to the planar plot inFigure 4.6). Although it is difficult to judge by eye, it is an accurate sphere. (Again,quantitative results appear in Section 4.5.) This 3-DOF sensor requires only twoknown loads to fully scale and orient the matrix.

To aid visualization, the sphere of Figure 4.7 is plotted as a single-image random-dotstereogram in Figure 4.8. To view, hold the image 50 cm away and focus at infinity.

FIGURE 4.7 RECOVERED MOTION OF THE LORD F/T SENSOR.



4.3.2 Force/Torque Sensors

The previous section showed the application of the shape from motion calibrationmethod to 2-DOF and 3-DOF force-only sensors. This section will develop the shapefrom motion technique for force/torque sensors. We will see that the introduction oftorque measurement introduces a new problem: the moment arm of the applied force isembedded in the shape matrix.

4.3.2.1 2-D Force/Torque Sensor: 2 Forces, 1 Torque

Consider a 2-D force/torque sensor which responds to forcesand torques in the plane(3-DOF). Our motion vector has 3 elements: two forces and one torque, so the true

shape, , must have 3 rows. However, if we generate the load by applying a force tomoment arm,r, then the torque value is linearly related to the force values,

. (4.19)

Because the torque,τ, is a linear combination of the first two columns of the motionmatrix, it does not increase the rank. Therefore, given a constant moment arm,r, themotion matrix has rank 2 (not rank 3, as we would hope). This means a planar systemwith three DOF has a proper rank of 2, the same as the 2-DOF force-only case. Post-

multiplying equation (4.19) by the true shape, , yields

FIGURE 4.8 SINGLE-IMAGE RANDOM-DOT STEREOGRAM OF RECOVERED MOTION DATA.

S

f x f y τ f x f y

1 0 r y–

0 1 r x

=

S



, (4.20)

where consists of vectors of the form [fx fy τ] andM consists of vectors of the form[fx fy] (identical to the 2-DOF case). The moment arm becomes embedded in the shapematrix,S (rank 2), we can recover using shape from motion:

(4.21)

whereI is the2 x 2 identity matrix andX is the planar equivalent of the cross productmatrix, [-ry rx]. (This vector produces the magnitude of the cross product ofr and any

other vector in thex-y plane.) We call the matrix [I X T] the rank squashing matrix (orsquashing matrix) because the constant moment arm “squashes” down the proper rankof S.

From the 2-DOF force-only case, we have seen how to recover theM andS matrices

of equation (4.20). However, we are interested in recovering the true shape matrix, ,which is the same for any load with any moment arm. We cannot simply pseudo-invertthe squashing matrix because it has rank of only 2 and so will not completely specifythe true shape matrix, which is rank 3. To overcome this, we need to perform shapefrom motion calibration twice using two different moment arms and combine theresults. As we have pointed out before, the two recovered shape matrices are notaligned with the desired reference frame, nor are they aligned with each other. So wemust introduce a precise data point to orient each squashed shape matrix to a consis-tent frame of reference before we can combine them to extract the true shape matrix.This is accomplished using the same technique that culminated in equation (4.18) forthe 2-DOF case.

Once the squashed shapes are “co-oriented,” we find from (4.21) by conglomeratingthe shape and squashing matrices from both trials such that

(4.22)

or

M S M1 0 r– y

0 1 r x

S MS==

M

S I X T S=

S

S

S1

S2

I X 1

I X 2

S=



, (4.23)

where the subscripts 1 and 2 refer to the two calibration trials with different momentarms.

In summary, for the 3-DOF planar force/torque sensor, we run the calibration proce-dure twice with the same constant force but different moment arms and then combine

the results. This provides three independent columns of information so (which isrank 3) can be recovered.

4.3.2.2 3-D Force/Torque Sensor: 3 forces and 3 torques

The 6-DOF force/torque sensor shape from motion calibration procedure follows asimilar development as the 3-DOF force/torque procedure. Now our load vector hassix elements: three forces and three torques. Again, if we apply a constant magnitudeforce with a particular moment arm,r, the torques are linear combinations of the forceso the rank of the motion matrix is 3 (not 6).

(4.24)

Again, the squashing matrix consists of the identity matrix and the transpose of thecross product matrix, but now these submatrices are represented in 3-space (4.24).

To reconstruct the full 6-DOF shape matrix (which is rank 6),three trials are requiredwith three different moment arms. Three trials are required because it is impossible tochoose two squashing matrices that yield rank greater than 5 when combined. We

solve for in a manner similar to (4.23) using co-oriented shape matrices:

SI X 1

I X 2

+

S1

S2

=

S

f x f y f z τx τy τz f x f y f z

1 0 0 0 r z r– y

0 1 0 r– z 0 r x

0 0 1 r y r– x 0

=

S

SIMULATION STUDIES


, (4.25)

where [I X i] is as it appears in (4.24). Note that a total of 6 precise (z, m) points arerequired - two per moment arm - in order to consistently orient theSi’s.

4.4 SIMULATION STUDIES

I conducted dozens of simulations comparing least squares to shape from motion toverify the performance of the technique. For these simulations, I assumed a decoupledsensor with known calibration matrix:

Simulated data was generated with various levels of independent, uniformly distrib-uted noise in the sensor outputs and in the components of the load vector input. Whenboth techniques were presented with the same large data sets of 210 loads (leastsquares using both load inputs and raw outputs, shape from motion using only the out-puts), they performed nearly equivalently with a slight advantage to shape frommotion. We expect the performance to be nearly the same under identical conditionsbecause the underlying mathematics are closely related (both compute a least squaresfit). Still, shape from motion prevails because it is unaffected by the noise appearing inthe input matrix (applied load error).

However, we ran the greatest number of simulations around an operating point similarto the real sensor experiments: 12 load vectors for least squares and 210 load vectorsfor shape from motion. Under these conditions, shape from motion demonstrated aclear advantage. With peak sensor noise at 1 percent of peak sensor output and anapplied load error ball with a peak radius of degrees, the variation in resolved forcemagnitude from the two calibration procedures is displayed in Figure 4.9. These plotsare analogous to those with real data in Figure 4.11. the first line of Table 4.1 lists themean and standard deviation data analogous to Table 2. The extra column for “load

S

I X 1

I X 2

I X 3

+

S1

S2

S3

=

0.03 0 0 0 0.03– 0 0 0

0 0 0.03 0 0 0 0.03– 0

0 0.02 0 0.02 0 0.02 0 0.02

3

SIMULATION STUDIES


error” is the peak error for each component of the load vector used in the least squarestechnique.

These data were generated by simulating 110 raw sensor vectors with no noise andthen applying the calibration matrices resulting from each calibration method to theraw vectors. Like the simulated data used to generate the calibration matrices and thereal data of Section 4, this data was generated by moving the force vector along thelongitudinal lines of an imaginary sphere. Although these data are from only two sim-ulation trials, they are representative of all simulation trials we ran.

The second line of Table 4.1 shows a simulation run in which no errors were injectedinto applied loads for the least squares procedure. The error in the measurementsstayed the same at 1 percent. In this case, shape from motion is still superior because

TABLE 4.1 SIMULATION COMPARISON

load Least Squares Shape from Motion

mag.(N)

error(N)

mean(N)

error(%) std dev

mean(N)

error(%) std dev

1.0 .017 .9971 .29 .0071 1.00001 .001 .0013

1.0 0.0 1.0003 .03 .0041 .9998 .02 .0016

20 40 60 80 100

LS and SfM SFM Magnitude of Force

0.98

0.99

1.01

1.02

20 40 60 80 100

LS and SfM SFM Magnitude of Force

0.98

0.99

1.01

1.02

FIGURE 4.9 SIMULATED COMPARISON OF SHAPE FROM MOTION (TOP) AND LEAST SQUARES (BOTTOM) ASORIENTATION CHANGES

For

ce M

agni

tude

EXPERIMENTAL COMPARISON TO LEAST SQUARES


of the large number of data points used (210 compared to 12). More data reduces theeffects of noise in the measurements, as expected.

4.5 EXPERIMENTAL COMPARISON TO LEAST SQUARES

The point of any sensor calibration is to provide an accurate characterization of thesensor. But desired accuracy is almost always traded-off with the time required to per-form the calibration procedure. Usually, more accurate calibration measurements willproduce a more accurate calibration, but they will take more time. Our motion andshape approach is exceptional in that it allows the collection of much more data inlesstime as compared to the least squares approach. This produces amore accurate andprecisecalibration, for a given accuracy in calibration measurement.

The reason it takes less time is obvious -- fewer accurately applied loads are required.For a given accuracy of each measurement, fewer of them results in less time, despitethe fact we collect many more data points.

The reasons for greater accuracy are threefold. First of all, more data can be gatheredin less time. Second, the applied loads are unknown sothere is no applied load error.Finally, the least squares approach takes all the data available and attempts to mini-mize the total error. It does this at the expense of the shape because, from equation(4.3),C is the only free variable. As such, both the measurement errorsand the appliedload errors distort the shape. On the other hand, the shape from motion approach hastwo free variables: the shape and the motion. Errors are distributed between the twobecause shape from motion relies on a known physical constraint of the motion andminimizes error with respect to that constraint to maintain the shape. It is important torealize that this is a true constraint based on geometry, not a constraint based on somesensor model that may be wrong. Greater accuracy results from this hard physical con-straint.

To demonstrate these assertions, we performed both shape from motion and leastsquares calibrations on our 2-DOF fingertip sensor and a 6-DOF Lord force/torquesensor. In both cases, the least squares method was aided by averaging each force mea-surement over time to minimize noise.

4.5.1 Fingertip Results

For the shape from motion approach, we chose a mass in the linear region of the sen-sor, near the upper extreme of what we consider “reasonable,” as defined by theexpected forces resulting from the tasks for which the sensor was designed. We pickedthe upper extreme to improve signal-to-noise, but we stayed within the “reasonable



range” to reduce any effects of minor nonlinearities. This is the extent to which weconsidered experiment design because we assumed the sensor is highly linear. Subse-quent testing showed this assumption is valid. The chosen test mass weighed in at111.4 grams and the calibration procedure took only 1.5 minutes while collecting atotal of 96 unknown data points and 1 known load.

For the least squares approach, we chose five different masses, each of which wasapplied at 40-degree increments around the circle for a total of 45 accurate loads. Foreach load we gathered several measurements and averaged them to reduce the effectsof measurement noise. We chose a large number of loads to reduce the effects ofapplied load error. The same criteria were used in selecting the calibration masseswhich were: 60.4, 78.3, 95.2, 111.4, and 122.8 grams. The procedure consisted ofadjusting the angle of the rotating stage to within 0.2 degrees, sequentially applyingthe five masses, and then incrementing the angle. The entire procedure took 17 min-utes to collect all 45 known data points (including redundancy at each data point).

Because of the nearly exhaustive nature of the least squares calibration, the time differ-ence is more than 10:1. But even cutting the procedure down to 15 measurementsresults in nearly a 4:1 time advantage for shape from motion, not to mention the loss inaccuracy for least squares due to lesser noise rejection. The large number of leastsquares measurements gathered should produce nearly the best possible least squarescalibration.

To assess the calibration results quantitatively, we hung each of the five masses usedfor the least squares calibration on the string and gathered raw data as the sensor wasrotated 360 degrees. We then calculated the force magnitude at each sample point andperformed two calculations. The first determined the average magnitude to assessabsolute accuracy. The second smoothed the magnitude signal to eliminate noise anddetermined the standard deviation of the smoothed signal. This assesses the precisionof the measurement across all orientations. The best and worst of these data are tabu-lated in Table 4.2.

TABLE 4.2 FINGERTIP SENSOR COMPARISON

Least Squares Shape from Motion

load(g)

mean(g)

error(%)

smooth time(min)

mean(g)

error(%)

smooth time(min)

60.4 60.1 0.5 1.20 17 60.5 0.2 1.13 1.5

111.4 110.2 1.1 0.83 111.6 0.2 0.60



4.5.2 Lord F/T Results

For the 6-DOF sensor we built a special calibration fixture (Figure 4.10) to help usmake quick, accurate measurements. It consists of a 3-inch aluminum cube (see exactdimensions in Table 4.3) and two 1-inch diameter brass bars. The cube has onethreaded hole in the center of each face into which the bars can screw. This provides

FIGURE 4.10 CALIBRATION FIXTURE FOR 6-AXIS FORCE/TORQUE SENSORS



16 unique force/torque combinations that can be quickly and precisely selected during

calibration. The simple design is easy to machine yet it dramatically reduces the timerequired for the least squares approach because the flat faces provide an accurate refer-ence surface for leveling the device in various orientations. It is quite convenient forthe shape from motion approach as well.

Despite using the calibration fixture and collecting only 13 load vectors (force magni-tudes of 12.09N, 22.9N, and 33.71N with different moments), the least squaresapproach required 69 minutes to collect data and compute moment vectors. (Withoutthe calibration fixture, using our own version of the “torque bar” as in Uchiyama, et al,1991, took even longer.) This compares to only 34 minutes for shape from motion tocollect up to 500 measurement vectors using the single 33.71 N weight and computethe moment vectors. Although the time difference is not as dramatic as the 2-DOFcase, it is significant and gets more so as the number of least squares force vectorsincreases. Also, these times are aided by the use of the calibration fixture, whichmakes applying precise loads easier, primarily benefitting least squares.

Gathering the raw data for the shape from motion trials involved sampling the sensorat 2 Hz while it slowly moved up and down the longitudinal lines of an imaginarysphere. Each trial varied slightly in length so the total volume of data was differenteach time, but the raw data collection time was a small percentage (10-15%) of thetotal. The majority of time was consumed in gathering the precise loads used to orientthe calibration matrix with the desired reference frame. This was done to the same pre-cision as the least squares loads.

To compare precision, we performed the same experiments as in the previous subsec-tion but randomly moved the sensor in 3-D rather than just the plane. The extremes of

TABLE 4.3 DIMENSIONS OF CALIBRATION FIXTURE ELEMENTS

Mass(kg)

Length(in)

Cube 1.18673 3.005

Cube withMounting Plate

1.23285 3.005

Bar A 1.10150 10.010

Bar B 1.10240 10.008



these data are tabulated in Table 4.4. Again, shape from motion accuracy (mean) is asgood or better while imprecision (standard deviation) is always better..

Figure 4.11 shows the magnitude of the linear forces of the one-bar experiment fromTable 4.4. Ideally, this should be a straight line with a value of 25.24N including inter-nal mass. We chose to illustrate this data set not because it’s the best (it’s not), butbecause the mass wasdifferent from that used during the shape from motion approachbut thesame as in four of the least squares vectors. This avoids giving the shape from

TABLE 4.4 LORD F/T SENSOR COMPARISON

Least Squares Shape from Motion

load(N)

mean(N)

error(%)

smooth time(min)

mean(N)

error(%)

smooth time(min)

9.26 9.09 1.8 .084 69 9.09 1.8 .058 34

25.24 25.53 1.7 .157 25.41 0.7 .062

Time (seconds)

For

ce M

agni

tude

(N

ewto

ns)

FIGURE 4.11 COMPARISON OF SHAPE FROM MOTION (TOP) AND LEAST SQUARES (BOTTOM) ASORIENTATION CHANGES

0 20 40 60 80 100 120 140 16024.5

25

25.5

26

26.5

0 20 40 60 80 100 120 140 16024.5

25

25.5

26

26.5



motion approach an unfair advantage. The two plots result from applying the calibra-tion matrices from both techniques to thesame batch of data. It is clear that the shapefrom motion plot has less variation, which indicates greater precision. This is con-firmed by the smoothed standard deviation measurements in Table 4.4.

To verify the absolute accuracy of both techniques, we assembled the calibration fix-ture in two configurations that were not used in either calibration procedure and gath-ered data in ten different orientations. The accuracy of each applied load in orientationand force is 1 milliradian and 0.02 Newtons, respectively. Table 4.5 shows averageerrors in force magnitude and direction across all ten trials. Figure 4.12 shows theangular error of both techniques for all ten trials.

TABLE 4.5 AVERAGE LORD FORCE ERRORS

methodmagnitudeerror (N)

angleerror (rad)

Shape from Motion 0.080 0.0025

Least Squares 0.144 0.0046

Ang

le E

rror

(ra

dian

s)

FIGURE 4.12 ANGULAR ERROR OF LEAST SQUARES (SOLID) AND SHAPE FROM MOTION (DOTTED) ATSPECIFIC KNOWN LOADS.

Sample Number

1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8x 10

-3

SENSOR BIAS


4.6 SENSOR BIAS

Sensor bias consists of a constant vector of offsets that is added to each and every sam-ple. From equation (4.1) we see that both the least squares method and shape frommotion as described both assume zero-bias.

In practice, this does not cause a problem because sensor bias is usually easy to deter-mine experimentally. For a robotic wrist force/torque sensor, it is usually sufficient totake readings with the sensor pointing straight up and straight down and average them(i.e. [Shimano and Roth, 1977]).

In certain situations, for example an asymmetrical mass distribution (a gripper) that isinconvenient to remove, it would be convenient to determine sensor bias and the cali-bration matrix simultaneously. We show how LS can be augmented to include biasidentification, and then present such an extension to the shape from motion techniquethat is similar to Tomasi and Kanade’s original computer vision derivation [1991].

4.6.1 Bias and Least Squares

Rewriting equation (4.1) in terms of the sensor function of Figure 4.1 yields:

, (4.26)

where z0 is the constant bias vector andC+ is the pseudoinverse ofC. For everyapplied force,m, the sensor response,z, includes the additive offset,z0. By augment-

ing C+ andm, we can write (4.26) as a strictly linear function:

(4.27)

This requires two LS (pseudoinverse) procedures, however. The first extractsC+ andz0 while the second findsC.

4.6.2 Bias and Shape from Motion

The idea of adding bias extraction to the shape from motion technique is analogous tothe least squares case. For the single strain gage as in equation (4.14), adding a non-zero offset becomes:

z C+m z0+=

z C+ z0m1

=

SENSOR BIAS


(4.28)

which can be rewritten in accordance with equations (4.15) and (4.27):

(4.29)

The effect of this is the increase of the rank of the motion matrix by one and the subse-quent increase of the “proper” rank ofZ by one. It is true that, except for specificpathological cases (such as all offsets equal to zero), increasing the columns of anyrank deficient matrix by constant offsets will increase the rank of the matrix by exactlyone.

Here we have the first caveat:Z must be rank deficient. In other words, there must be aminimum of n+1 sense elements for ann-space sensor. In general, this does notpresent a problem because the criticaln is not the degrees-of-freedom of the sensor butthe dimensionality of the pure force vector. For example, the three-beam “modifiedmaltese cross” commercial products from ATI [Little, 1992] which have only 6 senseelements for a 6-DOF sensor will work fine because the dimensionality of the forcevector is 3 (not 6). In fact,any force/torque sensor will have the required number ofsense elements. Only pure force sensors can potentially fall short of this requirement.

Armed with the new proper rank ofZ, the shape from motion procedure progresses in

similar fashion as reported in Section 4.3 until the determination ofA-1 from the

motion constraint. Clearly, the motion constraint has changed, but the shape ofA-1 hasalso changed; it is now4 x 4.

Using the same nomenclature forA-1 and as in Section 3, we find we have two,decoupled constraints:

(4.30)

The second constraint is just the description of a plane and is easy to solve. The firstconstraint is trickier, though, and can be rewritten

zij Fs1 j θicos ψ isin Fs2 j θi ψ isinsin Fs3 j ψ icos z0 j+ + +=

zi1 zi2 zi8 θicos ψ isin θi ψ isinsin ψ icos 1

s11 s12 s18

s21 s22 s28

s31 s32 s38

z01 z02 z08

=...

...

...

...

...

M

miTa1( )

2mi

Ta2( )2

miTa3( )

2+ + 1=

miTa4 1=

GREATER AUTONOMY: COLLABORATIVE CALIBRATION


(4.31)

whereBT is a4 x 3 submatrix consisting of the first three columns ofA-1 andmT is a

row of . This is a difficult nonlinear problem to solve that involves the fitting of datato a cylinder. We solve this problem by successively refining the solution numerically.

First, we note the similarity between (4.31) and the equation for an ellipse:

(4.32)

whereQ is symmetric. Intuitively, an ellipse should give us a good estimate for thecylinder and we can use it as an initial starting point for the numerical refinement ofthe cylindrical fit.

FindingQ is a linear problem that we must solve in the least squares sense over all the

rows of the matrix, . GivenQ, we decompose it with SVD and set the smallest sin-gular value to zero. This gives us a4 x 3 matrix representing a cylinder that best fits theellipse. Using this as a starting point, we numerically improve the solution using gradi-ent descent over the mean squared error.

Concatenating the solution from the plane constraint to the solution from the cylindri-

cal constraint yields a4 x 4 matrix,A-1, that we can verify is invertible and well condi-tioned. With equation (4.12), we can reconstruct the augmented motion and shapematrices from equation (4.29) and strip off the offset vector from the shape matrixbefore taking the pseudoinverse that yields the calibration matrix.

4.7 GREATER AUTONOMY: COLLABORATIVE CALIBRATION

The shape from motion approach to sensor calibration provides unprecedented auton-omy while maintaining the rigor and accuracy of the least squares technique. However,it is not fully autonomous because it does require the attachment of a constant massdistribution and the application of a small set of known forces and torques for refer-ence. The subject of what can be done when the known references are eliminated is thediscussed in the next chapter on Shape from Motion Primordial Learning. Eliminatingthe fixed mass to provide greater autonomy is the subject of this section.

Imagine now that instead of a constant mass in a gravity field, there are two sensorspressing against each other, such as on the fingers of a dextrous hand. The reason forthe constant mass stemmed from equation (4.15), in which we assumed the magnitude

mTBTBm 1=

M

mTQm 1=

M



of the force was unity. The important thing, however, is not that the magnitude be con-stant, but that there exists a constraint for the solution of the least squares problem ofequation (4.17).

The constant magnitude adds a constraint to the solution that is required to make theproblem solvable. It manifests itself in equation (4.16), where the left-hand side isalways equal to a constant (assumed to be 1 unit). To solve (4.16) it is not necessarythat the right-hand side be constant, only that it be constrained. Given two force sen-sors pressing against each other, the two magnitudes are constrained to be equal toeach other. This equality constraint can be used to replace the constant equality con-straint of equation (4.16):

(4.33)

The vectors m1i andm2i result from two separate but concurrent executions of theshape from motion procedure outlined in Section 4.2.3, resulting in:

(4.34)

The bilinear problem of equation (4.33) can be solved iteratively by guessing a solu-tion to the right-hand side, holding it fixed, then solving for the left-hand side. Next,using the solution just found, hold the left-hand side constant and solve for the right-hand side, and so on. Because zeroes on both sides is the only exact solution, one musttake care to re-normalize the results periodically to prevent relaxing to the trivialanswer.

4.7.1 Collaborative Calibration Results

To test this new result in shape from motion calibration, we physically attached twoforce sensors together with a universal joint. One sensor was firmly affixed to a sta-tionary robot while the other was tugged around manually to exert a wide range offorce magnitudes that sufficiently excited the sensing spaces of both sensors. (Suffi-

m1iT a11( )

2m1i

T a12( )2

m1iT a13( )

2+ + =

m2iT a21( )

2m2i

T a22( )2

m2iT a23( )

2+ +

M 1 M 1A11–=

S1 A1S1ˆ=

M 2 M 2ˆ A2

1–=

S2 A2S2ˆ=



ciently exciting the sensors is an issue of design of experiments and will not be dis-cussed here. As a simple example, if the applied vector is constrained to a plane for a3-D sensor, the result will be poor. Refer to any standard text on the subject, such asDiamond [1981].)

The raw output data from both sensors was sampled synchronously at 10 Hz. After theprocessing, the extracted motion matrices are plotted in Figure 4.13. Obviously, the

applied force was highly random but segments of the motion through space are easy topick out. After applying two known forces to scale and orient the force sensor refer-ence frame, analysis of the accuracy of the calibration matrix as performed in Section4.5 produced similar results, as expected.

The benefits of this extension to the basic shape from motion technique are twofold.First, it allows fully autonomous calibration of sensors when an external referenceframe is not pre-specified, hence there is no need for applyingany known forces.(These cases will be explored in the next chapter.) Second, and more important from a

FIGURE 4.13 MOTION OF THE FORCE VECTOR AROUND BOTH SENSORS DURING COLLABORATIVECALIBRATION.

CONCLUSIONS


theoretical standpoint, it allows calibration across a wide range of force magnitudes.This is an important aspect of the design of experiments because it allows the best lin-ear fit for sensors that have slight nonlinearities.

4.8 CONCLUSIONS

We have described an improved calibration technique for linear force/torque sensorsbased onshape from motion decomposition that provides a slightly more accurate cal-ibration matrix with much less effort from the user. The intrinsic “shape” of the sensoris extracted by randomly moving the calibration force through a spanning range ofmotion. Both the motion and the pseudoinverse of the calibration matrix are simulta-neously recovered from the raw sensor values.

The shape from motion approach yields a calibration matrix that is at least as preciseand accurate than the corresponding least squares technique that takes several times aslong because more data can be collected with fewer error sources. It has been success-fully applied to force-only and force/torque sensors from 2 to 6 degrees of freedomwith a single batch file written for Mathematica. While it is possible to achieve equiv-alent precision and accuracy using least squares, the additional time required can beprohibitive for sensor manufacturers or researchers that must calibrate sensors often.

Although we determined bias vectors as part of the experimental procedure, we showthat it is possible to augment the technique to automatically determine these as well,provided there are at leastn+1 sensing elements for a sensor inn-space (not to be con-fused withn-DOF).

Of course, this technique has limitations. It linearizes the response around a singleforce magnitude whereas the least squares technique finds the best linear representa-tion across a range of force magnitudes. In practice, we have found our force sensorshave sufficient linearity that shape from motion still prevails. However, this raises theimportant issue of design of experiments. We briefly discussed our rationale for selec-tion of applied loads in Section 4, but we leave the important topic of experimentdesign to the classical textbooks in the field (e.g. [Diamond, 1981]).

Another potential experimental flaw is induced by gathering the raw measurementswhile the sensor is in motion. This permits the quickest data collection with very densecoverage of the sensing space, but is subject to dynamic effects. Care must be taken tomove the sensor with minimal acceleration so measurements are quasistatic. Ofcourse, static “move-stop-measure” data collection works, too, and remains faster thanleast squares.

CONCLUSIONS


In closing, we note the elimination of the final step, orienting the shape with respect toan external reference frame, still provides useful information about the sensor.Because known loads are not required to derive this, it is possible to use shape frommotion forprimordial learning in autonomous agents [Voyles et al, 1995].

CONCLUSIONS


79

Chapter 5

Shape from Motion PrimordialLearning

Sensor Calibration as a Form of Learning

5.1 FROM SENSOR CALIBRATION TO SENSORIMOTOR PRIMITIVE

ACQUISITION

The previous chapter introduced Shape from Motion Calibration as a rigorous and reli-able method for calibrating sensors in much less time than the traditional least squarestechnique. But the real beauty of the method is that it looks at calibration in a funda-mentally new and unique way. Prior approaches to calibration assume an input refer-ence for each and every output measurement. The goal of these approaches is todetermine the mapping between two sets of known measurements. The Shape fromMotion technique does not do this. Shape from Motion extracts the intrinsic structureof a single raw data set without knowing the correct answer beforehand.

For the calibration experiments in the previous chapter, it was necessary to apply a fewknown loads to scale the calibration matrix to the desired engineering units and orientit with respect to the desired reference frame. It is important to note that this step wastakenafter the calibration matrix had been fully determined. The intrinsic structure ofthe data is not dependent on the scale factor or orientation. Useful information is“learned” without any known loads being applied.

The focus of this chapter is shape from motion as a learning paradigm. We will exam-ine how the extraction of sensorimotor primitives can be cast within the shape frommotion framework. These primitives will be used to populate the “skill base” used forinterpretation of a human demonstration and subsequent robotic execution.

PRIOR WORK


5.1.1 The Gesture-Based Programming Skill Base

For the Gesture-Based Programming system, the origin of this skill base is not particu-larly important. It is only important that the primitives exist and that they can berelated to observations of the demonstration. An additional desirable property is thatthey can be related to each other. This implies the existence of a similarity measureand a mechanism by which primitives can be combined or transformed to produce newprimitives.

There are three basic mechanisms by which these primitives can come into existence.The first is “clairvoyance.” The user (or automatic planner), acting as a god, handcodes a primitive that he/sheknows is useful and composable. An example of this is aguarded move. The second mechanism is through assembly of lower-level primitives.Using the calculus analogy, algebraic primitives may form the basis for calculus, butarithmetic primitives form the basis for algebra. Finally, there is learning.

In this chapter, we present a learning approach for extracting sensorimotor primitivesfrom teleoperated manipulations that is a natural extension of Shape from Motion Cal-ibration. Calibration is, after all, a form of learning. One wants to “learn” the transfor-mation from the sensor’s input space to the output space. Shape from motioncapitalizes on the fact that the structure of this transformation is latent in the raw data.Likewise, the structure of human manipulative skills is latent in the data obtainableduring teleoperation. Extracting this structure is equivalent to extracting the skill.

Shape from Motion is based on finding the eigenspace representation of the input/out-put mapping. It uses singular value decomposition to derive principle components. Inthe case of calibration, the principle components analysis (PCA) is constraint-directed,so model-based information can be incorporated into the learning process. Because theunderlying representation (eigenvectors) is linear, it is limited to learning linear, or atleast piece-wise linear, interactions. This is a drawback because it complicates the seg-mentation of the observations into linear, learnable subtasks. The advantage is the rep-resentation provides not only a mechanism to learn the primitives, but naturalmechanisms for identifying the primitives in subsequent observations and combiningand transforming existing primitives into new primitives. These are abilities manyother learning approaches have difficulty demonstrating.

5.2 PRIOR WORK

We are interested in developing a primitive representation which supports acquisition,integration, and transformation of sensorimotor primitives. Previous work in learningrobot skills is relevant to this effort because it provides potential primitive acquisition

PRIOR WORK


techniques and because primitives learned by different techniques than primordiallearning may also be used in the execution phase of our GBP system. The idea is torelate primitives to recurring subgoals of complex tasks and to reuse primitives in theconstruction of complex task strategies, whatever the origin of the primitives may be.

Speeter [1991] developed kinematic primitives for the control of the Utah/MIT Dex-trous Hand. These primitives encapsulated useful, coordinated finger trajectorieswhich were used to construct more complex task strategies but they did not employany type of feedback. His primitive basis set was developed empirically based onexperience in coding applications and the actual primitives were coded by hand. Mich-elman and Allen [1994] also developed complex task strategies from task primitives.Paetsch and von Wichert [1993] have developed a set of behaviors which can be com-bined to perform a peg insertion with a multi-fingered hand. These primitives focusedmore on kinematics and the coordination of many degrees of freedom. Our focus ismore on relating sensor feedback signals to the motor commands.

Yang et al [1994] have applied hidden markov models (HMM) to skill modeling andlearning from telerobotics. The skills learned are manipulator positioning tasks with-out force or vision feedback. Pook and Ballard [1993] have used a combination ofHMM’s and Learning Vector Quantization to recognize manipulations in a teleoper-ated sequence using a PUMA robot and a Utah/MIT dextrous hand. They do not actu-ally learn the motion primitives and they require,a priori, an HMM of the known taskto segment the manipulations, but they have successfully recognized the compoundmanipulations for removing an egg from a pan with a spatula.

Some researchers are applying supervised learning approaches to recover strategies forcontact tasks like deburring. Liu and Asada [1992] use a neural network to recover anassociative mapping representing the human skill in performing a deburring task. Thetask is performed using a direct-drive robot with low friction and a force sensor inte-grated with the workpiece. This allows the human to perform the task during the train-ing phase will little interference from the robot while much relevant information ismeasured by the robot sensors. Shimokura and Liu [1994] extend this approach withburr measurement information.

Along these lines of skill acquisition by supervised learning is ALVINN [Pomerleau,1992], an artificial neural network for vehicle steering. ALVINN has demonstratedrobust mastery over a wide range of vehicles and road types by observing a humandrive the vehicle on the target road type. For the purpose of comparison, Hancock andThorpe [1995] developed ELVIS, a PCA-based vehicle steerer to operate on the samevehicles. The success of ELVIS, although not quite equivalent to ALVINN, providedadditional impetus to complete this work on Shape from Motion Primordial Learning.

ACQUISITION OF SENSORIMOTOR PRIMITIVES


Several researchers have applied reinforcement learning methods to the recovery of apeg insertion skill. Simons et al [1982] learn how to interpret a force feedback vectorto generate corrective actions for a peg insertion which has an aligned insertion axis.The output motion commands are restricted to the plane normal to the insertion axis.Changes in force are used to reinforce (penalize) the situation/action pair. Gullapalli etal [1992] learned close tolerance peg insertion using a neural network and a criticfunction consisting of the task error plus a penalty term for excessive force. The inputto the network was the position vector and force vector and the output was a new posi-tion vector. About 400 training trials are required to recover a good strategy. However,the learned “skill” is specific to the peg geometry which it is trained on and is specificto the location of the peg in the workspace because the absolute peg position is pro-duced as output. If the peg location were moved in the workspace, this skill wouldprobably fail because it would be very difficult to accurately (relative to the insertionclearance) specify the relative transformation between the new location and traininglocation. A few more training trials would probably suffice for learning the new skill,but one does not want to learn and store a different skill for every location in therobot’s workspace.

Vaaler and Seering [1991] have applied reinforcement learning to recover productionrules (condition-action pairs) for performing a peg insertion task. The critic function isa measure of the forces produced from the last move increment; higher forces arepenalized. The termination conditions are an absolute Z position (Z is the insertionaxis) and a Z force large enough not to be caused by 1 or 2 point contact (common dur-ing insertion). Ahn et al [1991] learn to associate pre-defined corrective actions withparticular sensor readings during iterative training and store these mappings in abinary database. Again, the critic function penalizes moves which increase the mea-sured force. Kinematic analysis of the task can be used to “seed” the database withapriori knowledge, but this is not necessary for the method to succeed. Lee and Kim[1988] propose a learning expert system for the recovery of fine motion skills. Skillsare represented as sets of production rules and experta priori knowledge is used. Thecritic function is the distance between the current state and the goal state, but does notexplicitly include an excessive force penalty. The method is tested on a simulated 2Dpeg insertion task.

5.3 ACQUISITION OF SENSORIMOTOR PRIMITIVES

Primordial Learning refers more to a way of thinking than to a specific algorithm; itrefers to learning fundamental interactions, or mappings, with no prior knowledge. It’slike an infant that learns gross motor control by flailing. The infant is aware of “sensorand actuator ports,” but has no comprehension of their connection to the body or to theoutside world. Some might say primordial learning is non-parametric learning. Some



might call it unsupervised learning. The implementations in this chapter can be calledprinciple components analysis. Yet, all these terms, broad and narrow, are, in one wayor another, inaccurate in describing our work in its entirety, from autonomous sensorcalibration to mobile robot behaviors.

5.3.1 Mobile Robot Primitives

The infant analogy provided not just the impetus for the name, but the impetus for thefirst application. Somewhat as a lark, the challenge was made to create a mobile robotthat could learn to “crawl” before Meredith, my newborn daughter at the time, coulddo so. The term “crawl” was interpreted loosely to mean “purposefully motivate” so asnot to exclude the creation of a wheeled vehicle as opposed to a much more mechani-cally complex legged vehicle.

This sort of primordial robot learning has been suggested for mobile robots before.Pierce [1991] taught a simulated robot to navigate around obstacles and even to homein on a goal without any explicit knowledge of what the sensor and actuator datameant. Likewise, Maes and Brooks [1990] developed a subsumption network thatallowed a physical legged robot to learn to walk. Of course, both of these examplesemployed a specific objective function that guided the learning process to the desiredoutcome.

In our implementation, we applied the shape from motion primordial learning tech-nique to a small mobile robot that has no explicit knowledge of its limited set of actua-tors and sensors. All the robot is “aware of” is the streams of data that come from or goto the sensors and actuators. The shape from motion technique allows the robot todevelop an internal representation of teleoperated interaction between sensors andactuators to produce “meaningful” externalized behavior. In this sense, the robot is pri-mordial, or infant-like, because it must learn the most basic input/output relationshipsfrom a teacher, fuse them as required to mimic the behavior, and ignore superfluousdata. We do not use an explicit objective function or provide a reinforcement signal,but the learned result is explicitly dependent on the teleoperated interactions presentedby the teacher.

The shape from motion technique as applied to this problem uses the standard princi-pal component analyses of Hancock and Thorpe [1995] and Pierce [1991]. The stan-dard principal components analysis does not include a model-based constraint as inShape from Motion Calibration. For primordial learning, we do not wish to constrainthe problem. Although some constraints can be useful to achieve more sophisticatedresults (as in [Pierce, 1991]), they impose potential limitations which may not beexpressive enough for the situation at hand.



The subject of this experiment is the MK-V, a three-wheeled, non-holonomic mobilerobot with a single drive wheel and an unpowered steering wheel. Sensors includewheel encoders, a compass, bump sensors and motor current sensors. The drive andsteering motors have three-valued -- forward-off-reverse -- commands with no closed-loop controllers on velocity or position. For this initial experiment there are no redun-dant groups of very similar and correlated sense elements such as a visual retina (as in[Hancock and Thorpe, 1995]) or sonar ring (as in [Pierce, 1991]).

With no algorithmic structure imposed at all on the fusion process, the result learned isdependent on what the robot is allowed to observe. For training, we teleoperated therobot in a cluttered environment allowing it to wander while bumping into obstaclesand jamming the wheels.

The training data consists of a matrix of input/output vectors sampled periodically dur-ing teleoperation. The input/output vector is composed of the actuator commands con-catenated to the sensor data. The mean of this vector over the entire sequence wassubtracted out to normalize the data values. Next, the matrix was batch-processedusing SVD to extract the eigenvectors and the largestn eigenvectors were selectedusing the largest ratio of adjacent singular values as the threshold forn. This testassumes there is a group of “significant” eigenvectors with similar singular values andthen there’s a bunch of noise with small singular values. Examining the ratios of adja-cent, ordered singular values indicates the dividing line between the groups. For run-time operation, the new sensor image is projected onto the eigenspace as described in[Hancock and Thorpe, 1995]. To summarize, sequentially take the dot product of thenew sensor vector and the sensor part of each and every eigenvector. The result of eachdot product produces a scale factor that is applied to each and every eigenvector whichare summed across all scale factors.

(5.1)

This can also be written in a form closer to the calibration matrices of Chapter 4:

(5.2)

wherem is the motor command,am is the motor part of the average vector (equivalentto motor(a)), Cm consists of the eigenvectors corresponding to the motors,az is thesensor part of the average vector (equivalent tosensor(a)), and Cz consists of theeigenvectors corresponding to the sensors.

v a z sensora( )–( ) sensorei( )•( )eii 1=

n

∑+=

mT am CmT zT az–( )Cz

T[ ]+=



Using this technique, the real robot successfully achieved both wandering behaviorand escaped stalls and collisions without highly redundant or correlated sensors andwith no algorithmic structure imposed on the learned result. Figure 5.1 shows a teleop-erated run of the MK-V with comparisons of what the human did and what the MK-Vwould have done on its own with eigenvectors learned on a prior training set. The drivemotor commands of the Shape from Motion technique closely match the commands ofthe human. Furthermore, the steering motor learnedwhen to steer, but the steeringdirection, which is random and superfluous to getting un-stuck, does not match. Thisfused result is appropriate; superfluous information and, in fact, entire superfluous sen-sors were ignored to achieve the proper trained behavior.

5.3.2 PUMA Manipulator Primitives

While the application on the MK-V mobile robot started out as a lark, learning sen-sorimotor primitives for robot manipulation skills is more serious. Automatic genera-tion of usable primitives is a necessary technology demonstration for the credibility ofGesture-Based Programming. Because the skill base can incorporate primitives from a

FIGURE 5.1 COMPARISON OF HUMAN AND SHAPE FROM MOTION COMMANDS ON A TELEOPERATEDDATASET.

Shape from Motion Drive Motor Command

Human Drive Motor Command

Shape from Motion Steering Motor Command

Human Steering Motor Command



variety of sources, it is not necessary to generate all primitives automatically. Nonethe-less, it is important to show proof-of-concept of the end-to-end system.

5.3.2.1 Guarded Move

As a first demonstration, we attempted to learn a one-dimensional guarded move.Although this is a fairly trivial primitive that can easily be hand-coded, it is not quite asobvious how to robustly identify it during human demonstration. This is the strengthof an integrated approach like shape from motion primordial learning; it providesprimitive identification and transformation as well as the basic learning of the primi-tive.

To learn “zguard” -- a guarded move along the z-axis -- we teleoperated a PUMArobot with a 6-DoF force/torque sensor (calibrated with the shape from motion para-digm, incidentally) so that it came into contact with a table while moving along theapproach (z) axis of the end effector. We repeated this one-dimensional guarded moveten times, logging data only during the guarded move, not during the retraction phasewhen the end effector was moved away from the surface. Average approach height wasabout 40 mm for this first training dataset. The robot was controlled by a cartesianvelocity controller, teleoperation input was provided by a 6-DoF trackball, and opera-tor feedback was visual as well as a graphical display of the real-time force/torquecomponents.

The input/output vector from which the data matrix for training was generated con-sisted of the useful data -- 6 measured force components, the total force and torquemagnitudes, and 6 commanded cartesian velocities -- plus some irrelevant data -- 3cartesian position elements and 9 cartesian orientation elements. We used the samealgorithm for extraction of the eigenvectors as used on the MK-V with one exception.We modified the singular value thresholding to trigger on either of two criteria: peakratio of adjacent singular values or adjacent singular value ratio of 20 or greater. Thesetwo criteria together give us an absolute and a relative measure of significance of theeigenvectors. The singular values that resulted from the training data are plotted inFigure 5.2.

The output of the training was a primitive of one eigenvector that performed very well.Figure 5.3 shows the measured force components as the guarded move primitiveautonomously acquires a hard surface. 20 N was the target force threshold applied dur-ing training and this can be varied by scaling the output portion of the extracted eigen-vector. The primitive worked equivalently in all trials performed regardless of regionof workspace, orientation of the end effector, compliance of the surface, or externalperturbations. Analysis of the components of the eigenvector support this. The primi-tive is looking at thez-component of force as well as the total magnitude of the force.



5.3.2.2 Edge-to-Surface Alignment

The guarded move is a “move until force threshold” operation. Movement is anexplicit part of the goal. Alignment operations, on the other hand, only move in orderto accommodate. Since no explicit motion is part of the primitive, it is difficult to tele-

0 2 4 6 8 10 12 14−1

0

1

2

3

4

5

6

7x 10

5

FIGURE 5.2 SINGULAR VALUES ASSOCIATED WITH THE FIRST 14 EIGENVECTORS OF THE GUARDED MOVETRAINING DATA.

Fx

Fz

Fmag

0 5 10 15 20 25−30

−20

−10

0

10

20

30

FIGURE 5.3 AUTONOMOUS OPERATION OF THE LEARNED, ONE-DIMENSIONAL GUARDED MOVE.

For

ce (

N)

Time (sec)

IDENTIFICATION OF PREVIOUSLY LEARNED PRIMITIVES


operate. What we’d like to do is use the previously learned guarded move to bring theedge and surface together while we teleoperate only the alignment primitive. Thismakes the training task much easier for the human.

Learning the “yroll ” alignment task -- alignment of an edge to a surface by accommo-dating rotationally around the y-axis -- was accomplished in the same manner as thezguard primitive with the exception thatzguard was running during the trainingphase. This complicates the eigenvector extraction because we expect thezguardeigenvector to appear in the new set of trained vectors. In order to extract a “clean”version ofyroll , zguard must first be removed. Assume, for the moment, this is possi-ble and we will describe how in the next section.

Removing thezguard eigenvector produces a set of four eigenvectors that very nicelyimplement the desired accommodation operation. Figure 5.4 shows the commandedvelocities for an experimental run with both thezguard andyroll primitives super-posed. The corresponding force plots (Figure 5.5) show the desired result of moving inz to press the edge flat against the table was achieved.

5.4 IDENTIFICATION OF PREVIOUSLY LEARNED PRIMITIVES

The identification of previously learned primitives is an important component of GBP.We must be able to recognize and interpret segments of the demonstration in order to

RotVelY − rad/s TransVelZ − mm/s

0 5 10 15 20 25 30 35 40−20

−15

−10

−5

0

5

FIGURE 5.4 VELOCITY COMMANDS OF “YROLL” AND “ZGUARD” PRIMITIVES WORKING COOPERATIVELY TOBRING AN EDGE INTO CONTACT WITH A SURFACE.



match them to equivalent skills the robot has in its knowledge base. We will onlyinvestigate the identification of known primitives in the context of manipulation tasks.

5.4.1 Guarded Move

To identify instances of thezguard primitive from demonstration data, we run thetraining algorithm on windowed batches of run-time data and examine the parallelismbetween the extracted eigenspace and the eigenspace representation of the primitive.The dot product gives a quantified measure of eigenvector parallelism that we compareto a high threshold for presence or absence of the primitive.

To test the recognizer, we randomly moved the robot around the workspace, occasion-ally coming into contact with surfaces. The commanded velocities appear inFigure 5.6 while the corresponding force measurements are plotted in Figure 5.7. Theresulting goodness-of-fit is plotted in Figure 5.8.

As evidenced by the figures, the goodness-of-fit measure performed just as it should,picking out every instance of a guarded move in z and rejecting every instance of ran-dom motion or random contact, including guarded moves along other axes. The occa-sional dropouts to zero are inconsequential as this fitness measure is but one input tothe gesture interpretation network of the Gesture-Based Programming system.

For gesture-based programming we want to be able to identify primitives from naturalhand motions rather than teleoperations using a trackball. Although learning primi-

Ty − Nm

Tmag − Nm

Fz − N

Fmag − N

Tx − Nm

0 5 10 15 20 25 30 35 40−30

−20

−10

0

10

20

30

FIGURE 5.5 MEASURED FORCES WITH “YROLL” AND “ZGUARD” PRIMITIVES WORKING COOPERATIVELY TOBRING AN EDGE INTO CONTACT WITH A SURFACE.



0 50 100 150 200 250 300−0.1

−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

0.1

Time (sec)

Vel

ocity

(m

/s)

FIGURE 5.6 CARTESIAN VELOCITY COMMANDS RECORDED DURING RANDOM TELEOPERATED MOTIONAND INTERACTIONS WITH OBJECTS.

Fx

Fy

Fz

Fmag

0 50 100 150 200 250 300−50

−40

−30

−20

−10

0

10

20

30

40

50

Time (sec)

For

ce (

N)

FIGURE 5.7 FORCE COMPONENTS RECORDED DURING RANDOM TELEOPERATED MOTION ANDINTERACTIONS WITH OBJECTS.

0 50 100 150 200 250 300−0.5

0

0.5

1

1.5

Time (sec)

Nor

mal

ized

Mat

ch

FIGURE 5.8 GOODNESS OF MATCH DETERMINED BY AUTOMATIC SKILL IDENTIFICATION ALGORITHMDURING RANDOM TELEOPERATED MOTION AND INTERACTION WITH OBJECTS.



tives can be done either way, identification must be in a more natural environment. Todo this, we must instrument the human’s hand with force and velocity sensors. For ourinitial experiments, we used a standard 6-axis force sensor held in the human’s hand toprovide force information. Differentiating and filtering the output of a Polhemusdevice from the CyberGlove provided cartesian position and velocity.

The recognition results for the guarded move are shown in Figure 5.9. The bold lineindicates the confidence of match between the human motions and the previouslylearnedzguard primitive. As above, it is the parallelism between primitive eigenvec-tors and the principal components of the new data. For reference, z-force and x- and y-force are plotted on the same axes (scaled down by a factor of 25). The matcher cor-rectly picked out all three instances ofzguard and correctly rejected all instances ofyguard andxguard as well as all retractions from the table. The indication of matchprecedes the onset of the force because of a fixed-width moving window is used to per-

0 50 100 150 200 250 300 350 400 450 500−1.5

−1

−0.5

0

0.5

1

1.5

FIGURE 5.9 RECOGNITION OF ZGUARD PRIMITIVE FROM HUMAN DEMONSTRATION

MATCH

FZ

FX,Y

Deg

ree

of M

atch

TRANSFORMATION OF PRIMITIVES


form the comparison. As the tail-end of the window begins to overlap the region inwhich the primitive occurs, it indicates a positive response for the entire window.

5.4.2 Edge-to-Surface Alignment

Remember when we were learning thexroll andyroll primitives we used thezguardprimitive to make training easier. The problem is the learned result includes both prim-itive operations, since there is no way to separate the sensor stream into “guardedmove” and “alignment” portions. However, the eigenspace representation provides aconvenient mechanism for separating the two after learning, if they are truly elementaland decoupled.

Performing the standard learning experiment we detailed for the guarded move in Sec-tion 5.3.2.1 with the standard ratio of eigenvalues test, only one eigenvector isextracted during training and its dot product withzguard is 0.9993, indicating it is thezguard primitive. The alignment primitive has been lost (or overlooked) completely.

To extract the alignment primitive, we must look to the next significant group of eigen-vectors. We have already identified the eigenvectors corresponding tozguard. All wehave to do is remove them from the set of extracted eigenvectors and re-perform thestandard ratio of eigenvalues test on the remaining eigenvectors. That’s how we pro-duced the set of four eigenvectors identified in Section 5.3.2.2.

5.5 TRANSFORMATION OF PRIMITIVES

In fact, we have already demonstrated the superposition of eigenvectors for the decom-position and recomposition of primitives, at least in this case of decoupled operations.Section 5.3.2.2 and Section 5.4.2 involved both the decomposition and recompositionof thezguard andyroll primitives. However, this was in the case of a primitive thathad been previously taught to the system. This representation also allows for the cre-ation of novel primitives that the system has not seen. This is what we refer to as “skillmorphing.”

5.5.1 Surface-to-Surface Alignment

The above decoupled primitives can be combined to produce more complex behavior.Combiningzguard, xroll , andyroll produces a “skill” that allows two surfaces to bepressed together, which is a new emergent behavior. An experimental run of thesethree primitives is shown doing just that in Figure 5.10 and Figure 5.11. Figure 5.10shows the force stabilizing at the desired threshold while Figure 5.11 shows the

LIMITATIONS


torques stabilizing at zero despite initial misalignments of approximately 30 degreesalong each axis.

5.6 LIMITATIONS

There are several limitations to shape from motion primordial learning. Most obvious,perhaps, is the fact it results in a linear representation. Although I use this as a benefitin combining and morphing primitives to produce new primitives through superposi-tion, it does limit the complexity of learnable primitives. However, I do not think itplaces a strong limitation on the types of tasks that can ultimately be handled by thetechnique. Linearization is a common process in solving engineering problems, evenfor complex non-linear systems.

For example,gain schedulingis a common technique in adaptive control for applyinglinear control methods to highly non-linear systems. It segments the state space intolinearized subregions and switches linear controllers in and out at appropriate times. Inlike fashion, a set of linear primitives can be used to approximate non-linear behaviorif sequenced properly. The problem is one of segmentation and then recognizing when

0 5 10 15 20 25 30 35 40 45 50−25

−20

−15

−10

−5

0

5

FIGURE 5.10 FORCE IN Z-AXIS AS ZGUARD, XROLL, AND YROLL COOPERATE TO PRESS ONE SURFACEAGAINST ANOTHER.

initial contact

force stabilizes near

desired threshold

For

ce (

N)

Time (sec)

LIMITATIONS


to switch from one primitive to the next. These are not trivial problems and they arefurther complicated by what may happen during the transition. In the adaptive controlregime, instability can result at the boundaries of gain scheduled solutions and theremay be an analogous difficulty when switching primitives in certain situations.

Another potential problem with the shape from motion technique is the range of singu-lar values associated with the eigenvectors defining the primitive. If important eigen-vectors are associated with small singular values, they may be excluded from theprimitive or may overlap with eigenvectors associated with noise. Attempts must bemade to normalize the singular values so they will be grouped appropriately.

Closely related to the problem of singular value normalization is the problem of eigen-vector grouping. The idea behind extracting sensorimotor primitives by SVD is thatthe principal structure inherent in the data will be attributed to the primitive while allother content will be noise or, at least, non-essential. The result is an assumed parti-tioning of the eigenvectors into two distinct groups. However, there is no analyticalway to determine the dividing line between groups and proving that all non-essentialeigenvectors are excludable requires unrealistic assumptions. So the issue of findingthe groups of “significant eigenvectors” requires heuristics based on sensor signal-to-noise ratios and qualitative assumptions about “pureness” of the training sets.

TxTy

0 5 10 15 20 25 30 35 40 45 50−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

FIGURE 5.11 TORQUE AROUND X- AND Y-AXES AS ZGUARD, XROLL, AND YROLL COOPERATE TO PRESSONE SURFACE AGAINST ANOTHER.

torques stabilize

Torq

ue (

N-m

)

near zero

Time (sec)

CONCLUSIONS


Finally, the training sets must be “pure” and “complete” or the learned primitives maybe corrupt or incomplete. These requirements are necessary for any learning paradigmunless a specifica priori model is provided to exclude or “patch” specific information.This is exactly how the Shape from Motion paradigm works in “calibration mode,”with the inclusion of geometric model information, as described in the preceding chap-ter. Pomerleau [1992] extensively studied these requirements for his work onALVINN, a robotic driver. “Completeness” demanded that the vehicle be trained toreturn to the roadway if it departed, but “pureness” prevented the human trainer fromdeparting the roadway during training because departures would become an undesir-able part of the learned result.

5.7 CONCLUSIONS

I have described shape from motion primordial learning, a learning technique based oneigenspace decomposition that has been successfully applied to extracting, identify-ing, and morphing robotic sensorimotor primitives from teleoperated actions. It isclosely related to shape from motion calibration described in the previous chapter, butis “primordial” in the sense that it is given noa priori models of the phenomena itobserves nor is it given any interpretation of its sensors and actuators.

What makes it significant among other skill learning approaches surveyed is that itcompactly provides the “holy grail” for programming by human demonstration -- the“holy grail” being the triumvirate of acquisition, identification, and transformation(“morphing”). GBP requires a previously acquired knowledge base of robotic primi-tives to provide a shared ontology between human and robot for the abstraction ofintention. Given this basis set of primitives, the GBP system must be able to identifyprimitives as they are utilized by the human teacher during the demonstration ofhigher-level skills and tasks. Finally, GBP requires a method of parametrizing primi-tives for a specific instance of a task and of combining primitives to create new func-tionality, examples of which the system might never have seen before.

Shape from Motion Primordial Learning provides a solution to all three of theserequirements in an intuitive, learning-from-demonstration format. It also adds the ben-efit of a linear representation that can be easily analyzed by an expert to determine“what has been learned.” Although the technique has strong limitations, they are thetypes of limitations engineers are accustomed to dealing with. Linearization and seg-mentation are among the most commonly used tools for dealing with complex prob-lems.

Furthermore, consider the tasks that were learned successfully - the guarded move inparticular. This is considered a non-linear function, yet it was extracted successfully.

CONCLUSIONS


Upon closer inspection, the reason is obvious. The extracted matrix is a damping forcecontroller.Zguard actually servos the force to the threshold. Likewise, the roll primi-tives consist of damping force controllers.

Finally, the general technique of eigenspace decomposition is not new to the learningcommunity. Most notably, the work of Hancock and Thorpe [1995] and Pierce [1991]employ similar analyses. A key difference is that both of these efforts (and others withsimilar applications) used homogeneous sensor arrays: either a visual retina or sonarring. I have demonstrated the approach is applicable to heterogeneous sensor arrays inmore diverse domains and less structured learning environments.

97

Chapter 6

Gesture-Based ProgrammingDemonstrations

Kinematic and Contact Tasks

6.1 INTRODUCTION

Although the emphasis of this thesis was not to create a fully functional GBP system,it is necessary to demonstrate some level of functionality beyond the individual dem-onstrations of constituent parts in previous chapters. Towards that aim, this chapterpresents two example gesture-based interpretation systems for end-use applications.

The first, robotic cable harnessing, appears to be GBP, but, in fact, is an example ofgesture-basedinstruction. Like the work of Kang [1994], the robotic executable iscompiled beforehand and does not change for each instantiation of the same or evendifferent tasks. The core of the executable is merely a trajectory follower. UnlikeKang, however, the executable includes an auxiliary set of agents for the interpretationof symbolic and tactile gestures that select and fine tune the trajectory during practiceand execution.

The second example implements true gesture-basedprogramming in duplicatingKang’s low-tolerance peg-in-hole task. In this example, the output of the system is aprogram of both sequential and parallel agents, based on a set ofa priori sensorimotorprimitives and support modules. Two slightly different applications are demonstratedto show that the resulting programs are, in fact, different.

ROBOTIC CABLE HARNESSING


6.2 ROBOTIC CABLE HARNESSING

Robotic cable harnessing is a task that naturally follows from the trajectory modifica-tion example in Chapter 2. Both involve the robot following polygonal trajectories andtactile “nudge” gestures for fine tuning the trajectory. The fine tuning gestures are rele-vant to the human demonstration because the Polhemus sensor used during demonstra-tion is not accurate enough to use the trajectory as-is, particularly around iron or steelobjects (as are often found near a robot’s workspace).

6.2.1 Task Description

Cable harnessing involves the bundling of individual wires into a wire “rope” withappropriatetaps1 to connect to various electrical accessories along its path. The wiresare normally routed around a jig that maintains the appropriate spacing between taps.For our application, we use a reconfigurable jig consisting of a set of moveable pegsaround which the wires are strung (Figure 6.1). Because the resulting harness is flexi-ble and the robot’s workspace is limited, we will assume all wires can be laid out in asingle plane with the end point near the start point to keep cycle times low. We alsomake the more restrictive assumption that all wires in a harness follow one of a smallnumber of paths: straight through or one of a few tapped patterns.

1. A “tap” is a small loop for connecting to an external device.

= Routing peg

= straight-through wire

= tapped wire

FIGURE 6.1 WIRE HARNESS JIG

Taps



Since the wires are strung point-to-point around pegs, their shapes are polygonal andthe task is strictly one of tracking a trajectory. Using a CyberGlove, a Polhemus sensorand the human-wearable tactile sensor of Chapter 3 on the index finger, the user graspsa wire and traces out the trajectory the robot must follow for each wire type. This ispart of the teaching phase. To fine-tune the trajectory, simple “dictionary” symbolichand gestures start and stop the robot’s movement through the cyclic trajectory. As it’smoving, tactile gestures nudge the robot into the precise trajectory. (Precise has a looseinterpretation since it merely has to avoid collisions with the pegs.)

Each harness contains a variable number of each type of wire and the robot runs thesame type of wire until instructed to change. Changing from one type of wire toanother is accomplished in a menu-style with symbolic hand gestures. These gesturesare interpreted with a multi-agent network based on the fine-grained tropism systemcognitive architecture. Symbolic, sign language-like gestures were used to implementa menu because they can be interpreted instantaneously, making the entire interfacegesture-based.

Training, practice, and execution are all performed in real-time by networks of inde-pendent agents. These networks are described in detail in the following section.

6.2.2 Teaching the Trajectory by Human Demonstration

Training the wire runs is the first step in the application. The trainer dons a Cyber-Glove and puts the tactile sensor on the index finger, over the glove, and trains eachwire trajectory by grasping a wire between the fingers and tracing it through the routethe wire takes around the jig. The act of grasping the wire signals the starting point ofthe trajectory while prolonged ungrasping signals the end. Although trivial in thisapplication, these pinching gestures are recognized by the recognition agent,pinch(see Figure 6.2) which monitors the tactile fingertip.

Each trajectory is assumed to be made up of a series of line segments (polygonal). Twoadditional recognition agents,rotate andstraight, attempt to identify straight-line androtating gross motion gestures of the hand. These agents operate independently so it ispossible for straight-line and rotating gestures to be recognized simultaneously.

The interpretation of the gestures, identifying whether any given motion is part of atrajectory segment, an inter-segment reorientation, or is extraneous, is performed bythe interpretation agent,segment. Segment forms the fuzzy result ofnot_straight AND rotating AND grasped_wire . If this is true, the seg-ment is stored in a file for later recovery by the execution agents.Segment keepstrack of the start and stop of each trajectory and files them separately.



To provide a small level of immunity to the noisy output of the Polhemus,segmenttakes a parameter that specifies the minimum step-size of the grid of pegs. It thenrounds the positions contained in the gestural word up or down if the rotation was tothe left or right, respectively (assuming a counter-clockwise motion).

6.2.3 Robot Practice and Fine Tuning

The practice session is when the robot moves through the various trajectories and thetrainer observes the motions to see if they are as desired. End-of-arm tactile gesturessimilar to those described in Chapter 2, are used to fine-tune the trajectory while sym-bolic hand gestures coupled with hand motion gestures are used to switch between tra-jectories. These are illustrated in Figure 6.3.

6.2.4 Robot Execution

The execution phase is fairly trivial because the trajectories are all prepared, we justwant to provide a simple menu to select between them. This menu is implemented as aset of symbolic gestures of the raised hand indicating the trajectory number by thenumber of vertically extended fingers.

Polhemus Fingertip

Straight Rotate

svar

Inputs

FIGURE 6.2 CABLE HARNESSING TRAINING NETWORK.

Pinch

Segment

RecognitionAgents

InterpretationAgent



Straight

svar

Inputs

FIGURE 6.3 CABLE HARNESSING PRACTICE NETWORK.

Recognition

Agents

Interpretation

Agents

Polhemus CyberGlove F/T Sensor

CableTraj

PID FwdKin

InvKin

Robot

GravComp

RobotAgent

Symbol Nudge

Stop Numbers Thick Radial Scale

svar



6.2.5 Agent Descriptions

The network topology, voting mechanism and end-of-arm tactile gesture recognizersare described in detail in Chapter 2. However, the “numbers” agent is new. It is actu-ally a wild-card in the topology that symbolizes a group of agents for recognizingnumeric sign language gestures. Each number agent stores a template of a symbolicgesture and computes a closeness of fit. Since static pattern matching for recognizing

svar

Inputs

FIGURE 6.4 CABLE HARNESSING APPLICATION NETWORK.

Recognition

Agents

Interpretation

Agents

CyberGlove

CableTraj

PID FwdKin

InvKin

Robot

GravComp

RobotAgent

Symbol

Stop Numbers

svar



sign language is non-robust, the gesture recognizer is fairly liberal while the inter-preter is biased to look closely at negative stimuli such as hand motion gestures andend-of-arm tactile gestures. (It is assumed that symbolic gestures are usually madeclose to the body, in a roughly vertical position, with the hand approximately still.)

6.2.6 Experiments

The most important step in operating the system is training the trajectories. Figure 6.5

and Figure 6.6 indicate two segmented trajectories and the raw Polhemus data thatresulted from the training phase in an iron-free environment. Noise in the Polhemusdata can be significant and it is particularly disturbed by ferromagnetic materials. Afterthe practice session (Figure 6.7), however, it squares up fairly well. Of course, the onlycriteria for a good trajectory is that it doesn’t hit any of the pegs of the jig.

FIGURE 6.5 SEGMENTED AND RAW TRAJECTORIES FROM TRAINING.

0 0.1 0.2 0.3 0.4 0.5 0.6-0.3

-0.2

-0.1

0

0.1

0.2

PolhemusSegments



Snapshots from the cable harnessing system in action appear in Figure 6.8. Image (a)shows the demonstration of the routing of one of the wires. The gel-based fingertiptactile sensor described in Chapter 3 is used to detect the tactile “pinch” gesture indi-cating the start and stop points of the demonstration while the Polhemus senses motiongestures. A separate demonstration is performed for each path that a wire can take.Since the information from the Polhemus is rather noisy and prone to inaccuraciesnear metal objects (as are common in a robotic workcell), the trajectory must be fine-tuned to prevent collisions with the pegs. This is shown in (b) where the user is “nudg-ing” the robot with tactile gestures into the correct trajectory. These tactile gestures aresensed with the robot’s wrist force/torque sensor while the robot practices above thepegs.

Once the wire trajectories have been demonstrated and fine-tuned during practice, therobot is ready to route wires. In execution mode, the vertical offset inserted for prac-

FIGURE 6.6 ANOTHER SEGMENTED TRAJECTORY AND RAW TRAJECTORY FROM TRAINING.

0 0.1 0.2 0.3 0.4 0.5 0.6-0.3

-0.2

-0.1

0

0.1

0.2

SegmentsPolhemus



tice is removed and the robot engages the pegs at the normal working height. Image (c)shows the robot routing the first wire. Trajectories are chosen manually with symbolicgestures (d) implementing a menu. The robot starts a new wire trajectory (e) based onthis symbolic menu command. Note that this prototype demonstration does not haveany clinching mechanism to capture and hold the routed wires in position. A previ-ously routed bundle was laid in position to indicate the concept in Figure 6.8 (e).

Finally, image (f) shows the CyberGlove with gel-based extrinsic tactile sensor.

6.2.7 Conclusions

Although gestural input may not be the most efficient method of commanding a cableharnessing robot, it seems to work satisfactorily. For this particular task, a teach pen-dant would work at least as well and probably much better. However, teach pendantshave been around for a long time and they have little hope of making robots any more

FIGURE 6.7 TRAINED TRAJECTORY (DOTTED) AND PRACTICE TRAJECTORY (SOLID)



intuitive to program for complex tasks. To the contrary, gesture-based programminghas the potential for tremendous impact.

FIGURE 6.8 DEMONSTRATION, FINE-TUNING, AND EXECUTION OF THE CABLE HARNESSING TASK (A-E).CYBERGLOVE WITH GEL-BASED TACTILE SENSOR FOR OBSERVING GESTURES (F).

(a) demonstration usingpinch and motion gestures

(b) fine-tuning withnudge gestures

(c) routing a wire (d) symbolic gesture to selectnew trajectory

(e) routing a new wire (f) glove for sensing motion,symbolic and pinch gestures

LOW-TOLERANCE PEG-IN-HOLE TASK


6.3 LOW-TOLERANCE PEG-IN-HOLE TASK

This section describes a deliberate replication of the centerpiece demonstration ofKang’s thesis on “Robotic Instruction by Human Demonstration” [Kang, 1994]. Asmentioned in the introduction to this thesis, the support role we played in Kang’sresearch led directly to many of the insights and philosophical underpinnings of thework reported here. Some of these Kang, himself, shared, others he did not, but it is fit-ting and proper that we use a re-implementation of his demo as the central unifyingapplication.

6.3.1 Two Peg-in-Hole Demonstrations

We actually performed two slightly different types of demonstrations to show that, infact, different programs resulted. The first is exactly like Kang’s demo and is illus-trated in Figure 6.9. First, the hole is grasped, transported, and placed on the table.Then the peg is grasped, transported, and inserted into the hole. In both cases of plac-ing the hole and the peg, a guarded move is used. In effect, the hole is pressed onto thetable (as opposed to being dropped) and the peg is pressed into the hole. Contact forceis used as the terminating condition rather than reaching a location in space.

To abstract the task a multi-agent network, based on the fine-grained tropism systemcognitive architecture presented in Chapter 2, interprets the demonstrator’s “intention”during the various task phases. (Although the agents are designed to execute in realtime, in fact, some of the task abstraction processing was performed off-line for thisimplementation.)



(a) start (b) preshape & via (c) grasp

(d) via point (e) guarded move (f) ungrasp

(g) preshape & via (h) grasp (i) via point

(j) guarded move (k) ungrasp (l) via point

FIGURE 6.9 DEMONSTRATION OF THE LOW-TOLERANCE PEG-IN-HOLE TASK.



Figure 6.10 illustrates some of the data from the abstraction process during three con-

secutive demonstrations of this task. The top plot is thresholded, binarized output ofthe extrinsic tactile sensor indicating the presence/absence of an object in the hand.The next plot shows the actual force of the grasp from the intrinsic tactile sensor.These agents aid the volume sweep rate agent in temporally segmenting the grossphases of the task. The next plot shows the vertical component of the force vector fromthe intrinsic tactile sensor. This plot, along with the bottom plot which shows the verti-cal height of the hand help us visually pinpoint when the guarded moves are beingdemonstrated. The fourth plot from the top indicates the output of the zguard agent,which uses the method described in Chapter 5 to identify the presence of guardedmoves.

Note the “blips” in the data around the time equal to 100 seconds. These resulted fromextraneous motion to untwist some wires around the demonstrator’s wrist and were

20 40 60 80 100 120 140

0

0.5

1

tact

or

20 40 60 80 100 120 140

−2

−1

0

F o

f con

tact

20 40 60 80 100 120 1400

5

10

F o

f gra

sp

20 40 60 80 100 120 140

0

0.5

1

zgua

rd

20 40 60 80 100 120 140

0.5

time

z of

han

dFIGURE 6.10 IDENTIFICATION OF GUARDED MOVES IN THREE SEPARATE, CONSECUTIVE TASK

DEMONSTRATIONS



appropriately rejected by the volume sweep rate agent and tactile agents as not beingrecognizable pregrasp and manipulation phases of the task.

The program that results is displayed in Figure 6.11 as a finite state machine (FSM)using Morrow’s Skill Programming Interface (SPI) [Morrow, 1997]. Each bubble in

the FSM represents a node that consists of several agents executing in parallelto achieve the desired action. (The full, automatically-generated configuration files canbe found in Appendix B.)

The autonomous execution of the above program on a PUMA robot with a Utah/MIThand is shown in Figure 6.12.

6.3.2 A Variation on the Peg-in-Hole Demonstration

This second task is a slight variation on Kang’s demo. Instead of grasp-transport-place-grasp-transport-insert, as in the previous demo, this demonstration is grasp-transport-drop-grasp-transport-insert.

As in the previous task, the hole is first grasped and transported, which corresponds toframes (a) through (d) in Figure 6.13. But instead of placing the hole on the table using

create

destroy

reset

halt

start initHand1

Preshape&Move1

Grasp1

ViaMove1

GuardedMove1

Ungrasp1

initHand2

Preshape&Move2

Grasp2

ViaMove2

GuardedMove2

Ungrasp2

ViaMove3

end

FIGURE 6.11 SPI GRAPHICAL DISPLAY OF PROGRAM RESULTING FROM PEG-IN-HOLE DEMONSTRATION.



a guarded move, the operator moves the hole above the table and drops it (frames (e)and (f)). The rest of the task is the same as the peg is grasped, transported, and insertedinto the hole using a guarded move. The point of performing this second task is toshow that the guarded move is not automatically inserted. The system analyzes the


(d) via point (e) guarded move (f) ungrasp



FIGURE 6.12 ROBOTIC EXECUTION OF THE LOW-TOLERANCE PEG-IN-HOLE TASK.



sensory information during the demonstration and recognizes the presence or absenceof previously learned primitives using the method described in Chapter 5.

The same multi-agent network analyzes the demonstration and the resulting real-timeprogram is displayed in Figure 6.14. Note the absence of the guarded move during

FIGURE 6.13 VARIATION ON DEMONSTRATION OF LOW-TOLERANCE PEG-IN-HOLE TASK.


(d) via point (e) via point (f) ungrasp & drop





manipulation of the hole. Autonomous execution of this program is shown inFigure 6.15.

A flip-book movie of this trial can be found in the lower margin of the even-numberedpages of this thesis. To view, look at the lower left corner of the back cover and flip thepages toward the front of the thesis.

create

destroy

reset

halt

start initHand1

Preshape&Move1

Grasp1

ViaMove1

Ungrasp1

initHand2

Preshape&Move2

Grasp2

ViaMove2

GuardedMove1

Ungrasp2

ViaMove3

end

FIGURE 6.14 SPI GRAPHICAL DISPLAY OF PROGRAM RESULTING FROM SECOND PEG-IN-HOLEDEMONSTRATION.



FIGURE 6.15 ROBOTIC EXECUTION OF SECOND LOW-TOLERANCE PEG-IN-HOLE TASK.


(d) via point (e) via point (f) ungrasp & drop



115

Chapter 7

Contributions

Thesis Summary and Conclusions

7.1 SUMMARY

This thesis integrated a variety of fundamental but disparate works into the overallgoal of Gesture-Based Programming. Gesture-Based Programming is a form of pro-gramming by human demonstration that relies on a knowledge base of encapsulatedsensorimotor primitives (“robotic skills”) that aids the interpretation of the motionsand contact conditions (which we call “gestures”) used by the human during the dem-onstration of a task. The skill base is the key to successful roboticexecution, but it isthe abstraction of the human’s actions -- the interpretation of the user’sintentions withrespect to the robot’s own capabilities -- that is key to the successfulprogramming ofthe robot.

Although a complete example of a GBP system was demonstrated, Gesture-BasedProgramming is far from solved as a result of this thesis. Instead, this research investi-gated new avenues for a number of supporting components required for the ultimategesture-based programming system.

Specifically, a modification of the Tropism-System Cognitive Architecture (TSCA)[Agah and Bekey, 1996b] was presented that more easily supports the creation ofagents from hierarchical collections of sub-agents for inherently qualitative tasks. Tro-pisms can loosely be thought of as encoding the likes and dislikes of a behavioralagent, making the definition of agents based on mentalistic notions more intuitive forflat networks of multiple peer agents.

CONCLUSIONS


A new modular, reconfigurable tactile sensor was developed with contact propertiesmore closely matching those of the human fingertip. The sensor is based on an elec-trorheological gel rather than silicone or foam rubber. This gives the contact interfacesuperior conformal and impact absorption properties, according to other researchers.The sensor is modular and reconfigurable in the sense that both intrinsic and extrinsicsensing modules exist and can be interchanged permitting various combinations of thesame sensors to be worn by a human, the Utah/MIT dextrous hand, or the Stanford/JPL hand. Finally, due to the active nature of electrorheological and magnetorheologi-cal fluids, an inside-out-symmetric version of the sensor results in a compact, wear-able, multi-element tactile actuator. Planar prototypes of this actuator have beenconstructed but there remain some unsolved engineering issues with the miniaturiza-tion of the actuator to create a wearable version.

As a result of the number of sensors required to instrument all the fingers of a robotichand and the human hand, a fundamentally new approach to multi-axis sensor calibra-tion was developed that is nearly reference-less. Shape from Motion Calibrationresults in a complete calibration matrix withouta priori knowledge of any of theapplied loads. A small set of known applied loads is required to orient and scale thecalibration matrix to the desired reference frame, but this is done after the “shape” ofthe calibration matrix has been fixed. Collaborative Calibration even allows two uncal-ibrated sensors to mutually calibrate each other “by fumbling.”

Shape from Motion Calibration led to a convenient paradigm for building the knowl-edge base of sensorimotor primitives called Shape from Motion Primordial Learning.It is “primordial” in the sense that it is given noa priori information about sensors andactuators, only that sensors and actuators exist. It is a non-parametric learning tech-nique based on principle component analysis that is particularly useful for gesture-based programming. It is well-suited to GBP because it not only provides a mecha-nism for extracting sensorimotor primitives from human demonstration, but it simulta-neously provides a method for recognizing those primitives in subsequentdemonstrations and for “morphing” known primitives into new primitives. This resultsfrom the linear eigenspace representation that forms the basis of the decomposition.

7.2 CONCLUSIONS

The TSCA presents a convenient formalism for the construction of teams of roboticagents that are ruled by inherently qualitative behaviors or behaviors that are difficultto quantify (e.g. likes and dislikes). This architecture is not well suited, in our belief, tosystems that can be controlled algorithmically, nor is it well suited to qualitative agentsthat have different “personalities.” The fine-grained tropism system cognitive architec-ture presented here better provides for “multiple personalities” by considering a single

CONCLUSIONS


entity as a collection of independent agents with their own tropisms. We found thespecification of agents searching for “triangle-ness” or width modification was easyusing tropism sets encoded with fuzzy rules -- much easier than our unsuccessfulattempts at algorithmic approaches using “snakes” [Kass, et al, 1987] or grids with avision or navigation analogy.

We must acknowledge the difficulty in implementing peer agents with pure ignoranceof other agents. For example, although the rectangle agent and the pick-and-placeagent described in the trajectory experiment of Chapter 2 operate in complete isola-tion, their tropism specification was not completely isolated. The way the output of theproprioceptive agentconfusion is used byrect andpick is designed to help discrimi-nate ambiguous gestures by knowing something about how the other agent uses thisinformation.

The sensor requires much deeper characterization to truly determine if the claimsmade by other researchers regarding improved grasping properties are valid. Qualita-tively, our experience wearing the sensor during manipulation seems to support theseprior conclusions. But verification could be a complete thesis in itself and requires theimplementation of repeatable fine manipulation primitives for a dextrous hand to becomplete. The sensor work presented in this thesis is critical to our philosophy, to thefinal demonstration, and to the introduction of our novel work on sensor calibration,but we do not consider it closed.

Shape from Motion Calibration is the strongest and most complete contribution of thisthesis. Although it does not have the potential for broad impact equivalent to a para-digm shift to gesture-based programming or programming by human demonstration, itis a solid contribution within its niche. The potential savings for commercial sensormanufacturers is profound.

It is not without drawbacks, however. We do not recommend using Shape from Motionto extract sensor bias. Although we have demonstrated its effectiveness, if the offsetsare small it is possible for them to be overwhelmed by other vectors in the eigenspace,producing erroneous results.

Shape from Motion Primordial Learning, closely related to calibration in the algorith-mic sense, is promising. We discuss its limitations in Chapter 5, but generally focus onits strength: a compact representation for the acquisition, identification, and transfor-mation of sensorimotor primitives. But, as with any learning method, it should not berelied on as the exclusive learning paradigm. In fact, our agent-based philosophy pro-motes the combination of different approaches for more robust behavior. It is accept-able to use our approach to identify a guarded move, for instance, but use a hand-coded execution routine on the real-time system.

CONTRIBUTIONS


7.3 CONTRIBUTIONS

Gesture-Based Programming Paradigm.The incorporation of gestures and robotic skillsinto a framework of programming by human demonstration is novel and signif-icant in itself. It allows a system that looks, to an outside observer, that it caninterpret the qualitative intentions of the user from sparse gestures. It opensopportunities for further advancements.

Shape from Motion Calibration. A fundamentally new way to approach multi-axis sensorcalibration.

Shape from Motion Primordial Learning. Significant and unique in its ability to learn,identify, and “morph” sensorimotor primitives.

Fine-Grained Tropism-System Cognitive Architecture.Provides intuitive vehicle for the cre-ation of hierarchical multi-agent systems for inherently mentalistic tasks.

Wearable Fluid-Based Tactile Sensor/Actuator.Possesses unique and desirable graspinginterface, minimizes mapping difficulties from human to robot, and providesthe only actuation mechanism of which we are aware with realizable potentialfor a wearable array.

7.4 FUTURE RESEARCH

Although I have demonstrated a primitive system for GBP, much work remains for afully functional and useful system.

7.4.1 Fine-Grained Tropism-System Cognitive Architecture

The components of phylogenetic and ontogenetic learning from the tropism systemarchitecture have not been incorporated into the fine-grained architecture. Currently,tropism sets are developed and coded by hand. Learning to recognize a particularuser’s gestures and learning to better generalize across users would improve perfor-mance and reduce latencies.

7.4.2 Shape from Motion Primordial Learning

This area is rich with opportunities for further research. The obvious include applica-tion to more nonlinear primitives through task linearization or task segmentation tech-niques. There is also the problem of creating “pure” and “complete” training sets. Thismight be achievable automatically or, at least, the completeness and pureness may be

FUTURE RESEARCH


measurable. Automatic abstract parametrization of learned primitives and demonstra-tions would prove useful. Finally, analytical predictions on the results of morphingprimitives and methods to determine how to morph primitives for a desired resultwould be theoretically challenging but seem achievable. Of course, there’s also theissue of incorporating other learning techniques since no single technique is optimalfor all situations.

7.4.3 Gel-Based Tactile Sensor and Actuator

Further characterization is clearly needed for the sensor with real applications to dex-trous manipulation primitives. This should include better modeling of the gel as both astructural member and as a dielectric. There is a clear limitation to the resolution of thesensor as-designed due to the dominance of fringing fields. There is also the problemof extracting force information from a displacement mechanism with a highly complextransfer function.

One solution to both of these problems that we began to investigate is the use of micro-machined silicon pressure sensors bonded to the plastic core. These miniature pressuresensors are based on the same idea of capacitive displacement sensing, but their sizeand construction eliminate the problem of fringing fields. Secondly, theydo respond asa Hooke’s Law device, so there is a well-defined displacement/pressure relationship.(Although the complex transmission of the pressures through the gel complicate thingssignificantly.) They also have a well-defined rest state that they return to in the absenceof an applied pressure. Finally, the miniature pressure sensors are designed to workwith differential capacitors (note the different springs on the two plates in Figure 7.1)to improve sensitivity and minimize drift, noise, offset, and other non-signal effects.

FIGURE 7.1 MICROMACHINED DIFFERENTIAL CAPACITOR PRESSURE SENSOR (BEFORE ENCAPSULATION).

FUTURE RESEARCH


For convenience and signal integrity, local processing electronics can be bonded near,or perhaps integrated with, the pressure sensors for signal conditioning.

As a bonus, the differential configuration allows different capacitor configurationscapable of detecting both compressive and shear pressures. In future work this mayallow some level of slip detection.

The modeling and construction of a wearable tactile array actuator should be under-taken. For useful application to VR research, appropriate collaborative actuationschemes must be studied. This involves the use of micro-motion to simulate macro-motion similar to techniques used in flight simulators.

121

Chapter 8

References

[1] Agah, A, and G.A. Bekey, 1996a, “Phylogenetic and Ontogenetic Learning in aColony of Interacting Robots,”Autonomous Robots, To Appear.

[2] Agah, A, and G.A. Bekey, 1996b, “A Genetic Algorithm-Based Controller for Decen-tralized Multi-Agent Robotic Systems,” inProc. of the 1996 IEEE InternationalConf. on Evolutionary Computation, Nagoya, Japan, May, pp. 431-436.

[3] Ahn, D.S., H.S. Cho, K. Ide, F. Miyazaki, and S. Arimoto, 1991, “Strategy Generationand Skill Acquisition for Automated Robotic Assembly Task,”Proceedings of theIEEE International Symposium on Intelligent Control, Arlington, VA, 13-15August, pp. 128-133.

[4] Akella, P. and M. Cutkosky, 1989, “Manipulating with Soft Fingers: Modeling Con-tacts and Dynamics,” inProceedings of the 1989 IEEE International Conference onRobotics and Automation, v. 2, pp. 764-769.

[5] Akella, P., R. Siegwart, and M.R. Cutkosky, 1991, “Manipulation with Soft Fingers:Contact Force Control,” inProceedings of the 1991 IEEE International Conferenceon Robotics and Automation, v.2, pp. 652-657.

[6] Bayo, E. and Stubbe, J.R., 1989, “Six-Axis Force Sensor Evaluation and a New Typeof Optimal frame Truss Design for Robotic Applications,”Journal of Robotic Sys-tems, v.6, n.2, pp. 191-208.

[7] Beaton, A.E., D.B. Rubin, and J.L. Barone, 1976, “The Acceptability of RegressionSolutions: Another Look at Computational Accuracy,”Journal of the AmericanStatistical Association, v. 71, pp. 158-168.


[8] Berger, A.D. and P.K. Khosla, 1991, “Using Tactile Data for Real-Time Feedback,”International Journal of Robotics Research, v. 10, n. 2, April, pp. 88-102.

[9] Berkelman, P. and R. Hollis, “Interacting with Virtual Environments Using a Mag-netic Levitation Haptic Interface,”Proceedings of the 1995 IEEE/RSJ IntelligentRobots and Systems Conference, Pittsburgh, PA, Aug, 1995.

[10] Bicchi, A. and P. Dario, 1988, “Intrinsic Tactile Sensing for Artificial Hands,”Robot-ics Research: The 4th International Symposium, R.C. Bolles and B. Roth, editors,MIT Press, Cambridge, MA, pp. 83-90.

[11] Bicchi. A, J.K. Salisbury and D.L. Brock, 1993, “Contact Sensing from Force Mea-surements,”International Journal of Robotics Research, v. 12, n. 3, pp. 249-262,June.

[12] Brooks, R.A., 1986, “A Robust Layered Control System for a Mobile Robot,”IEEEJournal of Robotics and Automation, v.RA-2, n.1, March, pp. 14-23.

[13] Brunner, B., K. Arbter, G. Hirzinger, and R. Koeppe, 1995, “Programming RobotsVia Learning by Showing in a Virtual Environment,” inVirtual Reality World 95,Stuttgart, Germany, February.

[14] Chatterjee, S. and A.S. Hadi, 1988,Sensitivity Analysis in Linear Regression, JohnWiley & Sons, New York, NY.

[15] Cholewiak, R.W. and A.A. Collins, 1992, “Sensory and Physiological Bases ofTouch,”in M. A. Heller & W. R. Schiff (Eds.),The Psychology of Touch. Hillsdale,N. J.: Lawrence Erlbaum Associates, pp. 23-60.

[16] Connell, J.H., 1989, “A Colony Architecture for an Artificial Creature,” MIT AITechnical Report no. 1151.

[17] Cypher,A., D.C. Halbert, D. Kurlander, H. Lieberman, D. Maulsby, B. A. Myers andA. Turransky, eds., 1993,Watch What I Do: Programming by Demonstration, Cam-bridge, MA: MIT Press.

[18] Diamond, W., 1981,Practical Experiment Designs for Engineers and Scientists, Life-time Learning Pub., Belmont, CA.

[19] Delson, N. and H. West, 1996, “Robot Programming by Human Demonstration:Adaptation and Inconsistency in Constrained Motion,” inProceedings of the IEEEInternational Conference on Robotics and Automation, v. 1, pp. 30-36.

[20] Ernst, H.A., 1961, MH-1,A Computer-Operated Mechanical Hand, Ph.D. Thesis,MIT.

[21] Fearing, R.S., 1987,Tactile Sensing, Perception and Shape Interpretation, Ph.D.Thesis, Dept. of Electrical Engineering, Stanford University, Stanford, CA, Decem-ber.


[22] Forsythe, G.E., M. Malcolm, and C.B. Moler, 1977,Computer Methods for Mathe-matical Computations, Prentice Hall, Englewood Cliffs, NJ.

[23] Gertz, M.W., D.B. Stewart and P.K. Khosla, 1993, “A Software Architecture-BasedHuman-Machine Interface for Reconfigurable Sensor-Based Control Systems,” inProc. of the 8th IEEE Symp. on Intelligent Control, Chicago, IL, Aug.

[24] Gertz, M.W., 1994,An Ironic Programming Environment for Real-Time Control Sys-tems, Ph.D. Thesis, Electrical and Computer Engineering, Carnegie Mellon Univer-sity, Nov.

[25] Grupen, R.A., T.C. Henderson, and I.D. McCammon, 1989, “A Survey of General-Purpose Manipulation,”International Journal of Robotics Research, v. 8, n. 1, Feb-ruary, pp. 38-62.

[26] Gullapalli, V., R. Grupen, and A. Barto, 1992, “Learning Reactive Admittance Con-trol,” in Proceedings of IEEE International Conference on Robotics and Automa-tion, Nice, France, v. 2, pp. 1475-1480.

[27] Hancock, J. and C. Thorpe, 1995, “ELVIS: Eigenvectors for Land Vehicle Image Sys-tem,” inProc. of 1995 IEEE/RSJ International Conf. on Intelligent Robots and Sys-tems,” Pitsburgh, PA, Aug., pp 35-40.

[28] Hartley, J., 1983,Robots at Work, IFS Publications, Ltd.,Kempston, Bedford, UK,chapters 5,8.

[29] Hayward, V. and R.P. Paul, 1987, “Robot Manipulator Control under Unix: RCCL, aRobot Control ‘C’ Library,”International Journal of Robotics Research, v. 5, n. 4,pp. 94 - 111.

[30] Hebb, D.O., 1949,The Organization of Behavior; a Neuropsychological Theory, NewYork, Wiley.

[31] Hirzinger, G., B. Brunner, J. Dietrich, and J. Heindl, 1993, “Sensor-Based Robotics -ROTEX and its Telerobotic Features,”IEEE Transactions on Robotics and Auto-mation, v.9, n.5, pp. 649-663.

[32] Hoffman, D.D. and W.A. Richards, 1984, “Parts of Recognition,”Cognition, v. 18,pp. 65-96.

[33] Howe, R.D., 1994, “Tactile Sensing and Control of Robotic Manipulation,” inJour-nal of Advanced Robotics, v.8, n.3, pp. 245-261.

[34] IBM, 1983,A Manufacturing Language Concepts and User’s Guide, IBM Corpora-tion, 2nd Edition.

[35] Ikeuchi, K and T. Suehiro, 1994, “Towards Assembly Plan from Observation, part I:Task Recognition with Polyhedral Objects,”IEEE Transactions on Robotics andAutomation, v.10, n.3, pp. 368-385.


[36] Kaiser, M. and R. Dillman, 1996, “Building Elementary Robot Skills from HumanDemonstration,” inProceedings of the IEEE International Conference on Roboticsand Automation, v. 3, pp. 2700-2705

[37] Kang, S.B., 1994,Robot Instruction by Human Demonstration, Ph.D. Thesis, TheRobotics Institute, Carnegie Mellon University, CMU-RI-TR-94-33, November.

[38] Kang, S.B. and K. Ikeuchi, 1994, “Robot Task Programming by Human Demonstra-tion: Mapping Human Grasps to Manipulatior Grasps,” inProc. of IEEE/RSJ/GIInternational Conference on Intelligent Robots and Systems, Munchen, Germany,pp. 97-104.

[39] Kang, S.B., and K. Ikeuchi, 1993, “Toward Automatic Robot Instruction from Per-ception -- Recognizing a Grasp from Observation,” inIEEE Transactions on Robot-ics and Automation, v.9, n.4, August, pp. 432-443.

[40] Kang, S.B., and K. Ikeuchi, 1996, “Toward Automatic Robot Instruction from Per-ception -- Temporal Segmentation of Tasks from Human Hand Motion,” inIEEETransactions on Robotics and Automation, v. 11, n. 5, October, pp. 670-681.

[41] Kang, S.B., J. Webb, C.L. Zitnick, and T. Kanade, 1994, “An Active MultibaselineStereo System with Real-Time Image Acquisition,” inProc. of Image Understand-ing Workshop, Monterey, CA, Nov.

[42] Kass, M., A. Witkin, and D. Terzopolous, 1987, “Snakes: Active Contour Models,”International Journal of Computer Vision, v.1, n.4, pp. 321-331.

[43] Kenaley, G.L. and M.R. Cutkosky, 1989, “Electrorheological Fluid-Based RoboticFingers with Tactile Sensing,” inProceedings of the IEEE International Confer-ence on Robotics and Automation, v. 1, May, 1989, pp. 132-136.

[44] Klema, V.C. and A.J. Laub, 1980, “The Singular Value Decomposition: Its Compu-tation and Some Applications,”IEEE Trans. on Automatic Control, v.25, n.2, pp.164-176.

[45] Kontarinis, D.A., J.S. Son, W. Peine, and R.D. Howe, 1995, “A Tactile Shape Sensingand Display System for Teleoperated Manipulation,” inProceedings of the 1995IEEE International Conference on Robotics and Automation, v. 1, May, pp. 641-646.

[46] Kontarinis, D.A., 1995,Tactile Displays for Dextrous Telemanipulation, Ph.D. Thesis,Division of Applied Sciences, Harvard University, May.

[47] Kuniyoshi, Y., M. Inaba, and H. Inoue, 1994, “Learning by Watching: ExtractingReusable Task Knowledge from Visual Observation of Human Performance,”IEEE Transactions on Robotics and Automation, v 10, n 6, Dec., pp. 799-822.


[48] Kuniyoshi, Y, 1996, “Behavior Matching by Observation for Multi-Robot Coopera-tion,” pp. 343-352.

[49] Lee, S. and M. H. Kim, 1988, “Learning Expert Systems for Robot Fine Motion Con-trol,” Proceedings of the IEEE International Symposium on Intelligent Control,Arlington, VA, pp. 534-544.

[50] Little, R., 1992, “Force/Torque Sensing in Robotic Manufacturing,”Sensors, v.9,n.11.

[51] Liu, S. and H. Asada, 1992, “Transferring Manipulative Skills to Robots: Represen-tation and Acquisition of Tool Manipulative Skills Using a Process DynamicsModel,” ASME Journal of Dynamic Systems, Measurement, and Control, Vol. 114,No. 2, June, pp. 200-228.

[52] Lord Corporation, 1986,Installation and Operations Manual for F/T Series Force/Torque Sensing Systems, Lord Corporation, Cary, NC, Appendix A.

[53] Maes, P. and R. Brooks, 1990, “Learning to Coordinate Behaviors,” AAAI-90, Bos-ton, MA, pp. 796-802.

[54] Mataric, M., 1997, “Behavior-Based Control: Examples from Navigation, Learning,and Group Behavior,”Journal of Experimental and Theoretical Artificial Intelli-gence, v.9, n2-3, pp. 323-336.

[55] Michelman, P. and P. Allen, 1994, “Forming Complex Dextrous Manipulations fromTask Primitives,”Proceedings of IEEE International Conference on Robotics andAutomation, San Diego, CA, pp. 3383-3388.

[56] Monkman, G.J., “An Electrorheological Tactile Display,”Presence, v. 1, n. 2, Spring,1992, pp. 219-228.

[57] Morrow, J.D. and P.K. Khosla, 1995, “Sensorimotor Primitives for Robotic AssemblySkills,” in Proceedings of the 1995 IEEE International Conference on Robotics andAutomation, v. 2, May, pp.1894-1899.

[58] Morrow, J.D., 1997,Sensorimotor Primitives for Programming Robotic AssemblySkills, Ph.D. Thesis, Robotics Institute, Carnegie Mellon University, May.

[59] Nakamura, Y., T. Yoshikawa, and I. Futamata, 1988, “Design and Signal Processingof Six-Axis Force Sensors,”Robotics Research: The 4th International Symposium,R.C. Bolles and B. Roth, editors, MIT Press, Cambridge, MA, pp 75-81.

[60] Nicholls, H.R. and M.H. Lee, 1989, “A Survey of Robot Tactile Sensing Technol-ogy,” International Journal of Robotics Research, v. 8, n. 3, June, pp 3-30.

[61] Nicolson, E.J., 1994,Tactile Sensing and Control of a Planar Manipulator, Ph.D.Thesis, Electrical Engineering and Computer Science, University of California atBerkeley.


[62] Nilsson, N., 1984, “Shakey the Robot,” SRI AI Center tech. note 323, April.

[63] Paetsch, W. and G. von Wichert, 1993, “Solving Insertion Tasks with a MultifingeredGripper by Fumbling,”Proceedings of IEEE International Conference on Roboticsand Automation, Atlanta, GA, pp. 173-179.

[64] Patrick, J., 1992,Training: Research and Practice, Academic Press, San Diego, CA.

[65] Pearson, R.G., 1961,Judgment of Volume from Two-Dimensional Representations ofComplex, Irregular Shapes, Ph.D. Thesis, Department of Psychology, CarnegieMellon University, Oct.

[66] Pierce, D., 1991, “Learning a Set of Primitive Actions with an Uninterpreted Sen-sorimotor Apparatus,” 8th International Workshop on Machine Learning, Evan-ston, IL, pp 338 - 342.

[67] Pomerleau, D., 1992,Neural Network Perception for Mobile Robot Guidance, Ph.D.Thesis, Carnegie Mellon University, Pittsburgh, PA.

[68] Pook, P.K. and D.H. Ballard, 1993, “Recognizing Teleoperated Manipulations,”Pro-ceedings of the IEEE International Conference on Robotics and Automation, v. 2,May, pp. 578-585.

[69] Rehg, J. and T. Kanade, 1995, “Model-based tracking of self-occluding articulatedobjects,”Proceedings of Fifth Intl. Conference on Computer Vision, Boston, MA,June, pp. 612-617.

[70] Rubin, J.M. and W.A. Richards, 1985, “Boundaries of Visual Motion,” MIT AI Labtechnical report 835, April.

[71] Salisbury, J.K., 1984, “Interpretation of Contact Geometries from Force Measure-ments,”Robotics Research 1, Faugeras, O.D. and G. Giralt, eds., MIT Press, pp.???.

[72] Salisbury, K., D. Brock and S. Chiu, 1986, “Integrated Language, Sensing, and Con-trol for a Robot Hand,” inRobotics Research 3, Faugeras, O.D. and G. Giralt, eds.,MIT Press, pp. 389-396.

[73] Salisbury, K., 1993, “Mechanisms for Remote and Virtual Grasping: Hands and theHaptic Interface,” International Symposium on Robotics Research, Hidden Valley,PA, Oct.

[74] Sano, A., 1995, “Haptic Interface Device Using Electro-rheological Fluid,” in1995ICRA University Laboratory Tour, Nagoya Institute of Technology, May.

[75] Shimano, B. and Roth, B., 1977, “On Force Sensing Information and its Use in Con-trolling Manipulators,”Proceedings of the IFAC International Symposium on infor-mation-Control Problems in Manufacturing Technology, pp. 119 - 126.


[76] Shimoga, K.B. and A.A. Goldenberg, 1992, “Soft Materials for Robotic Fingers,” inProceedings of the IEEE International Conference on Robotics and Automation,Nice, France, v. 2, pp. 1300-1305.

[77] Shimoga, K.B., 1993, “A Survey of Perceptual Feedback Issues in Dextrous Telema-nipulation: Part II. Finger Touch Feedback,” inProceedings if IEEE Virtual RealityAnnual International Symposium, Seattle, WA, Sept.

[78] Shimoga, K.B., A.M. Murray, and P.K. Khosla, 1995, “A Touch Reflection Systemfor Interaction with Remote and Virtual Environments,” inProceedings of the 1995IEEE/RSJ International Conference on Intelligent Robots and Systems, Pittsburgh,PA, vol. 1, pp 123-128.

[79] Shimokura, K. and S. Liu, 1994, “Programming Deburring Robots Based on HumanDemonstration with Direct Burr Size Measurement,”Proceedings of IEEE Interna-tional Conference on Robotics and Automation, San Diego, CA, pp. 572-577.

[80] Shoham, Y., 1993, “Agent Oriented Programming,”Artificial Intelligence, v.60, n.1,pp. 51-92.

[81] Siegel, D.M., 1986,Contact Sensors for Dextrous Robotic Hands, Tech. Report AI-TR-900, MIT AI Lab, Cambridge, MA, June.

[82] Simons, J., H. Van Brussel, J. De Schutter, and J. Verhaert, 1982, “A Self-LearningAutomaton with Variable Resolution for High Precision Assembly by IndustrialRobots,”IEEE Transactions on Automatic Control, Vol AC-27, No. 5, October.

[83] Son, J., 1996,Integration of Tactile Sensing and Robot Hand Control, Ph.D. Thesis,Division of Applied Sciences, Harvard University, May.

[84] Speeter, T.H., 1991, “Primitive-Based Control of the Utah/MIT Dextrous Hand,”Pro-ceedings of the IEEE International Conference on Robotics and Automation, Sac-ramento, CA, April, pp 866-877.

[85] Stewart, D.B., D. E. Schmitz and P. K. Khosla, 1992, “The Chimera II Real-TimeOperating System for Advanced Sensor-Based Robotic Applications,”IEEE Trans-actions on Systems, Man, and Cybernetics, vol. 22, no. 6, pp. 1282-1295, Novem-ber/December.

[86] Stewart, D.B. and P.K. Khosla, 1996, “The Chimera Methodology: DesigningDynamically Reconfigurable and Reusable Real-Time Software using Port-BasedObjects,”International Journal of Software Engineering and Knowledge Engineer-ing, v. 6, n. 2, pp. 249-277, June.

[87] Strang, G., 1988,Linear Algebra and Its Applications, third edition, Harcourt BraceJovanovich Publishers, San Diego, CA, appendix A.


[88] Todd, D.J., 1986,Fundamentals of Robot Technology, John Wiley and Sons, chapters3, 7.

[89] Tomasi, C. and T. Kanade, 1991, “Shape and Motion from Image Streams: a Factor-ization Method” Tech. Report CMU-CS-91-172, Computer Science, CarnegieMellon University, Pittsburgh, PA.

[90] Uchiyama, M., E. Bayo, and E. Palma-Villalon, 1991, “A Systematic Design Proce-dure to Minimize a Performance Index for Robot Force Sensors,”Journal ofDynamic Systems, Measurement, and Control, v.113, n.3, pp. 388-394.

[91] Unimation, 1984,User’s Guide to VAL II, Unimation, Inc., version 1.1, August.

[92] Vaaler, E.G. and W.P. Seering, 1991, “A Machine Learning Algorithm for AutomatedAssembly,”Proceedings of IEEE International Conference on Robotics and Auto-mation, v. 3, pp. 2231-2237.

[93] Valbo, A.B. and R.S. Johansson, 1984, “Properties of Cutaneous Mechanoreceptorsin the Human Hand Related to Touch Sensation,” Human Neurobiology, v. 3, n.1,pp. 3-14.

[94] Van Brussel, H., H. Belien, and H. Thielemans, 1986, “Force Sensing for AdvancedRobot Control,”Robotics, v.2, n.2, pp. 139-148.

[95] Voyles, R.M. and P.K. Khosla, 1994, “Multi-Agent Perception for Human/RobotInteraction: A Framework for Intuitive Trajectory Modification,” Tech. reportCMU-RI-TR-94-33, Robotics Institute, Carnegie Mellon University, Pittsburgh,PA, Sept.

[96] Voyles, R.M. and P.K. Khosla. 1995a, “Tactile Gestures for Human/Robot Interac-tion,” Proc. of IEEE/RSJ Conf. on Intelligent Robots and Systems, Pittsburgh, PA,v. 3, pp. 7-13.

[97] Voyles, R.M. and P.K. Khosla, 1995b, “Multi-Agent Gesture Interpretation forRobotic Cable Harnessing,”Proceedings of IEEE Conf. on Systems, Man, andCybernetics, Vancouver, B.C., pp. 1113-1118.

[98] Voyles, R.M., J.D. Morrow, and P.K. Khosla, 1995, “Shape from Motion Decompo-sition as a Learning Approach for Autonomous Agents,”Proceedings of the IEEEInternational Conf. on Systems, Man, and Cybernetics, Vancouver, BC.

[99] Voyles, R.M., J.D. Morrow, and P.K. Khosla, 1997, “The Shape from MotionApproach to Rapid and Precise Force/Torque Sensor Calibration,”Journal ofDynamic Systems, Measurement, and Control, v. 119, n. 2, pp. 229 - 235, June.

[100]Voyles, R.M., B.L. Stavnheim, and B. Yap, 1989, “Practical ElectrorheologicalFluid-Based Fingertips for Robotic Applications,”Proceedings of IASTED Interna-


tional Symposium on Robotics and Manufacturing, pp. 280-283, Santa Barbara,CA, Nov.

[101]Walter, W.G. “The Living Brain,” W.W. Norton & Company, Inc., New York, 1953.

[102]Watson, P.C. and S.H. Drake, 1975, “Pedestal and Wrist Force Sensors for AutomaticAssembly,”Proceedings of the 5th International Symposium on Industrial Robots,pp. 501-511.

[103]Wooldridge, M. and N.R. Jennings, 1995, “Intelligent Agents: Theory and Practice,”Knowledge Engineering Review, v. 10, n. 2, pp. 115-152, June.

[104]Yang, J., Y. Xu, and C.S. Chen, 1994, “Hidden Markov Model Approach to SkillLearning and Its Application to Telerobotics,”IEEE Transactions on Robotics andAutomation, Vol. 10, No. 5, October, pp. 621-631

131

Appendix A

Shape from Motion Code

Mathematica, Matlab, and C Implementations

A.1 BASIC SHAPE FROM MOTION CALIBRATION MATHEMATICA CODE

A.1.1 File: calibrate.math

This is the “user interface” to the shape from motion code. It reads the ASCII datafiles(names defined within) and calls all the appropriate routines.

(* calibrate.math *)(* This is the top level mathematica file for Shape from Motion

Calibration of the Lord F/T sensor. *)

(* <<Graphics`Animation` *)

(* specify dimension of motion vector *)w=3(* gages = the number of sensing elements the Lord sensor has *)gages=8

<<motion.basic<<newrot.math

(* load or generate data file with raw strain gauge data *)

datafile=”block2.asc”(* read in data *)motiondata=ReadList[datafile,Table[Number,{gages}]];

BASIC SHAPE FROM MOTION CALIBRATION MATHEMATICA CODE


(* perform the calculation *)<<motion.math

(* combine results using squashing matrices *)s1=rotShape[shape,fy,uyr2];s2=rotShape[s1,-fx,uxr2];shape2 = s2;nc2=-Transpose[PseudoInverse[s2]];Print[nc2.uxr2]Print[-fx]Print[nc2.uyr2]Print[fy]Print[nc2.uzr2]Print[fz]Print[“**********************”]

datafile=”block3.asc”motiondata=ReadList[datafile,Table[Number,{gages}]]; (* read in

data *)<<motion.maths2=rotShape[s1,-fz,uzr3];s2={-s2[[1]],s2[[2]],s2[[3]]};shape3 = s2;nc3=-Transpose[PseudoInverse[s2]];Print[nc3.uxr3]Print[fx]Print[nc3.uyr3]Print[fy]Print[nc3.uzr3]Print[-fz]Print[“**********************”]


data *)<<motion.maths1=rotShape[shape,-fy,uyr4];s2=rotShape[s1,-fz,uzr4];s2={-s2[[1]],s2[[2]],s2[[3]]};shape4 = s2;nc4=-Transpose[PseudoInverse[s2]];Print[nc4.uxr4]Print[fx]Print[nc4.uyr4]Print[-fy]Print[nc4.uzr4]Print[-fz]Print[“**********************”]



(* combine the shape matrices *)shapeStack = stackMat[shape2,shape3,shape4];rotStack = stackMat[torqMat[rt2],torqMat[rt3],torqMat[rt4]];shape = PseudoInverse[rotStack].shapeStackcalib = -Transpose[PseudoInverse[shape]];

A.1.2 File: motion.basic

This is a file of utilities for both simulation and real data.

(* motion.basic *)

mf[x_] := MatrixForm[x]

(* mls_row[] generates a Least Squares motion matrix row from a rawmotion row *)

mlsrow[m_] := Module[{v={},w=Length[m]},Do[ Do[ If[p==r,v=Append[v,m[[p]]^2],v=Append[v,2*m[[p]]*m[[r]]]], {r,p,w}],{p,1,w}];v]

(* to see symbolic form of LS motion matrix row *)(* mlsrow[Array[m,w]] *)

(* this function generates the symbolic groups corresponding to theleast squares solutions. a[[i,j]] is the i,j element of affinetransformation matrix. Note that there will be w(w-1)/2 freeelements of this matrix and they must be chosen such that thematrix is not made singular *)

agroups[w_] := Module[{g={}},(Do[ Do[ g = Append[g,Sum[a[p,j]*a[r,j],{j,1,w}]], {r,p,w}],{p,1,w}]; g)]

(* vector of a[i,j] values to set equal to zero *)freeas[w_] := Module[{g={}},(Do[ (* Do a bogus assignment to avoid a warning when we Unset. *) Do[ a[p,r]=0; a[p,r]=.; AppendTo[g,a[p,r]], {r,1,p-1}],{p,2,w}]; g)]

(* vector of a[i,j] values to solve for *)



solvas[w_] := Module[{g={}},(Do[ (* Do a bogus assignment to avoid a warning when we Unset. *) Do[ a[p,r]=1; a[p,r]=.; AppendTo[g,a[p,r]], {r,p,w}],{p,1,w}]; g)]

(* biNoise[] produces an n-vector of uniform noise in the interval -1to 1. *)

biNoise[n_] := Module[{outv={}},(Do[AppendTo[outv,Random[Real, {-1,1}]], {i,1,n}];outv)];

(* This function generates approximately “nm” vectors of force sensordata and the forces that produced them. The number of vectorsproduced will be >= nm. It returns an array of 2 elements. Thefirst element is a matrix of the raw data while the second elementis a matrix of the corresponding forces. *)

genData[f_,cal_,nm_] := Module[{dims=Dimensions[cal], data={},shp={}, num={} },

(shp = N[Transpose[PseudoInverse[cal]]]; num = nm; If[dims[[1]] == 2, (Do[AppendTo[data, N[f*{Cos[t],Sin[t]}]], {t, 0.0, 2.0*Pi, 2.0*Pi/num}])]; If[dims[[1]] == 3, (num = Sqrt[num]; Do[ Do[AppendTo[data, N[f*{Cos[t]Sin[t2],

Sin[t]Sin[t2],Cos[t2]}]], {t2, Pi/2.0/num, Pi, Pi/num}], {t, 0.0, 2.0*Pi, 2.0*Pi/num}])]; Append[ {data.shp},data])];

(* This is a dummy call so previous versions of batch files don’tbreak. *)

simData[f_,cal_,offs_,noiz_] := Module[{data = {}}, (data = simDataN[f,cal,offs,noiz,200];data[[1]])];

(* This function generates simulated data given a constant forcemagnitude, the calibration matrix of the sensor, an offsetvector and a noise component. It generates a variable number ofvectors determined by “num”. The noise component is fraction offull scale for each individual sense element. If the offset is ascalar value, it is also a fraction of full scale for eachindividual sense element. If it is a vector, it’s length must



equal the number of sense elements and it will be applied directlyto the sensor values. It does not, in the vector case, apply asa fraction of full scale. *)

simDataN[f_,cal_,offs_,noiz_,num_] := Module[{dims=Dimensions[cal],data={}, forc={}, offm={}, maxs={}, noizv={}, noizpeak={},noizm={}},

(data = genData[f,cal,num]; forc = data[[2]]; data = data[[1]];

(* This calculation gives noise as a percent of maximum totalsignal magnitude.

maxs = (Transpose[data][[1]])^2; Do[maxs = maxs + (Transpose[data][[i]])^2, {i,2,dims[[2]]}]; maxs = Sqrt[Max[maxs]]; noizv = noiz * maxs; Do[AppendTo[noizm,noizv*biNoise[dims[[2]]]], {i,1,Dimensions[data][[1]]}]; If[!VectorQ[offs], offm = Table[offs*maxs, {dims[[2]]}], offm=offs];

*)

(* This calculation finds the max value for each sense element. *) Do[AppendTo[maxs, Max[Abs[Transpose[data][[i]]]]], {i,1,dims[[2]]}];

If[!VectorQ[noiz], noizpeak = noiz*maxs, noizpeak=noiz]; Do[(noizv = {}; Do[AppendTo[noizv, noizpeak[[i]] * Random[Real, {-1,1}]], {i,1,dims[[2]]}]; AppendTo[noizm,noizv]), {j,1,Dimensions[data][[1]]}];

If[!VectorQ[offs], offm = offs*maxs, offm=offs]; Print[“offsets = “,offm]; Print[“noizpeak = “,noizpeak]; Print[“noise mean = “, Mean[Transpose[noizm]]]; If[dims[[2]] != Dimensions[offm][[1]], Print[“Offset vector length

mismatch”]]; Append[{data + Table[offm,{Dimensions[data][[1]]}] + noizm},

forc])];

(* This function generates simulated least squares data given aconstant force magnitude, the calibration matrix of the sensor, anoffset vector and noise components for both the sense elements andthe force vector. *)

(* The raw sensor noise component, “noiz,” is fraction of fullscale for each individual sense element. If the offset is a scalar



value, it is also a fraction of full scale for each individualsense element. If it is a vector, it’s length must equal thenumber of sense elements and it will be applied directly to thesensor values. It does not, in the vector case, apply as afraction of full scale. *)

simLSData[f_,cal_,offs_,noiz_,noizf_,num_] :=Module[{dims=Dimensions[cal], data={}, force={}, noizv={},noizpeak={}, noizm={}},

(data = simDataN[f,cal,offs,noiz,num]; force = data[[2]]; data = data[[1]];

If[!VectorQ[noiz], noizpeak = noizf*Table[1,{i,dims[[1]]}],noizpeak=noizf];

Do[(noizv = {}; Do[AppendTo[noizv, noizpeak[[i]] * Random[Real, {-1,1}]], {i,1,dims[[1]]}]; AppendTo[noizm,noizv]), {j,1,Dimensions[force][[1]]}]; Append[{data},force+noizm])];

(* This function converts raw sensor data into force values, giventhe calibration matrix and offset vector of the sensor. *)

evalData[data_,cal_,offs_] := Module[{dims=Dimensions[data],calt={},offm={}},

(calt = N[Transpose[cal]]; If[!VectorQ[offs], offm = Table[offs, {dims[[2]]}], offm=offs]; (data-Table[offm, {dims[[1]]}]).calt)];

(* This function finds the mean of a vector or the rows of amatrix. *)

Mean[data_] := Module[{dims=Dimensions[data], sumv=0, sumvect={}},(If[ Length[data] == 0, (Print[“Vector is NULL”]; Return[])]; If[ VectorQ[data], (Do[sumv = sumv+data[[i]], {i,1,Length[data]}]; Return[sumv/Length[data]];), Do[(Do[sumv = sumv+data[[j,i]], {i,1,dims[[2]]}]; AppendTo[sumvect,sumv]; sumv = 0;), {j,1,dims[[1]]}]]; sumvect / dims[[2]])];

(* This function finds the standard deviation of a vector. *)StandardDev[data_] := Module[{dims=Dimensions[data], sumv=0,

meanv={}},(If[ Length[data] == 0, (Print[“Vector is NULL”]; Return[])]; If[ !VectorQ[data], (Print[“StandardDev only works with vectors”];

Return[])]; meanv = Mean[data];



Do[sumv = sumv+(data[[i]] - meanv)^2, {i,1,Length[data]}]; Sqrt[sumv / Length[data]])];

simcal = {{1,0,0,1,0,0},{0,1,0,0,1,0},{0,0,1,0,0,1}};

SideBySideMat[A_,B_] := Module[ {dims=Dimensions[B][[2]], C, D}, (C = Transpose[A];D = Transpose[B];Print[dims];Do[ AppendTo[C,D[[i]]], {i,1,dims}];Transpose[C])];

A.1.3 File: newrot.math

This file contains some utility functions for rotating and combining multiple“squashed” trials. It also contains experiment-specific constants such as the the sensoroffsets, masses used during the trials, and vectors for the known applied loads used fororientation and scaling.

(* newrot.math *)(* code to rotate the shape matrix given a “magic” point *)

mf[x_]:=MatrixForm[x]

(* RMV -- negated this *)xprodmat[r_]:={{0,-r[[3]],r[[2]]},{r[[3]],0,-r[[1]]},{-

r[[2]],r[[1]],0}}

(* returns a 3x3 rotation matrix for rotation about k axis. *)rotKmat[t_,k_]:= Module[{kx=k[[1]],ky=k[[2]],kz=k[[3]],

vt=N[1.-Cos[t]],st=Sin[t],ct=Cos[t]},{{kx^2 vt + ct, kx ky vt - kz st, kx kz vt + ky st}, {kx ky vt + kz st, ky^2 vt + ct, ky kz vt - kx st}, {kx kz vt - ky st, ky kz vt + kx st, kz^2 vt + ct}}];

mag[x_] := Sqrt[x.x];

(* This function rotates the shape matrix so that u produces a vector that corresponds to f. f can have arbitrary magnitude.*)(* 09/18/95 RMV fixed sign inversions due to ArcSin. *)rotShape[shape_,f_,u_]:=Module[{s,v,t,fa,par},s=PseudoInverse[shape];s=Transpose[s];fa=s.u;If[Dimensions[f][[1]] == 2,



(t=ArcTan[f[[1]],f[[2]]]; t=t-ArcTan[fa[[1]],fa[[2]]]; s = {{Cos[t], -Sin[t]},{Sin[t], Cos[t]}}.s; par = (f.fa)/(Sqrt[f.f]*Sqrt[fa.fa]); If[Abs[par] < 0.95, Print[“ERROR: Parallelism = “,par, “ fa = “,fa, “ f = “, f]; ]; Return[PseudoInverse[Transpose[s]]])];v={{0,-f[[3]],f[[2]]},{f[[3]],0,-f[[1]]},{-f[[2]],f[[1]],0}}.fa;t=Sqrt[v.v];v=v/t;t=Re[ArcSin[t/(Sqrt[f.f]*Sqrt[fa.fa])]];r=rotKmat[t,v];s=r.s;fa=s.u;par = (f.fa)/(Sqrt[f.f]*Sqrt[fa.fa]);If[Abs[par] < 0.95, Print[“ERROR: Parallelism = “,par, “ fa = “,fa, “ f = “, f];];If[par < 0, s=rotKmat[Pi,v].s];PseudoInverse[Transpose[s]]];

findCog[ra_,rb_,rc_]:=Module[{x,y,z},x = (ra[[1]]*Ma + rb[[1]]*Mb + rc[[1]]*Mc)/Mtot;y = (ra[[2]]*Ma + rb[[2]]*Mb + rc[[2]]*Mc)/Mtot;z = (ra[[3]]*Ma + rb[[3]]*Mb + rc[[3]]*Mc)/Mtot;{x,y,z}];

torqMat[r_]:={{1,0,0,0,-r[[3]],r[[2]]},{0,1,0,r[[3]],0,-r[[1]]},{0,0,1,-r[[2]],r[[1]],0}}

stackMat[m1_,m2_,m3_]:= {m1[[1]],m1[[2]],m1[[3]],m2[[1]],m2[[2]],m2[[3]],m3[[1]],m3[[2]],m3[[3]]};

fy=N[{0,1,0}];fx=N[{1,0,0}];fz=N[{0,0,1}];

(* offset values for the F/T sensor with calibration fixtureremoved *)

(* First set is sensor pointing down, second set is sensor pointingup *)

offset2 = 0.5*({-25,-2,6,-16,47,-1,51,-44}+{-54,-2,-24,-15,18,-1,22,-44});

offset3 = 0.5*({-41,-5,18,-16,38,-6,63,-43}+{-71,-5,-12,-14,6,-5,33,-42});



offset4 = 0.5*({-36,-3,18,-17,39,-1,58,-45}+{-66,-4,-11,-16,8,-2,28,-43});

(* Masses of the objects in kg *)Ma = 1.1015;Mb = 1.1024;Mc = 1.23285;Mtot = Ma + Mb + Mc;

(* Measured values with fixture in known orientation *)(* Run 2 aligned gravity with -X, +Y, +Z *)uxr2={-68,-684,-726,5,32,673,768,-66}-offset2;uyr2={-784,-866,-4,-216,739,-878,59,-1583}-offset2;uzr2={160,-6,-676,1,247,-4,1166,-69}-offset2;(* position of CoG relative to origin of F/T sensor *)ra2={0.1647/Sqrt[2], -0.1647/Sqrt[2], 0.05989};rb2={0.1647/Sqrt[2], 0.1647/Sqrt[2], 0.05989};rc2={0, 0, 0.05839};rt2=findCog[ra2,rb2,rc2];

(* Run 3 aligned gravity with +X, +Y, -Z *)uxr3={-40,675,736,-42,6,-686,-673,-25} - offset3;uyr3={-760,873,88,1553,757,855,38,137} - offset3;uzr3={-286,-4,-1128,7,-193,-7,723,-59} - offset3;(* position of CoG relative to origin of F/T sensor *)ra3={-0.1647/Sqrt[2], 0.1647/Sqrt[2], 0.05989};rb3={-0.1647/Sqrt[2], -0.1647/Sqrt[2], 0.05989};rc3={0, 0, 0.05839};rt3=findCog[ra3,rb3,rc3];

(* Run 4 aligned gravity with +X, -Y, -Z *)uxr4=N[{-46,243,1354,-495,0,-1114,-1294,-443} - offset4];uyr4=N[{1279,-445,-52,-1141,-1353,-436,59,219} - offset4];uzr4=N[{170,-3,-671,-4,-655,-5,275,-43} - offset4];(* position of CoG relative to origin of F/T sensor *)ra4={0, 0, 0.22305};rb4={-0.1647/Sqrt[2], -0.1647/Sqrt[2], 0.05989};rc4={0, 0, 0.05839};rt4=findCog[ra4,rb4,rc4];

(* os={-62,-7,15,-9,4.5,-23,61.5,-34.5} *)

A.1.4 File: motion.math

This is the main code for performing shape from motion calibration.

(* motion.math *)



(* 3-24-94 JDM & RMV This file contains code to apply shape from motion techniques to force sensor calibration *)

(* Flag *)MotionOffsets = 0;

(* set up the least squares problem *)(* Mathematica defines the SVD as: svd = uT diag[w] v *)(* Our formulation (Strang’s) is: svd = u diag[w] vT *)(* All comments referring to rows and columns are in our

formulation *)svd=SingularValues[N[motiondata]];uT=svd[[1]];(* Keep only the first w singular values *)(* w is the rank of the motion matrix and is defined in

calibrate.math *)s=DiagonalMatrix[Sqrt[Take[svd[[2]],w]]];vT=svd[[3]];

(* take first w columns of u *)u = Transpose[Take[uT,w]];(* take first w rows of v *)vT = Take[vT,w];

(* define motion and shape matrices from svd results *)(* These motion and shape matrices are off by an affine

transformation *)motionHat = u . s ;shapeHat = s . vT;

(* create the LS motion matrix *)mls = Map[mlsrow,motionHat];

(* set up the right side of LS equation *)(* construct a vector of 1’s with same number of rows as in

motiondata *)f = Table[1,{Dimensions[motiondata][[1]]}];

(* solve LS for group values: alpha *)(* f = mls . alpha *)alpha = PseudoInverse[mls] . f;

(* define symbolic groups vector *)agps = agroups[w];

(* set free parameters in a’s *)fas = freeas[w];

Do[Evaluate[fas[[i]]]=0,{i,Length[fas]}];



(* make list of a’s to solve for *)asol=solvas[w];

(* construct list of nonlinear equations from agps and alpha *)eqns=Module[{g={}},Do[g=Append[g,agps[[i]]==alpha[[i]]],{i,Length[agps]}];g];

(* solve nonlinear coupled equations for a’s *)solns = NSolve[eqns,asol];

(* show us the affine transformation *)at = Array[a,{w,w}] /. solns[[2]];

(* apply the affine transformation to the SVD motion and shapematrices *)

motion = motionHat . at;shape = Inverse[at] . shapeHat;offset = Table[0, {Dimensions[shape][[2]]}];cal = Transpose[PseudoInverse[shape]];

(* display a movie of the resulting motion matrix *)(* <<plotmotion.math *)

A.1.5 File: calib.fing.math

This is the “user interface” to the shape from motion code for the fingertip experi-ments. Only constants specific to the experiment have changed.

(* calib.fing.math *)(* This is the top level mathematica file for Shape from Motion

Calibration of the planar fingertip. *)

(* <<Graphics`Animation` *)

(* specify dimension of motion vector *)w=2(* gages = the number of sensing elements the sensor has *)gages=4

<<motion.basic<<newrot.math


datafile=”finger.asc”(* read in data *)

SHAPE FROM MOTION CALIBRATION WITH OFFSETS


motiondata=ReadList[datafile,Table[Number,{gages}]];

<<motion.maths1=rotShape[shape,fy,uyr2];s2=rotShape[s1,-fx,uxr2];shape2 = s2;nc2=-Transpose[PseudoInverse[s2]];Print[nc2.uxr2]Print[-fx]Print[nc2.uyr2]Print[fy]Print[nc2.uzr2]Print[fz]Print[“**********************”]

A.2 SHAPE FROM MOTION CALIBRATION WITH OFFSETS

A.2.1 File: calib.off.math

This is the “user interface” to the shape from motion code that calculates sensor off-sets. It reads the ASCII datafiles (names defined within) and calls all the appropriateroutines.

(* calib.off.math *)(* <<Graphics`Animation` *)

(* specify dimension of space of motion vector (2 or 3) *)w=3(* gages = the number of sensing elements the sensor has *)gages=8

<<motion.basic<<motion.off.basic<<newrot.math


datafile=”block2.asc”(* read in data *)motiondata=ReadList[datafile,Table[Number,{gages}]];

(* create a bogus offset for testing with zero-bias data *)offset =

1.Table[1,{i,1,Dimensions[motiondata][[1]]},{j,1,Dimensions[motiondata][[2]]}];

motiondata = motiondata+offset;



<<motion.off.maths1=rotShape[shape,fy,uyr2];s2=rotShape[s1,-fx,uxr2];shape2 = s2;nc2=-Transpose[PseudoInverse[s2]];Print[nc2.uxr2]Print[-fx]Print[nc2.uyr2]Print[fy]Print[nc2.uzr2]Print[fz]Print[“**********************”]


data *)<<motion.off.maths1=rotShape[shape,fy,uyr3];s2=rotShape[s1,fx,uxr3];nc3=-Transpose[PseudoInverse[s2]];Print[nc2.uxr3]Print[fx]Print[nc2.uyr3]Print[fy]Print[nc2.uzr3]Print[-fz]Print[“*****Error! ^why^**********”]s2=rotShape[s1,-fz,uzr3];s2={-s2[[1]],s2[[2]],s2[[3]]};shape3 = s2;nc3=-Transpose[PseudoInverse[s2]];Print[nc3.uxr3]Print[fx]Print[nc3.uyr3]Print[fy]Print[nc3.uzr3]Print[-fz]Print[“**********************”]


data *)<<motion.off.maths1=rotShape[shape,-fy,uyr4];s2=rotShape[s1,-fz,uzr4];s2={-s2[[1]],s2[[2]],s2[[3]]};shape4 = s2;nc4=-Transpose[PseudoInverse[s2]];Print[nc4.uxr4]



Print[fx]Print[nc4.uyr4]Print[-fy]Print[nc4.uzr4]Print[-fz]Print[“**********************”]

(* combine the shape matrices *)shapeStack = stackMat[shape2,shape3,shape4];rotStack = stackMat[torqMat[rt2],torqMat[rt3],torqMat[rt4]];shape = PseudoInverse[rotStack].shapeStackcalib = -Transpose[PseudoInverse[shape]];

A.2.2 File: motion.off.basic

This is a file of utilities for both simulation and real data.

(* motion.off.basic *)

mf[x_] := MatrixForm[x]

(* Finding the A matrix consists of solving two independentconstraints *)

(* Since the constraints are decoupled, we solve them separately. *)(* A = [A1 A2] where A1 is (w+1) by w and A2 is (w+1) by 1 *)

(* First set up for solving A1 *)


mlsrow1[m_] := Module[{v={},ww=Length[m]},Do[ Do[ If[p==r,v=Append[v,m[[p]]^2],v=Append[v,2*m[[p]]*m[[r]]]], {r,p,ww}],{p,1,ww}];v]

mlsrow1b[m_] := Module[{v={},ww=Length[m]},Do[ Do[ If[p==r,v=Append[v,m[[p]]^2],v=Append[v,2*m[[p]]*m[[r]]]], {r,p,ww}],{p,1,ww}];Do[AppendTo[v,2*m[[p]]],



{p,1,ww}];v]


(* this function generates the symbolic groups corresponding to theleast squares solutions a[[i,j]] is the i,j element of affinetransformation matrix. Note that there will be no free elements ofthis matrix. *)

a1groups[w_] := Module[{g={}},(Do[ Do[ g = Append[g,Sum[a[p,j]*a[r,j],{j,1,w}]], {r,p,w+1}],{p,1,w+1}]; g)]

A1groups[w_,a] := Module[{g={}},(Do[ Do[ g = Append[g,Sum[a[[p,j]]*a[[r,j]],{j,1,w}]], {r,p,w+1}],{p,1,w+1}]; g)]

(* this generates the “a” groups for the technique on page 51 *)a1bgroups[w_] := Module[{g={}},(Do[ Do[ AppendTo[g,Sum[a[j,p]*a[j,r],{j,1,w}]], {r,p,w}],{p,1,w}];Do[ AppendTo[g,If[ w==2, a[1,p], a[1,p] + a[3,p]]],{p,1,w}];g)]

A1bgroups[w_] := Module[{g={}},(Do[ Do[ AppendTo[g,Sum[a[[j,p]]*a[[j,r]],{j,1,w}]], {r,p,w}],{p,1,w}];Do[ AppendTo[g,If[ w==2, a[[1,p]], a[[1,p]] + a[[3,p]]]],{p,1,w}];g)]

(* This builds a symmetric matrix out of a special vector of theupper diagonal elements *)

symmetricMatrix1b[qq_] := Module[{g={}, row={}, ll=Length[qq], i=0,lll=0},

(ll = (Sqrt[1+8*ll] - 1)/2; g = Table[(i-1)*ll+j,{i,ll},{j,ll}];



lll=ll; ll=ll-1; Do[ Do[ i=i+1; g[[p,r]] = qq[[i]], {r,p,ll}], {p,1,ll}]; Do[i=i+1; g[[p,lll]] = qq[[i]], {p,1,lll}]; g = Transpose[g]; i=0; Do[ Do[ i=i+1; g[[p,r]] = qq[[i]], {r,p,ll}], {p,1,ll}]; Do[i=i+1; g[[p,lll]] = qq[[i]], {p,1,lll}]; g)]

refineCyl1b[alpha_, A_]:=Module[{t=1}, (Clear[a11];Clear[a12];Clear[a13];Clear[a21];Clear[a22];Clear[a23];Clear[a31];Clear[a32];Clear[a33]; fxn:=Sqrt[t*(alpha[[1]]-a11^2-a21^2-a31^2)^2]

+Sqrt[t*(alpha[[2]]-a11*a12-a21*a22-a31*a32)^2]+Sqrt[t*(alpha[[3]]-a11*a13-a21*a23-a31*a33)^2]+Sqrt[t*(alpha[[4]]-a12^2-a22^2-a32^2)^2]+Sqrt[t*(alpha[[5]]-a12*a13-a22*a23-a32*a33)^2]+Sqrt[t*(alpha[[6]]-a13^2-a23^2-a33^2)^2]+Sqrt[t*(alpha[[7]]-a11-a31)^2]+Sqrt[t*(alpha[[8]]-a12-a32)^2]+Sqrt[t*(alpha[[9]]-a13-a33)^2];

g=FindMinimum[fxn,{a11,A[[1,1]]},{a12,A[[1,2]]},{a13,A[[1,3]]},{a21,A[[2,1]]},{a22,A[[2,2]]},{a23,A[[2,3]]},{a31,A[[3,1]]},{a32,A[[3,2]]},{a33,A[[3,3]]},AccuracyGoal->0.000000001, MaxIterations->100];

{{a11, a12, a13},{a21,a22,a23},{a31,a32,a33}} /. g[[2]])]

(* This builds a symmetric matrix out of a vector of the upperdiagonal elements *)

symmetricMatrix[qq_] := Module[{g={}, row={}, ll=Length[qq], i=0},(ll = (Sqrt[1+8*ll] - 1)/2; g = Table[(i-1)*ll+j,{i,ll},{j,ll}];



Do[ Do[ i=i+1; g[[p,r]] = qq[[i]], {r,p,ll}], {p,1,ll}]; g = Transpose[g]; i=0; Do[ Do[ i=i+1; g[[p,r]] = qq[[i]], {r,p,ll}], {p,1,ll}]; g)]

(* This truncates the rank of an ellipse (symmetric matrix) to find a close-fit cylinder. An n x n matrix, qq, yields an n x n-1

matrix, g. *)ell2cyl[qq_] := Module[{g={}, svd={}, s, u, ll=Length[qq]},(svd = SingularValues[qq]; ll = ll-1; u=svd[[1]]; s=DiagonalMatrix[Sqrt[Take[svd[[2]],ll]]]; u = Transpose[Take[u,ll]]; g = u . s ; g)]

(*refineCyl[alpha_,A_]:=Module[{a11,a12,a13,a21,a22,a23,a31,a32,a33,a41,a42,a43},*)

refineCyl[alpha_, A_]:=Module[{t=1}, (Clear[a11];Clear[a12];Clear[a13];Clear[a21];Clear[a22];Clear[a23];Clear[a31];Clear[a32];Clear[a33];Clear[a41];Clear[a42];Clear[a43]; fxn:=Sqrt[t*(alpha[[1]]-a11^2-a12^2-a13^2)^2]

+Sqrt[t*(alpha[[2]]-a11*a21-a12*a22-a13*a23)^2]+Sqrt[t*(alpha[[3]]-a11*a31-a12*a32-a13*a33)^2]+Sqrt[t*(alpha[[4]]-a11*a41-a12*a42-a13*a43)^2]+Sqrt[t*(alpha[[5]]-a21^2-a22^2-a23^2)^2]+Sqrt[t*(alpha[[6]]-a21*a31-a22*a32-a23*a33)^2]+Sqrt[t*(alpha[[7]]-a21*a41-a22*a42-a23*a43)^2]+Sqrt[t*(alpha[[8]]-a31^2-a32^2-a33^2)^2]+Sqrt[t*(alpha[[9]]-a31*a41-a32*a42-a33*a43)^2]+Sqrt[t*(alpha[[10]]-a41^2-a42^2-a43^2)^2];

g=FindMinimum[fxn,{a11,A[[1,1]]},{a12,A[[1,2]]},{a13,A[[1,3]]},{a21,A[[2,1]]},{a22,A[[2,2]]},



{a23,A[[2,3]]},{a31,A[[3,1]]},{a32,A[[3,2]]},{a33,A[[3,3]]},{a41,A[[4,1]]},{a42,A[[4,2]]},{a43,A[[4,3]]},AccuracyGoal->0.0000000001, MaxIterations->50];

{{a11, a12, a13},{a21,a22,a23},{a31,a32,a33},{a41,a42,a43}} /. g[[2]])]

refineCyl2[grpGen_, alpha_, A_]:=Module[{}, (Clear[rC];eqns=grpGen[Dimensions[A][[2]],rC];Print[fxn:=Sum[(alpha[[i]] - eqns[[i]])^2, {i,Length[eqns]}]];rC=A;

Do[Do[Unprotect[rC[[i,j]]],{j,Dimensions[rC][[2]]}],{i,Dimensions[rC][[1]]}];

Print[FindMinimum[fxn,{rC[[1,1]],A[[1,1]]},{rC[[1,2]],A[[1,2]]},{rC[[1,3]],A[[1,3]]},{rC[[2,1]],A[[2,1]]},{rC[[2,2]],A[[2,2]]},{rC[[2,3]],A[[2,3]]},{rC[[3,1]],A[[3,1]]},{rC[[3,2]],A[[3,2]]},{rC[[3,3]],A[[3,3]]},{rC[[4,1]],A[[4,1]]},{rC[[4,2]],A[[4,2]]},{rC[[4,3]],A[[4,3]]}]];

)]

(* vector of a[i,j] values to set equal to zero *)freea1s[w_] := Module[{g={}},(g)]

(* vector of a[i,j] values to solve for *)solva1s[w_] := Module[{g={}},(Do[ (* Do a bogus assignment to avoid a warning when we Unset. *) Do[ a[p,r]=1; a[p,r]=.; AppendTo[g,a[p,r]], {r,1,w}],{p,1,w+1}]; g)]

(* vector of a[i,j] values to solve for *)solva1bs[w_] := Module[{g={}},(Do[ (* Do a bogus assignment to avoid a warning when we Unset. *) Do[ a[p,r]=1; a[p,r]=.; AppendTo[g,a[p,r]], {r,1,w}],{p,1,w}]; g)]

(* Now set up for solving A2 *)


mlsrow2[m_] := Module[{v={},ww=Length[m]},Do[



v=Append[v,m[[p]]],{p,1,ww}];v]


(* this function generates the symbolic groups corresponding to theleast squares solutions. a[[i,j]] is the i,j element of affinetransformation matrix. Note that there will be no free elements ofthis matrix. *)

a2groups[w_] := Module[{g={}},(Do[ g = Append[g,a[p,w+1]],{p,1,w+1}]; g)]

(* vector of a[i,j] values to set equal to zero *)freea2s[w_] := Module[{g={}},(g)]

(* vector of a[i,j] values to solve for *)solva2s[w_] := Module[{g={}},(Do[ (* Do a bogus assignment to avoid a warning when we Unset. *) a[p,3]=1; a[p,3]=.; AppendTo[g,a[p,3]],{p,1,w+1}]; g)]

filterStd[data_,filter_] := Module[{l=Length[data], fsum = 0, filt,g={}},

(If[ !VectorQ[filter], filt = {0.1, 0.2, 0.4, 0.8, 1.0, 0.8,0.4, 0.2, 0.1},

filt=filter];Do[fsum = fsum + filt[[i]],{i,Length[filt]}];filt = filt/fsum;n = (Length[filt]+1)/2;l = l-n;Do[AppendTo[g,Sum[filt[[j]]*data[[i-n+j]], {j,1,Length[filt]}]],{i,n,l}]ListPlot[g,PlotJoined->True];{StandardDev[g],g})]

A.2.3 File: motion.off.math

This is the main code for performing shape from motion calibration with offsets.



(* motion.off.math *)(* 9-13-95 RMV (based on motion.math by JDM) This file contains code to apply shape from motion techniques to force sensor calibration with offsets *)

(* Flag *)MotionOffsets = 1;

(* set up the least squares problem *)(* Mathematica defines the SVD as: svd = uT diag[w] v *)(* Our formulation (Strang’s) is: svd = u diag[w] vT *)(* All comments referring to rows and columns are in our

formulation *)svd=SingularValues[N[motiondata]];uT=svd[[1]];(* Keep only the first w+1 singular values *)(* w+1 is the rank of the motion matrix and is defined in

motion.basic *)s=DiagonalMatrix[Sqrt[Take[svd[[2]],w+1]]];vT=svd[[3]];

(* take first w+1 columns of u *)u = Transpose[Take[uT,w+1]];(* take first w+1 rows of v *)vT = Take[vT,w+1];

(* define motion and shape matrices from svd results *)(* These motion and shape matrices are off by an affine

transformation *)motionHat = u . s ;shapeHat = s . vT;

(* The affine transformation must be found in two steps, one for each of the two constraints. A = [A1 A2] *)

(* FINDING A1 *)

(* create the LS motion matrix *)mls = Map[mlsrow1,motionHat];



(* solve LS for group values: alpha *)(* f = mls . alpha *)alpha1 = PseudoInverse[mls] . f;

Q = symmetricMatrix[alpha1];



A2 = ell2cyl[Q];A1 = refineCyl[alpha1,A2];

(* FINDING A2 *)

(* create the LS motion matrix *)mls = Map[mlsrow2,motionHat];



(* solve LS for group values: alpha *)(* f = mls . alpha *)alpha = PseudoInverse[mls] . f;

(* show us the affine transformation *)

at = Append[Transpose[A1],alpha];at = Transpose[at];

mf[at]

(* apply the affine transformation to the SVD motion and shapematrices *)

motion = motionHat . at;shape = Inverse[at] . shapeHat;offset = Take[shape,-1][[1]];shape = Take[shape,w];cal = Transpose[PseudoInverse[shape]];

(* <<plotmotion.math *)

A.2.4 Analytic Solutions for Finding the Affine Transform{{b12 -> ((a2*F^2 - a1*a2*m11^2 - a2^2*m11*m12 - a1*a3*m11*m12 - a2*a3*m12^2 - (F*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/

m11)* ((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 + 2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 + a1*a3^2*m11^2*m12^2 + 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 - 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) - 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/



(a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2))/ (-(a3*F^2) + a2^2*m11^2 + 2*a2*a3*m11*m12 + a3^2*m12^2), b22 -> -((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 +

2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 +

a1*a3^2*m11^2*m12^2 + 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 - 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) - 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2), b11 -> (a1*F*m11^2 + a2*F*m11*m12 - m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^3 + 2*a2*m11^2*m12 + a3*m11*m12^2), b21 -> (a2*F*m11 + a3*F*m12 + ((a2^2 - a1*a3)*m11^2*(F^2 - a1*m11^2 - 2*a2*m11*m12 -

a3*m12^2))^ (1/2))/(a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)}, {b12 -> ((-(a2*F^2) + a1*a2*m11^2 + a2^2*m11*m12 + a1*a3*m11*m12 + a2*a3*m12^2 - (F*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/

m11)* ((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 + 2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 + a1*a3^2*m11^2*m12^2 + 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 + 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) + 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2))/ (-(a3*F^2) + a2^2*m11^2 + 2*a2*a3*m11*m12 + a3^2*m12^2), b22 -> ((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 +

2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 + a1*a3^2*m11^2*m12^2

+ 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 + 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) + 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2), b11 -> (a1*F*m11^2 + a2*F*m11*m12 + m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/



(a1*m11^3 + 2*a2*m11^2*m12 + a3*m11*m12^2), b21 -> (a2*F*m11 + a3*F*m12 - ((a2^2 - a1*a3)*m11^2*(F^2 - a1*m11^2 - 2*a2*m11*m12 -

a3*m12^2))^ (1/2))/(a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)}, {b12 -> ((a2*F^2 - a1*a2*m11^2 - a2^2*m11*m12 - a1*a3*m11*m12 - a2*a3*m12^2 + (F*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/

m11)* ((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 + 2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 + a1*a3^2*m11^2*m12^2 + 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 + 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) + 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2))/ (-(a3*F^2) + a2^2*m11^2 + 2*a2*a3*m11*m12 + a3^2*m12^2), b22 -> -((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 +

2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 +

a1*a3^2*m11^2*m12^2 + 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 + 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) + 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2), b11 -> (a1*F*m11^2 + a2*F*m11*m12 + m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^3 + 2*a2*m11^2*m12 + a3*m11*m12^2), b21 -> (a2*F*m11 + a3*F*m12 - ((a2^2 - a1*a3)*m11^2*(F^2 - a1*m11^2 - 2*a2*m11*m12 -

a3*m12^2))^ (1/2))/(a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)}, {b12 -> ((-(a2*F^2) + a1*a2*m11^2 + a2^2*m11*m12 + a1*a3*m11*m12 + a2*a3*m12^2 + (F*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/

m11)* ((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 + 2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 + a1*a3^2*m11^2*m12^2 + 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 - 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) -

SHAPE FROM MOTION PRIMORDIAL LEARNING MATLAB CODE


2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2))/ (-(a3*F^2) + a2^2*m11^2 + 2*a2*a3*m11*m12 + a3^2*m12^2), b22 -> ((-2*a2^2*F^2*m11^2 + a1*a3*F^2*m11^2 + a1*a2^2*m11^4 - 2*a2*a3*F^2*m11*m12 + 2*a2^3*m11^3*m12 +

2*a1*a2*a3*m11^3*m12 - a3^2*F^2*m12^2 + 5*a2^2*a3*m11^2*m12^2 + a1*a3^2*m11^2*m12^2

+ 4*a2*a3^2*m11*m12^3 + a3^3*m12^4 - 2*a2*F*m11*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2) - 2*a3*F*m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)^2)^(1/2), b11 -> (a1*F*m11^2 + a2*F*m11*m12 - m12*((a2^2 - a1*a3)*m11^2* (F^2 - a1*m11^2 - 2*a2*m11*m12 - a3*m12^2))^(1/2))/ (a1*m11^3 + 2*a2*m11^2*m12 + a3*m11*m12^2), b21 -> (a2*F*m11 + a3*F*m12 + ((a2^2 - a1*a3)*m11^2*(F^2 - a1*m11^2 - 2*a2*m11*m12 -

a3*m12^2))^ (1/2))/(a1*m11^2 + 2*a2*m11*m12 + a3*m12^2)}}

A.3 SHAPE FROM MOTION PRIMORDIAL LEARNING MATLAB CODE

A.3.1 File: pumapcat.m

This is the “user interface” to the Shape from Motion Primordial Learning routines fortraining the PUMA manipulation primitives. The matrix “data” consists of all the sen-sor and actuator data gathered during training via teleoperation.

function [U,sig,ave,Uall]=pumapcat(data,pi,mi)% [U,sig,ave,Uall] = pumapcat(data,pi,mi)%%Training mode for PUMA skill acquisition using Shape from Motion%Primordial Learning%%pi (optional) Specifies pi-th group of eigenvectors to extract% default is 1st group%mi (optional) Specifies total number of eigenvectors to extract

% argument pi was added 12/19/96 RMV



if nargin < 2, pi = 1; end;

head = 1;

[Uall,sig,ave]=pcatrain(data);

% remove all eigenvectors associated with zero eigenvalues[Uall,sig] = rm_zero_eig(Uall,sig);

[x,y]=size(Uall);

% If mi is not specified, find it automaticallyif nargin ~= 3 % find the max ratio of eigenvalues rats = ratiosv(sig);

% search for pi-th group of eigenvectors mi = 0; prevmi = 1; tail = size(rats); for j=1:pi head = mi+1; [mv,mi] = max(rats(head:tail));

mi = mi + head -1;

% see if any ratios before the max are > 20 for 12 bits % see if any ratios before the max are > 12 for 8 bits for i=mi:-1:prevmi if rats(i) > 12, mi = i; end; end; prevmi = mi+1;

end;

end;

U = Uall(:,(y-mi+1):(y-head+1));

% save pca.mat U sig ave Uall

A.3.2 File: pcatrain.m

This is the “workhorse” training routine for Shape from Motion Primordial Learning.

function [U,sigma,ave]=pcatrain(Z,flag)% [U,sigma,ave] = pcatrain(Z,flag)%This function is similar to John Hancock’s PCA-style learning



%algorithm for the MK-V mobile robot and PUMA skill acquisition.%The matrix Z is the transpose of Hancock’s data matrix A.%Eigenvalues are in ascending order except for possible zeros.%( Use rm_zero_eig() )%%Z is the matrix of training data that includes all variables% both sensor and actuator in columns, sensor first, actuator

appended.% (For the MK-V, include the status bits:% [bump steering_motor drive_motor].)%flag (optional) If set to one, take median of Z instead of mean.%%U is the full matrix of eigenvectors%sigma is the vector of eigenvalues (may include zeros)%ave is the average values of Z

if (nargin == 1) ave = mean(Z);elseif (flag == 1) ave = median(Z);else ave = mean(Z);end;

T = mat_offset(Z,ave);C = T’*T;[U,S] = eig(C);sigma = diag(S);

A.3.3 File: rm_zero_eig.m

function [U2,sigma2]=rm_zero_eig(U1,sigma1)% [U2,sigma2] = rm_zero_eig(U1,sigma1)%This function searches for eignvalues equal to zero and removes them%and their associated eigenvectors.%%U is the matrix of eigenvectors%sigma is the vector of eigenvalues

[row,col] = size(sigma1);

if (row ~= 1) & (col ~= 1) ‘eigenvalues must be a single vector’; return;end;

cnt = max([row col]);

U2=[];sigma2 = [];



for i=1:cnt, if sigma1(i) ~= 0 U2 = [U2 U1(:,i)]; sigma2 = [sigma2; sigma1(i)]; end;end;

A.3.4 File: ratiosv.m

function Y=ratiosv(X)% Y=ratiosv(X)% This function sorts the list, X, and computes the% ratios of adjacent values.

[nrow ncol] = size(X);if ncol > nrow, X=X’; end;[nrow ncol] = size(X);

rr = flipud(sort(X));

Y=[];ok = 1;

for i=2:nrow, rat =rr(i-1,:)/rr(i,:); if abs(rat) > 1e6, Y = [Y;0]; ok = 0; elseif ok == 1, Y = [Y;rat]; else Y = [Y;0]; end;

end;

A.3.5 File: pumapcar.m

This is the “user interface” to the Shape from Motion Primordial Learning routines forexecuting the PUMA manipulation primitives in simulation. The matrix “data” con-sists of all the sensor and actuator data gathered during teleoperation, but the acutatordata is ignored. pcarun() calculates autonomous actuator commands based only on thesensor data which can be compared to the teleoperation data.

For execution in real time, see the C code in Section A.4.



function [sens,mot,tsens,tmot]=pumapcar(data,U,ave,S)% [sens,mot,truesens,truemot] = pumapcar(data,U,ave,S)% Performs Shape from Motion (PCA) execution of% acquired PUMA skills. (see pumapcat for training.)% Assumes the number of mcols is 6 (xd_ref)% truesens and truemot are optional% S is an optional scaling matrix on data

[row,col] = size(data);

if nargin == 4, data = data * S;end;

[sens,mot] = pcarun(data,U,ave,6);

if nargout == 2, return; end;

tsens = data(:,1:(col-6));

tmot = data(:,(col-5):col);

A.3.6 File: pcarun.m

This is the “workhorse” execution simulator for Shape from Motion Primordial Learn-ing.

function [sens,mot]=pcarun(Z,U,ave,mcols)% [sens,mot] = pcarun(Z,U,ave,mcols)% This function is similar to John Hancock’s PCA-style run-time% algorithm for the MK-V mobile robot or PUMA skill acquisition.%%Z is the matrix of sensor data that includes all variables.% (For the MK-V, include the bump status bit.) If actuator% data is included in the matrix, all sensor data is grouped

together% in the first columns, all actuator data is appended% in the last columns and “mcols” must be set appropriately.%U is the reduced matrix of principal eigenvectors%ave is the vector of average training values%mcolsoptional number of trailing columns of actuator data included% in “Z” that must be stripped off (2 is the default)%%sens is the matrix of predicted sensor values%mot is the matrix of motor commands

[zrow,zcol]=size(Z);if nargin < 4, mcols = 2; end;



% Strip off the actuator data column vectorszc = zcol - mcols;Z = Z(:,1:zc);

%%% This is the sensor eigenmatrixUs = U(1:zc,:);%%% These are the means of the sensor valuesaves = ave(1:zc);

[row,n] = size(U);[m,col] = size(Z);

V=[];for i = 1:m, Zoff = Z(i,:) - aves; a = (Zoff * Us(:,1)) * U(:,1);

for j = 2:n, a = a + (Zoff * Us(:,j)) * U(:,j); end;

V = [V; a’ + ave];end;

sens = V(:,1:zc);mot = V(:,(zc+1):zcol);

A.3.7 File: pumapcai2.m

This is the Shape from Motion Primordial Learning routine for identifying previouslylearned manipulation primitives from teleoperation or human demonstration.

function [match,sv,Uf]=pumapcai2(data,U,ave,pi,S)% [match,sv,Uf] = pumapcai2(data,U,ave,pi,S)% Identifies Shape from Motion (PCA) PUMA skills% Assumes the number of actuator elements is 6

(xd_ref)%% data - raw data matrix; sensors first, actuators appended% U, ave - eigenvectors and average of the previously

acquired skill% pi - (optional) index of group of eignevectors to test% S - (optional) scaling matrix on input data% match - goodness of fit (0 to 1)% sv - matrix of ratios of 7 largest singular values% Uf - (optional) all significant eigenvectors at each

sample



%% Calls pumapcat() which calls pcatrain().

[row,col] = size(data);

if nargin < 4, pi = 1; end;

if nargin == 5, data = data * S;end;

sv = [];match = [];

if nargout == 3 Uf = [0]; ufsize = 1;end;

Usize = size(U);

% Train on short subsets and see if the eigenvectors are parallelfor i=1:5:(row-35) [Us,sigs,aves,Ualls] = pumapcat(data(i:(i+34),:),pi);

% save the recovered eigenvectors if requested if nargout == 3 ussize = size(Us); % ignore if there are more than 7 eigenvectors for convenience if ussize(2) <= 7 if ussize(2) > ufsize tmpsize = size(Uf); Uf = [Uf zeros(tmpsize(1),ussize(2)-ufsize)]; ufsize = ussize(2); Uf = [Uf; zeros(1,ufsize);Us]; elseif ussize(2) < ufsize Uf = [Uf; zeros(1,ufsize);[Us zeros(ussize(1),ufsize-

ussize(2))]]; else Uf = [Uf; zeros(1,ufsize);Us]; end; end; end;

ttmp = ratiosv(sigs); sv = [sv ttmp(1:7)]; if size(Us) == Usize % find the product of the dot products of the eigenvectors tmp = det(diag(diag(abs(Us’*U))));

SHAPE FROM MOTION PRIMORDIAL LEARNING CHIMERA CODE


% add number of points to match the step size so total isconstant

for j=1:5 match = [match; tmp]; end; else for j=1:5 match = [match; 0.0]; end; end;end;

sv = sv’;

A.4 SHAPE FROM MOTION PRIMORDIAL LEARNING CHIMERA CODE

A.4.1 File: skill.c

This is C code for a port-based object running under the Chimera real-time operatingsystem that executes the extracted sensorimotor primitive in real time. Chimera allowsmultiple instantiations of this piece of code with different configuration files so thatmultiple primitives can execute concurrently.

/***************************************************************** * skill.c * * Created: 05-16-96 RMV Richard M. Voyles, The Robotics

Institute * * Modified: * * --------------------------------------------------------------- * * General skill extracted by Shape from Motion Primordial Learning. * * This module reads eigenvectors that determine the eigenspace of a * learned skill from the rmod file and performs a projection of * the INVARs onto the space. See Hancock & Thorpe, IROS 95. * * The INVARs must be ordered correctly to match the training set. * *State variable table: * INCONST: none * OUTCONST:none * INVAR: any float state variables in correct order * OUTVAR: one float state variable



* *****************************************************************/

/* ***************************************************************//*include files *//* ***************************************************************/

#include <chimera.h>#include <sbs.h>#include <cmdi.h>#include <string.h>

#include <strings.h>#include <vtypes.h>

/* ***************************************************************//*defines *//* ***************************************************************/#define DUMMY_CODE 1

typedef struct { char name[MAXNAMELEN+1]; int *value; /* value of trigger svar */ int ix; /* svar index */ int trigval; /* trigger value (on equal) */} trigger_t;

/* ***************************************************************//*module ‘Local_t’ definition as required by Chimera*//* ***************************************************************/

typedef struct { int nInvars ; /* number of INVARs */ int nOutvars ; /* number of OUTVARs (must be 1) */ int nEigenvect ; /* number of eigenvectors */ double *eigenvect[16]; double average[64]; double evScale; double offval[32]; int off_flag; int invarsNelem[64]; /* array of number of elements of each

INVAR */ int totInNelem; /* Total number of invar elements */ int totOutNelem; /* Total number of outvar elements */ int totNelem; /* Length of the eigenvectors

(invars + outvars) */} skillLocal_t ;

SBS_MODULE(skill);



/* ***************************************************************//*skillInit Initialize the module.*//* ***************************************************************/

intskillInit(cinfo, local, stask)cfigInfo_t*cinfo;skillLocal_t*local;sbsTask_t*stask;{ sbsSvar_t*svar = &stask->svar; svarTable_t *vart=svar->vartable ; int *invarList=svar->invar; int *outvarList=svar->outvar; int *nvars; int i, j, n;

/* count the number of outvars by searching until a -1 is found */ nvars = &(local->nOutvars) ; *nvars=0; local->totOutNelem = 0; while ( (i=outvarList[*nvars]) != -1 ) { /* total up the length of the OUTVARs */ local->totOutNelem += svarNelemIx(vart, outvarList[*nvars]);

/* Only support FLOATs at this time. */ if(svarTypeIx(vart, invarList[*nvars]) != VT_FLOAT) errInvoke(stask->errmod,”OUTVARS must be floats”,DUMMY_CODE); if(++(*nvars)>1) errInvoke(stask->errmod,”Too many

OUTVARs”,DUMMY_CODE); }

nvars = &(local->nInvars) ; *nvars=0; local->totInNelem = 0; while ( (i=invarList[*nvars]) != -1 ) { /* total up the length of the INVARs */ local->totInNelem += svarNelemIx(vart, invarList[*nvars]);

/* Only support FLOATs at this time. */ if(svarTypeIx(vart, invarList[*nvars]) != VT_FLOAT) errInvoke(stask->errmod,”INVARS must be floats”,DUMMY_CODE); if(++(*nvars)>64) errInvoke(stask->errmod,”Too many

INVARs”,DUMMY_CODE); }

/* Allow a scale factor on the eigenvectors */ local->evScale = 1.0; cfigOptional(cinfo,”EVECTOR_SCALE”,&local->evScale,CFIG_DOUBLE,1);



n = local->totNelem = local->totInNelem + local->totOutNelem;

i=0; local->eigenvect[0] = (double *)cfigAlloc(cinfo, sizeof(double) *

n); while (cfigOptional(cinfo,”EIGENVECTOR”,local-

>eigenvect[i],CFIG_DOUBLE,n) == I_OK){ for (j=0; j<n; j++) local->eigenvect[i][j] *= local->evScale; local->eigenvect[++i] = (double *)cfigAlloc(cinfo, sizeof(double)

* n); } /* endwhile */

local->nEigenvect = i; if (i<1) errInvoke(stask->errmod,”No Eigenvectors

found”,DUMMY_CODE);

cfigCompulsory(cinfo,”AVERAGE”,local->average,CFIG_DOUBLE,n);

local->off_flag = cfigOptional(cinfo,”OFF_OUTVAR”,local->offval,CFIG_DOUBLE,

local->totOutNelem);

/* Return from initialization. */ return (int) local;}

/* ***************************************************************//*skillOn Start up the module.*//* ***************************************************************/

intskillOn(local, stask)skillLocal_t*local;sbsTask_t*stask;{

return I_OK;}

/* ***************************************************************//*skillCycle Process module information.*//* ***************************************************************/

intskillCycle(local, stask)skillLocal_t*local;sbsTask_t*stask;{



sbsSvar_t*svar = &stask->svar; svarTable_t *vart=svar->vartable ; int *invarList=svar->invar; int *outvarList=svar->outvar; int i, j, k, w, nEv, nIn, nEl; double factor; double *vect, *ave = local->average; float *invar, *outvar;

/* step through the OUTVAR elements to zero them */ nEl = svarNelemIx(vart,outvarList[0]); outvar = svarValueIx(vart,outvarList[0],float); for (k=0; k<nEl; k++){ outvar[k] = 0.0; } /* endfor */

/* step through the INVARs to offset them */ w = 0; nIn = local->nInvars; /* number of INVARs */

for (j=0; j<nIn; j++){ invar = svarValueIx(vart,invarList[j],float); nEl = svarNelemIx(vart,invarList[j]);

/* step through the elements of the INVARs, subtracting theaverage */

for (k=0; k<nEl; k++){ invar[k] -= ave[w++]; } /* endfor */ } /* endfor */

/* Start the eigenspace projection */

nEv = local->nEigenvect; /* number of eigenvectors */ nIn = local->nInvars; /* number of INVARs */

/* step through the eigenvectors */ for (i=0; i<nEv; i++){ w = 0; factor = 0.0; /* scale factor (dot product

result) */ vect = local->eigenvect[i]; /* pointer to current

eigenvector */

/* step through the INVARs */ for (j=0; j<nIn; j++){ invar = svarValueIx(vart,invarList[j],float); nEl = svarNelemIx(vart,invarList[j]);



/* step through the elements of the INVARs, forming the dotproduct with

the sensor part of the eigenvector */ for (k=0; k<nEl; k++){factor += invar[k] * vect[w++]; } /* endfor */ } /* endfor */

if(local->totInNelem != w) errInvoke(stask->errmod,”INVAR nums dont match”,DUMMY_CODE);

/* The dot product is a scale factor on the contribution ofthis eigenvector

to the output. Scale the actuator part of the eigenvectorand accumulate. */

/* step through the OUTVAR elements */ nEl = svarNelemIx(vart,outvarList[0]); outvar = svarValueIx(vart,outvarList[0],float); for (k=0; k<nEl; k++){ outvar[k] += factor * vect[w+k]; } /* endfor */

} /* endfor */

/* Add the average to the output vector */ nEl = svarNelemIx(vart,outvarList[0]); outvar = svarValueIx(vart,outvarList[0],float); for (k=0; k<nEl; k++){ outvar[k] += ave[w+k]; } /* endfor */

return I_OK;}

/* ***************************************************************//*skillOff Stop the module. *//* ***************************************************************/

intskillOff(local, stask)skillLocal_t*local;sbsTask_t*stask;{ sbsSvar_t*svar = &stask->svar; svarTable_t *vart=svar->vartable ; int *outvarList=svar->outvar; int k, nEl; float *outvar;



/* If there are valid off values, write them to the global table.*/

if (local->off_flag == I_OK){ nEl = svarNelemIx(vart,outvarList[0]); outvar = svarValueIx(vart,outvarList[0],float); for (k=0; k<nEl; k++){ outvar[k] = local->offval[k]; } /* endfor */ svarWrite(svarVarptrIx(vart,outvarList[0])); } /* endif */

/* Indicate that the module is off and return.*/ kprintf(“%s: OFF\n”,stask->rmod); return I_OK;}

/* ***************************************************************//*skillKill Clean up after the module.*//* ***************************************************************/

intskillKill(local, stask)skillLocal_t*local;sbsTask_t*stask;{

/* Indicate that the module is finished and return.*/ kprintf(“%s: KILL\n”,stask->rmod); return I_OK;}

/* ***************************************************************//*skillError Attempt automatic error recovery.*//* ***************************************************************/

intskillError(local, stask, mptr, errmsg, errcode)skillLocal_t*local;sbsTask_t*stask;errModule_t*mptr;char *errmsg;int errcode;{ /* Return after not correcting error.*/ return SBS_ERROR;}

/* ***************************************************************//*skillClear Clear error state of the module.*/



/* ***************************************************************/

intskillClear(local, stask, mptr, errmsg, errcode)skillLocal_t*local;sbsTask_t*stask;errModule_t*mptr;char *errmsg;int errcode;{

switch(errcode) {

case DUMMY_CODE: return SBS_OFF;

default: sbsNewError(stask, “Clear not defined, still in error state”,

errcode); }

return SBS_ERROR;}

/* ***************************************************************//*skillSet Set module parameters.*//* ***************************************************************/

intskillSet(local, stask)skillLocal_t*local;sbsTask_t*stask;{ return I_OK;}

/* ***************************************************************//*skillGet Get module parameters.*//* ***************************************************************/

intskillGet(local, stask)skillLocal_t*local;sbsTask_t*stask;{ return I_OK;}

/* ***************************************************************//*skillSync Get module parameters.*//* ***************************************************************/



intskillSync(local, stask)skillLocal_t*local;sbsTask_t*stask;{ return I_OK;}

/* ***************************************************************//*skillReinit Get module parameters.*//* ***************************************************************/

intskillReinit(local, stask)skillLocal_t*local;sbsTask_t*stask;{ return I_OK;}

A.4.2 File: zguard.rmod

This is the Chimera configuration file for autonomously executing the guarded movealong the z axis with a PUMA robot.

MODULE skillDESC Primordial Learning primitive - guarded move in ZSVARALIAS X^_CMD=X^_REFINCONST noneOUTCONST noneINVAR F_MEZ P_TrBOUTVAR X^_CMDTASKTYPE periodicFREQ 50

LOCAL

EVECTOR_SCALE 2.0

EIGENVECTOR -0.0360 0.0162 0.7035 -0.0015 -0.0032 0 -0.7096 -0.0066\0.0001 0.0001 0.0005 0.0 0.0 0.000126 0.0 0.0 0.0

AVERAGE -0.3334 -0.0255 -7.132 -0.0011 -0.0195 0.0005 7.2455 0.0711 \-0.7006 0.1956 -0.5464 0.0 0.0 0.0026 0.0 0.0 0.0

OFF_OUTVAR 0.0 0.0 0.0 0.0 0.0 0.0

EOF



A.4.3 File: xroll.rmod

This is the Chimera configuration file for autonomously executing the rotational acco-modation around the x axis with a PUMA robot.

MODULE skillDESC Primordial Learning primitive executiveINCONST noneOUTCONST noneINVAR F_MEZ P_TrBOUTVAR X^_CMD_SK2TASKTYPE periodicFREQ 20

LOCAL

EIGENVECTOR -0.4831 -0.8656 0.017 -0.1269 -0.0078 0.0068 -0.0104 -0.0266 \

-0.0001 -0.0006 0.0 0.0 0.0 0.0 0.0009 0.0 0.0

EIGENVECTOR -0.8218 0.4107 0.0264 0.2716 -0.0366 -0.0297 -0.0020.2815 \

-0.0001 -0.0003 0.0006 0.0 0.0 0.0 -0.0014 0.0 0.0

EIGENVECTOR -0.2489 0.2770 -0.0014 -0.8869 0.0288 0.0568 0.0183 -0.2651 \

0.0001 -0.0011 0.0003 0.0 0.0 0.0 0.0108 0.0 0.0

EIGENVECTOR -0.1653 0.0705 0.0135 0.3417 -0.019 -0.0277 0.0198 -0.9215 \

0.0003 0.0 0.0004 0.0 0.0 0.0 -0.0101 0.0 0.0

AVERAGE -0.417 -0.2955 -14.2702 0.1551 -0.0410 -0.0129 14.4296 \0.4705 -0.6 0.2663 -0.4847 0.0 0.0 0.0 0.0009 0.0 0.0

OFF_OUTVAR 0.0 0.0 0.0 0.0 0.0 0.0

EOF

171

Appendix B

Demonstration Code

Chimera Implementations

B.1 CHIMERA CONFIGURATION FILES

B.1.1 File: demo2.fsm

This is the finite state machine configuration file for controlling Morrow’s agent-levelprogramming interface for Chimera [Morrow, 1997] for the first demonstrationdescribed in Section 6.3. This program contains two guarded moves.

# **********************STATE 0 create 50.0 30.0COMMAND sbs spawn puma_pidgAcontrolCOMMAND sbs spawn grav_compsecondCOMMAND sbs spawn moveq secondCOMMAND sbs spawn invkin thirdCOMMAND sbs spawn fwdkin thirdCOMMAND sbs spawn umhand fourthCOMMAND sbs spawn dhmerge fourthCOMMAND sbs spawn dhmove fourthCOMMAND sbs spawn dhgrasp fourthCOMMAND sbs spawn trjcsvar secondCOMMAND sbs spawn gbptraj fourthCOMMAND sbs spawn pumaxformcontrolCOMMAND sbs spawn ijac fourthCOMMAND sbs spawn cartcntl thirdCOMMAND sbs spawn zguard thirdCOMMAND sbs spawn aftC fourth

CHIMERA CONFIGURATION FILES


# **********************STATE 1 destroy 50.0 100.0COMMAND sbs kill puma_pidgACOMMAND sbs kill grav_compCOMMAND sbs kill moveqCOMMAND sbs kill invkinCOMMAND sbs kill fwdkinCOMMAND sbs kill trjcsvarCOMMAND sbs kill gbptrajCOMMAND sbs kill umhandCOMMAND sbs kill dhmoveCOMMAND sbs kill dhgraspCOMMAND sbs kill dhmergeCOMMAND sbs kill pumaxformCOMMAND sbs kill ijacCOMMAND sbs kill cartcntlCOMMAND sbs kill zguardCOMMAND sbs kill aftC# **********************STATE 2 reset 50.0 170.0COMMAND sbs reinit gbptraj# **********************STATE 3 halt 50.0 240.0COMMAND sbs off moveqCOMMAND sbs off trjcsvarCOMMAND sbs off gbptrajCOMMAND sbs off dhmerge# **********************STATE 4 start 170.0 30.0COMMAND sbs on puma_pidgACOMMAND sbs on grav_compCOMMAND sbs on moveq umdhAstartCOMMAND sbs on umhandCOMMAND sbs on dhmergeCOMMAND sbs on fwdkinEVENT sbs moveq 20B 5EVENT sbs moveq 10B 5# **********************STATE 5 initHand1290.0 30.0COMMAND sbs on invkinCOMMAND sbs on trjcsvarCOMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -

1.047 0.0 0.0 0.0 -1.047 0.0 0.0 0.0 -1.047 0.0 0.0 50EVENT sbs dhmove 20B 6EVENT sbs dhmove 10B 6# **********************STATE 6 Preshape&Move1290.0100.0COMMAND sbs on dhmove -0.3559 0.2227 0.5317-

0.8383 0.0000 -1.4487 0.6566 0.2496 0.2610 -1.24570.6388 0.2391 0.0287 -0.5342 0.5396 0.1849 50



COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 7EVENT sbs gbptraj 10B 7# **********************STATE 7 Grasp1 290.0 170.0COMMAND sbs on dhgrasp 21EVENT sbs dhgrasp 20B 8EVENT sbs dhgrasp 10B 8# **********************STATE 8 ViaMove1 290.0 240.0COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 9EVENT sbs gbptraj 10B 9# **********************STATE 9 GuardedMove1290.0 310.0COMMAND sbs off invkinCOMMAND sbs off trjcsvarCOMMAND sbs on pumaxformCOMMAND sbs on ijacCOMMAND sbs on cartcntlCOMMAND sbs on aftCCOMMAND sbs on zguardEVENT sbs zguard 20B 10# **********************STATE 10 Ungrasp1 410.0 30.0COMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -

1.047 0.0 0.0 0.0 -1.047 0.0 0.0 0.0 -1.047 0.0 0.0 50EVENT sbs dhmove 20B 11EVENT sbs dhmove 10B 11# **********************STATE 11 initHand2410.0 100.0COMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -


0.9265 0.0000 -1.3007 0.4221 0.1288 0.2520 -1.28990.4745 0.1527 0.0135 -0.6119 0.1484 0.0323 50

COMMAND sbs off pumaxformCOMMAND sbs off ijacCOMMAND sbs off cartcntlCOMMAND sbs off aftCCOMMAND sbs on invkinCOMMAND sbs on trjcsvarCOMMAND sbs on gbptrajEVENT sbs gbptraj 20B 13EVENT sbs gbptraj 10B 13# **********************



STATE 13 Grasp2 410.0 240.0COMMAND sbs on dhgrasp 21EVENT sbs dhgrasp 20B 14EVENT sbs dhgrasp 10B 14# **********************STATE 14 ViaMove2 410.0 310.0COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 15EVENT sbs gbptraj 10B 15# **********************STATE 15 GuardedMove2530.0 30.0COMMAND sbs off invkinCOMMAND sbs off trjcsvarCOMMAND sbs on pumaxformCOMMAND sbs on ijacCOMMAND sbs on cartcntlCOMMAND sbs on aftCCOMMAND sbs on zguardEVENT sbs zguard 20B 16# **********************STATE 16 Ungrasp2 530.0 100.0COMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -

1.047 0.0 0.0 0.0 -1.047 0.0 0.0 0.0 -1.047 0.0 0.0 50EVENT sbs dhmove 20B 17EVENT sbs dhmove 10B 17# **********************STATE 17 ViaMove3 530.0 170.0COMMAND sbs off pumaxformCOMMAND sbs off ijacCOMMAND sbs off cartcntlCOMMAND sbs off aftCCOMMAND sbs on invkinCOMMAND sbs on trjcsvarCOMMAND sbs on gbptrajEVENT sbs gbptraj 20B 18EVENT sbs gbptraj 10B 18# **********************STATE 18 end 530.0 240.0COMMAND sbs on moveq umdhAstartEVENT sbs moveq 20B 3EVENT sbs moveq 10B 3

EOF



B.1.2 File: demo3.fsm

This is the finite state machine configuration file for controlling Morrow’s agent-levelprogramming interface for Chimera [Morrow, 1997] for the second demonstrationdescribed in Section 6.3. This program contains only one guarded move.

# **********************STATE 0 create 50.0 30.0COMMAND sbs spawn puma_pidgAcontrolCOMMAND sbs spawn grav_compsecondCOMMAND sbs spawn moveq secondCOMMAND sbs spawn invkin thirdCOMMAND sbs spawn fwdkin thirdCOMMAND sbs spawn umhand fourthCOMMAND sbs spawn dhmerge fourthCOMMAND sbs spawn dhmove fourthCOMMAND sbs spawn dhgrasp fourthCOMMAND sbs spawn trjcsvar secondCOMMAND sbs spawn gbptraj fourthCOMMAND sbs spawn pumaxformcontrolCOMMAND sbs spawn ijac fourthCOMMAND sbs spawn cartcntl thirdCOMMAND sbs spawn zguard thirdCOMMAND sbs spawn aftC fourth# **********************STATE 1 destroy 50.0 100.0COMMAND sbs kill puma_pidgACOMMAND sbs kill grav_compCOMMAND sbs kill moveqCOMMAND sbs kill invkinCOMMAND sbs kill fwdkinCOMMAND sbs kill trjcsvarCOMMAND sbs kill gbptrajCOMMAND sbs kill umhandCOMMAND sbs kill dhmoveCOMMAND sbs kill dhgraspCOMMAND sbs kill dhmergeCOMMAND sbs kill pumaxformCOMMAND sbs kill ijacCOMMAND sbs kill cartcntlCOMMAND sbs kill zguardCOMMAND sbs kill aftC# **********************STATE 2 reset 50.0 170.0COMMAND sbs reinit gbptraj# **********************STATE 3 halt 50.0 240.0COMMAND sbs off moveqCOMMAND sbs off trjcsvar



COMMAND sbs off gbptrajCOMMAND sbs off dhmerge# **********************STATE 4 start 170.0 30.0COMMAND sbs on puma_pidgACOMMAND sbs on grav_compCOMMAND sbs on moveq umdhAstartCOMMAND sbs on umhandCOMMAND sbs on dhmergeCOMMAND sbs on fwdkinEVENT sbs moveq 20B 5EVENT sbs moveq 10B 5# **********************STATE 5 initHand1290.0 30.0COMMAND sbs on invkinCOMMAND sbs on trjcsvarCOMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -


0.6177 0.0000 -1.2161 0.6331 0.2359 0.3330 -0.89240.6753 0.2607 0.0203 -0.3632 0.2698 0.0692 50

COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 7EVENT sbs gbptraj 10B 7# **********************STATE 7 Grasp1 290.0 170.0COMMAND sbs on dhgrasp 21EVENT sbs dhgrasp 20B 8EVENT sbs dhgrasp 10B 8# **********************STATE 8 ViaMove1 290.0 240.0COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 9EVENT sbs gbptraj 10B 9# **********************STATE 9 Ungrasp1 290.0 310.0COMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -

1.047 0.0 0.0 0.0 -1.047 0.0 0.0 0.0 -1.047 0.0 0.0 50EVENT sbs dhmove 20B 10EVENT sbs dhmove 10B 10# **********************STATE 10 initHand2410.0 30.0COMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -

1.047 0.0 0.0 0.0 -1.047 0.0 0.0 0.0 -1.047 0.0 0.0 50EVENT sbs dhmove 20B 11EVENT sbs dhmove 10B 11



# **********************STATE 11 Preshape&Move2410.0100.0COMMAND sbs on dhmove -0.4386 0.2104 0.4294-

0.9706 0.0000 -1.4064 0.4456 0.1393 0.2430 -1.06910.4562 0.1442 0.0118 -0.5342 0.1349 0.0288 50

COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 12EVENT sbs gbptraj 10B 12# **********************STATE 12 Grasp2 410.0 170.0COMMAND sbs on dhgrasp 21EVENT sbs dhgrasp 20B 13EVENT sbs dhgrasp 10B 13# **********************STATE 13 ViaMove2 410.0 240.0COMMAND sbs on gbptrajEVENT sbs gbptraj 20B 14EVENT sbs gbptraj 10B 14# **********************STATE 14 GuardedMove1410.0 310.0COMMAND sbs off invkinCOMMAND sbs off trjcsvarCOMMAND sbs on pumaxformCOMMAND sbs on ijacCOMMAND sbs on cartcntlCOMMAND sbs on aftCCOMMAND sbs on zguardEVENT sbs zguard 20B 15# **********************STATE 15 Ungrasp2 530.0 30.0COMMAND sbs on dhmove 0.0 -0.244 0.0 0.0 0.0 -

1.047 0.0 0.0 0.0 -1.047 0.0 0.0 0.0 -1.047 0.0 0.0 50EVENT sbs dhmove 20B 16EVENT sbs dhmove 10B 16# **********************STATE 16 ViaMove3 530.0 100.0COMMAND sbs off pumaxformCOMMAND sbs off ijacCOMMAND sbs off cartcntlCOMMAND sbs off aftCCOMMAND sbs on invkinCOMMAND sbs on trjcsvarCOMMAND sbs on gbptrajEVENT sbs gbptraj 20B 17EVENT sbs gbptraj 10B 17# **********************STATE 17 end 530.0 170.0COMMAND sbs on moveq umdhAstartEVENT sbs moveq 20B 3EVENT sbs moveq 10B 3



EOF

toward gesture-based programming - robotics institute · alleled freedom within which i could...

Documents