team ihmc's lessons learned from the darpa robotics … · 2018. 3. 18. · johnson et al.:...

Team IHMC’s Lessons Learned from the DARPA RoboticsChallenge: Finding Data in the Rubble

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

Matthew Johnson, Brandon Shrewsbury, Sylvain Bertrand, Duncan Calvert, Tingfan Wu, Daniel Duran,Douglas Stephen, Nathan Mertins, John Carff, William Rifenburgh, Jesper Smith, Chris Schmidt-Wetekam,Davide Faconti, Alex Graber-Tilton, Nicolas Eyssette, Tobias Meier, Igor Kalkov, Travis Craig, Nick Payton,Stephen McCrory, Georg Wiedebach, Brooke Layton, Peter Neuhaus and Jerry Pratt

Received 9 October 2015; revised 22 April 2016; accepted 7 July 2016

This article presents a retrospective analysis of Team IHMC’s experience throughout the DARPA RoboticsChallenge (DRC), where we took first or second place overall in each of the three phases. As an extremelydemanding challenge typical of DARPA, the DRC required rapid research and development to push the bound-aries of robotics and set a new benchmark for complex robotic behavior. We present how we addressed each ofthe eight tasks of the DRC and review our performance in the Finals. While the ambitious competition schedulelimited extensive experimentation, we will review the data we collected during the approximately three yearsof our participation. We discuss some of the significant lessons learned that contributed to our success in theDRC. These include hardware lessons, software lessons, and human-robot integration lessons. We describerefinements to the coactive design methodology that helped our designers connect human–machine interactiontheory to both implementation and empirical data. This approach helped our team focus our limited resourceson the issues most critical to success. In addition to helping readers understand our experiences in developingon a Boston Dynamics Atlas robot for the DRC, we hope this article will provide insights that apply more widelyto robotics development and design of human–machine systems. C© 2016 Wiley Periodicals, Inc.

1. INTRODUCTION

In the heat of competition, it is often difficult to stop andanalyze. The DARPA Robotics Challenge (DRC) was a par-ticularly demanding competition that required advancesin mobility, manipulation, and human interfaces — all ofwhich had to be integrated into a single unified system thatcould perform eight significantly different tasks as one con-tinuous operation in an outdoor and untethered environ-ment with limited communications. Having successfullycompleted the challenge, Team IHMC can now look backand retrospectively analyze the nearly three-year journeyand attempt to understand what factors contributed to oursuccess and what lessons may be beneficial to future roboticswork.

Competitive challenges like the DRC do not lend them-selves well to traditional experimental research and are of-ten at odds with such work. The work is fast paced, andthe analysis is necessarily abbreviated. Agile developmentcycles of rapidly prototyped functionality were evaluatedusing the Atlas robot on mock-ups of the anticipated tasksto quickly separate viable solutions from less-promising al-ternatives. Competitions can be beneficial in that they spurinnovation and force evaluation outside the control of the re-searcher. For the DRC, all teams had to leave the security oftheir labs and were required to compete outside on a course

they were not fully privy to until the day before the contest.In addition, the competition imposed network restrictionsand a surprise task, which were also withheld until thenight before each day of the competition. This uncertaintyforced a level of robustness and generality that would nototherwise be necessary for experimental research but whichis certainly essential for robotics to move from laboratoriesto fielded systems in the real world. What is often sacri-ficed in competitions is the scientific rigor of experimentalresearch. In this article, we strive to bring the competitiveadvances into balance with more traditional research bypiecing together the fragments of data we have collectedthroughout the entire process to identify key elements ofperformance and provide empirical support for our lessonslearned.

2. BACKGROUND

2.1. The Challenge Tasks

The primary goal of the DRC was to develop robots ca-pable of assisting humans in responding to natural andman-made disasters. The robots were expected to use stan-dard tools and equipment to accomplish the mission. TheDRC comprised three separate competitions. The first eventwas called the Virtual Robotics Challenge (VRC), which

Journal of Field Robotics 00(0), 1–21 (2016) C© 2016 Wiley Periodicals, Inc.View this article online at wileyonlinelibrary.com • DOI: 10.1002/rob.21674

2 • Journal of Field Robotics—2016

Figure 1. Images of the actual DRC Finals tasks (all images are pulled from DRC videos on the DARPA YouTube(https://www.youtube.com/user/DARPAtv/videos, accessed August 18, 2015).

was held in June 2013. The top performers1 in this phasewere each provided an Atlas humanoid robot, made byBoston Dynamics Inc. (BDI), and funding to continue re-search and development for the next event. The secondevent was the DRC Trials, which was held in Decem-ber 2013, at the Homestead-Miami Speedway in Florida.Sixteen teams competed at the trials: The top eight teams ofthe VRC used the provided Atlas robot, and the rest com-peted using a robotics platform that they purchased or builton their own. The top teams of the trials were awardedfunding to compete in the finals.2 The DRC culminatedin the final competition held in June 2015 at the PomonaFairplex in California. Twenty-three teams from around theworld competed for $3.5 million in prize money. Teams hadto have their robot drive through an obstacle course, get outof the car and enter a building through a door, turn a valve,cut a designated hole in a wall using a power tool, performa surprise manipulation task, either walk over rubble orthrough debris, and then climb some stairs (see Figure 1).Communications were degraded by latency and periodiccommunications blackouts as soon as the robot entered thebuilding. More information about the DRC can be found atwww.theroboticschallenge.org.

1http://spectrum.ieee.org/automaton/robotics/humanoids/darpa-vrc-challenge-results-heres-who-gets-an-atlas-humanoid(accessed September 28, 2015).2http://spectrum.ieee.org/automaton/robotics/humanoids/darpa-robotics-challenge-trials-results (accessed September 28,2015).

2.2. The Robot

We used the upgraded version of Atlas, a humanoid robotfrom BDI (Figure 2, left). This version was 75% differ-ent3 from the version used during the DRC Trials. It hadan onboard battery pack that permitted untethered opera-tion for over an hour. This version of Atlas is a hydrauli-cally powered humanoid robot that weighs 175 kg andstands 1.9 m tall. A Carnegie Robotics MultiSense-SL head(Figure 2, right) provides two forward-facing cameras, anaxial rotating Hokuyo light detection and ranging (LIDAR),and two wide-angle cameras intended to compensate for therobot not being able to yaw its head. Our robot used ROBO-TIQ 3-Finger gripper for hands.4 Detailed information aboutour system architecture or control algorithms can be foundin previous publications (Johnson et al., 2015; Koolen et al.,2013).

3. PERFORMANCE REVIEW

IHMC’s placed first in the VRC, second in the trials, andsecond in the finals. Table I shows an unofficial com-pilation of the results across all phases of the competi-tion. All told, approximately 425 international teams com-peted. The top teams in each phase advanced and were

3https://www.youtube.com/watch?v=27HkxMo6qK0 (accessedSeptember 25, 2015).4http://robotiq.com/products/industrial-robot-hand (accessedSeptember 25, 2015).5Some teams combined in later phases, making the exact count diff-icult to assess.

Journal of Field Robotics DOI 10.1002/rob

https://www.youtube.com/user/DARPAtv/videos

http://www.theroboticschallenge.org

http://spectrum.ieee.org/automaton/robotics/humanoids/darpa-vrc-challenge-results-heres-who-gets-an-atlas-humanoid

http://spectrum.ieee.org/automaton/robotics/humanoids/darpa-vrc-challenge-results-heres-who-gets-an-atlas-humanoid

http://spectrum.ieee.org/automaton/robotics/humanoids/darpa-robotics-challenge-trials-results

http://spectrum.ieee.org/automaton/robotics/humanoids/darpa-robotics-challenge-trials-results

https://www.youtube.com/watch?v=27HkxMo6qK0

http://robotiq.com/products/industrial-robot-hand

Johnson et al.: Team IHMC’s Lessons Learned from the DARPA Robotics Challenge • 3

Figure 2. Atlas robot (left) provided by Boston Dynamics and a close-up of the Carnegie Robotics MultiSense-SL head (right).

Table I. Unofficial scores and rankings across all DRC competitions.

VRC Trials Finals Day 1 Finals Day 2 Overall Final

Team Score Team Score Team Score Time Team Score Time

IHMC 52 SHAFT 27 TARTAN 8 55:15 KAIST 8 44:28 KAISTWPI (WRECS) 39 IHMC 20 Nimbro 7 34:00 IHMC 8 50:26 IHMCMIT 34 TARTAN 18 JPL 7 47:59 WPI-CMU 7 56:06 TARTANTRACLA8S 30 MIT 16 MIT 7 50:25 JPL 6 54:45 NimbroJPL 29 JPL 14 IHMC 7 56:04 MIT 6 58:01 JPLTORC/Va.Tech. 27 TRACLABS 11 KAIST 7 58:21 TARTAN 5 37:14 MITTEAM K 25 WPI (WRECS) 11 WPI-CMU 7 59:35 Nimbro 5 50:24 WPI-CMUTROOPER 24 TROOPER 9 UNLV-HUBO 6 57:41 AIST NEDO 5 52:30 UNLV-HUBOCase 23 THOR 8 TRACLABS 5 49:00 UNLV-HUBO 4 05:14 TRACLABSCMU-Steel 22 ViGIR 8 SNU 3 25:20 NEDO JSK 4 58:39 AIST NEDOAnonymous 21 KAIST 8 THOR 3 47:23 SNU 4 59:33 NEDO JSKBr Robotics 16 HKU 3 ViGIR 3 48:49 THOR 3 27:47 SNUOU 9 UNLV 3 Robotis 2 19:18 HRP2 Tokoyo 3 30:06 THORROBIL 8 CHIRON 0 NEDO JSK 2 23:C2 Robootis 3 30:23 HRP2Ronot-nicy 7 JSC 0 Walkman 2 36:35 TROOPER 2 42:32 RobotisRE2 6 Mojavaton 0 TROOPER 2 56:04 ViGIR 2 49:52 ViGIRSARBOT 6 AISTNEDO 1 01:41 Walkman 1 04:24 WalkmanWashington 4 Hector 1 02:44 Hector 1 17:55 TROOPERMimesis 4 HRP2 Tokoyo 1 16:22 TRACLABS 1 28:59 HectorGold 3 HKU 0 CC:CC HKU 0 00:00 HKUIntel. Tech. 2 Aero 0 00:00 Aero c 00:00King’s College 1 Grit 0 CC:CC Grit 0 00:00

Valor 0 00:00 Valor 0 00:00

joined by several new teams in each competition. Table Ishows the performance of teams competing in multiplephases.

It is difficult to compare our own team’s performanceacross the three phases because of task differences, but Fig-ure 3 shows our individual task performance from the DRCTrials, our practice leading up to the finals, and the tworuns performed during the final competition. The VRC taskswere too different to permit comparison. For the trials, thedrill task was very similar, but other tasks were not ex-actly the same; however, we included aspects we felt could

be compared. For example, we used the time it took to turnone valve, to get through one door and the last section of theterrain. We were approximately twice as fast for the finals.In addition to improving those tasks, we needed to developthe capability to drive, egress the vehicle, climb stairs, andhandle a surprise task for the finals. During the month priorto the finals, we performed several mock finals under cir-cumstances designed to represent what we expected duringthe finals. Figure 3 shows that our practice capability wasvery close to our final performance levels. Day 2 was a bitslower than Day 1 mainly because of additional caution in



Figure 3. DRC Task time from the trials in 2014, the practiceleading up to the finals, and the two different runs performedduring the finals.

response to our falls on Day 1, particularly for the terraintask. The surprise task on Day 2 was more difficult than Day1 and took around twice as long (additional 3:42 min).6

Our pre-finals practice, which we called mock finals,was not normal development and testing. For the mockfinals, we would stop development and emulate the finalsconditions as accurately as possible (e.g., remote operation,communications limitations, noise). The reliability resultsfor task completion are shown in Figure 4. We varied thesurprise task for each mock trial to challenge the operator.Example surprise tasks included pulling a shower handle,flipping a large power switch, disconnecting and connectinga hose, opening a box, and pushing a button. The numberin each bar is the number of attempts. Not all tasks wereattempted in a mock final because of development progressand hardware readiness.

We needed to increase task speed because we had30 min per task during the trials, but in the finals, we wouldonly have an hour for all tasks. Although we did speedup robot motion, this was only one part of the answer toincreasing speed. In the trials, approximately 20% of ouroperation time was robot motion. This is similar to resultsof 26% motion reported by MIT (Fallon et al., 2015). The restof the time the operator tried to assess the situation, decideon what to do, make a plan, and convey that plan to therobot. For the finals, we increased our percentage of time

6On Day 1, we fell on the terrain but were able to recover andcomplete the task. We pulled the fall and recovery out of the Day 1terrain time for more direct comparison. This extra time is labeled“Fall & Recovery.” Also, the stair task on Day 1 was not successful,so it is not an accurate comparison.

in which the robot was in motion to 35%, which was a 75%increase over our percentage during the DRC Trials. Ourrobot operation is still disproportionally idle, as shown byour Day 2 data in Figure 5. Driving has no idle time becauseour approach was aided teleoperation. The rest of the taskshave significant idle time. Indoor tasks had the additionalDARPA-imposed burden of limited bandwidth and datadelays. The drill task had the most idle time. This task wasquite complex and involved fine manipulation and a fairamount of judgment from the operator to ensure success.

During our first run of the finals, we fell twice. The firstfall occurred as we crossed the terrain obstacle. It is easy tolook at the fall and say it was “human error” or “machineerror”; however, a deeper analysis will reveal that the is-sue is rarely so clear-cut. Our design methodology is basedon the interdependence between the human and the ma-chine, and we have discussed the importance of designingthe algorithms and interfaces to support that interdepen-dence (Johnson, Bradshaw, Hoffman, Feltovich, & Woods,2014; Johnson et al., 2015). Human error certainly played arole, because it was a human operator that commanded thebad step—a decision that was confirmed by other observersas well. A contributing factor is that we expected muchmore challenging terrain. This may seem like a strange fac-tor, but our anticipation of difficulty made us overconfidentwith what we perceived to be a “much easier” task. How-ever, we knew early on that the responsibility for maintain-ing balance while walking needed to rest with the robot(Johnson et al., 2014) because communication limitationswould limit human intervention. Therefore, we could assignblame to the walking algorithm for not handling the com-mand. This is also not the full story, because we knew wecould command steps that were unachievable. In fact, ourinterface alerts us if we have steps that are out of reach, but itis only reliable for flat ground. For this reason, we could saythat the interface should have prevented commanding a badstep. We actually implemented this early in the competition,but operators found it too conservative and it was disabled.In addition, it was difficult to develop an algorithm thatwould properly handle complex situations, which is whenit is most needed. Even if we don’t prevent the step, wecould have at least alerted the operator of the step beingrisky. As such, we could say the interface provided insuf-ficient predictability to see that the step was bad. Each ofthese factors—man, machine, and interface—played a rolein the fall. As we analyze failures, we need to ensure we arediligent in considering the interdependencies throughoutthe system in order to design better systems in the future.

The second fall occurred on the stairs. This case is muchsimpler to analyze. We were close to the time limit and rush-ing to complete the last task. Our interface clearly indicatedthat we had not stepped accurately (the red bar misalignedwith the step in Figure 8), so it did its job. In fact, we hadthe same issue on Day 2, but we approached with more cau-tion and restepped until the interface indicated successful



Figure 4. Task reliability during our mock finals practice. The number in each bar represents the number of attempts. (Because ofdevelopment progress and hardware readiness, we were not able to do all tasks each mock trial.)

Figure 5. Amount of time robot was actively moving versus when it was idle during the Day 2 run in which we scored 8 points.

alignment. However, on Day 1, we were almost out of time.The controller was struggling to maintain balance, and therewas clear oscillation in the robot, indicating we were on theedge of stability. Although wobbly, the controller kept therobot standing, and it was not until we commanded an-other step that we fell. Both the interface and the controllerwere telling us there was a problem. Faced with the finalmoments of allowable time, we chose to ignore these indi-cations and take our chances in an attempt to beat the clock.The result was our second fall.

4. WHAT CHANGED FROM TRIALS TO FINALS

The major changes for the finals included adding the capa-bility to drive, to egress the , and to climb stairs. This in-cluded both algorithm and interface development. We alsoneeded to support the surprise task, but our testing indi-cated our interface and algorithms used during the trials

were sufficient to handle a wide variety of potential surprisetasks. Our general approach and architecture remained un-changed for the finals. We did improve our algorithms, asdescribed in Section 6.2.1. We also needed to adapt to botha new version of Atlas and DARPA’s new communicationconstraints for the finals.

Our interface remained the same as we moved towardthe finals, though we needed to make a few additions. First,we needed to support driving. The main addition requiredwas observability and predictability about the driving path(Figure 6). We provided a predictive path based on a simpleturning circle (turning circle radius = [Track/2] + [Wheel-base/Sin (average steer angle)]) and a steering ratio of 14determined by empirical testing with the Polaris vehicle.We provided two paths. The green path was based on theoperator-commanded steering angle and a red path basedon the current steering angle. This was needed to help theoperator account for latency in the robot’s arm turning the



Figure 6. Operator interface showing predictive indicators for driving. Green is the operator’s current, and red is the actual turnrate. The difference between the two is due to latency as the robot arm physically turns the steering wheel.

steering wheel. Our interface was simply the arrow keys onthe keyboard, much like video games in the 1980s—simplebut effective.

Other changes involved supporting more observabil-ity, predictability, and directability for the operator. Thisinvolved developing behavior generation tools, discussedin Section 6.3.3. It also involved developing an instanta-neous capture point (ICP) display (Figure 7) and warningthat helped alert the operator visually and with an audibletone of unintentional contact with the environment or whenpushing or pulling forces were reaching control authoritylimits. The ICP is based on the center of mass and the centerof mass velocity. The distance between the ICP and the edgeof the support polygon provides a good indication of stabil-ity (Koolen, de Boer, Rebula, Goswami, & Pratt, 2012). Wealso added some visual aides to help the operator ensure thefeet were properly aligned as the robot ascended the stairs(Figure 8). These visual aids automatically display based oncontext.

5. HOW WE TACKLED THE DARPA ROBOTICSCHALLENGE

There were eight major tasks for the DRC as described inSection 2. These needed to be completed sequentially, for themost part, although some variability in order was permittedfor the “indoor” tasks. Our approach was to develop a singlesystem capable of handling the variety of tasks required bythe DRC.

Our approach to driving was to teleoperate the robotaided by an intuitive interface. We provided the operator

simple keyboard controls common to older computer games(AWSD, or arrow , and the space bar). Our algorithms han-dled the transformation of these commands to achieve ma-nipulation of the steering wheel and gas pedal. We had ahydraulic gas pedal repeater that permitted our robot to sitin the passenger side of the vehicle, because Atlas was toobig to sit behind the wheel and effectively drive. We devel-oped interface elements (Figure 6) as described in Section 4that provided observability and predictability of both thecar and the robot. This simple approach was easy for severalpeople in our lab to use—not to mention fun—and highlyeffective. Our operator could easily drive as fast as DARPAwould permit.

Egress was quite challenging. It took several weeks oftrial and error to figure out how to sit in the car, how to standup, and how to get out of the car. We anticipated using ourarms to pull ourselves up, but in the end, it was mainlybody and leg positioning to stand and our normal walkingalgorithm once standing. The egress sequence was scriptedbut consistent with our overall coactive design approach.We provided the operator observability, predictability, anddirectability throughout the egress process to deal with thevariability of each egress.

All of the “indoor” tasks (valve, wall, and surprise) aswell as the door task were accomplished using our inter-active tool paradigm. Our original approach was to worktoward an automated script for each behavior, but we knewfrom experience that scripting is a brittle way to oper-ate. We next developed interface elements to support in-terdependence and leverage the operator to mitigate algo-rithmic brittleness. The next stage of development was to



Figure 7. ICP indicator during drill task of finals showing good tracking (left) and bad tracking (right). Bad tracking occurs whenthere are large unexpected contact forces between the drill and the wall during cutting.

Figure 8. Image from Atlas during Day 2. The virtual red baris displayed at the proper foot position to indicate the part offoot that should be aligned with edge of step. The misalignmentbetween the stair edge and the bar tells the operator that thefoot is not sufficiently on the stair and that it is poorly aligned.We fell on Day 1, but on Day 2, we restepped several times toobtain proper alignment and were successful.

generalize the problem and design interactive tools for eachtask. These tools enabled the operator to command the robotvia interaction with the three-dimensional (3D) graphic toolrather than specifying robot motion or joint positions di-rectly. An example of one of these tools is shown in Figure 7and further discussion can be found in Sections 4 and 6.3.3.These tools evolved over the project to add capability andflexibility.

We opted to walk over the rubble for the seventhtask, and both the rubble and the stairs used the same ap-proach. With DARPA-imposed latency, operator interven-tion in low-level walking was not possible, so reliable walk-ing is critical. We relied heavily on our capturablity-basedwalking algorithm discussed in Section 6.2.1. Guidance ofthis algorithm still depended on the operator. The operatorused our interactive walking tool that allowed specificationof a goal location, walking direction, and a variety of other

options. In addition, the operator could adjust individualfootsteps if desired. We had a terrain-snapping algorithmthat used least squares–based plane fitting from a subsetof the LIDAR point cloud located around the footstep cen-ter. This algorithm was used to adjust the footstep height,pitch, and roll to conform to the ground. It was imperfect,and the rubble task required operator adjustment for correctfootstep location determination. The stairs task was similar,except the automated footstep height was more reliable, andwe added visual cues for verifying accurate footstep place-ment (see Figure 8). The cue was a virtual red bar attachedto the foot. This indicator let the operator quickly assesswhether the foot was sufficiently on the step and whetherit was adequately aligned with the step. Our walking al-gorithm also needed to support partial footstep supportbecause the need to bend the forward-facing knee limitedhow much of the foot could be placed on the step and stillpermit a follow-on step without the support shin hitting theupcoming step. Coactive design guided development of ourwalking interface tool that permitted our operator to effec-tively employ our capturability-based walking algorithm.

Our team used very little automated perception andno mapping. The only perception we used was to alignthe car to the robot automatically. We had algorithms torecognize the valve, drill, and stairs, but none were reliableenough or accurate enough to use during the competition.In addition, these simple tasks were trivial for the operatorand took an almost insignificant amount of time. Mappingwas not necessary, and even localization ended up beingunnecessary for the competition (see Section 6.5.1).

It should be clear that our approach involved both thehuman operator and the autonomous capabilities of therobot. This was not a surprise to our team. In fact, it was aguiding principle from the beginning. The coactive designmethod (Johnson et al., 2014) guided the design of the au-tonomous capabilities and the interface elements to workwith the operator. This approach combined with reliable



bipedal walking on rough terrain were the cornerstones ofour approach.

6. LESSONS LEARNED

As a follow-up to a previous anecdotal report of lessonslearned in previous phases of the DRC (Johnson et al., 2015),we elaborate and refine these lessons based on experiencesin the last 18 months of the project—this time with datato support them. The lessons are grouped into the follow-ing categories: hardware, robot control, human–robot inte-gration, software practices, and design choices. Remote hu-manoid robot operation for disaster response is a very broadtask, and, as such, our lessons cover a wide variety of top-ics. Some lessons are validation of well-known principles,like those about real-time control and software practices.Other lessons address aspects of robotics often overlooked,such as the importance of the operator. The highlights of thelessons learned, in no particular order, are as follows:

� Robot uptime is essential� Battery power is surprisingly effective for large hydraulic

robots for up to 1.5 hr� Robot hand capability is still limited for heavy work� Capturability-based control was key to our walking suc-

cess� Real-time control is critical for bipedal robots� Don’t ignore the operator when building autonomous

systems� Coactive design helped identify critical operator–robot

interdependence� Dynamic behavior generation interface tools are an effec-

tive strategy for leveraging autonomous capabilities� Solid software practices are worth the cost� Sometimes optimality is not needed and sufficiency will

do� Instead of focusing how a robot can do a task for a hu-

man, consider how each can benefit the other (combineto succeed instead of divide and conquer)

6.1. Hardware Lessons

6.1.1. Solid, Reliable Hardware Is Essential

Figure 9 shows our hardware readiness for each phase of thecompetition. Green, yellow, red, and blue indicate full func-tionality, partial functionality, nonfunctionality, and non-availability, respectively. There were many reasons for apartially functional status. Atlas V4 had some hydraulicleak issues initially. The new arms were also more proneto breaking because of gearbox limitations. The hands werealso frequently broken.

For the VRC, the simulation environment was madeavailable in October 2012, but the task environments werenot available until March.

It was about two months after the VRC when we re-ceived the Atlas robot from BDI. We intensively testedand developed on the Atlas for a period of five monthswith few maintenance issues and a remarkable 92%uptime. Continuous uptime allowed us to make rapidprogress.

After the trials, we continued using the same versionof Atlas (V3) until it was returned to BDI in November foran upgrade. Since the finals required untethered operation,BDI needed to modify Atlas for onboard power. They alsomade other changes, such as upgrading the arms to be hy-draulic/electric and adding an additional degree of freedomin the wrist. The new version of Atlas (V4) was deliveredback to IHMC in mid-February 2015 and was 75% new. Inaddition, there were some growing pains as BDI workedout some issues with the new version of Atlas. Overall, BDImade a tremendous effort to provide a solid, reliable robot.Though we had much less uptime on the upgraded versionused during the finals, their efforts ensured that our Atlascould not only work for both runs of the finals but also sur-vive two falls and still manage to put up a second-place runafterward.

6.1.2. Battery Performance

Our hydraulically powered Atlas was powered usinga tether for our development, but for the competition,it needed to work untethered. BDI provided a battery(3.7 kWh Li-ion, 165 VDC, and 60 lbs), but we did not get totry it until we arrived at the competition. We had one 2-hrsession to test with the battery before competing. Our expe-rience surprised us. We saw no noticeable performance dif-ference between tethered and battery-powered operation.This is based on only three usages, but all three were con-sistent. We were able to get about one and a half hours ofoperation during our test session. Each of the two trials wereunder an hour, though slightly longer if you consider thesetup and wait time prior to each trial run. Not only was thepower consistent with tethered operation, but the weight es-timates we had been using for center of gravity calculationswere quite accurate. Overall, we were extremely pleased,and even a bit surprised at how effective battery operationperformed.

6.1.3. There Is a Need for Effective Hand Technology forField Robotics

Effective hand technology continued to be an issue for ourteam. For the trials, we used the iRobot hand and endedup using a basic hook for many tasks because of hardwarereliability and robustness. For the finals, we switched to theRobotiq 3-Finger hand because we expected the need forgreater strength in the hands to support tasks such as vehi-cle egress and stair climbing. As we developed techniquesfor these tasks, it became clear that even the Robotiq hands



Figure 9. Hardware readiness for each of the DRC phases. Green indicates fully functioning robot or simulation. Yellow indicatespartial functionality. Red indicates nonfunctional and blue indicates nonavailable. Version numbers (V3, V4, V5) are BostonDynamics release numbers. The table at the bottom shows the duration of each phase, the percentage of uptime, and the robothours for each phase.

would not have the strength and reliability for these tasks.We experienced numerous hand failures that contributedto our overall downtime, because we could only work onnonmanipulation tasks until the hand was repaired. It alsodiluted our manpower because our team had to develop im-provements and perform the repairs themselves. We endedup developing techniques that minimized the use of andrisk to the hands. In addition, the hands were not effectiveat fine manipulation such as pushing the recessed button onthe rotary tool.7 We developed a 3D printed adapter that weapplied to the hand to aid in turning on the tool. It seemsclear that developing hands that can provide fine-grainedmanipulation capability, while being rugged enough andstrong enough to work on robots like the Atlas, still remainsan open challenge.

7Dewalt rotary tool used in competition: http://www.dewalt.com/tools/cordless-specialty-cordless-cut-out-tools-dc550ka.aspx(accessed on September 18, 2014).

6.2. Robot Control Lessons

6.2.1. Capturability-Based Analysis and Control Is anEffective Approach to Walking

The walking challenge for the finals was actually simplifiedcompared to the trials. In fact, competitors had a choice of ei-ther walking over the terrain or going through some debris.For the trials, only 2 of the 16 teams successfully completedthe entire terrain task without requiring an intervention. Forthe finals, only 9 of the 22 teams reached that terrain, and, ofthose, only 4 chose to attempt the terrain instead of debris.

Inside of our lab, we ran mock trials to simulate whatwe anticipated for the finals. Our terrain task was 34%higher and 40% longer than the finals course (Figure 108).It was also more complex and had no flat areas. Duringthe last weeks of testing, we ran 13 mock trials with three

8A video of a mock final test: https://www.youtube.com/watch?v=EpE6MhaxwOM (accessed September 25, 2015).


http://www.dewalt.com/tools/cordless-specialty-cordless-cut-out-tools-dc550ka.aspx

http://www.dewalt.com/tools/cordless-specialty-cordless-cut-out-tools-dc550ka.aspx


Figure 10. Atlas robot tackling IHMC’s mock final terrain. Itwas 34% higher and 40% longer than the finals course. It wasalso more complex and had no flat areas.

different operators under conditions that we expected dur-ing the finals, including communications limitations. Asshown in Figure 4, we had a 100% success rate for thesemock trial runs.

Figure 11. Overhead view of CoM and ICP trajectories duringfinal stage of car egress. Each swing state takes about 1.2 s, andeach transfer state takes about 0.8 s. The walking gait beingrather conservative, the ICP and CoM paths are highly similar.

For the finals, we used the same capturability-basedcontrol scheme (Johnson et al., 2015) to plan and control themomentum of the center of mass (CoM) under the sameassumption of having the CoM being at a constant height.The main improvements were in streamlining the imple-mentation and tuning the controller parameters to improveoverall performance.

Figure 11 through Figure 18 show the overhead view ofCoM and ICP trajectories and tracking performance on the

Figure 12. State evolution, ICP and CoM trajectories, and ICP tracking errors during the final stage of car egress.



Figure 13. Overhead view of CoM and ICP trajectories whengoing through the door. Each swing state takes about 1.2 s, andeach transfer state takes about 0.8 s.

ICP during the car egress, the door, the terrain, and the stairstasks of our second official run. During the DRC Finals,the walking gait was conservative, taking approximately2 s per step. This is about four times slower than a fasthuman-walking pace. The robot was one-step capturable(Koolen et al., 2012) 25% of the swing time and zero stepthe rest of the time. Although we were able to achieve fasterwalking gaits, we observed that faster gaits are less robust

Figure 15. Overhead view of CoM and ICP trajectories duringthe terrain task on Day 2 of the DRC Finals. Each swing statetakes about 1.2 s, and each transfer state takes about 0.8 s. Thewalking gait being rather conservative, the ICP and CoM pathsare highly similar.

to external disturbances, especially to inaccurate footstepsplanned using an inaccurate height map of the environment.

Besides improving the feedback control on the cap-ture point, we implemented additional safety featuresbased on the capture point to prevent a fall due to largerdisturbances.

Figure 14. State evolution, ICP and CoM trajectories, and ICP tracking errors when going through the door.



Figure 16. State evolution, ICP and CoM trajectories, and ICP tracking errors during the terrain task.

Figure 17. Overhead view of CoM and ICP trajectories duringthe stairs task. Each swing state takes about 1.2 s, and eachtransfer state takes about 0.8 s. Note the partial footholds forthe right foot when stepping on the next step of the stairs.

a. The capture point information is fed back to the operatorboth visually and through an audible beep that increasesin frequency as the ICP error increases. Therefore, theoperator can evaluate the stability status of the robot.Accordingly, any command that is estimated to be unsafecan be manually aborted. When there is no audible tone,

the operator knows that the robot is in a safe state topursue the next action.

b. Based on the capture point control error, an automaticmanipulation abort procedure was implemented. Whenthe tracking error on the ICP passed a certain thresh-old, we used 4 cm; both arms are commanded to holdtheir current configuration immediately, canceling anycommand they were performing. The operator is noti-fied that the current arm motion is aborted, and any newinput is ignored for a short time. This safety feature per-mitted the robot to avoid a few potential falls, especiallyduring manipulation tasks. When the robot hit an objectduring a manipulation task, the ICP would start mov-ing as soon as the CoM velocity increased, making theautomatic abort very responsive.

c. The robot could be commanded to stay in single sup-port with one foot in the air. The operator can then movethe foot that is in the air or put it back on the groundas needed. This command was used during the caregress. In a static single-support state, the robot is highly



Figure 18. State evolution, ICP and CoM trajectories, and ICP tracking errors for the stairs task.

sensitive to external disturbances as the support polygonis reduced to the single foot on the ground. Therefore,we implemented an automatic recovery from single sup-port. If the ICP leaves the support foot polygon and isinside the convex hull of both feet, the foot in the air wasquickly moved toward the ground. As soon as touch-down was detected, the robot entered double supportstate, extending the support polygon and making therobot more likely to recover.

d. Finally, to improve the robustness during walking, weenabled changes in swing time. Because footstep ac-curacy can be critical in clutter, we decided that onlyswing time, and not swing position, could be modu-lated to recover from disturbances. When in single sup-port, if the ICP was ahead of the ICP plan and the errorwas greater than a certain threshold, we used 5 cm, theswing time remaining was estimated based on the cur-rent ICP location. The swing trajectory was then sped upusing the estimated swing time remaining. This safetyfeature was really useful during our second run at the

DRC Finals. Because of the two falls from the day be-fore, the robot had incurred some physical damage to itsstructure making walking less robust. During the secondrun, seven swings required being sped up to prevent po-tential falls.

As during the DRC Trials, we showed during the DRCFinals that a capturability-based control is viable and can beused to achieve reliable walking in a human environment.In addition to the planning and feedback control on thecapture point, we showed that simple capturability-basedanalyses can be performed online, allowing for adapting theoriginal plan when under disturbances.

6.2.2. Real-Time Control Is Critical and Existing Real-TimeJava Support Is Insufficient

It is difficult to provide specific data, but we hold that theneed for robust real-time support remains important, es-pecially for robotic systems that are inherently unstable,like a bipedal robot. Our team has brought POSIX real-time



Figure 19. Interdependence analysis table for the terrain task. This shows the potential interdependencies between the operatorand the robot. We have extended the original table by the addition of empirical data and associating capacities with specificalgorithmic and interface elements. This allows this one tool to connect theory, implementation, and data. (See Johnson et al., 2014,for full explanation of color coding.)

threads to the OpenJDK using a JNI library. We have re-leased the IHMCRealtime library created for this projectunder the Apache license. Full details can be found inour previous publication (Smith, Stephen, Lesman, & Pratt,2014).

6.3. Human–Robot Integration Lessons

6.3.1. Fielding Robotics Requires More Than theEngineering Solution

Our previous claim that the operator is a key part of thesystem is substantiated by the amount of operator involve-ment, not just on our team, but across all teams. As withthe trials, no team completed any of the eight tasks fullyautonomously. Figure 5 shows that 65% of our performancetime was fully dedicated to the human operator. Robusthardware, sound algorithms, and reliable control theoryare essential, but it is clear that as we strive to addressmore sophisticated and complex work, we need to take amore principled approach to building not just machinesbut also human–machine systems. Through our experiencewith the DRC, we have provided a principled methodology(Johnson et al., 2014), as well as guiding principles to helpavoid common pitfalls (Bradshaw, Hoffman, Johnson, &Woods, 2013) and address the new opportunities affordedby embracing the virtues of human–machine teamwork(Johnson et al., 2014).

6.3.2. Coactive Design Is an Effective Method for Designingfor Human Involvement

Coactive design has been the driving force behind ourhuman–machine system design from the beginning of theDRC (Koolen et al., 2013). Our consistent performance,shown in Table I, is due to many factors, but our designapproach is arguably one of the most critical. For a fullexplanation on the coactive design method and the in-terdependence analysis (IA) tool, see our previous work(Johnson et al., 2014), which explains the features ofFigure 19. For the final phase of the DRC, we extendedour approach to connect human–machine system theory toboth implementation and empirical data.

The data we collected during our testing and evaluationof the system were associated with specific capacities inthe IA table, as shown in Figure 19. The reliability column(labeled “Reli.” in the figure) tracked our success rate duringtesting at a much finer detail than overall reliability of Figure4. This was critical to helping our engineers identify whichspecific aspects of the task were culpable for failures. Afterachieving high reliability, the next challenge was to increasespeed, and the time column helped engineers focus on theareas where improvement could have the greatest impact.For the terrain task in Figure 19, adjusting the location of thefootsteps to ensure they were adequately on the terrain anddid not catch a toe or heel was the most time-consumingpart (44% of time). The actual execution of each swing wasonly 31% of the overall time. These results are consistent



with the overall percentages of robot motion in Figure 5.The inclusion of data in our theoretical analysis allowed usto make principled design choices that were also evidencebased. This is the key to keeping an eager team of engineersfocused in the shiny world of robotics.

Once we knew what was needed (i.e., the theory) andhow we were performing (i.e., the data), we needed a wayto connect it to what we have (i.e., the implementation)and what we plan to have. To achieve this, we associatedcapacities to particular algorithms, interface elements, orhuman abilities. Across the top of Figure 19, there are col-umn headings for each of the relevant pieces of our systemused to accomplish the terrain task. Below each heading isa black dot to indicate where in the activity that particularcomponent has a role. We then connect the dots with arrowsto indicate potential workflows to accomplish the goal. Theresulting graph structure is a visual description of the flexi-bility in the system. The coactive design method and the IAtable provide a unique understanding of the system that isa significant change from talking about modes (Stentz et al.,2015), levels of operation (DeDonato et al., 2015), and taskallocation (Fallon et al., 2015; Hebert et al., 2015). The graphmakes it clear that discrete task allocation is not what ishappening because the human is informed by automationand display elements, and automation can be assisted bythe human as indicated by the numerous horizontal anddiagonal lines shown in Figure 19.

The IA table was a valuable part of our rapid iterativedevelopment cycle. The table in Figure 19 evolved to itscurrent state as we tested and developed. After we had de-veloped our walking algorithm to be robust enough to yieldreliable performance, we began to attack speed. We identi-fied that planning the footsteps was the primary area wheretime was spent. Our planning was human driven but aidedby algorithms to sequence footsteps and align them prop-erly to the terrain. We decomposed terrain alignment intotwo subtasks. One was making sure each footstep was prop-erly at ground height and had a reasonable orientation. Werefer to this as our terrain-snapping algorithm, and it waseffective but not 100% reliable. We also worked on automat-ically positioning the footsteps on the terrain such that theywere not too close to an edge and would not cause a toe orheel collision. This turned out to be too big a challenge in thelimited time frame and was not complete. We also addedin some simple reachability warnings, but these algorithmswere limited to simple cases and could not handle com-plex terrain. From our IA table, the most time-consumingaspect of the terrain task for our team is clear. More im-portantly, the riskiest aspect of the terrain task is also clear.Though we did not address the issue well enough to preventa fall on Day 1, our approach made us keenly aware of thesituation.

The interdependence analysis table is a unique tool inthat its focus is on how to fit the human and machine to-gether as a team. Its purpose is not to determine the op-

timal design or even to optimize performance. Optimalitycan be difficult to define in complex work, and frequently, itis unnecessary in human–machine systems. People’s needscan often be met with sufficiency (see section 6.5.1). Insteadof determining the optimal human–robot role, the IA ta-ble provides a road map of opportunities for a given sys-tem. Using the same IA table shown in Figure 19, we showthree examples of different human–robot teaming options inFigure 20. These are represented by three viable workflowpathways that reach the goal. Each option has different func-tion allocation of roles and responsibilities. Example A onthe left of Figure 20 is highly automated, while Option B inthe middle of Figure 20 is highly manual. Option C on thefar right of Figure 20 involves significant interplay betweenthe automation and human operator. Any pathway alongthe graph is a valid option, so there are more than threealternatives, but each choice has a different nominal perfor-mance and risk associated with it. This analysis affordedour engineers the ability to design for a variety of sys-tem workflows thus providing flexibility and resilience (seeSection 6.5.2). The IA table is also valuable because it re-minds engineers that allocating a task to a robot is rarelya clean task offloading and typically involves generatingnew interdependencies with the person relieved of the task.It also makes explicitly clear the need to develop algo-rithms and interfaces together. The result of our processis a human–robot system that “coactively” completed alleight tasks of the DARPA Robotics Challenge. While allDRC teams involved both a robot and at least one opera-tor, our team can provide a specific design method for howto design for human involvement and a design tool thatdepicts how that human–robot team can work together forevery task in the challenge.

6.3.3. After Scripted Behaviors Comes Tools for DynamicBehavior Generation

For the VRC and the DRC Trials, we used scripted behav-iors to automate various aspects of the tasks. We have pre-viously discussed how automation can be more resilient byenabling the human to participate in the activity in a col-laborative manner (Johnson et al., 2014). For the finals, wetook this a step further. As we developed different scriptedbehaviors to address the variability in each task, we wereable to identify the set of things that were frequently ad-justed when using the automated behaviors. This led to thedevelopment of graphical tools, associated with our inter-actable objects (Johnson et al., 2015). These tools allowed usto modify different aspects of the behavior at run-time toadjust for context. For example, the drill tool allowed ad-justment of the approach orientation, the location to startcutting, and modification of the cutout pattern. This ap-proach is counter to the “more autonomy” approach, be-cause it is actually a step backward from a scripted be-havior. There is no behavior until the operator builds it at



Figu

re20

.In

terd

epen

den

cean

alys

ista

ble

exam

ple

show

ing

thre

ed

isti

ncta

lter

nati

vepa

thw

ays

toac

hiev

ea

goal

,eac

hw

ith

dif

fere

ntfu

ncti

onal

loca

tion

ofro

les

and

resp

onsi

bilit

ies.

Exa

mpl

eA

ishi

ghly

auto

mat

ed(l

eft)

,Bis

high

lym

anua

l(c

ente

r),a

ndC

isa

com

bina

tion

ofau

tom

atio

nan

dhu

man

inpu

t(r

ight

).A

nypa

thw

ayal

ong

the

grap

his

ava

lidop

tion

,so

ther

ear

em

ore

than

thre

eal

tern

ativ

es,b

utea

chch

oice

has

ad

iffe

rent

nom

inal

perf

orm

ance

and

risk

asso

ciat

edw

ith

it.

Table II. Lines of code and unit test developed for the DRC.

DRC Event Lines of Code Net ChangeNumber ofUnit Tests

VRC 1.5 M 500 K 1,300Trials 1.8 M 300 K 1,500Finals 2.0 M 200 K 3300

run-time. However, it is much more flexible than script-ing and provides more directability to the operator whilemaintaining the observability and predictability from ourprevious approach.

6.4. Software Practice Lessons

Seven teams used exactly the same hardware for the fi-nals. The scores of these teams spanned the full range, in-dicating the importance of software. It is not just quality ofthe control algorithm that matters, but undoubtedly morethan one team, including ourselves, suffered from a soft-ware bug that directly impacted their performance. Solidsoftware practices are worth the extra time. This is partic-ularly true of large projects. Table II lists the amount ofcode used, the amount developed, and the number of unittests developed for each phase of the competition. Figure 21shows the development ramp up for each phase as well asthe code freeze. Without solid software practices, useful de-bugging tools, and a suite of unit tests, we would not havehad the confidence we needed to develop as ambitiously aswe did.

We used several tools and techniques for improvingsoftware quality. We have several thousand low-level unittests that test the functionality of individual classes. Wehave dozens of end-to-end tests in which full humanoid be-haviors, such as walking over a pile of cinder blocks, aretested in simulation. We used an Atlassian Bamboo auto-matic build server with a server farm to run all of these testseach time new code is committed to our core repository.We also used many of the practices advocated by the agileprogramming community.

6.5. Design Choice Lessons

6.5.1. Know When to Design for Sufficiency

We have claimed that it is important to differentiate whena criteria should be minimized and when only sufficiencyis required. Bandwidth constraints were one example andare an interesting case because they varied across all threephases of the competition. We used good compression al-gorithms and carefully designed communication protocolsto ensure we stayed under the limits (i.e. sufficiency), butour design goal in all phases was to provide the opera-tor with as much relevant information as possible, becausethe operator is viewed as a critical teammate. Very early in



Figure 21. Lines of code over the duration of the DRC. Each color reflects a different module of our code base.

development, we replicated the communications expectedat the finals and were able to verify that the operator wouldhave no trouble using our existing approach.

An additional example is how we localized our robot ormore accurately how we did not. Traditionally, in robotics,there is some state estimation, which feeds into a localiza-tion algorithm, which is combined with sensor readings toestimate position. Often this can be extended to includemap building, resulting in simultaneous localization andmapping (SLAM). We integrated an open source localiza-tion algorithms from ETH9 (Pomerleau, Colas, Siegwart, &Magnenat, 2013). It worked and allowed us to, for example,walk back and forth over a known set of cinder blocks 10times in a row.10 However, in the end, we had improved ourstate estimator to a level that provided sufficient accuracyto perform all tasks required for the DRC Finals without theneed for additional localization.

The state estimator used for the DRC Finals was devel-oped by IHMC for the DRC Trials (Johnson et al., 2015). Itrelies only on joint encoders to estimate the joint positionsand velocities and only on the IMU to estimate the pelvisorientation and angular velocity. The pelvis position is esti-mated using the leg kinematics and accelerometer measure-ments using a Kalman filter. At the time of the DRC trials,the state estimator had a drift of about 2 to 3 cm per step forthe pelvis horizontal position, and about 5 mm per step forthe pelvis vertical position. At the time of the DRC Finals,the state estimator drift was down to about 1 cm per ev-

9http://wiki.ros.org/ethzasl_icp_mapping (accessed September25, 2015)10https://www.youtube.com/watch?v=ZG2gGIbtkAk (accessedSeptember 25, 2015)

ery three steps for the pelvis horizontal position, and about5 mm per every nine steps for the pelvis vertical position. Inthe end, the drift was reduced by a factor of about 10. Thiswas due to several factors:

� The redesign of Atlas came with a significant reductionof backlash in the leg joints, improving measurementsusing kinematics.

� The state estimator accuracy partially depends on therobot gait generated by the controller, which was im-proved by reducing the amount of foot slipping andbouncing.

� Besides Atlas, the controller and state estimator were ex-tensively tested on other robots. With these additionaltesting platforms, we were able to easily identify andcorrect design flaws in both the controller and the stateestimator.

We are not suggesting external localization or SLAMwould not be useful, just that they were not necessary forthe DRC Finals. When we integrated localization into oursystem, there was a slight benefit, but there was also acost. Occasional localization errors could cause problems.Because our simpler approach was sufficient, we opted toavoid rather than address the localization issue.

6.5.2. Human–Machine Teaming Is a Viable Path to SystemResilience

Resilience is not about optimal behavior; it is aboutsurvival and mission completion. The two essentialcomponents of resilience are recognition of problemsand flexible alternatives to address them. By focusingon observability, predictability, and directability—the core


http://wiki.ros.org/ethzasl_icp_mapping

https://www.youtube.com/watch?v=ZG2gGIbtkAk


interdependence relationships of coactive design—ourteam was able to effectively build a resilient system.

Examples of resilient behavior were more obvious inthe VRC where we were required to perform each taskmultiple times (see Johnson et al., 2014), but there werealso instances during the DRC Trials (e.g., the wind clos-ing doors during the door task). This is further supportedby the human–robot interaction (HRI) study of the Trials(Yanco, Norton, & Ober, 2015), which shows our team hadfewer critical incidents than any other team in the study.We feel this was due to our ability to recognize poten-tial problems and adapt to avoid them. In both cases, oursuccess was not because we performed flawlessly but in-stead reflects our system’s resilience to recover from errorsand unanticipated circumstances and adapt to overcomethem.

The finals were no different. Our performance was farfrom flawless and included falling on the first day. The fallwas a result of not having adequate recognition of a poten-tial problem. In this case, our operator was uncertain aboutthe danger of the upcoming step. Recognition of problemsis a common issue in robotics. The HRI study of the trialspointed it out. In addition to our falls on Day 1, there werealso examples at the finals from other teams. Though wecannot state as a certainty, it is likely that one challenge teamKAIST faced on Day 1 was that they did not recognize theirrobot had slipped, and this resulted in the robot driving upthe wall after the drill task. In addition to recognition, thesystem must be flexible enough to adapt to a problem. Dur-ing the finals, the NASA JPL team clearly recognized theirfailure to cut the wall but seemed to struggle with alterna-tives to adapt their cutting approach, until finally choosingto punch out the wall.

Recovering from obvious failures is important, asdemonstrated by CHIMP, as it righted itself after a fall.Equally important is to deftly avoid the numerous smallproblems that could grow into the next big issue. Our anal-ysis of the runs during the finals is shown in Table III.We had 19 errors that could have potentially preventedour second-place finish. Issues such as an arm uninten-tionally striking an object, pushing on something beyondthe robot’s control authority, or relying on incorrect sensordata.

The first lesson from this is that both machines and peo-ple make mistakes. Most people would readily admit thatpeople make mistakes; yet, machines are often viewed asperfect and repeatable. This may be true in controlled envi-ronments, but the machine side of our system contributedto the potential errors. Machine errors came from physicaldamage, not accounting for obstacles, not recognizing es-timation error, reaching control authority, and uncertainty.As we field robots in the real world, they will need to do abetter job recognizing their own errors and limitations.

The best solution is often a combination of the humanand machine—teamwork. For example, the machine is of-

ten unable to decide if a hand “flip” (360-degree rotation)is warranted and permissible in a given context. However,it is very capable of detecting when a plan will trigger one.The human, on the other hand, is extremely bad at detect-ing such situations ahead of time but can easily determinewhether it is appropriate. By providing an automated warn-ing, we enabled the human to avoid the problem. Arm mo-tions in tight spaces have similar issues. We used an auto-mated preview capability that helped the operator identifyand avoid two problems. Another challenge for automationis when it is being pushed to its limits. Often human opera-tors are unaware that a system is approaching its limits untilit is too late. By providing observability into when our con-trol was approaching control authority limits, as in Figure7, we avoid several problems. Last, sometimes both ma-chines and people have insufficient information to achievecertainty. Leveraging both party’s capabilities you may beable to reduce that uncertainty by redundant independentconfirmation.

Teamwork is essential to building appropriate trust ina system. Trust in the system comes from sound engineer-ing, extensive testing, lots of practice, but also the abilityto evaluate the system at run-time using observability andpredictability and to adapt the plan using directability asnecessary based on circumstances. Although we practiceda lot, we still used the preview constantly to verify ourexpectations. The algorithms underlying our interface canhelp enhance the human’s interpretation by providing sig-nals and warning that might otherwise go unnoticed, suchas the ICP alert and possible hand flip warning. It is throughthese mechanisms that we were able to build appropriatetrust. While confidence with our fully functional Atlas onDay 1 was appropriate, after the fall, the operator was ableto assess the damage and adjust his trust accordingly. Theresulting performance was slower, as shown in Figure 3,but more appropriate for the situation and resulted in bet-ter performance.

6.6. Assessment of Legged Robots for DisasterResponse Environments

It is very tempting to make broad generalizations aboutthe relative merits of legs, tracks, wheels, and other mobil-ity platforms based on the results of the DARPA RoboticsChallenge. However, one must be careful not to overgener-alize or draw too many conclusions from this one data point.The eight tasks of the DARPA Robotics Challenge were se-lected and designed based on a number of criteria and onlyare a small sample of potential tasks required for disasterresponse. One of the main concerns of DARPA was to makesure that the DRC was fair to all of the teams. Therefore,any obstacle or task that would have made it impossiblefor a subset of the teams to perform well in the finals wasavoided. However, there are many scenarios in real disasterresponse scenarios that would rule out a given platform.



Table III. Potential errors experienced during the DRC Finals.

Potential Error Cause Solution

Day 1 Possible hand flip Machine-specified poor solution Machine warned/human adjustedPossible hand flip Human chose a poor solution Machine warned/human adjustedArm striking door frame Machine did not recognize obstacle Machine preview/human adjustedEstimation drift Machine estimation drift Observable in interface/human

correctedICP approaching limits Machine exceeding capability Machine warned/human adjustedFootsteps below LIDAR Machine failed to recognize ground Observable in interface/human

aborted and replannedIs the drill on? Human and machine uncertainty Concurrence of automated detection

and human observation

Is that step too far? Human and machine error None (fall)Sensor data is wrong Hardware damage from fall Human judgmentExceeding control authority Failure to recognize approaching

stability limitNone (fall)

Potential Error Cause Solution

Day 2 Autonomous behavior failed Door latch was stiffer than expected Human adjustedSensor error Hardware damage from fall Human judgmentICP approaching limits Machine exceeding capability Machine warned/human adjustedIs the drill on? Human and machine uncertainty Concurrence of automated detection

and human observationMissed attempt at surprise task Human judgment Human retryPoor hardware performance Hardware damage from fall Human judgment and cautionLinguistic confusion Human spoke incorrectly Other people correctedArm striking hand rail of stairs Machine did not recognize obstacle Machine preview/human adjustedFootstep did not land where

commandedControls error possibly due to damage Observable in interface/human

adjusted and restep

For example, a 1-m-long gap to cross, a Jersey barrier, a 20-cm narrow passageway, or a standard ladder would ruleout many tracked or wheeled platforms. Legged humanoidplatforms could have been ruled out, for example, by havingflat roads requiring over 20 mph traversal speeds, tasks re-quiring transporting twice the weight of the platform, tasksrequiring getting over a 20-foot wall, or tasks requiring get-ting through a 20-cm-diameter pipe. Each of these taskscould be performed by other robot morphologies, and eachis something that would likely be encountered in a disasterresponse scenario.

In addition, luck played a part in the ordering of thetop teams in the DARPA Robotics Challenge. For example,using Table I it can be seen that the top six teams were allone of the top three teams on one of the days during thefinals. If the DRC were to have taken place over severalmore days, it is likely that the top 10 ordering would havebeen significantly different on each of the days. Therefore,it is prudent to avoid generalizations of the form “Robot

X beat robot Y proving that robot characteristic S is betterthan T.”

Instead, we believe that the lessons to be learned are(1) It is plausible for remotely operated robots to performtasks in disaster response environments, (2) multiple roboticplatforms will be suitable for different capabilities in suchenvironments, (3) these robots are getting close to addingpotential value but still need to be faster and more reliable,and (4) to be useful, robots will need to be able to surviveand recover from falls and other mishaps that will inevitablytake place.

We believe that the main value of a legged humanoidplatform is to have the potential to get to the same placesthat a human can get to. In disaster response environments,the places that dismounted humans can get to are numerousand is one of the reason why sending in real humans whenavailable will always be the first choice of response. Humansare incredibly mobile because of their morphology and di-mensions. Figure 22 shows some typical examples of where



Figure 22. Some of the mobility tasks that humans excel at due to their morphology and dimensions. Humanoid robots have thepotential for achieving human-level mobility, including being able to accomplish these mobility tasks, and thus are well suited foroperations in complex, cluttered environments found in disaster scenarios as well as every day human environments. Many ofthese tasks are difficult for wheeled and tracked vehicles that do not have leglike appendages.

the humanoid form excels at mobility. Because humans havevery narrow feet and legs, on the order of 10 cm wide, ahuman can maneuver over rough terrain that has intermit-tently placed narrow contact patches and squeeze throughvery narrow passages. Because humans have long legs, onthe order of 1 m long, they can easily get over high obsta-cles, such as Jersey barriers, and take longs steps over largegaps. Because humans have a high height and long arms,they can reach objects on high shelves or use their arms toclimb over walls on the order of 2.5 m high. Because humansare both bipeds and quadrupeds, they can switch betweenmodes when necessary, allowing for crawling under objectsand climbing ladders. As humanoid robots progress, we be-lieve that they will approach these capabilities of humansand eventually exceed them when component technologies,such as actuators, exceed the capabilities of their biologicalcounterparts, such as muscle.

The humanoid form is poor at flying, running at in-credible speeds, slithering through extremely small pipesand cracks, carrying very heavy loads, and several othermobility capabilities that may be important for disaster re-sponse and other scenarios in complex environments. Forthese tasks, flying, wheeled, snake, and other robots aremore appropriate. However, we believe that the humanform provides better general-purpose mobility across thewidest range of land surface based environments foundin nature, as well as built environments, including dam-aged built environments found after a disaster, than anyother morphology found in nature or currently found orproposed in technology. Note that we consider all primatesas having a “humanoid” form and that we consider hy-brid morphologies that have a humanoid-like mode also asbeing humanoid. In fact, humans plus their various wear-able technologies combine in hybrid morphologies allow-ing us to cross continents in hours, dive to the bottom ofthe ocean, and step on the moon. Likewise, we believe that

hybrid humanoid robots will be more capable than “pure”humanoids. Yet it is likely that, in many cases, similarly tohumans, they will remove their hybrid attachments whenentering buildings and other environments more suited forpure legged locomotion. In any case, we believe that hu-manoid robots will be an integral part of any highly capablerobotic solution for scenarios in which it is desired to projecta human presence but dangerous or costly enough that it isappropriate to send robots, either by themselves or as partof human robot teams.

7. CONCLUSION

The DRC was an amazing opportunity to advance robotics,but it also provided a fantastic opportunity to analyze thedesign process of a team while tackling such challenges. Un-derstanding how a team approaches complex challenges,prioritizes activity, and measures improvement is as im-portant as understanding what algorithms or interface el-ements were employed. The DRC provided three separatelooks at rapid development of robotics capabilities by thesame team. Though similar, each phase also provided slightdistinctions in constraints and requirements. Our successacross all phases of the competition was due to many factors.These include a combination of capable robot hardware, in-novative walking control algorithms, effective human-robotteaming using the coactive design method, diligent softwaredevelopment practices, maximizing our limited resourcesusing appropriate design choices, and a fair amount of luck.Failure in any one of these areas would likely have alteredthe outcome. The technological advancement of algorithmsand interfaces from not just our team but also all the com-petitors, is certainly beneficial to the robotics communityas a whole. We would argue that the lessons learned andexperienced gained by all participants is equally valuableby raising the caliber of the robotics professionals across all



teams. This foundation will likely play a significant role inrobotics innovation in the near future.

ACKNOWLEDGMENTS

We would like to thank DARPA for sponsoring the RoboticsChallenge and encouraging the advancement of robotics ca-pabilities. We would also like to thank DARPA for the fund-ing provided to IHMC to compete in the competition. Wealso thank Boston Dynamics for providing Atlas, which hasbeen a solid and reliable robotic platform. Last, we wouldlike to acknowledge our sponsors, Atlassian and Amazon.Atlassian also provided an embedded engineer to ensureour agile practices were effectively applied using their At-lassian software tools.

REFERENCES

Bradshaw, J. M., Hoffman, R. R., Johnson, M., & Woods, D. D.(2013). The seven deadly myths of “autonomous systems.”IEEE Intelligent Systems, 28(3), 54–61. http://doi.org/10.1109/MIS.2013.70

DeDonato, M., Dimitrov, V., Du, R., Giovacchini, R., Knoedler,K., Long, X., & Atkeson, C. G. (2015). Human-in-the-loopcontrol of a humanoid robot for disaster response:A report from the DARPA Robotics Challenge Trials.Journal of Field Robotics, 32(2), 275–292. http://doi.org/10.1002/rob.21567>

Fallon, M., Kuindersma, S., Karumanchi, S., Antone, M.,Schneider, T., Dai, H., & Teller, S. (2015). An architec-ture for online affordance-based perception and whole-body planning. Journal of Field Robotics, 32(2), 229–254.http://doi.org/10.1002/rob.21546

Hebert, P., Bajracharya, M., Ma, J., Hudson, N., Aydemir,A., Reid, J., & Burdick, J. (2015). Mobile manipulationand mobility as manipulation-design and algorithms ofRoboSimian. Journal of Field Robotics, 32(2), 255–274.http://doi.org/10.1002/rob.21566

Johnson, M., Bradshaw, J. M., Feltovich, P. J., Jonker, C. M., vanRiemsdijk, B. M., & Sierhuis, M. (2014). Coactive design:

Designing support for interdependence in joint activity.Journal of Human-Robot Interaction, 3(1), 43–69.

Johnson, M., Bradshaw, J. M., Hoffman, R. R., Feltovich, P.J., & Woods, D. D. (2014). Seven cardinal virtues ofhuman-machine teamwork: Examples from the DARPARobotic Challenge. IEEE Intelligent Systems, 29(6), 74–80. http://doi.org/10.1109/MIS.2014.100

Johnson, M., Shrewsbury, B., Bertrand, S., Wu, T., Du-ran, D., Floyd, M., & Pratt, J. (2015). Team IHMC’slessons learned from the DARPA Robotics Chal-lenge Trials. Journal of Field Robotics, 32(2), 192–208.http://doi.org/10.1002/rob.21571

Koolen, T., de Boer, T., Rebula, J., Goswami, A., & Pratt, J.(2012). Capturability-based analysis and control of leggedlocomotion: Part 1. Theory and application to three simplegait models. International Journal of Robotics Research,31(9), 1094–1113. http://doi.org/10.1177/0278364912452673

Koolen, T., Smith, J., Thomas, G., Bertrand, S., Carff, J., Mertins,N., & Pratt, J. (2013). Summary of team IHMC’s virtualrobotics challenge entry. In Proceedings of the IEEE-RASInternational Conference on Humanoid Robots, October15–17, 2003, Atlanta, GA.

Pomerleau, F., Colas, F., Siegwart, R., & Magnenat, S. (2013).Comparing ICP variants on real-world data sets.Autonomous Robots, 34(3), 133–148. http://doi.org/10.1007/s10514-013-9327-2

Smith, J., Stephen, D., Lesman, A., & Pratt, J. (2014). Real-timecontrol of humanoid robots using OpenJDK. In Proceed-ings of the 12th International Workshop on Java Technolo-gies for Real-Time and Embedded Systems. New York, NY:ACM. http://dl.acm.org/citation.cfm?id=2661027

Stentz, A., Herman, H., Kelly, A., Meyhofer, E., Haynes,G. C., Stager, D., & Wellington, C. (2015). CHIMP,the CMU Highly Intelligent Mobile Platform. Journalof Field Robotics, 32(2), 209–228. http://doi.org/10.1002/rob.21569

Yanco, H., Norton, A., & Ober, W. (2015). Analysis of human-robot interaction at the DARPA Robotics Challenge Tri-als. Journal of Field Robotics, 32(3), 420–444. http://onlinelibrary.wiley.com/doi/10.1002/rob.21568/pdf


team ihmc's lessons learned from the darpa robotics … · 2018. 3. 18. · johnson et al.:...

Documents