robot adaptation to unstructured terrains by joint...

10
Robot Adaptation to Unstructured Terrains by Joint Representation and Apprenticeship Learning Sriram Siva 1 , Maggie Wigness 2 , John Rogers 2 , and Hao Zhang 1 1 Colorado School of Mines, 2 U.S. Army Research Laboratory [email protected], [email protected], [email protected], [email protected] Abstract—When a mobile robot is deployed in a field environ- ment, e.g., during a disaster response application, the capability of adapting its navigational behaviors to unstructured terrains is essential for effective and safe robot navigation. In this paper, we introduce a novel joint terrain representation and apprenticeship learning approach to implement robot adaptation to unstructured terrains. Different from conventional learning-based adaptation techniques, our approach provides a unified problem formulation that integrates representation and apprenticeship learning under a unified regularized optimization framework, instead of treating them as separate and independent procedures. Our approach also has the capability to automatically identify discriminative feature modalities, which can improve the robustness of robot adaptation. In addition, we implement a new optimization algorithm to solve the formulated problem, which provides a theoretical guarantee to converge to the global optimal solution. In the experiments, we extensively evaluate the proposed approach in real-world scenar- ios, in which a mobile robot navigates on familiar and unfamiliar unstructured terrains. Experimental results have shown that the proposed approach is able to transfer human expertise to robots with small errors, achieve superior performance compared with previous and baseline methods, and provide intuitive insights on the importance of terrain feature modalities. I. I NTRODUCTION Over the past several years, autonomous mobile robots have been more commonly used in unstructured field environments to address real-world applications, including disaster response, infrastructure inspection, homeland defense, and subterranean and planetary exploration [1, 2]. When deployed in an outdoor field environment, mobile robots need the essential capability of efficiently navigating through different types of terrains with a wide variety of characteristics. For example, as illustrated in Figure 1, when mobile robots perform disaster response tasks in a post-disaster situation, they are likely to encounter terrains such as grass, rock, pebble, mud, snow, and a mixture of these terrain types (e.g., a rocky terrain partially covered by mud and snow). In addition, the same type of terrains typically exhibits a variety of characteristics (e.g., different slope and softness). Such terrains whose types and characteristics cannot be fully modeled or determined before robot deployment are referred to as unstructured terrains [3, 4]. The capability of generating desired navigation behaviors (e.g., velocity control) that adapt to unstructured terrains is essential for an autonomous mobile robot to operate in complex field environments. Given the importance of robot navigation over unstructured terrains, a large number of terrain classification and adaptation techniques were implemented [5, 6, 7, 8]. Especially, learning- based techniques have been attracting increasing attention over Snow Rock Grass Pebble Mud Concrete Disaster Response Figure 1. Motivating scenarios of robot adaptation to unstructured terrains. When mobile robots operates in a field environment during disaster response, the robots need the essential capability of navigating over a variety of terrains, such as grass, rock, snow, pebble, and mud, which often cannot be accurately structured before onsite robot deployment. In this paper, we propose the TRAL method to integrate representation learning and apprenticeship learning under a unified mathematical framework and estimate terrain feature importance to enable robust robot adaptation to unstructured terrains. the last few years. For example, unsupervised learning-based techniques were designed to perform semantic segmentation of terrains, and then navigational decisions were constructed by a robot through estimating traversability of unstructured terrains [9, 10, 11]. In addition, methods based upon terrain classifica- tion were implemented for mobile robots to reason about the possibility of traversing uneven terrains [12, 13]. Moreover, to teach navigational behaviors to robots, apprenticeship learning (a.k.a., learning from demonstration or imitation learning) was widely investigated [3, 11, 14] to transfer human expertise in navigation control to autonomous robots and generate adaptive behaviors according to given terrains (i.e., terrain adaptation). Despite of promising performance of the previous methods, several challenges in robot adaptation to unstructured terrains have not yet been well addressed. Most existing learning-based methods focused on either terrain classification or navigation behavior generation only [15, 16, 17, 18, 19], or treated them as two separate components in the processing pipeline without a unified integration [14]. Moreover, previous adaptation tech- niques either used a single feature to represent terrains [20], or combined a number of features by simple concatenation [21]. The question of how to automatically estimate the importance of heterogeneous terrain features (e.g., from exteroperceptive and proprioceptive sensors) and fuse them together for robust adaptation has not been well answered. In this paper, we propose a novel joint terrain representation and apprenticeship learning (TRAL) approach to enable robot

Upload: others

Post on 15-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

Robot Adaptation to Unstructured Terrains by JointRepresentation and Apprenticeship Learning

Sriram Siva1, Maggie Wigness2, John Rogers2, and Hao Zhang11Colorado School of Mines, 2U.S. Army Research Laboratory

[email protected], [email protected], [email protected], [email protected]

Abstract—When a mobile robot is deployed in a field environ-ment, e.g., during a disaster response application, the capabilityof adapting its navigational behaviors to unstructured terrains isessential for effective and safe robot navigation. In this paper, weintroduce a novel joint terrain representation and apprenticeshiplearning approach to implement robot adaptation to unstructuredterrains. Different from conventional learning-based adaptationtechniques, our approach provides a unified problem formulationthat integrates representation and apprenticeship learning undera unified regularized optimization framework, instead of treatingthem as separate and independent procedures. Our approach alsohas the capability to automatically identify discriminative featuremodalities, which can improve the robustness of robot adaptation.In addition, we implement a new optimization algorithm to solvethe formulated problem, which provides a theoretical guaranteeto converge to the global optimal solution. In the experiments, weextensively evaluate the proposed approach in real-world scenar-ios, in which a mobile robot navigates on familiar and unfamiliarunstructured terrains. Experimental results have shown that theproposed approach is able to transfer human expertise to robotswith small errors, achieve superior performance compared withprevious and baseline methods, and provide intuitive insights onthe importance of terrain feature modalities.

I. INTRODUCTION

Over the past several years, autonomous mobile robots havebeen more commonly used in unstructured field environmentsto address real-world applications, including disaster response,infrastructure inspection, homeland defense, and subterraneanand planetary exploration [1, 2]. When deployed in an outdoorfield environment, mobile robots need the essential capabilityof efficiently navigating through different types of terrains witha wide variety of characteristics. For example, as illustrated inFigure 1, when mobile robots perform disaster response tasksin a post-disaster situation, they are likely to encounter terrainssuch as grass, rock, pebble, mud, snow, and a mixture of theseterrain types (e.g., a rocky terrain partially covered by mud andsnow). In addition, the same type of terrains typically exhibitsa variety of characteristics (e.g., different slope and softness).Such terrains whose types and characteristics cannot be fullymodeled or determined before robot deployment are referredto as unstructured terrains [3, 4]. The capability of generatingdesired navigation behaviors (e.g., velocity control) that adaptto unstructured terrains is essential for an autonomous mobilerobot to operate in complex field environments.

Given the importance of robot navigation over unstructuredterrains, a large number of terrain classification and adaptationtechniques were implemented [5, 6, 7, 8]. Especially, learning-based techniques have been attracting increasing attention over

SnowRockGrass

Pebble Mud ConcreteDisaster Response

Figure 1. Motivating scenarios of robot adaptation to unstructured terrains.When mobile robots operates in a field environment during disaster response,the robots need the essential capability of navigating over a variety of terrains,such as grass, rock, snow, pebble, and mud, which often cannot be accuratelystructured before onsite robot deployment. In this paper, we propose the TRALmethod to integrate representation learning and apprenticeship learning undera unified mathematical framework and estimate terrain feature importance toenable robust robot adaptation to unstructured terrains.

the last few years. For example, unsupervised learning-basedtechniques were designed to perform semantic segmentation ofterrains, and then navigational decisions were constructed by arobot through estimating traversability of unstructured terrains[9, 10, 11]. In addition, methods based upon terrain classifica-tion were implemented for mobile robots to reason about thepossibility of traversing uneven terrains [12, 13]. Moreover, toteach navigational behaviors to robots, apprenticeship learning(a.k.a., learning from demonstration or imitation learning) waswidely investigated [3, 11, 14] to transfer human expertise innavigation control to autonomous robots and generate adaptivebehaviors according to given terrains (i.e., terrain adaptation).

Despite of promising performance of the previous methods,several challenges in robot adaptation to unstructured terrainshave not yet been well addressed. Most existing learning-basedmethods focused on either terrain classification or navigationbehavior generation only [15, 16, 17, 18, 19], or treated themas two separate components in the processing pipeline withouta unified integration [14]. Moreover, previous adaptation tech-niques either used a single feature to represent terrains [20], orcombined a number of features by simple concatenation [21].The question of how to automatically estimate the importanceof heterogeneous terrain features (e.g., from exteroperceptiveand proprioceptive sensors) and fuse them together for robustadaptation has not been well answered.

In this paper, we propose a novel joint terrain representationand apprenticeship learning (TRAL) approach to enable robot

Page 2: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

adaptation to unstructured terrains. Instead of treating terrainclassification and navigational behavior generation as separatetasks, our approach integrates them into a unified mathematicalformulation under the regularized optimization paradigm. Ourapproach learns representations of unstructured terrains, whichare jointly used to classify terrains and perform apprenticeshiplearning to generate navigational behaviors (e.g. velocity con-trol) in the unified formulation. Furthermore, our approach isable to automatically learn weights of terrain features to createa multi-modal representation of terrains, by designing sparsity-induced norms as regularizers in the regularized optimizationparadigm. These weights encode the importance of the terrainfeatures, thus offering the insight of which terrain features aremost critical for terrain classification and navigation behaviorgeneration.

The key theoretical novelty of the paper is twofold:• We propose a novel TRAL method that is able to integrate

representation learning and apprenticeship learning undera unified mathematical framework, automatically estimateterrain feature importance, and incorporate heterogeneousterrain features to enable robot adaptation to unstructuredterrains.

• We develop a new optimization algorithm that efficientlysolves the formulated joint representation and apprentice-ship learning problem, which holds a theoretical guaran-tee to converge to the optimal solution.

The remaining of the paper is organized as follows. Relatedwork is reviewed in Section II. We discuss the formulation ofTRAL for robot adaptation to unstructured terrains in SectionIII. The new optimization algorithm is described in Section IV.Experiments are presented in Section V. Finally, we concludethe paper in Section VI.

II. RELATED WORK

This section provides a review of related research on terrainclassification and robot adaptation, and identifies key researchchallenges to be addressed in this paper.

A. Terrain Classification and Characterization

Terrain classification and characterization aims at extractingterrain features and associating them with terrain categories.Much of earlier methods was designed for larger vehicles andthe characterization process was typically manual. There wasan amount of research on Terramechanics [22], the guidanceof autonomous vehicles through rough terrains. Similarly, Pre-planning was used for speed selection [23], e.g., for vehiclesused in the DARPA Grand Challenge, to address the problemof unstructured terrain navigation through fixing the optimalspeed for each terrain type by collecting data for months. [24]generated digital terrain maps and then fitted a Gaussian modelto classify terrains. [25] employed terrain cohesion and internalfrictional angle obtained from a variety of terrains to performterrain characterization.

In learning-based methods, terrain classification was usuallyperformed by learning from terrain data in a regressive manner.For example, [26] designed a feed forward neural network on

terrain images to classify terrain types. Motivated by the needto recognize visual terrain features that impact mobility, [27]used Hidden Markov Models to learn terrain features and anSVM classifier to identify terrain types. [15] used a series ofsensors to measure robot vibrations on terrains to successfullyclassify terrains. [28] used a color-based classification to labelobstacles based on terrain classes and carried out appropriatenavigational behaviors.

Previous methods either used a single feature modality or acombination of features to represent terrains, and the problemof estimating the importance of feature modalities has not beenwell addressed yet, especially in the context of apprenticeshiplearning for robot terrain adaptation. Also, previous methodsgenerally considered terrain classification and characterizationas a preprocessing procedure for subsequential reasoning suchas navigation planning.

B. Robot Adaptation

In the area of robot adaptation, high-level behavioral modelswere implemented to address the problem of robot adaptationin general [29, 30, 31]. Similarly, case-based reasoning [32]techniques were developed to make robots to adapt in dynamicenvironments [33, 34]. Many model-based approaches aimedat learning a single global model using function approximators,such as the Gaussian processes [35, 36] and neural networks[37]. Probabilistic Gaussian methods [38] were implementedto learn terrain models and update the models efficiently usingsparse approximation. One of the key challenges of deployingsuch models is due to the difficulty of learning a global modelthat is accurate for the entire state space. Several approaches[39, 40] were developed to learn accurate local models insteadof learning a global model. These techniques have acceptablelocal performance and allow for iterative local improvementsof the policy, but they may not be feasible to be applied to anunexperienced domain. Recently, online learning approaches[41, 42] were also used to iteratively update model parametersduring task execution. Although this online learning paradigmis widely used with robots operating in evolving environments,the effectiveness of the methods in the case of sudden changeof terrain features is questionable.

For the research problem of robot adaptation to terrains, sev-eral approaches attempted to generate autonomously-adaptabledynamic maneuvering using a neural system model in simula-tions [43] and operational robots [44, 45]. Recently, methodsbased upon self-supervised learning [46, 9] were implementedto navigate in a minimal collision environment by maintainingminimal traversal and bumper history evaluations. [47] appliedan array of inertial and ultrasonic sensors to calculate featuresof soil properties and carry out terrain adaptation. In the needfor human-like operation and navigation [48], [11] used visualperception data and inverse optimal control trained with humansupervision to learn to imitate expert navigation behaviors. Byconsidering speed as an important factor for robot navigation,[49] presented a method for trading progress and velocity withrespect to environment characteristics.

Page 3: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

Given the promise of the previous learning-based techniquesfor robot adaptation to terrains, almost all existing approachestreated behavior generation and terrain classification as sepa-rate procedures, with no principled integration. Different fromprevious methods, our proposed approach explicitly addressesthis challenge through integrating representation learning andapprenticeship learning under the same theoretical frameworkto perform joint terrain classification and behavior generationfor terrain adaptation.

III. APPROACH

In this section, we introduce the novel TRAL approach thatformulates robot adaptation to unstructured terrains as a jointrepresentation and apprenticeship learning problem, from theunified regularized optimization perspective. An illustration ofthe TRAL approach is shown in Figure 2.

Notations: Matrices are denoted as boldface capital letters,and vectors as boldface lowercase letters. Given a matrix M ={mij} ∈ Rm×n, we refer to its i-th row and j-th column as mi

and mj , respectively. The Frobenius norm of M is defined as‖M‖F =

√Σm

i=1Σnj=1m

2ij . The `1-norm of a vector v ∈ Rn is

computed by ‖v‖1 = Σni=1|vi|. The `2-norm of v is computed

by ‖v‖2 =√v>v.

A. Representation Learning for Terrain Classification

We denote a collection of n data instances of terrain featuresthat are acquired by a robot when traversing a terrain as X =[x1, . . . ,xn] ∈ Rd×n, where xi ∈ Rd denotes a feature vectorof length d. We represent terrain types associated with X asZ = [z1, . . . , zn] ∈ Rc×n, where zi ∈ Zc is an indicatorvector with each element zij ∈ {0, 1} indicating whether thei-th data instance of features xi has the j-th terrain type, and cis the number of terrain types. Given X and Z, representationlearning for terrain classification can be simply formulated asan optimization problem [50, 51]:

minV‖V>X− Z‖2F (1)

where V ∈ Rd×c is a weight matrix with vij representing theimportance of the i-th element in a feature vector with respectto the j-th terrain type. The loss function parameterized by Vin Eq. (1) models the squared error of utilizing the weightedfeatures to represent the terrain types.

When the terrain features are computed by different featureextraction techniques and/or from different sensors, the featurevector contains multiple feature modalities, with each modalityincluding features computed by one technique from one sensor.Then, each feature vector xi exhibits a multi-modal structurexi = [(x1

i )>, . . . , (xmi )>]> ∈ Rd, where m is the number of

feature modalities in the vector. If dj is the length of the j-thfeature modality, we have d = Σm

j=1dj .Because different terrain features typically capture different

characteristics of unstructured terrains (e.g., color, smoothness,and roughness) and often contribute differently toward terrainclassification, it is essential to identify most descriptive featuremodalities when multiple modalities are used together. In order

to enable this capability, we develop a norm to regularize theweight matrix V [52]:

‖V‖R =

c∑i=1

m∑j=1

‖vji ‖2 =

c∑i=1

m∑j=1

√√√√ dj∑k=1

(vjik

)2(2)

where vji ∈ Rdi represents the weight vector of terrain features

in the i-th modality with respect to the j-th terrain type. Since‖V‖R uses the `2-norm within each modality and the `1-normamong modalities with respect to each terrain type, it enforcessparsity between modalities to identify the most discriminativefeature modalities. That is, if a modality is not discriminativeto represent the terrains for terrain classification, the weightsof terrain features in this modality are assigned with zeros (inideal case, usually they are small values close to 0); otherwise,the weights have big values. The effect of the norm ‖V‖R isillustrated in Figure 2.

Then, representation learning from multi-modal features forterrain classification is formulated as the following regularizedoptimization problem:

minV‖V>X− Z‖2F + λ1‖V‖R (3)

where λ1 ≥ 0 is a trade-off hyper-parameter applied to balancethe loss function and the regularization term.

𝐱121

𝐱11

𝐱1𝑚

⋯⋯

𝐕

⋮⋯⋯

⋯⋯

𝐕𝐔

⋮⋯⋯

⋮⋮

⋮⋮

⋮⋮

⋮⋮

⋮⋮

⋮⋮

||𝐕𝐔||𝐴 =

𝑗=1

𝑚

||𝐕𝑗𝐔 ||𝐹

||𝐕||𝑅 =

𝑖=1

𝑐

𝑗=1

𝑚

||𝐯𝑖𝑗||2

𝐗

𝐙

𝐘

Behavior Generation

Mu

lti-

Mo

dal

Fea

ture

s

TRAL

Terrain Classification

Figure 2. Illustration of the proposed TRAL approach for joint represen-tation and apprenticeship learning under the unified regularized optimizationframework, with the regularization terms ‖V‖R and ‖VU‖A to identify thediscriminative feature modalities.

B. Formulation of Joint Representation and ApprenticeshipLearning for Unstructured Terrain Adaptation

Apprenticeship learning [48] is one of the most widely usedlearning-based paradigms to transfer human expertise to robotsthrough directly expert demonstration, which can be viewed asthe learning of a projection from observation or feature spaceinto behavior space, e.g., by probabilistic [53] or optimization-based [14, 54] approaches.

Our goal in this paper is to achieve apprenticeship learningwhile retaining the advantage of discriminative features offeredby representation learning. The theoretical novelty of the paperis that we introduce a principled integration of representationand apprenticeship learning in a unified formulation under the

Page 4: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

mathematical framework of regularized optimization, whichenables joint terrain classification and behavior generation.

Formally, given expert demonstrations on behavior controlsY = [y1, . . . ,yn] ∈ Rr×n (e.g., velocity, motor torque, andconsumed power) that are associated with terrain features Xduring robot navigation, where r is the number of controls wewant to learn for behavior generation, we formulate joint rep-resentation and apprenticeship learning for terrain adaptationas a unified regularized optimization problem:

minV,U

‖V>X− Z‖2F + ‖(VU)>X−Y‖2F+λ1‖V‖C + λ2‖VU‖A (4)

where λ2 is a tradeoff hyper-parameter. The new loss function(i.e., the 2nd term in Eq. (4)) encodes the difference betweenthe learning model and expert demonstrations, in order to learnthe projection from multi-modal feature space to robot behav-ior space. This loss function is parameterized by VU ∈ Rd×r,which first projects input multi-modal terrain features X intoa more discriminative space V>X (parameterized by V fromrepresentation learning), and further projects the representationspace into the robot behavior space U>V>X (parameterizedby U ∈ Rc×r for apprenticeship learning).

Because the output of the proposed joint learning approachis to generate robot behaviors, we also want our representationlearning to learn discriminative terrain features that can betterdecide robot navigation behaviors. Therefore, we design a newnorm for VU, which is mathematically defined as:

‖VU‖A =

m∑j=1

‖VjU‖F (5)

where Vj ∈ Rdj×c is the block of V that includes weights ofterrain features belonging to the j-th modality. This norm canbe directly integrated into our objective function in Eq. (4) asa regularization term over VU, as demonstrated in Figure 2.This regularization enforces sparsity among feature modalitieswith respect to all robot behaviors, thus facilitating identifyingdiscriminative features for robot behavior generation.

After solving the formulated regularized optimization prob-lem in Eq. (4) based on Algorithm 1, we can obtain the optimalV∗ and U∗. When a new multi-modal terrain feature vectorxo ∈ Rd is acquired by a robot during execution, the robot’snavigation behaviors can be determined by:

yo = (V∗U∗)>xo (6)

which incorporates considerations of both representation learn-ing and apprenticeship learning.

Different from conventional learning-based robot adaptationtechniques [11, 12, 14], TRAL provides a unified formulationthat integrates representation and apprenticeship learning un-der the same theoretical framework, instead of treating themas separate and independent procedures. Moreover, differentfrom most existing apprenticeship learning methods that applya single feature modality only [54] or simple feature concate-nation [14], one of the TRAL’s advantages is its capability to

Algorithm 1: An iterative algorithm to solve the formu-lated regularized optimization problem in Eq. (4)

Input : X,Y,Z1 Let t = 1. Initialize V(t) by solving min

V‖V>X− Z‖2F

2 Initialize U(t) by solving minU‖(VU)>X−Y‖2F

3 while not converge do4 Calculate the block diagonal matrix D(t+ 1), where

the i-th diagonal element of D(t+ 1) is1

2‖vi(t)‖2Ii

5 Calculate the block diagonal matrix D(t+ 1), where

the j-th diagonal element is1

2‖VjU(t)‖FIj

6 For eachvi(1 ≤ i ≤ c) = (XX> + λ1D + λ2D)−1X(zi)

>

7 For eachui(1 ≤ i ≤ c) = (V>XX> V + λ2V

>V)−1Xyi

8 t = t+ 1

Output: V = V(t) ∈ <d×c and U = U(t) ∈ <c×r

automatically identify discriminative feature modalities, whichcan improve the robustness of robot adaptation to unstructuredand unfamiliar terrains. Also, the weights learned by TRALprovide insights for intuitive analysis and future integration ofheterogeneous terrain features for robot adaptation to unstruc-tured terrains.

IV. OPTIMIZATION ALGORITHM

The second theoretical novelty of this paper is the design ofan optimization algorithm to solve the formulated regularizedoptimization problem in Eq. (4). Although the problem in Eq.(4) is convex, it is challenging to solve using traditional gradi-ent descent based solvers due to the non-smooth regularizationterms and due to the dependency of the two loss functions.

To solve the formulated problem, we compute the derivativeof Eq. (4) with respect to the columns of V and set it to zero,while assuming U to have a fixed value:

XX>vi −X(zi)> + λ1Dvi + λ2Dvi = 0 (7)

where D is a diagonal matrix with the i-th diagonal element

asI

2‖vi‖2, and D represents a block diagonal matrix with the

j-th block computed byI

2‖VjU‖F. Then, we obtain:

vi = (XX> + λ1D + λ2D)−1X(zi)> (8)

Fixing the value of V, to compute U, we take the derivativeof Eq. (4) with respect to the columns of U and set it to zero:

V>XX> Vui −V>X(yi)> + λ2V

>Vui = 0 (9)

Then we obtain:

ui = (V>XX> V + λ2V>V)−1Xyi (10)

Since both D and D are dependent on V, they are unknownparameters that we need to estimate. Accordingly, we develop

Page 5: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

an iterative algorithm to solve the formulated problem, whichis described in Algorithm 1.

As follows, we provide a theoretical analysis of Algorithm1. We first show a lemma cited from Nie et al. [55]:

Lemma 1: For any two given vectors v and v, the follow-ing inequality relation holds:

‖v‖2 −‖v‖222‖v‖2

≤ ‖v‖2 −‖v‖222‖v‖2

From Lemma 1, we can derive the following corollary:Corollary 1: For given matrices M and M , the following

inequality relation holds:

‖M‖F −‖M‖2F2‖M‖F

≤ ‖M‖F −‖M‖2F2‖M‖F

Then, we prove our algorithm’s convergence as follows:Theorem 1: Algorithm 1 iteratively decreases the value of

the objective function in Eq. (4) and converges to the globaloptimal solution.

Proof: According to Step 4 of Algorithm 1, we obtain:

V(t+ 1) = minV‖V>X− Z‖2F (11)

+λ1TrV>D(t+ 1)V + λ2TrV

>D(t+ 1)V

also from Step 5 of Algorithm 1, we obtain:

U(t+ 1) = minU‖(VU)>X−Y‖2F (12)

+λ2TrU>V>(t+ 1)V(t+ 1)U

Then, we can derive that

J (t+ 1) + F(t+ 1)

+λ1TrV>(t+ 1)D(t+ 1)V(t+ 1) (13)

+λ2TrU>(t+ 1)D(t+ 1)U(t+ 1)

≤ J (t) + F(t) + λ1TrV>(t)D(t+ 1)V(t)

+λ2TrU>(t)D(t+ 1)U(t),

where J (t) is given as ‖V>X(t) − Z‖2F and F(t) as‖(VU)

>X(t)−Y‖2F + λ2TrU

>V>(t)V(t)U.After substituting the definition of D and D, we obtain

J (t+ 1) + F(t+ 1) + λ1

c∑i=1

m∑j=1

‖vji (t+ 1)‖22

2‖vji (t)‖2

+λ2‖V(t+ 1)U(t+ 1)‖2F

2‖V(t)U(t)‖F

≤ J (t) + F(t) + λ1

c∑i=1

m∑j=1

‖vji (t)‖22

2‖vji (t)‖2

(14)

+λ2‖V(t)U(t)‖2F2‖V(t)U(t)‖F

From Lemma 1, we can derive:c∑

i=1

m∑j=1

‖vji (t+ 1)‖2 −

c∑i=1

m∑j=1

‖vji (t+ 1)‖22

2‖vji (t)‖2

≤c∑

i=1

m∑j=1

‖vji (t)‖2 −

c∑i=1

m∑j=1

‖vji (t)‖22

2‖vji (t)‖2

, (15)

and from Corollary 1, we can derive:

‖V(t+ 1)U(t+ 1)‖F −‖V(t+ 1)U(t+ 1)‖2F

2‖V(t)U(t)‖F

≤ ‖V(t)U(t)‖F −‖V(t)U(t)‖2F2‖V(t)U(t)‖F

. (16)

Adding Eq. (13)-(16) on both sides, we obtain:

J (t+ 1) + F(t+ 1) + λ1

c∑i=1

m∑j=1

‖vji (t+ 1)‖2 (17)

+λ2‖V(t+ 1)U(t+ 1)‖A

≤ J (t) + F(t) + λ1

c∑i=1

m∑j=1

‖vji (t)‖2 + λ2‖V(t)U(t)‖A

Eq. (17) decreases the value of the objective function in eachiteration. Because our objective function is convex, Algorithm1 converges to the global optimal solution of the formulatedregularized optimization problem in Eq. (4).

Complexity analysis. Since the objective function in Eq. (4)is convex, Algorithm 1 converges very fast. In each iterationof Algorithm 1, it is trivial to compute Steps 4 and 5. We cancompute Steps 6 and 7 by solving a system of linear equations,with a quadratic complexity.

V. EXPERIMENTS

This section discusses the experiment setup, present imple-mentation details, and provides an analysis of our experimentalresults and comparison with other methods.

A. Experiment Setup and Dataset

In the experiments, a Clearpath Jackal mobile robot is usedto navigate over a variety of unstructured terrains. In order tocollect heterogeneous multi-sensory data that can be utilized toimprove terrain representations, the mobile robot is equippedwith multiple diverse proprioceptive and exteroceptive sensors.A structured-light camera is applied as an exteroceptive sensorto capture 3D colored point clouds of unstructured terrains infront of the robot with a frame rate of 30 Hz. Proprioceptivesensors are also installed on the robot to measure its internalinformation, including wheel odometry, motor speed, inertialmeasurement unit (IMU) measurement, power consumption,and battery status. The data frame rate from the proprioceptivesensors varied from 5–100 Hz. Linear interpolation based ondata time-stamps is performed to ensure data frame rate to beconsistent at 30 Hz. All of these data sources are used togetheras the input to terrain adaptation approaches for evaluation andcomparison.

In addition, expert controls of the mobile robot are recordedas demonstrations of robot behaviors with the goal to navigatethe mobile robot over unstructured terrains in straight lines asquickly as possible while maintaining safety (e.g., no flippingor crushing). The recorded demonstrations are utilized as theinput to apprenticeship learning-based methods for training.

In the experiments, five types of unstructured terrains withvarious characteristics are used (shown in Figure 3), including:

Page 6: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

• Concrete: The concrete surface exhibits smooth and hardcharacteristics. The mobile robot navigates over concretefloors in an indoor corridor.

• Grass: The grass terrain is less smooth compared to theconcrete terrain, and is characterized by low ruggedness.The grass terrain often causes small wheel slips when themobile robot navigates in places with a greater slope.

• Mud: This type of unstructured terrains has special char-acteristics in terms of roughness. Muddy ground surfaceis wet and typically soft, and can produce a lot of slips inrobot motion. Mud terrains may also consist of weatheredleaves and wood chips.

• Pebble: The pebble terrain is rough and characterized bysmall rocks, which we refer to as pebbles. The roughnessrequires a robot to navigate slowly and also results in rel-atively high power consumption. During demonstration,the robot is operated by human experts in a way to avoidsudden jerks, which is desirable in scenarios when a robotoperates with fragile equipment.

• Rock: The rock terrain consists of larger rocks. This is themost challenging terrain type used for robot navigationin our experiments. The experts need to carefully controlthe robot and make detours to ensure safety of the mobilerobot.

RockGrass PebbleMudConcrete

Figure 3. Instances of the five terrain types used in the experiments. Thecolor images are observed by the camera installed in front of the mobile robotwhen navigating over these unstructured terrains.

Experiments of robot navigation are performed twenty timeson each of the five unstructured terrain types. Each experimentlasts for a period of 10-25 seconds of an expert operating themobile robot to navigate over a particular terrain type. In total,the experiments consist of 4000 data instances. All instancesare organized into a dataset, which contains the heterogeneousmulti-sensory data obtained by the robot during navigation onunstructured terrains, as well as the associated demonstrations.

B. Implementation

From the multi-sensory data obtained by robot sensors, com-monly adopted terrain features are extracted. For exteroceptivedata from depth and color images, real-time feature extractiontechniques, including Histogram of Oriented Gradients (HOG)[56], color descriptors, and Local Binary Patterns (LBP) [57],are used to compute visual features of the terrains from bothcolor and depth images. The extracted HOG, Color and LBP

features capture different visual characteristics of the terrains.Additionally, the point cloud data collected by the color-depthsensor is used to compute Elevation Maps based on grids [58]to discretely represent the grid-wise elevation and variance ofthe unstructured terrain. 3D point clouds with transformationand wheel odometry data are utilized to form the grid-basedlocal elevation map. Normal vector maps and variance mapsare also calculated based upon the generated elevation map ofunstructured terrains. The proprioceptive sensor data includingIMU and wheel odometry, are used together with the visualand elevation features to form a feature vector that is used asthe input to our TRAL approach.

We implement four versions of the TRAL approach. First,we set λ1 and λ2 in Eq. (4) to 0, which performs joint terrainclassification and robot behavior generation but cannot identifydiscriminative features. Then, we set λ1 = 0 and λ2 = 0.1 toidentify discriminative features only useful for terrain classifi-cation. In addition, we assign λ1 = 0.1 and λ2 = 0 to identifydiscriminative features only useful for behavior generation.Finally, the full version of the proposed TRAL approach isimplemented with λ1 = λ2 = 0.1, which identifies featuresthat contribute to the joint terrain classification and behaviorgeneration. We observe that TRAL shows stable performancewhen λ1, λ2 ∈ (0.05, 5). The TRAL approach is implementedin Linux on the Jackal robot’s onboard computer with an i52.5 GHz CPU and 8G memory.

We also compare the TRAL appraoch with previous appren-ticeship learning methods used for robot adaptation, includingLearning from Demonstration (LfD) based upon probabilisticgraphical models [59], Sequence-based Multi-modal Appren-ticeship Learning (SMAL) [14] that implements robot adapta-tion based upon optimization-based state perception and MDP-based robot decision making, and an imitation learning methodusing optimization-based multiclass classification [54] with norepresentation learning (NoChar).

C. Result on Familiar Terrains

In this set of experiments, we evaluate the proposed TRALapproach’s performance when the mobile robot navigates overfamiliar unstructured terrains, in which the terrain types havebeen experienced by the robot during the training period. Therobot is trained using terrain features and expert demonstra-tions collected in all five types of unstructured terrains used,and then tested using terrain features only from the five typesof terrains. In order to evaluate how well human expertise istransferred to a mobile robot in terms of ground speed, whichis viewed as one of the most critical performance metrics thatindicates how fast a navigation task can be completed by therobot [23, 60], we use the Root Mean Square Error (RMSE) asthe metric to measure the magnitude of deviation of navigationbehaviors generated by apprenticeship learning methods fromthe expert controls. Although RMSE provides the magnitudeof deviation, it cannot present the ratio of deviation, comparingwith the expert control. Therefore, we also use error ratio (ER),defined as the average of the ratio of the deviation magnitudeover the control magnitude, in order to measure the percentage

Page 7: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.1

0.2

0.3

0.4

0.5

0.6RMSE(m

/s)

(a) ConcreteLfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

0.8

1

RMSE(m

/s)

(b) Grass

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

0.8

1

1.2

1.4

RMSE(m

/s)

(c) Mud

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

0.8

RMSE(m

/s)

(d) Pebble

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.5

1

1.5

RMSE(m

/s)

(e) Rock

Figure 4. Quantitative results of RMSE on ground velocity obtained by TRAL in the scenarios of navigation over familiar unstructured terrains. Comparisonwith previous apprenticeship learning methods and our baseline methods is also presented. The results of each method over each type of terrains are presentedas a boxplot that depicts the distribution of the results from all testing instances based on a five number summary (i.e., minimum, first quartile, median, thirdquartile, and maximum).

Table IMEAN AND VARIANCE OF RMSE AND ER ON GROUND VELOCITYOBTAINED BY TRAL AND PREVIOUS/BASELINE METHODS OVER

FAMILIAR UNSTRUCTURED TERRAINS.

Approach RMSE (m/s) ER (%)LfD [59] 0.39 (±0.20) 21.35 (±10.16)

SMAL [14] 0.34 (±0.18) 12.5 (±10.10)NoChar [54] 0.17 (±0.08) 10.75 (±4.12)λ1,2 = 0 0.13 (±0.05) 7.48 (±4.96)λ1 = 0 0.11 (±0.04) 5.33 (±3.40)λ2 = 0 0.08 (±0.04) 3.40 (±2.41)TRAL 0.05 (±0.03) 2.41 (±1.76)

of deviation from the expert controls. For both RMSE and ER,a smaller value indicates a better performance.

The quantitative result obtained by our proposed approachis demonstrated in Figure 4 in the scenario of navigation overeach of the familiar unstructured terrains, using a boxplot thatdisplays the distribution of the results from all testing instancesbased on a five number summary (i.e., minimum, first quartile,median, third quartile, and maximum). The overall mean andvariance of RMSE and ER on ground velocity obtained by ourTRAL approach are summarized in Table II. TRAL achievesthe RMSE of 0.05± 0.03 m/s, which shows the magnitude ofdeviation is 5 cm/s from expert control on average. The errorrate ER demonstrates that the difference between the generatedcontrol and expert control is 2.41% with a variance of 1.67%for robots navigating over familiar terrains.

Quantitative comparisons with several previous approachesand baseline methods are demonstrated in Figure 4 using theboxplot graph and comparisons of the overall performance aresummarized in Table I. It is observed that the previous appren-ticeship learning methods tested in our experiment, includingLfD, SMAL, and NoChar obtain a larger deviation magnitudeand percentage. A likely reason that explains this phenomenonis that they cannot perform representation learning to identifydiscriminative terrain features. The raw feature vectors directlycomputed from the sensor measurements are noisy, which candecrease the performance of navigational behavior generation.Since our approach combines two loss functions to perform ajoint optimization for both representation and apprenticeship

learning, even without applying the two regularization norms,the version of our approach with λ1,2 = 0 is able to buildgood representations to improve the performance of behaviorgeneration. When regularization terms are used to promotemodality sparsity (i.e., with non-zero λ1 or λ2), these versionsof our approach can find more discriminative features towardterrain classification or behavior generation, which can furtherreduce the deviation. When both regularization norms are usedto learn terrain representations toward both terrain classifica-tion and navigational behavior generation, the complete TRALapproach obtains superior performance on both RMSE and ERover the previous methods.

D. Result on Unfamiliar Unstructured Terrains

In this set of experiments, we evaluate TRAL’s performancewhen the mobile robot navigates over unfamiliar unstructuredterrains, in which the terrain type that the robot navigates overhas not been experienced during training. The robot is trainedusing terrain features and expert demonstrations collected infour categories of the unstructured terrains, and then evaluatedusing the fifth terrain type that has not been used in the trainingperiod. For example, Figure 5(c) demonstrates the results fromthe scenario that the robot is trained using data from concrete,grass, pebble and rock terrain types, and evaluated on the mudterrain what is not experienced by the robot in training.

The RMSE results obtained by TRAL and comparisons withprevious and baseline techniques on each type of terrains thatare not familiar to the robot are demonstrated in Figure 5. Theoverall performance of TRAL and other methods are presentedin Table II. When the robot navigates over unfamiliar terrains,we observe phenomena that are similar to the results obtainedwhen it navigates on familiar terrains: apprenticeship learningapproaches without using representation learning do not workwell; integrating the representation and apprenticeship learningdecreases the errors; and the proposed regularization terms canfurther increase the performance. The complete TRAL methodstill obtains the best performance over other methods, mostlybecause of its capability of identifying the most discriminativeterrain features that can generalize well to unfamiliar scenar-ios. On the other hand, since unfamiliar terrains are not seen in

Page 8: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

RMSE(m

/s)

(a) ConcreteLfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

0.8

1

RMSE(m

/s)

(b) Grass

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

0.8

1

RMSE(m

/s)

(c) Mud

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.5

1

1.5

RMSE(m

/s)

(d) Pebble

LfD

SMAL

NoC

har

λ1,2=

0

λ1=

0

λ2=

0

TRAL

0

0.2

0.4

0.6

0.8

1

1.2

1.4

RMSE(m

/s)

(e) Rock

Figure 5. Quantitative results of RMSE on ground velocity obtained by TRAL in the scenarios of navigation over unfamiliar unstructured terrains. Comparisonwith previous apprenticeship learning methods and our baseline methods is also presented.

Table IIMEAN AND VARIANCE OF RMSE AND ER ON GROUND VELOCITYOBTAINED BY TRAL AND PREVIOUS/BASELINE METHODS OVER

UNFAMILIAR UNSTRUCTURED TERRAINS.

Approach RMSE (m/s) ER (%)LfD [59] 0.35 (±0.21) 33.15 (±19.59)

SMAL [14] 0.29 (±0.19) 29.35 (±16.80)NoChar [54] 0.23 (±0.07) 20.73 (±6.26)λ1,2 = 0 0.16 (±0.05) 22.82 (±6.97)λ1 = 0 0.11 (±0.03) 14.22 (±7.52)λ2 = 0 0.08 (±0.04) 13.14 (±4.56)TRAL 0.06 (±0.03) 7.45 (±3.89)

the training period, all approaches evaluated in this case exhibithigher errors (i.e., higher RMSE and ER values) comparing tothe results from the situations when robots navigate on familiarunstructured terrains.

E. Discussion

The TRAL approach offers the desirable ability to automat-ically estimate the importance of feature modalities for terrainclassification and behavior generation. The results of modalityimportance averaged from both familiar and unfamiliar exper-imental scenarios are demonstrated in Figure 6. It is observedthat two feature modalities, i.e., IMU and terrain variance, aremost discriminative in determining both terrain types and robotnavigational behaviors. Intuitively, both feature modalities canbe considered to contain complementary information becausethe IMU data is greatly affected by and correlated with terrainvariances. Moreover, it is interesting to note that HOG featuresare discriminative for classifying the terrain types, but not forlearning robot navigation behaviors. LBP features on the otherhand exhibit the opposite impact. When comparing all visualfeatures, we observe that features extracted from color imagesare generally more discriminative than the features from depthimages. Due to TRAL’s ability to integrate representation andapprenticeship learning in the unified framework, and due tothe efficiency of our convex optimization formulation, TRALis able to achieve high-speed processing on the robot’s onboardcomputer. Including the time spent on feature extraction, ourapproach obtains a processing rate of 20 Hz. Without featureextraction, our approach itself achieves a processing speed of80 Hz on average. This result indicates the runtime advantage

(a) Terrain classification (b) Behavior generation

Figure 6. Normalized importance of terrain feature modalities with respectto terrain classification and behavior generation.

of our TRAL approach for real-time robotics applications andits potential to facilitate high-speed mobile robot navigation.

VI. CONCLUSION

In this paper, we present the TRAL approach to address theresearch problem of robot adaptation to unstructured terrains.As a theoretical novelty, TRAL formulates representation andapprenticeship learning in the unified regularized optimizationframework to perform joint robot learning for terrain classifi-cation and behavior generation, which also automatically esti-mates the importance of terrain feature modalities. Our secondtheoretical novelty is the design of an optimization algorithmto solve the formulated problem, which possesses a theoreticalconvergence guarantee. The proposed approach is extensivelyevaluated using two real-world scenarios in the experiments,in which a mobile robot navigates over familiar or unfamiliarunstructured terrains. Experimental results have demonstratedthat TRAL is able to transfer human expertise to mobile robotswith small errors, obtain superior performance compared withprevious and baseline methods, and offer intuitive insights onthe importance of terrain feature modalities.

ACKNOWLEDGMENTS

This research was supported by NSF awards IIS-1849348,IIS-1849359, CNS-1823245, DOE grant DE-FE0031650, andARO grant W911NF-17-1-0447. The authors also would liketo thank Christopher Reardon for valuable discussions and theanonymous reviewers for their constructive comments.

Page 9: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

REFERENCES

[1] Shinji Kawatsuma, Mineo Fukushima, and TakashiOkada. Emergency response by robots to Fukushima-Daiichi accident: summary and lessons learned. Indus-trial Robot: An International Journal, 39(5):428–435,2012.

[2] Kristopher Toussaint, Nicolas Pouliot, and Serge Mon-tambault. Transmission line maintenance robots capableof crossing obstacles: State-of-the-art review and chal-lenges ahead. JFR, 26(5):477–499, 2009.

[3] David Silver, J Andrew Bagnell, and Anthony Stentz.Learning from demonstration for autonomous navigationin complex unstructured terrain. IJRR, 29(12):1565–1592, 2010.

[4] Ayse Naz Erkan, Raia Hadsell, Pierre Sermanet, Jan Ben,Urs Muller, and Yann LeCun. Adaptive long range visionin unstructured terrain. In IROS, 2007.

[5] Christopher A Brooks and Karl Iagnemma. Vibration-based terrain classification for planetary explorationrovers. TRO, 21(6):1185–1191, 2005.

[6] Ayanna Howard and Homayoun Seraji. Vision-basedterrain characterization and traversability assessment.Journal of Robotic Systems, 18(10):577–587, 2001.

[7] Sang-Ho Hyon. Compliant terrain adaptation for bipedhumanoids without measuring ground surface and contactforces. TRO, 25(1):171–178, 2009.

[8] Yoon-Gu Kim, Jeong-Hwan Kwak, Dae-Han Hong, In-Huck Kim, Dong-Hwan Shin, and Jinung An. Au-tonomous terrain adaptation and user-friendly tele-operation of wheel-track hybrid mobile robot. Interna-tional Journal of Precision Engineering and Manufac-turing, 13(10):1781–1788, 2012.

[9] Michael Shneier, Tommy Chang, Tsai Hong, Will Shack-leford, Roger Bostelman, and James S Albus. Learningtraversability models for autonomous mobile vehicles.AuRo, 24(1):69–86, 2008.

[10] Panagiotis Papadakis. Terrain traversability analy-sis methods for unmanned ground vehicles: A sur-vey. Engineering Applications of Artificial Intelligence,26(4):1373–1385, 2013.

[11] Maggie Wigness, John G Rogers, and Luis E Navarro-Serment. Robot navigation from human demonstration:learning control behaviors. In ICRA, 2018.

[12] Fernando L Garcia Bermudez, Ryan C Julian, Duncan WHaldane, Pieter Abbeel, and Ronald S Fearing. Per-formance analysis and terrain classification for a leggedrobot over rough terrain. In IROS, 2012.

[13] Dylan Williamson, Navinda Kottege, and PeymanMoghadam. Terrain characterisation and gait adaptationby a hexapod robot. In Australasian Conference onRobotics and Automation (ACRA), 2016.

[14] Fei Han, Xue Yang, Yu Zhang, and Hao Zhang.Sequence-based multimodal apprenticeship learning forrobot perception and decision making. In ICRA, 2017.

[15] Christian Weiss, Holger Frohlich, and Andreas Zell.

Vibration-based terrain classification using support vectormachines. In IROS, 2006.

[16] Michael Happold, Mark Ollis, and Nikolas Johnson. En-hancing supervised terrain classification with predictiveunsupervised learning. In RSS, 2006.

[17] Thomas M Howard and Alonzo Kelly. Optimal roughterrain trajectory generation for wheeled mobile robots.IJRR, 26(2):141–166, 2007.

[18] Anelia Angelova, Larry Matthies, Daniel Helmick, GabeSibley, and Pietro Perona. Learning to predict slip forground robots. In ICRA, 2006.

[19] Sebastian Thrun, Michael Montemerlo, and Andrei Aron.Probabilistic terrain analysis for high-speed desert driv-ing. In RSS, 2006.

[20] Paul Filitchkin and Katie Byl. Feature-based terrainclassification for littledog. In IROS, 2012.

[21] Stefan Laible, Yasir Niaz Khan, and Andreas Zell.Terrain classification with conditional random fields onfused 3D LIDAR and camera data. In European Confer-ence on Mobile Robots, 2013.

[22] Mieczyslaw Gregory Bekker. Introduction to terrain-vehicle systems. University of Michigan Press, 1969.

[23] Chris Urmson, Charlie Ragusa, David Ray, Joshua An-halt, Daniel Bartz, Tugrul Galatali, Alexander Gutierrez,Josh Johnston, Sam Harbaugh, Hiroki “Yu” Kato, et al. Arobust approach to high-speed navigation for unrehearseddesert terrain. JFR, 23(8):467–508, 2006.

[24] Cheng Wang and Nancy F Glenn. Integrating LiDARintensity and elevation data for terrain characterization ina forested area. IEEE Geoscience and Remote SensingLetters, 6(3):463–466, 2009.

[25] Karl Iagnemma, Shinwoo Kang, Hassan Shibly, andSteven Dubowsky. Online terrain parameter estimationfor wheeled mobile robots with application to planetaryrovers. TRO, 20(5):921–927, 2004.

[26] Lauro Ojeda, Johann Borenstein, Gary Witus, and RobertKarlsen. Terrain characterization and classification witha mobile robot. JFR, 23(2):103–122, 2006.

[27] Eric Trautmann and Laura Ray. Mobility characterizationfor autonomous mobile robots using machine learning.AuRo, 30(4):369–383, 2011.

[28] Roberto Manduchi, Andres Castano, Ashit Talukder, andLarry Matthies. Obstacle detection and terrain clas-sification for autonomous off-road navigation. AuRo,18(1):81–102, 2005.

[29] Lynne E Parker. Task-oriented multi-robot learning inbehavior-based systems. In IROS, 1996.

[30] Lynne E Parker. Case study for life-long learning andadaptation in coopertive robot teams. In Sensor Fusionand Decentralized Control in Robotic Systems, volume3839, pages 92–102, 1999.

[31] Lynne E Parker. Lifelong adaptation in heterogeneousmulti-robot teams: Response to continual variation inindividual robot performance. AuRo, 8(3):239–267, 2000.

[32] Ian Watson and Farhi Marir. Case-based reasoning: Areview. Knowledge Engineering Review, 9(4):327–354,

Page 10: Robot Adaptation to Unstructured Terrains by Joint ...rss2019.informatik.uni-freiburg.de/papers/0111_FI.pdfessential for effective and safe robot navigation. In this paper, we introduce

1994.[33] Michael W Floyd, Michael Drinkwater, and David W

Aha. Trust-guided behavior adaptation using case-basedreasoning. Technical report, 2015.

[34] Qiqian Zhang, Hui Qian, and Miaoliang Zhu. Parameteradaptation by case-based mission-planning of outdoorautonomous mobile robot. In Intelligent Vehicles Sym-posium, 2005.

[35] Jonathan Ko and Dieter Fox. GP-BayesFilters: Bayesianfiltering using Gaussian process prediction and observa-tion models. AuRo, 27(1):75–90, 2009.

[36] Marc Deisenroth and Carl E Rasmussen. PILCO: Amodel-based and data-efficient approach to policy search.In ICML, pages 465–472, 2011.

[37] Ian Lenz, Ross A Knepper, and Ashutosh Saxena.DeepMPC: Learning deep latent features for model pre-dictive control. In RSS, 2015.

[38] Christian Plagemann, Sebastian Mischke, Sam Prentice,Kristian Kersting, Nicholas Roy, and Wolfram Burgard.Learning predictive terrain models for legged robot lo-comotion. In IROS, 2008.

[39] Sergey Levine and Pieter Abbeel. Learning neural net-work policies with guided policy search under unknowndynamics. In NIPS, 2014.

[40] Austin D Buchan, Duncan W Haldane, and Ronald SFearing. Automatic identification of dynamic piecewiseaffine models for a running robot. In IROS, 2013.

[41] Alexander Kleiner, Markus Dietl, and Bernhard Nebel.Towards a life-long learning soccer agent. In RobotSoccer World Cup, pages 126–134, 2002.

[42] Sebastian Thrun and Tom M Mitchell. Lifelong robotlearning. In The Biology and Technology of IntelligentAutonomous Agents, pages 165–196. 1995.

[43] Gentaro Taga. A model of the neuro-musculo-skeletalsystem for human locomotion. Biological Cybernetics,73(2):97–111, 1995.

[44] Hiroshi Kimura, Yasuhiro Fukuoka, and Hiroyuki Naka-mura. Biologically inspired adaptive dynamic walking ofthe quadruped on irregular terrain. In Robotics Research,pages 329–336. 2000.

[45] Hiroshi Kimura, Seiichi Akiyama, and Kazuaki Saku-rama. Realization of dynamic walking and running ofthe quadruped using neural oscillator. AuRo, 7(3):247–258, 1999.

[46] Andrew Lookingbill, John Rogers, David Lieb, J Curry,and Sebastian Thrun. Reverse optical flow for self-supervised adaptive autonomous robot navigation. IJCV,74(3):287–302, 2007.

[47] Samir Nabulsi, M Armada, and Hector Montes. Multipleterrain adaptation approach using ultrasonic sensors forlegged robots. In Climbing and Walking Robots. 2006.

[48] Brenna D Argall, Sonia Chernova, Manuela Veloso, andBrett Browning. A survey of robot learning from demon-stration. Robotics and Autonomous Systems, 57(5):469–483, 2009.

[49] Dieter Fox, Wolfram Burgard, and Sebastian Thrun. The

dynamic window approach to collision avoidance. RAM,4(1):23–33, 1997.

[50] Zheng Zhang, Yong Xu, Jian Yang, Xuelong Li, andDavid Zhang. A survey of sparse representation: algo-rithms and applications. IEEE Access, 3:490–530, 2015.

[51] Hao Zhang, Fei Han, and Hua Wang. Robust multimodalsequence-based loop closure detection via structuredsparsity. In RSS, 2016.

[52] Fei Han, Xue Yang, Yiming Deng, Mark Rentschler,Dejun Yang, and Hao Zhang. SRAL: Shared repre-sentative appearance learning for long-term visual placerecognition. RA-L, 2(2):1172–1179, 2017.

[53] Sylvain Calinon and Aude Billard. Incremental learningof gestures by imitation in a humanoid robot. In HRI,2007.

[54] Nathan Ratliff, J Andrew Bagnell, and Siddhartha SSrinivasa. Imitation learning for locomotion and manip-ulation. In Humanoids, 2007.

[55] Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding.Efficient and robust feature selection via joint `2, 1-norms minimization. In NIPS, 2010.

[56] Navneet Dalal and Bill Triggs. Histograms of orientedgradients for human detection. In CVPR, 2005.

[57] Timo Ahonen, Abdenour Hadid, and Matti Pietikainen.Face description with local binary patterns: Applicationto face recognition. PAMI, (12):2037–2041, 2006.

[58] Peter Fankhauser, Michael Bloesch, Christian Gehring,Marco Hutter, and Roland Siegwart. Robot-centric el-evation mapping with uncertainty estimates. In MobileService Robotics, pages 433–440. 2014.

[59] Fenglu Ge, Wayne Moore, and Michael Antolovich.Learning from demonstration using GMM, CHMM andDHMM: A comparison. In Australasian Joint Conferenceon Artificial Intelligence, 2015.

[60] Shingo Shimoda, Yoji Kuroda, and Karl Iagnemma.Potential field navigation of high speed unmanned groundvehicles on uneven terrain. In ICRA, 2005.