qualitative induction for behavioral cloning

26
Qualitative Induction for Behavioral Cloning Dorian Šuc and Ivan Bratko AI Lab Faculty of Computer and Information Sc. University of Ljubljana, Slovenia

Upload: amandla

Post on 16-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Qualitative Induction for Behavioral Cloning. Dorian Šuc and Ivan Bratko AI Lab Faculty of Computer and Information Sc. University of Ljubljana, Slovenia. Kvalitativno u cenje v vedenjskem kloniranju. Dorian Suc in Ivan Bratko. Vedenjsko kloniranje. Dinamicni sistem: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Qualitative Induction for Behavioral Cloning

Qualitative Induction for Behavioral Cloning

Dorian Šuc and Ivan BratkoAI Lab

Faculty of Computer and Information Sc.University of Ljubljana, Slovenia

Page 2: Qualitative Induction for Behavioral Cloning

Kvalitativno ucenje v vedenjskem kloniranju

Dorian Suc in Ivan Bratko

Page 3: Qualitative Induction for Behavioral Cloning

Vedenjsko kloniranje

OperaterDinamicni sistem:

zerjav, letalo, akrobot...

Sled vodenja

Strojno ucenje

Operatorjev dvojnik (”klon”)

Page 4: Qualitative Induction for Behavioral Cloning

“Direktni kontroler”: induciraj preslikavo

States Actions, Action = f(State)

Pristopi k kloniranju

Sled je zaporedje:

(State1,Action1), (State2, Action2), ...

“Indirektni kontroler”: Dva problema ucenja

• ucenje operaterjeve trajektorije

• ucenje dinamike sistema

Page 5: Qualitative Induction for Behavioral Cloning

Uporaba indirektnih kontrolerjev

Indirektni kontroler = “posplosena trajektorija” + dinamika

sistema

1. Izracunaj Error = diff(CurrentState,GeneralTrajectory)

2. Z uporabo dinamike doloci naslednjo akcijo Action, tako da Action zmanjsa Error

Page 6: Qualitative Induction for Behavioral Cloning

Primerjava direktnih in indirektnih kontrolerjev

Eksperimentalne ugotovitve:

Indirektni kontrolerji: - so bolj robustni - omogocajo razlago vescine z operaterjevimi podcilji - dajo boljsi vpogled v podzavestno vescino operaterja

Page 7: Qualitative Induction for Behavioral Cloning

Ta clanek

• Induciranje indirektnih kontrolerjev

• Kvalitativno ucenje trajektorij

• QUIN: program za induciranje

kvalitativnih dreves iz numericnih

podatkov

• Uporaba v vodenju zerjava

Page 8: Qualitative Induction for Behavioral Cloning

Primer kvalitativne relacije

Kvantitativni zakon:

Pressure * Volume / Temperature = const.

Kvalitativni zakon:

Pressure = M+,-(Temperature, Volume)

Vedenje plina

Page 9: Qualitative Induction for Behavioral Cloning

Program QUIN QUalitative INduction

Numericni primeri

QUIN

Kvalitativno drevo

Kvalitativno drevo: podobno odlocitvenemu

drevesu, vendar kvalitativne omejitve v listih

Page 10: Qualitative Induction for Behavioral Cloning

Primer problema za QUIN

Sumni primeri:z = x2 - y2 + noise(st.dev. 50)

Page 11: Qualitative Induction for Behavioral Cloning

z = x*x - y*yNoisy examples (std.dev.=50)x = 0 ; y = 0

x > 0 & y > 0 =>

z = M+,-

(x,y)

Kvalitativni vzorci v podatkih

Page 12: Qualitative Induction for Behavioral Cloning

Inducirano kvalitativnodrevo za z=x2-y2

Z monotonically increasing with X and monotonically decreasing with Y

z=M-,+

(x,y) z=M-,-

(x,y) z=M+,+

(x,y) z=M+,-

(x,y)

0> 0 > 0

0 > 0

0

y

x

y

Page 13: Qualitative Induction for Behavioral Cloning

Qualitatively Constrained Functions, QCF

Ms1, ..., sm: R m --> R, si = + or -

Signs si indicate directions of change:

If si = + then:

function monotonically increases in i-th attribute

Function “positively related” to i-th attr.si = -: function “negatively related” to i-th att.

Page 14: Qualitative Induction for Behavioral Cloning

QCF consistency with examples

•Each pair of examples (e,f) defines

a qualitative change vector q with

respect to no-change threshold

•A QCF is consistent with (e,f) if QCF

permits q

Page 15: Qualitative Induction for Behavioral Cloning

QCF ambiguity

• A QCF may be consistent with qualitative change vector q and ambiguous w.r.t. q

• QCF is ambiguous w.r.t. q if QCF also permits other qualitative changes in class then those in q

Page 16: Qualitative Induction for Behavioral Cloning

Error-cost of QCF

• Error-cost of a QCF w.r.t. an example set defined as weighted encoding length

• Error-cost of a QCF considers: encoding of QCF + encoding of inconsistent predictions by

QCF + encoding of ambiguous predictions by QCF

Weighted by proximity of concerned examples

Page 17: Qualitative Induction for Behavioral Cloning

Algoritem QUIN

• Top-down pozresni algoritem, ki inducira kvalitativna drevesa

• Za vsako mozno delitev (vozlisce), poisci ”najbolj konsistentno” QCF (min. cena) za vsako podmnozico primerov

• Poisci najboljsi atribut (najboljso delitev) glede na MDL

Page 18: Qualitative Induction for Behavioral Cloning

Eksperimentalna evaluacija

• Na mnozici umetnih domen– QUIN deluje dobro na sumnih podatkih– QUIN najde kvalitativne relacije, ki ustrezajo

intuiciji

• QUIN v vedenjskem kloniranju:– QUIN uporabljen za ucenje operaterjeve

strategije vodenja– Poskusi v domeni zerjava

Page 19: Qualitative Induction for Behavioral Cloning

Uporaba v vedenjskem kloniranju

• Domena: vodenje zerjava

• Cilj kloniranja: uspesni in razumljivi kloni

Page 20: Qualitative Induction for Behavioral Cloning

Kontejnerski zerjav

X0=0L0=20

load

trolley

X

L

Xg=60Lg=32

Control forces: Fx, FL

State: X, dX, , d, L, dL

Temelji na prejsnjem delu T. Urbancic(94)

Naloga vodenja: prenesi tovor iz zacetnega do ciljnega polozaja

Page 21: Qualitative Induction for Behavioral Cloning

QUIN v modeliranju vescine, domena zerjava

• Kvalitativna drevesa inducirana za vodenje

vozicka in vodenje vrvi

• Sledi dveh operaterjev z zelo razlicnim

stilom vodenja

Page 22: Qualitative Induction for Behavioral Cloning

Vodenje vozicka, operater S

desired_velocity = f(X, , d)

M-(X) M+()

X < 20.7

X < 60.1M+(X)

yes

yes

no

no

First the trolley velocity is increasing

First the trolley velocity is increasing

From about middle distance from the goal the trolley velocity is decreasing

From about middle distance from the goal the trolley velocity is decreasing

At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)

At the goal reduce the swing of the rope (by acceleration of the trolley when the rope angle increases)

Page 23: Qualitative Induction for Behavioral Cloning

Vodenje zerjava: primerjva operaterjev

M-(X) M+()

X < 20.7

X < 60.1

X < 29.3

M+(X) d < -0.02

M-(X) M-,+(X,)

M+,+,-(X, , d)

yes

yes

yes

yes

no

no

no

no

Primerjava razlik v stilu

vodenjaOperater S Operater L

Page 24: Qualitative Induction for Behavioral Cloning

Transformacija kvalitativne v kvantitativno strategijo

• S konkretizacijo QCF v realne funkcije

M+(X)

• Lahko uporabimo znanje domene:– maksimalne in minimalne vrednosti

spremenljivk stanja– vozicek se na zacetku mora zaceti premikati– vozicek se mora ustaviti na cilju

f+(X)

Nakljucno generirana funkcija, ki ustreza

QCF

Page 25: Qualitative Induction for Behavioral Cloning

QUIN v modeliranju vescine

Inducirane strategije vodenja:

• Razumljive in zelo uspesne

• Omogocajo vpogled v razlike med

individualnimi stili vodenja

QUIN zmozen detektirati zelo skrite vidike

clovekove podzavestne vescine (vidiki, ki pred

to aplikacijo programa QUIN niso bili znani)

Page 26: Qualitative Induction for Behavioral Cloning

Related work in qualitative reasoning

• In qualitative reasoning: Our QFC’s inspired by qualitative proportionalities (Q+) in QPT (Forbus) and monotonicity relations (M+) in QSIM (Kuipers)

• In learning qualitative models of dynamic systems: Mozetic; Coiera; Bratko et al.; Varsek; Richards et al.; Dzeroski, Todorovski;

• Distinguishing features of QUIN: models of static systems, qualitative trees, takes numerical examples directly