nw computational intelligence laboratory 1 designing a contextually discerning controller by lars...

1

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Designing A ContextuallyDesigning A ContextuallyDiscerning ControllerDiscerning Controller

By Lars Holmstrom

2


Main PointsMain Points

• To Describe The Architecture And Training Methodology

• To Discuss What Is Mean By Context And Contextual Discernment

• To Demonstrate This Methodology Using The Pole-Cart System

3


Architecture Of A Contextually Discerning ControllerArchitecture Of A Contextually Discerning Controller

ContextuallyAware

Controller

ContextualDiscerner

R(t)

CD(t)

u(t)

4


What Is Meant By Context?What Is Meant By Context?

• Conceptual Definition– A Set Of Parameters Capable Of Representing Changes

In The Dynamics Of A Plant

• Mathematical Definition– A Point On A Manifold Of Functions That Can Be

Indexed By A Specific Point In An Associated Continuous Coordinate Space

5


The Set Of Sine Functions Specified By CoordinatesThe Set Of Sine Functions Specified By Coordinates

Corresponding To Amplitude, Frequency, And DC-OffsetCorresponding To Amplitude, Frequency, And DC-Offset

0.5*sin(0.7*x)+0.3

6


Contextual DiscernmentContextual Discernment

• To Discern The True Coordinates Of A Function That We Know Lies On A Specified Manifold But Has Unknown Manifold Coordinates

CD

CA

Current Discerned Context

Unknown Actual Context

∆CD0CD1

∆CD1

CD2

∆CD2

CD3

CD0

7


Contextual Discernment SystemContextual Discernment System

Function OnManifold AtLocation CA

CDN(Context

DiscerningNetwork)

D(t)=yA(t)-yD(t)

x(t)

CD(t)

z-1

+∆CD(t)

Function OnManifold AtLocation CD

yD(t)

yA(t)

8


Training The Contextual Discernment SystemTraining The Contextual Discernment System


CDN(Context

DiscerningNetwork)

D(t)=yA(t)-yD(t)

x(t)

CD(t)

z-1

+∆CD(t)


yD(t)

yA(t)

∆CD(t) = u(t)

CD(t) = R(t)

CD(t) + ∆CD(t) = R(t+1)U(t)=(yA(t)-yD(t))2

Critic

CD(t)

x(t)λ(t)

Used To Train Critic

CD(t+1)

“Plant”

9


Training The Critic and CDN With DHPTraining The Critic and CDN With DHP• The Approach Is Similar To That For Training A

Controller With DHP, But With The Substitutions:• R(t) = CD(t)

• u(t) = ∆CD(t)

• R(t+1) = CD(t)+∆CD(t)

• Action Training Signal:•λ(t+1)

• Critic Training Signal:•λ(t) - dU(t)/dCD(t) + λ(t+1)(I+d∆CD/dCD)

10


RA(t)

Training The CDN To Discern Plant ContextTraining The CDN To Discern Plant Context



CDN(Context

DiscerningNetwork)

D(t)=yA(t)-yD(t)x(t)

CD(t)

z-1

+∆CD(t)

yD(t)

yA(t)

x(t)

U(t)=(yA(t)-yD(t))2

Critic

CD(t)

λ(t)

Plant(With Context

Parameters CA)

Plant Model(With Context

Parameters CD)

u(t)

RA(t)D(t)=RA(t+1)-RD(t+1)

U(t)=(RA(t+1)-RD(t+1))2

RD(t+1)

RA(t+1)Used To Train Critic

11


Context And State Variables For TrainingContext And State Variables For TrainingCDN Using An Analytic Pole-Cart PlantCDN Using An Analytic Pole-Cart Plant

• C(t)– Mass Of Cart Ranged From [3 5]kg– Length Of Pole Ranged From [0.5 3]m

• R(t)– Position Of Cart Ranged From [-2 2]m– Velocity Of Cart Ranged From [-2 2]m/s– Angle Of Pole Ranged From [-π/2 π/2]rads– Angular Velocity Ranged From [-2 2]rads/s

• u(t)– Control Ranged From [-5 5]Newtons

12


ResultsResults Of Contextual Discernment TestingOf Contextual Discernment Testing

13



ContextualDiscerner

ContextuallyAware

ControllerR(t)

C(t)

u(t)

14


Designing A Contextually Aware ControllerDesigning A Contextually Aware Controller

• Train A Controller Using DHP In The Standard Fashion Except…

• Add The Contextual Variables For The Plant To The State Vector Used In The Training Process

• Vary These Contextual Variables Through The Training Process

15



ContextualDiscerner

ContextuallyAware

ControllerRA(t)

CD(t)

u(t)

z-1

z-1u(t-1)

16


Future TasksFuture Tasks

• Improve Contextual Discerner For The Pole-Cart System

• Train A Contextually Aware Controller For The Pole-Cart System

• Test The Ability Of The Contextually Discerning Controller In A Simulated “Real-Time” Environment

nw computational intelligence laboratory 1 designing a contextually discerning controller by lars...

Documents

t dutdc d t t

rt c d t c d t

c d t y d t y

ut c d t

c d t function

c d t rt

ty d t xt c d t z

critic c d t t plant