nw computational intelligence laboratory 1 designing a contextually discerning controller by lars...

16
1 NW Computational Intelligence NW Computational Intelligence Laboratory Laboratory Designing A Contextually Designing A Contextually Discerning Controller Discerning Controller By Lars Holmstrom

Upload: kerry-jennings

Post on 13-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

1

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Designing A ContextuallyDesigning A ContextuallyDiscerning ControllerDiscerning Controller

By Lars Holmstrom

2

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Main PointsMain Points

• To Describe The Architecture And Training Methodology

• To Discuss What Is Mean By Context And Contextual Discernment

• To Demonstrate This Methodology Using The Pole-Cart System

3

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Architecture Of A Contextually Discerning ControllerArchitecture Of A Contextually Discerning Controller

ContextuallyAware

Controller

ContextualDiscerner

R(t)

CD(t)

u(t)

4

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

What Is Meant By Context?What Is Meant By Context?

• Conceptual Definition– A Set Of Parameters Capable Of Representing Changes

In The Dynamics Of A Plant

• Mathematical Definition– A Point On A Manifold Of Functions That Can Be

Indexed By A Specific Point In An Associated Continuous Coordinate Space

5

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

The Set Of Sine Functions Specified By CoordinatesThe Set Of Sine Functions Specified By Coordinates

Corresponding To Amplitude, Frequency, And DC-OffsetCorresponding To Amplitude, Frequency, And DC-Offset

0.5*sin(0.7*x)+0.3

6

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Contextual DiscernmentContextual Discernment

• To Discern The True Coordinates Of A Function That We Know Lies On A Specified Manifold But Has Unknown Manifold Coordinates

CD

CA

Current Discerned Context

Unknown Actual Context

∆CD0CD1

∆CD1

CD2

∆CD2

CD3

CD0

7

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Contextual Discernment SystemContextual Discernment System

Function OnManifold AtLocation CA

CDN(Context

DiscerningNetwork)

D(t)=yA(t)-yD(t)

x(t)

CD(t)

z-1

+∆CD(t)

Function OnManifold AtLocation CD

yD(t)

yA(t)

8

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Training The Contextual Discernment SystemTraining The Contextual Discernment System

Function OnManifold AtLocation CA

CDN(Context

DiscerningNetwork)

D(t)=yA(t)-yD(t)

x(t)

CD(t)

z-1

+∆CD(t)

Function OnManifold AtLocation CD

yD(t)

yA(t)

∆CD(t) = u(t)

CD(t) = R(t)

CD(t) + ∆CD(t) = R(t+1)U(t)=(yA(t)-yD(t))2

Critic

CD(t)

x(t)λ(t)

Used To Train Critic

CD(t+1)

“Plant”

9

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Training The Critic and CDN With DHPTraining The Critic and CDN With DHP• The Approach Is Similar To That For Training A

Controller With DHP, But With The Substitutions:• R(t) = CD(t)

• u(t) = ∆CD(t)

• R(t+1) = CD(t)+∆CD(t)

• Action Training Signal:•λ(t+1)

• Critic Training Signal:•λ(t) - dU(t)/dCD(t) + λ(t+1)(I+d∆CD/dCD)

10

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

RA(t)

Training The CDN To Discern Plant ContextTraining The CDN To Discern Plant Context

Function OnManifold AtLocation CA

Function OnManifold AtLocation CD

CDN(Context

DiscerningNetwork)

D(t)=yA(t)-yD(t)x(t)

CD(t)

z-1

+∆CD(t)

yD(t)

yA(t)

x(t)

U(t)=(yA(t)-yD(t))2

Critic

CD(t)

λ(t)

Plant(With Context

Parameters CA)

Plant Model(With Context

Parameters CD)

u(t)

RA(t)D(t)=RA(t+1)-RD(t+1)

U(t)=(RA(t+1)-RD(t+1))2

RD(t+1)

RA(t+1)Used To Train Critic

11

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Context And State Variables For TrainingContext And State Variables For TrainingCDN Using An Analytic Pole-Cart PlantCDN Using An Analytic Pole-Cart Plant

• C(t)– Mass Of Cart Ranged From [3 5]kg– Length Of Pole Ranged From [0.5 3]m

• R(t)– Position Of Cart Ranged From [-2 2]m– Velocity Of Cart Ranged From [-2 2]m/s– Angle Of Pole Ranged From [-π/2 π/2]rads– Angular Velocity Ranged From [-2 2]rads/s

• u(t)– Control Ranged From [-5 5]Newtons

12

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

ResultsResults Of Contextual Discernment TestingOf Contextual Discernment Testing

13

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Architecture Of A Contextually Discerning ControllerArchitecture Of A Contextually Discerning Controller

ContextualDiscerner

ContextuallyAware

ControllerR(t)

C(t)

u(t)

14

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Designing A Contextually Aware ControllerDesigning A Contextually Aware Controller

• Train A Controller Using DHP In The Standard Fashion Except…

• Add The Contextual Variables For The Plant To The State Vector Used In The Training Process

• Vary These Contextual Variables Through The Training Process

15

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Architecture Of A Contextually Discerning ControllerArchitecture Of A Contextually Discerning Controller

ContextualDiscerner

ContextuallyAware

ControllerRA(t)

CD(t)

u(t)

z-1

z-1u(t-1)

16

NW Computational Intelligence LaboratoryNW Computational Intelligence Laboratory

Future TasksFuture Tasks

• Improve Contextual Discerner For The Pole-Cart System

• Train A Contextually Aware Controller For The Pole-Cart System

• Test The Ability Of The Contextually Discerning Controller In A Simulated “Real-Time” Environment