an architecture for empathic agents. abstract architecture planning + coping deliberated actions...

36
An Architecture for Empathic Agents

Upload: amberlynn-tyler

Post on 29-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

An Architecture for Empathic Agents

Abstract Architecture

Planning+

CopingDeliberatedActions

Agent in the World

BodySpeechFacial expressions

Effectors

SensorsFilters

World Agent model+

Model of self(Emotions)

+Model of others

(“memories”)

Appraisal

ConcernsReactions

objectsagentsactions

propertiesAction tendencies Emotional Signals

Agent Mind

Planning and Coping Module

The Planning and Coping module incorporates the action-selection mechanism of the agent

Conventional approaches require the programmer to anticipate every possible context and state and tune the mechanism to produce the right action

To overcome the problem complexity, we can adopt a learning approach

Planning and Coping Module

Hybrid approaches make use of both neural network and symbolic structures to learn sensory-motor correlations and abstract concepts through experience

We propose one way to deal with action sequencing, viewed as a type of motor reasoning, in a fully neural architecture

Basic Mechanisms - Bistables

on

BA

Bistable

off

Activity

B

A

t

t

t

Basic Mechanisms - Bistables

on

BA

Bistable

off

Activity

B

A

t

t

t

Basic Mechanisms - Bistables

on

BA

Bistable

off

Activity

B

A

t

t

t

Basic Mechanisms - Bistables

on

BA

Bistable

off

Activity

B

A

t

t

t

Basic Mechanisms - Bistables

1F

11F 12F

121F 122F13F

2F

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

1F

11F

12F

“Stack”

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

1F

11F

12F

121F

122F

“Stack”

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

1F

11F

12F

121F

122F

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

1F

11F

12F

121F

122F

“Pop”

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

1F

11F

12F

121F

122F

13F

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

1F

11F

12F

121F

122F

13F

“Pop”

Basic Mechanisms - Bistables

Frontal Cortex

Associative Cortex

1F 2F13F

122F121F12F11F

Schemas

Schemas are functional units (intermediate between overall behavior and neural function) for analysis of cooperative competition in the brain

A perceptual schema embodies the process whereby the system determines whether a given domain of interaction is present in the environment.

Current plans are made up of motor schemas.

Simplified Architecture

WORLD

External Perceptions

PerceptionsActions

Planning and Coping Network

Internal Perceptions MotivationalSystem

Neural Architecture

The Planing network is composed of:

Nodes

Environmental states are expressed through the activation ofstate nodesThe agent’s needs are reflected in the activation of drive nodesThe agent’s actions are determined by the activation of action nodes

Links are one-way communication channels that enable thecommunication between nodes.

Splitting mechanism

Splitting mechanism

A

H

F

B

C

A

H

F

B

C

A

H

F

B

C

A→B →FC→B →H

Splitting Mechanism

Internal State

Internal perceptions are defined by a set of internal variables that evolve in time.

A general internal variable is in the range [vmin, vmax] and evolves in time according to:

where vi is a variation caused by some

external causes.

max

min

, if 0

, if 0

0, otherwise

i i i i i

ii i ij i j i i i i i

j i

v v v vdv

v v v v v v vdt

Drives

Internal variables are homeostatic variables.

Each internal variable has a comfort zone and two drives associated with it, whose activity measures the need of increase or decrease.

Drive for increase :

Drive for decrease :

min max,i ith th

min

min min

1log i i

e i i

th v

k th vexiti ed k e

max

max max

1log i i

i i i

v th

k v thinibi id k e

min

i iv th

max

i iv th

Global Drive and Reward

All the base nodes receive a global drive signal whose intensity is a function of the most pressing need

All the base nodes also receive a global reward signal corresponding to the satisfaction of one of the needs

1max ,inib excit

i ii n

d d d

1max , ,0 ,

1, if 1 0

1 , ,

0, if 1 0

inib exciti i

i n

j ji i j

ijjii

ji

r f t f t

d t d td t

d tf t j inib excit

d t

Learning - Detection of Goals

A base action node learns the correlation between the success of the corresponding command and the global reward

When such correlation is strong enough, the node splits, producing a specialized node that plays the role of a goal

Action success + reward = GOAL

Context Learning

Specialized nodes compute the context of certain events

The context is defined by the ensemble configuration of the activities of the group of nodes linked to one node

The context is learned on the occurrence of certain events Event good is an event we wish to be able to predict Event bad corresponds to the situation where the event

good does not occur, contrary to what was expected

The events good and bad are associated to two distinct weights that are used to compute the context value

Base Nodes Functioning

Excitation of a node codes the detection of a perceptive event

Action nodes can also have a call activity that will trigger the associated action

The call activity is initiated at random, except when it is regulated by the specialized nodes, but the number of action nodes that can be simultaneously calling by means of spontaneous activity is limited

The call activity is maintained during a random period of time

Specialized Nodes Functioning

Specialized nodes are used primarily to organize the calls sent to the motor module

These nodes can be seen as schemas that can be chained in a plan

The bistable activity of specialized nodes implements a stack mechanism

The specialized nodes make use of local notions of drive and reward The local drive corresponds to the necessity of using

the associated command The local reward corresponds to the successful

execution of that command

Specialized Nodes Functioning

A Competition mechanism determines the nodes that can perform a call

The behaviour of a specialized node depends on its bistable activity If the node is off, the call is transmitted to the parent

base node, where it will trigger action execution If the node is on, the local drive is stocked and

transmitted to the node’s subgoalsThe on is triggered by the event badThe off transition happens in two situations

After a given period of time When the right context can now be obtained

Specialized Nodes Functioning

Context learning is triggered when the events good or bad occur The event good corresponds to occurrence of reward (action

successful execution) in response to a call not stocked The event bad corresponds to the case where the call of the

node is not able of leading to rewardSpecialized nodes also learn the time needed for the execution of

the associated command The occurrence of the event good means that that time can

be reduced The occurrence of the event bad when the context is

favourable means that that time should increase

Contexts

There are two types of context Excitation context evaluate the excitation of context

nodes Call context is only computed by action nodes and

evaluate the call activity of the other action nodesExcitation context leads to the reinforcement of the weights

corresponding to base nodes whose detection activity is predictive of the success of the command generated by the specialized node

When a base node is strongly implicated in the context, it splits to create a subgoal of the specialized node

Contention Scheduling

The local drive determines the dominant schemas and its computation favours the nodes whose excitation context is more coherent with the current activity

The reward obtained by the subgoals during plan execution is propagated upwards in the hierarchy, in the direction of the goal

A dominant specialized node inhibits the base nodes whose commands are incompatible with its own command according to the learned call context

Conclusions

Several aspects of the simplified architecture have been tested successfully in a text-like world Goal creation Creation and chaining of subgoals Inhibition mechanism

Conclusions

The integration of the agents in a 3D world has raised some technical problems

Goal creation was tested, other aspects need more work

We have to design carefully the internal state of the agents and do some bootstrapping if we want the characters to exhibit the right behaviour

Future Work

Representation of states, both external and internal (abstract space)

Consider complex and high-level actions, like producing speech

External representation of the agent’s state (body and facial expressions, speech)