two critical issuesmotion.me.ucsb.edu/talks/2011w-taskrelease+surveillance-2x2.pdf · vaibhav...

Adaptive Information Management Strategiesin Mixed Human-Robot Teams

Vaibhav Srivastava

Center for Control, Dynamical Systems & Computation

University of California at Santa Barbara

http://motion.me.ucsb.edu/∼vaibhav

January 19, 2012

Advisor: Francesco BulloCoAuthors: R. Carli, C. Langbort, K. Plarre

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 1 / 23

Big Picture: Human-robot decision dynamics

Uncertain environment surveyed by human-UAV team(Courtesy: Prof. Kristi Morgansen)

UCSB Camera Network

A Surveillance Operator (Courtesy: http://www.modsim.org/)

Data Center Operator

– How to handle information overload?

– What are optimal information management strategies?Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23

Two Critical Issues

Photo courtesy: The Wall Street Journal

Optimal information aggregation

Which source to observe?

Efficient search and detection

Routing for evidence collection

Optimal information processing

Optimal time allocation?

Optimal streaming rate?

Optimal number of operators?

Decision support system to optimize human-robot team objective?


Problem Setup

Support System (SS) collects information from the sensor network

Sensor Network ≡ regions surveyed,cold storages, different buildings, etc

SS streams collected information to the human operator

SS specifies the time, the operator should spend of each feed

Based on the operator’s decisions, the SS collects informationfrom the most pertinent source


Operator Models I

General vehicle team and human operator interaction modelNehme et al ’08


Operator Models II

DDM for Human Decision MakingBogacz et al ’06

Degradation of detection probabilityBertuccelli et al ’10

Evolution of probability of detectionPew ’68

Yerkes Dodson effectYerkes-Dodson 1908


Literature Review I

Task Release Control

The service time on a task isa function of utilization ratio (UR)

Yerkes-Dodson(Y-D) law determinesthe expected service time

Task release controller releases atask if UR is below a threshold

The maximally stabilizing arrival ratedepends on Y-D law and UR dynamics

Limitation: Does not incorporateerror rate in the policies

Task Release Setup

Task Release Controller

Max Arrival RateSavla et al ’11


Literature Review II

Resource allocation for human operator

Problem: How to optimally allocate operator attention to a batch oftasks or to an incoming stream of tasks

– Static queue: serve N tasks in time T

– Dynamic queue: tasks arrive continuously at some known rate

– Optimal design of queue: What is an optimal arrival rate

Srivastava et al ’11Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 8 / 23

Literature Review III

Time Constrained Static Queue

maximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t` ≥ 0, ` ∈ {1, . . . ,N} Optimal Allocations

Dynamic Queue with Latency Penalty

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

(fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2

)

where expected queue length

E[n`] = n1−`+1+λ−̀1∑

j=1

tj

Optimal Allocations

Expected Queue Length

Limitation: Does not incorporate SA models in policies


Literature Review III

Time Constrained Static Queue

maximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t` ≥ 0, ` ∈ {1, . . . ,N} Optimal Allocations

Dynamic Queue with Latency Penalty

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

(fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2

)

where expected queue length

E[n`] = n1−`+1+λ−̀1∑

j=1

tj

Optimal Allocations

Expected Queue Length

Limitation: Does not incorporate SA models in policiesVaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 9 / 23

Time Constrained Static Queue with Situational Awareness

Objective:

Serve N decision making tasks in time T

Maximize expected number of correctdecisions

Keep utilization ratio optimal

Each processed task should be allocatedmore time than the natural allocation

Probability of Detection

Yerkes Dodson effect


Dynamic Programming Formulation

Stage Cost

g` = z`(w`f`(t`) + β(t` − S(x`))), ` ∈ {1, . . . ,N},

System Dynamics

Allocation: a`+1 = a` + t` + r`, a1 = 0, a` ∈ [0,T ]

Utilization: x`+1 = (1− e−t`z`/τ + x(`)e−t`z`/τ )e−r`z`/τ , x` ∈ [xmin, xmax]

w` = weight, t` = allocation, r` = rest timeS = Y-D curve τ = operator sensitivity β = cost,f` = performance func. z` = process / don’t process

Optimal Solution


Dynamic Programming Formulation

Stage Cost

g` = z`(w`f`(t`) + β(t` − S(x`))), ` ∈ {1, . . . ,N},

System Dynamics

Allocation: a`+1 = a` + t` + r`, a1 = 0, a` ∈ [0,T ]

Utilization: x`+1 = (1− e−t`z`/τ + x(`)e−t`z`/τ )e−r`z`/τ , x` ∈ [xmin, xmax]

w` = weight, t` = allocation, r` = rest timeS = Y-D curve τ = operator sensitivity β = cost,f` = performance func. z` = process / don’t process

Optimal SolutionVaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 11 / 23

Dynamic Queue with Penalty and Situational Awareness

Tasks arrive as a Poisson process with rate λ

Tasks sampled from a distribution p : Γ→ R≥0

Unit reward for each correct decision

Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]

Stage Cost

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

z`

(w`fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2+ β(t` − S(x`))

)

Certainty Equivalent Solution







Stage Cost

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

z`

(w`fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2+ β(t` − S(x`))

)








Stage Cost

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

z`

(w`fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2+ β(t` − S(x`))

)



Illustrative Example I

Optimal Allocations and Rest Time

Receding Horizon Policy


Illustrative Example II

Reward versus Arrival Rate

Benefit per unit task Benefit rate

Optimal arrival rate

Switching occurs when operator is expected to be always non-idle

Designer may pick desired accuracy on each task to design arrival rate


Illustrative Example III

Handling Mandatory Tasks

No Mandatory Tasks Mandatory Tasks Present


Problem Setup

Support System (SS) collects information from the sensor network

Sensor Network ≡ regions surveyed,cold storages, different buildings, etc

SS streams collected information to the human operator

SS specifies the time, the operator should spend of each feed

Based on the operator’s decisions, the SS collects informationfrom the most pertinent source


Quickest Spatial Detection

N region to be surveyed

any number of anomalous regions

an ensemble of CUSUM algorithms

collection+transmission+processingtime at region ` is T` > 0

distance between region i and j : dij UCSB Campus Map

Spatial Quickest Detection (Srivastava & Bullo ’11)

1 at iteration τ , pick a region ` from stationary distribution q

2 go to region ` and collect evidence yτ3 update CUSUM statistic for region `

Λ` = (Λ`−1 + log(f 1` (yτ )/f 0

` (yτ ))+

4 declare an anomaly at region ` if Λ` > η


Quickest Spatial Detection

N region to be surveyed

any number of anomalous regions

an ensemble of CUSUM algorithms

collection+transmission+processingtime at region ` is T` > 0

distance between region i and j : dij UCSB Campus Map

Spatial Quickest Detection (Srivastava & Bullo ’11)

1 at iteration τ , pick a region ` from stationary distribution q

2 go to region ` and collect evidence yτ3 update CUSUM statistic for region `

Λ` = (Λ`−1 + log(f 1` (yτ )/f 0

` (yτ ))+

4 declare an anomaly at region ` if Λ` > η


Spatial Quickest Detection: Detection Delay

Expected detection delay at region `

E[T `d ] =

e−η + η − 1

q`D(f 1` , f

0` )

(q · T + q · Dq)

Two stage quickest detection strategy

1 pick optimal q∗ = argmin∑N

`=1 π1`E[T `

d ]

2 adapt q∗ with the evidence collected at each stage

Region Selection Probability Likelihood of Anamoly




E[T `d ] =

e−η + η − 1

q`D(f 1` , f

0` )

(q · T + q · Dq)



`=1 π1`E[T `

d ]






E[T `d ] =

e−η + η − 1

q`D(f 1` , f

0` )

(q · T + q · Dq)



`=1 π1`E[T `

d ]




Spatial Quickest Detection with Human Input

human operator allocates time t to an evidenceand decides on presence/absence of anomaly

probability of correct decision at region ` evolves as sigmoid function

{f 1` (t), if an anomaly is present,

f 0` (t), if no anomaly is present.

support system runs spatial quickest detection algorithmwith the decisions of the operator

Critical Issue:– human decisions are not i.i.d.– detection delay expressions can not be used


Expected Detection Delay: Heuristic Approximation

For a drift diffusion model

Expected decision time = threshold/D(f 1, f 0) = t inf

Expected delay minimization

minimizeq∈∆N−1

N∑

`=1

π1` t

inf`

qs(q · T + q · Dq)

detection delay proportional to

likelihood of anomaly

difficulty of task

inverse of region selection probability

processing time and average distance of the region from other regions


Simultaneous Information Aggregation and Processing I

At each iteration

the SS determines the optimal region selection policy

the region selection policy determinesthe distribution of incoming tasks

the performance on an incoming taskfrom region ` is π1

` f1` (t) + (1− π1

` )f 0` (t)

the SS determines the optimal allocation to each task,based on current reward and penalty


Simultaneous Information Aggregation and Processing II

Optimal Policies


Conclusions & Future Directions

Conclusions

novel simultaneous information aggregation and processing framework

incorporation of situational awareness models

incorporation of human decisions in sensor management strategies

an adaptive strategy that collects evidence from regions with highlikelihood of anomalies and optimally processes it

Future Directions

re-queuing of tasks and preemptive queues

validation with experiments

dynamic anomalies


Conclusions & Future Directions

Conclusions

novel simultaneous information aggregation and processing framework

incorporation of situational awareness models

incorporation of human decisions in sensor management strategies

an adaptive strategy that collects evidence from regions with highlikelihood of anomalies and optimally processes it

Future Directions

re-queuing of tasks and preemptive queues

validation with experiments

dynamic anomalies


two critical issuesmotion.me.ucsb.edu/talks/2011w-taskrelease+surveillance-2x2.pdf · vaibhav...

Documents