two critical issuesmotion.me.ucsb.edu/talks/2011w-taskrelease+surveillance-2x2.pdf · vaibhav...

8
Adaptive Information Management Strategies in Mixed Human-Robot Teams Vaibhav Srivastava Center for Control, Dynamical Systems & Computation University of California at Santa Barbara http://motion.me.ucsb.edu/vaibhav January 19, 2012 Advisor: Francesco Bullo CoAuthors: R. Carli, C. Langbort, K. Plarre Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 1 / 23 Big Picture: Human-robot decision dynamics Uncertain environment surveyed by human-UAV team (Courtesy: Prof. Kristi Morgansen) UCSB Camera Network A Surveillance Operator (Courtesy: http://www.modsim.org/) Data Center Operator – How to handle information overload? – What are optimal information management strategies? Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Two Critical Issues Photo courtesy: The Wall Street Journal Optimal information aggregation Which source to observe? Efficient search and detection Routing for evidence collection Optimal information processing Optimal time allocation? Optimal streaming rate? Optimal number of operators? Decision support system to optimize human-robot team objective? Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 3 / 23 Problem Setup Support System (SS) collects information from the sensor network Sensor Network regions surveyed, cold storages, different buildings, etc SS streams collected information to the human operator SS specifies the time, the operator should spend of each feed Based on the operator’s decisions, the SS collects information from the most pertinent source Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 4 / 23

Upload: others

Post on 24-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Adaptive Information Management Strategiesin Mixed Human-Robot Teams

Vaibhav Srivastava

Center for Control, Dynamical Systems & Computation

University of California at Santa Barbara

http://motion.me.ucsb.edu/∼vaibhav

January 19, 2012

Advisor: Francesco BulloCoAuthors: R. Carli, C. Langbort, K. Plarre

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 1 / 23

Big Picture: Human-robot decision dynamics

Uncertain environment surveyed by human-UAV team(Courtesy: Prof. Kristi Morgansen)

UCSB Camera Network

A Surveillance Operator (Courtesy: http://www.modsim.org/)

Data Center Operator

– How to handle information overload?

– What are optimal information management strategies?Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23

Two Critical Issues

Photo courtesy: The Wall Street Journal

Optimal information aggregation

Which source to observe?

Efficient search and detection

Routing for evidence collection

Optimal information processing

Optimal time allocation?

Optimal streaming rate?

Optimal number of operators?

Decision support system to optimize human-robot team objective?

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 3 / 23

Problem Setup

Support System (SS) collects information from the sensor network

Sensor Network ≡ regions surveyed,cold storages, different buildings, etc

SS streams collected information to the human operator

SS specifies the time, the operator should spend of each feed

Based on the operator’s decisions, the SS collects informationfrom the most pertinent source

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 4 / 23

Page 2: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Operator Models I

General vehicle team and human operator interaction modelNehme et al ’08

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 5 / 23

Operator Models II

DDM for Human Decision MakingBogacz et al ’06

Degradation of detection probabilityBertuccelli et al ’10

Evolution of probability of detectionPew ’68

Yerkes Dodson effectYerkes-Dodson 1908

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 6 / 23

Literature Review I

Task Release Control

The service time on a task isa function of utilization ratio (UR)

Yerkes-Dodson(Y-D) law determinesthe expected service time

Task release controller releases atask if UR is below a threshold

The maximally stabilizing arrival ratedepends on Y-D law and UR dynamics

Limitation: Does not incorporateerror rate in the policies

Task Release Setup

Task Release Controller

Max Arrival RateSavla et al ’11

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 7 / 23

Literature Review II

Resource allocation for human operator

Problem: How to optimally allocate operator attention to a batch oftasks or to an incoming stream of tasks

– Static queue: serve N tasks in time T

– Dynamic queue: tasks arrive continuously at some known rate

– Optimal design of queue: What is an optimal arrival rate

Srivastava et al ’11Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 8 / 23

Page 3: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Literature Review III

Time Constrained Static Queue

maximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t` ≥ 0, ` ∈ {1, . . . ,N} Optimal Allocations

Dynamic Queue with Latency Penalty

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

(fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2

)

where expected queue length

E[n`] = n1−`+1+λ−̀1∑

j=1

tj

Optimal Allocations

Expected Queue Length

Limitation: Does not incorporate SA models in policies

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 9 / 23

Literature Review III

Time Constrained Static Queue

maximize f1(t1) + · · ·+ fN(tN)

subject to t1 + · · ·+ tN = T

t` ≥ 0, ` ∈ {1, . . . ,N} Optimal Allocations

Dynamic Queue with Latency Penalty

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

(fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2

)

where expected queue length

E[n`] = n1−`+1+λ−̀1∑

j=1

tj

Optimal Allocations

Expected Queue Length

Limitation: Does not incorporate SA models in policiesVaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 9 / 23

Time Constrained Static Queue with Situational Awareness

Objective:

Serve N decision making tasks in time T

Maximize expected number of correctdecisions

Keep utilization ratio optimal

Each processed task should be allocatedmore time than the natural allocation

Probability of Detection

Yerkes Dodson effect

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 10 / 23

Dynamic Programming Formulation

Stage Cost

g` = z`(w`f`(t`) + β(t` − S(x`))), ` ∈ {1, . . . ,N},

System Dynamics

Allocation: a`+1 = a` + t` + r`, a1 = 0, a` ∈ [0,T ]

Utilization: x`+1 = (1− e−t`z`/τ + x(`)e−t`z`/τ )e−r`z`/τ , x` ∈ [xmin, xmax]

w` = weight, t` = allocation, r` = rest timeS = Y-D curve τ = operator sensitivity β = cost,f` = performance func. z` = process / don’t process

Optimal Solution

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 11 / 23

Page 4: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Dynamic Programming Formulation

Stage Cost

g` = z`(w`f`(t`) + β(t` − S(x`))), ` ∈ {1, . . . ,N},

System Dynamics

Allocation: a`+1 = a` + t` + r`, a1 = 0, a` ∈ [0,T ]

Utilization: x`+1 = (1− e−t`z`/τ + x(`)e−t`z`/τ )e−r`z`/τ , x` ∈ [xmin, xmax]

w` = weight, t` = allocation, r` = rest timeS = Y-D curve τ = operator sensitivity β = cost,f` = performance func. z` = process / don’t process

Optimal SolutionVaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 11 / 23

Dynamic Queue with Penalty and Situational Awareness

Tasks arrive as a Poisson process with rate λ

Tasks sampled from a distribution p : Γ→ R≥0

Unit reward for each correct decision

Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]

Stage Cost

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

z`

(w`fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2+ β(t` − S(x`))

)

Certainty Equivalent Solution

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 12 / 23

Dynamic Queue with Penalty and Situational Awareness

Tasks arrive as a Poisson process with rate λ

Tasks sampled from a distribution p : Γ→ R≥0

Unit reward for each correct decision

Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]

Stage Cost

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

z`

(w`fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2+ β(t` − S(x`))

)

Certainty Equivalent Solution

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 12 / 23

Dynamic Queue with Penalty and Situational Awareness

Tasks arrive as a Poisson process with rate λ

Tasks sampled from a distribution p : Γ→ R≥0

Unit reward for each correct decision

Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]

Stage Cost

maxt1,t2,t3...

limL→∞

1

L

L∑

`=1

z`

(w`fγ`(t`)− c̄E[n`]t` −

c̄λt2`

2+ β(t` − S(x`))

)

Certainty Equivalent Solution

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 12 / 23

Page 5: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Illustrative Example I

Optimal Allocations and Rest Time

Receding Horizon Policy

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 13 / 23

Illustrative Example II

Reward versus Arrival Rate

Benefit per unit task Benefit rate

Optimal arrival rate

Switching occurs when operator is expected to be always non-idle

Designer may pick desired accuracy on each task to design arrival rate

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 14 / 23

Illustrative Example III

Handling Mandatory Tasks

No Mandatory Tasks Mandatory Tasks Present

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 15 / 23

Problem Setup

Support System (SS) collects information from the sensor network

Sensor Network ≡ regions surveyed,cold storages, different buildings, etc

SS streams collected information to the human operator

SS specifies the time, the operator should spend of each feed

Based on the operator’s decisions, the SS collects informationfrom the most pertinent source

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 16 / 23

Page 6: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Quickest Spatial Detection

N region to be surveyed

any number of anomalous regions

an ensemble of CUSUM algorithms

collection+transmission+processingtime at region ` is T` > 0

distance between region i and j : dij UCSB Campus Map

Spatial Quickest Detection (Srivastava & Bullo ’11)

1 at iteration τ , pick a region ` from stationary distribution q

2 go to region ` and collect evidence yτ3 update CUSUM statistic for region `

Λ` = (Λ`−1 + log(f 1` (yτ )/f 0

` (yτ ))+

4 declare an anomaly at region ` if Λ` > η

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 17 / 23

Quickest Spatial Detection

N region to be surveyed

any number of anomalous regions

an ensemble of CUSUM algorithms

collection+transmission+processingtime at region ` is T` > 0

distance between region i and j : dij UCSB Campus Map

Spatial Quickest Detection (Srivastava & Bullo ’11)

1 at iteration τ , pick a region ` from stationary distribution q

2 go to region ` and collect evidence yτ3 update CUSUM statistic for region `

Λ` = (Λ`−1 + log(f 1` (yτ )/f 0

` (yτ ))+

4 declare an anomaly at region ` if Λ` > η

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 17 / 23

Spatial Quickest Detection: Detection Delay

Expected detection delay at region `

E[T `d ] =

e−η + η − 1

q`D(f 1` , f

0` )

(q · T + q · Dq)

Two stage quickest detection strategy

1 pick optimal q∗ = argmin∑N

`=1 π1`E[T `

d ]

2 adapt q∗ with the evidence collected at each stage

Region Selection Probability Likelihood of Anamoly

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 18 / 23

Spatial Quickest Detection: Detection Delay

Expected detection delay at region `

E[T `d ] =

e−η + η − 1

q`D(f 1` , f

0` )

(q · T + q · Dq)

Two stage quickest detection strategy

1 pick optimal q∗ = argmin∑N

`=1 π1`E[T `

d ]

2 adapt q∗ with the evidence collected at each stage

Region Selection Probability Likelihood of Anamoly

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 18 / 23

Page 7: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Spatial Quickest Detection: Detection Delay

Expected detection delay at region `

E[T `d ] =

e−η + η − 1

q`D(f 1` , f

0` )

(q · T + q · Dq)

Two stage quickest detection strategy

1 pick optimal q∗ = argmin∑N

`=1 π1`E[T `

d ]

2 adapt q∗ with the evidence collected at each stage

Region Selection Probability Likelihood of Anamoly

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 18 / 23

Spatial Quickest Detection with Human Input

human operator allocates time t to an evidenceand decides on presence/absence of anomaly

probability of correct decision at region ` evolves as sigmoid function

{f 1` (t), if an anomaly is present,

f 0` (t), if no anomaly is present.

support system runs spatial quickest detection algorithmwith the decisions of the operator

Critical Issue:– human decisions are not i.i.d.– detection delay expressions can not be used

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 19 / 23

Expected Detection Delay: Heuristic Approximation

For a drift diffusion model

Expected decision time = threshold/D(f 1, f 0) = t inf

Expected delay minimization

minimizeq∈∆N−1

N∑

`=1

π1` t

inf`

qs(q · T + q · Dq)

detection delay proportional to

likelihood of anomaly

difficulty of task

inverse of region selection probability

processing time and average distance of the region from other regions

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 20 / 23

Simultaneous Information Aggregation and Processing I

At each iteration

the SS determines the optimal region selection policy

the region selection policy determinesthe distribution of incoming tasks

the performance on an incoming taskfrom region ` is π1

` f1` (t) + (1− π1

` )f 0` (t)

the SS determines the optimal allocation to each task,based on current reward and penalty

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 21 / 23

Page 8: Two Critical Issuesmotion.me.ucsb.edu/talks/2011w-TaskRelease+Surveillance-2x2.pdf · Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23 Problem Setup

Simultaneous Information Aggregation and Processing II

Optimal Policies

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 22 / 23

Conclusions & Future Directions

Conclusions

novel simultaneous information aggregation and processing framework

incorporation of situational awareness models

incorporation of human decisions in sensor management strategies

an adaptive strategy that collects evidence from regions with highlikelihood of anomalies and optimally processes it

Future Directions

re-queuing of tasks and preemptive queues

validation with experiments

dynamic anomalies

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 23 / 23

Conclusions & Future Directions

Conclusions

novel simultaneous information aggregation and processing framework

incorporation of situational awareness models

incorporation of human decisions in sensor management strategies

an adaptive strategy that collects evidence from regions with highlikelihood of anomalies and optimally processes it

Future Directions

re-queuing of tasks and preemptive queues

validation with experiments

dynamic anomalies

Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 23 / 23