two critical issuesmotion.me.ucsb.edu/talks/2011w-taskrelease+surveillance-2x2.pdf · vaibhav...
TRANSCRIPT
Adaptive Information Management Strategiesin Mixed Human-Robot Teams
Vaibhav Srivastava
Center for Control, Dynamical Systems & Computation
University of California at Santa Barbara
http://motion.me.ucsb.edu/∼vaibhav
January 19, 2012
Advisor: Francesco BulloCoAuthors: R. Carli, C. Langbort, K. Plarre
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 1 / 23
Big Picture: Human-robot decision dynamics
Uncertain environment surveyed by human-UAV team(Courtesy: Prof. Kristi Morgansen)
UCSB Camera Network
A Surveillance Operator (Courtesy: http://www.modsim.org/)
Data Center Operator
– How to handle information overload?
– What are optimal information management strategies?Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 2 / 23
Two Critical Issues
Photo courtesy: The Wall Street Journal
Optimal information aggregation
Which source to observe?
Efficient search and detection
Routing for evidence collection
Optimal information processing
Optimal time allocation?
Optimal streaming rate?
Optimal number of operators?
Decision support system to optimize human-robot team objective?
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 3 / 23
Problem Setup
Support System (SS) collects information from the sensor network
Sensor Network ≡ regions surveyed,cold storages, different buildings, etc
SS streams collected information to the human operator
SS specifies the time, the operator should spend of each feed
Based on the operator’s decisions, the SS collects informationfrom the most pertinent source
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 4 / 23
Operator Models I
General vehicle team and human operator interaction modelNehme et al ’08
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 5 / 23
Operator Models II
DDM for Human Decision MakingBogacz et al ’06
Degradation of detection probabilityBertuccelli et al ’10
Evolution of probability of detectionPew ’68
Yerkes Dodson effectYerkes-Dodson 1908
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 6 / 23
Literature Review I
Task Release Control
The service time on a task isa function of utilization ratio (UR)
Yerkes-Dodson(Y-D) law determinesthe expected service time
Task release controller releases atask if UR is below a threshold
The maximally stabilizing arrival ratedepends on Y-D law and UR dynamics
Limitation: Does not incorporateerror rate in the policies
Task Release Setup
Task Release Controller
Max Arrival RateSavla et al ’11
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 7 / 23
Literature Review II
Resource allocation for human operator
Problem: How to optimally allocate operator attention to a batch oftasks or to an incoming stream of tasks
– Static queue: serve N tasks in time T
– Dynamic queue: tasks arrive continuously at some known rate
– Optimal design of queue: What is an optimal arrival rate
Srivastava et al ’11Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 8 / 23
Literature Review III
Time Constrained Static Queue
maximize f1(t1) + · · ·+ fN(tN)
subject to t1 + · · ·+ tN = T
t` ≥ 0, ` ∈ {1, . . . ,N} Optimal Allocations
Dynamic Queue with Latency Penalty
maxt1,t2,t3...
limL→∞
1
L
L∑
`=1
(fγ`(t`)− c̄E[n`]t` −
c̄λt2`
2
)
where expected queue length
E[n`] = n1−`+1+λ−̀1∑
j=1
tj
Optimal Allocations
Expected Queue Length
Limitation: Does not incorporate SA models in policies
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 9 / 23
Literature Review III
Time Constrained Static Queue
maximize f1(t1) + · · ·+ fN(tN)
subject to t1 + · · ·+ tN = T
t` ≥ 0, ` ∈ {1, . . . ,N} Optimal Allocations
Dynamic Queue with Latency Penalty
maxt1,t2,t3...
limL→∞
1
L
L∑
`=1
(fγ`(t`)− c̄E[n`]t` −
c̄λt2`
2
)
where expected queue length
E[n`] = n1−`+1+λ−̀1∑
j=1
tj
Optimal Allocations
Expected Queue Length
Limitation: Does not incorporate SA models in policiesVaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 9 / 23
Time Constrained Static Queue with Situational Awareness
Objective:
Serve N decision making tasks in time T
Maximize expected number of correctdecisions
Keep utilization ratio optimal
Each processed task should be allocatedmore time than the natural allocation
Probability of Detection
Yerkes Dodson effect
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 10 / 23
Dynamic Programming Formulation
Stage Cost
g` = z`(w`f`(t`) + β(t` − S(x`))), ` ∈ {1, . . . ,N},
System Dynamics
Allocation: a`+1 = a` + t` + r`, a1 = 0, a` ∈ [0,T ]
Utilization: x`+1 = (1− e−t`z`/τ + x(`)e−t`z`/τ )e−r`z`/τ , x` ∈ [xmin, xmax]
w` = weight, t` = allocation, r` = rest timeS = Y-D curve τ = operator sensitivity β = cost,f` = performance func. z` = process / don’t process
Optimal Solution
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 11 / 23
Dynamic Programming Formulation
Stage Cost
g` = z`(w`f`(t`) + β(t` − S(x`))), ` ∈ {1, . . . ,N},
System Dynamics
Allocation: a`+1 = a` + t` + r`, a1 = 0, a` ∈ [0,T ]
Utilization: x`+1 = (1− e−t`z`/τ + x(`)e−t`z`/τ )e−r`z`/τ , x` ∈ [xmin, xmax]
w` = weight, t` = allocation, r` = rest timeS = Y-D curve τ = operator sensitivity β = cost,f` = performance func. z` = process / don’t process
Optimal SolutionVaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 11 / 23
Dynamic Queue with Penalty and Situational Awareness
Tasks arrive as a Poisson process with rate λ
Tasks sampled from a distribution p : Γ→ R≥0
Unit reward for each correct decision
Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]
Stage Cost
maxt1,t2,t3...
limL→∞
1
L
L∑
`=1
z`
(w`fγ`(t`)− c̄E[n`]t` −
c̄λt2`
2+ β(t` − S(x`))
)
Certainty Equivalent Solution
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 12 / 23
Dynamic Queue with Penalty and Situational Awareness
Tasks arrive as a Poisson process with rate λ
Tasks sampled from a distribution p : Γ→ R≥0
Unit reward for each correct decision
Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]
Stage Cost
maxt1,t2,t3...
limL→∞
1
L
L∑
`=1
z`
(w`fγ`(t`)− c̄E[n`]t` −
c̄λt2`
2+ β(t` − S(x`))
)
Certainty Equivalent Solution
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 12 / 23
Dynamic Queue with Penalty and Situational Awareness
Tasks arrive as a Poisson process with rate λ
Tasks sampled from a distribution p : Γ→ R≥0
Unit reward for each correct decision
Latency penalty per unit-time cγ , for task γ ∈ Γ, and c̄ = Ep[cγ ]
Stage Cost
maxt1,t2,t3...
limL→∞
1
L
L∑
`=1
z`
(w`fγ`(t`)− c̄E[n`]t` −
c̄λt2`
2+ β(t` − S(x`))
)
Certainty Equivalent Solution
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 12 / 23
Illustrative Example I
Optimal Allocations and Rest Time
Receding Horizon Policy
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 13 / 23
Illustrative Example II
Reward versus Arrival Rate
Benefit per unit task Benefit rate
Optimal arrival rate
Switching occurs when operator is expected to be always non-idle
Designer may pick desired accuracy on each task to design arrival rate
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 14 / 23
Illustrative Example III
Handling Mandatory Tasks
No Mandatory Tasks Mandatory Tasks Present
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 15 / 23
Problem Setup
Support System (SS) collects information from the sensor network
Sensor Network ≡ regions surveyed,cold storages, different buildings, etc
SS streams collected information to the human operator
SS specifies the time, the operator should spend of each feed
Based on the operator’s decisions, the SS collects informationfrom the most pertinent source
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 16 / 23
Quickest Spatial Detection
N region to be surveyed
any number of anomalous regions
an ensemble of CUSUM algorithms
collection+transmission+processingtime at region ` is T` > 0
distance between region i and j : dij UCSB Campus Map
Spatial Quickest Detection (Srivastava & Bullo ’11)
1 at iteration τ , pick a region ` from stationary distribution q
2 go to region ` and collect evidence yτ3 update CUSUM statistic for region `
Λ` = (Λ`−1 + log(f 1` (yτ )/f 0
` (yτ ))+
4 declare an anomaly at region ` if Λ` > η
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 17 / 23
Quickest Spatial Detection
N region to be surveyed
any number of anomalous regions
an ensemble of CUSUM algorithms
collection+transmission+processingtime at region ` is T` > 0
distance between region i and j : dij UCSB Campus Map
Spatial Quickest Detection (Srivastava & Bullo ’11)
1 at iteration τ , pick a region ` from stationary distribution q
2 go to region ` and collect evidence yτ3 update CUSUM statistic for region `
Λ` = (Λ`−1 + log(f 1` (yτ )/f 0
` (yτ ))+
4 declare an anomaly at region ` if Λ` > η
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 17 / 23
Spatial Quickest Detection: Detection Delay
Expected detection delay at region `
E[T `d ] =
e−η + η − 1
q`D(f 1` , f
0` )
(q · T + q · Dq)
Two stage quickest detection strategy
1 pick optimal q∗ = argmin∑N
`=1 π1`E[T `
d ]
2 adapt q∗ with the evidence collected at each stage
Region Selection Probability Likelihood of Anamoly
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 18 / 23
Spatial Quickest Detection: Detection Delay
Expected detection delay at region `
E[T `d ] =
e−η + η − 1
q`D(f 1` , f
0` )
(q · T + q · Dq)
Two stage quickest detection strategy
1 pick optimal q∗ = argmin∑N
`=1 π1`E[T `
d ]
2 adapt q∗ with the evidence collected at each stage
Region Selection Probability Likelihood of Anamoly
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 18 / 23
Spatial Quickest Detection: Detection Delay
Expected detection delay at region `
E[T `d ] =
e−η + η − 1
q`D(f 1` , f
0` )
(q · T + q · Dq)
Two stage quickest detection strategy
1 pick optimal q∗ = argmin∑N
`=1 π1`E[T `
d ]
2 adapt q∗ with the evidence collected at each stage
Region Selection Probability Likelihood of Anamoly
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 18 / 23
Spatial Quickest Detection with Human Input
human operator allocates time t to an evidenceand decides on presence/absence of anomaly
probability of correct decision at region ` evolves as sigmoid function
{f 1` (t), if an anomaly is present,
f 0` (t), if no anomaly is present.
support system runs spatial quickest detection algorithmwith the decisions of the operator
Critical Issue:– human decisions are not i.i.d.– detection delay expressions can not be used
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 19 / 23
Expected Detection Delay: Heuristic Approximation
For a drift diffusion model
Expected decision time = threshold/D(f 1, f 0) = t inf
Expected delay minimization
minimizeq∈∆N−1
N∑
`=1
π1` t
inf`
qs(q · T + q · Dq)
detection delay proportional to
likelihood of anomaly
difficulty of task
inverse of region selection probability
processing time and average distance of the region from other regions
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 20 / 23
Simultaneous Information Aggregation and Processing I
At each iteration
the SS determines the optimal region selection policy
the region selection policy determinesthe distribution of incoming tasks
the performance on an incoming taskfrom region ` is π1
` f1` (t) + (1− π1
` )f 0` (t)
the SS determines the optimal allocation to each task,based on current reward and penalty
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 21 / 23
Simultaneous Information Aggregation and Processing II
Optimal Policies
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 22 / 23
Conclusions & Future Directions
Conclusions
novel simultaneous information aggregation and processing framework
incorporation of situational awareness models
incorporation of human decisions in sensor management strategies
an adaptive strategy that collects evidence from regions with highlikelihood of anomalies and optimally processes it
Future Directions
re-queuing of tasks and preemptive queues
validation with experiments
dynamic anomalies
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 23 / 23
Conclusions & Future Directions
Conclusions
novel simultaneous information aggregation and processing framework
incorporation of situational awareness models
incorporation of human decisions in sensor management strategies
an adaptive strategy that collects evidence from regions with highlikelihood of anomalies and optimally processes it
Future Directions
re-queuing of tasks and preemptive queues
validation with experiments
dynamic anomalies
Vaibhav Srivastava (UCSB) Adaptive Information Management January 19, 2012 23 / 23