uncertain observation times

32
Uncertain Observation Times Shaunak Chatterjee & Stuart Russell Computer Science Division University of California, Berkeley

Upload: malia

Post on 23-Feb-2016

77 views

Category:

Documents


0 download

DESCRIPTION

Uncertain Observation Times. Shaunak Chatterjee & Stuart Russell Computer Science Division University of California, Berkeley. Overview. Why uncertain observation times matter Scenarios considered: Each event is observed: Efficient DP algorithm Missing and false events: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Uncertain Observation Times

Uncertain Observation Times

Shaunak Chatterjee & Stuart RussellComputer Science Division

University of California, Berkeley

Page 2: Uncertain Observation Times

Overview

• Why uncertain observation times matter• Scenarios considered:1. Each event is observed: – Efficient DP algorithm

2. Missing and false events: – Practical approximation algorithm

3. Multiple asynchronous observation streams

Page 3: Uncertain Observation Times
Page 4: Uncertain Observation Times

Motivation

• Two types of data streams– Automatically time-stamped data traces– Human annotations for temporal events• Many essential facts cannot be recorded automatically• Human-generated timestamps often wrong

• Assuming the correctness of timestamps can lead to nonsense results

Page 5: Uncertain Observation Times

Example: at 16.30, nurse enters “gave phenylephrine at 16.00”

Data entry time

Event timestamp

Page 6: Uncertain Observation Times

Example: at 16.30, nurse enters “gave phenylephrine at 16.00”

Data entry time

Actual event time

Page 7: Uncertain Observation Times

Ubiquity of uncertain observation times

• Nurse monitoring a patient in the ICU– Hundreds of events recorded by the nurse • Usually recorded after event, sometimes before

• Manual recording of events– Science experiments• Biology, chemistry, physics

– Industrial plants• Multiple observation traces– Various historians’ accounts of a period– Only one underlying truth

Page 8: Uncertain Observation Times

Sample trace generated from model

• Correct chronological ordering of time stamps

Actual time of event (ai)

Recording time of event (di)

Time

Nurse gives medicine at 10:23 a.m.

Nurse records event at 11:00 a.m.

Nurse records time of event as 10:30 a.m.

Previous event’s time stamp (10:15 a.m.)

Recorded time of event (mi)

Page 9: Uncertain Observation Times

Dynamic Bayesian networks

• DBNs are discrete-time multivariate stochastic process models (include HMMs and KFs)

• DBNs facilitate modeling of complex systems with sensor noise etc.

Large-scale physiological models pursued since the 1960s, but little attention paid to nature of real data

Page 10: Uncertain Observation Times

Simple DBN representationY1 Y2 Y3 Y7

X1 X2 X3 X7

a1 a2 a3

m1 m2 m3

d1 d2 d3

Y4 Y5 Y6

X4 X5 X6

Y8

X8

false true true truefalse false false false

2 5 7

2 4 8

6 6 8

Page 11: Uncertain Observation Times

Objective

• To design a graphical model that allows for uncertainty in observation times

• Derive efficient inference algorithms – Naïve algorithm has O(MT) complexity– Reduce to O(MT)• Ordering constraints• Dynamic programming

Page 12: Uncertain Observation Times

Key constraint assumption

• Person recording events gets the order right• Valid association

• Invalid association

• For all i, j: mi > mj => ai > aj

Time

Recorded time of event (mi)

Actual time of event (ai)

Time

Recorded time of event (mi)

Actual time of event (ai)

Page 13: Uncertain Observation Times

Pre-computation step

• Likelihood of the data segment between the current event time stamp (ak) and the next hypothesized event time stamp (ak+1)

• Pre-compute for all k, and all possible values of ak and ak+1

Page 14: Uncertain Observation Times

Modified Baum-Welch algorithm

Page 15: Uncertain Observation Times

Complexity

• Modified time complexity O(MS2T)– M: maximum size of the time window of

uncertainty– S: # states in system– T: number of time steps

• Space complexity– O(KM2) – storing – O(KM) – storing α, β and γ

Page 16: Uncertain Observation Times

Simulation results – Increased likelihood of evidence

Window of uncertainty

Page 17: Uncertain Observation Times

Simulation results – General accuracy of inference

Page 18: Uncertain Observation Times

Simulation results – Computation time vs size of uncertainty window

Page 19: Uncertain Observation Times

Unreported events, false reports

• Not all events are reported– Unobserved– Negligence

• Not all reports are true– Double entry of a single data point– Misinterpretation of information– Intended actions reported but not carried out

Page 20: Uncertain Observation Times

Missing and false reports

a1 a2 a4

m1 m2 m3

θ1θ2 θ4

Φ1 Φ2 Φ3

a3

θ3

Actual time of event (ai)

Recorded time of event (mi)

Event i reported? (θi)

Index of event corresponding to report j (φi)

Page 21: Uncertain Observation Times

Modified DP and complexity

• The previous algorithm was compact because of the one-to-one correspondence between events and reports– Now have to consider all possibilities• Unless there are constraints (more on this later)

• Chronological mapping of events’ time stamps still holds– This again leads to an efficient dynamic program

Page 22: Uncertain Observation Times

Computational complexity

• In the general case, uncertainty windows are no longer limited, since event i can be associated with any report j

• O(IJT2) – I is the number of hypothesized events– J is the number of reports– T is the length of the temporal sequence

Page 23: Uncertain Observation Times

Practical assumptions – I

• Data entries are made in blocks– All reports in a given block (e.g., the night shift) must be

for events that occurred (really or otherwise) in that block

– Computational complexity is linear in T if blocks are of constant size

Page 24: Uncertain Observation Times

Practical assumptions – II

• When unobserved events and false reports are both rare events– We can perform approximate inference by NOT

considering all possible ai mj associations– The posterior distribution is highly concentrated along

the “skewed diagonal” corresponding to a small number of errors

– Assuming a bounded number of errors gives time complexity proportional to T

Page 25: Uncertain Observation Times

Simulation results – Posterior is peaked around the skewed diagonal

Page 26: Uncertain Observation Times

Simulation results – Hypothesizing more events leads to better recall

Page 27: Uncertain Observation Times

Effect of varying c

Page 28: Uncertain Observation Times

Multiple observation sequences

• Formulation– Several “sources” reporting on the same events– Key assumption• Individual report sequences are independent given the actual

truth (the X chain)

Page 29: Uncertain Observation Times

ai ai+1 aI

θiθi+1 θI

Φj(1) ΦJ

(1)

mj+1(1) mJ

(1)

Φj+1(1)

mj(1)

mj(R) mj+1

(R) mJ(R)

Φj(R) Φj+1

(R) ΦJ(R)

Latent trajectory

Evidence trajectory 1

Evidence trajectory R

Page 30: Uncertain Observation Times

Multiple observation sequences

• Formulation– Several “sources” reporting on the same events– Key assumption

• Individual report sequences are independent given the actual truth (the X chain)

• Inference– Similar DP algorithms apply, given the assumptions of

ordering constraints, blocks, etc.– Complexity increases linearly with the number of report

sequences

Page 31: Uncertain Observation Times

Conclusions• Handling uncertainty in observation times is critical for correct

modeling and inference• Assumptions about qualitative accuracy (e.g., order of events)

can be very helpful

• Given such assumptions, the computational complexity of inference remains unchanged (modulo some constant factors) while handling the following cases– Noisy observation times– Missing and false reports– Multiple report sequences

Page 32: Uncertain Observation Times

QUESTIONS?Thank You!