monitoring health by detecting drifts and outliers for a smart environment inhabitant
DESCRIPTION
Monitoring Health by Detecting Drifts and Outliers for a Smart Environment Inhabitant Gaurav Jain, Diane J. Cook, Vikramaditya Jakkula. UTA Project Unique Focus on entire home House perceives and acts Sensors Controllers for devices Connections to the mobile user and Internet - PowerPoint PPT PresentationTRANSCRIPT
June 27-28, 2006 Vikramaditya Jakkula
Monitoring Health by Detecting Driftsand Outliers for a Smart Environment
Inhabitant
Gaurav Jain, Diane J. Cook, Vikramaditya Jakkula
June 27-28, 2006 Vikramaditya Jakkula
MavHome
• UTA Project Unique– Focus on entire home
• House perceives and acts– Sensors– Controllers for devices– Connections to the mobile user and Internet
• Unified project incorporating varied AI techniques, cross disciplinary with mobile computing, databases, multimedia, and others
June 27-28, 2006 Vikramaditya Jakkula
MavHome
GoalsThe goals of an intelligent environment control system should
be to
1. Maximize the safety and security of the inhabitant(s)
2. Maximize the comfort of the inhabitant(s) by automating their environment the fullest and most desirable extent possible
3. Minimize the consumption of natural resources in an effort to reduce costs and maximize environment efficiency.
June 27-28, 2006 Vikramaditya Jakkula
Environment
MavHome Environment
MavDenMavKitchenMavPad
June 27-28, 2006 Vikramaditya Jakkula
Environment-Contd
June 27-28, 2006 Vikramaditya Jakkula
Overview
June 27-28, 2006 Vikramaditya Jakkula
Core Technologies
• Minimal Sequential Patterns Using “ED”
Given an input stream S of event occurrences O, ED:1. Partitions S into Maximal Episodes, Pmax.2. Creates Itemsets, I, from the Maximal Episodes.3. Creates a Candidate Significant Episode, C, for each
Itemset I, and computes one or more SignificanceValues, V, for each Candidate.
4. Identifies Significant Episodes by evaluating theSignificance Values of the candidates.
June 27-28, 2006 Vikramaditya Jakkula
Core Technologies
• Decision Making using ProPHeT
ProPHeT is the main controlling component of the system It uses data filtered through Episode Discovery (ED) to create a Hierarchical Hidden Markov Model (HHMM).
HHMM represents a user model that includes all of the episodes (e.g., entering a room, watching TV, sitting in a chair and listening to music, and so forth) that a person performs in the environment.
June 27-28, 2006 Vikramaditya Jakkula
Core Technologies
June 27-28, 2006 Vikramaditya Jakkula
Need for Health MonitoringProblem
Elderly, disabilities and the chronic ill need health care.Personal preferenceIncreased care costInadequate infrastructure
SolutionLow-cost automated health monitoring system at home
Lanspery & Hyde state “For most of us, the word ‘home’ evokes powerful emotions [and is] a refuge”
June 27-28, 2006 Vikramaditya Jakkula
Drift Detection Algorithm
• Diurnal algorithm• Uses autocorrelation
plots
• Three Steps– Update history– Detect drifts– Report Generation
Input: history h, frequency sets, action list and their criticalities
OutFile: report file• update h with the frequency
sets• for each action a loop
– find the drift type d in action a’s history
– send the drift d for action a to the report manager
• the report manager generates the final report based on the criticality of each action, the current drift parameters and previous drift parameters.
n-k
Σ ( xi - μ ) (xk - μ )i=1
(n-k) * σ 2rk =
June 27-28, 2006 Vikramaditya Jakkula
Update History
• Maintains six-hourly, daily, weekly history queues.
• Input is four six-hourly frequency sets.
• Different window sizes are posible
• Large window vs. small window
June 27-28, 2006 Vikramaditya Jakkula
Detecting Drifts
• Input: action a, history h, reporter r• OutFile: drift type d and its parameters p
• check if action a has drift type d == no drift• if yes then
– send the drift type and its parameters to the reporter– return to the calling function
• check if action a has drift type d == cyclic or increasing• if yes then
– send the drift type and its parameters to the reporter– return to the calling function
• send the drift type as chaotic to the reporter
June 27-28, 2006 Vikramaditya Jakkula
Test for no-drift
• No-drift?– constant for a significant period of time, and– may have random noise.
• Only the top half of the autocorrelation plot is used. Why?
• Test:– autocorrelation plot values < threshold. Why?– Less than 10% of these values should lie outside the
(m – 2s, m + 2s) range. Why?
June 27-28, 2006 Vikramaditya Jakkula
Test for Cyclic
• cyclic trend shows high upward peaks in autocorrelation graph
-1.5
-1
-0.5
0
0.5
1
1.5
1 2 3 4 5 6 7 8 9 10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Time Lag
Auto
correla
tion V
alu
es
June 27-28, 2006 Vikramaditya Jakkula
Test for Sloping
• High degree of autocorrelation is between adjacent and near-adjacent observations.
• High value at lag one• Value decreases with increase
in lag• Slope length is the smallest lag
at which the values stops decreasing.
• Note: Random noise is suppressed by the autocorrelation plot
-3-2.5
-2-1.5
-1-0.5
00.5
11.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Time Lag
Au
toco
rrel
atio
n V
alu
es
0
5
10
15
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Time
Fre
qu
ency
June 27-28, 2006 Vikramaditya Jakkula
Test for Chaotic
• No test for chaotic• Anything not yet classified ends up any chaotic• Causes:
– large number of irregular changes– heavy non-random noise in data in all the windows– sudden large changes in the distribution– seen for a short period of time when drift type changes
• Reporting of drifts will be discussed after presentation of the outlier detection algorithms
June 27-28, 2006 Vikramaditya Jakkula
Outlier Detection Algorithms
• Two types of outliers– Extremely high or low value in periodic frequency– Occurrence unexpected action in an ordered sequence of
actions.• Separate algorithms for each
– Autocorrelation-based outlier detection• Uses drift detection method• Outlier if last data point lies outside (m – 3s, m + 3s),• Tested for all window sizes.• If found outlier, then drift detection is not done.
– Prediction-based outlier detection• Why a two methods?
June 27-28, 2006 Vikramaditya Jakkula
Prediction-based Outlier Detection
• Live-monitoring method• Uses Active LeZi (ALZ) [2] to find the expected pattern in
the data.• ALZ uses data compression to predict the next action in
a sequence. It determines the probability distribution for each action at any point of time.
• When an action occurs this probability distribution is used to determine if the action is an outlier or not.
June 27-28, 2006 Vikramaditya Jakkula
Prediction-based Outlier Detection
• To determine if an action x is an outlier we calculate the anomaly measure n(x).
• Two methods are used to calculate anomaly measure. Why?
• To determine the importance an outlier we calculate the urgency factor u(x)
, otherwise1
ρ (x) * 100
1, if ρ (x) * 100 < 1anomaly
measure, n1(x) =
1, if ρ (x) * 100 <= ρ (y)
ρ (y)
ρ (x) * 100
anomaly measure,
n2 (x) =, otherwise
urgency factor, u (x) = n(x) * c(x)
report if u (x) >= 0.1
June 27-28, 2006 Vikramaditya Jakkula
Report generation for Autocorrelation-based algorithms
• Which drift or outlier is important to report?• Uses
– current classification, – the previous classification, – the criticality of the action, and – other parameters (confidence, length of drifts etc.)
• Three levels– Level one: Critical drifts and outliers– Level two: Important drifts and outliers– Level three: All drifts and outliers
June 27-28, 2006 Vikramaditya Jakkula
Report generation
• Level one– If action criticality is above medium, and either the classification
changed from the previous or cycle period changes.• Level Two
– All outliers– criticality is above medium– Previous classification changes– cycle period changes– confidence changes by some amount
• Level three– Classification of each action.
June 27-28, 2006 Vikramaditya Jakkula
Experiments
• HMS was tested using both synthetic and real data (activity and health).
• Five sets– Synthetic set one– Synthetic set two– Real set one– Real set two– Health set
• Step1: verify algorithms using synthetic sets• Step 2: analyze how the algorithm work on real and health sets
June 27-28, 2006 Vikramaditya Jakkula
Nature of data
• Synthetic set one– To test autocorrelation-
based algorithm– Hundred days nine action– 10639 data points– Random criticalities
• Synthetic set two– To test prediction-based
algorithm– 100 data points– Four actions
Action name Description
PerfectCyclic3_OnCyclic; period - three days; No noise.
DailyConstant_On A daily constant; No noise.
PerfectIncreasing_On Increasing; No noise.
NoisyIncresing_On Increasing with noise.
NoisyCyclic7_OnNoisy cyclic with period of a week
Cyclic3ToNoisyCylic7_On
cyclic (period 3) to cyclic (period 7) on the day 41
NoisyIncrToNoisyDesc_On
noisy increasing to noisy decreasing on the day 41
NoisyDecrToCyclic3_On
noisy decreasing drift to a cyclic with period 3
Changing_On
A drift constantly changing between constant, increasing and decreasing with outliers in the data.
Synthetic set one
June 27-28, 2006 Vikramaditya Jakkula
Nature of data• Real data
– Activity data from MavPad– Seven weeks
• Real set one – tested on prediction-based algorithm– Electrical outlets usage, light usage and overhead fan usage.– 2163 data points; 79 actions
• Real set two – tested on autocorrelation-based algorithm– Real set one data plus motion sensor data– 334935 data points; 157 actions
• Health Data – tested on autocorrelation-based algorithm– Systolic, diastolic and hear rate are taken as action– 2 months; one value each per day; – each action is associated with its value instead of frequency– Most missing values were added manually
June 27-28, 2006 Vikramaditya Jakkula
-1
-0.5
0
0.5
1
1.5
2
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61
Number of days
grap
h co
nfide
nce
020406080100120140160180
systo
lic
graph confidence Systolic
Experiments using Autocorrelation-based method
• For Health Data– sensitive to
sudden large changes
– could detect drifts due to long term trends even with small amounts of noise.
June 27-28, 2006 Vikramaditya Jakkula
-1
-0.5
0
0.5
1
1.5
2
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61
Number of days
conf
iden
ce
0
20
40
60
80
100
120
dias
tolic
graph confidence Diastolic
-1
-0.5
0
0.5
1
1.5
2
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61
Number of days
conf
iden
ce
0
20
40
60
80
100
120
hear
t rat
e (b
pm)
graph confidence HeartRate
Figure: Line graph for graph confidence & diastolic vs. number of days for health set.
Figure : Line graph for graph confidence & heart rate vs. number of days for health set.
June 27-28, 2006 Vikramaditya Jakkula
Reminder Assistance System
• Automation assistance is beneficial when activities are difficult to perform.
• Such reminder service would benefit individuals suffering from dementia.
• Reminders Triggered in two situations:
when user queries for next routine activity
Critical anomaly is detected.
June 27-28, 2006 Vikramaditya Jakkula
Conclusion
• HMS help us gain information about different types of drifts and outliers that are part of the inhabitant’s lifestyle.
• Detect anomalies in inhabitants health.• Gives information about sudden changes observed in
inhabitants health.• Successful demonstration of MavHome software
Architecture can monitor and provide automated assistance for inhabitants.
June 27-28, 2006 Vikramaditya Jakkula
Future Work
We are currently collecting health-specific data in the MavHome sites.
We will be testing in the living environments of recruited residents at the C.C. Young Retirement Community in Dallas, Texas.
Lifestyle Trends and patterns of inhabitants would be analyzed over period of time.
June 27-28, 2006 Vikramaditya Jakkula
Thank You