Download - Internet of Things Data Science
![Page 2: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/2.jpg)
internet of things data science architecture
1
![Page 3: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/3.jpg)
real time analytics
2
![Page 4: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/4.jpg)
real time analytics
3
![Page 5: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/5.jpg)
introduction: data streams
Data Streams
• Sequence is potentially infinite• High amount of data: sublinear space• High speed of arrival: sublinear time per example• Once an element from a data stream has been processed itis discarded or archived
ExamplePuzzle: Finding Missing Numbers• Let π be a permutation of {1, . . . ,n}.• Let π−1 be π with one element missing.• π−1[i] arrives in increasing orderTask: Determine the missing number
4
![Page 6: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/6.jpg)
introduction: data streams
Data Streams
• Sequence is potentially infinite• High amount of data: sublinear space• High speed of arrival: sublinear time per example• Once an element from a data stream has been processed itis discarded or archived
ExamplePuzzle: Finding Missing Numbers• Let π be a permutation of {1, . . . ,n}.• Let π−1 be π with one element missing.• π−1[i] arrives in increasing orderTask: Determine the missing number
Use a n-bitvector tomemorize all thenumbers (O(n)space)
4
![Page 7: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/7.jpg)
introduction: data streams
Data Streams
• Sequence is potentially infinite• High amount of data: sublinear space• High speed of arrival: sublinear time per example• Once an element from a data stream has been processed itis discarded or archived
ExamplePuzzle: Finding Missing Numbers• Let π be a permutation of {1, . . . ,n}.• Let π−1 be π with one element missing.• π−1[i] arrives in increasing orderTask: Determine the missing number
Data Streams:O(log(n)) space.
4
![Page 8: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/8.jpg)
introduction: data streams
Data Streams
• Sequence is potentially infinite• High amount of data: sublinear space• High speed of arrival: sublinear time per example• Once an element from a data stream has been processed itis discarded or archived
ExamplePuzzle: Finding Missing Numbers• Let π be a permutation of {1, . . . ,n}.• Let π−1 be π with one element missing.• π−1[i] arrives in increasing orderTask: Determine the missing number
Data Streams:O(log(n)) space.Store
n(n+1)2
−∑j≤i
π−1[j].
4
![Page 9: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/9.jpg)
data streams
Data Streams
• Sequence is potentially infinite• High amount of data: sublinear space• High speed of arrival: sublinear time per example• Once an element from a data stream has been processed itis discarded or archived
Tools:
• approximation• randomization, sampling• sketching
5
![Page 10: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/10.jpg)
data streams
Data Streams
• Sequence is potentially infinite• High amount of data: sublinear space• High speed of arrival: sublinear time per example• Once an element from a data stream has been processed itis discarded or archived
Approximation algorithms
• Small error rate with high probability• An algorithm (ε,δ )−approximates F if it outputs F̃ for whichPr[|F̃−F|> εF]< δ .
5
![Page 11: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/11.jpg)
data streams approximation algorithms
1011000111 1010101
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1ε log
2N) space, where
• N is the length of the sliding window• ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
6
![Page 12: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/12.jpg)
data streams approximation algorithms
10110001111 0101011
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1ε log
2N) space, where
• N is the length of the sliding window• ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
6
![Page 13: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/13.jpg)
data streams approximation algorithms
101100011110 1010111
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1ε log
2N) space, where
• N is the length of the sliding window• ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
6
![Page 14: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/14.jpg)
data streams approximation algorithms
1011000111101 0101110
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1ε log
2N) space, where
• N is the length of the sliding window• ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
6
![Page 15: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/15.jpg)
data streams approximation algorithms
10110001111010 1011101
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1ε log
2N) space, where
• N is the length of the sliding window• ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
6
![Page 16: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/16.jpg)
data streams approximation algorithms
101100011110101 0111010
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1ε log
2N) space, where
• N is the length of the sliding window• ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
6
![Page 17: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/17.jpg)
Classification
7
![Page 18: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/18.jpg)
classification
DefinitionGiven nC different classes, a classifier algorithm builds amodel that predicts for every unlabelled instance I the class Cto which it belongs with accuracy.
ExampleA spam filter
ExampleTwitter Sentiment analysis: analyze tweets with positive ornegative feelings
8
![Page 19: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/19.jpg)
data stream classification cycle
1 Process an example at a time,and inspect it only once (atmost)
2 Use a limited amount of memory3 Work in a limited amount of time4 Be ready to predict at any point
9
![Page 20: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/20.jpg)
classification
Data set thatdescribes e-mailfeatures fordeciding if it isspam.
ExampleContains Domain Has Time“Money” type attach. received spam
yes com yes night yesyes edu no night yesno com yes night yesno edu no day nono com no day noyes cat no day yes
Assume we have to classify the following new instance:Contains Domain Has Time“Money” type attach. received spam
yes edu yes day ?
10
![Page 21: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/21.jpg)
bayes classifiers
Naïve Bayes
• Based on Bayes Theorem:
P(c|d) = P(c)P(d|c)P(d)
posterior=prior× likelikood
evidence• Estimates the probability of observing attribute a and theprior probability P(c)
• Probability of class c given an instance d:
P(c|d) = P(c)∏a∈dP(a|c)P(d)
11
![Page 22: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/22.jpg)
bayes classifiers
Multinomial Naïve Bayes
• Considers a document as a bag-of-words.• Estimates the probability of observing word w and the priorprobability P(c)
• Probability of class c given a test document d:
P(c|d) = P(c)∏w∈dP(w|c)nwdP(d)
12
![Page 23: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/23.jpg)
perceptron
Attribute 1
Attribute 2
Attribute 3
Attribute 4
Attribute 5
Output hw⃗(⃗xi)
w1
w2
w3
w4
w5
• Data stream: ⟨⃗xi,yi⟩• Classical perceptron: hw⃗(⃗xi) = sgn(⃗wT⃗xi),• Minimize Mean-square error: J(⃗w) = 1
2 ∑(yi−hw⃗(⃗xi))2
13
![Page 24: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/24.jpg)
perceptron
Attribute 1
Attribute 2
Attribute 3
Attribute 4
Attribute 5
Output hw⃗(⃗xi)
w1
w2
w3
w4
w5
• We use sigmoid function hw⃗ = σ (⃗wT⃗x) whereσ(x) = 1/(1+e−x)
σ ′(x) = σ(x)(1−σ(x))13
![Page 25: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/25.jpg)
perceptron
• Minimize Mean-square error: J(⃗w) = 12 ∑(yi−hw⃗(⃗xi))2
• Stochastic Gradient Descent: w⃗= w⃗−η∇J⃗xi• Gradient of the error function:
∇J=−∑i(yi−hw⃗(⃗xi))∇hw⃗(⃗xi)
∇hw⃗(⃗xi) = hw⃗(⃗xi)(1−hw⃗(⃗xi))
• Weight update rule
w⃗= w⃗+η ∑i(yi−hw⃗(⃗xi))hw⃗(⃗xi)(1−hw⃗(⃗xi))⃗xi
13
![Page 26: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/26.jpg)
perceptron
Perceptron Learning(Stream,η)
1 for each class2 do Perceptron Learning(Stream,class,η)
Perceptron Learning(Stream,class,η)
1 � Let w0 and w⃗ be randomly initialized2 for each example (⃗x,y) in Stream3 do if class= y4 then δ = (1−hw⃗(⃗x)) ·hw⃗(⃗x) · (1−hw⃗(⃗x))5 else δ = (0−hw⃗(⃗x)) ·hw⃗(⃗x) · (1−hw⃗(⃗x))6 w⃗= w⃗+η ·δ ·⃗x
Perceptron Prediction(⃗x)1 return argmaxclasshw⃗class
(⃗x)14
![Page 27: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/27.jpg)
classification
Data set thatdescribes e-mailfeatures fordeciding if it isspam.
ExampleContains Domain Has Time“Money” type attach. received spam
yes com yes night yesyes edu no night yesno com yes night yesno edu no day nono com no day noyes cat no day yes
Assume we have to classify the following new instance:Contains Domain Has Time“Money” type attach. received spam
yes edu yes day ?
15
![Page 28: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/28.jpg)
classification
• Assume we have to classify the following new instance:Contains Domain Has Time“Money” type attach. received spam
yes edu yes day ?
Time
Contains “Money”
YES
Yes
NO
No
Day
YES
Night
15
![Page 29: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/29.jpg)
decision trees
Basic induction strategy:
• A← the “best” decision attribute for next node• Assign A as decision attribute for node• For each value of A, create new descendant of node• Sort training examples to leaf nodes• If training examples perfectly classified, Then STOP, Elseiterate over new leaf nodes
16
![Page 30: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/30.jpg)
hoeffding trees
Hoeffding Tree : VFDT
Pedro Domingos and Geoff Hulten.Mining high-speed data streams. 2000
• With high probability, constructs an identical model that atraditional (greedy) method would learn
• With theoretical guarantees on the error rate
Time
Contains “Money”
YESYes
NONo
Day
YESNight
17
![Page 31: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/31.jpg)
hoeffding bound inequality
Probability of deviation of its expected value.
18
![Page 32: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/32.jpg)
hoeffding bound inequality
Let X= ∑iXi where X1, . . . ,Xn are independent and indenticallydistributed in [0,1]. Then
1 Chernoff For each ε < 1
Pr[X> (1+ ε)E[X]]≤ exp(−ε2
3E[X]
)2 Hoeffding For each t> 0
Pr[X> E[X]+ t]≤ exp(−2t2/n
)3 Bernstein Let σ2 = ∑i σ2
i the variance of X. If Xi−E[Xi]≤ b foreach i ∈ [n] then for each t> 0
Pr[X> E[X]+ t]≤ exp
(− t2
2σ2+ 23bt
)19
![Page 33: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/33.jpg)
hoeffding tree or vfdt
HT(Stream,δ )1 � Let HT be a tree with a single leaf(root)2 � Init counts nijk at root3 for each example (x,y) in Stream4 do HTGrow((x,y),HT,δ )
HTGrow((x,y),HT,δ )1 � Sort (x,y) to leaf l using HT2 � Update counts nijk at leaf l3 if examples seen so far at l are not all of the same class4 then� Compute G for each attribute
5 if G(Best Attr.)−G(2nd best) >√
R2 ln1/δ2n
6 then� Split leaf on best attribute7 for each branch8 do� Start new leaf and initiliatize counts
20
![Page 34: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/34.jpg)
hoeffding tree or vfdt
HT(Stream,δ )1 � Let HT be a tree with a single leaf(root)2 � Init counts nijk at root3 for each example (x,y) in Stream4 do HTGrow((x,y),HT,δ )
HTGrow((x,y),HT,δ )1 � Sort (x,y) to leaf l using HT2 � Update counts nijk at leaf l3 if examples seen so far at l are not all of the same class4 then� Compute G for each attribute
5 if G(Best Attr.)−G(2nd best) >√
R2 ln1/δ2n
6 then� Split leaf on best attribute7 for each branch8 do� Start new leaf and initiliatize counts
20
![Page 35: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/35.jpg)
hoeffding trees
HT features
• With high probability, constructs an identical model that atraditional (greedy) method would learn
• Ties: when two attributes have similar G, split if
G(Best Attr.)−G(2nd best)<√
R2 ln1/δ2n
< τ
• Compute G every nmin instances• Memory: deactivate least promising nodes with lower pl×el
• pl is the probability to reach leaf l• el is the error in the node
21
![Page 36: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/36.jpg)
hoeffding naive bayes tree
Hoeffding TreeMajority Class learner at leaves
Hoeffding Naive Bayes Tree
G. Holmes, R. Kirkby, and B. Pfahringer.Stress-testing Hoeffding trees, 2005.
• monitors accuracy of a Majority Class learner• monitors accuracy of a Naive Bayes learner• predicts using the most accurate method
22
![Page 37: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/37.jpg)
bagging
ExampleDataset of 4 Instances : A, B, C, D
Classifier 1: B, A, C, BClassifier 2: D, B, A, DClassifier 3: B, A, C, BClassifier 4: B, C, B, BClassifier 5: D, C, A, C
Bagging builds a set of M base models, with a bootstrapsample created by drawing random samples withreplacement.
23
![Page 38: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/38.jpg)
bagging
ExampleDataset of 4 Instances : A, B, C, D
Classifier 1: A, B, B, CClassifier 2: A, B, D, DClassifier 3: A, B, B, CClassifier 4: B, B, B, CClassifier 5: A, C, C, D
Bagging builds a set of M base models, with a bootstrapsample created by drawing random samples withreplacement.
23
![Page 39: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/39.jpg)
bagging
ExampleDataset of 4 Instances : A, B, C, D
Classifier 1: A, B, B, C: A(1) B(2) C(1) D(0)Classifier 2: A, B, D, D: A(1) B(1) C(0) D(2)Classifier 3: A, B, B, C: A(1) B(2) C(1) D(0)Classifier 4: B, B, B, C: A(0) B(3) C(1) D(0)Classifier 5: A, C, C, D: A(1) B(0) C(2) D(1)
Each base model’s training set contains each of the originaltraining example K times where P(K= k) follows a binomialdistribution.
23
![Page 40: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/40.jpg)
bagging
Figure 1: Poisson(1) Distribution.
Each base model’s training set contains each of the originaltraining example K times where P(K= k) follows a binomialdistribution.
23
![Page 41: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/41.jpg)
oza and russell’s online bagging for m models
1: Initialize base models hm for all m ∈ {1,2, ...,M}2: for all training examples do3: for m= 1,2, ...,M do4: Set w= Poisson(1)5: Update hm with the current example with weight w
6: anytime output:7: return hypothesis: hfin(x) = argmaxy∈Y ∑T
t=1 I(ht(x) = y)
24
![Page 42: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/42.jpg)
Evolving Stream Classification
25
![Page 43: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/43.jpg)
data mining algorithms with concept drift
No Concept Drift
-input output
DM Algorithm
-
Counter1
Counter2
Counter3
Counter4
Counter5
Concept Drift
-input output
DM Algorithm
Static Model
-
Change Detect.-
6
�
26
![Page 44: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/44.jpg)
data mining algorithms with concept drift
No Concept Drift
-input output
DM Algorithm
-
Counter1
Counter2
Counter3
Counter4
Counter5
Concept Drift
-input output
DM Algorithm
-
Estimator1
Estimator2
Estimator3
Estimator4
Estimator5
26
![Page 45: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/45.jpg)
optimal change detector and predictor
• High accuracy• Low false positives and false negatives ratios• Theoretical guarantees
• Fast detection of change• Low computational cost: minimum space and time needed
• No parameters needed
27
![Page 46: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/46.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 1
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 47: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/47.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 1 W1 = 01010110111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 48: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/48.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 10 W1 = 1010110111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 49: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/49.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 101 W1 = 010110111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 50: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/50.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 1010 W1 = 10110111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 51: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/51.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 10101 W1 = 0110111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 52: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/52.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 101010 W1 = 110111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 53: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/53.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 1010101 W1 = 10111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 54: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/54.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111
W0= 10101011 W1 = 0111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 55: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/55.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111 |µ̂W0− µ̂W1 | ≥ εc : CHANGE DET.!
W0= 101010110 W1 = 111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 56: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/56.jpg)
algorithm adaptive sliding window
ExampleW= 101010110111111 Drop elements from the tail of W
W0= 101010110 W1 = 111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 57: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/57.jpg)
algorithm adaptive sliding window
ExampleW= 01010110111111 Drop elements from the tail of W
W0= 101010110 W1 = 111111
ADWIN: Adaptive Windowing Algorithm1 Initialize Window W2 for each t> 03 do W←W∪{xt} (i.e., add xt to the head of W)4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 | ≥ εc holds6 for every split of W into W=W0 ·W17 Output µ̂W
28
![Page 58: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/58.jpg)
algorithm adaptive sliding window
TheoremAt every time step we have:
1 (False positive rate bound). If µt remains constant within W,the probability that ADWIN shrinks the window at this step is atmost δ .
2 (False negative rate bound). Suppose that for some partitionof W in two parts W0W1 (where W1 contains the most recentitems) we have |µW0−µW1 |> 2εc. Then with probability 1−δADWIN shrinks W to W1, or shorter.
ADWIN tunes itself to the data stream at hand, with no need forthe user to hardwire or precompute parameters.
29
![Page 59: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/59.jpg)
algorithm adaptive sliding window
ADWIN using a Data Stream Sliding Window Model,
• can provide the exact counts of 1’s in O(1) time per point.• tries O(logW) cutpoints• uses O(1ε logW)memory words• the processing time per example is O(logW) (amortized andworst-case).
Sliding Window Model
1010101 101 11 1 1
Content: 4 2 2 1 1
Capacity: 7 3 2 1 1
30
![Page 60: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/60.jpg)
vfdt / cvfdt
Concept-adapting Very Fast Decision Trees: CVFDT
G. Hulten, L. Spencer, and P. Domingos.Mining time-changing data streams. 2001
• It keeps its model consistent with a sliding window ofexamples
• Construct “alternative branches” as preparation for changes• If the alternative branch becomes more accurate, switch oftree branches occurs
Time
Contains “Money”
YESYes
NONo
Day
YESNight
31
![Page 61: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/61.jpg)
decision trees: cvfdt
Time
Contains “Money”
YESYes
NONo
Day
YESNight
No theoretical guarantees on the error rate of CVFDT
CVFDT parameters :
1 W: is the example window size.2 T0: number of examples used to check at each node if thesplitting attribute is still the best.
3 T1: number of examples used to build the alternate tree.4 T2: number of examples used to test the accuracy of thealternate tree.
32
![Page 62: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/62.jpg)
decision trees: hoeffding adaptive tree
Hoeffding Adaptive Tree:
• replace frequency statistics counters by estimators• don’t need a window to store examples, due to the fact that wemaintain the statistics data needed with estimators
• change the way of checking the substitution of alternatesubtrees, using a change detector with theoreticalguarantees
Advantages over CVFDT:
1 Theoretical guarantees2 No Parameters
33
![Page 63: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/63.jpg)
adwin bagging (kdd’09)
ADWINAn adaptive sliding window whose size is recomputed onlineaccording to the rate of change observed.
ADWIN has rigorous guarantees (theorems)
• On ratio of false positives and negatives• On the relation of the size of the current window and changerates
ADWIN BaggingWhen a change is detected, the worst classifier is removedand a new classifier is added.
34
![Page 64: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/64.jpg)
Randomization as a powerful tool to increase accuracy anddiversity
There are three ways of using randomization:
• Manipulating the input data• Manipulating the classifier algorithms• Manipulating the output targets
35
![Page 65: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/65.jpg)
leveraging bagging for evolving data streams
Leveraging Bagging
• Using Poisson(λ )
Leveraging Bagging MC
• Using Poisson(λ ) and Random Output Codes
Fast Leveraging Bagging ME
• if an instance is misclassified: weight = 1• if not: weight = eT/(1−eT),
36
![Page 66: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/66.jpg)
empirical evaluation
Accuracy RAM-HoursHoeffding Tree 74.03 0.01Online Bagging 77.15 2.98ADWIN Bagging 79.24 1.48Leveraging Bagging 85.54 20.17Leveraging Bagging MC 85.37 22.04Leveraging Bagging ME 80.77 0.87
Leveraging Bagging
• Leveraging Bagging• Using Poisson(λ )
• Leveraging Bagging MC• Using Poisson(λ ) and Random Output Codes
• Leveraging Bagging ME• Using weight 1 if misclassified, otherwise eT/(1−eT)
37
![Page 67: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/67.jpg)
Clustering
38
![Page 68: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/68.jpg)
clustering
DefinitionClustering is the distribution of a set of instances of examplesinto non-known groups according to some common relationsor affinities.
ExampleMarket segmentation of customers
ExampleSocial network communities
39
![Page 69: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/69.jpg)
clustering
DefinitionGiven
• a set of instances I• a number of clusters K• an objective function cost(I)
a clustering algorithm computes an assignment of a clusterfor each instance
f : I→{1, . . . ,K}
that minimizes the objective function cost(I)
40
![Page 70: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/70.jpg)
clustering
DefinitionGiven
• a set of instances I• a number of clusters K• an objective function cost(C, I)
a clustering algorithm computes a set C of instances with|C|= K that minimizes the objective function
cost(C, I) = ∑x∈I
d2(x,C)
where
• d(x,c): distance function between x and c• d2(x,C) =minc∈Cd2(x,c): distance from x to the nearest pointin C
41
![Page 71: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/71.jpg)
k-means
• 1. Choose k initial centers C= {c1, . . . ,ck}• 2. while stopping criterion has not been met
• For i= 1, . . . ,N• find closest center ck ∈ C to each instance pi• assign instance pi to cluster Ck
• For k= 1, . . . ,K• set ck to be the center of mass of all points in Ci
42
![Page 72: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/72.jpg)
k-means++
• 1. Choose a initial center c1• For k= 2, . . . ,K
• select ck = p ∈ I with probability d2(p,C)/cost(C, I)• 2. while stopping criterion has not been met
• For i= 1, . . . ,N• find closest center ck ∈ C to each instance pi• assign instance pi to cluster Ck
• For k= 1, . . . ,K• set ck to be the center of mass of all points in Ci
43
![Page 73: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/73.jpg)
performance measures
Internal Measures
• Sum square distance• Dunn index D= dmin
dmax
• C-Index C= S−SminSmax−Smin
External Measures
• Rand Measure• F Measure• Jaccard• Purity
44
![Page 74: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/74.jpg)
birch
Balanced Iterative Reducing and Clustering usingHierarchies
• Clustering Features CF= (N,LS,SS)• N: number of data points• LS: linear sum of the N data points• SS: square sum of the N data points• Properties:• Additivity: CF1+CF2 = (N1+N2,LS1+LS2,SS1+SS2)• Easy to compute: average inter-cluster distanceand average intra-cluster distance
• Uses CF tree• Height-balanced tree with two parameters• B: branching factor• T: radius leaf threshold
45
![Page 75: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/75.jpg)
birch
Balanced Iterative Reducing and Clustering usingHierarchies
Phase 1: Scan all data and build an initial in-memory CFtree
Phase 2: Condense into desirable range by building asmaller CF tree (optional)
Phase 3: Global clusteringPhase 4: Cluster refining (optional and off line, as requires
more passes)
46
![Page 76: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/76.jpg)
clu-stream
Clu-Stream
• Uses micro-clusters to store statistics on-line• Clustering Features CF= (N,LS,SS,LT,ST)• N: numer of data points• LS: linear sum of the N data points• SS: square sum of the N data points• LT: linear sum of the time stamps• ST: square sum of the time stamps
• Uses pyramidal time frame
47
![Page 77: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/77.jpg)
clu-stream
On-line Phase
• For each new point that arrives• the point is absorbed by a micro-cluster• the point starts a new micro-cluster of its own• delete oldest micro-cluster• merge two of the oldest micro-cluster
Off-line Phase
• Apply k-means using microclusters as points
48
![Page 78: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/78.jpg)
streamkm++: coresets
Coreset of a set P with respect to some problemSmall subset that approximates the original set P.
• Solving the problem for the coreset provides an approximatesolution for the problem on P.
(k,ε)-coresetA (k,ε)-coreset S of P is a subset of P that for each C of size k
(1− ε)cost(P,C)≤ costw(S,C)≤ (1+ ε)cost(P,C)
49
![Page 79: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/79.jpg)
streamkm++: coresets
Coreset Tree
• Choose a leaf l node at random• Choose a new sample point denoted by qt+1 from Placcording to d2
• Based on ql and qt+1, split Pl into two subclusters and createtwo child nodes
StreamKM++
• Maintain L= ⌈log2( nm)+2⌉ buckets B0,B1, . . . ,BL−1
50
![Page 80: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/80.jpg)
Frequent Pattern Mining
51
![Page 81: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/81.jpg)
frequent patterns
Suppose D is a dataset of patterns, t ∈D , and min_sup is aconstant.
DefinitionSupport (t): number ofpatterns in D that aresuperpatterns of t.
DefinitionPattern t is frequent ifSupport (t)≥ min_sup.
Frequent Subpattern ProblemGiven D and min_sup, find all frequent subpatterns of patternsin D .
52
![Page 82: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/82.jpg)
frequent patterns
Suppose D is a dataset of patterns, t ∈D , and min_sup is aconstant.
DefinitionSupport (t): number ofpatterns in D that aresuperpatterns of t.
DefinitionPattern t is frequent ifSupport (t)≥ min_sup.
Frequent Subpattern ProblemGiven D and min_sup, find all frequent subpatterns of patternsin D .
52
![Page 83: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/83.jpg)
frequent patterns
Suppose D is a dataset of patterns, t ∈D , and min_sup is aconstant.
DefinitionSupport (t): number ofpatterns in D that aresuperpatterns of t.
DefinitionPattern t is frequent ifSupport (t)≥ min_sup.
Frequent Subpattern ProblemGiven D and min_sup, find all frequent subpatterns of patternsin D .
52
![Page 84: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/84.jpg)
frequent patterns
Suppose D is a dataset of patterns, t ∈D , and min_sup is aconstant.
DefinitionSupport (t): number ofpatterns in D that aresuperpatterns of t.
DefinitionPattern t is frequent ifSupport (t)≥ min_sup.
Frequent Subpattern ProblemGiven D and min_sup, find all frequent subpatterns of patternsin D .
52
![Page 85: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/85.jpg)
pattern mining
Dataset ExampleDocument Patterns
d1 abced2 cded3 abced4 acded5 abcded6 bcd
53
![Page 86: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/86.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequentd1,d2,d3,d4,d5,d6 cd1,d2,d3,d4,d5 e,ced1,d3,d4,d5 a,ac,ae,aced1,d3,d5,d6 b,bcd2,d4,d5,d6 d,cdd1,d3,d5 ab,abc,abe
be,bce,abced2,d4,d5 de,cde
minimal support = 3
54
![Page 87: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/87.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent6 c5 e,ce4 a,ac,ae,ace4 b,bc4 d,cd3 ab,abc,abe
be,bce,abce3 de,cde
55
![Page 88: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/88.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent Gen Closed6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce3 de,cde de cde
55
![Page 89: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/89.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
55
![Page 90: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/90.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
56
![Page 91: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/91.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
e→ ce
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
56
![Page 92: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/92.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
56
![Page 93: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/93.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
57
![Page 94: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/94.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
a→ ace
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
57
![Page 95: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/95.jpg)
itemset mining
d1 abced2 cded3 abced4 acded5 abcded6 bcd
Support Frequent Gen Closed Max6 c c c5 e,ce e ce4 a,ac,ae,ace a ace4 b,bc b bc4 d,cd d cd3 ab,abc,abe ab
be,bce,abce be abce abce3 de,cde de cde cde
58
![Page 96: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/96.jpg)
closed patterns
Usually, there are too many frequent patterns. We cancompute a smaller set, while keeping the same information.
ExampleA set of 1000 items, has 21000 ≈ 10301 subsets, that is morethan the number of atoms in the universe ≈ 1079
59
![Page 97: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/97.jpg)
closed patterns
A priori propertyIf t′ is a subpattern of t, then Support (t′)≥ Support (t).
DefinitionA frequent pattern t is closed if none of its propersuperpatterns has the same support as it has.
Frequent subpatterns and their supports can be generatedfrom closed patterns.
59
![Page 98: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/98.jpg)
maximal patterns
DefinitionA frequent pattern t is maximal if none of its propersuperpatterns is frequent.
Frequent subpatterns can be generated from maximalpatterns, but not with their support.
All maximal patterns are closed, but not all closed patterns aremaximal.
60
![Page 99: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/99.jpg)
non streaming frequent itemset miners
Representation:
• Horizontal layoutT1: a, b, cT2: b, c, eT3: b, d, e
• Vertical layouta: 1 0 0b: 1 1 1c: 1 1 0
Search:
• Breadth-first (levelwise): Apriori• Depth-first: Eclat, FP-Growth
61
![Page 100: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/100.jpg)
mining patterns over data streams
Requirements: fast, use small amount of memory and adaptive
• Type:• Exact• Approximate
• Per batch, per transaction• Incremental, Sliding Window, Adaptive• Frequent, Closed, Maximal patterns
62
![Page 101: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/101.jpg)
moment
• Computes closed frequents itemsets in a sliding window• Uses Closed Enumeration Tree• Uses 4 type of Nodes:
• Closed Nodes• Intermediate Nodes• Unpromising Gateway Nodes• Infrequent Gateway Nodes
• Adding transactions: closed items remains closed• Removing transactions: infrequent items remains infrequent
63
![Page 102: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/102.jpg)
fp-stream
• Mining Frequent Itemsets at Multiple Time Granularities• Based in FP-Growth• Maintains
• pattern tree• tilted-time window
• Allows to answer time-sensitive queries• Places greater information to recent data• Drawback: time and memory complexity
64
![Page 103: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/103.jpg)
tree and graph mining: dealing with time changes
• Keep a window on recent stream elements• Actually, just its lattice of closed sets!
• Keep track of number of closed patterns in lattice, N• Use some change detector on N• When change is detected:
• Drop stale part of the window• Update lattice to reflect this deletion, using deletion rule
Alternatively, sliding window of some fixed size
65
![Page 104: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/104.jpg)
Summary
66
![Page 105: Internet of Things Data Science](https://reader031.vdocuments.site/reader031/viewer/2022030315/587ff5c91a28ab3a1e8b4f27/html5/thumbnails/105.jpg)
overview of big data science
Short Course Summary
1 Introduction to Big Data2 Big Data Science3 Real Time Big Data Management4 Internet of Things Data Science
Open Source Software
1 MOA: http://moa.cms.waikato.ac.nz/2 SAMOA: http://samoa-project.net/
67