advances in deep learning with applications in text and ......c. alippi, g. boracchi and m. roveri,...
TRANSCRIPT
• 𝐾𝑡
• 𝐾𝑡
•
•
•
•
•
•
•
• 𝜙𝑡(𝒙, 𝑦)
•
•
•
𝑆𝑅𝑊
𝑥
𝑡
𝑥
𝑆𝑅𝑊
𝑆𝑅𝑊
•
•
ob
se
rva
tio
ns
-5
0
5
10 class ωclass ωT*
Classification error as a function of timeC
lassific
ation
Err
or
(%)
1000 2000 3000 4000 5000 6000 7000 8000 9000
27
28
29
30
31
32
33
34
35
T
JIT classifierContinuous Update ClassifierSliding Window ClassifierBayes error
Dataset
1
2
a)
b)
1000 2000 3000 4000 5000 6000 7000 8000 9000 T
𝑅𝑊
𝑆
• 𝑅𝑊
• 𝑆
• 𝑆 𝑅𝑊
• 𝑅𝑊 𝑆
• 𝑅𝑊 𝑆
• 𝑆 𝑅𝑊
• (𝑆)
• (𝑅𝑤)
• 𝑆
• 𝑅𝑤
• 𝑹𝒘 𝑆𝑤
𝑆 𝑅𝑤
𝜖𝑡 𝑤
𝑆
𝜃𝑤 𝑅𝑊
𝑆
𝑅𝑊 𝑆
𝑡 −𝑤
•
•
•
•
•
•
•
•
•
•
•
𝐶𝑖 = (𝑍𝑖 , 𝐹𝑖 , 𝐷𝑖)
𝑍𝑖 = 𝒙𝟎, 𝑦0 , … , 𝒙𝒏, 𝑦𝑛 :
𝑖th
𝐹𝑖 𝑝(𝒙) 𝑖th
• 𝑀 ⋅
• 𝑉(⋅)
𝐷𝑖• 𝑀 ⋅
• 𝑉(⋅)
• 𝑝𝑡(⋅)
𝐶 = (𝑍, 𝐹, 𝐷)
• 𝑍
• 𝐹
• 𝐷
𝐶0
𝑡
𝐶0
𝑇𝑅
𝐶0• 𝜙(𝒙) 𝜙 𝑦|𝒙
𝐶 = (𝑍, 𝐹, 𝐷)
• 𝑍
• 𝐹
• 𝐷
• 𝒟
• Υ
• ℰ
• 𝒰
•
𝐹𝑖
•
𝑍𝑖
𝑡
𝐶0
𝑇𝑅
𝐶0• 𝑍0 𝑝(𝑦|𝒙)
• 𝐹0 𝑝(𝒙)
• 𝐷0
𝒟
𝑫
• 𝒟
• 𝜙 𝑦 𝒙 𝜙(𝒙)
• 𝑇
𝑡𝑇
𝐶0𝒟(𝐶0) = 1
𝒟 𝐶𝑖 ∈ {0,1}
𝐷𝑖𝑫𝒊
𝒳
𝜏Ƹ𝜏 𝑇
•
•
𝑡𝑇Ƹ𝜏
Ƹ𝜏
𝑡𝑇Ƹ𝜏
1
𝐶1𝐶0
Υ(𝐶0) = (𝐶0, 𝐶1)
𝐹𝑖
𝑇 𝜏
ቊ𝐻0: "𝐹𝑖 contains i. i. d. samples"𝐻1: "𝐹𝑖 contains a change point"
𝐹𝑖
𝜏𝑇.
𝐶𝑗
• 𝐹 𝜙 𝒙 𝐶𝑚 𝐶𝑛
• 𝐶𝑚 𝐶𝑛 𝜙 𝑦 𝒙
𝑡𝑇
𝐶𝑛𝐶𝑚
ℰ 𝐶𝑚, 𝐶𝑛 = 1
Ƹ𝜏
𝐾
•
• 𝜙𝑡(𝒙, 𝑦)
•
•
•
•
•
•
•
P. Domingos and G. Hulton, “Mining high-speed data streams” in Proc. of the sixth ACM SIGKDD international conference on
Knowledge discovery and data mining, pp. 71–80, 2000.
G. Hulten, L. Spencer, and P. Domingos, “Mining time-changing data streams” in Proc. of Conference on Knowledge Discovery in
Data, pp. 97–106, 2001.
•
•
L. Cohen, G. Avrahami-Bakish, M. Last, A. Kandel, and O. Kipersztok, "Real-time data mining of non-stationary data streams from
sensor networks", Information Fusion, vol. 9, no. 3, pp. 344–353, 2008.
•
•
•
Y. Ye, S. Squartini, and F. Piazza, "Online sequential extreme learning machine in nonstationary environments", Neurocomputing, vol.
116, no. 20, pp. 94–101, 2013
ℋ = ℎ0, … , ℎ𝑁
ℎ𝑖 , 𝑖 = 1,… ,𝑁
ℋ 𝒙𝒕 = argmax𝝎∈𝚲
𝒉𝒊∈𝓗
𝛼𝑖 ℎ𝑖 𝒙𝑡 = 𝜔
ℋ = ℎ0, … , ℎ𝑁
ℎ𝑖 , 𝑖 = 1,… ,𝑁
ℋ 𝒙𝒕 = argmax𝝎∈𝚲
𝒉𝒊∈𝓗
𝛼𝑖 ℎ𝑖 𝒙𝑡 = 𝜔
𝛼𝑖 ℎ𝑖
ℎ𝑖ℎ𝑖
ℎ𝑖
•
• 𝛼𝑖
•
•
•
•
•
•
•
ℎ𝑡 ℎ𝑡−1
• ℎ𝑡
• ℎ𝑡−1
W. N. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for large scale classification", in Proceedings to the 7th ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining, pp. 377–382, 2001
𝑆 = 𝒙𝟎𝒕 , 𝑦0
𝑡 , 𝒙𝟏𝒕 , 𝑦1
𝑡 , … , 𝒙𝑩𝒕 , 𝑦𝐵
𝑡
• ℎ𝑡 𝑆
• ℎ𝑡−1 𝑆
• #ℋ < 𝑁 ℎ𝑡−1 ℋ
• ℎ𝑖 ∈ ℋ 𝑆ℎ𝑡−1
ℎ𝑡
W. N. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for large scale classification", in Proceedings to the 7th ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining, pp. 377–382, 2001
W. N. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for large scale classification", in Proceedings to the 7th ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining, pp. 377–382, 2001
•
•
•
•
•
•
•
Kolter, J. and Maloof, M. "Dynamic weighted majority: An ensemble method for drifting concepts". Journal of Machine Learning
Research 8, 2755–2790. 2007
•
•
•
•
ℎ𝑖 ℎ𝑘
𝑄𝑖,𝑘 =𝑁11𝑁00 −𝑁01𝑁10
𝑁11𝑁00 + 𝑁01𝑁10
𝑁𝑎,𝑏 = # 𝒙, ℎ𝑖 𝒙 = 𝑎 and ℎ𝑘 𝒙 = 𝑏 0, 1
ℎ𝑖 ℎ𝑘 𝑄𝑖,𝑘 = 1 𝑄𝑖,𝑘
Minku, L. L.; Yao, X. "DDD: A New Ensemble Approach For Dealing With Concept Drift", IEEE Transactions on Knowledge and Data
Engineering, IEEE, v. 24, n. 4, p. 619-633, April 2012,
ℎ𝑖 ℎ𝑘
𝑄𝑖,𝑘 =𝑁11𝑁00 −𝑁01𝑁10
𝑁11𝑁00 + 𝑁01𝑁10
𝑁𝑎,𝑏 = # 𝒙, ℎ𝑖 𝒙 = 𝑎 and ℎ𝑘 𝒙 = 𝑏 0, 1
ℎ𝑖 ℎ𝑘 𝑄𝑖,𝑘 = 1 𝑄𝑖,𝑘
Minku, L. L.; Yao, X. "DDD: A New Ensemble Approach For Dealing With Concept Drift", IEEE Transactions on Knowledge and Data
Engineering, IEEE, v. 24, n. 4, p. 619-633, April 2012,
𝑄𝑖,𝑘
•
•
Minku, L. L.; Yao, X. "DDD: A New Ensemble Approach For Dealing With Concept Drift", IEEE Transactions on Knowledge and Data
Engineering, IEEE, v. 24, n. 4, p. 619-633, April 2012,
𝑇0
•
•
𝑇0
Initially LabeledData
Receive UnlabeledData
Classify Using SSL Construct aBoundary
Compact the Boundary
Extract CoreSet
Initially LabeledData
Receive UnlabeledData
Classify Using SSL Construct aBoundary
Compact the Boundary
Extract CoreSet
Initially LabeledData
Receive UnlabeledData
Classify Using SSL Construct aBoundary
Compact the Boundary
Extract CoreSet
Initially LabeledData
Receive UnlabeledData
Classify Using SSL Construct aBoundary
Compact the Boundary
Extract CoreSet
Initially LabeledData
Receive UnlabeledData
Classify Using SSL Construct aBoundary
Compact the Boundary
Extract CoreSet
Initially LabeledData
Receive UnlabeledData
Classify Using SSL Construct aBoundary
Compact the Boundary
Extract CoreSet
73
Time
74
𝑡
𝑝𝑒𝑟𝑓(𝑡) =𝑝𝑒𝑟𝑓𝑒𝑥
(𝑡), if t=1
(𝑡 − 1)𝑝𝑒𝑟𝑓(𝑡−1) + 𝑝𝑒𝑟𝑓𝑒𝑥(𝑡)
𝑡, otherwise
75
perf (𝑡) =perf𝑒𝑥
(𝑡), if t = 1
𝜂 ⋅ perf (𝑡−1) + (1 − 𝜂) ⋅ perf𝑒𝑥(𝑡), otherwise
𝑡
•
•
• 𝜏
• 𝜏
•
D. M. Hawkins, P. Qiu, and C. W. Kang, “The changepoint model for statistical process control” Journal of Quality Technology, 2003.𝑡𝑇Ƹ𝜏
•
•
•
•
•
•
•
•
•
C. Alippi, G. Boracchi and M. Roveri, “Just In Time Classifiers for Recurrent Concepts” IEEE Transactions on Neural Networks and
Learning Systems, 2013. vol. 24, no.4, pp. 620 -634