event-based stream classification framework · by timo reuter event-based stream classification...

Post on 10-Sep-2019

26 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

!

!

!

!

!

!

by Timo REUTER

EVENT-BASED STREAM CLASSIFICATION FRAMEWORK

A SUPERVISED CLUSTERING APPROACH FOR SOCIAL MEDIA APPLICATIONS

DISSERTATION

on 4th February 2015

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Gedruckt auf alterungsbeständigem Papier nach ISO 9706

8QLYHUVLW¦W %LHOHIHOG7HFKQLVFKH )DNXOW¦W

(YHQW�EDVHG 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN$ 6XSHUYLVHG &OXVWHULQJ $SSURDFK IRU 6RFLDO 0HGLD $SSOLFDWLRQV

'LVVHUWDWLRQ]XU (UODQJXQJ GHV *UDGHV

'RFWRU UHUXP QDWXUDOLXP �'U� UHU� QDW��

YRUJHOHJW YRQ

7LPR 5(87(5

�� *XWDFKWHU 3URI� 'U� 3KLOLSS &,0,$127HFKQLVFKH )DNXOW¦W8QLYHUVLW¦W %LHOHIHOG

�� *XWDFKWHU 3URI� 'U� 'U� /DUV 6&+0,'7�7+,(0(:LUWVFKDIWVLQIRUPDWLN XQG 0DVFKLQHOOHV /HUQHQ8QLYHUVLW¦W +LOGHVKHLP

3U¾IXQJVDXVVFKX¡ 3URI� 'U� %DUEDUD +$00(5'U��,QJ� 6HEDVWLDQ :5('(

%LHOHIHOG� )HEUXDU ����

!!!!!!!!!!

Dédié à mes grands-parents

$EVWUDFW

L

LL

$FNQRZOHGJPHQWV

LLL

&RQWHQWV

,� ,QWURGXFWLRQ �

�� ,QWURGXFWLRQ �

�� )XQGDPHQWDOV RI 7KLV :RUN ��

�� )RXQGDWLRQV DQG 5HODWHG :RUN ��

Y

&RQWHQWV

�� (YHQW &OXVWHULQJ 'DWDVHW ��

,,� 6XSHUYLVHG 6LQJOH�3DVV &OXVWHULQJ ZLWK WKH (YHQW�EDVHG 6WUHDP &ODVVLILFDWLRQ )UDPH�ZRUN ��

�� 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLILFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ ��

YL

&RQWHQWV

�� ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLILFDWLRQ ��

,,,� 0XOWL�SDVV 6WUHDP &OXVWHULQJ ���

�� 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLILFDWLRQ )UDPHZRUN IRU D 0XOWL�3DVV 6HWWLQJ ���

�� ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ ���

YLL

&RQWHQWV

,9� &RQFOXGLQJ 5HPDUNV ���

�� 5HPDUNV DQG &RPSDULVRQ RI &OXVWHULQJ $SSURDFKHV ���

��� &RQFOXVLRQ ���

9� $SSHQGL[ ���

*ORVVDU\ ���

$FURQ\PV ���

%LEOLRJUDSK\ ���

YLLL

/LVW RI )LJXUHV

L[

/LVW RI 7DEOHV

[L

3$57 ,,QWURGXFWLRQ

&KDSWHU � ,QWURGXFWLRQ

��� 0RWLYDWLQJ 8VH &DVHV

��� *RDO DQG &KDOOHQJHV

��� *RDO DQG &KDOOHQJHV

&KDSWHU � ,QWURGXFWLRQ

��� *RDO DQG &KDOOHQJHV

����� &OXVWHULQJ RI /DUJH 'DWDVHWV

&KDSWHU � ,QWURGXFWLRQ

����� &OXVWHULQJ RI &RQWLQXRXV 'DWD 6WUHDPV

x1, x2, ... k

��� *RDO DQG &KDOOHQJHV

x1, x2, ...k

k

xk ∈ C x

k C

&KDSWHU � ,QWURGXFWLRQ

����� &ODVVLI\LQJ RI &RQFHSW 'ULIWLQJ 7LPH 6HULHV 'DWD

k

����� 1RLV\ 'DWD

��

��� 5HVHDUFK &RQWULEXWLRQV RI WKLV 'LVVHUWDWLRQ

��� 5HVHDUFK &RQWULEXWLRQV RI WKLV 'LVVHUWDWLRQ

&RQWULEXWLRQ

(YDOXDWLRQ RI (YHQW ,GHQWLILFDWLRQ

��

&KDSWHU � ,QWURGXFWLRQ

1HZ (YHQW 'HWHFWLRQ

&OXVWHULQJ

��� 6WUXFWXUH DQG 2XWOLQH RI WKLV 7KHVLV

��

��� 6WUXFWXUH DQG 2XWOLQH RI WKLV 7KHVLV

��

&+$37(5 �)XQGDPHQWDOV RI 7KLV :RUN

��� )URPWKH&DWHJRUL]DWLRQ ,GHD WR(YHQW&OXVWHULQJ� 'HILQLWLRQDQG'HYHORSPHQW

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

����� &DWHJRUL]DWLRQ LQ 3KLORVRSK\ ŏ 7KH &ODVVLFDO 9LHZ

����� &DWHJRUL]DWLRQ LQ &RJQLWLYH 3V\FKRORJ\ ŏ 7KH 3URWRW\SH 9LHZ

��

��� )URP WKH &DWHJRUL]DWLRQ ,GHD WR (YHQW &OXVWHULQJ� 'HƓQLWLRQ DQG 'HYHORSPHQW

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

����� (YHQW &OXVWHULQJ &KDUDFWHUL]DWLRQ

'HƓQLWLRQ ����� &OXVWHULQJ

��� &KDUDFWHUL]DWLRQ RI DQ (YHQW

��

��� &KDUDFWHUL]DWLRQ RI DQ (YHQW

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

����� (YHQW 'HILQLWLRQ LQ 3KLORVRSK\

B a B aB A a B A

.LPV 3URSHUW\�([HPSOLILFDWLRQ $FFRXQW RI (YHQWV

x P t[x, P, t]

��

��� &KDUDFWHUL]DWLRQ RI DQ (YHQW

[x, P, t] x P t

[x, P, t] = [y,Q, t′] x = y, P = Q, t = t′

x P tx P

t

'DYLGVRQV DQG /HPPRQV 7KHRULHV RI (YHQWV

/HZLV 7KHRU\

e

e

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

4XLQHV 7KHRU\

����� (YHQW 'HILQLWLRQ LQ &RJQLWLRQ DQG 3V\FKRORJ\

&RQWUDVWLQJ -X[WDSRVLWLRQ RI (YHQWV DQG 2EMHFWV

��

��� &KDUDFWHUL]DWLRQ RI DQ (YHQW

6HJPHQWDWLQJ DQG 3HUFHLYLQJ (YHQWV

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

5HPHPEHULQJ DQG 5HSUHVHQWLQJ (YHQWV ŏ $XWRELRJUDSKLFDO .QRZOHGJH %DVH

����� (YHQWV LQ 5HFHQW /LWHUDWXUH RI 0DFKLQH /HDUQLQJ DQG ,QIRUPDWLRQ 5HWULHYDO

��

��� &KDUDFWHUL]DWLRQ RI DQ (YHQW

(YHQW 'HWHFWLRQ LQ 1HZV DQG 6WRULHV

,QIRUPDWLRQ ([WUDFWLRQ

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

(YHQW 'HWHFWLRQ LQ 0XOWLPHGLD

(YHQW 'HWHFWLRQ DQG ,GHQWLILFDWLRQ LQ 6RFLDO 0HGLD

��

��� &KDUDFWHUL]DWLRQ RI DQ (YHQW

eTe De

Te

����� 'LVFXVVLRQ DQG 'HILQLWLRQ

��

&KDSWHU � )XQGDPHQWDOV RI 7KLV :RUN

(YHQW 'HILQLWLRQ

'HƓQLWLRQ ����� (YHQW

(YHQW ,GHQWLW\ &RQGLWLRQ

'HƓQLWLRQ ����� (YHQW ,GHQWLW\

(YHQW *UDQXODULW\ DQG +LHUDUFK\

'HƓQLWLRQ �����

��

&+$37(5 �)RXQGDWLRQV DQG 5HODWHG :RUN

h(x) : X → Y X Y h(x)

h(x)X {x(1), ..., x(m)}

h(x) xy ∈ Y

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

{x(1), ..., x(m)}y(i)

y(i)

x(i)

(x(i), y(i))

��

��� &ODVVLƓFDWLRQ

Feature Selection and Extraction

Selection of a Classifier Model

Testing

Training

Selection of a Clustering Method

Clustering the Data

Data Collection, Feature Nomination

SupervisedUnsupervised

Result OK?Result OK?Real-

world

Prob

lem

Clustering Solution

)LJXUH ����� 'LIIHUHQFH EHWZHHQ XQVXSHUYLVHG DQG VXSHUYLVHG OHDUQLQJ VWUDWHJLHV DFFRUGLQJ WR .XQFKHYD >.XQ��@

��� &ODVVLILFDWLRQ

f X Y

f : X → Y

f (x(i), y(i))

x(n) ∈ Rd f : Rd → Y

ff

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

err

err(X) =1

n

n∑

i=1

(f(xn)− f(xn))2

err(x) =

{0, f(x) = f(x)

1,

f

f(x) = w1 · x1 + w2 · x2 + ...+ wn · xn = wTx

��� &OXVWHULQJ

��

��� &OXVWHULQJ

nx1, ..., xn k C = {C1, ..., Ck}

C

k∑

i=1

x∈Si

∥x− µi∥2

k c1, ..., ck

t

Si ={xp :

∥∥xp − c(t)i

∥∥2 ≤∥∥xp − c(t)j

∥∥2 ∀j, 1 ≤ j ≤ k},

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

c(t+1)i =

1

|S(t)i |

xj∈S(t)i

xj

k

��� 'LVWDQFH )XQFWLRQV

��

��� 'LVWDQFH )XQFWLRQV

da, b d(a, b)

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

��� .QRZOHGJH�EDVHG &OXVWHULQJ

��

��� /DUJH�VFDOH 3URFHVVLQJ DQG 6FDODELOLW\

��� /DUJH�VFDOH 3URFHVVLQJ DQG 6FDODELOLW\

����� 7DVN�EDVHG 7HFKQLTXHV

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

����� 'DWD�EDVHG 7HFKQLTXHV

��

��� /DUJH�VFDOH 3URFHVVLQJ DQG 6FDODELOLW\

����� &DQGLGDWH 5HWULHYDO

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

����� 6WUHDP 'DWD

��

��� 1HZ (YHQW 'HWHFWLRQ

��� 1HZ (YHQW 'HWHFWLRQ

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

����� 6WDWLVWLFDO $SSURDFKHV

χ2

χ2

����� 8QVXSHUYLVHG $SSURDFKHV

��

��� 1HZ (YHQW 'HWHFWLRQ

pk λ k

λ p λ

����� 6XSHUYLVHG $SSURDFKHV

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

��� (YHQW ,GHQWLILFDWLRQ DQG 'HWHFWLRQ

��

��� (YHQW ,GHQWLƓFDWLRQ DQG 'HWHFWLRQ

��

&KDSWHU � )RXQGDWLRQV DQG 5HODWHG :RUN

��

��� (YHQW ,GHQWLƓFDWLRQ DQG 'HWHFWLRQ

��

&+$37(5 �(YHQW &OXVWHULQJ 'DWDVHW

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

��� &UHDWLRQ DQG &ROOHFWLRQ RI WKH 'DWDVHW

��

��� &UHDWLRQ DQG &ROOHFWLRQ RI WKH 'DWDVHW

����� )HWFKLQJ RI 0HWDGDWD

����� )HWFKLQJ RI 8SORDGHU ,QIRUPDWLRQ

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

←!

����� )HWFKLQJ RI 3LFWXUH )LOHV

��

��� /DEHOLQJ RI WKH 'DWD ŋ &UHDWLRQ RI WKH *ROG 6WDQGDUG

��� /DEHOLQJ RI WKH 'DWD ŏ &UHDWLRQ RI WKH *ROG 6WDQGDUG

����� 8VDJH RI 6RFLDO (YHQW &DOHQGDUV IRU 'DWD /DEHOLQJ

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

)LJXUH ����� ([DPSOH IURP ODVW�IP

����� )HWFKLQJ RI (YHQW ,QIRUPDWLRQ IURP 8SFRPLQJ DQG /DVW�IP

����� /DEHOLQJ 3URFHVV

��

��� /DEHOLQJ RI WKH 'DWD ŋ &UHDWLRQ RI WKH *ROG 6WDQGDUG

7DEOH ����� $YDLODEOH LQIRUPDWLRQ IURP WZR VRFLDO HYHQW FDOHQGDUV� ODVW�IP DQG 8SFRPLQJ

" "" "

""

" ""

"" "" "" "" "

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

��� 'DWDVHW 6WDWLVWLFV

��

��� 'DWDVHW 6WDWLVWLFV

7DEOH ����� $YDLODELOLW\ RI IHDWXUHV

����� 'DWD 4XDOLW\

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

7DEOH ����� 8VH RI OLFHQVH7DEOH ����� 'LVWULEXWLRQ SHU \HDU

����� /LFHQVH &RQVWUDLQWV

����� 'DWD 3RLQW 'LVWULEXWLRQ

��

��� 'DWDVHW 6WDWLVWLFVNu

mber

of Ev

ents

in Re

SEED

Data

set

0

1 000

2 000

3 000

4 000

Amount of Pictures per Event

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

)LJXUH ����� 'LVWULEXWLRQ RI SLFWXUHV SHU HYHQW

����� 'DWDVHW 5HSUHVHQWDWLRQ )RUPDW DQG 6FKHPD

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

urlusernamedatetakendateuploadtitledescriptionlatitudelongitudeviews

�ickr_picture_idVARCHAR(255)VARCHAR(100)DATETIMEDATETIMETEXTTEXTFLOATFLOATINT(11)

BIGINT(20) NOT NULLNOT NULLNOT NULLNOT NULLNOT NULLNULLNULLNULLNULLNOT NULL

ReSEED_PICTURES

�ickr_picture_idtag

BIGINT(20)VARCHAR(255)

NOT NULLNOT NULL

ReSEED_PICTURES_TAGS

�ickr_picture_idevent_id

BIGINT(20)INT(11)

NOT NULLNOT NULL

ReSEED_PICTURES_EVENTS

titletagsdescriptionvenue_idvenue_namevenue_streetvenue_cityvenue_postcodevenue_countryvenue_longitudevenue_latitudevenue_urlstartdateenddate

upcoming_event_idVARCHAR(255)VARCHAR(255)VARCHAR(255)INT(11)VARCHAR(255)VARCHAR(255)VARCHAR(128)VARCHAR(16)VARCHAR(100)FLOATFLOATVARCHAR(255)INT(11)INT(11)

INT(11) NOT NULLNOT NULLNULLNULLNULLNULLNULLNULLNULLNULLNULLNULLNULLNOT NULLNOT NULL

ReSEED_EVENTS_UPCOMING

upcoming_event_idlastfm_event_id

INT(11)INT(11)

NOT NULLNOT NULL

titlevenue_idvenue_namevenue_streetvenue_postcodevenue_cityvenue_countryvenue_longitudevenue_latitudevenue_urllastfm_urlstartdateattendancereviews

lastfm_idVARCHAR(255)INT(11)VARCHAR(255)VARCHAR(255)VARCHAR(16)VARCHAR(128)VARCHAR(100)FLOATFLOATVARCHAR(255)VARCHAR(255)INT(11)INT(11)INT(11)

INT(11) NOT NULLNOT NULLNULLNULLNULLNULLNULLNULLNULLNULLNOT NULLNOT NULLNULLNOT NULLNOT NULL

ReSEED_EVENTS_LASTFM

)LJXUH ����� 'DWDEDVH VFKHPD IRU WKH 5H6((' GDWDVHW

��

��� $SSOLFDWLRQV RI WKH 'DWDVHW

��� $SSOLFDWLRQV RI WKH 'DWDVHW

����� 0HGLD(YDO ����

����� )XUWKHU $SSOLFDWLRQV

��

&KDSWHU � (YHQW &OXVWHULQJ 'DWDVHW

����� (YDOXDWLRQ 3URSRVDO IRU &RPSDUDELOLW\

'DWDVHW 6SOLWV IRU 7UDLQLQJ DQG 7HVW

(YDOXDWLRQ 0HDVXUHV

1 1

��

��� $SSOLFDWLRQV RI WKH 'DWDVHW

��

3$57 ,,6XSHUYLVHG 6LQJOH�3DVV &OXVWHULQJ

ZLWK WKH (YHQW�EDVHG 6WUHDP&ODVVLILFDWLRQ )UDPHZRUN

��

&+$37(5 �6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP&ODVVLILFDWLRQ )UDPHZRUN IRU D6LQJOH�3DVV 6HWWLQJ

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

3UREOHP���

!

��� 3UREOHP 6WDWHPHQW

E at

at : D → Et

��

��� 3UREOHP 6WDWHPHQW

at d ∈ D E tEt

simvnew(d)

sim

'HƓQLWLRQ ����� $VVLJQPHQW IXQFWLRQ

at(d) =e∈Et

sim(d, e)

sim : D × Et → [0..1]

'HƓQLWLRQ ����� 6LPLODULW\ IXQFWLRQ

sim(d, e) = w T vsim(d, e)

w vsimw sim

dievent(di) di

centroidcentroid : E → Rn

event di event : di → E

ext e ∈ Eext : E → P(D)

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

Documents D

Yes

Pairwise Feature Extraction

Candidate Retrieval

Scoring and Ranking New Event Decision

Event Classificationnew event

no new event

Centroid (Re)calculation

!Event Database

Events EA

B

C

D

)LJXUH ����� 2YHUYLHZ RI WKH HYHQW FOXVWHULQJ IUDPHZRUN V\VWHP

��� 2YHUYLHZ RI WKH &OXVWHULQJ )UDPHZRUN

D ED

d ∈ D

E k d

e ∈ EP (e|d) d E

emax e1

d Pnew(d)Pbelongs to top cand(d)Pnew(d) Pbelongs to top cand(d) = 1

Pnew(e) > θn θn e′

d e′ := d

��

��� 2YHUYLHZ RI WKH &OXVWHULQJ )UDPHZRUN

D

d ∈ DTopk(d)← d

e ∈ Topk(d)P (e|d) d e

emax ← e′∈Topk(d) P (e′|d)Pnew(d) d

Pnew(d) > θne′

e′

e′ = d

emax = emax ∪ {d}emax

d emax emax

d e k

P (e|d) Pnew(d)

θn

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

��� &DQGLGDWH 5HWULHYDO 6WUDWHJLHV

P (d|e)

ci

����� 0HDVXUHPHQWV IRU 3HUIRUPDQFH DQG (IIHFWLYHQHVV

ci k

��

��� &DQGLGDWH 5HWULHYDO 6WUDWHJLHV

'HƓQLWLRQ ����� (IIHFWLYHQHVV RI D FDQGLGDWH UHWULHYDO VWUDWHJ\

(ci, k) =|{d | (d) ∈ k(ci)}|

|D|

d d

k ci k ci

����� &DQGLGDWH 5HWULHYDO 6WUDWHJLHV

kei

∆( (d), (ei)) k

1.0 × 1.0 k(d) − 1 < (e) <

(d) + 1 (d) − 1 < (e) < (d) + 1

k

k6 k 6 = 0

c1, · · · , c6

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

k(ci)

��� 3DLUZLVH )HDWXUH ([WUDFWLRQ

����� 7HPSRUDO )HDWXUHV

time(d) time(e)

∆time = |time(d)− time(e)|

∆time = 0

∆time

∆time

��

��� 3DLUZLVH )HDWXUH ([WUDFWLRQ

sim (d, e)

sim (d, e) =

⎧⎨

⎩1.0− ∆time

∆year∆time ≤ ∆year

0.0

∆year[0..1] 1.0

0.0

'HƓQLWLRQ ����� 6LPLODULW\ RI WZR WLPHVWDPSV

sim (d, e) =

⎧⎨

⎩1.0− log(∆time)

log(∆year)= 1.0− log(|time(d)− time(e)|)

log(y)∆time ≤ ∆year

0.0

time(d) time(e)y

sim (d, e)

sim (d, e)

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

����� *HRJUDSKLFDO )HDWXUHV

H(L1, L2) L1 L2 Ln

latn lonn

'HƓQLWLRQ ����� +DYHUVLQH IRUPXOD

H(L1, L2) = 2 · 2(√

φ,√

1− φ)

φ = 2

(∆lat

2

)+ (lat1) · (lat2) · 2

(∆lon

2

)

∆lat = |lat2 − lat1|, ∆lon = |lon2 − lon1|

[0..1]

0 1

'HƓQLWLRQ ����� 6LPLODULW\ RI WZR JHRJUDSKLF ORFDWLRQV

sim (d, e) = 1.0−H(L1, L2)

��

��� 3DLUZLVH )HDWXUH ([WUDFWLRQ

����� 7H[WXDO )HDWXUHV

d e

D × T → R t d

'HƓQLWLRQ ����� 7)�,')(t, d) = (t, d)× (t)

(t, d) td (t)

'HƓQLWLRQ ����� ,QYHUVH 'RFXPHQW )UHTXHQF\ �,')�

(t) = logn

(t)

n (t)t

(θ) vd ve

vd · ve = ||vd|| · ||ve|| · (θ)

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

(θ)

(θ) =vd · ve

||vd|| · ||ve||=

n∑

i=1

vd(i) × ve

(i)

√∑n

i=1

(vd

(i))2×√

∑ni=1

(ve

(i))2

sim sim sim0.0 1.0

'HƓQLWLRQ ����� 6LPLODULW\ RI WZR GRFXPHQWV UHJDUGLQJ WKHLU WH[W WRNHQV

sim (d, e) =

t

(t, d)× (t, e)

√∑

t

(t, d)2 ×√∑

t

(t, e)2

����� 'RFXPHQW�(YHQW 6LPLODULW\ 9HFWRU

de

vsim(d, e) =

⎜⎜⎜⎜⎜⎜⎝

sim (d, e)sim (d, e)sim (d, e)sim (d, e)sim (d, e)

sim (d, e)

⎟⎟⎟⎟⎟⎟⎠

sim (d, e) sim (d, e)sim (d, e)

sim (d, e) sim (d, e) sim (d, e)

��� 6FRULQJ DQG 5DQNLQJ ŏ /HDUQLQJ 6LPLODULW\ )XQFWLRQV

d P (e|d)e

d e

��

��� 6FRULQJ DQG 5DQNLQJ ŋ /HDUQLQJ 6LPLODULW\ )XQFWLRQV

P (e|d)(d, e)

P (e|d) = P ( |vsim(e, d))

����� 3UREOHP )RUPXODWLRQ XVLQJ D 6XSSRUW 9HFWRU 0DFKLQH

⟨w, sim(d, e)⟩ + b = 0

⟨w, vsim(d, e)⟩+ b = 1

⟨w, vsim(d, e)⟩+ b = −1

minw,ξ1

2||w||2 + C

ℓ∑

i=1

ξi

yi(⟨w,Φ(xi)⟩+ b) ≥ 1− ξi, ξx ≥ 0

Φ(xi)

Cξi yi ∈ {−1, 1}

xi

xi =

{((di, e),−1) e′ ∈ event(di)((di, e),+1)

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

w

∀di ⟨w,Φ(di, event(di))⟩ > 1(−ξi)∀di∀e′ = event(di) ⟨w,Φ(di, e′)⟩ < −1(+ξi)

e′

t E(t+1) = E(t)∪ e′

3UREOHP )RUPXODWLRQ DV D 5DQNLQJ 690 3UREOHP

∀di ∀e′ ∈ event(di) sim(di, event(di)) > sim(di, e′)

∀di ∀e′ ∈ event(di) sim(di, event(di))− sim(di, e′) > 0

sim w

∀di ∀e′ ∈ e(di) w(vsim(di, event(di))− vsim(d1, e

′))> 0

∀di ∀e′ ∈ event(di) w(vsim(di, event(di))− vsim(d1, e

′))> 1− ξi

��

��� 6FRULQJ DQG 5DQNLQJ ŋ /HDUQLQJ 6LPLODULW\ )XQFWLRQV

w w

&DOFXODWLRQ RI 3UREDELOLWLHV IRU 6XSSRUW 9HFWRU 0DFKLQHV

f(x)yi x (f(x))

P (y = 1|x)

P (y = 1|x) ≈ PA,B(f) ≡1

1 + eAf+B

fi, yiA,B

−∑

i

(ti (pi) + (1− ti) (1− pi))

pi = PA,B(fi) ti yi

ti+ =N+ + 1

N+ + 2, ti− =

1

N− + 2

����� 3UREOHP )RUPXODWLRQ DV D 'HFLVLRQ 7UHH &ODVVLILFDWLRQ 3UREOHP

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

m S S SS c = (simj(d, e),Θm) simj(d, e)

vsim(d, e) Θm

S (c) = (x, y)|(xsimj(d,e) ≤ Θm)

S (c) = S \ S (c) = (x, y)|(xsimj(d,e) > Θm)

IG(fm)fmi i m

IG(fm) = 1−

⎝∑

i∈(−1,1)

(fmi)2

G(S, c)

G(S, c) = IG(S)−(n

Nm· IG (S (c)) +

n

Nm· IG

(S (c)

))

Nm mn n

c

c =c

G(S, c)

c∗ =c

(n

Nm· IG (S (c)) +

n

Nm· IG

(S (c)

))

S (c)S (c) Nm = 1 Nm <

��

��� 1HZ (YHQW 'HWHFWLRQ

��� 1HZ (YHQW 'HWHFWLRQ

dd

dd

P (e1|d) d

P (ek|d) dk k

k1k

∑ki=1 P (ei|d)

k

sim (d, e1)

sim (d, e1)

vnew(d) d

vnew(d) =

⎜⎜⎜⎜⎜⎜⎝

⎟⎟⎟⎟⎟⎟⎠

��

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 6LQJOH�3DVV 6HWWLQJ

d

P (d) = P ( |vnew(d))

��

&+$37(5 �([SHULPHQWDO 6HWXS DQG 5HVXOWV RIWKH 6XSHUYLVHG 6LQJOH�3DVV&ODVVLILFDWLRQ

��� 'HILQLWLRQ RI (YDOXDWLRQ 0HDVXUHV

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

F

P R

'HƓQLWLRQ ����� 3UHFLVLRQ DQG 5HFDOO

P =∑

d∈D

1

|D| ·| (d) ∩ (d)|

| (d)|

R =∑

d∈D

1

|D| ·| (d) ∩ (d)|

| (d)|

(d)(d)

'HƓQLWLRQ ����� *HQHUDO )�0HDVXUH

Fβ =(β + 1) · P ·R(β2 · P ) +R

β

β

F1

'HƓQLWLRQ ����� )��0HDVXUH

F1 = 2 · P ·RP +R

��

��� 2SWLPL]LQJ &DQGLGDWH 5HWULHYDO

��� 2SWLPL]LQJ &DQGLGDWH 5HWULHYDO

ci

����� ([SHULPHQWDO 6HWWLQJV

ci

ci

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

D EC

d ∈ Dc ∈ C

Rankc(d)← d

egold ← d(E)d ∈ e

e′

e′

e′ = d

egold = egold ∪ {d}egold

kci

(ci, k)ci k

����� 5HVXOWV

(IIHFWLYHQHVV RI &DQGLGDWH 5HWULHYDO

c1, ..., c6

k

��

��� 2SWLPL]LQJ &DQGLGDWH 5HWULHYDO

0

0,2

0,4

0,6

0,8

1

1 10 1001000

Upload TimeCapture TimeGeoTagsDescriptionTitle

k

Geo

Description

Title

Tags

Capture Time

Upload Time

)LJXUH ����� (IIHFWLYHQHVV RI GLIIHUHQW �VLQJOH�VWUDWHJ\� FDQGLGDWH UHWULHYDO VWUDWHJLHV

k = 6

k

7DEOH ����� 1HHGHG QXPEHU RI N WR UHDFK [ � HIIHFWLYHQHVV

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

0,8

0,84

0,88

0,92

0,96

1

0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60k

Effectiveness

Retrieval time0,03

0,02

0,01

0,04

Effect

ivene

ss

Retrie

val tim

e

)LJXUH ����� 5HWULHYDO WLPH DQG HIIHFWLYHQHVV IRU WKH RSWLPDO FRPELQDWLRQ RI VLQJOH FDQGLGDWH UHWULHYDO VWUDWHJLHV RYHUN

k

k c1c6

k = 10k

6FDODELOLW\ DQG 3URFHVVLQJ 7LPH

kk

k

��

��� 2SWLPL]LQJ &DQGLGDWH 5HWULHYDOmi

llisec

onds

5

15

25

35

45

55

10 20 30 40 50 60 70 80 90Average Processing Time per Document Linear Correlation

k

)LJXUH ����� $YHUDJH SURFHVVLQJ WLPH IRU RQH GRFXPHQW RYHU GLIIHUHQW N EDVHG RQ 8SORDG 7LPH FDQGLGDWH UHWULHYDOVWUDWHJ\

c1 c6c1 c2

kk

k

����� &RQFOXVLRQ

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

��� /HDUQLQJ 6LPLODULW\ )XQFWLRQV

����� ([SHULPHQWDO 6HWWLQJV

'DWD 6SOLWV DQG 6DPSOLQJ 6WUDWHJLHV

vsim(d, e) di, e(di)di e(di)

��

��� /HDUQLQJ 6LPLODULW\ )XQFWLRQV

n

n

neg(d+i ) =

d−i ∈ext(e(d+i ))

∑6i=1 simi(d

−i , e(d

+i ))

7UDLQLQJ ([DPSOHV IRU &ODVVLILHU 7UDLQLQJ 6WUDWHJLHV

m m

m

1 −1

di, e(di)di, e′

&ODVVLILHU 6HWWLQJV

C1.0

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

[0..1]

'HƓQLWLRQ ����� /LQHDU DQG 5DGLDO %DVLV )XQFWLRQ �5%)� NHUQHO

Φ(d, e) = ⟨d, e⟩

Φ(d, e) = (−γ · ||d− e||2), γ =1

6

rank

C 0.011.0

[0..1]

/HDYH�2QH�2XW 6WUDWHJ\

k

k = 1000

����� 5HVXOWV

��

��� /HDUQLQJ 6LPLODULW\ )XQFWLRQV

7DEOH ����� $YHUDJH DVVLJQPHQW UDWH XVLQJ D VWDQGDUG 690

7DEOH ����� $YHUDJH DVVLJQPHQW UDWH XVLQJ D UDQNLQJ 690

,PSDFW RI &ODVVLILHU DQG .HUQHO ,QIOXHQFH

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

7DEOH ����� $YHUDJH DVVLJQPHQW UDWH XVLQJ D 'HFLVLRQ 7UHH

,PSDFW RI 6DPSOLQJ

��

��� /HDUQLQJ 6LPLODULW\ )XQFWLRQV

Features

Accu

racy f

or P(e

|d)

0,8

0,85

0,9

0,95

1

0,9850,985

0,953

0,9850,986

0,952

0,9820,982

0,912

0,933

0,982

0,903

0,933

0,978

0,8610,859

Time

Time

Geo Time

Tags Time

Title Tim

e

Desc. Time

GeoTags

Time

GeoTitl

eTim

e

GeoDesc.

Time

Tags

Title

Time

Tags

Desc.Tim

e

Title

Desc.Time

Title

GeoTags

Time

Desc.GeoTags

Time

Desc.GeoTitl

eTim

e

Desc.Tags

Title

Time

Title

Tags

Title

Desc.

0,986

)LJXUH ����� )HDWXUH DQDO\VLV IRU FRPSXWDWLRQ RI 3�H_G� RI DFFXUDF\ XVLQJ D VWDQGDUG 690 ZLWK OLQHDU NHUQHO

)HDWXUH $QDO\VLV

P (e|d)

m ∈ {1000, 2000, 4000, 8000, 16000}

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

����� &RQFOXVLRQ

��

��� 1HZ (YHQW 'HWHFWLRQ

��� 1HZ (YHQW 'HWHFWLRQ

����� ([SHULPHQWDO 6HWWLQJV

vnew(d)

vnew(d)

'DWD &UHDWLRQ IRU &ODVVLILHU 7UDLQLQJ

��

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

vnew(d)

100

vnew(d)

&ODVVLILHU 7UDLQLQJ 6WUDWHJ\

vnew(d)

mm m

2, 10, 50, 100, 200, 300, 400, 500, 1000, 2000, 4000 8000

1−1

&ODVVLILHU 6HWWLQJV

���

��� 1HZ (YHQW 'HWHFWLRQ

C1.0

[0..1]

(YDOXDWLRQ 6WUDWHJ\

����� 5HVXOWV

,QIOXHQFH RI 7UDLQLQJ ([DPSOHV

mm vnew(d)

)HDWXUH $QDO\VLV

m

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

7DEOH ����� $FFXUDF\ RI QHZ HYHQW GHWHFWLRQ IRU GLIIHUHQW QXPEHU RI WUDLQLQJ H[DPSOHV

&ODVVLILHU YV� 7KUHVKROG

���

��� 1HZ (YHQW 'HWHFWLRQ

maxcap 86.141

min 80.942

avg 82.241

stddev 83.844

maxcap 80.953

maxupl 82.937

min 83.601

avg 85.048

maxcap 84.030

maxupl 85.121

min 85.258

avg 86.220

maxcap 85.868

max 81.873

min 62.733

avg 69.182

stddev 79.558

maxcap 64.730

maxupl 74.724

min 86.006

)LJXUH ����� *UHHG\ VHDUFK IRU RSWLPDO IHDWXUHV IRU WKH QHZ HYHQW GHWHFWLRQ WDVN XVLQJ D VWDQGDUG 690

����� &RQFOXVLRQ

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

��� )UDPHZRUN DV D :KROH ŏ 5HVXOWV DQG &RPSDULVRQ

����� 7UDLQLQJ DQG 2SWLPL]DWLRQ RI WKH 6\VWHP 3DUWV

&DQGLGDWH 5HWULHYDO

���

��� )UDPHZRUN DV D :KROH ŋ 5HVXOWV DQG &RPSDULVRQ

&ODVVLILFDWLRQ 3HUIRUPDQFH

k

kk

k = 18

7DEOH ����� &RPSDULVRQ RI VHYHUDO FDQGLGDWH UHWULHYDO VWUDWHJLHV ZLWK N �� WR QR FDQGLGDWH UHWULHYDO1

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

0,75

0,775

0,8

0,825

0,85

0,875

0,9

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

Upload Time Capture Time Uniform Combination Optimal Combination

Upload TimeCapture Time

Optimal C

ombination

Uniform Combination

0,6

0,68

0,76

0,84

0,92

1

10 20 30 40 50 60 70 80 900,6

0,68

0,76

0,84

0,92

1

10 20 30 40 50 60 70 80 90

F-Measure

RecallPrecision

k

k k

)LJXUH ����� 3HUIRUPDQFH FRPSDULVRQ IRU GLIIHUHQW FDQGLGDWH UHWULHYDO VWUDWHJLHV RYHU N�

k = 18

3URFHVVLQJ 7LPH

���

��� )UDPHZRUN DV D :KROH ŋ 5HVXOWV DQG &RPSDULVRQ

0

100

200

300

400

500

600

700

010000

2000030000

4000050000

6000070000

8000090000

100000110000

120000130000

without Candidate Retrieval with Candidate Retrieval

millis

econ

ds

Documents

)LJXUH ����� &RPSXWLQJ WLPH IRU GRFXPHQW SURFHVVLQJ XVLQJ FDQGLGDWH UHWULHYDO LQ FRPSDULVRQ WR QR FDQGLGDWHUHWULHYDO

6XPPDU\

6FRULQJ DQG 5DQNLQJ 6WHS

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

C1.0

1HZ (YHQW 'HWHFWLRQ 6WHS

Φ(d, e) = e−γ·||d−e||2 γ = 14

θn

����� %DVHOLQHV

���

��� )UDPHZRUN DV D :KROH ŋ 5HVXOWV DQG &RPSDULVRQ

7DEOH ����� 5HVXOWV RI RXU DSSURDFK LQ FRPSDULVRQ ZLWK GLIIHUHQW EDVHOLQHV XVLQJ WKH 5H6((' WHVW VHW1

����� 2YHUDOO 6\VWHP 3HUIRUPDQFH

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI WKH 6XSHUYLVHG 6LQJOH�3DVV &ODVVLƓFDWLRQ

��� &RQFOXVLRQV

���

��� &RQFOXVLRQV

���

3$57 ,,,0XOWL�SDVV 6WUHDP &OXVWHULQJ

���

&+$37(5 �6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP&ODVVLILFDWLRQ )UDPHZRUN IRU D0XOWL�3DVV 6HWWLQJ

���

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 0XOWL�3DVV 6HWWLQJ

3UREOHP ���

!

���

��� 3UREOHP 6WDWHPHQW

��� 3UREOHP 6WDWHPHQW

D DCι

atc ∈ Cι Eν t

at : Cι → Eνt

Cι = Eν at

sim

sim : Cι × Eν → [0..1]

sim vnew(c)sim

'HƓQLWLRQ ����� $VVLJQPHQW IXQFWLRQ

at(eν) =eν∈Eν(t)

sim(c, eν)

c Cι eνEnu

'HƓQLWLRQ ����� 6LPLODULW\ IXQFWLRQ

sim(c, eν) = w · vsim(c, eν)

���

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 0XOWL�3DVV 6HWWLQJ

Event Clusters Co ∈ Et

Documents D

Pairwise Feature Extraction

Candidate Retrieval

Scoring and Ranking

New Event Decision

Event Classificationnew event

no new event

Centroid (Re)calculation

Events E

!Event Database

Pairwise Feature Extraction

Candidate Retrieval

Scoring and Ranking

Event Classificationnew event

no new event

Centroid (Re)calculation

!Event Database

Events Eν

Event Clusters Co ∈ Eνt

New Event Decision

)LJXUH ����� 2YHUYLHZ RI WKH HYHQW FOXVWHULQJ IUDPHZRUN V\VWHP XVLQJ PXOWLSOH SDVVHV

vsimw

��� 6\VWHP 2YHUYLHZ

D

���

��� 0XOWL�SDVV 5HTXLUHPHQWV DQG &KDOOHQJHV

d ∈ D

tCι Cι ∈ Et

E = ∅

Cι c ∈ Cι

��� 0XOWL�SDVV 5HTXLUHPHQWV DQG &KDOOHQJHV

���

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 0XOWL�3DVV 6HWWLQJ

����� 1XPEHU RI 3DVVHV

����� ,QIOXHQFH RQ )UDPHZRUN 6HWWLQJV

���

��� 0XOWL�SDVV 6WUDWHJLHV

&DQGLGDWH 5HWULHYDO

kei

∆( (d), (ei)) k

k

6FRULQJ DQG 5DQNLQJ�1HZ (YHQW 'HWHFWLRQ

P (eν |c)c eν

vnew(c)c

vnew(c) =

⎜⎜⎝

⎟⎟⎠

��� 0XOWL�SDVV 6WUDWHJLHV

���

&KDSWHU � 6\VWHP 'HVFULSWLRQ RI WKH 6WUHDP &ODVVLƓFDWLRQ )UDPHZRUN IRU D 0XOWL�3DVV 6HWWLQJ

���

&+$37(5 �([SHULPHQWDO 6HWXS DQG 5HVXOWV RI6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

��� $QDO\VLV RI )LUVW�3DVV 6WUDWHJLHV

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

7DEOH ����� 5HVXOWV IRU LQWHUPHGLDWH FOXVWHULQJ XVLQJ GLIIHUHQW VWUDWHJLHV IRU WKH ILUVW SDVV1

−−

���

��� *ROG 6WDQGDUG 3UHSDUDWLRQ IRU WKH 6HFRQG 3DVV

��� *ROG 6WDQGDUG 3UHSDUDWLRQ IRU WKH 6HFRQG 3DVV

����� 4XDOLW\ ,VVXHV LQ WKH 3UHSDUDWLRQ 3URFHVV

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

1 final event cluster2 final event clusters3 or more clusters

0.4 %2.0 %

97.6 %

2 clusters are of equal size1 cluster has majority

75.6 %

24.4 %

)LJXUH ����� $QDO\VLV RI ILQDO HYHQW FOXVWHUV FRQWDLQHG LQ DQ LQWHUPHGLDWH FOXVWHU

����� &UHDWLRQ RI WKH *ROG 6WDQGDUG IRU WKH 6HFRQG 3DVV

���

��� 2SWLPL]DWLRQ RI WKH &ODVVLƓFDWLRQ )UDPHZRUN 6WHSV IRU WKH 6HFRQG 3DVV

��� 2SWLPL]DWLRQ RI WKH &ODVVLILFDWLRQ )UDPHZRUN 6WHSV IRU WKH 6HFRQG 3DVV

����� &DQGLGDWH 5HWULHYDO

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

([SHULPHQWDO 6HWWLQJV

k

5HVXOWV

k

0

0,2

0,4

0,6

0,8

1

1 10 1001000

Capture TimeUpload TimeTagsGeo

Upload Time

Capture Time

Tags

Geo

Effec

tiven

ess

k

)LJXUH ����� (IIHFWLYHQHVV RI GLIIHUHQW FDQGLGDWH UHWULHYDO VWUDWHJLHV XVHG IRU D VHFRQG SDVV

���

��� 2SWLPL]DWLRQ RI WKH &ODVVLƓFDWLRQ )UDPHZRUN 6WHSV IRU WKH 6HFRQG 3DVV

7DEOH ����� 1HHGHG QXPEHU RI N WR UHDFK [ � HIIHFWLYHQHVV

k

����� )HDWXUHV IRU 6LPLODULW\ )XQFWLRQ /HDUQLQJ DQG 1HZ (YHQW 'HWHFWLRQ

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

6LPLODULW\ )XQFWLRQ /HDUQLQJ

&UHDWLRQ RI WKH 7UDLQLQJ 'DWD

c eνvsim(c, e) (ci, e(ci))

c e(ci)

n

neg(c+i ) =c−i ∈ext(e(c+i ))

6∑

i=1

(simi(c−i , e(d

+i ))

���

��� 2SWLPL]DWLRQ RI WKH &ODVVLƓFDWLRQ )UDPHZRUN 6WHSV IRU WKH 6HFRQG 3DVV

&ODVVLILHU 6HWWLQJV

C 1.0

1HZ (YHQW 'HWHFWLRQ

vnew(c)

&UHDWLRQ RI 7UDLQLQJ 'DWD

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

k = 10

6WUDWHJ\ IRU &ODVVLILHU 7UDLQLQJ DQG &ODVVLILHU 6HWWLQJV

C

[0..1]

��� &OXVWHULQJ )UDPHZRUN LQ 7ZR�3DVV 0RGH ŏ 2SWLPL]DWLRQ

���

��� &OXVWHULQJ )UDPHZRUN LQ 7ZR�3DVV 0RGH ŋ 2SWLPL]DWLRQ

����� ([KDXVWLYH 6HDUFK IRU 2SWLPDO )HDWXUHV LQ 6FRULQJ� 5DQNLQJ� DQG 1HZ (YHQW 'HWHFWLRQ

sim sim simsim sim

max min avg stddev maxmax

213

213

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

timecap

geotimecap timeupl

timeupl

timecap geo

tagstimecap timeupl geo

timeupl geo

timecap tags

geo tagstimecap timeupl tags

timeupl tags

timecap geo tags

titletimecap timeupl geo tags

timeupl geo tags

timecap title

geo titletimecap timeupl title

timeupl title

timecap geo title

tags titletimecap timeupl geo title

timeupl geo title

timecap tags title

geo tags titletimecap timeupl tags title

timeupl tags title

timecap geo tags title

timecap timeupl geo tags titletimeupl geo tags title

max

min

max m

in avg

max a

vgmi

n avg

max m

in av

g std

dev

max s

tddev

min s

tddev

max m

in std

dev

avg s

tddev

max a

vg st

ddev

min a

vg st

ddev

max m

in av

g stdd

ev

max ca

p ma

x max

cap

min m

axca

pma

x min

max ca

pav

g max

cap

max a

vg m

axca

pmi

n avg

max

cap

max m

in av

g max

cap

stdde

v max

cap

max s

tddev

max

cap

min s

tddev

max

cap

max m

in std

dev m

axca

pav

g stdd

ev m

axca

pma

x avg

stdd

ev m

axca

pmi

n avg

stdd

ev m

axca

pma

x min

avg s

tddev

max

cap

max up

l ma

x max

upl

min m

axup

l ma

x min

max up

l av

g max

upl

max

avg m

axup

l mi

n avg

max

upl

max m

in av

g max

upl

stdde

v max

upl

max s

tddev

max

upl

min s

tddev

max

upl

max m

in std

dev m

axup

l av

g stdd

ev m

axup

l ma

x avg

stdd

ev m

axup

l mi

n avg

stdd

ev m

axup

l ma

x min

avg s

tddev

max

upl

max ca

p max

upl

max m

axca

p max

upl

min m

axca

p max

upl

max m

in ma

x cap m

axup

l av

g max

cap m

axup

l ma

x avg

max

cap m

axup

l mi

n avg

max

cap m

axup

l ma

x min

avg m

axca

p max

upl

stdde

v max

cap m

axup

l ma

x stdd

ev m

axca

p max

upl

min s

tddev

max

cap m

axup

l ma

x min

stdde

v max

cap m

axup

l av

g stdd

ev m

axca

p max

upl

max a

vg st

ddev

max

cap m

axup

l mi

n avg

stdd

ev m

axca

p max

upl

max m

in av

g stdd

ev m

axca

p max

upl

9,06E-0018,61E-0019,07E-0018,71E-0019,07E-0018,66E-0019,07E-0019,03E-0019,06E-0019,04E-0019,06E-0019,03E-0019,06E-0019,02E-0019,06E-0018,61E-0019,06E-0018,61E-0019,07E-0018,81E-0019,07E-0018,84E-0019,07E-0019,02E-0019,07E-0019,07E-0019,07E-0019,03E-0019,07E-0019,07E-0019,06E-0018,61E-0019,06E-0018,62E-0019,07E-0018,71E-0019,09E-0018,75E-0019,09E-0019,02E-0019,09E-0019,09E-0019,06E-0019,02E-0019,06E-0019,09E-0019,06E-0018,61E-0019,07E-0018,62E-0019,07E-0018,75E-0019,07E-0018,79E-0019,09E-0018,92E-0019,07E-0019,03E-0019,07E-0019,02E-0019,07E-0019,09E-0019,07E-001

9,09E-0018,61E-0019,10E-0018,71E-0019,10E-0018,97E-0019,10E-0019,07E-0019,09E-0019,08E-0019,10E-0019,07E-0019,09E-0019,08E-0019,10E-0018,61E-0019,09E-0018,61E-0019,10E-0018,94E-0019,09E-0018,97E-0019,07E-0019,07E-0019,07E-0019,07E-0019,10E-0019,07E-0019,09E-0019,08E-0019,10E-0018,62E-0019,08E-0018,62E-0019,09E-0018,94E-0019,09E-0018,97E-0019,09E-0019,06E-0019,09E-0019,07E-0019,09E-0019,06E-0019,09E-0019,09E-0019,09E-0018,61E-0019,09E-0018,62E-0019,09E-0018,94E-0019,09E-0019,09E-0019,09E-0019,06E-0019,09E-0019,07E-0019,09E-0019,06E-0019,09E-0019,07E-0019,09E-001

9,07E-0018,61E-0019,07E-0018,65E-0019,07E-0018,81E-0019,07E-0019,02E-0019,07E-0019,02E-0019,07E-0019,02E-0019,06E-0019,02E-0019,06E-0018,61E-0019,07E-0018,61E-0019,07E-0018,83E-0019,02E-0019,02E-0019,07E-0019,02E-0019,07E-0019,02E-0019,07E-0019,01E-0019,06E-0019,07E-0019,06E-0018,61E-0019,06E-0018,62E-0019,09E-0018,93E-0019,06E-0018,73E-0019,07E-0019,01E-0019,06E-0019,02E-0019,07E-0019,01E-0019,07E-0019,02E-0019,07E-0018,61E-0019,07E-0018,62E-0019,07E-0018,76E-0019,07E-0018,79E-0018,92E-0019,01E-0019,07E-0019,02E-0019,07E-0019,02E-0019,07E-0019,09E-0019,06E-001

9,09E-0018,61E-0019,09E-0018,65E-0019,07E-0018,97E-0019,02E-0019,07E-0019,02E-0019,07E-0019,09E-0019,07E-0019,09E-0019,06E-0019,10E-0018,61E-0019,09E-0018,61E-0019,09E-0018,64E-0019,09E-0018,78E-0019,10E-0019,07E-0019,09E-0019,07E-0019,10E-0019,06E-0019,09E-0019,06E-0019,07E-0018,61E-0019,09E-0018,62E-0019,09E-0018,93E-0019,09E-0018,95E-0019,09E-0019,07E-0019,09E-0019,07E-0019,09E-0019,07E-0019,09E-0019,07E-0019,09E-0018,61E-0019,09E-0018,62E-0019,09E-0018,92E-0019,09E-0018,92E-0019,09E-0019,07E-0019,09E-0019,09E-0019,09E-0019,06E-0019,09E-0019,06E-0019,09E-001

9,07E-0018,61E-0019,07E-0018,71E-0019,07E-0018,86E-0019,08E-0019,02E-0019,07E-0019,02E-0019,07E-0019,02E-0019,08E-0019,02E-0019,08E-0018,61E-0019,08E-0018,61E-0019,08E-0018,86E-0019,02E-0018,78E-0019,08E-0019,02E-0019,08E-0019,02E-0019,07E-0019,02E-0019,07E-0019,01E-0019,07E-0018,61E-0019,07E-0018,62E-0019,07E-0019,07E-0019,08E-0018,82E-0019,08E-0019,02E-0019,07E-0019,02E-0019,07E-0019,02E-0019,07E-0019,02E-0019,07E-0018,61E-0019,07E-0018,62E-0019,07E-0018,79E-0019,07E-0018,84E-0019,02E-0019,02E-0019,02E-0019,02E-0019,07E-0019,09E-0019,07E-0019,02E-0019,07E-001

9,10E-0018,61E-0019,10E-0018,65E-0019,10E-0018,96E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0018,61E-0019,10E-0018,62E-0019,10E-0018,65E-0019,10E-0018,96E-0019,10E-0019,06E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0018,62E-0019,10E-0018,62E-0019,10E-0018,65E-0019,10E-0018,96E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0018,82E-0019,07E-0019,02E-0018,61E-0019,09E-0018,62E-0019,10E-0018,92E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-0019,07E-0019,10E-001

9,07E-0018,61E-0019,08E-0018,65E-0019,08E-0018,85E-0019,07E-0019,01E-0019,07E-0019,01E-0019,07E-0019,02E-0019,08E-0019,02E-0019,08E-0018,61E-0019,08E-0018,62E-0019,08E-0018,85E-0019,08E-0018,86E-0019,07E-0019,02E-0019,08E-0019,02E-0019,06E-0019,01E-0019,07E-0019,01E-0019,06E-0018,61E-0019,09E-0018,62E-0019,07E-0018,93E-0019,08E-0018,81E-0019,08E-0019,01E-0019,07E-0019,07E-0019,07E-0019,01E-0019,08E-0019,02E-0019,08E-0018,61E-0019,08E-0018,62E-0019,08E-0018,79E-0019,08E-0018,84E-0019,08E-0019,01E-0019,07E-0019,02E-0019,07E-0019,01E-0019,07E-0019,02E-0019,06E-001

9,10E-0018,61E-0019,10E-0018,96E-0019,10E-0019,01E-0019,10E-0019,07E-0019,10E-0019,06E-0019,06E-0019,06E-0019,06E-0019,06E-0019,10E-0018,61E-0019,10E-0018,61E-0019,10E-0018,65E-0019,10E-0018,99E-0019,10E-0019,06E-0019,10E-0019,05E-0019,10E-0019,06E-0019,10E-0019,07E-0019,10E-0018,61E-0019,09E-0018,62E-0019,07E-0018,93E-0019,10E-0018,98E-0019,10E-0019,06E-0019,10E-0019,06E-0019,10E-0019,06E-0019,02E-0019,06E-0019,02E-0018,61E-0019,09E-0018,62E-0019,10E-0018,64E-0019,10E-0019,07E-0019,10E-0019,06E-0019,10E-0019,05E-0019,10E-0019,06E-0019,10E-0019,06E-0019,10E-001

8,71E-0018,61E-0018,71E-0018,61E-0018,72E-0018,61E-0018,72E-0018,62E-0018,71E-0018,61E-0018,72E-0018,61E-0018,72E-0018,61E-0018,61E-0018,61E-0018,72E-0018,61E-0018,72E-0018,61E-0018,72E-0018,61E-0018,71E-0018,61E-0018,72E-0018,61E-0018,71E-0018,61E-0018,72E-0018,61E-0018,70E-0018,61E-0018,61E-0018,62E-0018,66E-0018,61E-0018,67E-0018,62E-0018,61E-0018,61E-0018,67E-0018,61E-0018,66E-0018,62E-0018,67E-0018,61E-0018,68E-0018,61E-0018,61E-0018,62E-0018,66E-0018,62E-0018,61E-0018,62E-0018,69E-0018,61E-0018,66E-0018,61E-0018,67E-0018,62E-0018,61E-0018,62E-0018,68E-001

8,73E-0018,61E-0018,75E-0018,62E-0018,74E-0018,63E-0018,61E-0018,70E-0018,74E-0018,70E-0018,75E-0018,61E-0018,74E-0018,70E-0018,74E-0018,61E-0018,61E-0018,61E-0018,75E-0018,61E-0018,74E-0018,63E-0018,75E-0018,70E-0018,75E-0018,61E-0018,75E-0018,69E-0018,74E-0018,69E-0018,75E-0018,61E-0018,72E-0018,62E-0018,73E-0018,61E-0018,72E-0018,62E-0018,73E-0018,61E-0018,72E-0018,68E-0018,73E-0018,61E-0018,72E-0018,67E-0018,73E-0018,61E-0018,61E-0018,62E-0018,73E-0018,62E-0018,72E-0018,61E-0018,73E-0018,65E-0018,72E-0018,67E-0018,61E-0018,69E-0018,72E-0018,61E-0018,73E-001

8,71E-0018,61E-0018,71E-0018,61E-0018,71E-0018,61E-0018,70E-0018,61E-0018,71E-0018,61E-0018,70E-0018,61E-0018,71E-0018,61E-0018,70E-0018,61E-0018,71E-0018,61E-0018,69E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,72E-0018,61E-0018,66E-0018,62E-0018,67E-0018,62E-0018,61E-0018,62E-0018,68E-0018,61E-0018,65E-0018,62E-0018,67E-0018,61E-0018,66E-0018,62E-0018,67E-0018,61E-0018,61E-0018,62E-0018,68E-0018,61E-0018,68E-0018,62E-0018,67E-0018,61E-0018,68E-0018,62E-0018,68E-0018,62E-0018,61E-0018,62E-0018,67E-001

8,61E-0018,61E-0018,72E-0018,62E-0018,74E-0018,63E-0018,72E-0018,61E-0018,73E-0018,66E-0018,61E-0018,64E-0018,73E-0018,65E-0018,73E-0018,61E-0018,61E-0018,61E-0018,73E-0018,63E-0018,71E-0018,63E-0018,73E-0018,61E-0018,71E-0018,66E-0018,73E-0018,61E-0018,72E-0018,64E-0018,61E-0018,61E-0018,70E-0018,61E-0018,72E-0018,62E-0018,61E-0018,62E-0018,72E-0018,63E-0018,72E-0018,61E-0018,73E-0018,64E-0018,74E-0018,64E-0018,73E-0018,61E-0018,70E-0018,61E-0018,73E-0018,62E-0018,72E-0018,61E-0018,72E-0018,63E-0018,61E-0018,64E-0018,61E-0018,63E-0018,73E-0018,64E-0018,73E-001

8,68E-0018,61E-0018,61E-0018,61E-0018,70E-0018,61E-0018,70E-0018,62E-0018,70E-0018,61E-0018,70E-0018,62E-0018,70E-0018,63E-0018,61E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,65E-0018,70E-0018,66E-0018,70E-0018,64E-0018,61E-0018,64E-0018,70E-0018,61E-0018,66E-0018,62E-0018,61E-0018,62E-0018,66E-0018,61E-0018,65E-0018,61E-0018,61E-0018,62E-0018,68E-0018,61E-0018,63E-0018,61E-0018,64E-0018,61E-0018,67E-0018,62E-0018,69E-0018,61E-0018,66E-0018,62E-0018,65E-0018,61E-0018,68E-0018,61E-0018,68E-0018,61E-0018,65E-0018,62E-0018,65E-001

8,69E-0018,62E-0018,62E-0018,61E-0018,70E-0018,62E-0018,70E-0018,69E-0018,61E-0018,69E-0018,70E-0018,68E-0018,70E-0018,68E-0018,61E-0018,61E-0018,71E-0018,62E-0018,70E-0018,61E-0018,70E-0018,63E-0018,70E-0018,69E-0018,61E-0018,68E-0018,70E-0018,68E-0018,70E-0018,61E-0018,70E-0018,61E-0018,61E-0018,62E-0018,61E-0018,63E-0018,69E-0018,63E-0018,62E-0018,61E-0018,61E-0018,67E-0018,62E-0018,68E-0018,61E-0018,68E-0018,62E-0018,61E-0018,61E-0018,62E-0018,69E-0018,61E-0018,69E-0018,63E-0018,62E-0018,66E-0018,61E-0018,61E-0018,62E-0018,67E-0018,62E-0018,61E-0018,62E-001

8,70E-0018,61E-0018,70E-0018,61E-0018,61E-0018,61E-0018,70E-0018,61E-0018,70E-0018,62E-0018,71E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,70E-0018,61E-0018,61E-0018,63E-0018,70E-0018,61E-0018,70E-0018,63E-0018,61E-0018,61E-0018,66E-0018,62E-0018,66E-0018,61E-0018,63E-0018,61E-0018,64E-0018,61E-0018,66E-0018,62E-0018,64E-0018,61E-0018,63E-0018,62E-0018,63E-0018,61E-0018,61E-0018,62E-0018,66E-0018,61E-0018,64E-0018,62E-0018,64E-0018,61E-0018,64E-0018,61E-0018,65E-0018,61E-0018,61E-0018,61E-0018,64E-001

8,65E-0018,61E-0018,65E-0018,63E-0018,65E-0018,64E-0018,65E-0018,64E-0018,64E-0018,64E-0018,65E-0018,64E-0018,65E-0018,64E-0018,65E-0018,61E-0018,61E-0018,61E-0018,65E-0018,63E-0018,61E-0018,64E-0018,65E-0018,61E-0018,65E-0018,64E-0018,65E-0018,64E-0018,65E-0018,64E-0018,65E-0018,61E-0018,61E-0018,62E-0018,64E-0018,63E-0018,64E-0018,61E-0018,65E-0018,64E-0018,61E-0018,64E-0018,64E-0018,61E-0018,64E-0018,64E-0018,65E-0018,61E-0018,65E-0018,61E-0018,65E-0018,62E-0018,65E-0018,63E-0018,64E-0018,64E-0018,65E-0018,64E-0018,65E-0018,64E-0018,61E-0018,64E-0018,65E-001

9,07E-0018,61E-0019,07E-0019,07E-0019,07E-0018,80E-0019,07E-0019,04E-0019,07E-0019,04E-0019,08E-0019,04E-0019,07E-0019,04E-0019,07E-0018,61E-0019,07E-0018,61E-0019,07E-0018,80E-0019,03E-0018,82E-0019,07E-0019,04E-0019,07E-0019,05E-0019,07E-0019,05E-0019,07E-0019,04E-0019,05E-0018,61E-0019,07E-0018,62E-0019,07E-0018,69E-0019,07E-0018,74E-0019,07E-0019,04E-0019,06E-0019,05E-0019,07E-0019,04E-0019,07E-0019,04E-0019,07E-0018,61E-0019,08E-0018,62E-0019,06E-0019,05E-0019,07E-0018,78E-0019,07E-0019,04E-0019,07E-0019,08E-0019,07E-0019,04E-0019,08E-0019,08E-0019,07E-001

9,10E-0018,61E-0019,09E-0018,97E-0019,09E-0018,98E-0019,04E-0019,08E-0019,10E-0019,05E-0019,09E-0019,07E-0019,09E-0019,04E-0019,09E-0018,61E-0019,10E-0018,61E-0019,03E-0018,96E-0019,09E-0018,75E-0019,09E-0019,08E-0019,10E-0019,07E-0019,09E-0019,05E-0019,09E-0019,07E-0019,05E-0018,61E-0019,07E-0018,62E-0019,09E-0018,95E-0019,09E-0018,75E-0019,09E-0019,08E-0019,09E-0019,08E-0019,09E-0019,07E-0019,05E-0019,07E-0019,09E-0018,61E-0019,09E-0018,62E-0019,06E-0018,94E-0019,08E-0018,75E-0019,09E-0019,07E-0019,08E-0019,07E-0019,09E-0019,08E-0019,09E-0019,07E-0019,09E-001

9,07E-0018,61E-0019,07E-0018,73E-0019,04E-0018,85E-0019,04E-0019,04E-0019,07E-0019,04E-0019,07E-0019,05E-0019,07E-0019,05E-0019,07E-0018,61E-0019,08E-0018,61E-0019,07E-0018,83E-0019,07E-0018,84E-0019,05E-0019,04E-0019,07E-0019,04E-0019,07E-0019,04E-0019,05E-0019,05E-0019,07E-0018,61E-0019,06E-0018,61E-0019,07E-0018,69E-0019,06E-0019,04E-0019,06E-0019,04E-0019,07E-0019,04E-0019,05E-0019,04E-0019,06E-0019,05E-0019,06E-0018,61E-0019,04E-0018,62E-0019,06E-0018,75E-0019,06E-0018,78E-0019,06E-0019,08E-0019,07E-0019,04E-0019,07E-0019,04E-0019,07E-0019,08E-0019,06E-001

9,08E-0018,61E-0019,08E-0019,03E-0019,08E-0019,04E-0019,08E-0019,08E-0019,08E-0019,05E-0019,08E-0019,08E-0019,05E-0019,08E-0019,08E-0018,61E-0019,08E-0018,61E-0019,03E-0019,03E-0019,08E-0019,04E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,07E-0019,08E-0018,61E-0019,08E-0018,62E-0019,08E-0019,02E-0019,08E-0019,04E-0019,08E-0019,08E-0019,04E-0019,07E-0019,08E-0019,08E-0019,08E-0019,05E-0019,08E-0018,61E-0019,08E-0018,62E-0019,08E-0019,02E-0019,08E-0019,04E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,07E-0019,08E-001

9,08E-0018,61E-0019,08E-0018,71E-0019,07E-0018,90E-0019,07E-0019,05E-0019,08E-0019,05E-0019,08E-0019,05E-0019,08E-0019,05E-0019,08E-0018,61E-0019,08E-0018,61E-0019,08E-0018,84E-0019,08E-0018,87E-0019,07E-0019,05E-0019,08E-0019,05E-0019,08E-0019,05E-0019,08E-0019,05E-0019,08E-0018,61E-0019,07E-0018,62E-0019,08E-0018,69E-0019,08E-0018,77E-0019,07E-0019,05E-0019,07E-0019,06E-0019,07E-0019,05E-0019,08E-0019,06E-0019,08E-0018,61E-0019,08E-0018,62E-0019,08E-0018,61E-0019,08E-0018,79E-0019,07E-0019,05E-0019,08E-0019,06E-0019,08E-0019,05E-0019,08E-0019,06E-0019,08E-001

9,09E-0018,61E-0019,09E-0019,03E-0019,08E-0018,79E-0019,08E-0019,08E-0019,09E-0019,07E-0019,09E-0019,07E-0019,08E-0019,07E-0019,08E-0018,61E-0019,09E-0018,61E-0019,09E-0018,60E-0019,08E-0018,97E-0019,08E-0019,08E-0019,09E-0019,05E-0019,09E-0019,08E-0019,09E-0019,08E-0019,08E-0018,61E-0019,09E-0018,62E-0019,09E-0018,97E-0019,09E-0018,97E-0019,08E-0019,08E-0019,09E-0019,08E-0019,09E-0019,08E-0019,08E-0019,07E-0019,08E-0018,61E-0019,09E-0018,61E-0019,09E-0019,09E-0018,78E-0019,09E-0019,08E-0019,09E-0019,08E-0019,09E-0019,08E-0019,09E-0019,07E-0019,09E-001

9,07E-0018,61E-0019,07E-0018,71E-0019,07E-0018,91E-0019,07E-0019,05E-0019,05E-0019,05E-0019,08E-0019,05E-0019,05E-0019,05E-0019,07E-0018,61E-0019,07E-0018,61E-0019,08E-0018,84E-0019,07E-0018,87E-0019,07E-0019,05E-0019,05E-0019,05E-0019,07E-0019,05E-0019,07E-0019,05E-0019,07E-0018,61E-0019,07E-0018,62E-0019,07E-0019,04E-0019,07E-0018,78E-0019,07E-0019,05E-0019,07E-0019,05E-0019,07E-0019,04E-0019,07E-0019,05E-0019,05E-0018,61E-0019,07E-0018,62E-0019,07E-0018,60E-0019,05E-0018,80E-0019,07E-0019,05E-0019,08E-0019,05E-0019,07E-0019,05E-0019,08E-0019,05E-0019,07E-001

9,07E-0018,61E-0019,08E-0019,03E-0019,08E-0019,04E-0019,04E-0019,08E-0019,04E-0019,08E-0019,08E-0019,07E-0019,08E-0019,04E-0019,08E-0018,61E-0019,08E-0018,61E-0019,03E-0019,03E-0019,07E-0019,03E-0019,08E-0019,08E-0019,08E-0019,07E-0019,08E-0019,07E-0019,08E-0019,05E-0019,08E-0018,61E-0019,08E-0018,61E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0019,07E-0019,04E-0019,07E-0019,04E-0019,07E-0019,08E-0019,08E-0019,09E-0018,61E-0019,08E-0018,61E-0019,08E-0019,03E-0019,08E-0018,78E-0019,08E-0019,08E-0019,08E-0019,08E-0019,08E-0019,07E-0019,08E-0019,07E-0019,09E-001

8,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-001

8,76E-0018,61E-0018,75E-0018,61E-0018,75E-0018,61E-0018,76E-0018,62E-0018,77E-0018,62E-0018,61E-0018,61E-0018,77E-0018,62E-0018,71E-0018,61E-0018,61E-0018,61E-0018,76E-0018,61E-0018,67E-0018,61E-0018,61E-0018,61E-0018,77E-0018,62E-0018,68E-0018,61E-0018,68E-0018,61E-0018,68E-0018,61E-0018,70E-0018,62E-0018,77E-0018,61E-0018,67E-0018,62E-0018,68E-0018,61E-0018,76E-0018,63E-0018,61E-0018,63E-0018,68E-0018,64E-0018,68E-0018,61E-0018,61E-0018,62E-0018,68E-0018,62E-0018,67E-0018,62E-0018,61E-0018,61E-0018,67E-0018,63E-0018,68E-0018,63E-0018,61E-0018,63E-0018,68E-001

8,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-001

8,77E-0018,61E-0018,78E-0018,61E-0018,78E-0018,63E-0018,83E-0018,61E-0018,78E-0018,69E-0018,61E-0018,69E-0018,81E-0018,69E-0018,75E-0018,61E-0018,78E-0018,61E-0018,81E-0018,62E-0018,82E-0018,61E-0018,75E-0018,68E-0018,61E-0018,68E-0018,75E-0018,68E-0018,75E-0018,68E-0018,75E-0018,62E-0018,61E-0018,62E-0018,79E-0018,62E-0018,79E-0018,63E-0018,75E-0018,68E-0018,79E-0018,68E-0018,61E-0018,68E-0018,81E-0018,61E-0018,75E-0018,61E-0018,79E-0018,62E-0018,77E-0018,62E-0018,61E-0018,64E-0018,75E-0018,61E-0018,81E-0018,68E-0018,75E-0018,68E-0018,75E-0018,68E-0018,61E-001

8,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-001

8,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,62E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,61E-0018,62E-001

8,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-001

> 0.91

0.909—0.91

0.904—0.909

0.899—0.904

0.894—0.899

0.889—0.894

0.884—0.889

0.879—0.884

0.874—0.879

0.869—0.874

0.864—0.869

< 0.864

Simila

rity F

eatu

res

New Event Features

F-Measure

)LJXUH ����� +HDW PDS VKRZLQJ RYHUDOO SHUIRUPDQFH RI WKH V\VWHP XVLQJ GLIIHUHQW VLPLODULW\ DQG QHZ HYHQW IHDWXUHVXVLQJ EHVW WKUHVKROG IRU QHZ HYHQW GHWHFWLRQ �690�

k = 10

����� 5HVXOWV RI WKH ([KDXVWLYH 6HDUFK

���

��� &OXVWHULQJ )UDPHZRUN LQ 7ZR�3DVV 0RGH ŋ 2SWLPL]DWLRQ

timecap

geotimecap timeupl

timeupl

timecap geo

tagstimecap timeupl geo

timeupl geo

timecap tags

geo tagstimecap timeupl tags

timeupl tags

timecap geo tags

titletimecap timeupl geo tags

timeupl geo tags

timecap title

geo titletimecap timeupl title

timeupl title

timecap geo title

tags titletimecap timeupl geo title

timeupl geo title

timecap tags title

geo tags titletimecap timeupl tags title

timeupl tags title

timecap geo tags title

timecap timeupl geo tags titletimeupl geo tags title

max

min

max m

in avg

max a

vgmi

n avg

max m

in av

g std

dev

max s

tddev

min s

tddev

max m

in std

dev

avg s

tddev

max a

vg st

ddev

min a

vg st

ddev

max m

in av

g stdd

ev

max ca

p ma

x max

cap

min m

axca

pma

x min

max ca

pav

g max

cap

max a

vg m

axca

pmi

n avg

max

cap

max m

in av

g max

cap

stdde

v max

cap

max s

tddev

max

cap

min s

tddev

max

cap

max m

in std

dev m

axca

pav

g stdd

ev m

axca

pma

x avg

stdd

ev m

axca

pmi

n avg

stdd

ev m

axca

pma

x min

avg s

tddev

max

cap

max up

l ma

x max

upl

min m

axup

l m

ax m

in m

axup

l av

g max

upl

max a

vg m

axup

l mi

n avg

max

upl

max m

in av

g max

upl

stdde

v max

upl

max s

tddev

max

upl

min s

tddev

max

upl

max m

in std

dev m

axup

l av

g stdd

ev m

axup

l ma

x avg

stdd

ev m

axup

l mi

n avg

stdd

ev m

axup

l ma

x min

avg s

tddev

max

upl

max ca

p max

upl

max m

axca

p max

upl

min m

axca

p max

upl

max m

in ma

x cap m

axup

l av

g max

cap m

axup

l ma

x avg

max

cap m

axup

l mi

n avg

max

cap m

axup

l ma

x min

avg m

axca

p max

upl

stdde

v max

cap m

axup

l ma

x stdd

ev m

axca

p max

upl

min s

tddev

max

cap m

axup

l ma

x min

stdde

v max

cap m

axup

l av

g stdd

ev m

axca

p max

upl

max a

vg st

ddev

max

cap m

axup

l mi

n avg

stdd

ev m

axca

p max

upl

max m

in av

g stdd

ev m

axca

p max

upl

9,03E-0018,62E-0019,02E-0018,63E-0019,01E-0018,81E-0019,02E-0018,97E-0019,02E-0018,96E-0019,02E-0018,97E-0018,97E-0018,97E-0018,97E-0018,61E-0019,03E-0018,62E-0018,70E-0018,75E-0019,02E-0019,00E-0019,03E-0018,96E-0019,03E-0018,96E-0019,03E-0019,00E-0019,02E-0018,97E-0019,02E-0018,62E-0019,02E-0018,62E-0019,02E-0018,65E-0019,00E-0018,69E-0019,01E-0018,97E-0019,02E-0018,98E-0019,01E-0018,96E-0018,97E-0018,96E-0018,97E-0018,62E-0019,01E-0018,62E-0019,01E-0018,74E-0018,99E-0018,89E-0019,00E-0018,96E-0018,99E-0018,96E-0019,01E-0018,92E-0018,98E-0018,91E-0019,00E-001

9,07E-0018,62E-0019,06E-0018,83E-0019,06E-0018,90E-0019,06E-0019,00E-0019,06E-0019,00E-0019,07E-0019,01E-0019,06E-0019,00E-0019,06E-0018,61E-0019,06E-0018,62E-0019,07E-0018,82E-0019,06E-0018,89E-0019,06E-0019,00E-0019,06E-0018,99E-0019,06E-0019,00E-0019,06E-0019,00E-0019,06E-0018,62E-0019,07E-0018,62E-0019,06E-0018,82E-0019,05E-0018,84E-0019,05E-0018,99E-0019,04E-0019,00E-0019,05E-0019,00E-0019,04E-0019,00E-0019,04E-0018,62E-0019,04E-0018,62E-0019,06E-0018,83E-0019,04E-0018,85E-0019,04E-0019,00E-0019,05E-0019,01E-0019,05E-0019,01E-0019,04E-0019,01E-0019,04E-001

9,03E-0018,62E-0019,02E-0018,75E-0019,01E-0018,78E-0019,02E-0018,96E-0019,02E-0018,97E-0019,02E-0018,97E-0019,02E-0018,97E-0019,02E-0018,61E-0019,03E-0018,61E-0019,03E-0018,77E-0019,02E-0018,82E-0019,02E-0018,96E-0019,03E-0018,95E-0019,02E-0018,96E-0019,02E-0018,96E-0019,02E-0018,62E-0019,02E-0018,62E-0019,02E-0018,65E-0019,02E-0018,68E-0019,01E-0018,95E-0019,01E-0018,94E-0019,02E-0018,95E-0019,02E-0018,95E-0019,01E-0018,62E-0019,03E-0018,62E-0019,03E-0018,74E-0019,02E-0018,77E-0019,02E-0018,94E-0019,03E-0018,94E-0019,02E-0018,94E-0019,02E-0018,94E-0019,02E-001

9,08E-0018,62E-0019,08E-0018,88E-0019,06E-0018,93E-0019,06E-0019,02E-0019,07E-0019,00E-0019,07E-0019,03E-0019,07E-0019,03E-0019,06E-0018,62E-0019,07E-0018,62E-0019,07E-0018,88E-0019,06E-0018,93E-0019,05E-0019,01E-0018,99E-0019,03E-0019,06E-0019,03E-0019,06E-0019,03E-0019,06E-0018,62E-0019,07E-0018,62E-0019,06E-0018,85E-0019,07E-0018,91E-0019,07E-0019,01E-0019,04E-0019,02E-0019,04E-0019,02E-0019,06E-0019,02E-0019,06E-0018,62E-0019,05E-0018,62E-0019,05E-0018,84E-0019,06E-0018,90E-0019,06E-0019,02E-0018,99E-0019,02E-0018,99E-0019,02E-0019,06E-0019,02E-0019,06E-001

9,06E-0018,62E-0019,06E-0018,71E-0019,05E-0018,73E-0019,05E-0018,94E-0019,06E-0018,94E-0019,07E-0018,94E-0019,06E-0018,94E-0019,06E-0018,61E-0019,06E-0018,62E-0019,06E-0018,67E-0019,05E-0018,75E-0019,05E-0018,92E-0019,06E-0018,95E-0019,06E-0018,94E-0019,06E-0018,93E-0019,06E-0018,62E-0019,05E-0018,61E-0019,05E-0018,68E-0019,04E-0018,68E-0019,04E-0018,94E-0019,04E-0018,94E-0019,04E-0018,94E-0019,03E-0018,94E-0019,03E-0018,62E-0019,06E-0018,62E-0019,06E-0018,66E-0019,05E-0018,62E-0019,05E-0018,93E-0019,05E-0018,93E-0019,05E-0018,91E-0019,04E-0018,91E-0019,04E-001

9,07E-0018,62E-0019,07E-0018,76E-0019,07E-0018,83E-0019,07E-0018,96E-0019,06E-0018,96E-0019,06E-0018,97E-0019,06E-0018,97E-0019,06E-0018,61E-0019,07E-0018,62E-0019,07E-0018,76E-0019,06E-0018,83E-0019,06E-0018,96E-0019,06E-0018,96E-0019,06E-0018,97E-0019,06E-0018,97E-0019,06E-0018,62E-0019,06E-0018,63E-0019,06E-0018,77E-0019,07E-0018,85E-0019,07E-0018,96E-0018,99E-0018,97E-0018,99E-0018,98E-0019,04E-0018,98E-0019,04E-0018,62E-0019,05E-0018,62E-0019,05E-0018,78E-0019,06E-0018,82E-0019,06E-0018,96E-0018,99E-0018,97E-0018,99E-0018,98E-0019,06E-0018,98E-0019,06E-001

9,07E-0018,62E-0019,07E-0018,71E-0019,06E-0018,70E-0019,06E-0018,93E-0019,06E-0018,93E-0019,06E-0018,93E-0019,06E-0018,93E-0019,06E-0018,61E-0019,07E-0018,61E-0019,07E-0018,62E-0019,06E-0018,72E-0019,06E-0018,91E-0019,06E-0018,92E-0019,06E-0018,92E-0019,06E-0018,92E-0019,06E-0018,62E-0019,06E-0018,62E-0019,06E-0018,63E-0019,04E-0018,66E-0019,04E-0018,92E-0019,06E-0018,92E-0019,06E-0018,92E-0019,05E-0018,92E-0019,05E-0018,62E-0019,07E-0018,62E-0019,07E-0018,63E-0019,05E-0018,69E-0019,05E-0018,91E-0019,06E-0018,91E-0019,06E-0018,89E-0019,05E-0018,89E-0019,05E-001

9,07E-0018,62E-0019,07E-0018,87E-0019,06E-0018,89E-0019,06E-0018,98E-0019,05E-0018,99E-0019,05E-0018,99E-0019,05E-0018,99E-0019,05E-0018,62E-0019,06E-0018,62E-0019,06E-0018,86E-0019,04E-0018,88E-0019,04E-0018,97E-0019,05E-0018,99E-0019,05E-0019,00E-0019,05E-0019,00E-0019,05E-0018,62E-0019,06E-0018,62E-0019,06E-0018,84E-0019,05E-0018,87E-0019,05E-0018,99E-0019,04E-0019,00E-0019,04E-0018,98E-0019,02E-0018,98E-0019,02E-0018,62E-0019,05E-0018,62E-0019,05E-0018,85E-0019,04E-0018,86E-0019,04E-0018,99E-0019,05E-0018,99E-0019,05E-0018,98E-0019,05E-0018,98E-0019,05E-001

8,78E-0018,62E-0018,75E-0018,61E-0018,78E-0018,61E-0018,74E-0018,63E-0018,79E-0018,63E-0018,73E-0018,63E-0018,78E-0018,63E-0018,74E-0018,61E-0018,78E-0018,62E-0018,76E-0018,61E-0018,78E-0018,61E-0018,76E-0018,63E-0018,78E-0018,63E-0018,72E-0018,63E-0018,78E-0018,62E-0018,76E-0018,62E-0018,78E-0018,62E-0018,74E-0018,62E-0018,78E-0018,62E-0018,74E-0018,62E-0018,77E-0018,63E-0018,73E-0018,62E-0018,80E-0018,62E-0018,73E-0018,62E-0018,78E-0018,61E-0018,75E-0018,61E-0018,78E-0018,61E-0018,75E-0018,62E-0018,79E-0018,63E-0018,74E-0018,62E-0018,80E-0018,62E-0018,74E-001

8,84E-0018,62E-0018,84E-0018,61E-0018,82E-0018,63E-0018,83E-0018,71E-0018,83E-0018,74E-0018,82E-0018,72E-0018,81E-0018,73E-0018,82E-0018,61E-0018,83E-0018,62E-0018,82E-0018,61E-0018,81E-0018,64E-0018,82E-0018,72E-0018,82E-0018,74E-0018,82E-0018,73E-0018,81E-0018,74E-0018,82E-0018,62E-0018,81E-0018,62E-0018,82E-0018,63E-0018,83E-0018,64E-0018,83E-0018,71E-0018,82E-0018,71E-0018,82E-0018,70E-0018,82E-0018,71E-0018,82E-0018,62E-0018,82E-0018,62E-0018,82E-0018,62E-0018,83E-0018,64E-0018,83E-0018,72E-0018,82E-0018,73E-0018,83E-0018,72E-0018,83E-0018,74E-0018,83E-001

8,78E-0018,62E-0018,78E-0018,61E-0018,76E-0018,61E-0018,77E-0018,63E-0018,77E-0018,63E-0018,78E-0018,63E-0018,77E-0018,63E-0018,77E-0018,61E-0018,79E-0018,61E-0018,79E-0018,61E-0018,78E-0018,61E-0018,78E-0018,63E-0018,78E-0018,64E-0018,78E-0018,64E-0018,78E-0018,63E-0018,78E-0018,62E-0018,76E-0018,61E-0018,75E-0018,61E-0018,72E-0018,62E-0018,75E-0018,64E-0018,74E-0018,63E-0018,75E-0018,63E-0018,70E-0018,63E-0018,74E-0018,62E-0018,75E-0018,62E-0018,74E-0018,61E-0018,72E-0018,61E-0018,74E-0018,64E-0018,75E-0018,64E-0018,73E-0018,63E-0018,71E-0018,63E-0018,73E-001

8,82E-0018,62E-0018,82E-0018,63E-0018,81E-0018,64E-0018,81E-0018,71E-0018,82E-0018,72E-0018,81E-0018,72E-0018,82E-0018,72E-0018,82E-0018,61E-0018,80E-0018,62E-0018,82E-0018,63E-0018,81E-0018,64E-0018,82E-0018,71E-0018,82E-0018,72E-0018,81E-0018,72E-0018,82E-0018,72E-0018,82E-0018,62E-0018,81E-0018,62E-0018,82E-0018,63E-0018,81E-0018,62E-0018,81E-0018,72E-0018,81E-0018,71E-0018,81E-0018,71E-0018,82E-0018,71E-0018,82E-0018,62E-0018,80E-0018,62E-0018,79E-0018,63E-0018,81E-0018,63E-0018,81E-0018,72E-0018,81E-0018,72E-0018,82E-0018,71E-0018,80E-0018,71E-0018,80E-001

8,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,62E-0018,61E-0018,62E-001

8,70E-0018,62E-0018,70E-0018,61E-0018,70E-0018,62E-0018,70E-0018,61E-0018,70E-0018,64E-0018,69E-0018,62E-0018,68E-0018,64E-0018,69E-0018,61E-0018,70E-0018,62E-0018,70E-0018,62E-0018,69E-0018,63E-0018,69E-0018,62E-0018,69E-0018,64E-0018,69E-0018,62E-0018,69E-0018,63E-0018,69E-0018,62E-0018,68E-0018,64E-0018,68E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,67E-0018,62E-0018,68E-0018,63E-0018,68E-0018,62E-0018,68E-0018,64E-0018,69E-0018,62E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,67E-0018,62E-0018,69E-0018,62E-0018,69E-001

8,62E-0018,62E-0018,64E-0018,62E-0018,63E-0018,61E-0018,63E-0018,61E-0018,62E-0018,61E-0018,63E-0018,61E-0018,62E-0018,61E-0018,63E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-001

8,70E-0018,62E-0018,70E-0018,62E-0018,69E-0018,62E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,61E-0018,70E-0018,61E-0018,70E-0018,62E-0018,70E-0018,62E-0018,70E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,63E-0018,69E-0018,62E-0018,70E-0018,62E-0018,70E-0018,62E-0018,70E-0018,63E-0018,70E-0018,64E-0018,71E-0018,64E-0018,70E-0018,63E-0018,70E-0018,63E-0018,70E-0018,62E-0018,70E-0018,62E-0018,70E-0018,62E-0018,70E-0018,62E-0018,70E-0018,64E-0018,71E-0018,64E-0018,71E-0018,63E-0018,70E-0018,63E-0018,70E-001

9,06E-0018,62E-0019,06E-0018,76E-0019,05E-0018,82E-0019,05E-0018,96E-0019,05E-0018,96E-0019,05E-0018,95E-0019,05E-0018,95E-0019,05E-0018,61E-0018,61E-0018,62E-0018,70E-0018,77E-0018,77E-0018,98E-0019,05E-0018,94E-0019,04E-0018,96E-0019,04E-0018,95E-0018,96E-0018,96E-0018,78E-0018,62E-0019,02E-0018,62E-0019,06E-0018,65E-0019,05E-0018,79E-0019,05E-0018,96E-0019,02E-0018,95E-0019,05E-0018,95E-0019,05E-0018,94E-0019,05E-0018,62E-0019,04E-0019,03E-0019,04E-0018,72E-0019,05E-0018,80E-0019,05E-0018,93E-0019,03E-0018,93E-0019,03E-0019,02E-0019,05E-0018,93E-0019,05E-001

9,07E-0018,62E-0019,07E-0018,84E-0019,06E-0018,89E-0019,06E-0019,00E-0019,05E-0019,01E-0019,06E-0019,01E-0019,06E-0019,02E-0019,06E-0018,61E-0019,07E-0018,62E-0019,07E-0018,86E-0019,05E-0018,90E-0019,05E-0019,00E-0019,04E-0019,00E-0019,04E-0019,01E-0019,04E-0019,02E-0019,04E-0018,62E-0019,07E-0018,62E-0019,07E-0018,85E-0019,05E-0018,87E-0019,05E-0019,00E-0019,06E-0019,01E-0019,06E-0019,00E-0019,06E-0019,00E-0019,06E-0018,62E-0019,06E-0018,62E-0019,06E-0018,85E-0019,03E-0018,87E-0019,03E-0019,00E-0019,04E-0019,00E-0019,04E-0019,00E-0019,04E-0019,00E-0019,04E-001

9,06E-0018,62E-0019,06E-0018,75E-0019,05E-0018,81E-0019,05E-0018,96E-0019,05E-0018,97E-0019,05E-0018,96E-0019,04E-0018,97E-0019,04E-0018,61E-0019,06E-0018,62E-0019,06E-0018,75E-0019,04E-0018,80E-0019,04E-0018,95E-0019,04E-0018,98E-0019,04E-0018,96E-0019,05E-0018,97E-0019,05E-0018,62E-0019,05E-0018,62E-0019,05E-0018,64E-0019,04E-0018,79E-0019,04E-0018,94E-0019,04E-0018,96E-0019,03E-0018,94E-0019,04E-0018,93E-0019,04E-0018,62E-0019,05E-0018,62E-0019,05E-0018,71E-0019,05E-0018,79E-0019,05E-0018,91E-0019,04E-0018,93E-0019,03E-0018,91E-0019,05E-0018,92E-0019,05E-001

9,08E-0018,62E-0019,08E-0018,97E-0019,07E-0019,00E-0019,07E-0019,05E-0019,07E-0019,06E-0019,08E-0019,05E-0019,07E-0019,05E-0019,07E-0018,61E-0019,07E-0018,61E-0019,07E-0018,95E-0019,07E-0018,98E-0019,07E-0019,06E-0019,07E-0019,06E-0019,07E-0019,06E-0019,07E-0019,05E-0019,07E-0018,62E-0019,08E-0018,62E-0019,08E-0018,93E-0019,08E-0018,97E-0019,08E-0019,06E-0019,07E-0019,06E-0019,07E-0019,05E-0019,07E-0019,05E-0019,07E-0018,62E-0019,08E-0018,62E-0019,08E-0018,94E-0019,07E-0018,99E-0019,07E-0019,05E-0019,08E-0019,05E-0019,08E-0019,05E-0019,08E-0019,05E-0019,08E-001

9,08E-0018,62E-0019,08E-0018,76E-0019,07E-0018,78E-0019,07E-0018,93E-0019,07E-0018,94E-0019,07E-0018,93E-0019,07E-0018,94E-0019,07E-0018,61E-0019,07E-0018,62E-0019,08E-0018,74E-0019,07E-0018,78E-0019,07E-0018,94E-0019,07E-0018,95E-0019,07E-0018,94E-0019,07E-0018,93E-0019,07E-0018,62E-0019,07E-0018,61E-0019,07E-0018,66E-0019,07E-0018,72E-0019,07E-0018,93E-0019,07E-0018,95E-0019,07E-0018,94E-0019,07E-0018,94E-0019,07E-0018,62E-0019,08E-0018,62E-0019,08E-0018,66E-0019,07E-0018,76E-0019,07E-0018,93E-0019,05E-0018,93E-0019,06E-0018,93E-0019,07E-0018,93E-0019,07E-001

9,08E-0018,62E-0019,08E-0018,85E-0019,07E-0018,88E-0019,07E-0018,98E-0019,07E-0018,99E-0019,06E-0019,00E-0019,07E-0019,00E-0019,06E-0018,61E-0019,08E-0018,62E-0019,08E-0018,86E-0019,08E-0018,87E-0019,08E-0018,97E-0019,07E-0018,98E-0019,07E-0018,99E-0019,07E-0018,99E-0019,07E-0018,62E-0019,07E-0018,63E-0019,07E-0018,87E-0019,07E-0018,86E-0019,07E-0018,99E-0019,07E-0018,99E-0019,06E-0018,99E-0019,07E-0018,99E-0019,06E-0018,62E-0019,08E-0018,62E-0019,08E-0018,76E-0019,08E-0018,85E-0019,08E-0018,99E-0019,06E-0018,98E-0019,06E-0018,98E-0019,07E-0018,98E-0019,07E-001

9,08E-0018,62E-0019,08E-0018,76E-0019,07E-0018,77E-0019,07E-0018,94E-0019,06E-0018,94E-0019,05E-0018,96E-0019,06E-0018,96E-0019,06E-0018,61E-0019,08E-0018,62E-0019,08E-0018,74E-0019,07E-0018,77E-0019,07E-0018,94E-0019,06E-0018,95E-0019,06E-0018,93E-0019,06E-0018,93E-0019,06E-0018,62E-0019,07E-0018,61E-0019,07E-0018,68E-0019,06E-0018,67E-0019,06E-0018,94E-0019,05E-0018,76E-0019,05E-0018,92E-0019,05E-0018,92E-0019,05E-0018,62E-0019,07E-0018,62E-0019,07E-0018,71E-0019,05E-0018,66E-0019,05E-0018,89E-0019,06E-0018,90E-0019,06E-0018,92E-0019,05E-0018,92E-0019,05E-001

9,09E-0018,62E-0019,09E-0018,97E-0019,08E-0018,97E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0018,61E-0019,09E-0018,62E-0019,09E-0018,96E-0019,09E-0018,97E-0019,09E-0019,03E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0018,62E-0019,09E-0018,62E-0019,09E-0018,96E-0019,09E-0018,96E-0019,09E-0019,04E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-0018,62E-0019,08E-0018,63E-0019,08E-0018,96E-0019,07E-0018,96E-0019,07E-0019,02E-0019,08E-0019,03E-0019,08E-0019,04E-0019,08E-0019,04E-0019,08E-001

8,81E-0018,62E-0018,80E-0018,61E-0018,80E-0018,61E-0018,80E-0018,63E-0018,80E-0018,63E-0018,80E-0018,63E-0018,81E-0018,64E-0018,80E-0018,61E-0018,81E-0018,61E-0018,81E-0018,61E-0018,79E-0018,61E-0018,80E-0018,63E-0018,81E-0018,64E-0018,81E-0018,63E-0018,81E-0018,64E-0018,81E-0018,62E-0018,80E-0018,62E-0018,81E-0018,62E-0018,80E-0018,62E-0018,81E-0018,64E-0018,81E-0018,64E-0018,81E-0018,64E-0018,80E-0018,64E-0018,80E-0018,62E-0018,80E-0018,61E-0018,81E-0018,61E-0018,80E-0018,61E-0018,81E-0018,63E-0018,81E-0018,63E-0018,80E-0018,63E-0018,80E-0018,63E-0018,80E-001

8,82E-0018,62E-0018,81E-0018,62E-0018,81E-0018,62E-0018,81E-0018,71E-0018,82E-0018,73E-0018,82E-0018,73E-0018,82E-0018,73E-0018,82E-0018,61E-0018,82E-0018,62E-0018,81E-0018,62E-0018,81E-0018,62E-0018,80E-0018,71E-0018,81E-0018,73E-0018,81E-0018,72E-0018,81E-0018,72E-0018,81E-0018,62E-0018,81E-0018,62E-0018,81E-0018,62E-0018,81E-0018,63E-0018,81E-0018,73E-0018,82E-0018,73E-0018,82E-0018,72E-0018,82E-0018,72E-0018,82E-0018,62E-0018,80E-0018,62E-0018,81E-0018,61E-0018,80E-0018,62E-0018,80E-0018,74E-0018,81E-0018,73E-0018,81E-0018,72E-0018,81E-0018,72E-0018,81E-001

8,82E-0018,62E-0018,81E-0018,61E-0018,81E-0018,61E-0018,81E-0018,62E-0018,80E-0018,63E-0018,81E-0018,63E-0018,81E-0018,63E-0018,81E-0018,61E-0018,80E-0018,61E-0018,80E-0018,61E-0018,80E-0018,61E-0018,80E-0018,63E-0018,81E-0018,63E-0018,81E-0018,63E-0018,80E-0018,63E-0018,80E-0018,61E-0018,78E-0018,61E-0018,77E-0018,61E-0018,77E-0018,60E-0018,77E-0018,62E-0018,78E-0018,62E-0018,76E-0018,62E-0018,77E-0018,62E-0018,77E-0018,61E-0018,77E-0018,61E-0018,77E-0018,61E-0018,76E-0018,61E-0018,77E-0018,62E-0018,77E-0018,62E-0018,77E-0018,62E-0018,77E-0018,62E-0018,77E-001

8,81E-0018,62E-0018,80E-0018,75E-0018,80E-0018,74E-0018,80E-0018,76E-0018,80E-0018,78E-0018,80E-0018,79E-0018,80E-0018,79E-0018,80E-0018,61E-0018,81E-0018,62E-0018,80E-0018,74E-0018,80E-0018,74E-0018,80E-0018,77E-0018,81E-0018,79E-0018,80E-0018,80E-0018,80E-0018,80E-0018,80E-0018,62E-0018,80E-0018,62E-0018,80E-0018,73E-0018,80E-0018,73E-0018,80E-0018,76E-0018,80E-0018,78E-0018,80E-0018,78E-0018,79E-0018,78E-0018,79E-0018,62E-0018,79E-0018,62E-0018,79E-0018,74E-0018,80E-0018,73E-0018,80E-0018,75E-0018,79E-0018,77E-0018,79E-0018,77E-0018,79E-0018,77E-0018,79E-001

8,61E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,61E-0018,62E-0018,61E-0018,61E-0018,62E-0018,62E-0018,62E-0018,61E-0018,61E-0018,62E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,61E-0018,60E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-0018,62E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,61E-0018,62E-0018,62E-0018,62E-001

8,62E-0018,62E-0018,63E-0018,62E-0018,63E-0018,62E-0018,63E-0018,61E-0018,63E-0018,63E-0018,63E-0018,62E-0018,63E-0018,62E-0018,63E-0018,61E-0018,62E-0018,62E-0018,63E-0018,62E-0018,63E-0018,62E-0018,63E-0018,61E-0018,63E-0018,63E-0018,63E-0018,62E-0018,63E-0018,63E-0018,63E-0018,62E-0018,62E-0018,62E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-0018,62E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-0018,62E-0018,62E-0018,62E-0018,63E-0018,62E-0018,62E-0018,63E-0018,62E-0018,62E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-0018,63E-001

8,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,60E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-0018,59E-0018,61E-0018,61E-0018,60E-0018,61E-0018,61E-0018,61E-0018,61E-0018,60E-0018,59E-0018,61E-0018,61E-0018,60E-0018,61E-0018,61E-0018,59E-0018,61E-0018,59E-0018,61E-0018,59E-0018,61E-0018,59E-0018,59E-0018,61E-0018,59E-0018,61E-0018,61E-0018,61E-0018,61E-0018,61E-001

> 0.91

0.909—0.91

0.904—0.909

0.899—0.904

0.894—0.899

0.889—0.894

0.884—0.889

0.879—0.884

0.874—0.879

0.869—0.874

0.864—0.869

< 0.864

Simila

rity F

eatu

res

New Event Features

F-Measure

)LJXUH ����� +HDW PDS VKRZLQJ RYHUDOO SHUIRUPDQFH RI WKH V\VWHP XVLQJ GLIIHUHQW VLPLODULW\ DQG QHZ HYHQW IHDWXUHVXVLQJ EHVW WKUHVKROG IRU QHZ HYHQW GHWHFWLRQ �'HFLVLRQ 7UHH�

xy

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

7DEOH ����� %HVW SHUIRUPDQFH RQ RYHUDOO WDVN RQ WUDLQLQJ VHW XVLQJ GLIIHUHQW VLPLODULW\ IHDWXUHV DQG VLPLODULW\ IHDWXUHFRPELQDWLRQV

���

��� &OXVWHULQJ )UDPHZRUN LQ 7ZR�3DVV 0RGH ŋ 2SWLPL]DWLRQ

0.0 0.05 0.1 0.1

5 0.2 0.25 0.3 0.3

50.0

0.05

0.1

0.15

0.2

0.25

0.3

simtags

simtitle

)LJXUH ����� /HDUQHG GHFLVLRQ ERXQGDU\ RI 690 XVLQJ FRPELQDWLRQ RI VLPWDJV RU VLPWLWOH

sim sim sim

simsim

sim

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

simtags > 0.054

Gini: 0.0 Samples: 1|0

Gini: 0.346 Samples: 180

Gini: 0.42 Samples: 20

Gini: 0.334 Samples: 25|93

Gini: 0.498 Samples: 8|9

Gini: 0.118 Samples: 729

Gini: 0.016 Samples: 4978

Gini: 0.0 Samples: 2|0

Gini: 0.114 Samples: 727

Gini: 0.156 Samples: 25|269

Gini: 0.084 Samples: 19|414

Gini: 0.036 Samples: 1376

Gini: 0.008 Samples: 3602

Gini: 0.034 Samples: 24|1351

Gini: 0.0 Samples: 1|0

Gini: 0.016 Samples: 10|1242

Gini: 0.003 Samples: 4|2346

Gini: 0.5 Samples: 5|5

Gini: 0.18 Samples: 9|1

Gini: 0.494 Samples: 114

Gini: 0.0 Samples: 4|0

simtags ≤ 0.054

simtag

s > 0.0

54

simtags ≤ 0.035

simtags ≤ 0.054simtags ≤ 0.038

simtags ≤ 0.036

simtags ≤ 0.089simtags ≤ 0.089

simtags ≤ 0.08simtags ≤ 0.926

simtags ≤ 0.206

simtags ≤ 0.206simtags ≤ 0.141

simtags ≤ 0.419

simtags ≤ 0.419

simtags ≤ 0.663

simtags > 0.663

simtags > 0.419

simtags > 0.419

simtags > 0.206

simtags > 0.206

simtags > 0.141sim

tags >

0.089

simtags > 0.089

simtags > 0.08

simtags > 0.926

simtags > 0.036

simtag

s > 0.0

35

simtags > 0.038

Depth 1 2 3 4 5

Gini: 0.0 Samples: 0|13

Gini: 0.312 Samples: 6|25

simtags ≤ 0.082

simtags > 0.082

Gini: 0.082 Samples: 44

Gini: 0.077 Samples: 135

Gini: 0.042 Samples: 5887

Gini: 0.029 Samples: 5707

Gini: 0.341 Samples: 179

Gini: 0.5 Samples: 12158

Gini: 0.0 Samples: 0|4

Gini: 0.499 Samples: 45|41

simtags ≤ 0.038

simtags > 0.038

Gini: 0.5 Samples: 90

Gini: 0.497 Samples: 110

Gini: 0.096 Samples: 6271

Gini: 0.444 Samples: 3

Gini: 0.0 Samples: 0|1

Gini: 0.5 Samples: 1|1

simtags ≤ 0.019simtags ≤ 0.02

simtags ≤ 0.019

simtags > 0.019

simtags > 0.02Gini: 0.335

Samples: 89|24

Gini: 0.0 Samples: 11|0

simtags ≤ 0.034

simtags > 0.034

Gini: 0.312 Samples: 124

Gini: 0.326 Samples: 127

Gini: 0.073 Samples: 5674

Gini: 0.074 Samples: 5434|216

Gini: 0.0 Samples: 24|0

simtags ≤ 0.019

simtags ≤ 0.001simtags ≤ 0.000

simtags > 0.000

simtags > 0.001Gini: 0.0

Samples: 0|1

Gini: 0.121 Samples: 332|23

simtags ≤ 0.001

simtags > 0.001

Gini: 0.126 Samples: 356

Gini: 0.076 Samples: 6030

Gini: 0.083 Samples: 6157

)LJXUH ����� 'HFLVLRQ 7UHH RI WKH VLPLODULW\ GHFLVLRQ XVLQJ WDJ IHDWXUH RQO\ �EHVW SHUIRUPLQJ RQ RYHUDOO WDVN�

���

��� &OXVWHULQJ )UDPHZRUN LQ 7ZR�3DVV 0RGH ŋ 2SWLPL]DWLRQ

7DEOH ����� %HVW SHUIRUPDQFH RQ RYHUDOO WDVN RQ WUDLQLQJ VHW XVLQJ GLIIHUHQW QHZ HYHQW GHWHFWLRQ IHDWXUHV

max

����� 2SWLPL]DWLRQ RI &DQGLGDWH 5HWULHYDO 6WUDWHJ\

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

0,7

0,75

0,8

0,85

0,9

0,95

1

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39Precision (Train) Recall (Train) F-Measure (Train) F-Measure (Test)

k

Precision (Train)

Recall (Train)

F-Measure (Train)F-Measure (Test)

)LJXUH ����� 3HUIRUPDQFH RI WKH FOXVWHULQJ IUDPHZRUN LQ D WZR�SDVV SDUDGLJP XVLQJ GLIIHUHQW QXPEHUV RI FDQGLGDWHVUHWULHYHG �XVLQJ WUDLQ DQG WHVW VHW�

k

k

k = 30

k

���

��� 5HVXOWV RI WKH &OXVWHULQJ )UDPHZRUN XVHG LQ 7ZR�3DVV 0RGH

7DEOH ����� 5HVXOWV IRU FOXVWHULQJ XVLQJ GLIIHUHQW VWUDWHJLHV IRU WKH ILUVW SDVV DQG DQ RSWLPL]HG VHFRQG SDVV1

+−−

��� 5HVXOWV RI WKH &OXVWHULQJ )UDPHZRUN XVHG LQ 7ZR�3DVV 0RGH

���

&KDSWHU � ([SHULPHQWDO 6HWXS DQG 5HVXOWV RI 6XSHUYLVHG 0XOWL�3DVV &OXVWHULQJ

��� &RQFOXVLRQV

���

��� &RQFOXVLRQV

���

3$57 ,9&RQFOXGLQJ 5HPDUNV

���

&+$37(5 �5HPDUNV DQG &RPSDULVRQ RI&OXVWHULQJ $SSURDFKHV

��� 3UHUHTXLVLWHV IRU (YHQW &OXVWHULQJ

���

&KDSWHU � 5HPDUNV DQG &RPSDULVRQ RI &OXVWHULQJ $SSURDFKHV

��� 5HIOHFWLRQ RQ 0XOWL�3DVV &OXVWHULQJ LQ D 6WUHDP�EDVHG 6HWWLQJ

���

��� 5HŴHFWLRQ RQ 0XOWL�3DVV &OXVWHULQJ LQ D 6WUHDP�EDVHG 6HWWLQJ

���

&KDSWHU � 5HPDUNV DQG &RPSDULVRQ RI &OXVWHULQJ $SSURDFKHV

��� &RPSDULVRQ ZLWK 2WKHU $SSURDFKHV

7DEOH ����� &RPSDULVRQ RI GLIIHUHQW FOXVWHULQJ VWUDWHJLHV RI 0HGLD(YDO ���� 6(' ZLWK RXU FODVVLILFDWLRQ DSSURDFKHV

���

��� &RPSDULVRQ ZLWK 2WKHU $SSURDFKHV

���

&+$37(5 ��&RQFOXVLRQ

���

&KDSWHU �� &RQFOXVLRQ

���

���

&KDSWHU �� &RQFOXVLRQ

2XWORRN

���

3$57 9$SSHQGL[

���

*ORVVDU\

���

*ORVVDU\

���

$FURQ\PV

���

%LEOLRJUDSK\

���

%LEOLRJUDSK\

���

%LEOLRJUDSK\

���

top related