point processes - adapted from gomez-rodriguez [gomez...

Post on 10-Oct-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Point ProcessesAdapted from Gomez-Rodriguez [4, Gomez-Rodriguez]Knowledge Discovery and Data Mining 2 (VU) (707.004)

Tiago Santos

Institute for Interactive Systems and Data Science, TU Graz

2019-12-05

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 1 / 37

Section 1

Motivation and Applications

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 2 / 37

Example 1: assessing source trustworthiness

Timeline of edits to a Wikipedia article

Refutation probabilities by topic and source

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 3 / 37

Paper: [9, Tabibian et al.]

Example 1: assessing source trustworthiness

Timeline of edits to a Wikipedia article

Refutation probabilities by topic and source

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 3 / 37

Paper: [9, Tabibian et al.]

Example 2: seismology models

Interactions between di�erent kinds of earthquakes

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 4 / 37

Paper: [7, Ogata 1983]

Generalized problem formulation

Suppose:1 Discrete event stream of timestamps

I Irrespective of application scenario [3, Daley and Vere-Jones], [1, Bacry et al.], [5, Kurashima etal.]

2 Non-trivial temporal dynamics and dependencies:I Dependence of own event historyI Dependence of other event histories

When facing such a problem,consider Hawkes processes!

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 5 / 37

Generalized problem formulation

Suppose:1 Discrete event stream of timestamps

I Irrespective of application scenario [3, Daley and Vere-Jones], [1, Bacry et al.], [5, Kurashima etal.]

2 Non-trivial temporal dynamics and dependencies:I Dependence of own event historyI Dependence of other event histories

When facing such a problem,consider Hawkes processes!

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 5 / 37

Section 2

Univariate Point Processes and Hawkes Processes

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 6 / 37

Temporal point processes

Definition: A random process whose realization consists of discrete events localized in time.

Formally, N(t) =∫ t0 dN(s), dN(t) =

∑ti∈H(t) δ(t − ti)dt , where dN(t) ∈ {0, 1} and δ is the

Dirac delta.

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 7 / 37

Intensity function

Since it is cumbersome to model event timelines directly, we model event intensity over time:

λ∗(t)dt = E[dN(t)|H(t)]

λ∗(t)dt is the expected value of (infinitesimal) change in event count over time, given eventhistory.→ λ∗(t) is an event rate (i.e., number of events per time unit), and this changes over time!

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 8 / 37

Intensity function

Since it is cumbersome to model event timelines directly, we model event intensity over time:

λ∗(t)dt = E[dN(t)|H(t)]

λ∗(t)dt is the expected value of (infinitesimal) change in event count over time, given eventhistory.→ λ∗(t) is an event rate (i.e., number of events per time unit), and this changes over time!

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 8 / 37

Poisson process

Intensity of a Poisson process:

λ∗(t) = µ

Note:1 Intensity independent of history2 Events occur uniformly at random3 Exponential inter-event time distribution

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 9 / 37

Inhomogeneous Poisson process

Intensity of an inhomogeneous Poisson process:

λ∗(t) = g(t) ≥ 0

Note:1 Intensity independent of history

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 10 / 37

Survival (or terminating) process

Intensity of a survival (or terminating) process:

λ∗(t) = g∗(t)(1− N(t)) ≥ 0

Note:1 Limited number of occurrences

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 11 / 37

Hawkes (or self-exciting) process

Intensity of Hawkes (or self-exciting) process:

λ∗(t) = µ+∑

ti∈H(t)

ακβ(t − ti)

Note:1 Clustered (or bursty) occurrence of events2 Intensity is stochastic and history-dependent

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 12 / 37

Hawkes (or self-exciting) process

Typical choices for kernel function κβ(t) include power law and exponential kernel:

κβ(t) = e−βt

Hence we get:λ∗(t) = µ+

∑ti<t

αe−β(t−ti)

What can we do with these models?

Fit models to real data by maximizing log-likelihood

Sample from fi�ed process via Ogata thinning [6, Ogata 1981]

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 13 / 37

Hawkes (or self-exciting) process

Typical choices for kernel function κβ(t) include power law and exponential kernel:

κβ(t) = e−βt

Hence we get:λ∗(t) = µ+

∑ti<t

αe−β(t−ti)

What can we do with these models?

Fit models to real data by maximizing log-likelihood

Sample from fi�ed process via Ogata thinning [6, Ogata 1981]

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 13 / 37

Fi�ing temporal point processes: Poisson

Likelihood of historical timeline with length T :

λ∗(t1)λ∗(t2)λ∗(t3) exp(−∫ T

0λ∗(τ)dτ

)= µ3 exp(−µT )

Maximizing log-likelihood:

µ∗ = argmaxµ

3 log(µ)− µT =3T

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 14 / 37

Fi�ing temporal point processes: Hawkes

Likelihood of historical timeline with length T :

λ∗(t1)λ∗(t2)λ∗(t3) . . . λ∗(tn) exp(−∫ T

0λ∗(τ)dτ

)Set λ∗(t) = µ+

∑ti∈H(t) ακβ(t − ti) and max. likelihood:

maxµ,α

n∑i=1

logλ∗(ti)−∫ T

0λ∗(τ)dτ

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 15 / 37

Section 3

Multivariate Hawkes Processes

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 16 / 37

Mutually exciting process

Intensity of mutually exciting (or cross-exciting) Hawkes process:

λ∗(t) = µ+∑

ti∈Hb(t)

ακβ(t − ti) +∑

ti∈Hc(t)

γκβ(t − ti)

Note:1 Superposition of processes2 Clustered occurrence of events a�ected by neighbors

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 17 / 37

Multivariate Hawkes process

M-variate Hawkes process with exponential kernel:

λ∗m(t) = µm +M∑n=1

∑tni <t

αmne−βmn(t−tni )

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 18 / 37

Fi�ing and sampling multivariate Hawkes

Sampling and fi�ing multivariate Hawkes processes works as previously.Example 2-variate Hawkes Process sample for T = 8:

0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5 6 7 8Time

Inte

nsity

Dimension λ1 λ2

Parameter values: µ = ( 0.10.5 ), α = ( 0.1 0.70.5 0.2 ), β = ( 1.2 1.0

0.8 0.6 )

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 19 / 37

Section 4

A few words of caution!

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 20 / 37

Pitfalls & Counter-Measures

Assure stationarity of multivariate Hawkes, otherwise:

Stationarity test: Spectral radius ρ < 1Fi�ing β: EM, L-BFGS, Hyperparameter optim., . . .Fit quality: Measure with Q-Q plot

Alternative approaches:I Information-theory (e.g. transfer entropy)I Dynamical systems (e.g. branching processes)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 21 / 37

Pitfalls & Counter-Measures

Assure stationarity of multivariate Hawkes, otherwise:

Stationarity test: Spectral radius ρ < 1Fi�ing β: EM, L-BFGS, Hyperparameter optim., . . .Fit quality: Measure with Q-Q plotAlternative approaches:

I Information-theory (e.g. transfer entropy)I Dynamical systems (e.g. branching processes)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 21 / 37

Section 5

Example Application: Understanding Q&A Community Development

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 22 / 37

Motivation

How and why do some online communities grow and others do not?

How do users become active, and how does their activity evolve over time?

We aim to understand the role of user excitation in the activity levels of Stack ExchangeQ&A forums.

→ This will help community managers guide and encourage activity.

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 23 / 37

Motivation

How and why do some online communities grow and others do not?

How do users become active, and how does their activity evolve over time?

We aim to understand the role of user excitation in the activity levels of Stack ExchangeQ&A forums.

→ This will help community managers guide and encourage activity.

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 23 / 37

Motivation

How and why do some online communities grow and others do not?

How do users become active, and how does their activity evolve over time?

We aim to understand the role of user excitation in the activity levels of Stack ExchangeQ&A forums.

→ This will help community managers guide and encourage activity.

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 23 / 37

Fi�ing Multivariate Hawkes

Ensuring stationarity:I Fit only stationary segments of event streamsI Estimate stationary segments via Zeileis et al.’s [10, Zeileis et al.] algorithm:

Fi�ing βm,n:I Assume βm,n = β,∀1≤m,n≤MI Algorithm: Bayesian hyperparameter optimization [2, Bergstra et al.]

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 24 / 37

Fi�ing Multivariate Hawkes

Ensuring stationarity:I Fit only stationary segments of event streamsI Estimate stationary segments via Zeileis et al.’s [10, Zeileis et al.] algorithm:

Fi�ing βm,n:I Assume βm,n = β,∀1≤m,n≤MI Algorithm: Bayesian hyperparameter optimization [2, Bergstra et al.]

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 24 / 37

Dataset

Stack Exchange: 159 Q&A communities from 2008 to 2017 with 22 million events

Dataset Group Communities # Activity total Age (years) Growth (%)

Growing

electronics (757.62%), ru (736.42%), codegolf (510.06%),

22 [7987, 1489384] [3.08, 7.83] [169.29, 757.62]chemistry, sharepoint, academia, puzzling, tex, codereview,blender, unix, money, gis, ux, crypto, security, stats, salesforce, dba,wordpress (182.28%), opendata (174.69%), askubuntu (169.29%)

Declining

boardgames (−28.53%), fitness (−34.56%), sound (−35.01%),

22 [3301, 117474] [3, 7.75] [−82.7,−28.53]productivity, tridion, parenting, pets, cra�cms, webapps, spanish, cooking,ham, bricks, gardening, cstheory, expressionengine, pm, skeptics, sustainability,genealogy (−80.26%), ebooks (−81.52%), stackapps (−82.7%)

STEMelectronics (757.62%), chemistry (473.48%), stats (199.18%), biology,

15 [15759, 745674] [2.41, 8.75] [−35.01, 757.61]datascience, physics, astronomy, cs, space, cogsci, earthscience, engineering,reverseengineering (0.00%), so�wareengineering (−21.28%), sound (−35.01%)

Humanitiesphilosophy (122.45%), english (117.76%), chinese (23.17%), music, german,

15 [87, 896631] [0.17, 6.83] [−50.10, 127.47]mythology, portuguese, christianity, esperanto, arabic, russian, writers,buddhism (−26.62%), french (−27.91%), spanish (−50.10%)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 25 / 37

Experimental Setup

Longitudinal comparison:

We compare groups of datasets across 3 years. . .

. . . by fi�ing Hawkes process every 3 monthsGroup comparisons:

I Growing vs. decliningI STEM vs humanities

Mapping event streams to Hawkes processes:

Every dataset group is a multivariate process, every community a process realization4 process dimensions distinguish common activity and user types:

I �estions by Power Users (QP)I �estions by Casual Users (QC)I Answers by Power Users (AP)I Answers by Casual Users (AC)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 26 / 37

Experimental Setup

Longitudinal comparison:

We compare groups of datasets across 3 years. . .

. . . by fi�ing Hawkes process every 3 monthsGroup comparisons:

I Growing vs. decliningI STEM vs humanities

Mapping event streams to Hawkes processes:

Every dataset group is a multivariate process, every community a process realization4 process dimensions distinguish common activity and user types:

I �estions by Power Users (QP)I �estions by Casual Users (QC)I Answers by Power Users (AP)I Answers by Casual Users (AC)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 26 / 37

Growing vs. Declining: Baseline Excitation

Low baseline intensities:

● ● ● ● ● ●● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(a) Baseline of Answers by Power Users

● ● ● ● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(b) Baseline of �estions by Power Users

● ●●

●●

● ● ● ● ● ● ●

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(c) Baseline of Answers by Casual Users

● ● ●● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(d) Baseline of �estions by Casual Users

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 27 / 37

Growing vs. Declining: Self- and Cross-Excitation

Early power user excitation, late casual user excitation and late self-excitation:

●●

●●

●● ● ●

●●

● ●

Late Stage

Self−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(a) Self of AP

●●

●● ●

●●

Early Power User

Cross−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(b) Cross of QP on AP

● ● ● ●●

● ● ●● ● ● ●

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(c) Cross of AC on AP

● ●●

● ● ●●

Early Power User

Cross−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(d) Cross of QC on AP

● ● ● ● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(e) Cross of AP on QP

● ● ●● ● ● ●

● ● ● ●

Late Stage

Self−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(f) Self of QP

● ● ● ● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(g) Cross of AC on QP

●● ● ● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(h) Cross of QC on QP

●● ●

● ● ● ● ●● ● ● ●

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(i) Excitation of AP on AC

● ●

●●

●●

● ● ●

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(j) Cross of QP on AC

●●

●● ●

●●

● ● ●

Late Stage

Self−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(k) Self of AC

●●

● ●● ● ●

Late Casual UserCross−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(l) Cross of QC on AC

● ● ● ● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(m) Cross of AP on QC

●● ● ● ● ● ● ● ● ● ● ●0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(n) Cross of QP on QC

● ● ●

● ● ● ● ● ● ● ●

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(o) Cross of AC on QC

●●

●● ● ● ● ● ● ●

Late Stage

Self−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(p) Self of QC

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 28 / 37

STEM vs. Humanities: Self- and Cross-Excitation

Importance of casual users for STEM communities, and of power users for Humanities:

●● ●

● ● ● ● ● ● ●

Casual User

Self−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(a) Self of AC

●●

● ●● ●

Power User

Cross−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(b)Cross of QC on AP

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 29 / 37

E�ect Evaluation — “Sanity Checks”

High self-excitation of casual users in STEM is not due to growth (K-S two-sample test)

Permutation tests confirm the e�ects do not arise at random:

Growing vs. decliningcomparison

Growing vs. decliningcomparison

Growing vs. decliningcomparison

Humanities vs. STEMcomparison

●●

●● ●

●●

●●

●●

● ●●

● ●●

● ●

Early Power User

Cross−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(a) Permuted Cross-Excitation of�estions by Power Users on Answersby Power Users

●●

● ●● ● ●

● ●

●● ● ● ● ● ● ●

Late Casual UserCross−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(b) Permuted Cross-Excitation of�estions by Casual Users on Answersby Casual Users

●●

●● ●

●●

● ● ●● ●

●● ● ● ●

● ● ●

Late Stage

Self−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(c) Permuted Self-Excitation of Answersby Casual Users

●●

● ●

● ●●

●●

●●

●●

Power User

Cross−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(d) Permuted Cross-Excitation of�estions by Casual Users on Answersby Power Users

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 30 / 37

E�ect Evaluation — “Sanity Checks”

High self-excitation of casual users in STEM is not due to growth (K-S two-sample test)

Permutation tests confirm the e�ects do not arise at random:

Growing vs. decliningcomparison

Growing vs. decliningcomparison

Growing vs. decliningcomparison

Humanities vs. STEMcomparison

●●

●● ●

●●

●●

●●

● ●●

● ●●

● ●

Early Power User

Cross−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(a) Permuted Cross-Excitation of�estions by Power Users on Answersby Power Users

●●

● ●● ● ●

● ●

●● ● ● ● ● ● ●

Late Casual UserCross−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(b) Permuted Cross-Excitation of�estions by Casual Users on Answersby Casual Users

●●

●● ●

●●

● ● ●● ●

●● ● ● ●

● ● ●

Late Stage

Self−Excitation0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(c) Permuted Self-Excitation of Answersby Casual Users

●●

● ●

● ●●

●●

●●

●●

Power User

Cross−Excitation

0.0

0.5

1.0

1.5

2.0

1 2 3 4 5 6 7 8 9 10 11 12Time (quarter)

Inte

nsity

(d) Permuted Cross-Excitation of�estions by Casual Users on Answersby Power Users

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 30 / 37

E�ect Evaluation — Predictive Impact

Prediction setup:

We fit a quarter and predict the next over 3 years

We measure prediction K-S distance and RMSEWe compare 3 models in the Growing-vs-Declining se�ing:

I BaselineI Excitation E�ects RemovedI Full

Excitation e�ects ma�er for prediction:

Best performance by Full model�arters where Excitation E�ects Removed model performs worse allow for ranking e�ectswrt. predictive importance:

1 Late Stage Self-Excitation2 Early Power User Excitation3 Late Casual User Excitation

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 31 / 37

E�ect Evaluation — Predictive Impact

Prediction setup:

We fit a quarter and predict the next over 3 years

We measure prediction K-S distance and RMSEWe compare 3 models in the Growing-vs-Declining se�ing:

I BaselineI Excitation E�ects RemovedI Full

Excitation e�ects ma�er for prediction:

Best performance by Full model�arters where Excitation E�ects Removed model performs worse allow for ranking e�ectswrt. predictive importance:

1 Late Stage Self-Excitation2 Early Power User Excitation3 Late Casual User Excitation

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 31 / 37

Limitations

Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters

High-dimensional Hawkes process may be more realistic

Pinpointing exact transition dates beyond scope of this work

No claim of causality

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37

Limitations

Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters

High-dimensional Hawkes process may be more realistic

Pinpointing exact transition dates beyond scope of this work

No claim of causality

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37

Limitations

Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters

High-dimensional Hawkes process may be more realistic

Pinpointing exact transition dates beyond scope of this work

No claim of causality

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37

Limitations

Tested result robustness only to slight changes in thresholdsI Extend Hawkes to include time-varying parameters

High-dimensional Hawkes process may be more realistic

Pinpointing exact transition dates beyond scope of this work

No claim of causality

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 32 / 37

Conclusions

Leveraging Hawkes processes, we uncovered user excitation e�ects in comparisons ofgrowing-vs-declining and STEM-vs-humanities Stack Exchange communities

Impact:I Importance of timing in rotating user mixI Excitation e�ects may serve as development indicatorI Adjust community management according to communities’ topical focus

Future work:I Generalize to other Q&A platformsI Extend methodological approach to other domains (e.g. di�erent activities or platforms

altogether)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 33 / 37

Source: [8, Santos et al.]

Conclusions

Leveraging Hawkes processes, we uncovered user excitation e�ects in comparisons ofgrowing-vs-declining and STEM-vs-humanities Stack Exchange communities

Impact:I Importance of timing in rotating user mixI Excitation e�ects may serve as development indicatorI Adjust community management according to communities’ topical focus

Future work:I Generalize to other Q&A platformsI Extend methodological approach to other domains (e.g. di�erent activities or platforms

altogether)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 33 / 37

Source: [8, Santos et al.]

Conclusions

Leveraging Hawkes processes, we uncovered user excitation e�ects in comparisons ofgrowing-vs-declining and STEM-vs-humanities Stack Exchange communities

Impact:I Importance of timing in rotating user mixI Excitation e�ects may serve as development indicatorI Adjust community management according to communities’ topical focus

Future work:I Generalize to other Q&A platformsI Extend methodological approach to other domains (e.g. di�erent activities or platforms

altogether)

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 33 / 37

Source: [8, Santos et al.]

Section 6

Further Resources

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 34 / 37

Code Resources

Python package: Tickhttps://github.com/X-DataInitiative/tick

C++ package: PtPackhttps://github.com/dunan/MultiVariatePointProcess

Hawkes network inference: Pyhawkeshttps://github.com/slinderman/pyhawkes

Models from papers:I Distilling Information Reliability and Source Trustworthiness from Digital Traces

http://btabibian.com/projects/reliability/

I Modeling Interdependent and Periodic Real-World Action Sequenceshttp://snap.stanford.edu/tipas/

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 35 / 37

References I

E. Bacry, I. Mastroma�eo, and J.-F. Muzy.

Hawkes processes in finance.Market Microstructure and Liquidity, 1(01):1550005, 2015.

J. Bergstra, D. Yamins, and D. Cox.

Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures.In Proceedings of the 30th International Conference on Machine Learning (ICML’13), pages 115–123, 2013.

D. J. Daley and D. Vere-Jones.

An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods.Springer Science & Business Media, 2003.

M. Gomez-Rodriguez.

Machine learning for dynamic social network analysis seminar.http://learning.mpi-sws.org/uc3m-seminar/, 2017.Accessed: 2018-02-10.

T. Kurashima, T. Altho�, and J. Leskovec.

Modeling interdependent and periodic real-world action sequences.In Proceedings of the 2018 World Wide Web Conference, pages 803–812. International World Wide Web Conferences Steering Commi�ee, 2018.

Y. Ogata.On lewis’ simulation method for point processes.IEEE Transactions on Information Theory, 27(1):23–31, 1981.

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 36 / 37

References II

Y. Ogata.

Likelihood analysis of point processes and its applications to seismological data.Bulletin of the International Statistical Institute, 50:943–961, 1983.

T. Santos, S. Walk, R. Kern, M. Strohmaier, and D. Helic.

Self- and cross-excitation in stack exchange question & answers communities.In WWW, 2019.

B. Tabibian, I. Valera, M. Farajtabar, L. Song, B. Scholkopf, and M. Gomez-Rodriguez.

Distilling information reliability and source trustworthiness from digital traces.In Proceedings of the 26th International Conference on World Wide Web, pages 847–855. International World Wide Web Conferences Steering Commi�ee, 2017.

A. Zeileis, C. Kleiber, W. Kramer, and K. Hornik.Testing and dating of structural changes in practice.Computational Statistics & Data Analysis, 44:109–123, 2003.

Tiago Santos (ISDS, TU Graz) Point Processes 2019-12-05 37 / 37

top related