multivariate sparse dynamic process modeling and...

Post on 30-Sep-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

un i v er s i ty of copenhagen

Faculty of Science

Multivariate sparse dynamic process modeling andinference

Niels Richard HansenDepartment of Mathematical Sciences

June 8, 2011

Slide 1/29

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Spike tracks from turtle motor neurons

Turtle motor neurons are used as models for neuronal activity.

The (dead) turtle is stimulated by scratching and recordings fromone or more electrodes register the spikes.

Slide 2/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Spike tracks from turtle motor neurons

Turtle motor neurons are used as models for neuronal activity.

The (dead) turtle is stimulated by scratching and recordings fromone or more electrodes register the spikes.

Slide 2/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Data

time5 10 15 20 25 30 35

12

34

5

Slide 3/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Data from stimulation period

time10 12 14 16 18

12

34

5

Slide 4/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Point process modeling via intensities

We consider a filtered probability space (Ω,F ,Ft ,P) and aparametrized family (λt(θ))t≥0 of positive, predictable processesfor θ ∈ Θ.

The minus-log-likelihood is

lt(θ) =

∫ t

0λs(θ)ds −

∫ t

0log λs(θ)N(ds)

We will study penalized maximum-likelihood estimation of theparameter θ.

Slide 5/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Point process modeling via intensities

We consider a filtered probability space (Ω,F ,Ft ,P) and aparametrized family (λt(θ))t≥0 of positive, predictable processesfor θ ∈ Θ.The minus-log-likelihood is

lt(θ) =

∫ t

0λs(θ)ds −

∫ t

0log λs(θ)N(ds)

We will study penalized maximum-likelihood estimation of theparameter θ.

Slide 5/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Point process modeling via intensities

We consider a filtered probability space (Ω,F ,Ft ,P) and aparametrized family (λt(θ))t≥0 of positive, predictable processesfor θ ∈ Θ.The minus-log-likelihood is

lt(θ) =

∫ t

0λs(θ)ds −

∫ t

0log λs(θ)N(ds)

We will study penalized maximum-likelihood estimation of theparameter θ.

Slide 5/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What are the goals?

λs(θ) =?

• Use ensembles of spike patterns for decoding of externalstimuli signals1

• Learn the connectivity graph for an ensemble of neurons fromdata2.

• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).

• Learn if the network and / or the functional forms areadaptive to stimuli.

1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008

2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.

Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What are the goals?

λs(θ) =?

• Use ensembles of spike patterns for decoding of externalstimuli signals1

• Learn the connectivity graph for an ensemble of neurons fromdata2.

• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).

• Learn if the network and / or the functional forms areadaptive to stimuli.

1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008

2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.

Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What are the goals?

λs(θ) =?

• Use ensembles of spike patterns for decoding of externalstimuli signals1

• Learn the connectivity graph for an ensemble of neurons fromdata2.

• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).

• Learn if the network and / or the functional forms areadaptive to stimuli.

1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008

2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.

Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What are the goals?

λs(θ) =?

• Use ensembles of spike patterns for decoding of externalstimuli signals1

• Learn the connectivity graph for an ensemble of neurons fromdata2.

• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).

• Learn if the network and / or the functional forms areadaptive to stimuli.

1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008

2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.

Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What are the goals?

λs(θ) =?

• Use ensembles of spike patterns for decoding of externalstimuli signals1

• Learn the connectivity graph for an ensemble of neurons fromdata2.

• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).

• Learn if the network and / or the functional forms areadaptive to stimuli.

1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008

2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.

Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A self-exciting model

With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model

λt(g) = φ

∑j :τj<t

g(t − τj)

= φ

(∫ t−

0g(t − s)N(ds)

)For a multivariate counting process, (N i

t)t≥0,i=1,...,K ,

λkt (g) = φ

K∑i=1

∑j :τ ij<t

g ik(t − τ ij )

,

which is the non-linear Hawkes process3. With φ(x) = x + d weget the linear Hawkes process.

3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.

Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A self-exciting model

With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model

λt(g) = φ

∑j :τj<t

g(t − τj)

= φ

(∫ t−

0g(t − s)N(ds)

)

For a multivariate counting process, (N it)t≥0,i=1,...,K ,

λkt (g) = φ

K∑i=1

∑j :τ ij<t

g ik(t − τ ij )

,

which is the non-linear Hawkes process3.

With φ(x) = x + d weget the linear Hawkes process.

3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.

Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A self-exciting model

With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model

λt(g) = φ

∑j :τj<t

g(t − τj)

= φ

(∫ t−

0g(t − s)N(ds)

)

For a multivariate counting process, (N it)t≥0,i=1,...,K ,

λkt (g) = φ

K∑i=1

∑j :τ ij<t

g ik(t − τ ij )

,

which is the non-linear Hawkes process3. With φ(x) = x + d weget the linear Hawkes process.

3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.

Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A self-exciting model

With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model

λt(g) = φ

∑j :τj<t

g(t − τj)

= φ

(∫ t−

0g(t − s)N(ds)

)For a multivariate counting process, (N i

t)t≥0,i=1,...,K ,

λkt (g) = φ

K∑i=1

∑j :τ ij<t

g ik(t − τ ij )

,

which is the non-linear Hawkes process3. With φ(x) = x + d weget the linear Hawkes process.

3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.

Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Intensities

time

−6

−4

−2

0

2

4

6

v13.1

v5.1

12.4 12.6 12.8 13.0 13.2

variable v5.1

v13.1

log_intensity

Example of estimated linear filters and log intensity with φ = exp.

Slide 8/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Local independence

We say that Nkt is locally independent of N i

t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted.

We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).

For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.

The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.

The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.

4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,

J. R. Statist. Soc. B (2008) 70(1)

Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Local independence

We say that Nkt is locally independent of N i

t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).

For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.

The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.

The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.

4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,

J. R. Statist. Soc. B (2008) 70(1)

Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Local independence

We say that Nkt is locally independent of N i

t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).

For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.

The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.

The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.

4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,

J. R. Statist. Soc. B (2008) 70(1)

Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Local independence

We say that Nkt is locally independent of N i

t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).

For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.

The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.

The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.

4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,

J. R. Statist. Soc. B (2008) 70(1)Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Local independence

We say that Nkt is locally independent of N i

t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).

For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.

The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.

The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.

4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,

J. R. Statist. Soc. B (2008) 70(1)Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Joint likelihood

If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is

K∑i=1

∫ t

0λis(θi )ds −

∫ t

0log λis(θi )N i (ds)︸ ︷︷ ︸

l it (θi )

,

hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.

We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.

Which class of models to consider?

Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Joint likelihood

If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is

K∑i=1

∫ t

0λis(θi )ds −

∫ t

0log λis(θi )N i (ds)︸ ︷︷ ︸

l it (θi )

,

hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.

We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.

Which class of models to consider?

Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Joint likelihood

If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is

K∑i=1

∫ t

0λis(θi )ds −

∫ t

0log λis(θi )N i (ds)︸ ︷︷ ︸

l it (θi )

,

hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.

We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.

Which class of models to consider?

Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Joint likelihood

If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is

K∑i=1

∫ t

0λis(θi )ds −

∫ t

0log λis(θi )N i (ds)︸ ︷︷ ︸

l it (θi )

,

hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.

We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.

Which class of models to consider?

Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Generalized linear point process models

(Xt)t≥0 is a predictable, cadlag process process with values in V ∗

– the dual of the vector space V , and

Θ(D) = β ∈ V | Xs−β ∈ D for all s ∈ [0, t] P-a.s..

φ : D → [0,∞) and (Yt)t≥0 is a predictable, cadlag process withvalues in [0,∞).

Definition

A generalized linear point process model on [0, t] is the statisticalmodel with parameter space Θ(D) such that for β ∈ Θ(D) thepoint process on [0, t] has intensity

λs = Ysφ(Xs−β)

for s ∈ [0, t].

Slide 11/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Generalized linear point process models

(Xt)t≥0 is a predictable, cadlag process process with values in V ∗

– the dual of the vector space V , and

Θ(D) = β ∈ V | Xs−β ∈ D for all s ∈ [0, t] P-a.s..

φ : D → [0,∞) and (Yt)t≥0 is a predictable, cadlag process withvalues in [0,∞).

Definition

A generalized linear point process model on [0, t] is the statisticalmodel with parameter space Θ(D) such that for β ∈ Θ(D) thepoint process on [0, t] has intensity

λs = Ysφ(Xs−β)

for s ∈ [0, t].

Slide 11/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Generalized linear point process models

(Xt)t≥0 is a predictable, cadlag process process with values in V ∗

– the dual of the vector space V , and

Θ(D) = β ∈ V | Xs−β ∈ D for all s ∈ [0, t] P-a.s..

φ : D → [0,∞) and (Yt)t≥0 is a predictable, cadlag process withvalues in [0,∞).

Definition

A generalized linear point process model on [0, t] is the statisticalmodel with parameter space Θ(D) such that for β ∈ Θ(D) thepoint process on [0, t] has intensity

λs = Ysφ(Xs−β)

for s ∈ [0, t].

Slide 11/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Stochastic integrals as linear functionalsIf g : [0,∞)→ R is a measurable, locally bounded function and(Zt)t≥0 a semi-martingale we can define the linear filter

Xtg =

∫ t

0g(t − s)dZs

The functiong 7→ Xtg

is an ω-wise continues linear functional on the Sobolev spaceW m,2([0, t]) for m ≥ 1.

Proof by integration by parts∫ t

0h(s)dZs = h(t)Zt − h(0)Z0 −

∫ t

0Zs−h′(s)ds

Slide 12/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Stochastic integrals as linear functionalsIf g : [0,∞)→ R is a measurable, locally bounded function and(Zt)t≥0 a semi-martingale we can define the linear filter

Xtg =

∫ t

0g(t − s)dZs

The functiong 7→ Xtg

is an ω-wise continues linear functional on the Sobolev spaceW m,2([0, t]) for m ≥ 1.

Proof by integration by parts∫ t

0h(s)dZs = h(t)Zt − h(0)Z0 −

∫ t

0Zs−h′(s)ds

Slide 12/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Stochastic integrals as linear functionalsIf g : [0,∞)→ R is a measurable, locally bounded function and(Zt)t≥0 a semi-martingale we can define the linear filter

Xtg =

∫ t

0g(t − s)dZs

The functiong 7→ Xtg

is an ω-wise continues linear functional on the Sobolev spaceW m,2([0, t]) for m ≥ 1.

Proof by integration by parts∫ t

0h(s)dZs = h(t)Zt − h(0)Z0 −

∫ t

0Zs−h′(s)ds

Slide 12/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Penalized maximum likelihood estimation

As a function of g ∈W m,2([0, t]) the minus-log-likelihood functionreads

lt(g) =

∫ t

0

Ysφ

(∫ s−

0

g(s − u)dZu

)ds−

∫ t

0

log(Ysφ

(∫ s−

0

g(s − u)dZu

))N(ds)

We are optimizing the penalized minus-log-likelihood

lt(g) + λ

∫ t

0Dmg(s)2ds

over W m,2([0, t]).

Slide 13/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Penalized maximum likelihood estimation

As a function of g ∈W m,2([0, t]) the minus-log-likelihood functionreads

lt(g) =

∫ t

0

Ysφ

(∫ s−

0

g(s − u)dZu

)ds−

∫ t

0

log(Ysφ

(∫ s−

0

g(s − u)dZu

))N(ds)

We are optimizing the penalized minus-log-likelihood

lt(g) + λ

∫ t

0Dmg(s)2ds

over W m,2([0, t]).

Slide 13/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Estimates

Dimensions: 28 x 28Column

Row

5

10

15

20

25

5 10 15 20 25

v10.2

v9.3

v13.2

v10.3

v13.1

v2.3

v4.2v11.2

v8.2

v1.2

v12.1

v4.1

v4.3

v14.2

v2.1

v14.1

v2.2

v14.3

v5.1

v7.2

v16.1v15.1

v7.1

v9.2

v1.3

v2.5

v3.2

v8.1

Slide 14/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Estimates

v10.2

v9.3

v13.2

v10.3

v13.1

v2.3

v4.2v11.2

v8.2

v1.2

v12.1

v4.1

v4.3

v14.2

v2.1

v14.1

v2.2

v14.3

v5.1

v7.2

v16.1v15.1

v7.1

v9.2

v1.3

v2.5

v3.2

v8.1 Dimensions: 28 x 28Column

Row

5

10

15

20

25

5 10 15 20 25

v10.2

v9.3

v13.2

v10.3

v13.1

v2.3

v4.2v11.2

v8.2

v1.2

v12.1

v4.1

v4.3

v14.2

v2.1

v14.1

v2.2

v14.3

v5.1

v7.2

v16.1v15.1

v7.1

v9.2

v1.3

v2.5

v3.2

v8.1

Slide 14/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Estimates

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Estimates

time lag (s)

!4

!3

!2

!1

0

!1

0

1

2

3

4

5

0.0 0.1 0.2 0.3 0.4 0.5

v5

.1v1

3.1

time lag (s)

!5

!4

!3

!2

!1

0

!0.2

0.0

0.2

0.4

0.6

0.8

!0.4

!0.2

0.0

0.2

0.4

0.6

0.8

0.0 0.1 0.2 0.3 0.4 0.5

v1

3.1

v8

.2v4

.2

Dimensions: 28 x 28

Column

Row

5

10

15

20

25

5 10 15 20 25

v4.1

v2.2

v13.1

v1.2

v8.2

v14.1

v2.1v14.2

v9.3

v10.3

v4.3

v4.2

v2.3

v16.1

v5.1

v10.2

v13.2

v9.2

v7.1

v8.1

v3.2v2.5

v1.3

v14.3

v15.1

v11.2

v12.1

v7.2

Slide 14/28— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 7, 2011

Dimensions: 28 x 28Column

Row

5

10

15

20

25

5 10 15 20 25

v10.2

v9.3

v13.2

v10.3

v13.1

v2.3

v4.2v11.2

v8.2

v1.2

v12.1

v4.1

v4.3

v14.2

v2.1

v14.1

v2.2

v14.3

v5.1

v7.2

v16.1v15.1

v7.1

v9.2

v1.3

v2.5

v3.2

v8.1

Slide 14/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A theoremLet τ1, . . . τNt denote the jump times for N.

Theorem

If φ(x) = x + d with domain (−d ,∞) then a minimizer of thepenalized minus-log-likelihood function over Θ((−d ,∞)) belongsto the finite dimensional subspace of W m,2([0, t]) spanned by thefunctions φ1, . . . , φm, the functions

hi (r) =

∫ τi−

0

R1(τi − u, r)dZu

for i = 1, . . . ,Nt together with the function

f (r) =

∫ t

0

Ys

∫ s

0

R1(s − u, r) dZuds =

∫ t

0

∫ t

u

YsR1(s − u, r)ds dZu.

R1m(s, r) =

∫ s∧r

0

(s − u)m−1(r − u)m−1

((m − 1)!)2du, φk(t) = tk−1/(k − 1)!

Slide 15/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Counting process integrals

If (Zs)0≤s≤t is a counting process with jumps σ1, . . . , σZt the hi

basis functions are order 2m splines with knots in

τi − σj | i = 1, . . . ,Nt , j : σj < τi.

Slide 16/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Another theorem

Theorem

If φ is continuously differentiable,

ηi (r) =

∫ τi−

0

R(τi − u, r)dZu

and

fg (r) =

∫ t

0

∫ t

u

Ysφ′(∫ s−

0

g(s − u)dZu

)R1(s − u, r)dsdZu.

Then the gradient of lt at g ∈ Θ(D) is

∇lt(g) = fg −Nt∑i=1

φ′(∫ τi−

0g(τi − u)dZu

)φ(∫ τi−

0g(τi − u)dZu

) ηi .

Slide 17/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process?

Yes but the likelihood process

Lt = exp

(t −

∫ t

0λsds +

∫ t

0log λsN(ds)

)may not be a martingale.

EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].

We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.

For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.

Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process

Lt = exp

(t −

∫ t

0λsds +

∫ t

0log λsN(ds)

)may not be a martingale.

EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].

We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.

For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.

Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process

Lt = exp

(t −

∫ t

0λsds +

∫ t

0log λsN(ds)

)may not be a martingale.

EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].

We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.

For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.

Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process

Lt = exp

(t −

∫ t

0λsds +

∫ t

0log λsN(ds)

)may not be a martingale.

EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].

We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.

For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.

Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process

Lt = exp

(t −

∫ t

0λsds +

∫ t

0log λsN(ds)

)may not be a martingale.

EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].

We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.

For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.

Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Generalization of local independence

There are abstract generalizations of local independence tosemi-martingales, of particular interest are the solutions to the SDE

dXt = G (Xt)dt + DdBt

where (Bt)t≥0 is p-dimensional Brownian motion, D is diagonaland G : Rp → Rp.

We say that X kt is locally independent of X i

t if Gk does not dependupon the i ’th coordinate.

Slide 19/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Generalization of local independence

There are abstract generalizations of local independence tosemi-martingales, of particular interest are the solutions to the SDE

dXt = G (Xt)dt + DdBt

where (Bt)t≥0 is p-dimensional Brownian motion, D is diagonaland G : Rp → Rp.

We say that X kt is locally independent of X i

t if Gk does not dependupon the i ’th coordinate.

Slide 19/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A Gaussian processWith G (x) = B(x − A) the solution to the SDE

dXt = B(Xt − A)dt + DdBt

becomes a Gaussian process with normal discrete time transitions

Xt ∼ N (ξ(x0, t),Σ(t))

where

ξ(x , t) = A + etB(x − A) (1)

Σ(t) =

∫ t

0esBD2esB

Tds (2)

The

Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A Gaussian processWith G (x) = B(x − A) the solution to the SDE

dXt = B(Xt − A)dt + DdBt

becomes a Gaussian process with normal discrete time transitions

Xt ∼ N (ξ(x0, t),Σ(t))

where

ξ(x , t) = A + etB(x − A) (1)

Σ(t) =

∫ t

0esBD2esB

Tds (2)

The minus-log-likelihood for discrete observations

lx(A,B,D) =n∑

i=2

[(xti − ξ(xti−1 ,∆i ))TΣ(∆i )

−1(xti − ξ(xti−1 ,∆i ))

+ log det Σ(∆i )] .

Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A Gaussian processWith G (x) = B(x − A) the solution to the SDE

dXt = B(Xt − A)dt + DdBt

becomes a Gaussian process with normal discrete time transitions

Xt ∼ N (ξ(x0, t),Σ(t))

where

ξ(x , t) = A + etB(x − A) (1)

Σ(t) =

∫ t

0esBD2esB

Tds (2)

The pseudo minus-log-likelihood for discrete observations

lx(A,B) =n∑

i=2

(xti − ξ(xti−1 ,∆i ))T (xti − ξ(xti−1 ,∆i ))

=n∑

i=2

||xti − A− e∆iB(xti−1 − A)||2.

Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

A Gaussian processWith G (x) = B(x − A) the solution to the SDE

dXt = B(Xt − A)dt + DdBt

becomes a Gaussian process with normal discrete time transitions

Xt ∼ N (ξ(x0, t),Σ(t))

where

ξ(x , t) = A + etB(x − A) (1)

Σ(t) =

∫ t

0esBD2esB

Tds (2)

The `1-penalized pseudo minus-log-likelihood for discreteobservations

lx(A,B) =n∑

i=2

||xti − A− e∆iB(xti−1 − A)||2 + λ∑i ,j

|Bij |

Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The non-linear penalized least squares problem

Minimization of

lx(B) =n∑

i=2

||xti − e∆iBxti−1 ||2 + λ

∑i ,j

|Bij |

is a non-linear least squares problem.

A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem

lx(B0 + B) 'n∑

i=2

||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ

∑i ,j

|Bij |

ri (B) = xti − e∆iBxti−1 . Computations of e∆iB and DBe∆iB

dominates. Problem does not “decouple” into regression problemsfor each coordinate.

Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The non-linear penalized least squares problem

Minimization of

lx(B) =n∑

i=2

||xti − e∆iBxti−1 ||2 + λ

∑i ,j

|Bij |

is a non-linear least squares problem.

A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem

lx(B0 + B) 'n∑

i=2

||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ

∑i ,j

|Bij |

ri (B) = xti − e∆iBxti−1 .

Computations of e∆iB and DBe∆iB

dominates. Problem does not “decouple” into regression problemsfor each coordinate.

Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The non-linear penalized least squares problem

Minimization of

lx(B) =n∑

i=2

||xti − e∆iBxti−1 ||2 + λ

∑i ,j

|Bij |

is a non-linear least squares problem.

A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem

lx(B0 + B) 'n∑

i=2

||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ

∑i ,j

|Bij |

ri (B) = xti − e∆iBxti−1 . Computations of e∆iB and DBe∆iB

dominates.

Problem does not “decouple” into regression problemsfor each coordinate.

Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

The non-linear penalized least squares problem

Minimization of

lx(B) =n∑

i=2

||xti − e∆iBxti−1 ||2 + λ

∑i ,j

|Bij |

is a non-linear least squares problem.

A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem

lx(B0 + B) 'n∑

i=2

||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ

∑i ,j

|Bij |

ri (B) = xti − e∆iBxti−1 . Computations of e∆iB and DBe∆iB

dominates. Problem does not “decouple” into regression problemsfor each coordinate.

Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example: p = 20, A = 0, D = I

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

1920

5 10 15 20

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

Eigenvalues

Index

Real partImaginary part

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example: p = 20, A = 0, D = I

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

1920

5 10 15 20

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

Eigenvalues

Index

Real partImaginary part

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

1920

Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example: p = 20, A = 0, D = I

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

1920

5 10 15 20

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

Eigenvalues

Index

Real partImaginary part

5 10 15 20

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

Eigenvalues

Index

Real partImaginary part

Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example: p = 20, A = 0, D = I

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

1920

5 10 15 20

−1.

5−

1.0

−0.

50.

00.

51.

01.

5

Eigenvalues

Index

Real partImaginary part

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Parameters

index

Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−1.0

−0.5

0.0

0.5

1.0

Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example, Lasso estimation with N = 100

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

Index

Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

For equidistant observations the process is a vector AR(1)-process(with correlated noise)

Xi − Xi−1 = (e∆B − I )Xi−1 + εi

Is sparseness of B not related to sparseness of

e∆B − I = ∆B + O(∆2).

Slide 25/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

For equidistant observations the process is a vector AR(1)-process(with correlated noise)

Xi − Xi−1 = (e∆B − I )Xi−1 + εi

Is sparseness of B not related to sparseness of

e∆B − I = ∆B + O(∆2).

Slide 25/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

0.5

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

1

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

1.5

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

2

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

2.5

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

3

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

3.5

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

4

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

4.5

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Is the discrete time AR-process not enough?

0 100 200 300 400

−1.

0−

0.5

0.0

0.5

1.0

5

Index

Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What about the (asymptotic) variance?

When the spectrum of B has no positive real parts

Σ(t)→ Γ =

∫ ∞0

esBD2esBTds

for t →∞.

The variance Γ for the invariant distribution solves

(B ⊗ I + I ⊗ B)Γ = −D2.

If BBT = BTB and BD = DB then BΓ = ΓB and

Γ = −(B + BT )−1D2.

Slide 27/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What about the (asymptotic) variance?

When the spectrum of B has no positive real parts

Σ(t)→ Γ =

∫ ∞0

esBD2esBTds

for t →∞.

The variance Γ for the invariant distribution solves

(B ⊗ I + I ⊗ B)Γ = −D2.

If BBT = BTB and BD = DB then BΓ = ΓB and

Γ = −(B + BT )−1D2.

Slide 27/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

What about the (asymptotic) variance?

When the spectrum of B has no positive real parts

Σ(t)→ Γ =

∫ ∞0

esBD2esBTds

for t →∞.

The variance Γ for the invariant distribution solves

(B ⊗ I + I ⊗ B)Γ = −D2.

If BBT = BTB and BD = DB then BΓ = ΓB and

Γ = −(B + BT )−1D2.

Slide 27/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Toy example asymptotic variance

Dimensions: 20 x 20Column

Row

5

10

15

20

5 10 15 20

−0.03

−0.02

−0.01

0.00

0.01

0.02

0.03

Slide 28/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s

Acknowledgements

• Rune Berg and Susanne Ditlevsen, University of Copenhagen,for collaboration on the neuron data analysis.

• Lisbeth Carstensen, Albin Sandelin and Ole Winther,Bioinformatics Centre, University of Copenhagen, forcollaboration on the use of Hawkes processes for modelingregulatory cites on genomes.

• Alexander Sokol, University of Copenhagen, who currentlywork with estimation of sparse SDE’s.

• Patricia Bouret, Nice Sophia-Antipolis University, and VicentRivoirard, Universite Paris Sud - Orsay, with whom I work onoracle inequalities for the multivariate Hawkes process.

Slide 29/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011

top related