multivariate sparse dynamic process modeling and...
Post on 30-Sep-2020
4 Views
Preview:
TRANSCRIPT
un i v er s i ty of copenhagen
Faculty of Science
Multivariate sparse dynamic process modeling andinference
Niels Richard HansenDepartment of Mathematical Sciences
June 8, 2011
Slide 1/29
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Spike tracks from turtle motor neurons
Turtle motor neurons are used as models for neuronal activity.
The (dead) turtle is stimulated by scratching and recordings fromone or more electrodes register the spikes.
Slide 2/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Spike tracks from turtle motor neurons
Turtle motor neurons are used as models for neuronal activity.
The (dead) turtle is stimulated by scratching and recordings fromone or more electrodes register the spikes.
Slide 2/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Data
time5 10 15 20 25 30 35
12
34
5
Slide 3/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Data from stimulation period
time10 12 14 16 18
12
34
5
Slide 4/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Point process modeling via intensities
We consider a filtered probability space (Ω,F ,Ft ,P) and aparametrized family (λt(θ))t≥0 of positive, predictable processesfor θ ∈ Θ.
The minus-log-likelihood is
lt(θ) =
∫ t
0λs(θ)ds −
∫ t
0log λs(θ)N(ds)
We will study penalized maximum-likelihood estimation of theparameter θ.
Slide 5/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Point process modeling via intensities
We consider a filtered probability space (Ω,F ,Ft ,P) and aparametrized family (λt(θ))t≥0 of positive, predictable processesfor θ ∈ Θ.The minus-log-likelihood is
lt(θ) =
∫ t
0λs(θ)ds −
∫ t
0log λs(θ)N(ds)
We will study penalized maximum-likelihood estimation of theparameter θ.
Slide 5/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Point process modeling via intensities
We consider a filtered probability space (Ω,F ,Ft ,P) and aparametrized family (λt(θ))t≥0 of positive, predictable processesfor θ ∈ Θ.The minus-log-likelihood is
lt(θ) =
∫ t
0λs(θ)ds −
∫ t
0log λs(θ)N(ds)
We will study penalized maximum-likelihood estimation of theparameter θ.
Slide 5/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What are the goals?
λs(θ) =?
• Use ensembles of spike patterns for decoding of externalstimuli signals1
• Learn the connectivity graph for an ensemble of neurons fromdata2.
• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).
• Learn if the network and / or the functional forms areadaptive to stimuli.
1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008
2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.
Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What are the goals?
λs(θ) =?
• Use ensembles of spike patterns for decoding of externalstimuli signals1
• Learn the connectivity graph for an ensemble of neurons fromdata2.
• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).
• Learn if the network and / or the functional forms areadaptive to stimuli.
1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008
2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.
Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What are the goals?
λs(θ) =?
• Use ensembles of spike patterns for decoding of externalstimuli signals1
• Learn the connectivity graph for an ensemble of neurons fromdata2.
• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).
• Learn if the network and / or the functional forms areadaptive to stimuli.
1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008
2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.
Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What are the goals?
λs(θ) =?
• Use ensembles of spike patterns for decoding of externalstimuli signals1
• Learn the connectivity graph for an ensemble of neurons fromdata2.
• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).
• Learn if the network and / or the functional forms areadaptive to stimuli.
1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008
2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.
Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What are the goals?
λs(θ) =?
• Use ensembles of spike patterns for decoding of externalstimuli signals1
• Learn the connectivity graph for an ensemble of neurons fromdata2.
• Learn from data how signals propagate at the spike levelamong connected neurons (the functional forms).
• Learn if the network and / or the functional forms areadaptive to stimuli.
1Pillow et al. Spatio-temporal correlations and visual signalling in acomplete neuronal population. Nature, 454. 2008
2Masud et al. Statistical technique for analysing functional connectivity ofmultiple spike trains, J. Neuroscience Meth., 196, 2011.
Slide 6/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A self-exciting model
With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model
λt(g) = φ
∑j :τj<t
g(t − τj)
= φ
(∫ t−
0g(t − s)N(ds)
)For a multivariate counting process, (N i
t)t≥0,i=1,...,K ,
λkt (g) = φ
K∑i=1
∑j :τ ij<t
g ik(t − τ ij )
,
which is the non-linear Hawkes process3. With φ(x) = x + d weget the linear Hawkes process.
3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.
Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A self-exciting model
With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model
λt(g) = φ
∑j :τj<t
g(t − τj)
= φ
(∫ t−
0g(t − s)N(ds)
)
For a multivariate counting process, (N it)t≥0,i=1,...,K ,
λkt (g) = φ
K∑i=1
∑j :τ ij<t
g ik(t − τ ij )
,
which is the non-linear Hawkes process3.
With φ(x) = x + d weget the linear Hawkes process.
3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.
Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A self-exciting model
With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model
λt(g) = φ
∑j :τj<t
g(t − τj)
= φ
(∫ t−
0g(t − s)N(ds)
)
For a multivariate counting process, (N it)t≥0,i=1,...,K ,
λkt (g) = φ
K∑i=1
∑j :τ ij<t
g ik(t − τ ij )
,
which is the non-linear Hawkes process3. With φ(x) = x + d weget the linear Hawkes process.
3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.
Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A self-exciting model
With (Nt)t≥0 the counting process for the spike times andτ1, . . . τNt the jumps consider the model
λt(g) = φ
∑j :τj<t
g(t − τj)
= φ
(∫ t−
0g(t − s)N(ds)
)For a multivariate counting process, (N i
t)t≥0,i=1,...,K ,
λkt (g) = φ
K∑i=1
∑j :τ ij<t
g ik(t − τ ij )
,
which is the non-linear Hawkes process3. With φ(x) = x + d weget the linear Hawkes process.
3Bremaud, P. and Massoulie, L. Stability of nonlinear Hawkes processes.Ann. Probab. 24(3), 1996.
Slide 7/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Intensities
time
−6
−4
−2
0
2
4
6
v13.1
v5.1
12.4 12.6 12.8 13.0 13.2
variable v5.1
v13.1
log_intensity
Example of estimated linear filters and log intensity with φ = exp.
Slide 8/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Local independence
We say that Nkt is locally independent of N i
t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted.
We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).
For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.
The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.
The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.
4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,
J. R. Statist. Soc. B (2008) 70(1)
Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Local independence
We say that Nkt is locally independent of N i
t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).
For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.
The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.
The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.
4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,
J. R. Statist. Soc. B (2008) 70(1)
Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Local independence
We say that Nkt is locally independent of N i
t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).
For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.
The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.
The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.
4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,
J. R. Statist. Soc. B (2008) 70(1)
Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Local independence
We say that Nkt is locally independent of N i
t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).
For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.
The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.
The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.
4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,
J. R. Statist. Soc. B (2008) 70(1)Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Local independence
We say that Nkt is locally independent of N i
t if the Ft-intensity isσ((N−is )s∈[0,t))-adapted. We write i 6→ k and this defines theoriented local independence graph G = (1, . . . ,K,E ).
For the non-linear Hawkes process (i , k) ∈ E if and only if g ik 6= 0.
The concept was introduced by Tore Schweder4 for finite statespace Markov processes and extended to general point processesby Vanessa Didelez5.
The local independence graph has a causal connotation related toGranger Causality, and it finds applications in the literature oncausal inference.
4Composable Markov Processes, J. Appl. Prob. (1970), 7(2)5Graphical models for marked point processes based on local independence,
J. R. Statist. Soc. B (2008) 70(1)Slide 9/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Joint likelihood
If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is
K∑i=1
∫ t
0λis(θi )ds −
∫ t
0log λis(θi )N i (ds)︸ ︷︷ ︸
l it (θi )
,
hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.
We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.
Which class of models to consider?
Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Joint likelihood
If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is
K∑i=1
∫ t
0λis(θi )ds −
∫ t
0log λis(θi )N i (ds)︸ ︷︷ ︸
l it (θi )
,
hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.
We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.
Which class of models to consider?
Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Joint likelihood
If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is
K∑i=1
∫ t
0λis(θi )ds −
∫ t
0log λis(θi )N i (ds)︸ ︷︷ ︸
l it (θi )
,
hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.
We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.
Which class of models to consider?
Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Joint likelihood
If λi (θi ) is parametrized by θi ∈ Θi the joint minus-log-likelihood is
K∑i=1
∫ t
0λis(θi )ds −
∫ t
0log λis(θi )N i (ds)︸ ︷︷ ︸
l it (θi )
,
hence if we have variation independence of θ1, . . . , θK we need tominimize each l it (θi ) separately.
We can think of the i ’th term as a likelihood for a (conditional)model of i ’th counting process and we consider only such models.
Which class of models to consider?
Slide 10/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Generalized linear point process models
(Xt)t≥0 is a predictable, cadlag process process with values in V ∗
– the dual of the vector space V , and
Θ(D) = β ∈ V | Xs−β ∈ D for all s ∈ [0, t] P-a.s..
φ : D → [0,∞) and (Yt)t≥0 is a predictable, cadlag process withvalues in [0,∞).
Definition
A generalized linear point process model on [0, t] is the statisticalmodel with parameter space Θ(D) such that for β ∈ Θ(D) thepoint process on [0, t] has intensity
λs = Ysφ(Xs−β)
for s ∈ [0, t].
Slide 11/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Generalized linear point process models
(Xt)t≥0 is a predictable, cadlag process process with values in V ∗
– the dual of the vector space V , and
Θ(D) = β ∈ V | Xs−β ∈ D for all s ∈ [0, t] P-a.s..
φ : D → [0,∞) and (Yt)t≥0 is a predictable, cadlag process withvalues in [0,∞).
Definition
A generalized linear point process model on [0, t] is the statisticalmodel with parameter space Θ(D) such that for β ∈ Θ(D) thepoint process on [0, t] has intensity
λs = Ysφ(Xs−β)
for s ∈ [0, t].
Slide 11/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Generalized linear point process models
(Xt)t≥0 is a predictable, cadlag process process with values in V ∗
– the dual of the vector space V , and
Θ(D) = β ∈ V | Xs−β ∈ D for all s ∈ [0, t] P-a.s..
φ : D → [0,∞) and (Yt)t≥0 is a predictable, cadlag process withvalues in [0,∞).
Definition
A generalized linear point process model on [0, t] is the statisticalmodel with parameter space Θ(D) such that for β ∈ Θ(D) thepoint process on [0, t] has intensity
λs = Ysφ(Xs−β)
for s ∈ [0, t].
Slide 11/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Stochastic integrals as linear functionalsIf g : [0,∞)→ R is a measurable, locally bounded function and(Zt)t≥0 a semi-martingale we can define the linear filter
Xtg =
∫ t
0g(t − s)dZs
The functiong 7→ Xtg
is an ω-wise continues linear functional on the Sobolev spaceW m,2([0, t]) for m ≥ 1.
Proof by integration by parts∫ t
0h(s)dZs = h(t)Zt − h(0)Z0 −
∫ t
0Zs−h′(s)ds
Slide 12/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Stochastic integrals as linear functionalsIf g : [0,∞)→ R is a measurable, locally bounded function and(Zt)t≥0 a semi-martingale we can define the linear filter
Xtg =
∫ t
0g(t − s)dZs
The functiong 7→ Xtg
is an ω-wise continues linear functional on the Sobolev spaceW m,2([0, t]) for m ≥ 1.
Proof by integration by parts∫ t
0h(s)dZs = h(t)Zt − h(0)Z0 −
∫ t
0Zs−h′(s)ds
Slide 12/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Stochastic integrals as linear functionalsIf g : [0,∞)→ R is a measurable, locally bounded function and(Zt)t≥0 a semi-martingale we can define the linear filter
Xtg =
∫ t
0g(t − s)dZs
The functiong 7→ Xtg
is an ω-wise continues linear functional on the Sobolev spaceW m,2([0, t]) for m ≥ 1.
Proof by integration by parts∫ t
0h(s)dZs = h(t)Zt − h(0)Z0 −
∫ t
0Zs−h′(s)ds
Slide 12/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Penalized maximum likelihood estimation
As a function of g ∈W m,2([0, t]) the minus-log-likelihood functionreads
lt(g) =
∫ t
0
Ysφ
(∫ s−
0
g(s − u)dZu
)ds−
∫ t
0
log(Ysφ
(∫ s−
0
g(s − u)dZu
))N(ds)
We are optimizing the penalized minus-log-likelihood
lt(g) + λ
∫ t
0Dmg(s)2ds
over W m,2([0, t]).
Slide 13/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Penalized maximum likelihood estimation
As a function of g ∈W m,2([0, t]) the minus-log-likelihood functionreads
lt(g) =
∫ t
0
Ysφ
(∫ s−
0
g(s − u)dZu
)ds−
∫ t
0
log(Ysφ
(∫ s−
0
g(s − u)dZu
))N(ds)
We are optimizing the penalized minus-log-likelihood
lt(g) + λ
∫ t
0Dmg(s)2ds
over W m,2([0, t]).
Slide 13/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Estimates
Dimensions: 28 x 28Column
Row
5
10
15
20
25
5 10 15 20 25
v10.2
v9.3
v13.2
v10.3
v13.1
v2.3
v4.2v11.2
v8.2
v1.2
v12.1
v4.1
v4.3
v14.2
v2.1
v14.1
v2.2
v14.3
v5.1
v7.2
v16.1v15.1
v7.1
v9.2
v1.3
v2.5
v3.2
v8.1
Slide 14/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Estimates
v10.2
v9.3
v13.2
v10.3
v13.1
v2.3
v4.2v11.2
v8.2
v1.2
v12.1
v4.1
v4.3
v14.2
v2.1
v14.1
v2.2
v14.3
v5.1
v7.2
v16.1v15.1
v7.1
v9.2
v1.3
v2.5
v3.2
v8.1 Dimensions: 28 x 28Column
Row
5
10
15
20
25
5 10 15 20 25
v10.2
v9.3
v13.2
v10.3
v13.1
v2.3
v4.2v11.2
v8.2
v1.2
v12.1
v4.1
v4.3
v14.2
v2.1
v14.1
v2.2
v14.3
v5.1
v7.2
v16.1v15.1
v7.1
v9.2
v1.3
v2.5
v3.2
v8.1
Slide 14/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Estimates
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Estimates
time lag (s)
!4
!3
!2
!1
0
!1
0
1
2
3
4
5
0.0 0.1 0.2 0.3 0.4 0.5
v5
.1v1
3.1
time lag (s)
!5
!4
!3
!2
!1
0
!0.2
0.0
0.2
0.4
0.6
0.8
!0.4
!0.2
0.0
0.2
0.4
0.6
0.8
0.0 0.1 0.2 0.3 0.4 0.5
v1
3.1
v8
.2v4
.2
Dimensions: 28 x 28
Column
Row
5
10
15
20
25
5 10 15 20 25
v4.1
v2.2
v13.1
v1.2
v8.2
v14.1
v2.1v14.2
v9.3
v10.3
v4.3
v4.2
v2.3
v16.1
v5.1
v10.2
v13.2
v9.2
v7.1
v8.1
v3.2v2.5
v1.3
v14.3
v15.1
v11.2
v12.1
v7.2
Slide 14/28— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 7, 2011
Dimensions: 28 x 28Column
Row
5
10
15
20
25
5 10 15 20 25
v10.2
v9.3
v13.2
v10.3
v13.1
v2.3
v4.2v11.2
v8.2
v1.2
v12.1
v4.1
v4.3
v14.2
v2.1
v14.1
v2.2
v14.3
v5.1
v7.2
v16.1v15.1
v7.1
v9.2
v1.3
v2.5
v3.2
v8.1
Slide 14/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A theoremLet τ1, . . . τNt denote the jump times for N.
Theorem
If φ(x) = x + d with domain (−d ,∞) then a minimizer of thepenalized minus-log-likelihood function over Θ((−d ,∞)) belongsto the finite dimensional subspace of W m,2([0, t]) spanned by thefunctions φ1, . . . , φm, the functions
hi (r) =
∫ τi−
0
R1(τi − u, r)dZu
for i = 1, . . . ,Nt together with the function
f (r) =
∫ t
0
Ys
∫ s
0
R1(s − u, r) dZuds =
∫ t
0
∫ t
u
YsR1(s − u, r)ds dZu.
R1m(s, r) =
∫ s∧r
0
(s − u)m−1(r − u)m−1
((m − 1)!)2du, φk(t) = tk−1/(k − 1)!
Slide 15/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Counting process integrals
If (Zs)0≤s≤t is a counting process with jumps σ1, . . . , σZt the hi
basis functions are order 2m splines with knots in
τi − σj | i = 1, . . . ,Nt , j : σj < τi.
Slide 16/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Another theorem
Theorem
If φ is continuously differentiable,
ηi (r) =
∫ τi−
0
R(τi − u, r)dZu
and
fg (r) =
∫ t
0
∫ t
u
Ysφ′(∫ s−
0
g(s − u)dZu
)R1(s − u, r)dsdZu.
Then the gradient of lt at g ∈ Θ(D) is
∇lt(g) = fg −Nt∑i=1
φ′(∫ τi−
0g(τi − u)dZu
)φ(∫ τi−
0g(τi − u)dZu
) ηi .
Slide 17/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process?
Yes but the likelihood process
Lt = exp
(t −
∫ t
0λsds +
∫ t
0log λsN(ds)
)may not be a martingale.
EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].
We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.
For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.
Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process
Lt = exp
(t −
∫ t
0λsds +
∫ t
0log λsN(ds)
)may not be a martingale.
EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].
We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.
For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.
Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process
Lt = exp
(t −
∫ t
0λsds +
∫ t
0log λsN(ds)
)may not be a martingale.
EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].
We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.
For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.
Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process
Lt = exp
(t −
∫ t
0λsds +
∫ t
0log λsN(ds)
)may not be a martingale.
EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].
We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.
For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.
Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The problem with explosionGiven a predictable (candidate) intensity process (λt)t≥0 does itdefine a point process? Yes but the likelihood process
Lt = exp
(t −
∫ t
0λsds +
∫ t
0log λsN(ds)
)may not be a martingale.
EP(Lt) = 1 if and only if the intensity defines a point process thatdoes not explode in [0, t].
We only have a dominated statistical model if we restrict ourattention to combinations of φ and processes (Xt)t≥0 such thatthe likelihood process is a martingale.
For non-exploding data Lt(θ) is, however, always sensible forrelative comparisons of models.
Slide 18/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Generalization of local independence
There are abstract generalizations of local independence tosemi-martingales, of particular interest are the solutions to the SDE
dXt = G (Xt)dt + DdBt
where (Bt)t≥0 is p-dimensional Brownian motion, D is diagonaland G : Rp → Rp.
We say that X kt is locally independent of X i
t if Gk does not dependupon the i ’th coordinate.
Slide 19/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Generalization of local independence
There are abstract generalizations of local independence tosemi-martingales, of particular interest are the solutions to the SDE
dXt = G (Xt)dt + DdBt
where (Bt)t≥0 is p-dimensional Brownian motion, D is diagonaland G : Rp → Rp.
We say that X kt is locally independent of X i
t if Gk does not dependupon the i ’th coordinate.
Slide 19/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A Gaussian processWith G (x) = B(x − A) the solution to the SDE
dXt = B(Xt − A)dt + DdBt
becomes a Gaussian process with normal discrete time transitions
Xt ∼ N (ξ(x0, t),Σ(t))
where
ξ(x , t) = A + etB(x − A) (1)
Σ(t) =
∫ t
0esBD2esB
Tds (2)
The
Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A Gaussian processWith G (x) = B(x − A) the solution to the SDE
dXt = B(Xt − A)dt + DdBt
becomes a Gaussian process with normal discrete time transitions
Xt ∼ N (ξ(x0, t),Σ(t))
where
ξ(x , t) = A + etB(x − A) (1)
Σ(t) =
∫ t
0esBD2esB
Tds (2)
The minus-log-likelihood for discrete observations
lx(A,B,D) =n∑
i=2
[(xti − ξ(xti−1 ,∆i ))TΣ(∆i )
−1(xti − ξ(xti−1 ,∆i ))
+ log det Σ(∆i )] .
Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A Gaussian processWith G (x) = B(x − A) the solution to the SDE
dXt = B(Xt − A)dt + DdBt
becomes a Gaussian process with normal discrete time transitions
Xt ∼ N (ξ(x0, t),Σ(t))
where
ξ(x , t) = A + etB(x − A) (1)
Σ(t) =
∫ t
0esBD2esB
Tds (2)
The pseudo minus-log-likelihood for discrete observations
lx(A,B) =n∑
i=2
(xti − ξ(xti−1 ,∆i ))T (xti − ξ(xti−1 ,∆i ))
=n∑
i=2
||xti − A− e∆iB(xti−1 − A)||2.
Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
A Gaussian processWith G (x) = B(x − A) the solution to the SDE
dXt = B(Xt − A)dt + DdBt
becomes a Gaussian process with normal discrete time transitions
Xt ∼ N (ξ(x0, t),Σ(t))
where
ξ(x , t) = A + etB(x − A) (1)
Σ(t) =
∫ t
0esBD2esB
Tds (2)
The `1-penalized pseudo minus-log-likelihood for discreteobservations
lx(A,B) =n∑
i=2
||xti − A− e∆iB(xti−1 − A)||2 + λ∑i ,j
|Bij |
Slide 20/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The non-linear penalized least squares problem
Minimization of
lx(B) =n∑
i=2
||xti − e∆iBxti−1 ||2 + λ
∑i ,j
|Bij |
is a non-linear least squares problem.
A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem
lx(B0 + B) 'n∑
i=2
||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ
∑i ,j
|Bij |
ri (B) = xti − e∆iBxti−1 . Computations of e∆iB and DBe∆iB
dominates. Problem does not “decouple” into regression problemsfor each coordinate.
Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The non-linear penalized least squares problem
Minimization of
lx(B) =n∑
i=2
||xti − e∆iBxti−1 ||2 + λ
∑i ,j
|Bij |
is a non-linear least squares problem.
A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem
lx(B0 + B) 'n∑
i=2
||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ
∑i ,j
|Bij |
ri (B) = xti − e∆iBxti−1 .
Computations of e∆iB and DBe∆iB
dominates. Problem does not “decouple” into regression problemsfor each coordinate.
Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The non-linear penalized least squares problem
Minimization of
lx(B) =n∑
i=2
||xti − e∆iBxti−1 ||2 + λ
∑i ,j
|Bij |
is a non-linear least squares problem.
A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem
lx(B0 + B) 'n∑
i=2
||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ
∑i ,j
|Bij |
ri (B) = xti − e∆iBxti−1 . Computations of e∆iB and DBe∆iB
dominates.
Problem does not “decouple” into regression problemsfor each coordinate.
Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
The non-linear penalized least squares problem
Minimization of
lx(B) =n∑
i=2
||xti − e∆iBxti−1 ||2 + λ
∑i ,j
|Bij |
is a non-linear least squares problem.
A coordinate wise Gauss-Newton method iteratively optimize onecoordinate at a time for the linear least squares problem
lx(B0 + B) 'n∑
i=2
||ri (B0)− DBe∆iB0(B)xti−1 ||2 + λ
∑i ,j
|Bij |
ri (B) = xti − e∆iBxti−1 . Computations of e∆iB and DBe∆iB
dominates. Problem does not “decouple” into regression problemsfor each coordinate.
Slide 21/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example: p = 20, A = 0, D = I
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1920
5 10 15 20
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Eigenvalues
Index
Real partImaginary part
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example: p = 20, A = 0, D = I
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1920
5 10 15 20
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Eigenvalues
Index
Real partImaginary part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1920
Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example: p = 20, A = 0, D = I
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1920
5 10 15 20
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Eigenvalues
Index
Real partImaginary part
5 10 15 20
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Eigenvalues
Index
Real partImaginary part
Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example: p = 20, A = 0, D = I
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1920
5 10 15 20
−1.
5−
1.0
−0.
50.
00.
51.
01.
5
Eigenvalues
Index
Real partImaginary part
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Parameters
index
Slide 22/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
Slide 23/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example, Lasso estimation with N = 100
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
Index
Slide 24/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
For equidistant observations the process is a vector AR(1)-process(with correlated noise)
Xi − Xi−1 = (e∆B − I )Xi−1 + εi
Is sparseness of B not related to sparseness of
e∆B − I = ∆B + O(∆2).
Slide 25/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
For equidistant observations the process is a vector AR(1)-process(with correlated noise)
Xi − Xi−1 = (e∆B − I )Xi−1 + εi
Is sparseness of B not related to sparseness of
e∆B − I = ∆B + O(∆2).
Slide 25/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
0.5
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
1
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
1.5
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
2
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
2.5
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
3
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
3.5
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
4
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
4.5
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Is the discrete time AR-process not enough?
0 100 200 300 400
−1.
0−
0.5
0.0
0.5
1.0
5
Index
Slide 26/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What about the (asymptotic) variance?
When the spectrum of B has no positive real parts
Σ(t)→ Γ =
∫ ∞0
esBD2esBTds
for t →∞.
The variance Γ for the invariant distribution solves
(B ⊗ I + I ⊗ B)Γ = −D2.
If BBT = BTB and BD = DB then BΓ = ΓB and
Γ = −(B + BT )−1D2.
Slide 27/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What about the (asymptotic) variance?
When the spectrum of B has no positive real parts
Σ(t)→ Γ =
∫ ∞0
esBD2esBTds
for t →∞.
The variance Γ for the invariant distribution solves
(B ⊗ I + I ⊗ B)Γ = −D2.
If BBT = BTB and BD = DB then BΓ = ΓB and
Γ = −(B + BT )−1D2.
Slide 27/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
What about the (asymptotic) variance?
When the spectrum of B has no positive real parts
Σ(t)→ Γ =
∫ ∞0
esBD2esBTds
for t →∞.
The variance Γ for the invariant distribution solves
(B ⊗ I + I ⊗ B)Γ = −D2.
If BBT = BTB and BD = DB then BΓ = ΓB and
Γ = −(B + BT )−1D2.
Slide 27/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Toy example asymptotic variance
Dimensions: 20 x 20Column
Row
5
10
15
20
5 10 15 20
−0.03
−0.02
−0.01
0.00
0.01
0.02
0.03
Slide 28/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
un i v er s i ty of copenhagen department of mathemat i ca l s c i ence s
Acknowledgements
• Rune Berg and Susanne Ditlevsen, University of Copenhagen,for collaboration on the neuron data analysis.
• Lisbeth Carstensen, Albin Sandelin and Ole Winther,Bioinformatics Centre, University of Copenhagen, forcollaboration on the use of Hawkes processes for modelingregulatory cites on genomes.
• Alexander Sokol, University of Copenhagen, who currentlywork with estimation of sparse SDE’s.
• Patricia Bouret, Nice Sophia-Antipolis University, and VicentRivoirard, Universite Paris Sud - Orsay, with whom I work onoracle inequalities for the multivariate Hawkes process.
Slide 29/29— Niels Richard Hansen — Multivariate sparse dynamic process modeling and inference — June 8, 2011
top related