spiral.imperial.ac.uk · 2019. 9. 5. · supplementary material to “modeling, simulation and...

Supplementary material to “Modeling, simulation and inference formultivariate time series of counts using trawl processes”

Almut E. D. VeraartDepartment of Mathematics, Imperial College London

180 Queen’s Gate, London, SW7 2AZ, [email protected]

Abstract

Here we present supplementary material to the articleModeling, simulation and inference for multivariatetime series of counts using trawl processes.

Keywords: Count data, continuous time modeling of multivariate time series, trawl processes, infinitelydivisible, Poisson mixtures, multivariate negative binomial law, limit order book

Mathematics Subject Classification:60G10, 60G55, 60E07, 62M10, 62P05

Contents

1 Introduction and outline 2

2 Proofs 2

3 Additions to Section 3 (parametric specifications) 43.1 Parametric specifications of the trawl function . . . . . . .. . . . . . . . . . . . . . . . . 43.2 Modeling the cross-sectional dependence . . . . . . . . . . . .. . . . . . . . . . . . . . 5

4 Additions to the empirical and simulation study 74.1 Empirical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 7

4.1.1 Estimating the trawl functions . . . . . . . . . . . . . . . . . . .. . . . . . . . . 84.1.2 Estimating the bivariate negative binomial law . . . . .. . . . . . . . . . . . . . 8

4.2 Details regarding the simulation study . . . . . . . . . . . . . .. . . . . . . . . . . . . . 184.2.1 Simulating from the bivariate logarithmic series distribution . . . . . . . . . . . . 184.2.2 Simulating the bivariate trawl process . . . . . . . . . . . .. . . . . . . . . . . . 184.2.3 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 19

5 Likelihood inference in the Poisson mixture model 345.1 Finding the relation betweenL(k)(A), L(ℓ)(A) andXk,Xℓ . . . . . . . . . . . . . . . . . . . 34

6 Extension of the Poisson mixture model to allow for bivariate interactions 366.1 Negative binomial marginal law . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 396.2 Pairwise likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 40

6.2.1 Computing the pairwise probabilities . . . . . . . . . . . . .. . . . . . . . . . . 406.2.2 The final pairwise likelihood function . . . . . . . . . . . . .. . . . . . . . . . . 43

6.3 Likelihood computations in the univariate case . . . . . . .. . . . . . . . . . . . . . . . . 44

Preprint submitted to Journal of Multivariate Analysis August 24, 2018

1. Introduction and outline

This supplementary material provides additional details about the material covered in the main article.The outline is as follows. First, Section 2 contains the proofs of the theoretical results from the mainarticle. Next, Section 3 gives additional details on possible parametric specifications of the multivariatetrawl process. In particular, it presents another interesting specification of the trawl function, which we calla supIG-trawl. Also, it describes how a multivariate Poisson trawl process can be constructed with differenttypes of dependence structures ranging from a common-factor to pairwise-interaction terms. Section 4contains many additional graphics from the empirical studywhere we fitted and compared the goodness ofthe fit of six different multivariate trawl processes (and only the best fit is presented in the main article).Also, it gives a detailed description of the simulation design used and explains the parametric bootstrapused to construct the 95% confidence bounds in the main article. Section 5 contains the derivation ofthe pairwise likelihood for the Poisson mixture model whichcould be used as an alternative inferencemethods to the (generalized) method of moments proposed in the main article. Finally, Section 6 extendsthe Poisson mixture model from the common factor construction to a model which allows for pairwiseinteraction terms. As such it is particularly relevant for multivariate applications beyond a bivariate setting.We derive the theoretical details (in particular its compound Poisson representation which is needed forsimulations) and also present the corresponding pairwise likelihood.

2. Proofs

Proof of Proposition 1.Using the properties of the Levy basis, we immediately obtain that

E{exp(iθ⊤Yt)} = exp

∫

Rn×[0,1]×R

exp

i

n∑

j=1

θ j I A( j)(x, s− t)y j

− 1

ν(dy)dxds

.

The expression for the characteristic function can be further simplified by using a partitionP = {S1, . . . ,SnP}

ofA := ∪ni=1A(i), as defined in the main text of the article. Then

θ⊤Yt =

n∑

j=1

θ j L( j)(A( j)

t ) =n∑

j=1

θ j

∑

k:Sk⊂A( j)

L( j)(Sk) =nP∑

k=1

∑

1≤ j≤n:A( j)⊃Sk

θ j L( j)(Sk). (1)

Finally, combining (1) with the fact that a Levy basis is independently scattered, we obtain that

E(exp(iθ⊤Yt)) = E

i exp

nP∑

k=1

∑


θ jL( j)(Sk)

= exp

nP∑

k=1

∑


θ j L( j)(Sk)

= exp

nP∑

k=1

Leb(Sk)C(L′ ( j)) j∈J(k)((θ j) j∈J(k))

.

Proof of Proposition 4.The joint law is given by

Pr(X1 = x1, . . . ,Xn = xn)

=

∫

(0,∞)n+1Pr(X1 = x1, . . . ,Xn = xn|U = u,V1 = v1, . . . ,Vn = vn)

· fU(u) fV1(v1) · · · fVn(vn)dudv1 · · ·dvn

=

∫

(0,∞)n+1

n∏

i=1

e−(αiu+vi ) (αiu+ vi)xi

xi !fU (u) fVi (vi)dudvi

2

=

∫

(0,∞)n+1fU(u)

n∏

i=1

e−(αi u+vi) 1xi !

xi∑

j i=0

(xi

j i

)α

j ii u j i vxi− j i

i fVi (vi)dudvi

=1

x1! · · · xn!

x1∑

j1=0

· · ·

xn∑

jn=0

(x1

j1

)· · ·

(xn

jn

)α

j11 · · ·α

jnj E(U j1+···+ jne−(α1+···+αn)U )

·

n∏

k=1

E(Vxk− jkk e−Vk).

Proof of Proposition 5.Let MU andMVi denote the moment generating functions ofU andVi , respectively.According to [2, equation (5.1)], the probability generating function of (X1, . . . ,Xn) is given by

G(t1, . . . , tn) = E(tX1

1 · · · tXnn ) = MU

n∑

i=1

αi(ti − 1)

n∏

i=1

MVi (ti − 1),

for t1, . . . , tn ∈ R with max1≤i≤n |ti | < 1. Hence, the corresponding Laplace transform for positiveθ is givenby

L(θ1, . . . , θn) = G(e−θ1, . . . , e−θn) = MU

n∑

i=1

αi(e−θi − 1)

n∏

i=1

MVi (e−θi − 1). (2)

The aim is to findv andLC(θ) by equating equation (2) from the main article and (2) above. Using therelation between the Laplace and the moment generating function, we deduce that

L(θ1, . . . , θn) = MU

n∑

i=1

αi(e−θi − 1)

n∏

i=1

MVi (e−θi − 1)

= LU

n∑

i=1

αi(1− e−θi )

n∏

i=1

LVi (1− e−θi )

= exp

lnLU

n∑

i=1

αi(1− e−θi )

+n∑

i=1

lnLVi (1− e−θi )

.

We use the notationK = lnL for the so-calledkumulant function. SinceU is a subordinator without drift,we have that

KU

n∑

i=1

αi(1− e−θi )

=∫

R

{e−

∑ni=1 αi (1−e−θi )x − 1

}νU(dx)

=

∫

R

{e−

∑ni=1 αi x − e−

∑ni=1 αi x + e

∑ni=1 αi (1−e−θi )x − 1

}νU (dx)

=

∫

R

(e−

∑ni=1 αi x − 1

)νU (dx) +

∫

R

e−∑n

i=1 αi x(e∑n

i=1 αie−θi x − 1)νU(dx).

Note that

e∑n

i=1 αi e−θi x − 1 =∞∑

k=1

1k

n∑

i=1

αie−θi x

k

=

∞∑

k=1

1k!

n∑

i=1

αie−θi

k

xk.

3

We setα :=∑n

i=1αi . Then

∫

R

e−∑n

i=1 αi x(e∑n

i=1 αie−θi x − 1)νU(dx) =

∫

R

e−αx∞∑

k=1

1k!

n∑

i=1

αie−θi

k

xkνU(dx)

=

∞∑

k=1

n∑

i=1

αie−θi

k

1k!

∫

R

e−αxxkνU(dx)︸︷︷︸

:=q(U)k

.

I.e.,

KU

n∑

i=1

αi(1− e−θi )

=∫

R

(e−αx − 1

)νU(dx) +

∞∑

k=1

n∑

i=1

αie−θi

k

q(U)k

= KU(α) +∞∑

k=1

n∑

i=1

αie−θi

k

q(U)k .

Similarly,

n∑

i=1

KVi (1− e−θi ) =n∑

i=1

KVi (1)+∞∑

k=1

e−θikq(Vi)k

, whereq(Vi)k =

∫

R

xk

k!e−xνVi (dx).

So, overall we have

KX(θ) = lnL(θ1, . . . , θn)

=

KU(α) +n∑

i=1

KVi (1)

+∞∑

k=1

n∑

i=1

αie−θi

k

q(U)k +

n∑

i=1

∞∑

k=1

e−θikq(Vi)k

= −v+ vLC(θ),

if and only if

v = −

KU(α) +n∑

i=1

KVi (1)

,

LC(θ) =1v

∞∑

k=1

n∑

i=1

αie−θi

k

q(U)k +

n∑

i=1

∞∑

k=1

e−θikq(Vi)k

.

Proof of Proposition 6.The requirement that Leb(A(i)) < ∞ implies that Leb({(x, s) : s ≤ 0, 0 ≤ x ≤g(i)(s− t)}) → 0 ast → ∞. Since a Levy basis is countably additive (in the sense thatfor any sequenceAn ↓ ∅ of Borel sets with bounded Lebesgue measure,L(i)(An) → 0 in probability asn→ ∞, see [1]), wecan deduce thatX(i)

0,t → 0 in probability ast→ ∞.

3. Additions to Section 3 (parametric specifications)

3.1. Parametric specifications of the trawl function

In the main article we discussed exponential trawls, trawlsconsisting of weighted sums of exponentialsand their generalizations to superpositions of exponential trawl. A particularly interesting example of thelatter is the long memory trawl which we present in the main article. The long memory trawl is in facta special case of a superposition-type trawl obtained from using a generalized inverse Gaussian (GIG)density fπ in the construction of the superposition-type trawl:

4

Example 1. Suppose that fπ is the density of the GIG distribution, i.e.,

fπ(x) =(γ/δ)ν

2Kν(δγ)xν−1 exp

{−

12

(δ2x−1 + γ2x)

},

whereν ∈ R and γ and δ are both nonnegative and not simultaneously equal to zero. Here we denoteby Kν(·) the modified Bessel function of the third kind. Straightforward computation show that the corre-sponding trawl function is given by

g(s) =

(1−

2sγ2

)− ν2 Kν(δγ

√1− 2s

γ2

)

Kν(δγ), for s≤ 0,

and the corresponding size of the trawl set equals

Leb(A) =(γ/δ)Kν−1(δγ)

Kν(δγ).

Moreover, the autocorrelation function is given by

r(h) =Kν−1

(δ√γ2 + 2h

)

Kν−1(δγ)

(1+

2hγ2

) 12 (1−ν)

, for h ≥ 0.

A special case of the GIG distribution is the inverse Gaussian distribution which we study next.

Example 2. Suppose we choose an inverse Gaussian (IG) density functionfor fπ. Then we obtain theso-called sup-IG trawl function, which can be written as

g(s) =

(1−

2zγ2

)−1/2

exp

δγ

1−√

1−2sγ2

, for s≤ 0,

for nonnegative parametersδ, γ which are assumed not to be simultaneously equal to zero. Then we havethatLeb(A) = γ/δ and the corresponding autocorrelation function is given by

r(h) = exp

δγ

1−√

1+2hγ2

, for h ≥ 0.

Note that if we use the Gamma density as a special case of the GIG density we obtain the long-memorymodel presented in the main article.

3.2. Modeling the cross-sectional dependence through a multivariate Poisson distribution

In the main article, we study Poisson-mixture type distributions for the multivariate Levy seedL ′.However, the most common starting point would be to considera multivariate Poisson distribution whichclearly falls into the framework of discrete compound Poisson distributions we consider throughout thearticle.

So, let us denote byL ′ = (L′(1), . . . , L

′(n))⊤ the Levy seed and we will now present a multivariatePoisson law for the Levy seed. In order to introduce dependence between the Poisson random variables,one typically uses a so-calledcommon factor approach, which we outline in the following, see, e.g., [4, 5].

Suppose that we havem ∈ N independent random variablesX(i) ∼ Poi(θi) for i = 1, . . . ,m, and setX = (X(1), . . . ,X(m))⊤.

Let A denote an × m-matrix (for n ∈ N) with 0-1 entries and having no duplicate columns. We thenset L ′ = AX , which clearly follows a multivariate Poisson distribution. The corresponding mean andvariance can be easily computed and are given by E(L ′) = AM and var(L ′) = AΣA⊤, respectively, whereM = E(X) andΣ = var(X). Since the componentsX(i) are independent, we haveΣ = diag(θ1, . . . , θm) and

5

M⊤ = (θ1, . . . , θm). The above construction implies thatL′(i) ∼ Poi(vi), wherevi =

∑mk=1 aikθi . Also, for

i , j we have that

cor(L′(i), L

′( j)) =

∑mk=1 aikθkak j√∑m

k=1 a2ikθk

∑mk=1 a2

jkθk

.

Let us study some relevant examples within this modeling framework.

Example 3. An n-dimensional model with one common factor between all components can be obtained bychoosing m= n+ 1, and

A =

1 0 · · · · · · 10 1 0 · · · 1.... . .

. . .. . .

...

0 · · · 0 1 1

, X =

X(1)

...

X(n)

X(0)

,

with independent Poisson random variables X(i) ∼ Poi(θi), for i = 0, . . . , n. Then we have

L′(1) = X(1) + X(0), L

′(2) = X(2) + X(0), . . . , L′(n) = X(n) + X(0).

Here each component has marginal Poisson distribution, i.e., L′(i) ∼ Poi(θi + θ0) and, for i , j, we have

thatcov(L′(i), L

′( j)) = θ0.

Beyond the bivariate case, the example above presents a rather restrictive model for applications sinceit only allows for one common factor. A less sparse choice ofA would allow for more flexible modelspecifications. Let us consider a more realistic example in the trivariate case next.

Example 4. Consider a model of the type

L′(1) = X(1) + X(12) + X(13) + X(123),

L′(2) = X(2) + X(12) + X(23) + X(123),

L′(3) = X(3) + X(13) + X(23) + X(123),

for independent Poisson random variables X(i) with parametersθi , fori ∈ {{1}, {2}, {3}, {12}, {13}, {23}, {123}}. Such a model specification corresponds to the choice of

A =

1 0 0 1 1 0 10 1 0 1 0 1 10 0 1 0 1 1 1

, X =(X(1),X(2),X(3),X(12),X(13),X(23),X(123)

)⊤.

Here we have that L′(1) ∼ Poi(θ1 + θ12 + θ13 + θ123), L

′(2) ∼ Poi(θ2 + θ12 + θ23 + θ123) and L′(3) ∼ Poi(θ3 +

θ13 + θ23 + θ123).

The above example treats a very general case which allows forall possible bivariate as well as a trivari-ate covariation effect. A slightly simpler specification is given in the next example, which only considerspairwise interaction terms.

Example 5. Choosing

A =

1 0 0 1 1 00 1 0 1 0 10 0 1 0 1 1

, X =(X(1),X(2),X(3),X(12),X(13),X(23)

)⊤,

results in a trivariate model of the form

L′(1) = X(1) + X(12) + X(13), L

′(2) = X(2) + X(12) + X(23), L′(3) = X(3) + X(13) + X(23),

6

for independent Poisson random variables X(i) with parametersθi , fori ∈ {{1}, {2}, {3}, {12}, {13}, {23}}. Then we have that L

′(1) ∼ Poi(θ1 + θ12 + θ13), L′(2) ∼ Poi(θ2 + θ12 + θ23)

and L′(3) ∼ Poi(θ3 + θ13 + θ23); also,

var(L ′) =

θ1 + θ12 + θ13 θ12 θ13

θ12 θ2 + θ12 + θ23 θ23

θ13 θ23 θ3 + θ13 + θ23

.

4. Additions to the empirical and simulation study

4.1. Empirical results

Recall that in our empirical study on limit order book data from Apple we focus on 720 five-secondintervals between 11am and 12noon on 8th August 2017. We count the number of newly submitted limitorders (time series 1) and the number of cancelled limit orders (time series 2) in each interval on thesell-side. We fit and compare six model specifications:

1. Model 1: Exponential trawl with fully dependent negative binomialmarginal law:

• The trawl functions are given byg(i)(s) = exp(λ(i)s), for λ(i) > 0, s≤ 0 andi = 1, 2.

• The Levy seed is modeled as in Example 8 in the main article, i.e., L ′ = (L′(1), L

′(2))⊤ =(X1,X2)⊤ for random variablesX1,X2. Then (X1,X2)|(Z1 = z1,Z2 = z2) are independent andPoisson distributed with means given by{z1, z2}. The {Z1,Z2} are modeled by the so-calledadditive effect model as follows:Zi = αiU, for i = 1, 2 andα1, α2 > 0; alsoU ∼ Γ(κ, 1), forκ > 0. ThenXi ∼ NB(κ, αi/(1+ αi)), for i = 1, 2.

2. Model 2: Double exponential trawl with fully dependent negative binomial marginal law:

• The trawl functions are given byg(i)(s) = w(i) exp(λ(i)1 s) + (1 − w(i)) exp(λ(i)

2 s), for λ(i)1 , λ

(i)2 >

0, s≤ 0 andi = 1, 2.



3. Model 3: Long memory trawl with fully dependent negative binomial marginal law:

• The trawl functions are given byg(i)(s) =(1− s/a(i)

)−H(i)

, for a(i) > 0,H(i) > 1, s ≤ 0 andi = 1, 2.



4. Model 4: Exponential trawl with dependent negative binomial marginal law (including independentfactors):

• The trawl functions are given byg(i)(s) = exp(λ(i)s) for λ(i) > 0, s≤ 0 andi = 1, 2.


′(2))⊤ =(X1,X2)⊤ for random variablesX1,X2. Then (X1,X2)|(Z1 = z1,Z2 = z2) are independent andPoisson distributed with means given by{z1, z2}. The {Z1,Z2} are modeled by the so-calledadditive effect model as follows:Zi = αiU + Vi , for i = 1, 2 andα1, α2 > 0; alsoU ∼ Γ(κ, 1)andVi ∼ Γ(κi , 1/αi) are independent andκ, κi > 0. ThenXi ∼ NB(κ+κi , αi/(1+αi)) for i = 1, 2.

7

5. Model 5: Double exponential trawl with dependent negative binomial marginal law (including inde-pendent factors):

• The trawl functions are given byg(i)(s) = w(i) exp(λ(i)1 s) + (1 − w(i)) exp(λ(i)

2 s) for λ(i)1 , λ

(i)2 >

0, s≤ 0 andi = 1, 2.



6. Model 6: Long memory trawl with dependent negative binomial marginal law (including indepen-dent factors):

• The trawl functions are given byg(i)(s) =(1− s/a(i)

)−H(i)

, for a(i) > 0,H(i) > 1, s ≤ 0 andi = 1, 2.



4.1.1. Estimating the trawl functionsThe exponential trawl appearing in Models 1 and 4 is estimated via a method of moments, whereas the

double exponential trawl appearing in Models 2 and 5 and the long memory trawl appearing in Models 3and 6 are estimated by carrying out a least squares minimisation of the first five lags of the empirical andtheoretical autorcorrelation function. The corresponding fitted and empirical estimated trawl functions forthe two time series are depicted in Figure 1. We observe that the exponential trawl functions fitted to thesubmissions and cancellations is decaying too rapidly resulting in a poor fit when comparing the empiricaland the estimated autocorrelation functions. In the case ofa weighted sum of two exponential functions,which we refer to as the double exponential trawl, we notice that the fit to the autocorrelation function ismuch better. This is due to the fact that one factor (the first one, say) allows for a slow decay indicated by a‘small’ value forλ(i)

1 , whereas the second factor allows for a rather quick initialdecay indicated by a ‘big’value forλ(i)

2 .Finally, we fitted a long memory trawl to both time series. Unsurprisingly the decay of the fitted

autocorrelation function is much slower than in the exponential and double exponential cases. However,we know from work on the univariate trawl processes reportedin [3] that typically a rather large samplesize (much bigger than our sample size of 720) is required such that the empirical autocorrelation functionapproaches the theoretical one. Hence we need to be very careful when interpreting the fit of a longmemory model in our particular example. Since as we will see in the following the parameter uncertaintyin the estimation was rather high in the case of a long memory trawl, we choose the double exponentialtrawl for both the submissions and the cancellations as the best model choice and report it in the mainarticle.

4.1.2. Estimating the bivariate negative binomial lawAfter the parameters of the trawl functions have been estimated (corresponding to Step 1 a) in our

estimation procedure), we obtain estimates for Leb(A(1)), Leb(A(2)), Leb(A(1) ∩ A(2)) which we typicallydenote by using the hat-notation.

We now proceed to Step 1 b) where we estimate the parameters ofthe negative binomial law: Recallthat in the case of Example 8 we have thatY(i)

t ∼ NB(Leb(A(i))κ, 1/αi) and in the case of Example 9 wehave thatY(i)

t ∼ NB(Leb(A(i))(κ + κi), 1/αi). Consider the general case whenL′(i) ∼ NB(mi , θi), mi ∈ N,

θi ∈ (0, 1). Then E(L′(i)) = miθi(1− θi)−1 and Var(L

′(i)) = miθi(1− θi)−2. Hence we can estimateθi andmi

8

0 20 40 60 80 100

−0.

10.

00.

10.

20.

30.

40.

5

Lags

AC

F

(a) Submissions: Fitted ACF of exponential trawl

0 20 40 60 80 100

−0.

10.

00.

10.

20.

30.

40.

5

Lags

AC

F

(b) Cancellations: Fitted ACF of exponential trawl

0 20 40 60 80 100

−0.

10.

00.

10.

20.

30.

40.

5

Lags

AC

F

(c) Submissions: Fitted ACF of double exponential trawl

0 20 40 60 80 100

−0.

10.

00.

10.

20.

30.

40.

5

Lags

AC

F

(d) Cancellations: Fitted ACF of double exponential trawl

0 20 40 60 80 100

−0.

10.

00.

10.

20.

30.

40.

5

Lags

AC

F

(e) Submissions: Fitted ACF of long memory trawl

0 20 40 60 80 100

−0.

10.

00.

10.

20.

30.

40.

5

Lags

AC

F

(f) Cancellations: Fitted ACF of long memory trawl

Figure 1: Empirical autocorrelation function (ACF) of the number of newly submitted and fully deleted limit orders, respectively.The solid red lines show the estimated trawl functions in both cases using an exponential, double exponential and long memory trawl.The black dotted lines indicate the 95% confidence bounds forthe estimated autocorrelation function based on a parametric bootstrapwith 1000 replications.

9

by

θi = 1−E(Y(i))

Var(Y(i)), mi =

E(Y(i))(1− θi)

Leb(A(i))θi.

Note thatE andVar denote the sample mean and sample variance. Note that then αi = 1/θi and eithermi = κ (in the case of Example 8) ormi = κ + κi (in the case of Example 9).

After the marginal parameters have been identified, we move on to Step 2 and estimate the dependenceparameter. There is actually not a unique way of doing this within the method of moments set-up. Hencewe will now describe the two different procedures which have been implemented in both the empirical andlater also in the simulation study.

In the main article, we write that we can estimate the covariation parameter

κ1,2 = α1α2κ =ρe

12(0)

R12(0),

whereρe12(0) denotes the empirical cross-covariance function between the two components evaluated at lag

0 andR12(0) = Leb(A(1) ∩ A(2)). Given that theαs have already been estimated, we obtain that

κ =ρe

12(0)

R12(0)α1α2

.

Now we have in fact found various estimators ofκ and we need to decide which one to choose in theapplication.

Method 1: Estimateκ using the minimum of the marginal estimators and the correlation estimator, i.e.,set

κ = min

{ρe

12(0)

R12(0)α1α2

, m1, m2

}.

Then estimateκ1 andκ2 by

κi = max{mi − κ, 0}, for i = 1, 2.

Method 2: Estimateκ based on the empirical cross-covariance first and possibly make adjustments tothe already estimated marginal parameters as follows. Set

κ =ρe

12(0)

R12(0)α1α2

.

Then estimateκ1 andκ2 by

κi = max{mi − κ, 0}, for i = 1, 2.

In the case whenκi = 0, we then re-set the estimatemi to

mi = κ, for i = 1, 2.

While Method 1 typically supports a model including both a common factor and independent factors,see Example 9, Method 2 in our application leads to estimatessupporting a fully dependent model, seeExample 8. The main difference between the two methods lies in the fact that in Method1, the marginalfit is not affected by the estimation of the dependence parameter and hence we expect the marginal fit foreach univariate time series to be better. The dependence parameterκ might be underestimated (through thechoice of the minimum) and hence one needs to explore whetherthe bivariate fit gets worse at the expenseof a good marginal univariate fit. Indeed, we find that the marginal fit using Method 1 which is analyzed

10

using histograms and quantile-quantile plots of the negative binomial law is generally very good for any ofthe three trawl functions and for both the submissions and cancellations, see Figure 2.

In Method 2, we prioritise the estimation of the dependence parameterκ and hence expect that thedescription of the dependence could be more accurate compared to Method 1. However, since we possiblyneed to adjust the first parameter in the negative binomial law (rather than taking the optimal one from themarginal fit), we expect to find that the marginal fit of the univariate negative binomial law to the univariatetime series is possibly worse than in the case of Method 1. This is indeed confirmed when looking atthe corresponding marginal histograms and quantile-quantile plots in Figure 3. While for exponential anddouble exponential trawls, the marginal fit is good, in the case of a long-memory trawl the marginal fitappears to be rather poor when using Method 2.

11

0 50 100 150 200 250 300

0.00

00.

002

0.00

40.

006

0.00

80.

010

0.01

2

0 50 100 150 200 250 300

050

100

150

200

250

300

350

Neg

ativ

e B

inom

ial

(a) Submissions: Negative binomial marginal fit for exponentialtrawl

0 50 100 150 200 250

0.00

00.

005

0.01

00.

015

0 50 100 150 200 250

050

100

150

200

250

Neg

ativ

e B

inom

ial

(b) Cancellations:Negative binomial marginal fit for exponentialtrawl

0 50 100 150 200 250 300

0.00

00.

002

0.00

40.

006

0.00

80.

010

0.01

2

0 50 100 150 200 250 300

010

020

030

040

0

Neg

ativ

e B

inom

ial

(c) Submissions: Negative binomial marginal fit for double ex-ponential trawl

0 50 100 150 200 250

0.00

00.

005

0.01

00.

015

0 50 100 150 200 250

050

100

150

200

Neg

ativ

e B

inom

ial

(d) Cancellations: Negative binomial marginal fit for double ex-ponential trawl

0 50 100 150 200 250 300

0.00

00.

002

0.00

40.

006

0.00

80.

010

0.01

2

0 50 100 150 200 250 300

050

100

150

200

250

300

350

Neg

ativ

e B

inom

ial

(e) Submissions: Negative binomial marginal fit for long mem-ory trawl

0 50 100 150 200 250

0.00

00.

005

0.01

00.

015

0 50 100 150 200 250

050

100

150

200

250

Neg

ativ

e B

inom

ial

(f) Cancellations: Negative binomial marginal fit for long mem-ory trawl

Figure 2: Empirical and fitted densities and quantile-quantile plots of the negative binomial marginal law for the new submissions(left) and the full cancellations (right) for three different trawl functions. Method 1 was used in the estimation ofthe dependenceparameter, which lead to a model specification of the dependent case with additional independent factors, see Example 9.

12

0 50 100 150 200 250 300

0.00

00.

002

0.00

40.

006

0.00

80.

010

0.01

2

0 50 100 150 200 250 300

010

020

030

0

Neg

ativ

e B

inom

ial

(a) Submissions: Negative binomial marginal fit for exponentialtrawl

0 50 100 150 200 250

0.00

00.

005

0.01

00.

015

0 50 100 150 200 250

050

100

150

200

250

Neg

ativ

e B

inom

ial

(b) Cancellations:Negative binomial marginal fit for exponentialtrawl

0 50 100 150 200 250 300

0.00

00.

002

0.00

40.

006

0.00

80.

010

0.01

2

0 50 100 150 200 250 300

050

100

150

200

250

300

350

Neg

ativ

e B

inom

ial

(c) Submissions: Negative binomial marginal fit for double ex-ponential trawl

0 50 100 150 200 250

0.00

00.

005

0.01

00.

015

0 50 100 150 200 250

050

100

150

200

Neg

ativ

e B

inom

ial

(d) Cancellations: Negative binomial marginal fit for double ex-ponential trawl

0 50 100 150 200 250 300

0.00

00.

002

0.00

40.

006

0.00

80.

010

0.01

2

0 50 100 150 200 250 300

050

100

150

200

250

300

350

Neg

ativ

e B

inom

ial

(e) Submissions: Negative binomial marginal fit for long mem-ory trawl

0 50 100 150 200 250

0.00

00.

005

0.01

00.

015

0 50 100 150 200 250

010

020

030

0

Neg

ativ

e B

inom

ial

(f) Cancellations: Negative binomial marginal fit for long mem-ory trawl

Figure 3: Empirical and fitted densities and quantile-quantile plots of the negative binomial marginal law for the new submissions(left) and the full cancellations (right) for three different trawl functions. Method 2 was used in the estimation ofthe dependenceparameter, which leads to a model specification of the fully dependent case, see Example 8.

13

Let us now turn to assessing the bivariate fit of the bivariatenegative binomial distributions for thevarious models we considered. Goodness-of-fit considerations are rather challenging in particular in avery high-dimensional setting. In this article we carried out the check of the bivariate goodness-of-fit byanalyzing the difference between the empirical bivariate histogram and the bivariate histogram obtainedfrom averaging over the histograms from 1000 simulations from the various models.

When we used the estimation Method 1, we ended up with a model of the type presented in Example9 (common factor and independent factors), and when we used Method 2, we ended up with the fullydependent model as discussed in Example 8. Recall that we already mentioned that Method 1 might leadto an underestimation of the dependence parameters, so we want to investigate whether the bivariate fitappears worse in the cases of Models 1-3.

We have already described in the main article how exactly thebivariate histograms describing thedifference between the empirical histogram and the histogram ofthe simulated data have been computed.Note that we hope to find very small differences, i.e., values close to zero, with no clear patterns.Overall,we find that all six bivariate histograms look rather good in the sense that the differences are indeed veryclose to 0.

Let us consider Figure 4 first. When focusing only on the noticeable deviations from 0, we observe thatin the case of an exponential trawl (Models 1 and 4), the simulated data seem to exhibit a slightly highercross-correlation than the empirical ones for bigger values of the time series and the picture is reversed fortime series values close to 0. There does not seem to be a noticeable difference between the settings ofExample 8and Example 9 and the bivariate fit seems to be of similar quality.

In the case of a double exponential trawl, see Figure 5, we findthat the bivariate fit appears good andvery similar for both Model 2 and Model 5.

Finally, when using a long memory trawl corresponding to Models 3 and 6, see Figure 6, the twohistograms differ more noticeably. In the case of Model 3 (the fully dependent case), the histogram exhibitsa split into approximately two triangles, where the differences are close to 0 in the upper triangle, butnegative in the lower triangle. This feature disappears in the more general setting of Model 6. We also notethat along the diagonal the empirical data seem to have slightly stronger dependence than the simulateddata.

Overall, we can say that based on the graphical comparison the bivariate fit when using a long memorytrawl is the worst one and the double exponential case leads to reasonable results and is hence presented inthe main article.

14

Cancellations

Sub

mis

sion

s

[0,5) [20,25) [40,45) [60,65) [80,85) [105,110) [135,140) [165,170) [195,200) [225,230) [255,260) [285,290)

[0,5

)[3

0,35

)[6

5,70

)[1

05,1

10)

[150

,155

)[1

95,2

00)

[240

,245

)[2

85,2

90)

diff

−8−6−4−2 0 2 4 6 8

101214161820

(a) Model 1

Cancellations

Sub

mis

sion

s

[0,5) [20,25) [40,45) [60,65) [80,85) [105,110) [135,140) [165,170) [195,200) [225,230) [255,260) [285,290)

[0,5

)[3

0,35

)[6

5,70

)[1

05,1

10)

[150

,155

)[1

95,2

00)

[240

,245

)[2

85,2

90)

diff

−8−6−4−2 0 2 4 6 8

10121416

(b) Model 4

Figure 4: Difference between the bivariate histograms corresponding to the empirical data and the simulated data for Model 1 (top)and Model 4 (bottom)

15

Cancellations

Sub

mis

sion

s

[0,5) [20,25) [40,45) [60,65) [80,85) [105,110) [135,140) [165,170) [195,200) [225,230) [255,260) [285,290)

[0,5

)[3

0,35

)[6

5,70

)[1

05,1

10)

[150

,155

)[1

95,2

00)

[240

,245

)[2

85,2

90)

diff

−10 −8 −6 −4 −2 0 2 4 6 8

10 12 14 16

(a) Model 2

Cancellations

Sub

mis

sion

s

[0,5) [20,25) [40,45) [60,65) [80,85) [105,110) [135,140) [165,170) [195,200) [225,230) [255,260) [285,290)

[0,5

)[3

0,35

)[6

5,70

)[1

05,1

10)

[150

,155

)[1

95,2

00)

[240

,245

)[2

85,2

90)

diff

−10 −8 −6 −4 −2 0 2 4 6 8

10 12 14

(b) Model 5


16

Cancellations

Sub

mis

sion

s

[0,5) [20,25) [40,45) [60,65) [80,85) [105,110) [135,140) [165,170) [195,200) [225,230) [255,260) [285,290)

[0,5

)[3

0,35

)[6

5,70

)[1

05,1

10)

[150

,155

)[1

95,2

00)

[240

,245

)[2

85,2

90)

diff

−5 0 5

1015202530

(a) Model 3

Cancellations

Sub

mis

sion

s

[0,5) [20,25) [40,45) [60,65) [80,85) [105,110) [135,140) [165,170) [195,200) [225,230) [255,260) [285,290)

[0,5

)[3

0,35

)[6

5,70

)[1

05,1

10)

[150

,155

)[1

95,2

00)

[240

,245

)[2

85,2

90)

diff

−6−4−2 0 2 4 6 8

1012141618202224

(b) Model 6


17

4.2. Details regarding the simulation study

Let us now describe in detail how a bivariate trawl process with negative binomial marginal law can besimulated. We focus on the fully dependent case, see Example8, since the more general case describedin Example 9 can be easily obtained by adding independent factors. Recall that when simulating the trawlprocess, we work with the compound-Poisson-type representation (3) in the main article and specify thejump size distribution as the bivariate logarithmic seriesdistribution (BLSD) as in Example 10 in the mainarticle.

4.2.1. Simulating from the bivariate logarithmic series distributionFirst of all, we describe how we can generate random samplesC = (C1,C2)⊤ from the BLSD with

parametersp1, p2. The algorithm is based on the idea that we can simulateC1 from the modified logarithmicseries distribution (ModLSD) with parameters ˜p1 = p1/(1− p2) andδ1 = ln(1 − p2)/ ln(1 − p1 − p2) ina first step, and thenC2 can be simulated from the conditional distribution, givenC1, see, e.g., [6]. Wenote here, that ifC1 ≡ 0, thenC2|C1 follows the logarithmic distribution (with parameterp2), and whenC1 > 0, thenC2|C1 follows the negative binomial distribution with parametersC1 andp2, see, e.g., [7]. Wedescribe the simulation algorithm for the BLSD using pseudocode tailored to theR language. Throughoutthe section we use the abbreviation rv for random variable. Also, recall thatNB stands for the negativebinomial distribution,LSD for the logarithmic series distribution,MLSD for the modified logarithmicseries distribution,B for the Bernoulli distribution, andPOI for the Poisson distribution.

Algorithm 1 (Simulation from the bivariate logarithmic series distribution).1: library(Runuran) ⊲ Load the Runuran package inR.

2: function Sim-BLSD(N, p1, p2)3: p1← p1/(1− p2) ⊲ Calculate the parameters of theMLSD.4: δ1← ln(1− p2)/ ln(1− p1 − p2)5: L← urlogarithmic(N, p1) ⊲ Simulate N i.i.d.LSD(p1) rvs.6: B← rbinom(N, 1, 1− δ1) ⊲ Simulate N i.i.d. Bernoulli(1− δ1) rvs.7: C1← L ∗ B ⊲ Generate N i.i.d.MLSD(p1, δ1) rvs.8: C2← numeric(N)9: for i in 1 : N do c1← C1[i]

10: if c1 == 0 then11: C2[i] ← urlogarithmic(1, p2) ⊲ Simulate aLSD(p2) rv.12: end if13: if c1 > 0 then14: C2[i] ← rnbinom(1, size= c1, prob= 1− p2) ⊲ Simulate aNB(c1, p2) rv.15: end if16: end for17: C← cbind(C1,C2) ⊲ Combine the component vectors to an N× 2 matrix.18: return C19: end function

4.2.2. Simulating the bivariate trawl processNext, we provide the pseudo code tailored to theR language which has been used to simulate the

bivariate trawl process with exponential trawl function and bivariate negative binomial law (as in Example8). Here we are using the same notation as in the general description of Algorithm 6 in the main article.In addition, we denote bybi the length of the burn-in period. I.e., we simulate the process over the timeinterval [0, t] for t = T + bi and then remove the initial burnin period, i.e., we return the paths over theinterval (bi, bi + T].

Algorithm 2 (Simulation from the bivariate trawl process).1: library(Runuran) ⊲ Load the RunuranR package and the functionSim-BLSD defined above.2: function Expfct((x, λ)) ⊲ Choose an exponential trawl function.3: return exp(λ ∗ x)

18

4: end function

5: procedure Sim-Trawl(∆,T, bi, λ1, λ2, α1, α2, κ)6: v← κ ∗ ln(1+ α1 + α2) ⊲ Intensity of the driving Poisson process.7: p1← α1/(α1 + α2 + 1); p2← α2/(α1 + α2 + 1) ⊲ Parameters in the bivariateLSD8: Nt ← rpois(1, v ∗ t) ⊲ Draw the number of jumps in[0, t] fromPOI(vt).9: τ← sort(runif(Nt,min = 0,max= t)) ⊲ simulate the Nt jump times from the ordered uniform

distribution on[0, t].10: h← runif(Nt,min = 0,max= 1) ⊲ Simulate the Nt jump heights of the abstract spatial

parameter of the Poisson basis from the uniform distribution on[0, 1].11: m← Sim − BLSD(Nt, p1, p2) ⊲ Draw the jump marks from the bivariateLSD12: C1← m[, 1]; C2← m[, 2] ⊲ Assign the jump marks to C1 and C2.

13: ⊲ Determine the number of jumps up to each grid point k∆ and store them in the vector V.14: V ← vector(mode = ”numeric” , length = floor(t/∆))15: c← table(cut(jumptimes, seq(0, t,∆), include.lowest = TRUE))16: V[1] < −as.integer(c[1])17: for k in 2 : floor(t/∆) do18: V[k] ← V[k− 1] + as.integer(c[k])19: end for

20: for i in 1 : 2 do ⊲ Simulate the ith trawl process21: T Pi ← vector(mode = ”numeric” , length = floor(t/∆))22: for k in 1 : floor(t/∆) do23: Nk∆ ← V[k] ⊲ Number of jumps until time k∆.24: if Nk∆ > 0 then25: d← k ∗ ∆ − τ[1 : Nk∆] ⊲ Compute the time differences between k∆ and each jump time

up to k∆.26: condi ← 1− ceiling(h[1 : Nk∆] − Expfct(−d, λi)) ⊲ Check which points are in the

trawl.27: T Pi [k] ← sum(condi ∗Ci [1 : Nk∆]) ⊲ Sum up the marks in the trawl.28: end if29: end for30: end for

31: b1← bi/∆, b2 = bi/∆ + T/∆32: for i in 1 : 2 do33: TrawlProcessi ← T Pi [(b1 + 1) : b2] ⊲ Cut off burn-in period.34: end for35: end procedure

4.2.3. Simulation resultsWe carried out a detailed simulation study to assess the finite-sample performance of the proposed

estimation methodology. Clearly there are many different ways in which a simulation study can be set up.Here we followed the guide from the empirical study and mimicked the setting suggested from the empiricaldata. I.e., we simulate a bivariate time series of length 720using six different model specifications (Models1-6) as described above where the parameter values are set totheir empirical counterparts. The negativebinomial parameters of Models 1-3 have been estimated usingMethod 2 and of Models 4-6 using Method1. The simulation/estimation exercise has been carried out for 1000 samples (we call them bootstrapreplications in the main article).

For each model we provide the boxplots of the parameter estimates, the corresponding 95% confidencebounds underneath each boxplot and the true values are indicated by a red horizontal line.

Exponential trawl: In the case when we estimate an exponential trawl function, see Models 1 and 4, i.e.,

19

Figures 7 and 10, respectively, we observe that we can estimate the trawl parameters very accuratelyand that the uncertainty associated with the parameter estimates is small.

Double exponential trawl: In the case when we estimate a double exponential trawl function, see Models2 and 5, i.e., Figures 8 and 11, respectively, let us first focus on the correct centering of the parameterestimates: We observe that we can estimate the smaller memory parameter (indicating a slow decay)more precisely than then bigger memory parameter (indicating a fast decay). Also, the correspondingweight can be estimated fairly well. In a next step, we consider the corresponding 95% confidencebounds and we observe that they are rather wide, meaning thatthe uncertainty associated with theparameter estimates of the double exponential trawl is rather large. As already mentioned in the mainarticle, this is due to the fact that fitting a sum of exponentials to a curve (in our case the empiricalautocorrelation function) by least squares is often considered as rather challenging, see [8]. That said,even a wide range in the parameter estimates results in very similar fitted estimated autocorrelationfunctions. I.e., the goodness-of-fit does not necessarily suffer from the increased uncertainty in theparameter estimates.

Long memory trawl: Next consider the case of a long memory trawl function, see Models 3 and 6, i.e.,Figures 9 and 12, respectively. When looking at the centering of the parameter estimates in theboxplots, we observe that the long memory parameter seemed to be overestimated in the simulationstudy. Also, the uncertainty revealed by the 95% confidence bands (in particular fora2) is ratherhigh. In principle, there are two sources of errors which canchallenge the simulation results: Firstof all, the simulation error could be rather high, and/or the estimation method could not be good.Here our interpretation is that the former is certainly the case: We know from earlier work by [3]that when simulating a long memory trawl process with a sample size of 720, the resulting empiricalautocorrelation function widely underestimates the theoretical one (this is later confirmed in Figures15 and 18). Hence it is not surprising that, if already the autocorrelation function from the simulateddata is not accurate, the parameter estimates derived from it, cannot be very reliable. The only wayto remedy this problem is to consider a bigger sample size when simulating long memory trawlprocesses.

Fully dependent model (Example 8):Let us next check the accuracy of the estimates of the bivariatenegative binomial parameters. In the case of a fully dependent model, Models 1-3, see Figures 7,8 and 9, the bivariate negative binomial law features three parameters (α1, α2, κ). We note that thecentering of the estimates ofα1 andα2 is generally good and the uncertainty associated with theseparameters is relatively small, which is very encouraging.When it comes to estimating the depen-dence parameterκ, we note that the estimation accuracy is pretty good in the case of an exponentialtrawl, it gets worse for the case of a double exponential trawl and, when it comes to parameter un-certainty, is totally unacceptable in the case of a long memory trawl (where the 95% confidencebounds were so large that we only report them as ‘big’). As mentioned before, in the estimationof the fully dependent model we allow for the possibility of having additional independent factors(corresponding to additional parametersκ1, κ2). They were generally estimated as being close to zero(their theoretical value). While the uncertainty associated with the estimation ofκ1, κ2 is rather lowfor both an exponential and a double exponential trawl, it becomes unacceptably large in the case ofa long memory trawl.

Dependent model with independent factors (Example 9):In the case of a dependent model with inde-pendent factors, Models 4-6, see Figures 10, 11 and 12, the bivariate negative binomial law featuresfive parameters (α1, α2, κ, κ1, κ2). As in the fully dependent case, we observe that the centering of theestimates ofα1 andα2 is generally good and the uncertainty associated with theseparameters is rel-atively small. The quality of the estimates ofκ, κ1, κ2 deteriorates when passing from an exponential(good centering, small uncertainty) to a double exponential (underestimation ofκ, but fairly goodcentering forκ1, κ2 and fairly wide confidence bounds) and to a long memory trawl (poor centeringand very wide confidence bounds).

In addition to the boxplots, we also provide time series plots of the simulated data: For each of the sixmodels, we choose the first simulated bivariate sample path and plot the bivariate time series, its autocorre-

20

lations and its cross-correlation. These plots are given inFigures 13-18. When comparing these plots withthe empirical time series plots we note that the simulated path using double exponential trawls replicatesthe empirical counterpart best, both in terms of the auto-and the cross-correlation. In the case of the longmemory trawl, we note that the simulation error is quite significant in the sense that the autocorrelationfunction of the simulated path does not get anywhere close tothe theoretical counterpart. This is due to thefact that the sample size of 720 appears to be too small in the long memory case.Lessons learned from the simulation study:One clear advantage of estimating the multivariate trawlparameters by a (generalized) method of moments is its speed: The negative binomial parameters can beestimated without using any optimisation routine and are hence available almost instantaneously. The sameis true for the estimation of the exponential trawl parameter. In the case of a double exponential or longmemory trawl we run a least squares minimisation, but it alsoturns out to be very fast (with results obtainedwithin seconds). A competing method which would allow to estimate all parameters simultaneously wouldbe a composite likelihood estimation, which we will describe in detail in Section 5.

When it comes to simulation accuracy, we remark that the simulation procedure seems to work well forour sample size of 720 for an exponential and double exponential trawl. However, for a long memory trawlone would need to consider bigger sample sizes starting froma length of ca. 5000 observations to ensurethat the simulated sample paths have a similar autocorrelation as the theoretical model suggests.

21

1.01.2

1.41.6

λ(1)

[0.99, 1.65]

1.21.4

1.61.8

2.0

λ(2)

[1.18, 2.05]

2224

2628

3032

α1

[22.93, 32.7]

3035

40

α2

[29.85, 41.91]

1.52.0

2.53.0

3.5

κ

[1.75, 3.48]

0.00.1

0.20.3

0.4

κ1

[0, 0.35]

0.00

0.02

0.04

0.06

0.08

0.10

κ2

[0, 0.18]

Figure 7: Model 1: Boxplots of the estimated five parameters (λ(1), λ(2), α1, α2, κ) from a bivariate trawl model with exponential trawlfunction and negative binomial Levy seed, see Example 8. Note that we also estimated the parametersκ1, κ2 allowing possibly forindependent factors, but their estimates were close to 0.

22

0.00.1

0.20.3

0.4

λ1(1)

[0, 2.22]

010

2030

40

λ2(1)

[0.73, 36.35]

0.000

0.005

0.010

0.015

0.020

0.025

0.030

w(1)

[0, 0.5]

0.00.1

0.20.3

0.4

λ1(2)

[0, 0.63]

010

2030

40

λ2(2)

[1.71, 39.35]

0.000

0.005

0.010

0.015

0.020

w(2)

[0, 0.43]

0.00.2

0.40.6

0.81.0

leb(A(1))

[0.04, 0.89]

0.00.2

0.40.6

0.8

leb(A(2))

[0.04, 0.67]

0.00.1

0.20.3

0.40.5

0.6

R12(0)

[0.03, 0.56]

2022

2426

2830

3234

α1

[22.08, 35.33]

3035

40

α2

[28.49, 45.01]

010

2030

4050

κ

[2.78, 56.02]

0.00.5

1.01.5

κ1

[0, 4.4]

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

κ2

[0, 0.82]

Figure 8: Model 2: Boxplots of the estimated nine parameters(λ(1)1 , λ

(1)2 ,w

(1), λ(2)1 , λ

(2)2 ,w

(2), α1, α2, κ) from a bivariate trawl modelwith double exponential trawl function and negative binomial Levy seed, see Example 8. Note that we also estimated the parametersκ1, κ2 allowing possibly for independent factors, but their estimates were close to 0.

23

0.00.1

0.20.3

0.40.5

a1

[0, 1.07]

1.01.5

2.02.5

H1

[1.12, 3.42]

0.00.1

0.20.3

0.40.5

0.6

a2

[0, 35.03]

1.01.5

2.02.5

3.0

H2

[1.06, 100]

0.00.1

0.20.3

0.40.5

0.6

Leb(A(1))

[0, 0.53]

0.00.1

0.20.3

0.40.5

Leb(A(2))

[0, 0.49]

0.00.1

0.20.3

R12(0)

[0, 0.37]

2022

2426

2830

32

α1

[22.48, 32.95]

2530

3540

α2

[28.18, 43.66]

010

020

030

0

κ

big

010

2030

40

κ1

[0, 275.27]

0.00.2

0.40.6

0.81.0

κ2

big

Figure 9: Model 3: Boxplots of the estimated seven parameters (a(1),H(1), a(2),H(2), α1, α2, κ) from a bivariate trawl model with longmemory trawl function and negative binomial Levy seed, seeExample 8. Note that we also estimated the parametersκ1, κ2 allowingpossibly for independent factors, but their estimates wereclose to 0.

24

1.01.2

1.41.6

λ(1)

[0.99, 1.69]

1.01.2

1.41.6

1.82.0

2.2

λ(2)

[1.19, 2.11]

2224

2628

3032

34

α1

[22.94, 32.3]

3035

40

α2

[29.82, 41.54]

1.52.0

2.53.0

κ

[1.51, 2.95]

0.00.1

0.20.3

κ1

[0, 0.39]

0.00.1

0.20.3

0.40.5

0.6

κ2

[0.02, 0.76]

Figure 10: Model 4: Boxplots of the estimated seven parameters (λ(1), λ(2), α1, α2, κ, κ1, κ2) from a bivariate trawl model with expo-nential trawl function and negative binomial Levy seed, see Example 9.

25

0.00.1

0.20.3

0.4

λ1(1)

[0, 1.86]

010

2030

40

λ2(1)

[1.01, 34.58]

0.000

0.005

0.010

0.015

0.020

0.025

0.030

w(1)

[0, 0.5]

0.00.1

0.20.3

0.40.5

λ1(2)

[0, 0.57]

010

2030

40

λ2(2)

[1.27, 34.18]

0.000

0.005

0.010

0.015

0.020

w(2)

[0, 0.5]

0.00.2

0.40.6

0.8

leb(A(1))

[0.04, 0.81]

0.00.2

0.40.6

0.8

leb(A(2))

[0.04, 0.69]

0.00.1

0.20.3

0.40.5

0.6

R12(0)

[0.03, 0.52]

2022

2426

2830

3234

α1

[22.13, 34.86]

3035

4045

α2

[28.98, 44.67]

05

1015

2025

3035

κ

[1.61, 27.92]

01

23

45

6

κ1

[0, 27.62]

01

23

45

6

κ2

[0, 19.12]

Figure 11: Model 5: Boxplots of the estimated eleven parameters (λ(1)1 , λ

(1)2 ,w

(1), λ(2)1 , λ

(2)2 ,w

(2), α1, α2, κ, κ1, κ2) from a bivariate trawlmodel with double exponential trawl function and negative binomial Levy seed, see Example 9.

26

0.00.1

0.20.3

0.40.5

0.60.7

a1

[0, 3.06]

1.01.5

2.02.5

3.0

H1

[1.14, 6.61]

0.00.1

0.20.3

0.40.5

0.6

a2

[0, 37.03]

1.01.5

2.02.5

3.0

H2

[1.05, 100]

0.00.1

0.20.3

0.40.5

0.6

Leb(A(1))

[0, 0.56]

0.00.1

0.20.3

0.40.5

Leb(A(2))

[0, 0.5]

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

R12(0)

[0, 0.35]

2025

30

α1

[21.39, 33.98]

2530

3540

α2

[28.3, 44.57]

05

1015

20

κ

[1.7, 159.67]

05

1015

20

κ1

[0, 152.63]

020

4060

8010

012

014

0

κ2

big

Figure 12: Model 6: Boxplots of the estimated nine parameters (a(1),H(1),a(2),H(2), α1, α2, κ, κ1, κ2) from a bivariate trawl modelwith long memory trawl function and negative binomial Levyseed, see Example 9.

27

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 20 40 60 80 100

−0.

10−

0.05

0.00

0.05

0.10

0.15

0.20

Lag

AC

F

0 20 40 60 80 100

−0.

10.

00.

10.

2

Lag

AC

F

−100 −50 0 50 100

0.0

0.2

0.4

0.6

0.8

Lag

AC

F

Figure 13: Time series, autocorrelation and cross-correlation plots for a simulated bivariate sample path from Model 1.

28

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 20 40 60 80 100

−0.

10.

00.

10.

20.

3

Lag

AC

F

0 20 40 60 80 100

−0.

10.

00.

10.

20.

3

Lag

AC

F

−100 −50 0 50 100

0.0

0.2

0.4

0.6

0.8

Lag

AC

F


29

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 20 40 60 80 100

−0.

050.

000.

050.

10

Lag

AC

F

0 20 40 60 80 100

−0.

10.

00.

10.

20.

3

Lag

AC

F

−100 −50 0 50 100

−0.

10.

00.

10.

20.

30.

40.

50.

6

Lag

AC

F


30

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 20 40 60 80 100

0.0

0.1

0.2

0.3

Lag

AC

F

0 20 40 60 80 100

−0.

10.

00.

10.

20.

3

Lag

AC

F

−100 −50 0 50 100

0.0

0.2

0.4

0.6

0.8

Lag

AC

F


31

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 20 40 60 80 100

−0.

050.

000.

050.

100.

15

Lag

AC

F

0 20 40 60 80 100

−0.

050.

000.

050.

100.

150.

20

Lag

AC

F

−100 −50 0 50 100

0.0

0.2

0.4

0.6

0.8

Lag

AC

F


32

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 100 200 300 400 500 600 700

050

100

150

200

250

300

Interval

Cou

nts

0 20 40 60 80 100

−0.

10−

0.05

0.00

0.05

0.10

0.15

Lag

AC

F

0 20 40 60 80 100

−0.

050.

000.

050.

10

Lag

AC

F

−100 −50 0 50 100

−0.

10.

00.

10.

20.

30.

4

Lag

AC

F


33

5. Likelihood inference in the Poisson mixture model

An alternative inference method to the (generalized) method of moments presented in the main articleis a composite likelihood approach, more specifically a pairwise likelihood approach.

We will now derive the composite (pairwise) likelihood for the Poisson mixture model defined in Sec-tion 3.2.1 of the main article. To this end, consider two components of the model and letk , l. Then

Pr(Xk = xk,Xℓ = xℓ)

=

∫

(0,∞)3Pr(Xk = xk,Xℓ = xℓ|U = u,Vk = vk,Vℓ = vℓ) fU(u) fVk(vk) fVℓ (vℓ)dudvkdvℓ.

Note that sinceX j |Z j ∼ POI(Z j) for Z j = α jU + V j for j ∈ {k, ℓ}, we have

Pr(Xk = xk,Xℓ = xℓ |U = u,Vk = vk,Vℓ = vℓ)

= exp{− (αku+ vk)}1

xk!(αku+ vk)xk exp{− (αℓu+ vℓ)}

1xℓ!

(αℓu+ vℓ)xℓ .

Using the binomial theorem, we can write

(αku+ vk)xk =

xk∑

w1=0

(xk

w1

)α

w1

k uw1vxk−w1

k , (αℓu+ vℓ)xℓ =

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

uw′1vxℓ−w′1ℓ.

Hence we have

Pr(Xk = xk,Xℓ = xℓ) =xk∑

w1=0

(xk

w1

)α

w1k

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓΓ1Γ2Γ3,

where

Γ1 := Γ1(k, ℓ,w1,w′1) :=

∫

(0,∞)uw1+w′1 exp{−(αk + αℓ)u} fU (u)du = E(Uw1+w′1e−(αk+αℓ)U ),

Γ2 := Γ2(k, ℓ,w1,w′1) :=

∫

(0,∞)vxk−w1

k exp(−vk, ) fVk(vk)dvk = E(Vxk−w1

k e−Vk),

Γ3 := Γ3(k, ℓ,w1,w′1) :=

∫

(0,∞)v

xℓ−w′1ℓ

exp(−vℓ) fVℓ (vℓ)dvℓ = E(Vxℓ−w′1ℓ

e−Vℓ ).

In the case whenU ∼ Γ(κ, 1) andV j ∼ Γ(κ j, 1/α j) for j = 1, . . . , n, we get

Γ1 =Γ(w1 + w′1 + κ)

Γ(κ)(αk + αℓ + 1)w1+w′1+κ,

Γ2 =Γ(xk − w1 + κk)α

xk−w1

k

Γ(κk)(αk + 1)xk−w1+κk,

Γ3 =Γ(xℓ − w′1 + κℓ)α

xℓ−w′1ℓ

Γ(κℓ)(αℓ + 1)xℓ−w′1+κℓ.

5.1. Finding the relation between L(k)(A), L(ℓ)(A) and Xk,XℓWe will now need to link theX-variables to the corresponding trawl process. We know that

E[exp{i(θk, θℓ)(L(k)(A), L(ℓ)(A))⊤}] = exp{Leb(A)C(L(k)′ ,L(ℓ)′ )(θk, θℓ)}

= exp{Leb(A)C(Xk,Xℓ)(θk, θℓ)}.

When looking at the characteristic function, we observe thefollowing: When comparing (L(k)(A), L(ℓ)(A))and (Xk,Xℓ), then the joint distribution of the latter can be obtained by multiplying the intensity of the

34

underlying compound Poisson process (of the latter) by Leb(A). Hence we have

Pr(L(k)(A) = xk, L(ℓ)(A) = xℓ) =

xk∑

w1=0

(xk

w1

)Leb(A)w1α

w1k

xℓ∑

w′1=0

(xℓw′1

)Leb(A)w′1α

w′1ℓΓ1Γ2Γ3,

where

Γ1 := Γ1(k, ℓ,w1,w′1) :=

∫

(0,∞)uw1+w′1 exp{−Leb(A)(αk + αℓ)u} fU(u)du

=Γ(w1 + w′1 + κ)

Γ(κ){Leb(A)(αk + αℓ) + 1}w1+w′1+κ,

Γ2 := Γ2(k, ℓ,w1,w′1) :=

∫

(0,∞)vxk−w1

k exp{−Leb(A)vk} fVk(vk)dvk

=Γ(xk − w1 + κk){Leb(A)αk}

xk−w1

Γ(κk){Leb(A)αk + 1}xk−w1+κk,

Γ3 := Γ3(k, ℓ,w1,w′1) :=

∫

(0,∞)v

xℓ−w′1ℓ

exp{−Leb(A)vℓ} fVℓ (vℓ)dvℓ

=Γ(xℓ − w′1 + κℓ){Leb(A)αℓ}xℓ−w′1

Γ(κℓ){Leb(A)αℓ + 1}xℓ−w′1+κℓ.

I.e.,

Pr(L(k)(A) = xk, L(ℓ)(A) = xℓ) = Leb(A)xk+xℓ

xk∑

w1=0

(xk

w1

)α

w1k

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

Γ(w1 + w′1 + κ)

Γ(κ){Leb(A)(αk + αℓ) + 1}w1+w′1+κ

Γ(xk − w1 + κk)αxk−w1

k

Γ(κk){Leb(A)αk + 1}xk−w1+κk

Γ(xℓ − w′1 + κℓ)αxℓ−w′1ℓ

Γ(κℓ){Leb(A)αℓ + 1}xℓ−w′1+κℓ

Suppose we have already estimated the trawl parameters and hence know Leb(A(k) \A(ℓ)) and Leb(A(k)∩

A(ℓ)). Then the parameters of the multivariate negative binomial distribution can be estimated using thepairwise likelihood for fixed trawl parameter given by

l(θ, y) =n∑

k=1

⌊t/∆m⌋∑

i=0

ln{Pr(Y(k)i∆m= y(k)

i )} +n−1∑

k=1

n∑

ℓ=k+1

⌊t/∆m⌋∑

i=0


i ,Y(ℓ)i∆m= y(ℓ)

i )}.

We note that

Pr(Y(k)i∆m= y(k)


i )

=

y(k)i∑

c1=0

y(ℓ)i∑

c2=0

Pr{L(k)(A(k) \ A(ℓ)) = y(k)i − c1}Pr{L(ℓ)(A(ℓ) \ A(k)) = y(ℓ)

i − c2}

· Pr{L(k)(A(k) ∩ A(ℓ)) = c1, L(ℓ)(A(k) ∩ A(ℓ)) = c2}.

If one would like to estimate both the trawl parameters and the parameters of the multivariate negativebinomial distribution simultaneously via a pairwise likelihood approach, then the above likelihood can beextended to

l(θ, y) =n∑

k=1

⌊t/∆m⌋∑

i=0

⌊t/∆m⌋−i∑

h=−i,h,0


i ,Y(k)(i+h)∆m

= y(k)i+h)}

35

+

n∑

k=1

⌊t/∆m⌋∑

i=0


i )}

+

n−1∑

k=1

n∑

ℓ=k+1

⌊t/∆m⌋∑

i=0



i )}.

In simulation studies (not reported here) we found that in the univariate case inference via the pairwiselikelihood method works well. However, already in the bivariate case (and for rather small samples), theevaluation of the pairwise likelihood (consisting of the various sums stated above) is rather time consumingand does not appear feasible yet. It will be worthwhile to investigate in future research whether the pairwiselikelihood can be implemented in an efficient way so that bivariate and hopefully higher dimensional trawlprocesses can be estimated by likelihood methods. Since theasymptotic theory for composite likelihoodmethods is available in general, one could then use these results for the inference rather than relying on aparametric bootstrap which (dependending on the model specification) can be computationally intense.

6. Extension of the Poisson mixture model to allow for bivariate interactions

In our empirical application we focused on a bivariate example. For a higher dimensional applicationone might wonder whether the dependence structure generated by the random additive effect model con-sisting of one common factor is rich enough. Hence, in this section we extend the Poisson mixture modelpresented in Section 3.2.1 in the main article, where dependence between the various components entersthrough pairwise interaction terms. This is in the spirit ofExample 5. We illustrate this idea first in afour-variate example and describe the generaln-dimensional setting next.

Example 6. Consider a four-variate setting with four random variables{Z1, . . . ,Z4} defined as

Z1 = α1(U12+ U13+ U14) + V1,

Z2 = α2(U12+ U23+ U24) + V2,

Z3 = α3(U13+ U23+ U34) + V3,

Z4 = α4(U14+ U24+ U34) + V4,

where Ui j ∼ Γ(κi j , 1), Vi ∼ Γ(κi , 1/αi), and we assume that all the Us and Vs are independent. Then

Z1 ∼ Γ(κ12 + κ13 + κ14 + κ1, 1/α1)

Z2 ∼ Γ(κ12 + κ23 + κ24 + κ2, 1/α2)

Z3 ∼ Γ(κ13 + κ23 + κ34 + κ3, 1/α3)

Z4 ∼ Γ(κ14 + κ24 + κ34 + κ4, 1/α4).

Recall that the Zs feature as the corresponding means in the Poisson mixture construction. There are 14parameters which need to be estimated:α1, α2, α3, α4 andκ12, κ13, κ14, κ23, κ24, κ34 andκ1, κ2, κ3, κ4.

Let us now focus on the general construction. Consider random variablesX1, . . . ,Xn and positive ran-dom variablesZ1, . . . ,Zn and assume that (X1, . . . ,Xn)|(Z1 = z1, . . . ,Zn = zn) are independent and Poissondistributed with means given by the{z1, . . . , zn}. We then model the joint distribution of the{Z1, . . . ,Zn} byan additive effect model with pairwise interaction terms:

Zi = αi

n∑

j=1, j,i

Ui j + Vi ,

whereUi j = U ji for all i , j. We assume that all theUs andVs are independent nonnegative randomvariables. Using the notationa∧ b = min(a, b) anda∨ b = max(a, b), we sometimes write

Zi = αi

n∑

j=1, j,i

U(i∧ j)(i∨ j) + Vi ,

36

Proposition 1. The probability generating function is given by

G(t1, . . . , tn) = E(tX1

1 · · · tXnn )

=

n∏

j=1

n∏

k= j+1

MU jk {α j(t j − 1)+ αk(tk − 1)}n∏

j=1

MV j (t j − 1),

for t1, . . . , tn ∈ R with max1≤i≤n |ti | < 1, where MU jk ,MV j denote the moment generating functions of Ujk

and Vj , respectively.

Proof of Proposition 1.

G(t1, . . . , tn) = E(tX11 · · · t

Xnn )

= E{E(tX11 · · · t

Xnn |Z1, . . . ,Zn)} = E

n∏

j=1

E(tX j

j |Z j)

= E

n∏

j=1

E(tX j

j |Z j)

= E

n∏

j=1

exp{−Z j(1− t j)}

= E

exp

−

n∑

j=1

Z j(1− t j)

= E

exp

−

n∑

j=1

α j

n∑

k=1,k, j

U jk + V j

(1− t j)

= E

exp

−

n∑

j=1

α j

n∑

k=1,k, j

U jk(1− t j)

E

exp

−

n∑

j=1

V j(1− t j)

= E

exp

n∑

j=1

α j(t j − 1)n∑

k=1,k, j

U jk

E

exp

n∑

j=1

V j(t j − 1)

=

n∏

j=1

n∏

k= j+1

E(exp[U jk{α j(t j − 1)+ αk(tk − 1)}])E

exp

n∑

j=1

V j(t j − 1)

=

n∏

j=1

n∏

k= j+1

MU jk {α j(t j − 1)+ αk(tk − 1)}n∏

j=1

MV j (t j − 1),

whereMU jk ,MV j denote the moment generating functions ofU jk andV j , respectively.

Proposition 2. The Poisson mixture model of random-additive-effect type with pairwise interaction termscan be represented as a discrete compound Poisson distribution with rate

v = −

n∑

j=1

n∑

k= j+1

KU jk (α j + αk) +n∑

j=1

KV j (1)

,

whereKU jk ,KV j denote the corresponding kumulant functions of Ujk and Vj, respectively, (see the proofbelow) and and the joint jump sizes having a Laplace transform (for positiveθ) given by

LC(θ) =1v

n∑

j=1

n∑

k= j+1

∞∑

ℓ=1

(α je

−θ j + αke−θk

)lq

(U jk )ℓ+

n∑

j=1

∞∑

ℓ=1

e−θ jℓq(V j)ℓ

,

where the definitions of q(U jk)ℓ, q

(V j)ℓ

are given in the proof below.

Proof of Proposition 2.We writeLU jk ,LV j for the corresponding Laplace transforms and setK = lnL for

37

the so-calledkumulant function. We can now deduce the Laplace transform for positiveθ as

L(θ1, . . . , θn) = G(e−θ1, . . . , e−θn)

=

n∏

j=1

n∏

k= j+1

MU jk (α j(e−θ j − 1)+ αk(e−θk − 1))n∏

j=1

MV j (e−θ j − 1)

=

n∏

j=1

n∏

k= j+1

LU jk (α j(1− e−θ j ) + αk(1− e−θk))n∏

j=1

LV j (1− e−θ j )

= exp

n∑

j=1

n∑

k= j+1

lnLU jk (α j(1− e−θ j ) + αk(1− e−θk)) +n∑

j=1

lnLV j (1− e−θ j )

= exp

n∑

j=1

n∑

k= j+1

KU jk (α j(1− e−θ j ) + αk(1− e−θk)) +n∑

j=1

KV j (1− e−θ j )

.

SinceU jk is a subordinator without drift, we know that

KU jk (a) =∫

R

(e−ax− 1)νU jk (dx).

Fora = α j(1− e−θ j ) + αk(1− e−θk), we get

KU jk (a) =∫

R

(e−(α j+αk)x − 1)νU jk (dx) +∫

R

e−(α j+αk)x(eα je

−θ j x+αke−θk x − 1)νU jk (dx).

Note that the second summand in the above expression can be rewritten as∫

R

e−(α j+αk)x(eα j e

−θ j x+αke−θk x − 1)νU jk (dx)

=

∫

R

e−(α j+αk)x∞∑

ℓ=1

1ℓ!

(α je


)ℓxℓνU jk (dx)

=

∞∑

ℓ=1

(α je


)ℓ 1ℓ!

∫

R

e−(α j+αk)xxℓνU jk (dx)︸︷︷︸

=:q(U jk )

ℓ

.

Hence

KU jk (a) = KU jk (α j + αk) +∞∑

ℓ=1

(α je


)lq

(U jk )ℓ.

Similarly,

n∑

j=1

KV j (1− e−θ j ) =n∑

j=1

KV j (1)+∞∑

ℓ=1


, whereq(V j)ℓ=

∫

R

xℓ

ℓ!e−xνV j (dx).

So, overall we have

KX(θ) = lnL(θ1, . . . , θn)

=

n∑

j=1

n∑

k= j+1


j=1

KV j (1)

38

+

n∑

j=1

n∑

k= j+1

∞∑

ℓ=1

(α je


)lq

(U jk)ℓ+

n∑

j=1

∞∑

ℓ=1


= −v+ vLC(θ),

if and only if

v = −

n∑

j=1

n∑

k= j+1


j=1

KV j (1)

,

LC(θ) =1v

n∑

j=1

n∑

k= j+1

∞∑

ℓ=1

(α je


)lq

(U jk )ℓ+

n∑

j=1

∞∑

ℓ=1


.

6.1. Negative binomial marginal law

Let us now focus on the case that the mixing variablesUs andVs follow a gamma marginal law so thatX follows a multivariate negative binomial distribution. Recall the result for the joint probability generatingfunction of (X1, . . . ,Xn)

G(t1, . . . , tn) = E(tX1

1 · · · tXnn ) =

n∏

j=1

n∏

k= j+1

MU jk {α j(t j − 1)+ αk(tk − 1)}n∏

j=1

MV j (t j − 1),

and consider three scenarios of particular interest:

Example 7(Independencecase). Setαi ≡ 0, for i = 1, . . . , n and choose Vi ∼ Γ(κi , 1/βi). ThenE(tX11 · · · t

Xnn ) =∏n

i=1(1− βi(ti − 1))−κi , which implies that the Xi are independent and satisfy Xi ∼ NB(κi , βi/(1+ βi)).

Example 8 (Dependence through pairwise common factors). Choose Ui j ∼ Γ(κi j , 1) and Vi ≡ 0, fori, j = 1, . . . , n. Then

G(t1, . . . , tn) =n∏

j=1

n∏

k= j+1

MU jk {α j(t j − 1)+ αk(tk − 1)}

=

n∏

j=1

n∏

k= j+1

[1− {α j(t j − 1)+ αk(tk − 1)}

]−κ jk,

and

G(ti) = {1− αi(ti − 1)}−∑n

j=1, j,i κi j ,

which implies that Xi ∼ NB(∑n

j=1, j,i κi j , αi/(1+ αi)).

Example 9 (Dependence through pairwise common factors and additional independent factors). Supposethat Ui j ∼ Γ(κi j , 1) and Vi ∼ Γ(κi , 1/αi). Then we can deduce that

G(t1, . . . tn) =n∏

j=1

n∏

k= j+1

[1− {α j(t j − 1)+ αk(tk − 1)}

]−κ jkn∏

i=1

{1− αi(ti − 1)}−κi ,

and

G(ti) = {1− αi(ti − 1)}−∑n

j=1, j,i κi j−κi ,

which implies that Xi ∼ NB(∑n

j=1, j,i κi j + κi , αi/(1+ αi)).

39

6.2. Pairwise likelihood

Let us next derive the pairwise likelihood for the random-additive-effect type model with pairwiseinteraction terms.

To this end, suppose we have the observationsy(k)i , which are realizations ofY(k)

i∆m, for i = 0, . . . , ⌊t/∆m⌋,

for k = 1, . . . , n. We will conveniently summarize them in the matrixy = (y(k)i )i=0,...,⌊t/∆m⌋,k=1,...,n.

We are going to construct a composite likelihood in the form of a pairwise likelihood as follows. Weconsider the pairs: (Y(k)

i∆m,Y(ℓ)

j∆m) for k, ℓ ∈ {1, . . . , n} andi, j ∈ {0, . . . , ⌊t/∆m⌋}.

From our construction of the model we can deduce that not all possible pairs need to be considered inorder to identify the relevant parameters. More precisely,we can restrict ourselves to the

• purely temporal pairs (Y(k)i∆m,Y(k)

(i+h)∆m) for k ∈ {1, . . . , n} andh ∈ {1, . . . , ⌊t/∆m⌋} andi ∈ {0, . . . , ⌊t/∆m⌋−

h},

• purely cross-component pairs (Y(k)i∆m,Y(ℓ)

i∆m) for k, ℓ ∈ {1, . . . , n} for k , ℓ andi ∈ {0, . . . , ⌊t/∆m⌋}.

The resulting pairwise log-likelihood function for the parameter vectorθ is then given by

l(θ, y) =n∑

k=1

⌊t/∆m⌋∑

i=0

⌊t/∆m⌋−i∑

h=−i,h,0


i ,Y(k)(i+h)∆m

= y(k)i+h)}

+

n∑

k=1

⌊t/∆m⌋∑

i=0


i )}

+

n−1∑

k=1

n∑

ℓ=k+1

⌊t/∆m⌋∑

i=0



i )}

6.2.1. Computing the pairwise probabilitiesFor the pairwise likelihood we need to compute the quantity Pr(Y(k)

i∆m= y(k)

i ,Y(ℓ)j∆m= y(ℓ)

j ). To this endnote that we have the following decomposition

Y(k)i∆m= L(k)(A(k)

i∆m) = L(k)(A(k)

i∆m∩ A(ℓ)

j∆m) + L(k)(A(k)

i∆m\ A(ℓ)

j∆m),

Y(ℓ)j∆m= L(ℓ)(A(ℓ)

j∆m) = L(ℓ)(A(k)

i∆m∩ A(ℓ)

j∆m) + L(ℓ)(A(ℓ)

j∆m\ A(k)

i∆m),

where the three sets

B(k,ℓ)i, j = A(k)

i∆m∩ A(ℓ)

j∆m,

B(k,ℓ)i\ j = A(k)

i∆m\ A(ℓ)

j∆m,

B(ℓ,k)j\i = A(ℓ)

j∆m\ A(k)

i∆m

are disjoint. Suppose for now that none of the setsB(k,ℓ)i, j , B

(k,ℓ)i\ j , B

(ℓ,k)j\i is empty in which case the computa-

tions could be further simplified. Then, using the law of total probability, we can deduce that

Pr(Y(k)i∆m= y(k)


j )

=∑

c1,c2

Pr{Y(k)i∆m= y(k)


j , L(k)(B(k,ℓ)

i, j ) = c1, L(ℓ)(B(k,ℓ)

i, j ) = c2}

=∑

c1,c2

Pr{L(k)(B(k,ℓ)i\ j ) = y(k)

i − c1, L(ℓ)(B(ℓ,k)

j\i ) = y(ℓ)j − c2, L

(k)(B(k,ℓ)i, j ) = c1, L

(ℓ)(B(k,ℓ)i, j ) = c2}

=

y(k)i∑

c1=0

y(ℓ)j∑

c2=0

Pr{L(k)(B(k,ℓ)i\ j ) = y(k)

i − c1}Pr{L(ℓ)(B(ℓ,k)j\i ) = y(ℓ)

j − c2}Pr{L(k)(B(k,ℓ)i, j ) = c1, L

(ℓ)(B(k,ℓ)i, j ) = c2}.

40

The univariate marginal probability mass functions are known, hence we only need to compute the bivariateprobability mass function for terms of the form

Pr(L(k)(A) = c1, L(ℓ)(A) = c2),

whenk , ℓ, otherwise we get the corresponding univariate probability mass function.Let k , ℓ. Then

Pr(Xk = xk,Xℓ = xℓ) =∫

(0,∞)n+(n−1)!

Pr(Xk = xk,Xℓ = xℓ|Uii ′ = uii ′ , i ∈ 1, . . . , n, i′ ∈ {i + 1, . . . , n},V1 = v1, . . . ,Vn = vn)n∏

i=1

n∏

i′=i+1

fUii ′(uii ′)duii ′

n∏

ι=1

fVι(vι)dvι.

Note that sinceXk|Zk ∼ POI(Zk), we have

Pr(Xk = xk,Xℓ = xℓ|Uii ′ = uii ′ , i ∈ 1, . . . , n, i′ ∈ {i + 1, . . . , n},V1 = v1, . . . ,Vn = vn)

= exp

−

αk

n∑

p=1,p,k

u(k∧p)(k∨p) + vk

1

xk!

αk

n∑

p=1,p,k


xk

· exp

−

αℓ

n∑

p′=1,p′,ℓ

u(ℓ∧p′)(ℓ∨p′) + vℓ

1

xℓ!

αℓ

n∑

p′=1,p′,ℓ

u(ℓ∧p′)(ℓ∨p′) + vℓ

xℓ

.

Using the binomial theorem, we can write

αk

n∑

p=1,p,k


xk

=

xk∑

w1=0

(xk

w1

)α

w1

k

n∑

p=1,p,k

u(k∧p)(k∨p)

w1

vxk−w1

k ,

αℓ

n∑

p′=1,p′,ℓ

u(ℓ∧p′)(ℓ∨p′) + vℓ

xℓ

=

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

n∑

p′=1,p′,ℓ

u(ℓ∧p′)(ℓ∨p′)

w′1

vxℓ−w′1ℓ.

Also

n∑

p=1,p,k

u(k∧p)(k∨p)

w1

=

w1∑

w2=0

(w1

w2

)uw2

(k∧ℓ)(k∨ℓ)

n∑

p=1,p<{k,ℓ}

u(k∧p)(k∨p)

w1−w2

,

n∑

p′=1,p′,ℓ

u(ℓ∧p′)(ℓ∨p′)

w′1

=

w′1∑

w′2=0

(w′1w′2

)u

w′2(k∧ℓ)(k∨ℓ)

n∑

p′=1,p′<{k,ℓ}

u(ℓ∧p′)(ℓ∨p′)

w′2−w′1

Hence we have

Pr(Xk = xk,Xℓ = xℓ)

=

xk∑

w1=0

(xk

w1

)α

w1k

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

w1∑

w2=0

(w1

w2

) w′1∑

w′2=0

(w′1w′2

)

∫

(0,∞)u

w2+w′2(k∧ℓ)(k∨ℓ) exp

{−(αk + αℓ)u(k∧ℓ)(k∨ℓ)

}fU(k∧ℓ)(k∨ℓ) {u(k∧ℓ)(k∨ℓ)}du(k∧ℓ)(k∨ℓ)

∫

(0,∞)n−2

n∑

p=1,p<{k,ℓ}

u(k∧p)(k∨p)

w1−w2

exp

−αk

n∑

p=1,p<{k,ℓ}

u(k∧p)(k∨p)

41

·

n∏

p=1,p<{k,ℓ}

fU(k∧p)(k∨p) {u(k∧p)(k∨p)}du(k∧p)(k∨p)

∫

(0,∞)n−2

n∑

p′=1,p′<{k,ℓ}

u(ℓ∧p′)(ℓ∨p′)

w′1−w′2

exp

−αℓn∑

p′=1,p′<{k,ℓ}

u(ℓ∧p′)(ℓ∨p′)

·

n∏

p′=1,p′<{k,ℓ}

fU(ℓ∧p′ )(ℓ∨p′) {u(ℓ∧p′)(ℓ∨p′)}du(ℓ∧p′)(ℓ∨p′)

∫

(0,∞)vxk−w1

k exp(−vk) fVk(vk)dvk

∫

(0,∞)v

xℓ−w′1ℓ

exp(−vℓ) fVℓ (vℓ)dvℓ

=

xk∑

w1=0

(xk

w1

)α

w1k

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

w1∑

w2=0

(w1

w2

) w′1∑

w′2=0

(w′1w′2

)Γ1Γ2Γ3Γ4Γ5,

where

Γ1 =

∫

(0,∞)u

w2+w′2(k∧ℓ)(k∨ℓ) exp

{−(αk + αℓ)u(k∧ℓ)(k∨ℓ)

}fU(k∧ℓ)(k∨ℓ) (u(k∧ℓ)(k∨ℓ))du(k∧ℓ)(k∨ℓ)

= E{Uw2+w′2e−(αk+αℓ)U } for someU ∼ Γ(κ(k∧ℓ)(k∨ℓ), 1)

=Γ{w2 + w′2 + κ(k∧ℓ)(k∨ℓ)}

Γ{κ(k∧ℓ)(k∨ℓ)}(αk + αℓ + 1)w2+w′2+κ(k∧ℓ)(k∨ℓ),

Γ2 =

∫

(0,∞)n−2

n∑

p=1,p<{k,ℓ}

u(k∧p)(k∨p)

w1−w2

exp

−αk

n∑

p=1,p<{k,ℓ}

u(k∧p)(k∨p)

·

n∏

p=1,p<{k,ℓ}

fU(k∧p)(k∨p) {u(k∧p)(k∨p)}du(k∧p)(k∨p)

= E(Uw1−w2e−αkU) for someU ∼ Γ

n∑

p=1,p<{k,ℓ}

κ(k∧p)(k∨p), 1

=Γ{w1 − w2 +

∑np=1,p<{k,ℓ} κ(k∧p)(k∨p)}

Γ{∑n

p=1,p<{k,ℓ} κ(k∧p)(k∨p)}(αk + 1)w1−w2+∑n

p=1,p<{k,ℓ} κ(k∧p)(k∨p)

Γ3 =

∫

(0,∞)n−2

n∑

p′=1,p′<{k,ℓ}

u(ℓ∧p′)(ℓ∨p′)

w′1−w′2

exp

−αℓ

n∑

p′=1,p′<{k,ℓ}

u(ℓ∧p′)(ℓ∨p′)

·

n∏

p′=1,p′<{k,ℓ}

fU(ℓ∧p′ )(ℓ∨p′) {u(ℓ∧p′)(ℓ∨p′)}du(ℓ∧p′)(ℓ∨p′)

= E(Uw′1−w′2e−αℓU) for someU ∼ Γ

n∑

p′=1,p<{k,ℓ}

κ(ℓ∧p′)(ℓ∨p′), 1

=Γ{w′1 − w′2 +

∑np′=1,p<{k,ℓ} κ(ℓ∧p′)(ℓ∨p′)}

Γ{∑n

p′=1,p<{k,ℓ} κ(ℓ∧p′)(ℓ∨p′)}(αℓ + 1)w′1−w′2+

∑np′=1,p<{k,ℓ} κ(ℓ∧p′)(ℓ∨p′)

,

Γ4 =

∫

(0,∞)vxk−w1

k exp(−vk) fVk(vk)dvk

= E(Vxk−w1e−V) for someV ∼ Γ (κk, 1/αk)

=Γ(xk − w1 + κk)α

xk−w1

k

Γ(κk)(αk + 1)xk−w1+κk,

42

Γ5 =

∫

(0,∞)v

xℓ−w′1ℓ

exp(−vℓ) fVℓ (vℓ)dvℓ

= E(Vxℓ−w′1e−V) for someV ∼ Γ (κℓ, 1/αℓ)

=Γ(xℓ − w′1 + κℓ)α

xℓ−w′1ℓ

Γ(κℓ)(αℓ + 1)xℓ−w′1+κℓ.

Finding the relation betweenL(k)(A), L(ℓ)(A) and Xk,Xℓ: As in the common factor model, we need tofind the relation betweenL(k)(A), L(ℓ)(A) andXk,Xℓ. We know that

E[exp{i(θk, θℓ)(L(k)(A), L(ℓ)(A))⊤}] = exp{Leb(A)C(L(k)′ ,L(ℓ)′ )(θk, θℓ)}

= exp{Leb(A)C(Xk,Xℓ)(θk, θℓ)}.

When looking at the characteristic function, we observe thefollowing: When comparing (L(k)(A), L(ℓ)(A))and (Xk,Xℓ), then the joint distribution of the latter can be obtained by multiplying the intensity of theunderlying compound Poisson process (of the latter) by Leb(A).

We have

Pr(L(k)(A) = xk, L(ℓ)(A) = xℓ)

=

xk∑

w1=0

(xk

w1

)α

w1k

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

w1∑

w2=0

(w1

w2

) w′1∑

w′2=0

(w′1w′2

)

= Leb(A)xk+xℓxk∑

w1=0

(xk

w1

)α

w1

k

xℓ∑

w′1=0

(xℓw′1

)α

w′1ℓ

w1∑

w2=0

(w1

w2

) w′1∑

w′2=0

(w′1w′2

)Γ1Γ2Γ3Γ4Γ5,

where

Γ1 =Γ{w2 + w′2 + κ(k∧ℓ)(k∨ℓ)}

Γ{κ(k∧ℓ)(k∨ℓ)}{Leb(A)(αk + αℓ) + 1}w2+w′2+κ(k∧ℓ)(k∨ℓ),

Γ2 =Γ{w1 − w2 +

∑np=1,p<{k,ℓ} κ(k∧p)(k∨p)}

Γ{∑n

p=1,p<{k,ℓ} κ(k∧p)(k∨p)}{Leb(A)αk + 1}w1−w2+∑n

p=1,p<{k,ℓ} κ(k∧p)(k∨p)

Γ3 =Γ{w′1 − w′2 +

∑np′=1,p<{k,ℓ} κ(ℓ∧p′)(ℓ∨p′)}

Γ{∑n

p′=1,p<{k,ℓ} κ(ℓ∧p′)(ℓ∨p′)}{Leb(A)αℓ + 1}w′1−w′2+

∑np′=1,p<{k,ℓ} κ(ℓ∧p′ )(ℓ∨p′)

,

Γ4 =Γ(xk − w1 + κk)α

xk−w1

k

Γ(κk){Leb(A)αk + 1}xk−w1+κk,

Γ5 =Γ(xℓ − w′1 + κℓ)α

xℓ−w′1ℓ

Γ(κℓ){Leb(A)αℓ + 1}xℓ−w′1+κℓ.

6.2.2. The final pairwise likelihood functionWhen we combine all the above results, we end up with the following pairwise likelihood in the pair-

wise interaction model:

l(θ, y) =n∑

k=1

⌊t/∆m⌋−1∑

i=0

⌊t/∆m⌋−i∑

h=1


i ,Y(k)(i+h)∆m

= y(k)i+h)}

+

n∑

k=1

⌊t/∆m⌋∑

i=0


i )}

+

n−1∑

k=1

n∑

ℓ=k+1

⌊t/∆m⌋∑

i=0



i )},

43

where

Pr(Y(k)i∆m= y(k)

i ,Y(k)(i+h)∆m

= y(k)(i+h))

=

min(y(k)i ,y

(k)(i+h))∑

c=0

Pr{L(k)(B(k,k)i\(i+h)) = y(k)

i − c}Pr{L(k)(B(k,k)(i+h)\i) = y(k)

(i+h) − c}Pr{L(k)(B(k,k)i,(i+h)) = c}.

SinceL(k)(B(k,k)i\(i+h)) andL(k)(B(k,k)

(i+h)\i) have the same law, we can write

Pr(Y(k)i∆m= y(k)

i ,Y(k)(i+h)∆m

= y(k)(i+h))

=

min(y(k)i ,y

(k)(i+h))∑

c=0


i − c}Pr{L(k)(B(k,k)i\(i+h)) = y(k)

(i+h) − c}Pr{L(k)(B(k,k)i,(i+h)) = c}.

If the trawl parameters have already been inferred (e.g., via a previous (generalized) method of momentsapproach), then the pairwise likelihood for determining the parameters of the marginal multivariate negativebinomial distribution simplifies to

l(θ, y) =n∑

k=1

⌊t/∆m⌋∑

i=0


i )}

+

n−1∑

k=1

n∑

ℓ=k+1

⌊t/∆m⌋∑

i=0



i )}.

We note that

Pr(Y(k)i∆m= y(k)


i )

=

y(k)i∑

c1=0

y(ℓ)i∑

c2=0

Pr{L(k)(A(k) \ A(ℓ)) = y(k)i − c1}Pr{L(ℓ)(A(ℓ) \ A(k)) = y(ℓ)

i − c2}

· Pr{L(k)(A(k) ∩ A(ℓ)) = c1, L(ℓ)(A(k) ∩ A(ℓ)) = c2}

6.3. Likelihood computations in the univariate case

For convenience, we also provide the formula for the pairwise likelihood if one wants to estimate theparameters of thekth component of the multivariate time series for ak ∈ {1, . . . , n}:

l(θ, y) =⌊t/∆m⌋−1∑

i=0

⌊t/∆m⌋−i∑

h=1


i ,Y(k)(i+h)∆m

= y(k)i+h)} +

⌊t/∆m⌋∑

i=0


i )},

where

Pr(Y(k)i∆m= y(k)

i ,Y(k)(i+h)∆m

= y(k)(i+h))

=

min(y(k)i ,y

(k)(i+h))∑

c=0


i − c}Pr{L(k)(B(k,k)(i+h)\i) = y(k)

(i+h) − c}Pr{L(k)(B(k,k)i,(i+h)) = c}.

SinceL(k)(B(k,k)i\(i+h)) andL(k)(B(k,k)

(i+h)\i) have the same law, we can write

Pr(Y(k)i∆m= y(k)

i ,Y(k)(i+h)∆m

= y(k)(i+h))

=

min(y(k)i ,y

(k)(i+h))∑

c=0


i − c}Pr{L(k)(B(k,k)i\(i+h)) = y(k)

(i+h) − c}

44

· Pr{L(k)(B(k,k)i,(i+h)) = c}.

References

[1] O. E. Barndorff-Nielsen, F. E. Benth, A. E. D. Veraart, Ambit processes and stochastic partial differential equations, in:G. Di Nunno, B. Øksendal (Eds.), Advanced Mathematical Methods for Finance, Springer Berlin Heidelberg, Berlin, Heidel-berg, 2011, pp. 35–74.

[2] O. E. Barndorff-Nielsen, P. Blæsild, V. Seshadri, Multivariate distributions with generalized inverse Gaussian marginals, andassociated Poisson mixtures, The Canadian Journal of Statistics. La Revue Canadienne de Statistique 20 (1992) 109–120.

[3] O. E. Barndorff-Nielsen, A. Lunde, N. Shephard, A. E. Veraart, Integer-valued trawl processes: A class of stationary infinitelydivisible processes, Scandinavian Journal of Statistics 41 (2014) 693–724.

[4] D. Karlis, Multivariate Poisson models, 2002. Presentation slides, Limburg, October 2002.[5] D. Karlis, L. Meligkotsidou, Multivariate Poisson regression with covariance structure, Statistics and Computing 15 (2005) 255–

265.[6] C. D. Kemp, S. Loukas, The computer generation of bivariate discrete random variables, Journal of the Royal Statistical Society.

Series A (General) 141 (1978) 513–519.[7] S. Kocherlakota, K. Kocherlakota, The bivariate logarithmic series distribution, Communications in Statistics.Theory and Meth-

ods 19 (1990) 3387–3432.[8] A. Ruhe, Fitting empirical data by positive sums of exponentials, SIAM Journal on Scientific and Statistical Computing 1 (1980)

481–498.

45

spiral.imperial.ac.uk · 2019. 9. 5. · supplementary material to “modeling, simulation and...

Documents