Transcript
Page 1: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

MODELING COVARIATE-CONTINGENT CORRELATION

AND TAIL-DEPENDENCE WITH COPULAS

FENG LI

Abstract. Copulas provide an attractive approach for constructing multivariate densities with flexible

marginal distributions and different forms of dependence. Of particular importance in many areas

is the possibility of explicitly modeling tail-dependence. Most of the available approaches estimatetail-dependence and correlations via nuisance parameters, yielding results that are neither tractable

nor interpretable for practitioners. We propose a general Bayesian approach for directly modeling tail-dependence and correlations as explicit functions of covariates. Our method allows for variable selection

among the covariates in the marginal models and in the copula parameters. Posterior inference is carried

out using a novel and efficient MCMC simulation method.

Keywords: Covariate-dependent copula; Bayesian variable selection; tail-dependence; Kendall’s τ ;

MCMC.

1. Introduction

Copula modeling has been an active research area dating back to Sklar’s theorem (Sklar, 1959) inwhich he proves that a multivariate cumulative distribution function F (x1, ..., xM ) can be written interms of univariate marginal distributions and a copula function C(u1, ..., uM ), where ui = Fi(xi) is thei:th marginal CDF. Various properties and applications of copulas have thereafter been studied, see e.g.Nelsen (2006) for an introduction to copulas, Joe (1997) for dependence and extreme value distributionwith copulas, and Dorota (2010) for constructions of multivariate dependences using bivariate copulas.

Copula models have been widely used in financial applications due to its ability in modeling tail-dependence and correlations, see Patton (2012b) for a recent survey. Jaworski et al. (2010) reviews thestate of the art approaches in copula estimation, including pair-copula constructions (Czado, 2010). Animportant concept in copula modeling is the tail-dependence. In bivariate copulas, the tail-dependencedescribes the dependence of random variables in the tail: limu→0+ p(X1 < F−11 (u)|X2 < F−12 (u)) is calledthe lower tail-dependence and limu→1− p(X1 > F−11 (u)|X2 > F−12 (u)) is the upper tail-dependence forthe two random variables X1 and X2. Dobric and Schmid (2005) use nonparametric methods to esti-mate the lower tail-dependence in bivariate copulas with continuous marginals. Schmidt and Stadtmuller(2006) explore the tail-dependence estimators for the tail copula where the lower tail copula and uppertail copula for a bivariate copula C are defined as limt→∞ tC(x/t, y/t) and limt→∞(x+y−t+tC(x/t, y/t))respectively, if the limits exist. Another special case in the tail-dependence is the asymptotically indepen-dent: x1 and x2 are asymptotically independent if F (x1, x2) = limx1→∞ F (x1, x2) limx2→∞ F (x1, x2).We do not further explore this possibility but see Draisma et al. (2004) for hypothesis testing to de-tect the dependence of extreme events when they are asymptotically independent, and for the study ofasymptotic dependence case, see e.g. Ledford and Tawn (1997).

Among the vast and extensive literature in copula modeling, few articles have investigated the un-derlying causes of the dependence structures. This is partially because the computational complexity isstill a challenge in many situations, which explains the relatively simplistic models used for the copula.Moreover, most copula approaches for modeling rank correlation and tail-dependence for existing copu-las require firstly modeling intermediate parameters and obtaining the correlation and tail-dependencein the end. Thus the results are neither tractable nor interpretable for practitioners. In this paperwe present a general Bayesian approach for copula modeling with explanatory variables entering boththe tail-dependence and correlation parameters in the copula function. This construction allows us toexplore the drivers of the different forms of dependence between the variables. We propose an efficientMCMC simulation method for the posterior inference which also allows for variable selection in both themarginal models and the copula features. In this paper a feature stands for a characteristic in the copulafunction, e.g. Kendall’s τ and tail-dependence are two features of a copula.

School of Statistics and Mathematics, Central University of Finance and Economics, 100081 Beijing, China. Email:

[email protected].

1

arX

iv:1

401.

0100

v2 [

stat

.ME

] 4

Nov

201

6

Page 2: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

2 FENG LI

The outline of the paper is as follows. In Section 2 we introduce the Bayesian covariate-dependentcopula model and Section 2.2 introduces the reparametrized covariate-dependent copula model withsome new properties. We discuss the prior specification for the model and present the general form forcopula posterior in Section 3. Section 4 presents the details of the MCMC scheme. We demonstrate theadvantages of having covariates to model dependence parameters with simulated data in Section 5. InSection 6 we apply the model to the daily returns from the S&P 100 and S&P 600 stock market indices.Section 7 concludes the paper and discusses potential directions for further research. The appendix ofthe paper documents the necessary analytical computation used in the MCMC.

2. The covariate-dependent copula model

A copula is a multivariate distribution that separates the univariate marginals and the multivariatedependence structure. The correspondence between a multivariate distribution F (x1, ..., xM ) and acopula function C(u1, ..., uM ) can be expressed as

F (x1, ..., xM ) = F (F−11 (u1), ..., F−1M (uM )) = C(u1, ..., uM ) = C(F1(x1), ..., FM (xM ))

where the correspondence is one to one with continuous marginal distributions. Copulas provide a generalapproach to constructing more flexible multivariate densities. For example, the bivariate Gaussian copulaC(u1, u2|ρ) = Ψ(F−11 (u1), F−12 (u2)|ρ) with the correlation parameter ρ, is a relaxed Gaussian densityin the sense that F1(·) and F2(·) do not need to be normal, see e.g. Pitt et al. (2006). New classes ofmultivariate densities are possible to construct in terms of copulas, see Joe (1997) for details. A keyfeature in copula models is that the multivariate dependence structure does not depend on the marginaldensities. Thus multivariate modeling with copulas consists of two parts: i) separate modeling of eachmarginal distribution and, ii) modeling the multivariate dependence.

2.1. Dependence concepts. Modeling the multivariate dependence typically involves quantifying twoimportant quantities: correlation and tail-dependence. In copula models, the correlation between twovariables are usually measured with rank correlations like Kendall’s τ

τ = 4

∫ ∫F (x1, x2)dF (x1, x2)− 1 = 4

∫ ∫C(u1, u2)dC(u1, u2)− 1.

In this paper we focus on modeling Kendall’s τ , but our methodology also applies to other correlationslike Spearman’s ρ. The tail-dependence describes the concordance between extreme values of randomvariables X1 and X2. The lower tail-dependence λL and the upper tail-dependence λU can also beexpressed in terms bivariate copulas

λL = limu→0+

p(X1 < F−11 (u)|X2 < F−12 (u)) = limu→0+

C(u, u)

u,

λU = limu→1−

p(X1 > F−11 (u)|X2 > F−12 (u)) = limu→1−

1− C(u, u)

1− u.

In principle, correlations and tail-dependencies can attain all values in the domain [−1, 1], but not allcopulas can specify them in the whole interval. For example the Clayton copula and Gumbel copula(Joe, 1997) can only have positive correlations and tail-dependence. Gumbel exhibits strong uppertail-dependence and relatively weak lower tail-dependence.

2.2. The reparametrized copula. In this paper we focus on the bivariate copula modeling with thetwo-parameter Joe-Clayton copula due to limited paper length, which is commonly used in the literature.Our copula modeling approach is general and can be applied to modeling any copulas with any featureof interest. In Section 6 and our computer program, we have also extended our model class to Claytoncopula, Gumbel copula, and student t-copula (Section A.3). The popular vine copula construction(Aas et al., 2009; Czado et al., 2012) can also be used to extend our bivariate copula modeling tohigher dimensions with discrete and continuous margins but requires further extensive work and will bediscussed in a separate paper.

2.2.1. The Joe-Clayton copula density. The Joe-Clayton copula, also known as the BB7 copula, wasintroduced by Joe (1997) and is of the form

C(u, v|θ, δ) =η(η−1(u) + η−1(v))

=1−[1−

{(1− uθ

)−δ+(1− vθ

)−δ − 1}−1/δ]1/θ

Page 3: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 3

where η(s) = 1 − [1 − (1 + s)−1/δ]1/θ, θ ≥ 1, δ > 0, u = 1 − u, v = 1 − v with lower tail-dependenceparameter λL = 2−1/δ and upper tail-dependence parameter λU = 2−21/θ. The copula density functionfor the Joe-Clayton copula is then

c(u, v|θ, δ) =∂2C(u, v, θ, δ)

∂u∂v= [T1(u)T1(v)]

−1−δT2(u)T2(v)L

−2(1+δ)/δ1

× (1− L−1/δ1 )1/θ−2[(1 + δ)θL

1/δ1 − θδ − 1

] (1)

where T1(s) = 1− (1− s)θ, T2(s) = (1− s)θ−1 and L1 = T1(v)−δ + T1(u)−δ − 1.The Joe-Clayton copula has been widely used in modeling tail-dependence. Patton (2006) uses a

symmetric version of Joe-Clayton copula to model time-varying dependence with its autoregressive termsfor daily return of the Deutsche mark with the U.S. dollar and the Japanese yen with U.S. dollar. Aaset al. (2009) and Czado et al. (2012) among others develop a flexible class of multivariate copulasallowing multivariate dependence via a vine structure based on bivariate copulas, including the Joe-Clayton copula. Bouye and Salmon (2009) apply the Joe-Clayton copula to a quantile regression thatallows both positive and negative slopes for the quantile curves. But none of these works have a modelthat can explain the driving forces behind the dependence structures.

2.2.2. Kendall’s τ . Kendall’s τ of the Joe-Clayton copula for the case 1 ≤ θ < 2 can be found in e.g,Smith and Khaled (2012). We now present the full expression for all θ ≥ 1.

τ(θ, δ) =

1− 2/[δ(2− θ)] + 4B (δ + 2, 2/θ − 1) /(θ2δ), 1 ≤ θ < 2;

1− [ψ(2 + δ)− ψ(1)− 1] /δ, θ = 2;

1− 2/[δ(2− θ)]− 4π/[θ2δ(2 + δ) sin(2π/θ)B (1 + δ + 2/θ, 2− 2/θ)

], θ > 2,

where B(·) is the beta function and ψ(·) is the digamma function. It is easy verified that Kendall’s τ iscontinuous for all θ ≥ 1.

If we employ the equations δ = − log 2/ log λL and θ = log 2/ log(2− λU ) to the preceding result, wecan rewrite the Kendall’s τ in terms of lower and upper tail-dependence as τ(λL, λU ), see Figure 1 for acontour plot of these relationships.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

τ

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

λL

λU

Figure 1. Contour plot of the Kendall’s τ with respect to lower tail-dependence (λL)and upper tail-dependence (λU ) for the Joe-Clayton copula.

Page 4: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

4 FENG LI

The Joe-Clayton copula can only determine positive correlations. If the relationship between twovariables is negative, we just need to rotate the axes of the copula and the estimation procedure remainsthe same. For example, for copulas rotated by 90 degrees, u has to be set to 1− u; for 270 degrees let vbe 1 − v and for 180 degrees set u and v to 1 − u and 1 − v, respectively. See Durrleman et al. (2000)for the proof that after this transformation it is still a copula and for other possible transformations toextend current bivariate copula with desired properties. In the financial application in Section 6, thecorrelations are known to be positive, but modeling negative correlations requires no extra work also forother copulas than the Joe-Clayton copula.

2.2.3. Some properties. The Joe-Clayton copula has some unique attributes. The upper tail-dependenceand lower tail-dependence are not functionally dependent. The Clayton copula (Clayton, 1978) and theB5 copula (Joe, 1993) are special cases of the Joe-Clayton copula. All of them belong to a more generalclass of Archimedean copulas. Furthermore, we also find the following new properties.

(1) The inequality holds 0 ≤ λL ≤ 21/2−1/(2τ) when the lower tail-dependence is not extremely high.We say that λL and τ are variationally dependent. The proof is non-trivial, but we have verifiedthe inequality numerically in a very careful way. We discover the bound of the inequality throughthe limit of τ(λL, λU ) when λU → 0, see Figure 1 for an illustrative plot.

(2) When λL → 0 (i.e. δ → 0), we have

τ → 1− 2H(2/θ)− 2

2− θ= 1− 2H(2 log(2− λu)/ log 2)− 2

2− log 2/ log(2− λu)

and

∂τ

∂θ→ 2(1−H(2/θ))

(θ − 2)2− 4ψ1(2/θ + 1)

(θ − 2)θ2

where H(·) is the Harmonic number. A special case is when θ → 2 (i.e. λU → 2 −√

2 ≈ 0.59),we have τ → 2 − π2/6 ≈ 0.36 and ∂τ/∂θ → π2/12 − Zeta(3)/2 ≈ 0.22. where Zeta(·) is theRiemann zeta function.

(3) Furthermore, we also have derived the analytical gradients for the copula density with respectto Kendall’s τ and tail-dependence of Joe-Clayton copula in Appendix A.2, which will be usedto construct efficient proposal distributions for MCMC.

2.2.4. Reparametrization. The parameters in most copula functions do not directly represent the copulafeatures, see e.g. the tail-dependence parameters and Kendall’s τ in the Joe-Clayton copula. In thissection we describe how to reparameterize the copula function so that the parameters are the copulafeatures of interest.

To simplify the interpretation of the copula model, we parameterize it in terms of lower tail-dependencyparameter λL and Kendall’s τ ,

C(u, v|λL, τ) =1−[1−

{[1− ulog 2/log(2−τ−1(λL))

]log 2/log λL

+[1− vlog 2/log(2−τ−1(λL))

]log 2/log λL− 1

}log λL/ log 2]log(2−τ−1(λL))/ log 2

where τ−1(λL) is the inverse function of Kendall’s τ given λL, i.e. the upper tail-dependence λU .And the related reparametrized copula density is obtained by substituting δ = − log 2/ log λL and θ =log 2/ log(2− τ−1(λL)) from (1).

An attractive property of Kendall’s τ is that it is invariant with respect strictly monotonic transforms.Other types of correlation like Spearman’s rank correlation can be equally well modeled with our method.For measuring the dependence in trivariate distributions, one may consider using a three-dimensionalversion of correlations, see e.g. Garcıa et al. (2013). Correlations in higher dimensions are usuallyestimated pairwise.

Our parameterization has two main advantages. Firstly, it reduces the efforts for specifying the priorinformation in our Bayesian approach in Section 3. Secondly, and most importantly, this parameterizationmake it possible to directly link correlations and tail-dependence to covariates, see Section 2.3 for details.

Modeling upper tail-dependence is also important in financial applications. There are also alternativereparametrization schemes that allow modeling lower and upper tail dependence parameter directly asshown in in our real data application. In principle, there is no difference between using parameterization(λL, λU ) and (λL, τ) with Joe-Clayton copula. No matter what two of the three parameterizations areused, it easily derives the other one (see, e.g. Figure 5) due to the deterministic relationship among lower

Page 5: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 5

tail-dependence, upper tail-dependence and Kendall’s τ in Figure 1. For other single-parameter copulaslike Clayton copula, Gumbel copula, A simple way to achieve the same effect is to rotate the copula for180 degrees with our parameterization.

2.3. Covariate-dependent copula parameters. Letting the Kendall’s τ and tail-dependence param-eters in copula modeling be fixed numbers is very restrictive. This is particularly true in financialtime-series applications where the tail-dependence has been shown to vary with time (Patton, 2006).We introduce a covariate-dependent copula model that allows the copula features to be linked to ob-served covariate information. A prominent example is covariate-dependent monotone dependence andtail-dependence for ith observation and jth margins in a copula:

τij = l−1τ (x′βτ ), λij = l−1λ (x′βλ)

where λ without subscripts represents the dependence parameter in the lower and/or upper tail, τ isKendall’s τ , and x is the set of covariates used in the two margins. Furthermore lτ (·) and lλ(·) aresuitable link functions that connect λ and τ with x.

Generally in a p-dimensional copula, like student t-copula, our covariate-dependent structure can beconveniently represented in a matrix form as

vec(T ) = l−1([I ⊗X]vecB)

where T is n× [p(p− 1)/2] dependence parameter matrix, I is p(p− 1)/2 dimensional identity matrix,X is the set of covariates used in all marginal models and B is the corresponding coefficient matrix.Other copula parameters can be linked to covariates in the same way. This representation allows us tojointly estimate the correlation and tail-dependence. Note that there will be a curse of dimensionalityin higher dimensional modeling with covariate-dependent structures. Currently limited tests for smallerdimensions less than 10 were tried with t-copula. we notice that the computation time dramaticallyincreases as the dimension grows.

Patton (2006) allows ARMA-like variation in the dependence parameter. Our approach makes itpossible to use all marginal information to model the dependence parameters. This approach not onlyleads to more interesting interpretations of the features, but generates more accurate forecasts and allowsfor heteroscedasticity in the dependence parameters in comparison to multivariate stochastic volatilitymodel and multivariate GARCH model, see e.g. Yu and Meyer (2006)) and the references therein, seealso Section 5.3 and 6.2 for simulation and real data studies. We also use variable selection to selectmeaningful covariates that influence the dependence. Furthermore, variable selection also reduces themodel complexity and prevents overfitting, see Section 4 for details.

2.4. Marginal models. In this paper we use margins as synonym of marginal models. In principle, thecopula approach can be used with any margins, but we will assume the margins to be split-t distributions(Li et al., 2010) in our simulation studies in Section 5 and application in Section 6 . The split-t is aflexible four-parameter distribution with the student’s t, the asymmetric normal and symmetric normaldistributions as its special cases; see Li et al. (2010) for some properties of the split-t distribution.

Following Li et al. (2010) we allow the location parameter µ, the scale parameter φ, the degrees offreedom ν and the skewness parameter κ in the split-t density in the margins to be linked to covariatesas

µij = x′ijβµj νij = exp(x′ijβνj )φij = exp(x′ijβφj ) κij = exp(x′ijβλj ), for i = 1, ..., n,

where xij is the covariate vector for the i:th observation in the j :th margin.One may also consider using mixture models in the margins. Li et al. (2010) show in an application to

S&P 500 data that the one-component split-t model with all parameters linked to covariates does well incomparison with mixtures of split-t components. In this paper, we will therefore use the one-componentsplit-t model for demonstration purposes.

Our inference procedure can be generally applied to different margins. We also implemented GARCHmodel (Bollerslev, 1986)and stochastic volatility (SV) model (Melino and Turnbull, 1990) as margins forcomparison purpose in Section 6.2. GARCH model and stochastic volatility models are popularly usedin modeling volatility. The major difference between the two approaches is the volatility is determin-istic in GARCH model and in stochastic volatility model, volatility are modeled via a latent process.Furthermore, our Joe-Clayton copula with split-t margin can explicitly model skewness, volatility andtail-dependence with linked covariates. A related approach is the time varying copula model in e.g.,Ausin and Lopes (2010) and Manner and Reznikova (2012).

Page 6: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

6 FENG LI

3. The prior and posterior

We use the same technique to specify the priors for the marginal parameters and the prior in thecopula features. We omit the subscript that indicates the functionality of the parameter in this sectionfor convenience. We will first assume that the model parameters are independent a priori, and then turnto a more general situation with dependent model parameters in Section 3.3.

Let Ij be the variable selection indicator for a given covariate xj

Ij =

{1 if βj 6= 0

0 if βj = 0,

where βj is the jth covariate in the model. More informally, this can be expressed as

Ij =

{1 if variable j enters the model

0 otherwise.

We standardize each covariate to have zero mean and unit variance and assume prior independencebetween the intercept β0 and the slope β. We can decompose the joint prior as

p(β0, β, I) = p(β0)p(β, I) = p(β0)p(β|I)p(I).

We will use normal priors for both β0 and β. We also assume that the intercept is always included ineach parameter, so the variable selection indicator for β0 is always one.

3.1. Prior for the intercept. We set the prior for the intercept by following the strategy in Villaniet al. (2012) that firstly puts prior information on the model parameters (e.g. τ and λL in the Joe-Clayton copula), and then derive the implied prior on the intercept β0 under the assumption that thecovariates are at their means. The technique can be applied to the following two situations directly.When the link is the identity, setting the implied prior on the model parameter is trivially the same ason the intercept. When the link is the log function, assuming a log-normal distribution on the modelparameter with mean m and variance σ2 yields a normal prior with mean log(m)− log[σ2/m2 + 1]/2 andvariance log[σ2/m2 + 1] in the intercept.

We now generalize this to a general situation that can be applied to any link function with anydistribution. We take the tail-dependence parameter λ as an example, where 0 < λ < 1 in some copulafunctions, e.g. the Joe-Clayton copula in Section 2.2. It is natural to consider using the logit link toconnect the tail-dependence λ with covariates. When there is only an intercept in the covariates, we haveλ = 1/(1 + exp(−β0)). Therefore, if we assume λ to have a beta distribution Beta(m,σ2) with mean mand variance σ2, we have the mean and variance for β0 as

E(β0) =

∫ 1

0

log(x

1− x)Beta(x,m, σ2)dx = ψ(α1)− ψ(α2),

V (β0) =

∫ 1

0

(log(x

1− x))2Beta(x,m, σ2)dx− E2(β0) = ψ1(α1) + ψ1(α2)

where ψ(·) and ψ1(·) are the digamma and trigamma functions respectively, α1 = −m(m2−m+ σ2)/σ2

and α2 = −1 +m+ (m− 1)2m/σ2. Higher order moments are also possible to obtain either analyticallyor numerically. We can now set the prior on the intercept β0 based on the derived mean and varianceinformation. The prior for the intercept in the Kendall’s τ parameter can be elicited in the same fashionbut the integration domain and link function should be changed accordingly.

3.2. Prior for the slope and variable selection indicators. We first consider the case withoutvariable selection. We assume that the slopes are normally distributed with mean zero and covariancematrix Σ. The extension to a non-zero mean is trivial. The covariance matrix is defined as Σ = c2 ·P−1where P is a positive definite symmetric matrix and c is a scaling factor. In the application, P is theidentity matrix. Using the inverse Fisher information for P as in Villani et al. (2012) is also possibility.

We now consider the case with variable selection. Conditional on the variable selection indicators, theslopes are still normal distributed with mean µI + Σ21Σ−1Ic (βIc − µIc) and covariance matrix becomes

ΣI −Σ21Σ−1Ic Σ12 (Mardia and Kent, 1979), with obvious notations. The prior for each variable selectionindicator is identically Bernoulli distributed with probability of p.

A shrinkage prior is often used as an alternative method for reducing model complexity. In ourexperience, the choice between variable selection and shrinkage estimator depends on the context of theapplication. Variable selection is usually used to select meaningful variables, which is of great interest

Page 7: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 7

here as we are exploring which variables that explain or drive the dependence among variables. See alsoVach et al. (2001) for a comparison for the two approaches in some commonly used models.

3.3. Priors when the parameters are dependent. In this section we consider a special case whentwo or more model parameters are dependent a priori. When we reparametrize the original densityfunction in terms of other parameters, it is common to introduce a variational dependence between thenew parameters in the sense that the outcome of one parameter puts a restriction on the domain of theother parameter. In our model, the original Joe-Clayton copula has two parameters θ and δ which arevariationally independent. When we reparametrize it in terms of lower tail-dependence and Kendall’s τ ,Section 2.2.3 shows the inequality between the two parameters (see Figure 1 for a visualization of therelations between the parameters).

As before, our aim is to elicit a prior on βτ0,, βτ , βλ0 and βλ via an elicited joint distribution on τ andλ. When the parameters are variationally dependent and we can no longer assume prior independenceand instead we decompose the joint prior for the model parameters as

p(τ, λ) = p(τ |λ)p(λ). (2)

The marginal priors for βλ0 , βλ and its variable selection indicators Iλ are the same as in Section 3.1and Section 3.2. We will now document the prior for βτ0 , βτ and its variable selection indicators Iτconditional on λ.

We first introduce the generalized beta function and the generalized logit link function.

Definition 1. The generalized beta distribution. Let gBeta(x, a, b,m, σ) be the generalized betadistribution with mean m, standard deviation σ where a < x < b. Then (x− a)/(b− a) follows the betadistribution with mean (m− a)/(b− a) and standard deviation σ/(b− a).

Definition 2. The generalized logit function. The generalized logit function that extends the logitfunction with two parameters a and b as

glogit(x, a, b) = a+b− a

1 + exp(−x),

where a < x < b.

The generalized beta distribution and the generalized logit link function now both have two boundaryparameters a and b. Furthermore, when a = 0 and b = 1, they reduce to their usual form.

Based on the decomposition in (2) we now assume that τ in Section 2 follows a generalized betadistribution gBeta(x, a, b,m, σ) conditional on λ with the generalized logit link glogit(Xτβτ , a, b) wherea = log(2)/(log(2)− log(λ)) and b = 1. We can now elicit the prior on the intercept, slopes and variableselection indicators for τ conditional on λ by following Section 3.1 and Section 3.2.

Furthermore, for conditional dependence with more than two parameters, we can always decomposethe joint distribution with pairwise conditional distributions and apply the technique thereafter. It isshown in our application that the conditional link function used in the prior also makes the MCMC algo-rithm more robust and gives higher acceptance probability in Metropolis-Hastings algorithm comparedto the case where the prior is simply truncated to the region of allowed (τ, λ) pairs and all proposaldraws outside this region are rejected in the MCMC.

3.4. The joint posterior. The posterior in the copula model can be written in terms of the likelihoodsfrom the marginal distributions, the copula likelihood and the prior for parameters in the copula andmarginal distributions as

log p({β,I}|y,x) =constant +∑M

j=1log p(y.j |{β,I}j ,xj)

+ logLC(u|{β,I}C ,y,x) + log p({β,I})

where log p(y.j |{β,I}j ,xj) is the log likelihood in j:th margin, the sets {β, I}j are the parameter blocks

in the j:th margin. Furthermore, u = (u1, ...,uM ), where uj = (u1j , ..., unj)′

and uij = Fj(yij), andFj(·) is the CDF of the j:th marginal distribution and LC is the likelihood for the copula function. Inour application, we have M = 2 and we use the reparametrized Joe-Clayton copula defined in Section2.2.4.

Page 8: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

8 FENG LI

4. The general MCMC scheme

We update the copula component together with the marginal components jointly. The joint posterioris not tractable and we use the Metropolis–Hastings within Gibbs sampler, i.e. a Gibbs sampler is usedfor updating the joint parameter components, with each conditional parameter block {β,I} updated bythe Metropolis–Hastings algorithm. The complete updating scheme is as follows.

The updating order in the Gibbs sampler is given in Table 1. We jointly update the coefficientsand variable selection indicators {β,I} in each parameter block using an efficient tailored Metropolis–Hastings algorithm with integrated finite-step Newton proposals. The acceptance probability for a

proposed draw {β(p),I(p)} conditional on current value of the parameters {β(c),I(c)} is

min

[1,p({β(p),I(p)}|{β(c),I(c)},y,x)g({β(c),I(c)}|{β(p),I(p)})p({β(c),I(c)}|{β(p),I(p)},y,x)g({β(p),I(p)}|{β(c),I(c)})

]where g(·) is the jumping rule in the Metropolis-Hastings for {β,I}. Note that it is convenient to

decompose g({β(p),I(p)}|{β(c),I(c)} = g1(β(p)|{β(c),I(p)}i)g2(I(p)|{β(c),I(c)}). And g1(·) is the pro-posal distribution where the proposal mode is from finite-step Newton approximation of the posteriordistribution (usually smaller than three steps) starting on current draw and the proposal covariance ma-trix is from the negative inverse Hessian matrix. In our application, g1(·) is multivariate t distributionwith six degrees of freedom. The distribution of g2(·) is such that we always propose a change of I: acurrently excluded variable is proposed to enter the model, and vice versa. We do not allow all indicatorto change in a given iteration, each indicator is proposed to change with probability pprop. This simplescheme works well in the copula model. For alternative types of variable selection schemes, see e.g. Nottand Kohn (2005).

The updating scheme is used in e.g. Villani et al. (2009) and Villani et al. (2012) where it is shownthat Metropolis–Hastings with finite-step Newton proposals increases the convergence rate rapidly. Thealgorithm only requires the gradient for the marginal distribution and copula model with respect to theirthe (low-dimensional) parameters. Appendix A.1 documents the details for calculating the gradientwith respect to copula features for reparametrized copulas in the MCMC implementation with bothindependent and dependent link functions.

Table 1. The Gibbs sampler for covariate-dependent copula. The notation {βµ,Iµ}mdenotes the covariates coefficients and variable selection indicators in copula componentm for parameter feature µ. And the notation {βµ,Iµ}−m indicates all other parametersin the model except {βµ,Iµ}m. The updating order is column-wise from left to right.If dependent link functions are used, the updating should be ordered accordingly.

Margin component (1) ... Margin component (M) Copula component (C)

(1.1) {βµ,Iµ}1|{βµ,Iµ}−1 ... (M.1) {βµ,Iµ}M |{βµ,Iµ}−M (C.1) {βλ,Iλ}C |{βλ,Iλ}−C(1.2) {βφ,Iφ}1|{βφ,Iφ}−1 ... (M.2) {βφ,Iφ}M |{βφ,Iφ}−M (C.2) {βτ ,Iτ}C |{βτ ,Iτ}−C(1.3) {βν ,Iν}1|{βν ,Iν}−1 ... (M.3) {βν ,Iν}M |{βν ,Iν}−M(1.4) {βκ,Iκ}1|{βκ,Iκ}−1 ... (M.4) {βκ,Iκ}M |{βκ,Iκ}−M

An alternative approach is the two-stage estimation method which first independently estimate themargins and then estimates the copula likelihood conditional on the estimated margins, see e.g. Xu(1996) and Joe (1997). The two-stage estimation method is widely used as it reduces the computationaldifficulty in maximizing the likelihood for high-dimensional copula models. Joe (2005) shows that theasymptotic relative efficiency of the two-stage estimation procedure depends on how close the copula isto the Frechet bounds. Furthermore, Engle and Sheppard (2001) shows that the two-stage approach inestimating the multivariate DCC-GARCH model is consistent but not fully efficient due to the limitedinformation provided by the estimators.

5. Simulation studies

In this section, we generate random datasets with our data generating process (DGP) in Section 5.1with various tail-dependence structures and estimate them with our Joe-Clayton copula and proposedMCMC scheme described in Section 4. Our model comparison in Section 5.3 is based on out-of-samplecross-validation.

Page 9: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 9

5.1. DGP for copula model with dynamic dependences. The procedure of generating bivariaterandom variable with dynamic dependence is different from usual data generating process (DGP) suchas in regression. We document the algorithm as follows.

(1) Generate copula parameters.(a) Randomly generate n lower tail-dependence λL and n upper tail dependence λU from beta

distribution to allow for dynamic dependences, with mean 0 < λL, λU < 1 and standarddeviation σL and σU , respectively.

(b) Calculate linear predictors ηL = l(λL) and ηU = l(λU ) from link function l(·).(2) Generate dependent n× p covariates x for copula parameters for ηL and ηU .

(a) Set p × 1 coefficients β. Randomly pick one element β1 6= 0 from β and set the remainingp− 1 elements as β0.

(b) Randomly generate n× (p− 1) matrix x0 from U [0, 1].(c) Calculate x1 = (η − x0β0)/β1.(d) Let x = [x0,x1], so that the dependence parameter λ satisfies the covariate-dependent

structure λ = l−1(xβ).(3) Generate n × 2 matrix u in [0, 1]2 for Joe-Clayton copula with λL and λU in step 1.(a) by

conditional method (Nelsen, 2006).(4) Repeat step 1-2 with link function used in Table 3 to generate two sets of split−t marginal

parameters including µ, scale parameter φ, degrees of freedom ν, skewness parameter κ andtheir connected covariates x.

(5) Calculate quantile y, the finally generated n× 2 variable matrix with dynamic dependences, foreach margin u given in step 3 and its marginal parameters in step 4.

In the simulation studies, we generate 16 datasets of size n = 1, 000 with combination of differentmeans in lower and upper tail-dependences which are shown in Table 2.

Table 2. DGP settings for parameters in Joe-Clayton copula and split-t margins. Thefirst element of β coefficient is set for intercept.

Parameter feature Distribution mean sd True coefficients β within link function l(xβ)

mean µ Beta 0 1 [1, 1, −1, 1, −1, 0, 0]scale parameter φ logNormal 1 1 [1, 1, −1, 1, −1, 0, 0]degrees of freedom ν logNormal 6 1 [1, 1, −1, 1, −1, 0, 0]skewness parameter κ logNormal 1 1 [1, 1, −1, 1, −1, 0, 0]

lowertail-dependence λL Beta [0.3, 0.5, 0.7, 0.9] 0.1 [1, 1, −1, 1, −1, 0, 0, 1, −1, 1, −1, 0, 0]upper tail-dependence λU Beta [0.3, 0.5, 0.7, 0.9] 0.1 [1, 1, −1, 1, −1, 0, 0, 1, −1, 1, −1, 0, 0]

5.2. Model setup and computing time. We use exactly the same model setup for simulations inthis section and real data application in Section 6. The initial values for coefficients β in the MCMCare obtained by numerical optimization of the posterior distribution from a two-stage approach withrandomly selected 10% of the full dataset. This is just to guarantee the very first MCMC step can startgracefully. The initial variable selection indicators I are set to one in our application.

The prior for variable selection indicators is Bernoulli(0.5) which means a variable has equal prob-ability to be included and excluded in the model. The priors for coefficients are listed in Table 3 andthe details are described in Section 3. The scaling factors in Section 3.2 are set to be one in all pa-rameters, i.e., no shrinkage is used. Within each Metropolis-Hastings step, the proposal distributionfor coefficients is a multivariate t distribution with mean based on three-step Newton iterations andsix degrees of freedom. The proposal for variable selection indicators is Bernoulli(0.5) as described inSection 4. Our proposed MCMC algorithm allows for variable selection to reduce the model complexity.In high-dimensional problems, one may set more strict proposal to reject more unnecessary variables forvariable selection indicators and set the variable selection indicator be zero at initial stage of MCMC.

An R package “CDCopula” for estimating the covariate-dependent copula model with our MCMCscheme, posterior summary and visualization, DGP and model comparison is available for download athttps://bitbucket.org/fli/cdcopula/ upon request. The program runs on a Linux cluster node withtwo 2.0GHz Intel Xeon CPUs with total of 16 cores and 1, 024GB RAM. Parallelization is used whencross-validation is applied. We set the number of MCMC iterations to be 20, 000 and discard the first

Page 10: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

10 FENG LI

Table 3. Prior belief on parameter features and implied priors for coefficients incovariate-dependent copula models.

Parameter feature Link function Prior belief on feature Implied normal prior on β

mean µ identity N(mean = 0, var = 1) N(mean = 0, var = 1)scale parameter φ log logNorm(mean = 1, var = 1) N(mean = −0.34, var = 0.69)degrees of freedom ν log logNorm(mean = 5, var = 10) N(mean = 1.44, var = 0.34)skewness parameter κ log logNorm(mean = 1, var = 1) N(mean = −0.34, var = 0.69)

Kendall’s τ & tail-dependences λL, λU

logit Beta(mean = 0.2, var = 0.05) N(mean = −2.55, var = 6.91)

10% as burn-in. It takes roughly two hours to run a four-fold cross-validation with DGP data based onfull model including covariate-dependent structure and variable selection in all parameters.

5.3. Comparison of different modeling strategies. We evaluate the model performance based onK-fold out-of-sample log predictive score (LPS), defined as

LPS =1

K

∑K

k=1log p(yd|y−d,x),

where yd is an nd×p matrix containing the nd observations in the dth testing sample and y−d denotes thetraining observations used for estimation. If we assume that the observations are independent conditionalon {β,I}, then

p(yd|y−d,x) =

∫ ∏i∈d

p(yi|{β,I},xi)p({β,I}|y−d)d{β,I}.

The LPS is easily calculated by averaging∏i∈d p(yi|{β,I},xi) over the posterior draws from p({β,I}|y−d).

This requires sampling from each of the K posteriors p({β,I}|y−d) for k = 1, ...,K. The main advan-tage for choosing LPS instead of marginal likelihood based criteria is that the LPS is not sensitive to thechoice of prior (Kass, 1993; Richardson and Green, 1997).

Table 4 shows LPS of four-fold cross-validation for Joe-Clayton copula with given DGP settings inTable 2. Under each DGP setting, we estimate the model with our proposed joint approach and a two-stage approach. With each MCMC approach, we estimate our model using two different settings with i)our full covariate dependent structure plus variable selection scheme and ii) just modeling the constantin copula parameters. So four modeling strategies are applied for each dataset. It is worth mentioningthat we use full Bayesian method for estimating the marginal models in the two-stage approach. Thesecond-stage copula modeling is based on the full posterior in marginal models which is a significantlyimproved version of the typical two-stage approach, where only parameter mean of marginal model isused.

Note that LPS is not directly comparable among different datasets because the datasets are charac-tered with different lower and upper tail-dependences. We also have explored different models includingClayton copula, Gumbel copula, student t-copula in our model class. Compared with other models,out-of-sample cross-validation shows that Joe-Clayton copula performs the best. Due the limited paperlength, we choose to just present results based on Joe-Clayton copula. A real data comparison for thesefour copulas in our model class is presented in Table 6 in Section 6.2.

For all 16 DGP settings in Table 4, Joe-Clayton copula with full covariates in copula structure forjoint modeling approach performs the best compared to other three situations. The significance levelwith regard to LPS differences increase as the strength of tail-dependence increases. In same modelsetting, joint approach is always better than the two-stage approach. In our joint approach, includingcovariates in copula features can improve the out-of-sample performance, which is significant especiallywhen moderate or high dependences occur.

In the end of this section, we present the posterior summary of copula features in Figure 2 for jointmodeling approach with i) covariate-dependent structure in copula features, ii) only constant estimatedin copula feature, based on DGP (λL = 0.7, λU = 0.7) with 1, 000 observations in Table 4. We chooseto present this setup because in real financial application as show in Section 6, this setting is close tothe dependence during difficulty financial periods. Two subplots both depict modeling performance inan absolute sense by comparing abilities to recover true DGP features. It shows that our joint modelingapproach with covariates entered in copula parameters nicely captures the true DGP feature even whenvery high dependence occurs in the last part of second subplot. The model with only constant estimated

Page 11: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 11

Table 4. LPS of four-fold cross-validation for Joe-Clayton copula with 16 DGP settingsand 64 simulations. Each dataset consists of 1, 000 observations with given mean andstandard deviation (0.1) for lower and upper tail-dependences.

DGP settings λ(DGP )U = 0.3 λU = 0.5 λU = 0.7 λU = 0.9

MCMC CD. Const. CD. Const. CD. Const. CD. Const.

λ(DGP )L = 0.3 J. −519.56 −520.91 −506.90 −508.95 −427.72 −432.35 −273.93 −306.99

T. −523.25 −522.00 −510.60 −511.75 −444.32 −439.68 −310.67 −321.38

λL = 0.5 J. −501.33 −502.57 −468.30 −471.97 −424.30 −436.54 −244.02 −268.56T. −510.51 −507.29 −476.68 −474.30 −446.38 −451.83 −299.08 −314.36

λL = 0.7 J. −440.81 −454.16 −424.20 −439.24 −380.30 −390.38 −243.16 −244.78T. −457.76 −460.83 −440.01 −440.70 −397.72 −402.37 −283.96 −295.11

λL = 0.9 J. −228.83 −256.11 −218.61 −294.52 −241.21 −255.13 −210.11 −269.86T. −244.01 -294.00 −292.74 −317.60 −280.67 −289.88 −259.15 −297.25

J. = joint modeling approach; T. = two-stage approach; CD. = covariate-dependent structure used in copula parameter;Const.= constant estimated in copula parameter (i.e. no covariate-dependent).

in copula features only captures the mean of true DGP features but fails to cover the volatility of featureseven with 95% HPD.

0 20 40 60 80 100

0.3

0.4

0.5

0.6

0.7

0.8

0.9

lam

bdaL

Posterior mean (No covariate−dependent) 95% HPD

Posterior mean 95% HPD DGP values

0 20 40 60 80 100

0.3

0.4

0.5

0.6

0.7

0.8

0.9

lam

bdaU

Posterior mean (No covariate−dependent) 95% HPD

Posterior mean 95% HPD DGP values

Figure 2. Posterior summary for lower and upper tail-dependences. Two models (J.+ CD. and J. + Const.) are estimated based on DGP (λL = 0.7, λU = 0.7) with 1, 000observations in Table 4. Only the tail-dependences for the first 100 observations areplotted due to resolution.

Page 12: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

12 FENG LI

6. Application to financial data

In order to illustrate our method, we use a financial application with daily stock returns. The copulamodel is the reparametrized Joe-Clayton copula with split-t distributions on the continuous margins.For the discrete case, see e.g. the approach by latent variables for the Gaussian copula in Pitt et al.(2006) and the extension to a general copula in Smith and Khaled (2012).

6.1. The S&P 100 and S&P 600 data. Our data are daily returns from the S&P 100 and S&P 600daily stock market indices during the period from February 01, 1989 to February 06, 2015 (Figure 3).The S&P 100 index includes the largest and most established companies in the U.S. which is a subsetof the well-known S&P 500 index. The S&P 600 index covers the small capitalization companies whichpresent the possibility of greater capital appreciation, but at greater risk. The S&P 600 index coversroughly three percent of the total US equities market.

1990 1995 2000 2005 2010 2015

−4

−2

02

4

S&P 600

1990 1995 2000 2005 2010 2015

−4

−2

02

4

S&P 100

Figure 3. Daily returns of the S&P 600 and S&P 100 indices from February 01, 1989to February 06, 2015.

Figure 3 shows the time series of the daily returns. It is seen that there is huge volatility in the returnsfor both S&P 100 and S&P 600 during the 2008 financial crisis. Figure 4 depicts six empirical densitiesfor Return that is orderly partitioned into blocks estimated with normal kernel methods by assumingindependent observations. The empirical densities in Figure 4 show that the dependence especially lowertail-dependence changes over time. There is sign of extreme positive dependence during 2008 financialcrisis. Figure 4 also suggests that there are strong lower and upper tail dependences during and after2008 financial crisis. The two-parameter Joe-Clayton copula is appropriate for modeling the data withoutrotating the scale.

Patton (2012a) uses hypothesis tests to show that there is significant time-varying dependence betweenS&P 100 and S&P 600. Both parametric and nonparametric methods are used to estimate the tail-dependence coefficient in different copula models in Patton (2012a). Nevertheless, little effort has beendevote to interpreting the dependence, in particular from using covariate information. The covariateswe used in the margins and in the copula function are described in Table 5. Villani et al. (2009) andLi et al. (2010) apply similar covariates in univariate response regression density estimation on S&P 500data using mixtures of Gaussian and asymmetric student’s t densities. Furthermore, Joe (2005) showsthat the usual two-stage approach for copula estimation is not fully efficient for extreme dependence,which is the case in our empirical studies in Figure 4.

6.2. Comparison of different copula models and volatility models. In our time series application,LPS is defined based on estimation of 80% of historical data and prediction of the last 20% data. Itevaluates the quality of the one-step-ahead predictions

LPS = log p(y(T+1):(T+p)|y1:T )

=

p∑i=1

log

∫p(yT+i|{β,I},y1:(T+i−1))p({β,I}|y1:(T+i−1))d{β,I}

where ya:b is the dataset from time a to b and {β,I} are the model parameters. This calculation of theLPS is usually computationally costly because every one-step-ahead prediction needs a new posteriorsample from the posterior based on the data available at the time of the forecast. We approximate theLPS by assuming that the posterior does not change much as we add a few data points to the estimationsample. Villani et al. (2009) document that this type of approximation is accurate in an application ofsmooth mixture of Gaussians for density predictions of the S&P 500 data. Furthermore, with simplealgebra, it can be shown that the global LPS in a copula model equals the sum of their LPSs in each

Page 13: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 13

Table 5. Description of variables in the S&P 100 and S&P 600 data.

Variable Description

Return Daily return yt = 100 log(pt/pt−1) where pt is the closing price.RM1 Return of last day.RM5 Return of last week.RM20 Return of last month.CloseAbs95 Geometrically decaying average of absolute returns (1 − ρ)

∑∞s=0 ρ

s|yt−2−s| withρ = 0.95.

CloseAbs80 Geometrically decaying average of past absolute returns with ρ = 0.80.MaxMin95 Measure of volatility (1−ρ)

∑∞s=0 ρ

s(log(pht−1−s)−log(plt−1−s)) with ρ = 0.95, where

ph and pl are the highest and lowest prices.MaxMin80 Measure of volatility with ρ = 0.80.CloseSqr95 Geometrically decaying average of returns ((1−ρ)

∑∞s=0 ρ

sy2t−2−s)1/2 with ρ = 0.95.

CloseSqr80 Geometrically decaying average of returns with ρ = 0.80.

−2 −1 0 1 2

−2

−1

01

2

[1989−02−01, 1993−05−27]

−2 −1 0 1 2

−2

−1

01

2

[1993−05−28, 1997−09−23]

−2 −1 0 1 2

−2

−1

01

2

[1997−09−24, 2002−01−30]

−2 −1 0 1 2

−2

−1

01

2

[2002−01−31, 2006−06−02]

−2 −1 0 1 2

−2

−1

01

2

[2006−06−05, 2010−10−05]

−2 −1 0 1 2

−2

−1

01

2

[2010−10−06, 2015−02−06]

Figure 4. Contour plot for bivariate empirical densities with daily returns of S&P100 (y-axis) and S&P 600 (x-axis) indices from February 01, 1989 to February 06,2015. The whole data are partitioned into six data blocks in time order (marked insquared brackets) and the densities are estimated via normal kernel method in theirblock, respectively. The subplot in the middle of the second row is for the period during2008 financial crisis.

marginal model and copula component, i.e., LPS =∑Mi=1 LPSi + LPSC . This allows us to parallelly

compute LPS and compare the contribution in different components.Table 6 shows the out-of-sample comparison of Joe-Clayton copula, Clayton copula, Gumbel copula

and student t-copula in combination with split-t, GARCH and stochastic volatility margins. Both jointmodeling approach and two-stage approach are applied when applicable. For each combination, globalLPS for the full model is decomposed into marginal LPS components and copula LPS component. Injoint modeling approach, The Joe-Clayton copula with covariate dependent structure outperforms otherthree copulas that also have covariate dependent structures. The t-copula with correlation parameter anddegrees of freedom modeled with covariates is the second best model but still falls 36 LPS points behind.Compared to only modeling constant copula parameters, introducing covariates-dependent copula mod-eling improves prediction performance regarding LPS in every copula component in four copulas. Amongthe four copulas, t-copula is the weakest copula for modeling tail-dependences because its dependencerelies on small degrees of freedom. Allowing covariates entered into degrees of freedom through frame-work greatly enhances the LPS with about 90 points from 703.96 to 792.14. In the two-stage modelingapproach, we have compared three types of margins including the default split-t, GARCH and stochastic

Page 14: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

14 FENG LI

volatility with combination of four copula models that covariates entered in every copula parameters.Joe-Clayton copula with split-t margins is still the most successful model. GARCH margin and stochasticvolatility margin fall behind with distinctions in terms of LPS. We also calculated LPS for two bivari-ate DCC-GARCH and bivariate volatility model as benchmarks. Bivariate DCC-GARCH does muchbetter job than bivariate volatility model with a gap 269 LPS points. DCC-GARCH also outperformsall four copulas with stochastic volatility margin and GARCH margin. Out four covariate-dependentcopula models with split-t margins outperform DCC-GARCH with strong significance of least 350 LPSpoints. It is worth mentioning that implementing the univariate and multivariate GARCH model re-quires carefully calibrating their likelihood functions and constrains. However our inference based oncovariate-dependent structures is much straightforward and also makes the interpretation friendly. Tosummarize, models with covariates in the copula features could greatly improve the capacity of modelingdependences and outperform the commonly used volatility models. Joint modeling is more efficient thantwo-stage approach.

Table 6. Model comparison with LPS. Note that variable selection is used in all situa-tions in the margins. C(CD.) indicates covariate-contingent structure used in reparam-eterized copula component, and C(Const.) means only constant tail-dependence andcorrelation are estimated in copula component. Note that in principle, the global LPSshould be equal to the summation of LPSs in marginal and copula components. Due toMCMC numerical errors, the results are not exactly the same. The numerical standarderrors are all below one in all LPSs. The LPS for Joe-Clayton copula C(λL, τ) is thesame as C(λL, λU ).

Reparameterized CopulasJoe-Clayton Clayton Gumbel t-Copula

Margins LPS decomposition

(joint modeling approach)SPLIT-t M1 −1743.12 −1741.04 −1754.36 −1741.47

M2 −1435.98 −1468.25 −1485.68 −1430.07C(CD.) 837.50 690.22 797.78 792.14Global −2344.12 −2523.75 −2448.14 −2380.12

SPLIT-t M1 −1747.99 −1747.15 −1754.61 −1782.37M2 −1434.22 −1449.95 −1446.84 −1658.09C(Const.) 779.14 654.46 780.33 703.96Global −2411.06 −2547.14 −2421.15 −2736.49

(two-stage modeling approach)SPLIT-t M1 −1740.10 −1741.05 −1737.73 −1741.47

M2 −1428.39 −1436.63 −1427.83 −1433.41C(CD.) 819.63 694.84 781.39 788.22Global −2346.61 −2483.93 −2392.13 −2389.41

GARCH M1 −1948.07 −1948.07 −1948.07 −1948.07M2 −1673.85 −1673.85 −1673.85 −1673.85C(CD.) 702.35 530.48 810.39 791.55global −2919.57 -3091.44 −2811.53 −2830.37

SV M1 −2166.90 −2154.18 −2168.17 −2179.36M2 −1811.36 −1844.57 −1808.61 −1808.24C(CD.) 964.37 698.30 1012.10 1053.19Global −3013.90 −3300.46 −2964.68 −2934.40

(bivariate volatility models)DCC-GARCH −2730.78SV −2999.63

Page 15: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 15

6.3. Posterior summary. We present the posterior summary for Joe-Clayton copula model with split-tmargins in two parameterization in terms of i) lower and upper tail-dependence (Table 7) and ii) lowertail-dependence and Kendall’s τ (Figure 5). The conditional link function is used for the dependentKendall’s τ and lower tail-dependence parameterization. We also estimated the model with the indepen-dent link for comparison. The mean posterior acceptance probability is 71% in comparison to 50% whenthe independent link function is used. The efficiency of the MCMC is monitored via the inefficiencyfactor IF = 1 + 2

∑∞i=1 ρi, where ρi is the autocorrelation at lag i in the MCMC iterations.

Our marginal models are similar to the model in Li et al. (2010) except that the location parameterof the split-t is fixed in Li et al. (2010). We focus on explaining the results in the copula componentand refer to Li et al. (2010) for a detailed interpretation of the marginal models. The variable selectionresults show that important variables for lower tail-dependence are RM5 from S&P 100, CloseAbs95 fromS&P 600, CloseAbs80 from both margins, MaxMin95 from S&P 100, and CloseSqr95 from both margins.Variables with large posterior inclusion probabilities in the upper-tail dependence are CloseAbs95 fromS&P 100, CloseSqr95 from S&P 100 and CloseSqr80 from S&P 600.

Multicollinearity may occur when the same variables are used in a margin and in the copula parame-ters. The same covariate in S&P 100 and in S&P 600 tend to be highly correlated. Since covariates inboth margins are used in the copula, there is a risk of covariate duplication with associated problems withnon-identification when an inefficient MCMC algorithm is used. Table 7 shows that when a covariate inone margin is selected in the copula feature, the same covariate in the other margin does not appear inthe copula with only two exceptions, which indicates that our variable selection algorithm is efficient inremoving superfluous covariates. Also notice that the initial value for variable selection indicators areall one, i.e., all covariates all in the model. our inclusion probabilities are very selective in the sensethat very few indicators are close to undecided region around 0.5 nor selecting all of the variables as theinitial values suggested. This also suggests that our algorithm is competent.

Figure 5 shows a correlation between S&P 100 and S&P 600 indices with the posterior mean and 95%HPD of Kendall’s τ , lower tail-dependence and upper tail-dependence. The tail-dependence is not sostrong during normal time, but there is significant variation over time. There is a very high dependencein the tail even though the overall correlation is relatively small. Figure 6 depicts the contour plot for theposterior density of Joe-Clayton copula model with dates that are selected based on estimated lower tail-dependence. We can see that the lower tail-dependence is much higher during the crisis in comparisonto the dates before the crisis. Also note from the shape of contour lines that the empirical densities inFigure 4 overestimate the dependence during normal time and underestimates the dependence during the2008 financial crisis because it assumes temporally independent observations. Our covariate-dependentcopula model not only captures but also estimates the dynamic dependence.

7. Concluding remarks

We have proposed a general approach for modeling a covariate-dependent copula. The copula param-eters as well as the parameters in the margins are linked to covariates. We use an efficient BayesianMCMC method to sample the posterior distribution and to simultaneously perform variable selection inall parts of the model. Both simulation studies and application to the daily returns of S&P 100 and S&P600 indices show the advantages of this approach in terms of understanding the copula dependence viacovariates and improved out-of-sample prediction performance. High-dimensional covariate-dependentcopula modeling with discrete margins is also possible using the data augmentation in Pitt et al. (2006)and Smith and Khaled (2012) or vine copula constructions but requires further extensive studies.

8. Acknowledgement

The author is grateful to Associate Editor, two anonymous reviewers, Prof. Mattias Villani, Prof.Sylvia Fruhwirth-Schnatter and Prof. Rong Chen for helpful comments that improved the contents ofthe paper.

Appendix A. The MCMC details

In this section, we briefly present the MCMC details. The MCMC implementation is straightforward,but requires great care of the proposal distribution in the Metropolis–Hastings algorithm.

A.1. The chain rule for parameters in copula models. We use the finite-step Newton methodembedded in the Metropolis-Hastings algorithm that requires the analytical gradient for the posteriorwith respect to the parameters of interest in marginal and copula components. The chain rule of gradientconveniently modularizes the copula model and reduces the complexity of the the gradient calculation.

Page 16: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

16 FENG LI

Table 7. Posterior mean summary of Joe-Clayton copula model reparameterized interms of lower tail-dependence (λL) and upper tail-dependence (λU ) with S&P 100 andS&P 600 data from February 01, 1989 to February 06, 2015. In the copula componentpart, the first row and second row for β and I are the results for the combined covariatesthat are used in the first and second marginal models, respectively. The intercepts arealways included in the model. Variables that are selected with inclusion probabilitygreater than 0.5 are marked in bold. The average of inefficiency factors for all parametersis 29. The average acceptance probabilities are all above 0.70 within each Metropolis-Hastings.

Intercept RM1 RM5 RM20 CloseAbs95 CloseAbs80 MaxMin95 MaxMin80 CloseSqr95 CloseSqr80

Marginal component (S&P 600)

βµ 0.146 0.120 0.000 0.006 0.001 0.116 −0.001 0.002 −0.013 −0.046Iµ 1.00 1.00 0.01 0.22 0.02 0.96 0.05 0.08 0.24 0.53

βφ −0.318 0.041 −0.115 −0.004 −0.010 −0.012 0.217 0.002 0.019 0.243Iφ 1.00 1.00 1.00 0.16 0.13 0.17 0.94 0.12 0.18 0.96

βν 1.405 0.008 −0.230 0.041 −0.160 −0.098 −0.220 −0.414 0.428 0.641Iν 1.00 0.13 0.78 0.27 0.48 0.38 0.50 0.65 0.56 0.74

βκ −0.248 −0.116 0.000 0.000 −0.039 −0.009 −0.057 −0.015 0.174 −0.003Iκ 1.00 1.00 0.02 0.17 0.29 0.16 0.38 0.32 0.83 0.13

Marginal component(S&P 100)

βµ 0.091 0.000 −0.001 −0.019 0.010 0.000 0.011 −0.005 0.011 0.020Iµ 1.00 0.02 0.05 0.57 0.30 0.13 0.25 0.15 0.19 0.26

βφ −0.432 −0.001 −0.109 −0.026 0.481 −0.191 0.135 0.045 −0.477 0.339Iφ 1.00 0.05 1.00 0.60 0.96 0.96 0.58 0.33 1.00 0.98

βν 0.641 −0.024 −0.173 0.067 −0.151 −0.160 −0.001 0.035 −0.005 −0.055Iν 1.00 0.20 0.62 0.36 0.41 0.42 0.34 0.43 0.32 0.27

βκ −0.140 −0.058 0.000 −0.005 0.033 0.000 0.020 0.000 −0.077 0.007Iκ 1.00 1.00 0.05 0.15 0.20 0.09 0.21 0.02 0.38 0.08

Copula component(C)

βλL 0.508 0.003 −0.017 0.000 1.497 −0.449 0.031 −0.030 0.359 0.0400.001 −0.072 0.002 0.088 0.268 −0.811 0.043 −0.498 −0.090

IλL 1.00 0.07 0.42 0.00 1.00 0.84 0.27 0.20 0.75 0.370.04 0.55 0.04 0.44 0.64 0.96 0.16 0.82 0.45

βλU 0.246 −0.001 −0.001 0.000 0.044 0.001 1.384 0.005 0.199 0.347−0.001 −0.005 −0.005 −0.887 −0.066 −0.466 0.226 −0.229 −0.157

IλU 1.00 0.03 0.02 0.11 0.11 0.08 1.00 0.13 0.47 0.940.02 0.09 0.36 1.00 0.41 0.43 0.43 0.67 0.38

∂ log c(u1, ..., uM , λL, τ)

∂λL=∂ log c(u1, ..., uM , θ, δ)

∂δ×(∂λL∂δ

)−1∂ log c(u1, ..., um, λL, τ)

∂τ=∂ log c(u1, ..., um, θ, δ)

∂θ×(∂τ(θ, δ)

∂θ

)−1∂ log c(u1, ..., uM , λL, τ)

∂ϕm=∂ log c(u1, ..., uM , θ, δ)

∂um× ∂um∂ϕm

+∂ log pm(ym, ϕm)

∂ϕm

Page 17: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 17

Figure 5. The posterior mean and 95% highest probability density (HPD) intervals forJoe-Clayton copula model reparameterized in terms of lower tail-dependence (λL) andKendall’s τ (τ) with S&P 600 and S&P 100 data from February 01, 1989 to February06, 2015. The top-left and top-right four subplots show time series plots of the location,scale, degrees of freedom and skewness parameters in margin S&P 600 and S&P 100,respectively. The three subplots at the bottom are the time series plots for Kendall’s τ ,the lower tail-dependence, and the upper tail-dependence which is derived based on theposterior of lower tail-dependence and Kendall’s τ .

where ϕm is any parameter in the m:th marginal density pm(ym, ϕm) and um is the CDF function ofits marginal density. The parameters θ and δ are the intermediate parameters that link the dependenceand correlations with the traditional parametrization for copulas. Particularly, obtaining the gradient

Page 18: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

18 FENG LI

−2 −1 0 1 2

−2

−1

01

2(0.38, 0.33)

1991−12−31

−2 −1 0 1 2

−2

−1

01

2

(0.47, 0.36)

1992−02−21

−2 −1 0 1 2

−2

−1

01

2

(0.51, 0.43)

1989−08−29

−2 −1 0 1 2

−2

−1

01

2

(0.56, 0.45)

1990−04−12

−4 −2 0 2 4

−4

−2

02

4

(0.72, 0.48)

1991−02−22

−2 −1 0 1 2

−2

−1

01

2

(0.41, 0.39)

1993−07−02

−2 −1 0 1 2

−2

−1

01

2(0.47, 0.48)

1996−03−14

−2 −1 0 1 2

−2

−1

01

2

(0.51, 0.47)

1995−03−30

−2 −1 0 1 2

−2

−1

01

2

(0.54, 0.48)

1997−08−15

−4 −2 0 2 4

−4

−2

02

4

(0.66, 0.6)

1996−07−30

−2 −1 0 1 2

−2

−1

01

2

(0.4, 0.3)

1999−11−19

−2 −1 0 1 2

−2

−1

01

2

(0.54, 0.5)

1999−04−14

−2 −1 0 1 2

−2

−1

01

2(0.6, 0.57)

2001−02−12

−2 −1 0 1 2

−2

−1

01

2

(0.67, 0.66)

2001−03−05

−4 −2 0 2 4

−4

−2

02

4

(0.82, 0.61)

2000−05−24

−2 −1 0 1 2

−2

−1

01

2

(0.54, 0.57)

2003−03−20

−2 −1 0 1 2

−2

−1

01

2

(0.63, 0.48)

2002−04−02

−2 −1 0 1 2

−2

−1

01

2

(0.67, 0.52)

2003−01−15

−2 −1 0 1 2

−2

−1

01

2

(0.7, 0.7)

2004−08−10

−4 −2 0 2 4

−4

−2

02

4

(0.8, 0.46)

2002−10−29

−2 −1 0 1 2

−2

−1

01

2

(0.57, 0.71)

2007−03−01

−2 −1 0 1 2

−2

−1

01

2

(0.65, 0.65)

2007−08−13

−2 −1 0 1 2

−2

−1

01

2

(0.71, 0.66)

2009−08−17

−2 −1 0 1 2

−2

−1

01

2

(0.77, 0.73)

2010−06−23

−30 −10 0 10 20

−3

0−

10

10

(0.96, 0.75)

2008−12−16

−2 −1 0 1 2

−2

−1

01

2

(0.53, 0.45)

2012−11−07

−2 −1 0 1 2

−2

−1

01

2

(0.6, 0.54)

2014−02−20

−2 −1 0 1 2

−2

−1

01

2

(0.64, 0.63)

2014−10−09

−2 −1 0 1 2

−2

−1

01

2

(0.68, 0.68)

2012−04−05

−4 −2 0 2 4

−4

−2

02

4

(0.92, 0.78)

2011−11−07

Figure 6. Contour plot of the posterior densities for selected dates with Joe-Claytoncopula model reparameterized in terms of lower tail-dependence and upper tail-dependence (marked in brackets) for S&P 100 (y-axis) and S&P 600 (x-axis) data fromFebruary 01, 1989 to February 06, 2015. Five days (column-wise view) sorted by as-cending order of lower tail-dependences (1%, 25%, 50%, 75% and 99% quantiles) areselected in each data block of total six blocks (row-wise view) used in Figure 4. Thesubplots in the second last row are for the period during 2008 financial crisis.

for t copula requires the following extra decomposition,

∂ log ct(u1, ..., uM , λL, τ)

∂um=∂ log pt(x1, ..., xM , ν)

∂xm/∂um∂xm

where um =∫pm(xm, ν)dxm are the CDF for univariate student t density with ν degrees of freedom in

the mth margin. Gradients for other elliptical copulas are obtained in the similar way.

Page 19: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 19

The MCMC algorithm requires evaluating τ−1λL excessively. It can be evaluated numerically or througha dictionary-lookup method. In practice, we have found that the dictionary-lookup method is particularlyfast and robust. Modeling the upper tail-dependence can be done in the same manner.

Our model in Section 2 is covariate-dependent. Let l(ϕ) = x′β be the link function where ϕ is theparameter of interest. The gradient expression can be written as

∂ log c(u1, ..., uM , ϕ)

∂β=∂ log c(u1, ..., uM , ϕ)

∂ϕ×(∂l(ϕ)

∂ϕ

)−1× ∂x′β

∂β.

When the conditional link function is used, e.g. τ depending on λL in our model in the link functionl(τ |λL) = x′β, the gradient for λL is slightly complicated. One needs to write τ as a function of λL withthe link function and substitute it into the copula density. The gradient for λL is obtained thereafter.The details are straightforward, but lengthy, and will be omitted here.

A.2. Gradients for parameters in Joe-Clayton copula. The gradient for the Joe-Clayton copulaw.r.t lower tail-dependence parameters λL can be decomposed as

∂ log c(u, v, θ, δ)

∂δ=− log T1(u)− log T1(v)− 2(1 + δ)∆1

δL1− (

1

θ− 2)

logL1 − δ∆1/L1

δ2(L1/δ1 − 1)

+2 logL1

δ2+L1/δ1 − (1 + δ)L

1/δ1 (logL1 − δ∆1/L1)/δ2 − 1

(1 + δ)L1/δ1 − δ − 1/θ

,

where ∆1 = ∂L1/∂δ = −T1(u)−δ log T1(u)− T1(v)−δ log T1(v). Furthermore, ∂λL/∂δ = 2−1/δ log(2)/δ2.The gradient for Joe-Clayton copula with respect to the Kendall’s τ is decomposed as

∂ log c(u, v, θ, δ)

∂θ=− (1 + δ)∆2(0) + ∆3(u) + ∆3(v) + 2(1 + δ)∆2(−δ)/L1

+(1− 2θ)∆2(1/δ)

(1− L−1/δ1 )θδ− log(1− L−1/δ1 )

θ2

+(1 + δ)L

1/δ1 + θ(1 + δ)/δL

1/δ−11 ∆2(−1/δ)− δ

(1 + δ)θL1/δ1 − θδ − 1

where ∆2(d) = −T1(u)d−1(1− u)θ log(1− u)− T1(v)d−1(1− v)θ log(1− v) and ∆3(s) = ∂ log T2(s)/∂θ =(1− s)θ−1 log(1− s)/T2(s). Furthermore,

∂τ(θ, δ)

∂θ=

−2/[(θ − 2)2δ]− 8B(2 + δ, 2/θ − 1) [θ + ψ(2/θ − 1)− ψ(2/θ + δ + 1)] /(θ4δ),

1 ≤ θ < 2;

−[12 + 24ψ(1) + 6ψ2(1) + π2 − 12(2 + ψ(1))ψ(2 + δ) + 6ψ2(2 + δ)− 6ψ1(2 + δ)]/(24δ),

θ = 2; −2(2 + δ)θ4B(1 + δ + 2/θ, 2− 2/θ)− 8π2(θ − 2)2 cos(2π/θ)/ sin2(2π/θ)

−8π(θ − 2)2[ψ(1 + δ + 2/θ)− ψ(2− 2/θ)− θ]/ sin(2π/θ)

δ(2 + δ)(θ − 2)2θ4B(1 + δ + 2/θ, 2− 2/θ)

,

θ > 2.

where ψ1(·) is the trigamma function. then gradient for the case θ = 2 can be obtained by taking thelimiting result from the cases of 1 ≤ θ < 2 or θ > 2 when θ → 2.

∂τ(θ, δ)

∂δ=

−2/[(θ − 2)δ2] + 4B(2 + δ, 2/θ − 1)[ψ(2 + δ)− ψ(2/θ + δ + 1)− 1/δ]/(θ2δ),

1 ≤ θ < 2;

[ψ(2 + δ)− δψ1(2 + δ)− ψ(1)− 1]/δ2, θ = 2;

− 2

(θ − 2)δ2− 4π[ψ(3 + δ)− ψ(2/θ + δ + 1)− 2(1 + δ)/(2δ + δ2)]

(2 + δ)δθ2 sin(2π/θ)B(1 + δ + 2/θ, 2− 2/θ),

θ > 2.

For the Joe-Clayton copula, u and v are exchangeable, and we only present the derivative with respectto u:

Page 20: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

20 FENG LI

∂ log c(u, v, θ, δ)

∂u=(1 + δ)θ∆4(0) + (1− θ)[(1− u)θ−2/T2(u) + (1− v)θ−2/T2(v)]

− 2(1 + δ)∆4(−δ)/L1 − (1/θ − 2)L−1/δ−11 ∆4(−δ)/(1− L−1/δ1 )

− (1 + δ)θL1/δ−11 ∆4(−δ)/

[(1 + δ)θL

1/δ1 − θδ − 1

]where ∆4(d) = −T1(u)d−1(1− u)θ−1θ − T1(v)d−1(1− v)θ−1θ.

A.3. Gradients for parameters in t copula. The gradient for t copula w.r.t. the lower tail depen-dence λLij for ith and ith margins can be obtained via the following chain rule,

∂ log ct(u, λLij , τ)

∂λLij=∂ log pt(x, ν, ρij)

∂ν/∂λLij∂ν

where the tail-dependence for ith and jth margins (λLij) are (Embrechts et al., 1997)

λLij =

∫ π/2π/4−arcsin(ρij)/2 cosν(t)dt∫ π

0cosν(t)dt

and ρij is the correlation coefficient for ith and jth margins.

A.4. Gradients for parameters in marginal distributions. The direct derivatives of CDF functionand PDF functions with respect to their parameters are straightforward for most densities. See existingderivatives for PDF functions in e.g. Li et al. (2010) (asymmetric student-t density where asymmetricnormal and symmetric student-t densities are its special cases), Li et al. (2011) (gamma and log-normaldensities) and Villani et al. (2012) (negative binomial, beta and generalized Poisson densities) and Liand Villani (2013) (spline model with free knots as unknown parameters).

We only document the split-t case for CDF functions. Let I = κ if y > µ and I = 1 elsewhere, J = 1,if y > µ and J = −1, and A = I2νφ2/[(y − µ)2 + I2νφ2], the gradient for the split-t CDF function withrespect to it parameters µ, φ, κ, ν are as follows,

∂usplit−t(y, µ, φ, κ, ν)

∂µ=−

2I√

1(y−µ)2+I2νφ2A

ν/2

(1 + κ)Beta[ν2 ,

12

] ,

∂usplit−t(y, µ, φ, κ, ν)

∂φ=−

2I(y − µ)√

1(y−µ)2+I2νφ2A

ν/2

(1 + κ)φBeta[ν2 ,

12

] ,

∂usplit−t(y, µ, φ, κ, ν)

∂φ=

−BetaR[A, ν2 ,

12 ]

(1+κ)2 y < µ

−2(1+κ)(y−µ)

√1

(y−µ)2+κ2νφ2Aν/2+Beta[A, ν2 ,

12 ]

(1+κ)2Beta[ ν2 ,12 ]

elsewhere,

∂usplit−t(y, µ, φ, κ, ν)

∂φ=

I

2(1 + κ)ν2Beta[ν2 ,

12

] {4JAν/2pFq

[{1

2,ν

2,ν

2

},{

1 +ν

2, 1 +

ν

2

}, A

]

(−2(y − µ)

√1

(y − µ)2 + κ2νφ2Aν/2

− νJ Beta

[A,

ν

2,

1

2

](log (A)− ψ

(ν2

)+ ψ

(1 + ν

2

)) }where BetaR is the regularized beta function, pFq is the generalized hypergeometric function.

However there are exceptions when this derivative is numerically unstable in practice. In this situation,we propose an alternative approach. Note that

∂u

∂ϕ=∂F (y, ϕ)

∂ϕ=

∫ y

−∞

∂f(x, ϕ)

∂ϕdx

where u = F (y, ϕ) is the CDF function of density f(y, ϕ), and calculating ∂f(y, ϕ)/∂ϕ is usually easierthan ∂F (y, ϕ)/∂ϕ. When the integral cannot be easily obtained analytically, numerical methods can beapplied in the last stage.

Page 21: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

COVARIATE-DEPENDENT COPULA 21

References

Aas, K., Czado, C., Frigessi, A., and Bakken, H. (2009), “Pair-copula constructions of multiple depen-dence,” Insurance: Mathematics and Economics, 44, 182–198.

Ausin, M. C. and Lopes, H. F. (2010), “Time-varying joint distribution through copulas,” ComputationalStatistics & Data Analysis, 54, 2383–2399.

Bollerslev, T. (1986), “Generalized autoregressive conditional heteroskedasticity,” Journal of Economet-rics, 31, 307–327.

Bouye, E. and Salmon, M. (2009), “Dynamic copula quantile regressions and tail area dynamic depen-dence in Forex markets,” The European Journal of Finance, 15, 721–750.

Clayton, D. (1978), “A model for association in bivariate life tables and its application in epidemiologicalstudies of familial tendency in chronic disease incidence,” Biometrika, 65, 141–151.

Czado, C. (2010), “Pair-copula constructions of multivariate copulas,” in Copula theory and its applica-tions: proceedings of the workshop held in Warsaw, 25-26 September 2009, eds. Jaworski, P., Durante,F., Hardle, W. K., and Rychlik, T., Springer, pp. 93–109.

Czado, C., Schepsmeier, U., and Min, A. (2012), “Maximum likelihood estimation of mixed C-vines withapplication to exchange rates,” Statistical Modelling, 12, 229–255.

Dobric, J. and Schmid, F. (2005), “Nonparametric estimation of the lower tail dependence λL in bivariatecopulas,” Journal of Applied Statistics, 32, 387–407.

Dorota, K. (2010), Dependence Modeling: Vine Copula Handbook, World Scientific.Draisma, G., Drees, H., Ferreira, A., and De Haan, L. (2004), “Bivariate tail estimation: dependence in

asymptotic independence,” Bernoulli, 10, 251–280.Durrleman, V., Nikeghbali, A., and Roncalli, T. (2000), “A simple tranformation of copulas,” Working

paper, groupe de Recherce Operationelle, Credit Lyonnais, France.Embrechts, P., Kluppelberg, C., and Mikosch, T. (1997), Modelling extremal events, vol. 33, Springer

Science & Business Media.Engle, R. F. and Sheppard, K. (2001), “Theoretical and Empirical properties of Dynamic Conditional

Correlation Multivariate GARCH,” Working Paper, 1–46.Garcıa, J. E., Gonzalez-Lopez, V., and Nelsen, R. (2013), “A new index to measure positive dependence

in trivariate distributions,” Journal of Multivariate Analysis, 115, 481–495.Jaworski, P., Durante, F., Hardle, W. K., and Rychlik, T. (2010), Copula theory and its applications:

proceedings of the workshop held in Warsaw, 25-26 September 2009, vol. 198, Springer.Joe, H. (1993), “Parametric families of multivariate distributions with given margins,” Journal of Mul-

tivariate Analysis, 46, 262–282.— (1997), Multivariate models and dependence concepts, Chapman & Hall, London.— (2005), “Asymptotic efficiency of the two-stage estimation method for copula-based models,” Journal

of Multivariate Analysis, 94, 401–419.Kass, R. (1993), “Bayes factors in practice,” Journal of the Royal Statistical Society. Series D (The

Statistician), 42, 551–560.Ledford, A. W. and Tawn, J. a. (1997), “Modelling Dependence within Joint Tail Regions,” Journal of

the Royal Statistical Society: Series B (Statistical Methodology), 59, 475–499.Li, F. and Villani, M. (2013), “Efficient Bayesian Multivariate Surface Regression,” Scandinavian Journal

of Statistics, 40, 706–723.Li, F., Villani, M., and Kohn, R. (2010), “Flexible modeling of conditional distributions using smooth

mixtures of asymmetric student t densities,” Journal of Statistical Planning and Inference, 140, 3638–3654.

— (2011), “Modeling Conditional Densities Using Finite Smooth Mixtures,” in Mixtures: estimation andapplications, eds. Mengersen, K., Robert, C., and Titterington, M., John Wiley & Sons Inc, Chichester,pp. 123–144.

Manner, H. and Reznikova, O. (2012), “A survey on time-varying copulas: specification, simulations,and application,” Econometric Reviews, 31, 654–687.

Mardia, K. and Kent, J. (1979), Multivariate analysis, Academic Press, London.Melino, A. and Turnbull, S. M. (1990), “Pricing foreign currency options with stochastic volatility,”

Journal of Econometrics, 45, 239–265.Nelsen, R. (2006), An introduction to copulas, Springer Verlag.Nott, D. and Kohn, R. (2005), “Adaptive sampling for Bayesian variable selection,” Biometrika, 92,

747–763.Patton, A. (2006), “Modelling Asymmetric Exchange Rate Dependence,” International economic review,

47, 527–556.

Page 22: arXiv:1401.0100v2 [stat.ME] 4 Nov 2016 · PDF file · 2016-11-07For example the Clayton copula and Gumbel copula (Joe, 1997) can only have positive correlations and tail-dependence

22 FENG LI

— (2012a), “Copula methods for forecasting multivariate time series,” Handbook of Economic Forecast-ing, 1–76.

— (2012b), “A review of copula models for economic time series,” Journal of Multivariate Analysis, 110,4–18.

Pitt, M., Chan, D., and Kohn, R. (2006), “Efficient Bayesian inference for Gaussian copula regressionmodels,” Biometrika, 93, 537–554.

Richardson, S. and Green, P. (1997), “On Bayesian analysis of mixtures with an unknown number of com-ponents (with discussion),” Journal of the Royal Statistical Society. Series B. Statistical Methodology,59, 731–792.

Schmidt, R. and Stadtmuller, U. (2006), “Non-parametric Estimation of Tail Dependence,” ScandinavianJournal of Statistics, 33, 307–335.

Sklar, A. (1959), “Fonctions de repartition a n dimensions et leurs marges,” Publications de l’Institut deStatistique de L’Universite de Paris, 8, 229–231.

Smith, M. and Khaled, M. (2012), “Estimation of Copula Models with Discrete Margins via BayesianData Augmentation,” Journal of the American Statistical Association, 107, 290–303.

Vach, K., Sauerbrei, W., and Schumacher, M. (2001), “Variable selection and shrinkage: comparison ofsome approaches,” Statistica Neerlandica, 55.

Villani, M., Kohn, R., and Giordani, P. (2009), “Regression density estimation using smooth adaptiveGaussian mixtures,” Journal of Econometrics, 153, 155–173.

Villani, M., Kohn, R., and Nott, D. J. (2012), “Generalized Smooth Finite Mixtures,” Journal of Econo-metrics, 171, 121–133.

Xu, J. (1996), “Statistical modelling and inference for multivariate and longitudinal discrete responsedata,” Ph.D. thesis, Department of Statistics, University of British Columbia.

Yu, J. and Meyer, R. (2006), “Multivariate stochastic volatility models: Bayesian estimation and modelcomparison,” Econometric Reviews, 25, 361–384.


Top Related