inferring volatility dynamics and risk premia from the s&p 500 and … · 2019. 10. 23. · vix...

Inferring volatility dynamics and risk premia from the S&P 500 and

VIX markets∗

Chris Bardgett† Elise Gourier‡ Markus Leippold§

November 21, 2014

Abstract

This paper investigates the information contained in the S&P 500 and VIX markets. We

estimate a flexible affine model over a time series of underlying indices and option prices on

both markets simultaneously. We find that the S&P 500 and VIX derivatives markets contain

conflicting information on variance, especially during market distress. We analyze the information

spanned by each market on the distributions and tail properties of S&P 500 returns and their

variance process. An extensive model specification analysis reveals that jumps and two variance

factors help reproduce these distributions and play a significant role in the variance risk premium.

Keywords: S&P 500 and VIX joint modeling, variance risk premium, particle filter.

JEL Codes: G12, G13, C58.

∗The authors thank Yacine Aı̈t-Sahalia, Peter Christoffersen, Jérôme Detemple, Garland Durham, Damir Filipović,Andras Fulop, Michael Johannes, Loriano Mancini, Chay Ornthanalai, Andrew Papanicolaou, Chris Rogers, RonnieSircar, Josef Teichmann, Fabio Trojani, Anders Trolle, Alexandre Ziegler, and the participants of the Brown Bag LunchSeminar at the Department of Banking and Finance of the University of Zürich, the Gerzensee Swiss Finance InstitutePhD workshop, the Bachelier 2012 Finance Conference, the 2014 Conference of the Swiss Society for Financial MarketResearch, the joint seminar with ETH Zürich, the 5th International Conference of the ERCIM WG on Computing &Statistics, the 40th Annual Conference of the European Finance Association and the Finance Seminar of the Universityof Aarhus and HEC Montreal for helpful comments. Financial support from the Swiss Finance Institute (SFI), BankVontobel, the Swiss National Science Foundation and the National Center of Competence in Research “FinancialValuation and Risk Management” is gratefully acknowledged.†University of Zurich and Swiss Finance Institute (SFI), Plattenstrasse 14, 8032 Zürich, Switzerland; tel: (+41)-44-

634-4045; Email: [email protected].‡ORFE Department, Princeton University, Sherrerd Hall, Princeton NJ 08544, USA; Email: [email protected].§University of Zurich and Swiss Finance Institute (SFI), Plattenstrasse 14, 8032 Zürich, Switzerland; tel: (+41)-44-

634-5069; Email: [email protected].

1 Introduction

A central research question in empirical option pricing is the specification of asset returns dynamics.

The recent introduction of derivatives on the VIX index brought new and valuable information on

the S&P 500 index returns. Introduced by the CBOE in 1993, the VIX index non-parametrically

approximates the expected future realized volatility of the S&P 500 returns over the next 30 days.

VIX options started trading in 2006 and, as of today, represent a much larger market than VIX

futures. By definition, the VIX index, VIX options, and S&P 500 options are directly linked to the

S&P 500 index. However, up to our knowledge, there has very little effort dedicated to studying

the information contents of these datasets. In this paper we investigate the information that they

span on the distribution of S&P 500 returns and of their variance process. We show that a joint

estimation of VIX index, VIX options, S&P 500 index and S&P 500 options is required to avoid

mispricing of one market’s derivatives.

Jointly analyzing VIX and S&P 500 option markets is a challenge. Not only do we need a set of

candidate models that are flexible enough to accommodate the stylized facts in both markets, but in

addition, the empirical analysis involves a large amount of data and poses computational challenges.

We develop a time-consistent estimation procedure that permits us to perform a comprehensive in-

and out-of-sample analysis of both the S&P 500 and VIX data. This methodology goes well beyond

a simple calibration exercise, and allows us to extract information simultaneously from S&P 500 and

VIX derivatives markets, to consistently match the joint evolution of prices on both markets over

time.

We make the following contributions to the empirical option pricing literature.

First, we analyze and compare the information contained in the S&P 500 and VIX markets. We find

that when the market is calm, options do not provide more information on the dynamics of volatility

than the underlying S&P 500 returns and VIX levels. However, during market turmoil, our results

indicate that the information contained in S&P 500 options is different from that contained in the

underlying index levels or in VIX options. A joint estimation is thus required to accurately represent

the level of reversion of the S&P 500 returns’ variance and infer jumps’ arrival times and magnitude.

1

These findings are further supported by a thorough in- and out-of-sample analysis of how models

estimated in one market perform in the pricing of options in the other market. We find that VIX

levels do not span the information contained in S&P 500 options, and that S&P 500 options do not

span the information contained in VIX options and vice-versa. It is crucial to be aware of this lack

of market integration when pricing, risk managing or hedging positions on one market with options

on the other one.

Second, we perform an extensive model specification analysis. We detail and explain the role of the

different features of the model in explaining option prices and risk-neutral distributions of returns

and of their variance process. We model the S&P 500 returns using the affine framework of Duffie,

Pan, and Singleton (2000). This structure allows us to price S&P 500 and VIX derivatives in semi-

closed form which is essential for carrying out the analysis of returns and volatility dynamics using a

large dataset of options. However, we point out and reduce the limitations of one-factor affine models

by advocating a stochastic level of reversion in the volatility dynamics.1 The flexibility of our model

makes it possible to investigate how many factors are needed to reproduce the times series features

of the data, and whether jumps should be incorporated. Extracting information from both S&P

500 and VIX derivatives markets indeed provides new valuable insight into the dynamics of asset

returns and volatility. Based on likelihood criteria as well as statistical tests of pricing errors, we find

that jumps in the return and variance processes are needed to jointly represent the index levels and

derivatives prices in both markets. Introducing a stochastic level of reversion for the variance (also

known as stochastic central tendency) helps to better represent the tails of the returns’ distribution

and the term structure of S&P 500 and VIX option prices. Furthermore, jumps enable to better

reproduce the right tail of the variance distribution and short-maturity options.

Third, estimating the dynamics from an extremely large dataset of options on the two markets and

for a long time series requires computationally efficient techniques that can easily deal with the

features of the model, in particular state-dependent jumps. To achieve this goal, we extend the

1While adding an additional factor to the Heston model increases the complexity, it has indeed been shown that twofactors are needed to provide an accurate description of the volatility dynamics (see, e.g., Andersen, Benzoni, and Lund(2002), Alizadeh, Brandt, and Diebold (2002), Chernov, Gallant, Ghysels, and Tauchen (2003), Christoffersen, Heston,and Jacobs (2009), Egloff, Leippold, and Wu (2010), Todorov (2010), Kaeck and Alexander (2012), Bates (2012) andMenćıa and Sentana (2013)).

2

Fourier Cosine method introduced by Fang and Oosterlee (2008) for S&P 500 options to price VIX

options and adapt the Auxiliary Particle Filter of Pitt and Shephard (1999) to filter out unobservable

processes over time and their jumps. Accordingly, we provide an extensive toolkit for inference and

diagnostics of affine option pricing models given index and option data from both S&P 500 and

VIX markets. Sequential Monte-Carlo techniques have recently increased in popularity and have

been used to estimate models, but most endeavors using this tool restrict their options dataset to

near at-the-money options and to the extent of our knowledge, none have used S&P 500 and VIX

derivatives jointly.

Our fourth contribution is the thorough analysis of the variance risk premium. In the literature,

the components of risk premia are usually found hard to estimate and statistically insignificant with

daily data, particularly when jumps are involved. One reason for this is that the estimation of risk

premia requires a large amount of returns and options data and therefore powerful computational

tools to extract the relevant information. Our approach reveals some interesting characteristics of

the variance risk premium. In particular, we find that the variance risk premium is very sensitive to

jumps, in particular at the short end of the variance term structure, because a large movement in

the variance process has an immediate negative impact on the payoff of a short-term variance swap.

The stochastic central tendency plays a significant role in both continuous and discontinuous parts,

especially in calm markets. It is more persistent than the instantaneous variance and represents

long-term expectations of investors, and as such the contribution of the variance takes over in times

of market turmoil.

Several papers have been published in the last years aiming to reconcile the cross-sectional informa-

tion of the S&P 500 and the VIX derivatives markets by modeling them jointly. Gatheral (2008)

pointed out first that even though the Heston model performs fairly well to price S&P 500 options, it

fails to additionally price VIX options. In fact, modeling the instantaneous volatility as a square root

process leads to a VIX smile decreasing with moneyness, which is the opposite of what is observed

in practice.

Among the recent papers that attempted to simultaneously reproduce the smiles of volatility of S&P

500 and VIX options are Chung, Tsai, Wang, and Wenig (2011), Cont and Kokholm (2011), Song

3

and Xiu (2012), Papanicolaou and Sircar (2013) and Bayer, Gatheral, and Karlsmark (2013). We

build on this literature by considering extensions of the Heston model that remain within the affine

framework, but add more flexibility to the specifications used in the above mentioned papers. Our

model is a special case of the general affine framework developed by Duffie, Pan, and Singleton (2000)

but includes as sub-cases the usual extensions of the Heston model encountered in the literature, for

example Bates (2000), Eraker (2004) and Sepp (2008a).2 However most if not all of the papers that

consider S&P 500 and VIX options in their calibration exercise have restricted their analysis to a

static one-day estimation. Therefore the estimated parameters might exhibit large variations when

calibrating the model to different dates. Lindström, Ströjby, Brodén, Wiktorsson, and Holst (2008)

show that the estimated parameters are not stable over time and therefore cannot be used to infer

time series properties of returns and risk premia.

Time-consistent estimation methods have been used previously to calibrate models to index returns

and options. See, e.g, Pan (2002), Eraker (2004), Broadie, Chernov, and Johannes (2007), Christof-

fersen, Jacobs, and Mimouni (2010), Johannes, Polson, and Stroud (2009) and Duan and Yeh (2011).

However, as underlined in Ferriani and Pastorello (2012), most papers filtering information from op-

tion prices rely on one option per day or a limited set of options. Limiting the amount of data results

in a computationally less intensive empirical exercise, but it ignores a large part of the information

present in the markets.

The paper is organized as follows. In Section 2, we introduce the two-factor affine jump-diffusion

framework used later in the estimation. We describe the risk premium specification and derive the

expression for the VIX squared as well as the pricing formula for VIX and S&P 500 options. In

Section 3, we describe our datasets and perform a preliminary model selection exercise based on a

daily joint calibration to S&P 500 and VIX markets. In Section 4, we detail our time-series consistent

estimation method. Finally, in Section 5, we summarize the results of the estimation and present

our findings. Section 6 concludes.

2Some studies are going in the direction of non-affine models (e.g., Jones (2003), Aı̈t-Sahalia and Kimmel (2007),Christoffersen, Jacobs, and Mimouni (2010), Ferriani and Pastorello (2012), Durham (2013), Kaeck and Alexander(2012)). However tractability remains an issue that is of crucial importance when it comes to calibrating a model to along time series containing hundreds of options each day.

4

2 Theoretical framework

We first present our modeling framework, which belongs to the affine class. Then, we discuss the

risk premium specification and we derive the pricing formula for VIX derivatives.

2.1 Model specification

Let (Ω,F , {Ft}t≥0,P) be a filtered probability space satisfying the usual assumptions, where P de-

notes the historical measure. We consider a risk-neutral measure Q equivalent to P and denote by

(Ft)t≥0 the forward price of the S&P 500 index and by Y = (Yt)t≥0 = (log(Ft))t≥0 the returns. The

dynamics of Y under Q are specified as

dYt = [−λY v(vt− ,mt−)(θQZ(1, 0, 0)− 1)−1

2vt− ]dt+

√vt−dW

Y,Qt + dJ

Y,Qt , (1)

dvt = κQv (mt− − vt−)dt+ σv

√vt−dW

v,Qt + dJ

v,Qt , (2)

dmt = κQm(θ

Qm −mt−)dt+ σm

√mt−dW

m,Qt + dJ

m,Qt , (3)

where W Y,Q,W v,Q, and Wm,Q are three Brownian motions under Q with dependence structure

d〈W Y,Q,W v,Q〉t = ρY,vdt, d〈Wm,Q,W Y,Q〉t = 0, d〈Wm,Q,W v,Q〉t = 0. (4)

The processes JY , Jv, and Jm are finite activity jump processes defined as

dJY,Qt = ZY,Qt dN

Y vt , dJ

v,Qt = Z

v,Qt dN

Y vt , dJ

m,Qt = Z

m,Qt dN

mt . (5)

We first discuss the jump structure of our affine framework. As suggested by the price paths of

the S&P 500 and VIX index, large movements in equity returns and variance are likely to occur at

the same time. We therefore choose, in line with the literature, the same Poisson process NY vt to

generate jumps in the asset returns and variance process (vt)t≥0.3 We assume that jump intensities

3Many articles suggest to have positive jumps in the variance, e.g., Christoffersen, Jacobs, and Mimouni (2010).Todorov (2010), Todorov and Tauchen (2011) and Jacod and Todorov (2010) find striking evidence for co-jumps inS&P 500 returns and in the VIX. See also Eraker (2004), Broadie, Chernov, and Johannes (2007), Cont and Kokholm(2011).

5

depend linearly on the factor levels:4

λm(mt−) = λm0 + λ

m1 mt− , (6)

λY v(vt− ,mt−) = λY v0 + λ

Y v1 vt− + λ

Y v2 mt− . (7)

Moreover, the process ZQ = (ZY,Q, Zv,Q, Zm,Q)> corresponds to the random jump sizes under Q

and are independent and identically distributed (i.i.d.). The jump sizes in the returns are normally

distributed N (µQY , σQY ) and the jump sizes in the two volatility factors are exponentially distributed

with respective means νQv and νQm. The jump sizes are characterized by their joint Laplace transform

θQZ(φ) = θQZ(φY , φv, φm) = E

Q[exp(φ>ZQ)], φ ∈ C3. (8)

The variance process (vt)t≥0 of the returns reverts towards a stochastic central tendency (mt)t≥0.

Hence, the level of these two processes are not independent, even though their increments are. The

leverage effect is driven by the correlation between W Y,Q and W v,Q as well as the possibility of

simultaneous jumps in the returns and variance.

The model specification above implicitly defines the dynamics for the VIX. To derive its value within

our framework, we use the definition of the VIX as a finite sum of call and put prices that converges

to the integral VIX2t =2τE

Qt

[∫ t+τt

dFuFu−− d(lnFu)

], where τ is 30 days in annual terms.

Proposition 2.1. Under the model specification given in equations (1) to (3), the VIX squared at

time t can be written as an affine function of vt and mt:

VIX2t =1

τEQt[∫ t+τ

tvudu+ 2

(eZ

Y,Qu − 1− ZY,Qu

)dNY vu

], (9)

= αVIX2vt + βVIX2mt + γVIX2 , (10)

4The specification of jumps is of importance. Bates (1996), Pan (2002) and Eraker (2004) argue in favor of usingstate-dependent jumps in returns, which is intuitively appealing as jumps tend to occur more frequently when volatilityincreases. Using variance swaps, Aı̈t-Sahalia, Karaman, and Mancini (2012) find that the state dependent intensityof jumps is a desirable model feature. However, evidence supporting this choice is mixed. Indeed, Bates (2000) findsthat state dependent intensities lead to strong misspecification and Eraker (2004) finds that it does not significantlyimprove the option prices fit. Broadie, Chernov, and Johannes (2007) and Johannes, Polson, and Stroud (2009) use aconstant intensity of jumps.

6

where the coefficients αVIX2 , βVIX2 and γVIX2 are known in closed-form and provided in Appendix

A.

For a proof of the above proposition, we refer to Appendix A.

2.2 Risk premium specification

We specify the change of measure from the pricing to the historical measure so that the model

dynamics keeps the same structure under P. We separate the total equity risk premium γt into a

Brownian contribution, which is proportional to the variance level and represents the compensation

for the diffusive price risk, and a jump contribution reflecting the compensation for jump risk:

γt = ηY vt− + λY v(vt− ,mt−)

(θPZ(1, 0, 0)− θ

QZ(1, 0, 0)

), (11)

where θPZ denotes the joint Laplace transform of jump sizes under the historical measure P. We

follow Pan (2002) and Eraker (2004) and impose the intensity of jumps to be the same under Q and

P.5 However, we allow for the mean and volatility of the jump sizes in returns to be different under

Q and P.6

Similarly, the volatility risk premium on the two volatility factors vt and mt decomposes into a

diffusive component and a jump component. The diffusive variance risk premium in vt is proportional

to the current level of variance, with coefficient of proportionality given by ηv = κQv − κPv . The same

applies to the central tendency mt, for which the coefficient is defined as ηm = κQm − κPm. For the

jump part of the volatility risk premium, we allow the mean jump sizes νv and νm to be different

under P and Q.

For the estimation procedure, it is helpful to summarize the model in terms of P and Q parameters,5Pan (2002) argues that introducing different intensities of jumps under the historical and pricing measure introduces

a jump-timing risk premium that is very difficult to disentangle from the mean jump risk premium. The consequence ofthis assumption is that the jump-timing risk premium is artificially incorporated into the mean jump size risk premium.

6In the literature σY has sometimes been constrained to be the same under P and Q (Bates (1988), Naik and Lee(1990)), but this is not required by absence of arbitrage (in contrast to σv, σm, and ρY,v). We follow Broadie, Chernov,and Johannes (2007) by allowing them to be different. Indeed, they find strong evidence for them to be different havinga strong impact on the magnitude of the premium attached to the mean price jump size.

7

which need to be estimated accordingly:

ΘP = {κPv , κPm, θPm, νPm, νPv , µPY , σPY , ηY }, ΘQ = {κQv , κQm, θQm, νQm, νQv , µQY , σ

QY }. (12)

The remaining parameters are, by assumption, equal under both measures:

ΘP,Q = {λY v0 , λY v1 , λY v2 , λm0 , λm1 , σm, σv, ρY v}. (13)

2.3 Derivatives pricing

Within the class of affine models, option pricing is most efficiently performed using Fourier inversion

techniques. As starting point, we need the characteristic function of the underlying processes. Due

to the affine property of the VIX2 in Proposition 2.1, we have the following result:

Proposition 2.2. In our two-factor stochastic volatility model with jumps, the Laplace transforms

of the VIX2 and the S&P 500 returns are exponential affine in the factor processes vt and mt:

ΨVIX2T(t, vt,mt;ω) := EQt

[eωVIX

2T

]= eα(T−t)+β(T−t)·v+γ(T−t)·m,

ΨYT (t, yt, vt,mt;ω) := EQt

[eωYT

]= eαY (T−t)+βY (T−t)·y+γY (T−t)·v+δY (T−t)·m,

where the coefficients are functions defined on [0, T ] by ODEs presented in Appendix C and ω ∈ C.

Pricing options on the VIX poses technical difficulties that are not encountered when pricing equity

options. Given a call option with strike K and maturity T on the VIX at time t = 0, we need to

calculate

C(VIX0,K, T ) = e−rT

∫ ∞0

(√v −K)+fVIX2T (v)dv, (14)

where fVIX2Tis the Q-density of the VIX2 at time t = T . The square root appearing in the integral

as part of the payoff in (14) prevents us from using the Fast Fourier Transform of Carr and Madan

(1999). We would need the log of the VIX to be affine, which is incompatible with affine models for

8

log-returns.

However, this problem can be circumvented. Fang and Oosterlee (2008) introduce the Fourier Cosine

Expansion to price index options on the S&P 500. We extend their method to tackle the pricing of

VIX options. Our approach to price VIX options is comparable to the inversion performed by Sepp

(2008a), but it is more parsimonious in the number of computational parameters.7

Proposition 2.3. Consider a European-style contingent claim on the VIX index with maturity T and

payoff uVIX(VIX2) = (

√VIX2 −K)+. Given an interval [aVIX, bVIX] for the support of the VIX2T |v0,m0

density, the price PVIX(t0,VIX0) at time t = t0 ≥ 0 of the contingent claim is

PVIX(t0,VIX0)= e−r(T−t0)

N−1∑′n=0

AVIX2

n UVIX2

n , (15)

where the prime superscript in the sum∑′

means that the first term A0U0 is divided by 2. The

terms in the sum are defined by:

AVIX2

n =2

bVIX − aVIXRe

{ΨVIX2T

(t0, v0,m0;

inπ

bVIX − aVIX

)exp

(−iaVIX

nπ

bVIX − aVIX

)}, (16)

UVIX2

n =

∫ bVIXaVIX

uVIX(v) cos

(nπ

v − aVIXbVIX − aVIX

)dv. (17)

The coefficient AVIX2

n is computed using Proposition 2.2 and UVIX2

n is known in closed form and given

in Appendix B.

3 Data and preliminary analysis

In this section, we describe our data and point out some important characteristics of VIX options.

Additionally, we perform a preliminary model selection exercise, in which we try to jointly calibrate

7For some model parameter values, the cosine expansion for the density of the VIX2 converges slowly. We found thatthis happens for parameters for which the density of the VIX2 is not differentiable at the left end of its support (close toγVIX2 in equation (10)) and this generates an oscillating Fourier approximation (referred to as the Gibbs phenomenon,well known in numerical analysis). One way to improve convergence is to use spectral filters as is illustrated in Ruijter,Versteegh, and Oosterlee (2013). Before one estimates the model, we cannot rule out the parameter values that generatethe Gibbs phenomenon. Therefore, we need to take this problem into consideration so that the optimizer can run overthe entire parameter space.

9

some candidate models to the S&P 500 and VIX option markets.

3.1 Data description

Options on the VIX were introduced in 2006. Our sample period ranges from March 1, 2006 to

October 29, 2010. The option data consist of daily closing prices of European options on the S&P

500 and VIX, obtained from OptionMetrics. This time series includes both periods of calm and

periods of crisis with extreme events. Therefore, it provides an ideal test bed for our candidate

models.

Both the S&P 500 and VIX options datasets are treated following usual procedures.8 We only

consider options with maturity between one week and one year and delete options quotes that where

not traded on a given date. We only work with liquid OTM options for the S&P 500 market and

only with liquid call options for the VIX market. If the VIX ITM call is not liquid, we use the

put-call parity to infer a liquid VIX ITM call from a more liquid VIX OTM put. Implied volatilities

are computed from futures prices, inferred from highly liquid options using the at-the-money (ATM)

put-call parity.9 By doing so we avoid two issues, making predictions on future dividends and using

futures closing prices which are not synchronized with the option closing prices.

These adjustments leave a total of 383,286 OTM S&P 500 and 43,775 call options on the VIX, with

a daily average of 327 S&P 500 options and 37 VIX options. The number of S&P 500 (VIX) options

in our dataset on a given date increases with time with around 170 (5) options at the beginning of

the dataset and around 450 (70) options at the end. At the beginning of the sample, there are one or

two short maturities (less than six months) available for VIX options and around six maturities for

S&P 500 options with approximately 40 options per maturity slice. At the end of the sample, VIX

options have around five short maturities with a bit more than 10 options trading per maturity. For

S&P 500 options, around ten maturities are available per day with around 60 options for one-month

maturities and 40 options for the one-year slice. The low number of VIX options compared to the

8See, e.g., Aı̈t-Sahalia and Lo (1998).9We remark that VIX option prices do not satisfy no-arbitrage relations with respect to the VIX index, but rather

with respect to the VIX futures value. A VIX call option at time t with maturity T is an option on the volatility forthe time interval [T, T + 30d], where 30d stands for 30 days. The value VIXt at time t is related to the volatility onthe time interval [t, t+ 30d] which might not overlap at all with [T, T + 30d].

10

number of S&P 500 options comes from the fact that VIX options only started trading in 2006. At

the end of our sample, the total VIX options volume per day is about half the total volume of S&P

500 options traded.

3.2 Descriptive Statistics

Table 1 reports the first four sample moments of the S&P 500 futures returns and VIX index levels,

over two different periods of time. The first period starts in March 2006 and ends in March 1, 2009,

i.e., it spans the pre-crisis period as well as the beginning of the crisis. The second period begins

in December 2008 and lasts until October 2010. For our estimation, these two periods serve as

in-sample and out-of-sample period.

In the out-of-sample period, the S&P 500 log-returns exhibit negative skewness and a high kurtosis

over both periods, suggesting the presence of rare and large movements. The VIX index exhibits

a large positive skewness and kurtosis in the in-sample period. However, in the second period the

numbers in Table 1 suggest that the movements are more symmetric and centered around a higher

value (29% instead of 20% in the first period).

[Table 1 about here.]

In Panel A of Figure 1, we plot the joint evolution of the S&P 500 and the VIX index from 2006 to

2010. The S&P 500 returns and the VIX increments are highly negatively correlated (−83.2% over

this period), which explains the popularity of VIX contracts to hedge part of the equity risk of a

portfolio. In Panel B, we also plot the expected forward log-returns of the underlying S&P 500 index

returns from March 1st, 2006 to October 29th, 2010 as implied by prices of S&P 500 options with

one month maturity.10 The expected forward returns illustrate the variety of market situations that

our time series covers. They were almost constant until the end of 2007, equal to a positive value

and thus indicating that market participants were expecting a stable income from investing in the

index. From the end of 2007, they exhibit more variation and eventually turn negative. Following

the bail out of Lehman Brothers in September 2008, expected forward returns drop below -1.5%.

10We use the method described in Bakshi, Kapadia, and Madan (2003) to calculate the moments implied by optionprices.

11

Then, expected forward rates recover and gradually come back to stabilize in mid-2009 around a

slightly negative level close to -0.2%. In 2010, the sudden increase in the VIX index coincides with

a further sudden drop of the expected forward returns falling to almost -0.5%.11

[Figure 1 about here.]

Even though the S&P 500 and VIX markets are closely related, we emphasize that options on the

VIX and S&P 500 differ substantially in their characteristics. First, S&P 500 and VIX derivatives

with the same maturity contain different information. While an S&P 500 option with maturity

T contains information about the future S&P 500 index level at time T and therefore about the

S&P 500 volatility up to T , a VIX option with maturity T embeds information about the VIX at

time T and therefore about the S&P 500 volatility between T and T + 30 days. Second, the implied

volatility smiles backed out from S&P 500 and VIX option prices have different shapes. Panels C and

D of Figure 1 display the S&P 500 and VIX smiles on May 10, 2010. The implied volatilities (IVs)

are computed using the standard Black-Scholes formula. The VIX IVs are in general substantially

higher, averaging around 75% with a range from 40% to 200%, compared to S&P 500 IVs with

an average around 23%. The implied volatilities are negatively skewed for S&P 500 options, i.e.,

generally decreasing with moneyness as risk-averse investors require a premium for negative states of

the economy. In contrast, VIX implied volatilities are positively skewed and increase with moneyness,

which can intuitively be explained by the fact that, through the leverage effect, negative returns are

often observed together with a rise of volatility. In high-volatility states, hedging generally becomes

more expensive.

The difference between these markets is also reflected by other indicators such as the put-call trading

ratio. Almost twice as many puts as calls are traded daily in the S&P 500 options market, but the

situation is reversed in the VIX market where the amount of calls traded daily is almost double that

of the puts. In fact, we can observe in Panels C and D of Figure 1 that the log-moneynesses traded

for S&P 500 options are mostly negative (which corresponds to OTM put options) and often positive

11Both the VIX index and expected forward returns as implied by S&P 500 options indicate market expectationsover the next month as reflected in index option prices. However volatility provides information on returns throughthe leverage effect, while the implied expected forward returns are a direct measure of how investors expect returns tobehave. They are much more stable in quiet periods and better reflect the different market situations that composeour time-series and that we aim to reproduce with a model.

12

for VIX options (OTM calls).

3.3 Joint calibration

Before we bring our models to the time series of data, we do a joint calibration exercise using the

cross section of S&P 500 and VIX options on a particular date. This exercise gives us some guidance

for model design and allows us to reduce the set of models to be estimated with the particle filter.

If a model is not flexible enough to reproduce simultaneously the implied volatility patterns of both

markets on a single date, the Q dynamics of the model is not sufficiently rich to accurately price both

S&P 500 and VIX derivatives jointly and we can safely discard this model from further consideration.

Let us fix a date t and consider {IVMktSPX,i}i=1···NSPX the set of NSPX market implied volatilities of

S&P 500 options for strikes {Ki} and maturities {Ti}. We denote by {IVMktVIX,j}j=1···NVIX the set

of NVIX market implied volatilities of VIX options. To estimate parameters, we minimize the root

mean squared error (RMSE) between market and model implied volatilities:12

RMSEM(t) :=

√1

NM

∑1≤i≤I

(IVMktM,i − IVModM,i

)2, M∈ {SPX,VIX}, (18)

RMSE(t) :=1

2(RMSESPX(t) + RMSEVIX(t)) (19)

We use two global optimizers to cope with the non-convexity of the calibration problem and the

potential existence of multiple local minima, namely the Covariance Matrix Adaptation Evolution

Strategy (CMA-ES), introduced by Hansen and Ostermeier (1996), and the Differential Evolution

(DE) algorithm13 introduced by Storn (1996).

For our calibration, we chose a date on which markets were under stress, namely May 5, 2010 at the

beginning of the European sovereign debt crisis. After cleaning our data as described previously, we

have 91 VIX options at six different maturities (from 0.04 to 0.46 years) and 486 S&P 500 options

12Since we analyze the fit in terms of implied volatilities, we do not consider other popular choices of distancesincluding absolute error of the logarithm of option prices, relative error of option prices (see Christoffersen and Jacobs(2004)). Alternatively, we checked that using distances taking into account the bid-ask spread of IVs as in Cont andKokholm (2011) does not significantly change the quality of fit. Instead of the RMSE, we also looked at average relativeerrors (ARE). However, this does not affect our conclusions. The results using ARE are available upon request.

13We thank Jochen Krause for his implementation of the CMA-ES and DE algorithms.

13

at eleven different maturities (from 0.05 to 0.91 years) available. We emphasize that we perform a

joint calibration. Hence, all this data is entered as input to minimize the total RMSE in (19) from

the VIX and the S&P 500 market simultaneously across all available maturities and moneyness.

In Figure 2, we plot the market and model implied volatilities for the S&P 500 (Panels A, C, E)

and the VIX (Panels B, D, F) for two maturity slices each. For the S&P 500 options, we choose the

two maturity T = 0.05 and T = 0.3, and for VIX options T = 0.04 and T = 0.36. As candidate

models, we choose different sub-specifications of the general model (SVJ2) presented in equations

(1)-(7): the Heston model and the Heston model with jumps in returns and volatility (SVJ). Since

we find that the jump component in the stochastic central tendency does not improve the results for

the SVJ2 model, we present the results with mt being a diffusive process.


From Panel A, Figure 2, we observe that the Heston model provides reasonable results for the S&P

500 market. However, for the VIX market (Panel B), the Heston model clearly fails to reproduce

one of the stylized facts of VIX option markets, namely the positive skew of the implied volatility

surface. This failure is most pronounced for the short-term options, where the Heston model generates

a significantly negative skew. The results for the SVJ model look much more promising. Just by

adding jump components to the return and volatility processes, we can now generate the positive

skew in the VIX market (Panel D), while providing an almost perfect fit for the S&P 500 options

market. The SVJ model only struggles at the short end of the VIX implied volatility surface. This

shortcoming disappears when we extend the SVJ specification to the SVJ2 model by adding the

factor mt. Doing so gives us not only a remarkable fit for the S&P 500 but also for the VIX options

market (Panel F). Looking at the RMSEs of the SVJ and SVJ2 models, we find that the SVJ

provides an RMSESPX of 1.27% and an RMSEVIX of 11.60%. The SVJ2 model yields 1.17% and

5.15%, respectively. Hence, while the two models are comparable in terms of their performance on

the S&P 500 options market, there is an obvious difference in the VIX market on the chosen date.

In unreported results, we performed calibration exercises on other days, also including calm periods.

Irrespective of the day, we observe that the SVJ and SVJ2 models perform comparably on the S&P

500 options market, both fitting very well the data. In contrast, we see that there are dates when

14

the SVJ model struggles to fit the VIX IVs in addition to the S&P 500 IVs whereas the SVJ2 model

satisfactorily fits both.14 Therefore, we conclude from our calibration exercise that we can discard

the Heston model from further analysis.

4 Estimation methodology

Daily calibration is essentially a multiple curve fitting exercise to check whether the models can fit

the risk-neutral distributions inferred by option prices at different maturity. Some of the parameters

we get from daily calibrations in the previous section are unstable and vary substantially from one

day to the next.15 To judge whether one model is preferable to the other, a more elaborated analysis

is needed. We choose a methodology based on particle filtering. The particle filter not only allows

us to estimate the conditional densities of unobserved latent processes such as the volatility and

jump processes at every point in time, but it also provides us with standard errors of the parameter

estimates. Using a time series of S&P 500 and VIX indices and options, we estimate both the P and

Q dynamics of the model to obtain a set of model parameters that jointly prices spot and options

in both markets consistently across time. Before introducing the filter, we clarify how we discretize

our model framework and how we specify the measurement errors.

4.1 Discretization and error specifications

We discretize the continuous-time model on a uniform time grid composed of M + 1 points t ∈ {t0 =

0, t1 = ∆t, ..., tk = k∆t, ..., tM = M∆t}, for some M ∈ N∗. Since we use daily data, ∆t corresponds

to one day. By a slight abuse of notation, we henceforth denote by t the integer time index of the

14Our findings are consistent with Gatheral (2008), who shows that the Heston model is incapable of reproducingthe positive skew in VIX IVs, and with Sepp (2008a,b), who finds that the Heston model with positive jumps in thevolatility dynamics removes this shortcoming.

15As explained in Broadie, Chernov, and Johannes (2007) and Lindström, Ströjby, Brodén, Wiktorsson, and Holst(2008), the parameters obtained when calibrating to daily options prices are not stable over time.

15

discretized process. In discrete time, the model evolves under P as follows:16

∆Yt = [−λY v(vt,mt)(θPZ(1, 0, 0)− 1)−1

2vt + γt]∆t+

√vt∆W

Y,Pt + Z

Y,Pt ∆N

Y vt , (20)

∆vt = κPv

(κQvκPvmt − vt

)∆t+ σv

√vt∆W

v,Pt + Z

v,Pt ∆N

Y vt , (21)

∆mt = κPm(θ

Pm −mt)∆t+ σm

√mt∆W

m,Pt + Z

m,Pt ∆N

mt . (22)

Equation (20) is the first measurement equation. The second measurement equation is given by the

observation of the VIX index level with error. Since the VIX index is in practice calculated using a

finite number of options, a discretization bias is introduced:17

VIX2t − (αVIX2vt + βVIX2mt + γVIX2) = �VIXt . (23)

The error term �VIXt is assumed to follow a normal distribution with mean zero and variance s > 0.

The last measurements are the prices of S&P 500 and VIX options. We assume that option prices are

observed with an error, which is due to different sources such as bid-ask spreads, misspecification,

and timing and processing errors. We define these errors as relative differences between market

OM,Mktt and model prices OM,Modt , M∈ {SPX,VIX}:

OSPX,Modt,i (Yt, vt,mt,ΘQ,ΘP,Q)−OSPX,Mktt,i

OSPX,Mktt,i= �SPX,optionst,i , i = 1, . . . , NSPX,t, (24)

OVIX,Modt,j (vt,mt,ΘQ,ΘP,Q)−OVIX,Mktt,j

OVIX,Mktt,j= �VIX,optionst,j , j = 1, . . . , NVIX,t, (25)

where NM,t is the number of contracts available in the respective markets. We assume the error

terms to be normally distributed and heteroscedastic:

�SPX,optionst,i ∼ N (0, σ2�SPXt,i

), �VIX,optionst,j ∼ N (µ�VIXt , σ2�VIXt,j

), (26)

where µ�V IXt is proportional to the error �VIXt which has been made on the estimation of the VIX level.

16For the particle filter, we actually use a Milstein scheme to improve the precision of the discretized dynamics. SeeKloeden and Platen (1992) for details.

17Jiang and Tian (2007) point to systematic biases in the VIX.

16

Indeed, if the underlying’s value is not accurately estimated, it introduces a bias in the valuation of

VIX options. We specify the variance of errors as follows:

σ2�SPXt,i

= exp

(φ0 · bid-ask spreadi + φ1

∣∣∣∣log( KiF SPXt (Ti))∣∣∣∣+ φ2(Ti − t) + φ3) , (27)

σ2�VIXt,j

= exp

(ψ0 · bid-ask spreadj + ψ1

∣∣∣∣log( KjFVIXt (Tj))∣∣∣∣+ ψ2(Tj − t) + ψ3) , (28)

with φi and ψi are in R, i ∈ {0, ..., 3}.18

4.2 Particle filter

At every period t, the measurement vector yt collects observed market prices. By ytn = (yt0 , ..., ytn),

we denote all the observations available up to time t. The filtering problem consists of recursively

calculating the distribution of the latent state Lt,

Lt ={vt,mt,∆N

Y vt ,∆N

mt , Z

Y,Pt , Z

v,Pt , Z

m,Pt

}, (29)

conditional on ytn . Particle filters are perfectly adapted to our problem. They can handle obser-

vations, which are nonlinear functions of latent variables as well as equations with non-Gaussian

innovations.

Multiple versions of the particle filter exist. We use the Auxiliary Particle Filter (APF) proposed by

Pitt and Shephard (1999). Compared to more basic particle filters such as the Sampling Importance

Resampling (SIR) filter, the APF is better suited to detect jumps whereas the SIR filter faces sample

impoverishment leading to potential particle degeneracy. Both filters are described in Johannes,

Polson, and Stroud (2009) for filtering latent factors from returns in a Heston model with jumps in

returns.

We develop an extension of their algorithm that is able to handle the second volatility factor mt and

the volatility jumps. As the jump sizes of returns and their variance are respectively normally and

18The fact that option pricing errors are normally distributed does not constitute a restriction. The reason is thatthe errors are heteroscedastic and coefficients generating heteroskedasticity are driven by the data, i.e., we optimizeover the parameters {φi, ψi}0≤i≤3.

17

exponentially distributed, the conditional likelihood of the new observations given a combination of

jumps involves the sum of a normal and (up to two) exponential random jumps, assuming a maximum

of one jump occurring within one time step. To compute the joint probability of jumps and preserve

tractability, we approximate the exponentially distributed jump sizes by a categorical distribution

(generalization of a Bernoulli distribution) which has support a certain number of chosen quantiles.19

The detailed particle filter is presented in Appendix D. Furthermore, we perform additional data

treatments for S&P 500 and VIX options before running the particle filter. They are described in

Appendix E.

4.3 Candidate models and datasets

For the convenience of illustrating the effect of the incremental information contained in different

market, we proceed with our empirical investigation and develop insights in a pedagogical way by

defining four different datasets as follows:

S&P 500 index S&P 500 index and options

VIX index D1 D2

VIX index and options D3 D4

D1 only contains data on the S&P 500 and VIX indexes. D2 (resp. D3) additionally contains S&P

500 options (resp. VIX options). Finally, D4 contains all available data. Splitting up the data in

such a way allows us to draw inferences on the information contents of different markets and to study

whether these contents are consistent with one another.

As the calibration exercise described in Section 3.3 has shown, the SVJ and SVJ2 model perform

well in simultaneously fitting the S&P 500 and VIX market on a particular day. Therefore, these

two models are natural candidates to analyze with the particle filter. In addition, to appreciate the

impact of jumps, we consider a two factor volatility model that has no jumps. We label this model

SV2.20

19Robustness tests were performed on simulated data to check that the choice of quantiles was appropriate.20In unreported results, we do not find evidence that the inclusion of additional jumps in the central tendency factor

adds value to capture the time series of option prices. Therefore, as in the calibration exercise of the previous section,

18

5 Results and Discussion

After presenting the parameter estimates, we analyze the trajectories of the latent factors depending

on the model and dataset considered. We also investigate the benefits of jumps and stochastic central

tendency by analyzing the in and out-of-sample option pricing errors. We then discuss whether the

VIX market and S&P 500 markets convey the same information about volatility dynamics. Finally,

we study the dynamics of equity and variance risk-premia.21

5.1 Likelihood criteria and parameter estimates

Tables 2 and 3 report the point estimates and standard errors resulting from the estimation of the

SVJ2 model and its SVJ and SV2 sub-specifications to datasets D1 to D4. The last rows of Table

3 indicate the log-likelihood values and the values of the Akaike Information Criterion (AIC) and

Bayes Information Criterion (BIC) for each estimation.

Before we discuss specific parameters, we make the following remark. When we estimate the models

with index data (D1) only, the likelihood criteria are slightly in favor of the SVJ2 model. However,

the Q-parameters and κPv are difficult to estimate. Their standard errors are typically four to five

times larger than those obtained with the other datasets. Therefore, we conclude that D1 is not rich

enough to provide a reliable estimation of the models considered. In particular, despite the fact that

the VIX index is constructed from option prices, it does not contain enough information to accurately

infer the parameters which characterize the Q-distribution of the S&P 500 returns. Consequently,

D1 is not informative enough to provide reasonable pricing performance for S&P 500 options.


Given this result for D1, we focus our discussion on the results for the larger datasets D2 to D4.

We start by analyzing the parameters that we assume equal under both probability measures. The

estimates for the jump intensities suggest that the dominant factor driving the intensities is mt.

we restrict our analysis of the SVJ2 model by setting Jmt = 0 for all t.21In the results presented below, we have used np = 15000 particles on days where observations contain option prices

and np = 8000 when the observation are only composed of returns. Larger numbers of particles did not change ourestimates, but increased the computational complexity of the algorithm.

19

Indeed, the estimates of λY v2 range between 3.64 and 4.25 for the SVJ2 model and are significantly

different from zero. In contrast, the estimates of λY v1 range between 0.15 and 1.1 and are not

significant for all datasets. For the SVJ model with one volatility factor, the effect of mt is transferred

to vt and we can expect an increase in the estimates for λY v1 . Indeed, they range between 2.38 and

2.85 and are significant. The constant term λY v0 is significantly different from zero for both the SVJ2

and SVJ model, when we use the whole dataset D4.

For the second volatility factor mt, we find a volatility parameter σm in the interval [19%, 35%] for

both the SV2 and SVJ2 models regardless of the dataset chosen for the estimation, which is about

half the value of the volatility factor found for the process vt for these two models. In addition, the

speed of mean-reversion of the process mt is about twenty times smaller than that of vt under both

measures. Hence, we can interpret the process vt as a factor representing short-term fluctuations

of the variance, whereas the process mt captures long-term trends. In the SVJ model, as expected,

the volvol parameter σv tends to range between the estimates σm and σv of the two-factor models.22

Not surprisingly, we also find a prominent leverage coefficient ρY v across all models and datasets.

By inspection of the P-parameters, the second volatility factor mt turns out to be more persistent

than vt. This finding also holds under the measure Q in Table 3. The equity risk premium coefficient

ηY is positive across all models and datasets which implies, as expected, a positive diffusive equity

risk premium. We find that under P the mean jump size of returns µPY and its variance σPY are

difficult to identify, because the likelihood is not very sensitive to a change in their value. Therefore,

they were estimated from high-frequency returns. Five-minute returns were obtained from the TAQ

database and intra-daily jump occurrences and sizes were estimated following the method described

in Bollerslev and Todorov (2011). Daily jumps were obtained by aggregation of high-frequency jumps.

Resulting moments of jumps are: µPY = 0.0002 and σPY = 0.0039. Their values were fixed throughout

the filtering exercise. Jumps size estimates under Q are statistically significant and strongly negative

(around -10%), which is due to investors’ risk aversion to jumps and yields a non-zero jump risk

premium. Similarly, the volatility of jumps is larger under Q than under P. This finding for jumps in22We impose the Feller condition 2κQvm0 > σ

2v on the SVJ model, where m0 is the level of reversion of the variance

when the central tendency is a constant process. As a consequence, for all datasets containing options, the estimatedvolatility-of-volatility parameter σv is considerably smaller for SVJ than for SV2 and SVJ2.

20

returns seems to contrast the result for the jumps in volatility. There is almost no difference between

the estimates of νPv and νQv . For the SVJ2 model with the full dataset D4, we find a mean jump of

0.03 under both measures. However,the estimate for νQv is highly significant, while the estimate for

νPv is statistically not different from zero.


The diffusive part of the volatility risk premium ηv = κQv − κPv is mostly negative, its amplitude

however depends on the model used. In particular, we find much smaller magnitudes with the SVJ2

and SV2 models than with the SVJ model. The estimates for the dataset D1 illustrate again the

problem inherent in an estimation strategy using indices only. The estimates for the variance risk

premium ηv vary considerably from strongly negative to strongly positive for different models. For

the diffusive part of the stochastic central tendency risk premium ηm = κQm − κPm, we find that the

estimation of the SV2 model may generate positive premia, depending on the dataset used. However,

whenever when we account for the presence of jumps, the risk premium ηm is negative.23

Finally, in the last columns of Table 3 we report the log-likelihood values for the different models

and datasets. All likelihood criteria imply that the SVJ2 model significantly outperforms the SVJ

and SV2 models when options are included in the estimation dataset. Furthermore, the two-factor

models (SVJ2 and SV2) are substantially superior to the SVJ model in fitting S&P 500 options.

5.2 Filtered trajectories

Figure 3 displays the filtered trajectory of the volatility process when estimating all three models

SVJ, SV2, and SVJ2 using D4. The volatility trajectories√vt are similar across models. For both

the in and out-of-sample period we cannot detect substantial differences across models and conclude

that the model choice has no significant impact on the filtered volatility state.


23For the dataset D1, all the estimates of ηm are positive. However, as argued above, these estimates should beinterpreted with care.

21

Next, we compare the filtered trajectories of the volatility when using different estimation datasets.

By doing so, we can identify the information contents of each dataset on the variance of S&P 500

returns. In Panels B through D of Figure 3 we plot the absolute differences between the filtered

variance using D4 against datasets D1, D2, and D3, under the SVJ2 specification. Up until the peak

in the VIX towards the end of the in-sample period, the difference between the filtered variances is

small (less than 3%). During this period, the filtered variance using D1, which uses only S&P 500

and VIX index returns, is in general slightly smaller than the variance filtered using other datasets

(Panel B). Hence, the inclusion of options tends to moderately increase the filtered variance. With

the start of the financial crisis in the fall of 2008, some new patterns emerge. While the variance

filtered using D4 remains close to the one filtered using D2 (Panel C), the variance trajectories filtered

using D1 and D3 (Panels C and D) are substantially larger (up to 25 percentage points). Hence,

adding S&P 500 options to D1 provides highly valuable information for the identification of the

variance process, which is not spanned by VIX options (D3). Only relying on VIX data may lead

to a significant overestimation of volatility during volatile times. In the out-of-sample period, the

difference between the trajectories remains within an interval of ±3 percentage points except during

the period surrounding the second variance peak in May 2010, which again leads to an overestimation

of volatility when using D1 and D3.

In Figure 4, Panels A and B display the filtered time series of probability of jumps in the variance

process {P(∆NY,vtk = 1)}k=0,...,M for D1 and D4.24 If this probability is above 10%, we plot the

corresponding jump size in Panels C and D. As for the contribution of jumps to the variation of the

variance, Figure 4 shows that a substantial number of jumps are filtered during the crisis regardless

of the dataset used for estimation. When options are part of the dataset, the largest jumps are

between 5 and 10%.


Although the volatility trajectories are similar across models (Panel A, Figure 3), the SVJ generates

more volatility jumps during the crisis. We attribute this observation to the fact that SVJ has a

constant mean-reversion level for the volatility process. Furthermore, we impose the Feller condition,

24We do not report the results for D2 and D3, as the results are very similar to D4.

22

which restricts the amplitude of the volatility diffusive movements. This condition is relaxed for two-

factor models as the level of reversion of the variance is varying, allowing the volatility process to

have a larger amplitude. Hence, the SVJ model needs to generate more jumps for the volatility to

stay high.

In Figure 5, we plot the trajectories of the central tendency process mt using different datasets.

Interestingly, the identification of mt is more sensitive to the choice of datasets. In contrast to the

estimation of vt, we observe significant differences depending on which model we take. The SV2

model indeed tends to generate higher values for the central tendency, in particular in 2009 just

after the peak in the process vt. Intuitively, especially during volatile times, the SV2 model needs to

compensate its inability to generate jumps by increasing mt.

Irrespective of the dataset, we observe that the process mt is overall more stable and less erratic than

the variance process vt, providing evidence that it captures long-term trends. The variance process

vt increases dramatically during the crisis (from September 2008) but gradually returns to a level

comparable to the one before the crisis, which is reached at the beginning of 2010. In contrast, the

central tendency starts to increase at the beginning of 2009, a few months after the spike observed

for vt. We attribute this delay to market participants needing time to adjust their long-term view

of the market. In turn, the process mt does not return to its pre-crisis level of 1 to 2% and remains

between 5% and 8% until the end of the time-series.


Even though the central tendency process is more persistent and less erratic than vt, approximating

mt by a constant as in the SVJ model is too rough of an approximation. In particular, Figure 5

shows that the levels reached by the central tendency during and after the crisis are substantially

underestimated by the SVJ model. In fact, the constant central tendency estimated in the SVJ

model seems to be close to the average filtered central tendency of the SV2 and SVJ2 models over

the in-sample period. This makes the SVJ model insensitive and non-adaptable to different regimes

in long-term volatility especially when the out-of-sample estimation period exhibits more instabilities

than the in-sample period.

23

5.3 Pricing errors

We analyze pricing errors among different dimensions, across different models within the affine pricing

framework, and across different datasets. Our main focus, however, is on the simultaneous pricing

performance of different models on options on both the S&P 500 and VIX.25

We first investigate how the different models reproduce S&P 500 options across time, for the different

moneynesses and maturities. Focusing first on dataset D2 in Table 4, we find that SVJ2 and SV2

are superior to the SVJ. Indeed, the SVJ model exhibits higher Root Mean Square Relative Errors

(RMSREs) than the SV2 and SVJ2 models for all option categories except ATM options.26 The SVJ

prices fairly well short-term options (still not as well as the SVJ2 and SV2) but poorly represents

deep OTM calls and long-maturity options. Hence, introducing stochastic central tendency allows

us to price long-term and deep OTM S&P 500 options more accurately. This finding supports

the interpretation that mt captures the long-term trends of volatility and therefore helps better

reproduce the term structure of S&P 500 option prices. The SVJ2 and SV2 have similar in-sample

pricing errors overall. The SV2 is slightly better at pricing ATM and deep OTM calls, but the SVJ2

prices more accurately OTM puts. Out-of-sample, the SVJ2 model outperforms the SV2 model in

most moneyness and maturity categories. Exceptions are deep OTM puts and long-term options.

However, for those categories the difference in RMSRE does not exceed 2%. Therefore, the SVJ2

model does not overfit the data in the in-sample-period, which would translate into a deterioration

of its performance in the out-of-sample performance compared to the other models.


Focusing on the pricing errors of VIX options using dataset D3, we find that again the SVJ2 model

outperforms all models in terms of RMSE for most option categories. In sample, the SVJ2 always

performs better than the SVJ model. The SV2 model is only slightly better for deep OTM call

25We also analyzed pricing errors for the VIX index. We find that all three models, SVJ, SV2, and SVJ2 accuratelyreproduce the time-series of VIX index irrespective of which estimation dataset is used. Hence, jumps and a stochasticcentral tendency appear superfluous to reproduce the trajectory of the VIX level, a result which is confirmed by aDiebold-Mariano test. Detailed results are available upon request.

26As the model has been estimated to relative errors, it is sensible to use the same measure of error to evaluate itsperformance. We found that an assessment in terms of Root Mean Square Errors may be misleading as it focuses onexpensive options, i.e., options which are closer to the ATM level.

24

options. Interestingly, the SVJ does a better job than the SV2 at pricing deep OTM VIX calls, indi-

cating that jumps are essential to represent accurately the tail of the volatility distribution. However,

the central tendency factor significantly improves the pricing of all other moneyness levels. These

observations are confirmed out-of-sample. The addition of a central tendency factor improves the

pricing of VIX options for all moneyness levels except for deep OTM calls. In fact, for this category,

the SVJ model even performs better than the SVJ2 out-of-sample. Moreover, consistently with the

results obtained when using D2 as estimation dataset, the SV2 model substantially outperforms the

SVJ model in pricing VIX options with a maturity exceeding two months.

When using all the available data D4, the SVJ2 model yields significantly smaller in-sample RMSEs

than the SVJ and the SV2 models when pricing most S&P 500 and VIX option categories. We find

that the two-factor models perform much better than the SVJ to price deep OTM puts and calls on

the S&P 500 as well as long maturity options. In turn, the SVJ model outperforms the SV2 model

in fitting deep OTM calls on the VIX. While the SVJ2 outperforms the SV2 model at pricing S&P

500 options in- and out-of-sample, it appears the SV2 model slightly outperforms the SVJ2 model

in fitting out-of-sample VIX options.

To test whether the pricing performance is significantly better for SVJ2 than for its nested models

on average, we use the Diebold and Mariano (1995) test (DM).27 We consider two loss functions, the

Mean Square Error (MSE) of option prices and the Mean Square Relative Error (MSRE).

For S&P 500 options and using dataset D2, the results in Panel A of Table 5 confirm that the SVJ2

model provides significantly better in-sample and out-of-sample MSEs than the other two models.

The DM tests for the MSRE loss function are not as significant but still positive and larger out-of-

sample than in-sample. When we switch to the full dataset D4, the results confirm that the SVJ2

model has smaller in-sample pricing errors than the other two models, especially for S&P 500 options.

However, the test indicates that the SVJ2 model does not outperform the SV2 model out-of-sample,

which might be due to an identification problem for the jump terms, since a large part of the crisis

belongs to the out-of-sample time period.

27The DM test works as follows. Consider a loss function, e.g., L(et) = |et|, where et is the difference at time tbetween the model filtered and observed value. If two models have comparable pricing errors, then the expectation oftheir loss differential should be zero. The DM test provides a test statistics for this differential.

25


For VIX options (and using D4), the results in Panel B of Table 5 are positive but fairly close to

zero, especially for the MSEs, indicating that all three models perform similarly in terms of MSEs,

but that the SVJ2 performs slightly better in terms of MSREs. Since the MSEs measure the average

dollar error and the MSREs the relative error, our results indicate that the SVJ2 model fits better

cheap OTM options relative to the SV2 model.

From the above analysis of the pricing errors, we can draw three conclusions. First, modeling

stochastic central tendency adds significant value for the pricing of long-term options and for the

representation of the tails of the returns’ distribution (OTM puts and calls on the S&P 500). Second,

jumps add value to represent the right tail of the variance distribution (OTM calls on the VIX) as

well as short-term options. Third, given that we use more than 4.5 years of options’ data with a wide

range of moneynesses and on two different derivatives markets, the pricing errors resulting from our

time-series estimation are small overall.

However, our analysis also uncovers some potential shortcomings of the affine framework. Even the

SVJ2 model has difficulties in reproducing the observed volatility smiles during and after the crisis. In

particular, OTM puts on the S&P 500 tend to be underpriced and OTM calls are generally overpriced,

i.e., the model-implied smile of volatility does not exhibit enough skewness. This phenomenon affects

short-maturity options in particular and indicates that the model struggles in reproducing higher

moments at the short end.


To further illustrate this point, we compare in Figure 6 the implied skewness and kurtosis of S&P

500 returns as implied by market and model option prices, when the models are calibrated to the full

dataset D4. While the skewness of the returns is well represented at the beginning of the in-sample

period, it is underestimated from late 2007 until the end of our sample. In the out-of-sample period

this phenomenon becomes much more apparent, and all three models yield an implied skewness which

is about half the one implied by the market. The SVJ2 model provides a slight improvement over the

other two models but is still far from reality. Similarly, the kurtosis is only slightly underestimated

26

at the beginning of the time-series, but in the out-of-sample period the model kurtosis is about half

the market implied kurtosis. We add that there is no improvement in the representation of S&P 500

implied moments when adding VIX options to the estimation dataset.

5.4 Market integration

Even for affine models, particle filter estimation of option pricing models is computationally highly

intensive. Therefore, recent literature proposes to estimate option pricing models by using only

S&P 500 and VIX index data, thereby avoiding the computational burden associated with option

valuation.28 However, as we mention in Section 5.1, the estimation results indicate that using index

data only may lead to erroneous estimated dynamics, and consequently option prices. Indeed, Table

4 reports RMSREs for S&P 500 options which are four to five times larger when using D1 instead

of D2. For VIX options, the RMSREs are three to four times larger when using D1 instead of D3.

The mispricing is particularly strong for the SVJ2 and SV2 models, which have more parameters

and require more information for their estimation. Hence, we conclude that one should refrain from

estimating an option pricing model using S&P 500 and VIX data only.

Although VIX and S&P 500 options provide different information on the trajectory of volatility in

times of market turmoil (Figure 3, Panels C and D), filtered jumps from estimating SVJ2 with D2

and D3 present similar patterns, with slightly less jumps filtered from S&P 500 options. In fall 2008,

some jumps around 10% occur with probability larger than 10%. Finally, slightly smaller jumps are

detected when the Eurozone sovereign debt crisis emerges in May 2010.

Despite these similarities, using D2 as estimation dataset generates for two-factor models RMSREs

on S&P 500 options that are much lower (reduced by a factor 4 to 5 approximately) than the

ones from estimating with D3, see Table 4. The ratio is much smaller for the SVJ model, partly

because its estimation with D2 already yields rather bad results. Estimating the SVJ2 and SVJ

models to VIX options leads to strongly mispriced OTM S&P 500 calls and long-term options, which

indicates that VIX options contain very little information on the right tail of the S&P 500 returns’

distribution. Concerning deep OTM puts on the S&P 500, it is striking to see that the estimation

28See, e.g., Duan and Yeh (2010, 2011).

27

using D3 outperforms the one using D2 out-of-sample. This observation indicates that VIX options

provide valuable information on the left tail of the returns’ distribution. Conversely, we note that

the RMSREs of VIX options using D2 are about one and a half those using D3, both in-sample

and out-of-sample. This observation indicates that S&P 500 options do not span the information

contained in VIX options.

5.5 Variance Risk Premium

Following Bollerslev and Todorov (2011), we define the annualized integrated variance risk premium

(IVRP) as:

IVRP(t, T ) =1

T − t

[EPt(QV[t,T ]

)− EQt

(QV[t,T ]

)],

where QV[t,T ] denotes the quadratic variation of the log price process, which is the sum of the

integrated variance of the returns and the squared jumps in the time interval considered:

QV[t,T ] =

∫ Ttvsds+

∑t≤s≤T

(ZYs)2

∆NY vs .

The IVRP represents the expected payoff when buying a variance swap at time t with maturity T .

We can further decompose the IVRP into a continuous and a discontinuous part as,

IVRP(t, T ) = IVRPc(t, T ) + IVRPd(t, T )

with

IVRPc(t, T ) =1

T − t

[EPt(∫ T

tvs−ds

)− EQt

(∫ Ttvs−ds

)],

IVRPd(t, T ) =1

T − t

EPt ∑t≤s≤T

(ZYs )2∆Ns

− EQt ∑t≤s≤T

(ZYs )2∆Ns

.

28

Each part can also be decomposed into a contribution of mt and another one of vt as follows:

IVRPc(t, T ) =

((AP −AQ)vt + (BP

κQvκPv−BQ)mt +GP −GQ

)(T − t),

IVRPd(t, T ) = IVRPd(t) =(λY v1 vt + λ

Y v2 mt + λ

Y v0

) [(σPY )

2 + (µPY )2 − (σQY )

2 − (µQY )2].

The coefficients under the appropriate measure are given in Appendix A.

In Figure 7, Panel A, we analyze the evolution of the IVRP during our sample period. In line with

literature, we find that the IVRP is strongly negative with a sharp drop end of 2008.29. Beginning of

2009, the IVRP steadily shrinks back but, compared to the pre-crisis period, remains at higher levels

in absolute terms. In Panel B of Figure 7, we decompose the instantaneous variance risk premium

into a discountinuous and continuous part. The discontinuous component of the IVRP dominates

for shorter maturities, indicating that the inclusion of jumps helps represent the shorter end of the

variance premium’s term structure. Indeed, as jumps in the variance are constrained to be positive,

a jump in the variance process will increase the variance premium’s term structure especially at

the short end. At the long end, the effect of a positive jump in the variance process is likely to be

dampened by the reversion of the variance to its long-term mean.


Finally, we find that the contribution of the stochastic central tendency in the continuous part of the

IVRP is small for short maturities but no longer negligible for maturities larger than three months.

In quiet times, the contribution of mt sets the level of the continuous IVRP, while the contribution of

vt takes over when the variance peaks, see Panel C of Figure 7. The central tendency mt also has a

non-negligible impact on the discontinuous part of the IVRP and determines a large part of its level

in quiet times, see Panel D of Figure 7. During the crisis, the peak of the discontinuous IVRP is then

driven by the shorter-term variance factor vt. Hence, both jumps and stochastic central tendency

play a crucial role for the IVRP. While jumps help to represent the short end of the variance term

structure, it is mostly the stochastic central tendency mt that determines the IVRP during calm

market periods. In times of financial crises, the dominant role is played by the variance process vt.

29This result holds over all datasets and models. However, we only plot the results for SVJ2 using D4.

29

6 Conclusion

In this paper we estimate a series of affine models using a time series of S&P 500 and VIX levels as

well as option prices on both indices. Our estimation is based on particle filter methods. To extract

as much information about extreme events as possible, we use S&P 500 and VIX options with a

unique wide range of moneynesses. Instead of a step-wise estimation, we depart from most of the

literature and estimate the historical P-parameters and the risk-neutral Q-parameters jointly, in one

single step.

We argue that using a model with a stochastic central tendency and jumps in returns and volatility

provides significant improvements for pricing S&P 500 and VIX option simultaneously, both in and

out-of-sample. Adding a stochastic central tendency helps to better represent the tails of the returns’

distribution as well as the term structure of S&P 500 and VIX option prices, while jumps allow for

more flexibility to match the right tail of the variance distribution as well as short-dated options.

We investigate the information contained in the underlying levels and in the options on both markets.

We find that the VIX index does not provide an accurate representation of the information contained

in S&P 500 options. Furthermore, the information contained in S&P 500 derivatives does not span the

information contained in VIX derivatives and vice-versa. It is therefore crucial to include underlyings

as well as derivatives on both markets to estimate a model and account for the cross section of

instruments. Finally, we emphasize the importance of jumps and of a stochastic central tendency in

representing the term structure and level of variance risk premia.

Finally, we find that even the model with stochastic central tendency and jumps is not able to fully

reproduce the skewness and kurtosis of the underlying S&P 500 index in times of market turmoil. We

conjecture that the fitting of these periods is restricted by the affine nature of our modeling frame-

work. However, departing from this framework would generate additional computational complexity

for particle filter estimation. We leave this challenging avenue for future research.

30

References

Aı̈t-Sahalia, Y., M. Karaman, and L. Mancini, 2012, “The Term Structure of Variance Swaps, Risk

Premia and the Expectation Hypothesis,” working paper, Swiss Finance Institute Working Paper.

Aı̈t-Sahalia, Y., and R. Kimmel, 2007, “Maximum Likelihood Estimation of Stochastic Volatility

Models,” Journal of Financial Economics, 83(2), 413–452.

Aı̈t-Sahalia, Y., and A. Lo, 1998, “Nonparametric Estimation of State-Price Densities Implicit in

Financial Asset Prices,” Journal of Finance, 53(2), 499–547.

Alizadeh, S., M. Brandt, and M. Diebold, 2002, “Range-based estimation of stochastic volatility

models,” Journal of Finance, 57(3), 1047–1091.

Andersen, T. G., L. Benzoni, and J. Lund, 2002, “An Empirical Investigation of Continuous-Time

Equity Return Models,” Journal of Finance, 57(3), 1239–1284.

Andrews, D., 1991, “Heteroskedasticity and autocorrelation consistent covariance matrix estima-

tion,” Econometrica, 59(3), 817–858.

Bakshi, G., N. Kapadia, and D. Madan, 2003, “Stock Return Characteristics, Skew Laws, and the

Differential Pricing of Individual Equity Options,” Review of Financial Studies, 16(1), 101–143.

Bates, D. S., 1996, “Jumps and Stochastic Volatility: Exchange Rate Processes Implicit in Deutsche

Mark Options,” Review of Financial Studies, 9(1), 69–107.

, 2000, “Post-’87 Crash Fears in the S&P 500 Futures Options,” Journal of Econometrics,

94(1-2), 181–238.

, 2012, “U.S. stock market crash risk, 1926-2010,” Journal of Financial Economics, 105(2),

229 – 259.

Bayer, C., J. Gatheral, and M. Karlsmark, 2013, “Fast Ninomiya-Victoir calibration of the Double-

Mean-Reverting Model,” Quantitative Finance, 13(11), 1813–1829.

Bollerslev, T., and V. Todorov, 2011, “Tails, Fears and Risk Premia,” Journal of Finance, 66(6),

2165–2211.

31

Breeden, D., and R. H. Litzenberger, 1978, “Prices of State-Contingent Claims Implicit in Option

Prices,” Journal of Business, 51(4), 621–651.

Broadie, M., M. Chernov, and M. Johannes, 2007, “Model Specification and Risk Premia: Evidence

from Futures Options,” Journal of Finance, 62(3), 1453–1490.

Carr, P., and D. B. Madan, 1999, “Option valuation using the fast Fourier transform,” Journal of

Computational Finance, 2(4), 1–18.

Chernov, M., A. R. Gallant, E. Ghysels, and G. T. Tauchen, 2003, “Alternative Models of Stock

Price Dynamics,” Journal of Econometrics, 116(1-2), 225 – 257.

Christoffersen, P., S. Heston, and K. Jacobs, 2009, “The Shape and Term Structure of the Index

Option Smirk: Why Multifactor Stochastic Volatility Models Work So Well,” Management Science,

55(12), 1914–1932.

Christoffersen, P., and K. Jacobs, 2004, “The importance of the loss function in option valuation,”

Journal of Financial Economics, 72(2), 291 – 318.

Christoffersen, P., K. Jacobs, and K. Mimouni, 2010, “Models for S&P 500 Dynamics: Evidence

from Realized Volatility, Daily Returns and Options Prices,” Review of Financial Studies, 23(8),

3141–3189.

Chung, S., W. Tsai, Y. Wang, and P. Wenig, 2011, “The Information Content of the S&P 500 Index

and VIX Options on the Dynamics of the S&P 500 Index,” The Journal of Futures Markets, 31(12),

1170 – 1201.

Cont, R., and T. Kokholm, 2011, “A Consistent Pricing Model for Index Options and Volatility

Derivatives,” Mathematical Finance, 3(2), 248–274.

Diebold, F. X., and R. S. Mariano, 1995, “Comparing predictive accuracy,” Journal of Business and

Economic Statistics, 13(3), 253–265.

Duan, J.-C., and C.-Y. Yeh, 2010, “Jump and volatility risk premiums implied by VIX,” Journal of

Economic Dynamics and Control, 34(11), 2232–2244.

32

, 2011, “Price and Volatility Dynamics Implied by the VIX Term Structure,” working paper,

NUS RMI Working Paper No. 11/05.

Duffie, D., J. Pan, and K. J. Singleton, 2000, “Transform Analysis and Asset Pricing for Affine

Jump-Diffusions,” Econometrica, 68(6), 1343–1376.

Durham, G. B., 2013, “Risk-neutral Modeling with Affine and Nonaffine Models,” Journal of Finan-

cial Econometrics, 11(4), 650–681.

Egloff, D., M. Leippold, and L. Wu, 2010, “Valuation and Optimal Investing in Variance Swaps,”

Journal of Financial and Quantitative Analysis, 45(5), 1279–1310.

Eraker, B., 2004, “Do Stock Prices and Volatility Jump? Reconciling Evidence from Spot and Option

Prices,” Journal of Finance, 59(3), 1367–1404.

Fang, F., and C. Oosterlee, 2008, “A novel pricing method for European options based on Fourier-

cosine series expansions,” SIAM Journal on Scientific Computing, 31(2), 826–848.

Ferriani, F., and S. Pastorello, 2012, “Estimating and Testing Non-Affine Option Pricing Models

With a Large Unbalanced Panel of Options,” The Econometrics Journal, 15(2), 171–203.

Gatheral, J., 2008, “Consistent Modeling of SPX and VIX options,” The Fifth World Congress of

the Bachelier Finance Society, London, July 2008.

Gordon, N. J., D. J. Salmond, and A. F. M. Smith, 1993, “Novel approach to nonlinear/non-Gaussian

Bayesian state estimation,” Radar and Signal Processing, IEE Proceedings F, 140(2), 107–113.

Hansen, N., and A. Ostermeier, 1996, “Adapting arbitrary normal mutation distributions in evolu-

tion strategies: The covariance matrix adaptation,” Proceedings of the 1996 IEEE Conference on

Evolutionary Computation (ICEC 96), pp. 312–317.

Jacod, J., and V. Todorov, 2010, “Do Price and Volatility Jump Together?,” Annals of Applied

Probability, 20(4), 1425–1469.

Jiang, G. J., and Y. S. Tian, 2007, “Extracting Model-Free Volatility from Option Prices: An

Examination of the VIX Index,” The Journal of Derivatives, 14(3), 35–60.

33

Johannes, M. S., N. G. Polson, and J. R. Stroud, 2009, “Optimal Filtering of Jump Diffusions:

Extracting Latent States from Asset Prices,” Review of Financial Studies, 22(7), 2759–2799.

Jones, C. S., 2003, “The Dynamics of Stochastic Volatility: Evidence From Underlying and Options

Markets,” Journal of Econometrics, 116(1-2), 181–224.

Kaeck, A., and C. Alexander, 2012, “Volatility dynamics for the S&P 500: Further evidence from

non-affine, multi-factor jump diffusions,” Journal of Banking & Finance, 36(11), 3110 – 3121.

Kloeden, P. E., and E. Platen, 1992, Numerical Solution of Stochastic Differential Equations.

Springer-Verlag, Berlin, Germany.

Lindström, E., J. Ströjby, M. Brodén, M. Wiktorsson, and J. Holst, 2008, “Sequential calibration of

options,” Computational Statistics & Data Analysis, 52(6), 2877–2891.

Menćıa, J., and E. Sentana, 2013, “Valuation of VIX derivatives,” Journal of Financial Economics,

108(2), 367 – 391.

Newey, W. K., and K. D. West, 1987, “A Simple, Positive Semi-definite, Heteroskedasticity and

Autocorrelation Consistent Covariance Matrix,” Econometrica, 55(3), 703–708.

Pan, J., 2002, “The Jump-Risk Premia Implicit in Options: Evidence from an Integrated Time-Series

Study,” Journal of Financial Economics, 63(1), 3–50.

Papanicolaou, A., and R. Sircar, 2013, “A regime-switching Heston model for VIX and S&P 500

implied volatilities,” Quantitative Finance, Forthcoming.

Pitt, M. K., 2002, “Smooth particle filters for likelihood evaluation and maximisation,” working

paper 651, The Warwick Economics Research Paper Series.

Pitt, M. K., and N. Shephard, 1999, “Filtering via Simulation: Auxiliary Particle Filters,” Journal

of the American Statistical Association, 94(446), 590–599.

Rebonato, R., and T. Cardoso, 2004, “Unconstrained Fitting of Implied Volatility Surfaces Using a

Mixture of Normals,” Journal of Risk, 7(1), 55–74.

34

Ruijter, M., M. Versteegh, and C. W. Oosterlee, 2013, “On the Application of Spectral Filters in a

Fourier Option Pricing Technique,” working paper, SSRN.

Sepp, A., 2008a, “Pricing Options on Realized Variance in the Heston Model with Jumps in Returns

and Volatility,” Journal of Computational Finance, 11(4), 33–70.

, 2008b, “VIX Option Pricing in a Jump-Diffusion Model,” Risk Magazine, pp. 84–89.

Song, Z., and D. Xiu, 2012, “A Tale of Two Option Markets: Pricing Kernels and Volatility Risk,”

working paper, Chicago Booth Research Paper No 12-10 - Fama-Miller Working Paper.

Storn, R., 1996, “On the usage of differential evolution for function optimization,” in Biennial

Conference of the North American Fuzzy Information Processing Society (NAFIPS), pp. 519–523.

Todorov, V., 2010, “Variance Risk Premium Dynamics: The Role of Jumps,” Review of Financial

Studies, 23(1), 345–383.

Todorov, V., and G. Tauchen, 2011, “Volatility Jumps,” Journal of Business and Economic Statistics,

29(3), 356–371.

35

Appendix

A Affine dependence of the VIX2 on vt and mt

The expression for the coefficients αVIX2 , βVIX2 and γVIX2 in Proposition 2.1 are given by:

αVIX2 =(1 + 2λY v1 C

)A, (A.1)

βVIX2 =(1 + 2λY v1 C

)B +

(2λY v2 C

)Â, (A.2)

γVIX2 = 2λY v0 C +

(1 + 2λY v1 C

)G+

(2λY v2 C

)B̂, (A.3)

where A = 1avτVIX (eavτVIX − 1) and A = 1, if av = 0, and C := (θQZ(1, 0, 0) −

∂θQZ∂φY

(0, 0, 0) − 1). We

can calculate the remaining coefficients as:

B =

1τVIX

hv(am−av)

[(eamτVIX−1

am

)−(eavτVIX−1

av

)], if am, av 6= 0,

hvav

(eavτVIX − 1avτVIX (e

avτVIX − 1)), if av = am 6= 0,

hvam

(1

amτVIX(eamτVIX − 1)− 1

), if am 6= av = 0,

hvav

(1

avτVIX(eavτVIX − 1)− 1

), if av 6= am = 0,

12τVIXhv, if am = av = 0,

(A.4)

G =

bvav

[(eavτVIX−1avτVIX

)− 1]− bmB, if av 6= 0,

bva

[(eaτVIX−1aτVIX

)− 1]− bmB, if av = am 6= 0,

12bvτVIX − bmB, if am 6= av = 0,cmav

(B − 12hvτVIX

)+ 1av

∂θQZ∂φv

(0, 0, 0)λY v0

[(eavτVIX−1avτVIX

)− 1], if av 6= am = 0,

12τVIX

[�

inferring volatility dynamics and risk premia from the s&p 500 and … · 2019. 10. 23. · vix...

Documents