Download - Realized Semicovariances - Duke Universitypublic.econ.duke.edu/~boller/Papers/Realized_Semicovariances.pdf · Figure2: SignedReturn-PairsforDJIAStocks.-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5-1.5-1.0-0.5

Realized Semicovariances

This version: 17 February, 2020

Tim Bollersleva,∗, Jia Lib, Andrew J. Pattonb, Rogier Quaedvliegc

aDepartment of Economics, Duke University, NBER and CREATESbDepartment of Economics, Duke University

cErasmus School of Economics, Erasmus University Rotterdam

Abstract

We propose a decomposition of the realized covariance matrix into components based on

the signs of the underlying high-frequency returns, and we derive the asymptotic prop-

erties of the resulting realized semicovariance measures as the sampling interval goes

to zero. The first-order asymptotic results highlight how the same-sign and mixed-sign

components load differently on economic information related to stochastic correlation and

jumps. The second-order asymptotic results reveal the structure underlying the same-sign

semicovariances, as manifested in the form of co-drifting and dynamic “leverage” effects.

In line with this anatomy, we use data on a large cross-section of individual stocks to

empirically document distinct dynamic dependencies in the different realized semicovari-

ance components. We show that the accuracy of portfolio return variance forecasts may

be significantly improved by exploiting the information in realized semicovariances.

Keywords: High-frequency data; realized variances; semicovariances; co-jumps;

volatility forecasting.

JEL: C22, C51, C53, C58

�We would like to thank the Co-Editor (Aviv Nevo) and four anonymous referees for their helpfulcomments, which greatly improved the paper. We would also like to thank conference and seminar partic-ipants at Aarhus, Boston, Ca’Foscari Venice, Chicago, Cologne, Gerzensee, ITAM, Konstanz, Lancaster,Padova, Pennsylvania, PUC Rio, QUT, Rutgers, and Toulouse for helpful comments and suggestions.Bingzhi Zhao kindly provided us with the cleaned high-frequency data underlying our empirical investi-gations. Patton and Quaedvlieg gratefully acknowledge support from, respectively, Australian ResearchCouncil Discovery Project 180104120 and Netherlands Organisation for Scientific Research Grant 451-17-009. This paper subsumes the 2017 paper “Realized Semicovariances: Looking for Signs of DirectionInside the Covariance Matrix” by Bollerslev, Patton, and Quaedvlieg.

∗Corresponding author: Department of Economics, Duke University, 213 Social Sciences Building, Box90097, Durham, NC 27708-0097, United States. Email: [email protected]. A Supplemental Appendix tothe paper is available at http://www.econ.duke.edu/∼boller/research.html.

1. Introduction

The covariance matrix of asset returns arguably constitutes the most crucial input

for asset pricing, portfolio and risk management decisions. Correspondingly, there is a

substantial literature devoted to the estimation, modeling, and prediction of covariance

matrices dating back more than half a century (e.g., Kendall (1953), Elton and Gruber

(1973), and Bauwens, Laurent, and Rombouts (2006)). Meanwhile, a rapidly-growing

recent literature has forcefully advocated the use of high-frequency intraday data to more

reliably estimate lower-frequency return covariance matrices (e.g., Andersen, Bollerlsev,

Diebold, and Labys (2003), Barndorff-Nielsen and Shephard (2004a), and Barndorff-

Nielsen, Hansen, Lunde, and Shephard (2011)).

Set against this background, we propose a decomposition of the realized covariance

matrix into three realized semicovariance matrix components dictated by the signs of the

underlying high-frequency returns. The realized semicovariance matrices may be seen as

a high-frequency multivariate extension of the semivariances originally proposed in the

finance literature (e.g., Markowitz (1959), Mao (1970), Hogan and Warren (1972, 1974),

and Fishburn (1977)). Our high-frequency theoretical analysis is inspired by and extends

the pioneering work of Barndorff-Nielsen, Kinnebrock, and Shephard (2010).

To fix ideas, let Xt = (X1,t, . . . , Xd,t)� denote a d-dimensional log-price process, sam-

pled on a regular time grid {iΔn : 0 ≤ i ≤ [T/Δn]} over some fixed time span T > 0.

Let the ith return of X be denoted by ΔniX ≡ XiΔn −X(i−1)Δn . The realized covariance

matrix (Barndorff-Nielsen and Shephard (2004a)) is then defined as:

C ≡[T/Δn]∑i=1

(ΔniX) (Δn

iX)� . (1.1)

If we let p (x) ≡ max {x, 0} and n (x) ≡ min {x, 0} denote the component-wise positive

and negative elements of the real vector x, the corresponding “positive,” “negative,” and

“mixed” realized semicovariance matrices are then simply defined as:

P ≡[T/Δn]∑i=1

p (ΔniX) p (Δn

iX)� , N ≡[T/Δn]∑i=1

n (ΔniX)n (Δn

iX)� ,

M ≡[T/Δn]∑i=1

(p (Δn

iX)n (ΔniX)� + n (Δn

iX) p (ΔniX)�

).

(1.2)

Note that C = P + N + M for any sampling frequency Δn. The concordant realized

semicovariance matrices, P and N , are defined as sums of vector outer-products and thus

are positive semidefinite. By contrast, the mixed semicovariance matrix, M , has diagonal

elements that are identically zero, and thus is necessarily indefinite.

As an illustration of the different dynamic dependencies and information conveyed by

2

Figure 1: Realized Covariance Decomposition

Realized Covariance Concordant Semicovariance Mixed Semicovariance

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

05

1015

(Sem

i)Cov

aria

nce

Time


Note: The figure plots the time series of the concordant semicovariance (P + N), the

mixed semicovariance (M) and the realized covariance (C). Each series is constructed asthe moving average of the relevant daily realized measures averaged across 500 randompairs of S&P 500 stocks.

the realized semicovariances, Figure 1 plots the daily realized covariance averaged across

500 randomly-selected pairs of S&P 500 stocks, together with its concordant (P + N)

and mixed (M) semicovariance components.1 The mixed component is, of course, always

negative, while the concordant component is always positive. The two components are

typically similar in magnitude during “normal” time periods, while in periods of high

volatility the concordant component increases substantially more than the mixed compo-

nent declines, in line with the widely-held belief that during periods of financial market

stress correlations between financial assets tend to increase. As such,the (total) realized

covariance is largely determined by the concordant realized semicovariance components

in these “crisis” periods.

To help understand these empirical features, consider a simple setting in which the

vector log-price process Xt is generated by a Brownian motion with constant drift b, unit

volatility, and constant correlation ρ. Although stylized, this simple model captures the

central force in the first-order asymptotic behavior of the semicovariance estimators in

the no-jump setting; we provide a more general analysis below. Normalizing T = 1, by

the law of large numbers the probability limits (as Δn → 0) of the (j, k) off-diagonal

1Each day we randomly draw 500 pairs of assets and compute C ≡ (1/500)∑

j �=k Cjk. P , N and

M are computed similarly. A detailed description of the data are provided in Section 4 below. Toavoid cluttering the figure, we sum P and N into a single concordant component, and smooth the dailymeasures using a day t− 25 to day t+ 25 moving average.

3

Figure 2: Signed Return-Pairs for DJIA Stocks.

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5 September 18, 2013

Return of Asset k

Ret

urn

of A

sset

j

-0.4 -0.2 0.0 0.2 0.4-0.4

-0.2

0.0

0.2

0.4 February 25, 2013

Return of Asset k

Note: The figure shows a scatter plot of the one-minute returns of each pair of the 30Dow Jones Industrial Average stocks on two days in 2013. The left panel presents a daywith an FOMC announcement that led to positive stock price jumps for many stocks.The right panel presents a day with steady downward price moves for many stocks.

elements of the realized semicovariance matrices are then given by

plim Pjk = plim Njk = ψ(ρ), plim Mjk = −2ψ(−ρ), (1.3)

where

ψ(ρ) = (2π)−1(ρ arccos (−ρ) +

√1− ρ2

), (1.4)

corresponds to E[Z1Z21{Z1<0,Z2<0}] for (Z1, Z2) bivariate standard normally distributed

with correlation ρ. As these expressions illustrate, the limiting values of the semicovari-

ance components depend crucially on the value of ρ. As ρ increases to one, the limiting

value of the concordant component P + N also approaches one, while the mixed com-

ponent M approaches zero, and vice versa when ρ decreases to −1. This is consistent

with the empirical observation from Figure 1 that the concordant component accounts

for most of the covariance in periods of market stress, which are generally believed to be

accompanied by increased positive correlations.

This simple diffusive setting highlights the potentially different information conveyed

by the concordant and the mixed semicovariance components. It does not, however,

reveal any differences between the P and N components as they have the same limits in

this stylized setting. This is at odds with the intuition that these measures carry distinct

economic information about good and bad news over each day. To illustrate, Figure 2

presents high-frequency returns on each of the 30 Dow Jones Industrial Average (DJIA)

stocks on two different days. On September 18, 2013 (left panel), the Federal Reserve

announced that it would not taper its asset-purchasing program, in contrast to what the4

market had been anticipating, and individual stocks responded with positive jumps at

the announcement time, resulting in much larger estimates of P than N . By contrast, on

February 25, 2013 (right panel), the DJIA drifted down by 1.5% over the course of the

day amid concerns, according to market anecdotes,2 about political uncertainty in Italy,

and generated much larger estimates of N than P . Hence, the empirical estimates of P

and N can indeed differ depending on the directional content of the news, and whether

it manifests in the form of price jumps or price drifts.

Motivated by these empirical observations, in Section 2 we derive first- and second-

order asymptotics for the positive and the negative semicovariance estimators in a general

Ito semimartingale setting, focusing particularly on a deeper understanding of the infor-

mation they convey. Extending the work of Barndorff-Nielsen, Kinnebrock, and Shephard

(2010), our limit theory identifies three channels through which P and N may differ: di-

rectional “co-jumps,” a type of “co-drifting,” and a specific form of “dynamic leverage

effect.” The co-jump channel manifests straightforwardly in the first-order asymptotics.

In particular, assuming that the vector log-price process in the aforementioned example

is subject to finite activity jumps, it follows readily that

plim Pjk = ψ(ρ) +∑

0<s≤1

p (ΔXj,s) p (ΔXk,s) ,

plim Njk = ψ(ρ) +∑

0<s≤1

n (ΔXj,s)n (ΔXk,s) ,

where ΔXj,s denotes the jump of the jth component of X at time s. By comparison, co-

drifting and dynamic leverage effects manifest in second-order bias terms in a non-central

limit theorem. These terms are unique to our analysis of realized semicovariances, and,

from a methodological perspective, they set our asymptotic analysis apart from stan-

dard high-frequency econometric analysis, in which central limit theorems are generally

applied for the purpose of conducting statistical inference (see, e.g., Aıt-Sahalia and Ja-

cod (2014)). By contrast, the main purpose of our higher-order asymptotic results is to

“dissect” the semicovariance estimators, thereby allowing for additional theoretical and

empirical insights by comparing and contrasting the relevant terms.

To gain more insights into the price behaviors illustrated in Figure 2, we employ

a standard truncation technique (Mancini (2001)) to separately study the jump and

diffusive information in positive and negative semicovariances. For the jump component,

we establish a feasible central limit theorem that facilitates formal statistical inference.

For the diffusive component, which is more complicated to study, we provide a standard

error estimator that quantifies its sampling variability in a well-defined sense, and which,

under more restrictive regularity conditions, also yields an asymptotically valid test.

2Source: https://money.cnn.com/2013/02/25/investing/stocks-markets/index.html.

5

Implementing the new inference procedures on high-frequency data for the 30 DJIA

stocks over a nine-year period reveals strong evidence of significant differences in the P

and N semicovariance components on many days. Consistent with economic intuition,

large differences in the jump semicovariance components are typically associated with

“sharp” public news announcements (e.g., FOMC announcements). Large differences in

the diffusive semicovariance components are typically associated with more difficult-to-

interpret news, which manifests in the form of common price drifts within the day.3

The above results naturally suggest that decomposing the realized covariance matrix

into its semicovariance components may be useful for volatility forecasting. We investi-

gate this idea by analyzing a large cross-section of stocks comprised of all of the S&P

500 constituents. We show that out-of-sample forecasts of portfolio return volatility are

significantly improved by using semicovariance measures. The gains from doing so in-

crease with the number of stocks included in the portfolio, although in line with the gains

from portfolio diversification, they appear to plateau at around 30-40 stocks in the port-

folio. Further dissecting the forecasting gains, we find that the models that use realized

semicovariances generally respond to new information faster than models that use only

realized variances or semivariances (e.g., Corsi (2009) and Patton and Sheppard (2015)).

Interestingly, during the financial crisis existing volatility models reduce the weight on

recent information, whereas semicovariance-based models increase the weight, primarily

due to an increase in the short-run importance of the negative semicovariance component.

The forecasting gains obtained through the use of the realized semicovariances are

naturally linked to the early work on asymmetric volatility models (e.g., Kroner and Ng

(1998) and Cappiello, Engle, and Sheppard (2006)). Our proposed tests for differences be-

tween positive and negative semicovariance components are also related to existing tests

for asymmetric dependencies (e.g., Longin and Solnik (2001), Ang and Chen (2002) and

Hong, Tu, and Zhou (2007)). Our work also connects to earlier empirical results on corre-

lations between asset returns in “bear” versus “bull” markets, and notions of asymmetric

tail dependencies (e.g., Patton (2004), Poon, Rockinger, and Tawn (2004), and Tjøsthem

and Hufthammer (2013)), and to recent work on high-frequency based co-skewness and

co-kurtosis measures (e.g., Neuberger (2012) and Amaya, Christoffersen, Jacobs, and

Vasquez (2015)), and jumps and co-jumps (e.g., Das and Uppal (2004), Bollerslev, Law,

and Tauchen (2008), Lee and Mykland (2008), Mancini and Gobbi (2012), Jacod and

Todorov (2009), Aıt-Sahalia and Xiu (2016) and Li, Todorov, and Tauchen (2017b)). In

contrast to all of these studies, however, we retain the covariance matrix as the sum-

mary measure of dependence, and instead use information from signed high-frequency

returns to “look inside” this matrix to obtain additional information about the inherent

dependencies, both dynamically and cross-sectionally at a given point in time.

3The two days plotted in Figure 2 also correspond to these two scenarios, and are indeed detected byusing the new inference method, as further discussed in Section 3 below.

6

The rest of the paper is organized as follows. Section 2 presents the first- and second-

order asymptotic properties of the realized semicovariances. Readers primarily interested

in empirical applications of the new measures may skip the more technical parts of Section

2. Section 3 discusses our empirical findings from the new semicovariance-based tests.

Our results pertaining to the use of the realized semicovariances in the construction of

improved volatility forecasts are discussed in Section 4. Section 5 concludes. Technical

regularity conditions and proofs are deferred to the Appendix. Additional robustness

checks and extensions are available in a Supplemental Appendix.

2. Information content of realized semicovariances: an asymptotic analysis

In this section, we demonstrate the differential information embedded in the realized

semicovariance measures in equation (1.2) in an infill asymptotic framework. Sections

2.1 and 2.2 present the first- and the second-order asymptotics, respectively. Section 2.3

describes feasible inference methods. Below, for a matrix A, we denote its (j, k) element

by Ajk and its transpose by A�. Convergence in probability and stable convergence in

law are denoted byP−→ and

L-s−→, respectively. All limits are for the sample frequency

Δn → 0 on a probability space (Ω,F ,P).

2.1. First-order asymptotic properties

Suppose that the log-price vector Xt is an Ito semimartingale of the form

Xt = X0 +

∫ t

0

bsds+

∫ t

0

σsdWs + Jt, (2.1)

where b is the Rd-valued drift process, W is a d-dimensional standard Brownian motion,

σ is the d×d dimensional stochastic volatility matrix and J is a finitely active pure-jump

process. We denote the spot covariance matrix of X by ct ≡ σtσ�t and further set

vj,t ≡ √cjj,t, ρjk,t ≡

cjk,tvj,tvk,t

. (2.2)

That is, vj,t and ρjk,t denote the spot volatility of asset j and the spot correlation coef-

ficient between assets j and k, respectively. We explicitly allow for so-called “leverage

effect” (i.e., dependence between changes in the price and changes in volatility), stochastic

volatility of volatility, volatility jumps and price-volatility co-jumps.

We begin by characterizing the first-order limiting behavior of the realized semico-

variance estimators defined by equation (1.2) in the Introduction. Let ΔXs denote the

price jump occurring at time s, if a jump occurred, and set it to zero if no jump occurred

at time s. Further define

7

P † ≡∑s≤T

p(ΔXs)p(ΔXs)�,

N † ≡∑s≤T

n(ΔXs)n(ΔXs)�,

M † ≡∑s≤T

(p(ΔXs)n(ΔXs)

� + n(ΔXs)p(ΔXs)�) .

These measures characterize the discontinuous parts of the semicovariance measures, as

formally spelled out in the following theorem.

Theorem 1 Under Assumption 1 in the appendix, (P , N , M)P−→ (P,N,M), where P ,

N and M are d× d matrices with their (j, k) elements given by

Pjk ≡∫ T

0

vj,svk,sψ(ρjk,s)ds+ P †jk,

Njk ≡∫ T

0

vj,svk,sψ(ρjk,s)ds+N †jk,

Mjk ≡ −2

∫ T

0

vj,svk,sψ(−ρjk,s)ds+M †jk,

and ψ(·) is defined in equation (1.4).

It follows from Theorem 1 that each of the realized semicovariances contains both

diffusive and jump covariation components. Importantly, the limiting variables P and

N share exactly the same diffusive component, but their jump components differ. In

particular,

P − NP−→ P −N = P † −N †.

That is, the first-order asymptotic behavior of the concordant semicovariance differential

(CSD) is fully characterized by the “directional co-jumps.” Consequently, in line with

the stylized model in equation (1.2) discussed in the introduction, Theorem 1 cannot

distinguish the information conveyed by P and N in periods when there are no jumps.

Hence, in order to reveal the differential information inherent in the realized measures

more generally, we turn next to a more refined second-order asymptotic analysis.

2.2. Second-order asymptotic properties

Since the main theoretical lessons about the second-order asymptotic behavior of

the concordant realized semicovariance components can be readily learnt in a bivariate

setting, we set d = 2 and focus on the analysis of P12 and N12 throughout this subsection.

Correspondingly, we also write ρt in place of ρ12,t for simplicity. The joint analysis of

all semicovariance components can be done in a similar manner. However, it is much

more tedious to discuss, so for readability we defer this more general analysis to the

Supplemental Appendix, Section S1.

8

We need to impose some additional structure on the volatility dynamics. In particular,

we will assume that the stochastic volatility σt is also an Ito semimartingale of the form

(see, e.g., equation (4.4.4) in Jacod and Protter (2012))

σt = σ0 +

∫ t

0

bsds+

∫ t

0

σsdWs + Mt +∑s≤t

Δσs1{‖Δσs‖>σ}, (2.3)

where b is the drift, σ is a d × d × d tensor-valued process, M is a local martingale

that is orthogonal to the Brownian motion W .4 A few remarks are in order. The pro-

cess σ collects the loadings of the stochastic volatility matrix σ on the price Brownian

shocks dW , and hence is naturally thought of as a multivariate quantification of a “lever-

age effect.” We also allow σ to load on Brownian shocks that are independent of dW

through the local martingale M . The M process may also contain compensated “small”

volatility jumps in the form of a purely discontinuous local martingale.5 Meanwhile,

the term∑

s≤t Δσs1{‖Δσs‖>σ} collects the “large” volatility jumps (with an arbitrary but

fixed threshold σ > 0), which often occur in response to major news announcements (see,

e.g., Bollerslev, Li, and Xue (2018)). The Ito semimartingale setting also readily accom-

modates the well-established intraday periodicity in the volatility dynamics (see, e.g.,

Andersen and Bollerslev (1997)). Further regularity conditions regarding the σ process

are collected in the appendix.

Theorem 2, below, describes the F -stable convergence in law of the normalized statis-

tic Δ−1/2n (P12 − P12, N12 − N12). The limit variable turns out to be fairly complicated,

but it may be succinctly expressed as(B

−B

)+

(L

−L

)+

(ζ

−ζ

)+

(ζP

ζN

)+

(ξP

ξN

), (2.4)

where, as we will detail below, B and L are bias terms, and (ζ, ζ, ξ) capture sampling

variabilities that arise from various sources. In particular, we note that P12 and N12 load

on the B and L bias terms in exactly opposite ways, which would cancel with each other

in the (aggregated) realized covariance C12. Before presenting the actual limit theorem,

we begin by briefly describing each of these separate terms. Recall that the processes b,

σ, v, ρ, and σ have previously been introduced in equations (2.1), (2.2), and (2.3).

Bias components due to price drift, B. The first type of bias is related to the price

4By convention, the (j, k) element of the stochastic integral∫ t

0σsdWs equals

∑dl=1

∫ t

0σjkl,sdWl,s.

5We remind the reader that two local martingales are called orthogonal if their product is a localmartingale (or equivalently, their predictable covariation process is identically zero). A local martingaleis called purely discontinuous if it is orthogonal to all continuous local martingales. See Definition I.4.11and Proposition I.4.15 in Jacod and Shiryaev (2003) for additional details.

9

drift, which is defined for the semicovariance estimator P12 as

B =1

2√2π

∫ T

0

(b1,sv1,s

+b2,sv2,s

)v1,sv2,s (1 + ρs) ds. (2.5)

As mentioned above, the analogous bias term for N12 is −B. Other things equal, this

bias term is proportional to the average spot “Sharpe ratio” of the two assets

1

2

(b1,sv1,s

+b2,sv2,s

). (2.6)

Therefore, the B term tends to be more pronounced when the two assets drift in the

same direction, akin to a “co-drift” type phenomenon.

Bias components due to continuous price-volatility covariation, L. The second bias

term stems from the fact that the volatility matrix process σt may be partially driven by

the Brownian motion W (i.e., σ �= 0), corresponding to a “dynamic leverage” type effect.

To more precisely describe this bias term for P12, define f1(x) ≡ 1{x1≥0} max{x2, 0} and

f2 (x) ≡ max{x1, 0}1{x2≥0} and then set, for any 2× 2 matrix A,

Fj (A) ≡ E

[fj (AW1)

∫ 1

0

WsdW�s

], j = 1, 2.

The bias term in P12 due to the common price-volatility Brownian dependence may then

be expressed as

L ≡2∑

j=1

∫ T

0

Trace [σj,sFj (σs)] ds, (2.7)

where σj,s denotes the 2× 2 matrix [σjkl,s]1≤k,l≤2. The bias term for N12 may be defined

similarly, and it can be shown to equal −L.Diffusive sampling error spanned by price risk, ζ. The third component in the limit

of P12 captures the sampling variability in p (ΔniX1) p (Δ

niX2) that is spanned by the

Brownian price shock σtdWt. Formally,

ζ ≡∫ T

0

(c−1s γs

)�(σsdWs) , (2.8)

where the γt process is defined as

γt ≡(1 + ρt)

2 v1,tv2,t

2√2π

(v1,t

v2,t

). (2.9)

The analogous component for N12 equals −ζ. Note that the quadratic covariation matrix

10

of the local martingale (ζ,−ζ) equals ∫ T

0Γsds, where

Γt ≡(

γ�t c−1t γt −γ�t c−1

t γt

−γ�t c−1t γt γ�t c

−1t γt

). (2.10)

Diffusive sampling error orthogonal to price risk, ζ. While ζ defined above captures

the diffusive risk in the semicovariance spanned by the Brownian shocks to the price

process, ζ captures the diffusive risk component orthogonal to those shocks. This limit

variable may be represented by its F -conditional distribution as

ζ =

(ζP

ζN

)=

∫ T

0

γ1/2s dWs, (2.11)

where W is a 2-dimensional standard Brownian motion that is independent of the σ-field

F , and the γ process is defined by γt ≡ Γt − Γt, where

Γt ≡ v21,tv22,t

(Ψ(ρt)− ψ (ρt)

2 −ψ (ρt)2

−ψ (ρt)2 Ψ(ρt)− ψ (ρt)

2

), (2.12)

and

Ψ (ρ) ≡ 3ρ√1− ρ2 + (1 + 2ρ2) arccos (−ρ)

2π, (2.13)

with Ψ (ρ) corresponding to E[Z2

1Z221{Z1<0,Z2<0}

]for (Z1, Z2) standard normally dis-

tributed with correlation ρ.

Jump-induced sampling error, ξ. The price jumps also induce sampling errors. Let Tj

for j ∈ {1, 2} denote the collection of jump times of (Xj,t)t∈[0,T ], with the corresponding

“signed” subsets denoted by

Tj+ ≡ {τ ∈ Tj : ΔXj,τ > 0}, Tj− ≡ {τ ∈ Tj : ΔXj,τ < 0}.

For each τ ∈ T1 ∪ T2 associate the variables (κτ , ξτ−, ξτ+) that are, conditionally on F ,

mutually independent with the following conditional distributions: κτ ∼ Uniform[0, 1],

ξτ− ∼ MN (0, cτ−), and ξτ+ ∼ MN (0, cτ ). Further define ητ = (η1,τ , η2,τ )� ≡ √

κτ ξτ− +√1− κτ ξτ+. The limiting variable ξ = (ξP , ξN)

� may then be expressed as

ξP ≡∑

τ∈T1+∩T2+(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) +

∑τ∈T1+\T2

ΔX1,τp (η2,τ ) +∑

τ∈T2+\T1ΔX2,τp (η1,τ ) ,

ξN ≡∑

τ∈T1−∩T2−(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) +

∑τ∈T1−\T2

ΔX1,τn (η2,τ ) +∑

τ∈T2−\T1ΔX2,τn (η1,τ ) .

Note that the first component in ξP (resp. ξN) concerns times when both assets have

positive (resp. negative) jumps, while the other two terms are active when one asset

11

jumps upwards (resp. downwards) and the other asset does not jump. The latter terms

involve the half-truncated doubly mixed Gaussian variable p(ητ ) (resp. n (ητ )). To the

best of our knowledge, this limiting distribution is new to the literature.

With the definitions above, we are now ready to state the stable convergence in law

of the realized semicovariances.

Theorem 2 Under Assumption 2 in the appendix,

Δ−1/2n

(P12 − P12

N12 −N12

)L-s−→(

B

−B

)+

(L

−L

)+

(ζ

−ζ

)+

(ζP

ζN

)+

(ξP

ξN

).

Theorem 2 depicts a non-central limit theorem for the positive and negative realized

semicovariances, where (±B,±L) represent bias terms, while (±ζ, ζ, ξ) stem from various

sources of “sampling errors.” The latter sampling error terms are all formed as (local)

martingales and have zero mean under mild integrability conditions. We note that the

bias terms arise because the test functions used to define the realized semicovariances

(e.g., (x1, x2) → p (x1) p (x2) = max{x1, 0}max{x2, 0}) are not globally even.6 This

phenomenon also appears in earlier work by Kinnebrock and Podolskij (2008), Barndorff-

Nielsen, Kinnebrock, and Shephard (2010), and Li, Mykland, Renault, Zhang, and Zheng

(2014); also see Chapter 5 of Jacod and Protter (2012). In the Supplemental Appendix

Section S1, we provide a more general result for the joint convergence of all the realized

semicovariance components, including the realized semivariances of Barndorff-Nielsen,

Kinnebrock, and Shephard (2010) as special “diagonal” cases. We also demonstrate there

how to recover that prior result in the case without price jumps, and further characterize

the effect of jumps on the sampling variability of the realized semivariances.

The presence of the bias terms means that Theorem 2 is not directly suitable for the

construction of confidence intervals for the (P,N) estimand, however that is not our goal.

Instead, the main insight derived from Theorem 2 is to reveal the differential second-order

behavior of the realized semicovariances P and N , about which the first-order asymptotics

in Theorem 1 remains entirely silent in the absence of jumps. Indeed, while Theorem 1

states that P and N have the same limit in the no-jump case, Theorem 2 clarifies that

they actually load on higher-order “signals” ±B and ±L in the exact opposite way. From

a theoretical perspective, this therefore explains why P and N may behave differently.

From an empirical perspective, it helps guide our understanding of the actual P and N

estimates discussed in Section 3, and the use of these measures in the construction of

improved volatility forecasts discussed in Section 4.

Theorem 2 focuses on the concordant semicovariance terms, P and N . The asymptotic

6Following Jacod and Protter (2012), we say that a function f (·) defined on Rd is globally even iff (x) = f (−x) for all x ∈ Rd; see page 135 in that book. Kinnebrock and Podolskij (2008) simply referto such functions as even functions; see page 1057 of that paper.

12

behavior of the mixed semicovariance term, M , is of particular interest when studying

negatively correlated assets, such as in analyzing hedge portfolios. This term is considered

jointly with all other elements of the semicovariance matrices in the more general limit

theorem presented in the Supplemental Appendix. Meanwhile, Theorem 2 can also be

used to help understand the behavior of the mixed semicovariances by computing P and

N on a rotation of the original returns: use the original returns on the first asset, and

the negative of the returns on the second asset.

2.3. Tests based on concordant semicovariance differentials

Even though Theorem 2 does not allow for the construction of standard confidence

intervals, it is still possible to develop feasible inference methods for the difference P −N ,

that is, the CSD. In particular, the asymptotic theory in the previous subsection reveals

three types of signals underlying the CSD: a directional co-jump effect (i.e., P † −N †), a

co-drifting effect (i.e., 2B), and a dynamic leverage effect (i.e., 2L). Empirically, it is of

great interest to separate the variation due to jumps from that due to the diffusive price

moves. Below, we use the standard truncation method (see, e.g., Mancini (2001, 2009))

to achieve such a separation. As in Section 2.2, we consider a bivariate setting, or d = 2,

and focus on the inference for P12 − N12.

The truncation method involves a sequence un ∈ R2+ of truncation thresholds satisfy-

ing uj,n � Δ�n for some � ∈ (0, 1/2) and j ∈ {1, 2}. Under our maintained assumption

of finite activity jumps, it can be shown that the index set

I ≡ {i : −un ≤ ΔniX ≤ un does not hold}

consistently estimates the jump times of the vector log-price process X.7 In practice,

it is important to choose the truncation threshold un adaptively so as to account for

time-varying volatility, particularly its well-known U-shape intraday pattern; we discuss

this further in connection with our empirical analyses below. As a result, the diffusive

and jump returns may be separated, allowing for the separate estimation of the diffusive

components of the semicovariances using the “small” (non-jump) returns:

P � ≡∑i/∈I

p(ΔniX)p(Δn

iX)�, N� ≡∑i/∈I

n(ΔniX)n(Δn

iX)�,

7See Proposition 1 of Li, Todorov, and Tauchen (2017b), for which the inequality −un ≤ Δni X ≤ un is

interpreted element-by-element. Importantly, although jumps can be separately recovered in asymptotictheory, it is difficult to do so in practice. The realized semicovariance measures only concern time-aggregated jump characteristics, instead of individual jumps. In our analysis, the jump recovery is onlyused as theoretical auxiliary tool to prove asymptotic results for time-aggregated quantities.

13

and the jump components of the semicovariances using the “jump” returns:

P † ≡∑i∈I

p(ΔniX)p(Δn

iX)�, N † ≡∑i∈I

n(ΔniX)n(Δn

iX)�.

The aforementioned jump detection result permits the truncated estimators to be

analyzed in the following straightforward extension of Theorem 2.

Proposition 1 Under Assumption 2 in the appendix, the following convergences hold

jointly

Δ−1/2n

(P †12 − P †

12

N †12 −N †

12

)L-s−→(ξP

ξN

),

Δ−1/2n

(P �12 − P �

12

N�12 −N�

12

)L-s−→(

B

−B

)+

(L

−L

)+

(ζ

−ζ

)+

(ζP

ζN

),

where P �12 = N�

12 ≡∫ T

0v1,sv2,sψ(ρs)ds.

Proposition 1, and consistent estimation of the spot covariances (discussed below),

allows for feasible inference on each of the two separate CSD components, P †12 − N †

12 and

P �12 − N�

12. We will refer to these as the jump or diffusive concordant semicovariance

differentials (JCSD or DCSD) respectively.

We start with a discussion of how to implement the JCSD test, which is the simpler

of the two as it admits a central limit theorem. In particular, it follows immediately from

Proposition 1,

Δ−1/2n

(P †12 − N †

12 −(P †12 −N †

12

)) L-s−→ ξP − ξN ,

where, as discussed above, ξP and ξN are defined in terms of doubly-mixed Gaussian

variables. Hence, to consistently estimate the distribution of the limiting variable, we

first need to estimate the spot covariance matrix before and after each detected jump

time. In order to do so, we choose an integer sequence kn of local windows that satisfies

kn → ∞ and knΔn → 0, and set, for each i,

ci− ≡ 1

knΔn

kn∑l=1

(Δn

i−lX) (

Δni−lX

)�1{−un≤Δn

i−lX≤un},

ci+ ≡ 1

knΔn

kn∑l=1

(Δn

i+lX) (

Δni+lX

)�1{−un≤Δn

i+lX≤un}.

Algorithm 1 describes the requisite steps for implementing the resulting JCSD test for

the null hypothesis P †12 = N †

12, that is, equal directional jump covariation.

Algorithm 1 (JCSD Test).

14

Step 1. Draw random variables (κ∗i , ξ∗i−, ξ

∗i+) that are mutually independent such that

κ∗i ∼ Uniform[0, 1] and ξ∗i± ∼ MN (0, ci±). Set η∗i = (η∗i,1, η∗i,2) =

√κ∗i ξ

∗i− +

√1− κ∗i ξ

∗i+.

Step 2. Let ΔniX

∗j ≡ Δn

iXj1{|Δni Xj |>uj,n} for j ∈ {1, 2} and set

ξ∗P = Δ−1/2n

∑i∈I

(p(Δn

iX∗1 +Δ1/2

n η∗i,1)p(Δn

iX∗2 +Δ1/2

n η∗i,2)− p (Δn

iX∗1 ) p (Δ

niX

∗2 )),

ξ∗N = Δ−1/2n

∑i∈I

(n(Δn

iX∗1 +Δ1/2

n η∗i,1)n(Δn

iX∗2 +Δ1/2

n η∗i,2)− n (Δn

iX∗1 )n (Δ

niX

∗2 )).

Step 3. Repeat steps 1–2 many times. Compute the 1− α (resp. α) quantile of ξ∗P − ξ∗Nas the critical value of Δ

−1/2n (P †

12 − N †12) for the null hypothesis P

†12 = N †

12 in favor of the

one-sided alternative P †12 > N †

12 (resp. P †12 < N †

12) at significance level α. �

Algorithm 1 may be seen as a parametric bootstrap that exploits the approximate

(parametric) doubly-mixed Gaussian distribution of the detected jump returns given the

estimated spot covariances, with ξ∗P and ξ∗N being the bootstrap analogue of the original

normalized estimators. While this type of simulation-based inference is often used in

the study of jumps, a non-standard feature of Algorithm 1 is its use of the truncated

return ΔniX

∗j = Δn

iXj1{|Δni Xj |>uj,n}, which shrinks the detected diffusive returns to zero.

This shrinkage is needed in situations where exactly one asset jumps at time τ , so that

the sampling variability contributed by the other (no-jump) asset, say j, is given by a

half-truncated doubly-mixed Gaussian variable like p (ητ,j). This distribution may in turn

be mimicked by Δ−1/2n p(Δn

iX∗j + Δ

1/2n η∗i,j) = p(η∗i,j), which differs from the “un-shrunk”

variable Δ−1/2n p(Δn

iXj +Δ1/2n η∗i,j).

Proposition 2 Under Assumption 2 in the appendix, the conditional distribution of

ξ∗P −ξ∗N given the data converges in probability to the F-conditional distribution of ξP −ξNunder the uniform metric. Consequently, the test described in Algorithm 1 has asymp-

totic level α under the null {P †12 = N †

12} and asymptotic power of one under one-sided

alternatives.

We turn next to the conduct of feasible inference using the DCSD statistic P �12− N�

12.

This involves some additional non-standard theoretical subtlety. Proposition 1 implies

that

Δ−1/2n

(P �12 − N�

12

)− 2B − 2L

L-s−→ 2ζ + ζP − ζN , (2.14)

where we recall that B and L capture co-drift and dynamic leverage effects, respec-

tively. The first-order limiting variables P �12 and N�

12 exactly cancel with each other in

the P �12 − N�

12 difference. Consequently, as revealed by the convergence in (2.14), the re-

maining “signal” carried by the DCSD is given by the higher-order term 2B +2L, which

is comparable in magnitude with the statistical noise term 2ζ + ζP − ζN (defined as a

local martingale). Since the signal-to-noise ratio does not diverge to infinity even in large15

samples, the resulting test is generally not consistent.

A further non-standard complication related to (2.14) stems from the fact that the

limiting variable 2ζ+ ζP − ζN is generally not mixed Gaussian. Specifically, while ζP − ζNis F -conditional Gaussian, the remaining part

2ζ = 2

∫ T

0

(c−1s γs

)�(σsdWs)

is generally not mixed Gaussian unless, of course, the stochastic volatility σ is independent

of the Brownian motion W that drives the diffusive price moves.

Although these non-standard features of the limit theory prevent us from conducting

formal tests in the most general setting, it is nevertheless possible to assess the sampling

variability of P �12 − N�

12 in a well-defined way. Indeed, it follows from (2.8) and (2.11)

that the quadratic variation of the continuous local martingale 2ζ + ζP − ζN is given by

Σ� ≡ 2

∫ T

0

v21,sv22,sΨ(ρs) ds. (2.15)

Therefore,√Σ� may be naturally used as the standard error for gauging the sampling

variability of the centered variable Δ−1/2n (P �

12−N�12)−(2B+2L). The Σ� variable is defined

as an integrated functional of the spot covariance matrix and it may be consistently

estimated using a nonparametric “plug-in” estimator Σ� (see, e.g., Li, Todorov, and

Tauchen (2017a)).

Comparing P �12 − N�

12 with its standard error provides an econometrically disciplined

approach for detecting “large” differences in positive and negative concordant semicovari-

ances, as summarized in the following algorithm. We refer to it cautiously as a detection

rule instead of a test; it may be formally interpreted as a test under additional conditions,

as detailed below.

Algorithm 2 (DCSD Detection).

Step 1. Define v1,i, v2,i and ρi implicitly by decomposing the spot covariance matrix

estimator ci+ as

ci+ =

(v21,i ρiv1,iv2,i

ρiv1,iv2,i v22,i

),

and set

Σ� ≡ 2Δn

[T/Δn]− kn + 1

[T/Δn]−kn∑i=0

v21,iv22,iΨ(ρi) . (2.16)

Step 2. Use the 1−α quantile of a standard normal distribution zα as the critical value for

the t-statistic Δ−1/2n (P �

12−N�12)/√Σ� for one-sided detection of deviations from B+L = 0.

The DCSD detection rule described in Algorithm 2 should be interpreted carefully in

16

empirical work. Due to the aforementioned lack of mixed Gaussianity for the limiting

variable under the most general conditions, the t-statistic Δ−1/2n (P �

12 − N�12)/√Σ� is gen-

erally not asymptotically standard normally distributed. For this reason, the asymptotic

level of the proposed one-sided test is not guaranteed to be α. Instead, the t-statistic

may be interpreted as a well-defined “signal-to-noise” measure, as opposed to a formal

statistical test. These asymptotic results are also corroborated by the simulation results

pertaining to the finite-sample properties of the JCSD test and DCSD detection rule

presented in Section S2 of the Supplemental Appendix.

In lieu of these results, mixed Gaussianity can be formally restored under the addi-

tional assumption that the volatility process σ is independent of the Brownian motionW ,

which in turn allows us to analyze the size and power properties of the DCSD detection

rule as a formal test. Note that in this “no-leverage” case with σ being independent of

W , without loss of generality we can fix σ ≡ 0 in the volatility model in (2.3), which

further implies that L ≡ 0 (recall equation (2.7)). Hence, the statistic Δ−1/2n (P �

12 − N�12)

is centered at 2B, and the signal-to-noise ratio equals 2B/√Σ�.

Formally, the null hypothesis for DCSD is comprised of the following collection of

sample paths:

Ω0 ≡ {ω ∈ Ω : B (ω) = 0}.

We remind the reader that in non-ergodic high-frequency settings, it is standard to con-

sider hypotheses as events consisting of sample paths of interest (see, e.g., Aıt-Sahalia

and Jacod (2014) and the many references therein). Correspondingly, to state a one-side

alternative hypothesis, we fix a constant R > 0, and consider:

Ωa ≡{ω ∈ Ω :

2B (ω)√Σ� (ω)

≥ R

}.

Intuitively, R measures the “distance” between the null and the alternative by drawing

a lower bound for the (random) signal-to-noise ratio 2B/√Σ�, which, not surprisingly,

determines the power of the test. Proposition 3, below, characterizes the size and power

properties of the resulting DCSD test described in Algorithm 2, where we use Φ (·) to

denote the cumulative distribution function of the standard normal distribution.

Proposition 3 Suppose that (i) Assumption 2 in the appendix holds, and (ii) the (bt)

and (σt) processes are independent of W , and σt ≡ 0. Then, the critical region C+n ≡

{Δ−1/2n (P �

12 − N�12)/√Σ� > zα} associated with the (positive) one-sided DCSD test has

asymptotic level α in restriction to Ω0, that is

limn→∞

P(C+

n |Ω0

)= α.

17

Moreover, for any R > 0,

lim infn→∞

P(C+

n |Ωa

) ≥ 1− Φ (zα −R) > α.

Proposition 3 shows that under the additional “no-leverage” assumption the one-sided

DCSD test controls size under the null with “no co-drift” (i.e., B = 0), and is asymp-

totically unbiased with strictly non-trivial power under the alternative. The asymptotic

power is higher when the signal-to-noise ratio 2B/√Σ� is higher, and the power is at least

50% when R = zα. Also, even though the proposition focuses on a positive one-sided

test, these same results readily extend to a negative one-sided test, or two-sided tests.

2.4. Discussion of prior related studies

Putting our results further into perspective, Theorem 2 naturally extends Barndorff-

Nielsen, Kinnebrock, and Shephard’s (2010) asymptotic theory (Proposition 2) from the

univariate setting and realized semivariances to a multivariate setting and realized semi-

covariances. However, the results in Barndorff-Nielsen, Kinnebrock, and Shephard (2010)

rely on the theory of Kinnebrock and Podolskij (2008), and hence rule out the presence

of price or volatility jumps. Jacod and Protter (2012) provide a more general result al-

lowing for volatility jumps (see Theorem 5.3.5). By comparison, we allow for both price

and volatility jumps. Price jumps in turn result in an interesting limiting distribution

that involves truncated doubly-mixed Gaussian variables (i.e., p(ητ ) and n(ητ )), which, to

the best of our knowledge, are new to the literature on jump-related inference (see, e.g.,

Aıt-Sahalia and Jacod (2009), Jacod and Todorov (2009), and Chapter 14 of Aıt-Sahalia

and Jacod (2014)).

The higher-order bias terms that appear in the second-order asymptotics arise from

the fact that the realized semicovariances are not globally even transformations of the

high-frequency returns. Results in Kinnebrock and Podolskij (2008), Barndorff-Nielsen,

Kinnebrock, and Shephard (2010), and Chapter 5 of Jacod and Protter (2012) share that

same feature. In the probability literature, a closely related limit theorem is also proved

by Jacod (1997) and reported as Theorem IX.7.3 in Jacod and Shiryaev (2003). Another

interesting example of this type of higher-order bias terms is provided by Li, Mykland,

Renault, Zhang, and Zheng (2014), in their analysis of realized tricity (see their Theorem

2) and a test for endogenous sampling times.8

8The (realized) tricity estimator is defined as the sum of cubic powers of high-frequency returns.Theorem 2 of Li, Mykland, Renault, Zhang, and Zheng (2014) establishes a non-central limit theoremfor this estimator allowing for random sampling. In the special case with regular sampling (correspondingto ht = 1 in the notation in that paper), the bias term has the form (using the notation in the present

paper) 3∫ T

0σ2sbsds+3

∫ T

0σ3sdWs+(3/2)〈σ2, X〉t, where 〈σ2, X〉t denotes the quadratic variation between

σ2 and X, corresponding to a leverage type effect. These three components play similar roles to the B,ζ, and L terms in Theorem 2, respectively. It may be interesting to further generalize the results in thepresent paper to a setting with random sampling using the technique of Li, Mykland, Renault, Zhang,and Zheng (2014).

18

The feasible inference developed in Section 2.3 also further sets our analysis apart

from that of Barndorff-Nielsen, Kinnebrock, and Shephard (2010). In the absence of price

jumps, in particular, our DCSD inference about the diffusive component involves a nonlin-

ear transform of the spot covariance matrix, Σ� ≡ 2∫ T

0v21,sv

22,sΨ(ρs) ds. Correspondingly,

our inference relies on the estimator for general integrated volatility functionals recently

developed by Li and Xiu (2016) and Li, Todorov, and Tauchen (2017a); see also Reno

(2008) and Kristensen (2010) for earlier results on spot volatility estimation. By contrast,

the feasible inference of Barndorff-Nielsen, Kinnebrock, and Shephard (2010), available

under more restrictive conditions, only requires the estimation of integrated quarticity.

In further contrast to Barndorff-Nielsen, Kinnebrock, and Shephard (2010), who do

not consider jumps, our JCSD test is explicitly geared toward price jumps. By focusing on

the differential between positive and negative directional co-jumps, our test also serves

a different empirical purpose than other tests for the presence of jumps and co-jumps

(see, e.g., Barndorff-Nielsen and Shephard (2006), Aıt-Sahalia and Jacod (2009), Jacod

and Todorov (2009), and Caporin, Kolokolov, and Reno (2017)). Our feasible inference

described in Algorithm 1 may be seen as a parametric bootstrap, and as such is naturally

related to the high-frequency bootstrap methods developed by Goncalves and Meddahi

(2009), Dovonon, Goncalves, Hounyo, and Meddahi (2019), among others. However,

while the latter bootstrap methods are designed to mimic the sampling variability of

aggregated Brownian shocks, we aim to recover that of the truncated jump returns.

3. Empirical semicovariance tests

We begin our empirical investigations by looking at the realized semicovariances for

the 30 Dow Jones Industrial Average (DJIA) constituent stocks as of the last day of our

sample period. We use one-minute returns, obtained from the Trades and Quotes (TAQ)

database, spanning the period from January 2006 to December 2014, a total of 2,265

trading days. Our choice of a one-minute sampling frequency and recent sample period is

dictated by the need to reliably estimate the spot covariance matrix, which is used in the

tests described in the previous section.9 To more clearly pinpoint within-day market-wide

price moves and economic events, we exclude the returns for the first half-hour of the

trading day in the calculation of the statistics.

We calculate the JCSD and DCSD semicovariance-based tests for all of the 435 unique

stock-pairs and 2,265 days in the sample, resulting in close to one million test statistics

9The choice of a one-minute sampling frequency mirrors that of Li, Todorov, Tauchen, and Chen(2017) in their estimation of spot covariances. It is also supported by the “signature plots” in SectionS8 of the Supplemental Appendix. For additional discussion of market microstructure effects, see Zhang,Mykland, and Aıt-Sahalia (2005), Hansen and Lunde (2006), Barndorff-Nielsen, Hansen, Lunde, andShephard (2008), and Jacod, Li, and Zheng (2017).

19

for each of the two tests.10 We use one-sided tests at the 5% significance level. The

average rejection rates for the JCSD tests far exceed the nominal level, rejecting in favor

of positive (negative) co-jumps for about 30% (32%) of the pairs. The DCSD algorithm

detects significantly positive (negative) differences for 13% (10%) of all stock pairs.11

To shed additional light on these test results, Table 1 lists the days with the most

rejections for each of the tests for each of the nine years in the sample, along with a short

description of the most important economic events that occurred on these days. Panel A

shows that all but one of the days with the most rejections for the JCSD test are associated

with FOMC statements or changes in the federal funds rate. This finding is consistent

with the prior literature that links jumps in asset prices with public news announcements

(e.g., Andersen, Bollerslev, and Diebold (2007), Lee and Mykland (2008), Lee (2012),

and Caporin, Kolokolov, and Reno (2017)). It is also in line with the literature on testing

for co-jumps and the finding that those jumps are associated with economy-wide news

(e.g., Bollerslev, Law, and Tauchen (2008) and Lahaye, Laurent, and Neely (2011)).12

In contrast to the “sharp” economic events associated with the JCSD test, Panel B

shows that the days with the most rejections for the DCSD test are typically associated

with “softer” more difficult-to-interpret information. Kyle (1985)-type models can be

used to establish a more formal economic link between the “soft” information and price

drift (which drives the DCSD test through the co-drift effect). In these models, informed

agents trade strategically with liquidity traders to maximize their profit, and they do

so patiently in order to manage the market marker’s belief. As shown by Back (1992),

informed traders’ optimal order flow is smooth (i.e., differentiable) in time, which in turn

determines the drift of the equilibrium price. In a setting with stochastic liquidity, Collin-

Dufresne and Fos (2016) further show that, in equilibrium, the price drift exhibits mean

reversion towards the asset’s true value, and mean reversion is strong (weak) when the

short-term liquidity is high (low) relative to the long-term liquidity. This theory suggests

that, ceteris paribus, price drift is greater in magnitude when there is a greater degree of

mispricing, or when the revelation of the private information is imminent.

To illustrate the distinct price dynamics on the “sharp” and “soft” news event days

identified by the JCSD and DCSD tests, Figure 3 plots cumulative returns for each of

the 30 DJIA stocks for two days from Table 1: September 18, 2013 and February 25,

10We rely on the dynamic threshold advocated by Bollerslev and Todorov (2011a,b) based on threetimes the trailing bipower variation, as originally defined by Barndorff-Nielsen and Shephard (2004b,2006), adjusted for the intraday periodicity in the volatility. We set the local window kn = 45.

11We do not intend to make a formal statistical statement jointly across all pairs. Instead, we viewthe rejection frequencies as simple summary statistics of the pairwise test results.

12A detailed comparison of the dates in Table 1 with those reported by all of the studies cited here isbeyond the scope of this paper. However, we note that some of the dates overlap with those in Caporin,Kolokolov, and Reno (2017), for example, while others do not, as the different focus of the tests naturallyleads to some variation in the days with the most significant jumps.

20

Table 1: Top Rejection Days by Year

Year Date Direction % Headline Event

Panel A. JCSD Test

2006 June 29 + 100 Fed raises short-term rate by a quarter-percentagepoint.

2007 September 18 + 99 Fed cuts short-term rate by a half-percentage point.2008 December 16 + 100 Fed cuts short-term rate by a quarter-percentage

point.2009 March 18 + 100 Fed announces it will buy up to �300 billion in long-

term Treasuries.2010 August 10 + 98 Fed announces it will continue Quantitative Easing.2011 September 22 + 100 Fed announces Operation Twist.2012 September 13 + 100 Fed announces it will continue buying Mortgage

Backed Securities.2013 September 18 + 98 Fed announces it will sustain the asset buying pro-

gram.2014 August 5 − 97 Russian troops are reported lining on the borders

of Ukraine.

Panel B. DCSD Test

2006 July 19 + 57 Bernanke explains to the Senate Banking Commit-tee how the Fed sees the economic slowdown.

2007 August 29 + 58 Bernanke writes letter to senator that Fed is mon-itoring and ready to step in if necessary.

2008 January 2 − 67 Markets react to poor manufacturing, housing andcredit news.

2009 March 23 + 90 Obama administration announces its plan to buy�1 trillion in bad bank assets.

2010 July 7 + 76 EU reveals its first list of stress test banks.2011 June 1 − 83 Moody’s cuts Greece’s bond rating by three

notches.2012 June 21 − 84 Rumors of Moody’s downgrade for global banks.2013 February 25 − 83 Political uncertainty surrounding Italian elections.2014 February 3 − 85 Janet Yellen sworn in as the new Fed chair.

Note: The table reports the top rejection dates by year for the semicovariance-based testsacross all 435 DJIA stocks-pairs. The first column gives the date, the second gives the directionin which the rejections occurred, and the third provides the fraction of pairs of stocks forwhich the test rejects at the 5% level in that direction. The final column summarizes headlineeconomic news events for the different days. Panel A reports the results for the jump CSDtest, while Panel B is based on the diffusive CSD test.

2013.13 (These are also the dates presented in Figure 2.) For the event detected by the

JCSD test (left panel), all stocks experienced a large positive price change (of just over

1% on average) at 2pm, following the release of the FOMC meeting statement which

13Section S3 in the supplemental appendix presents plots for all of the dates listed in Table 1.21

Figure 3: DJIA Cumulative Returns on Representative Event Days

10:00 12:00 14:00 16:00

-1

0

1

2

September 18, 2013

Cum

ulat

ive

Ret

urn

Time 10:00 12:00 14:00 16:00

-2

-1

0

1

February 25, 2013

Time

Note: The figure plots the cumulative return of the 30 Dow Jones Industrial Averagestocks on two of the event dates associated with market-wide jump CSD and diffusiveCSD events from Table 1.

declared that the Fed would sustain its asset-purchasing program. By contrast, for the

event detected by the DCSD test (right panel), we observe slow and steadily decreasing

price paths throughout the day for all of the stocks. The total daily return is large, with

the median daily return of around negative 2%, but no “extreme” one-minute returns

occurred for any of the stocks during the course of that day.

The outcome of the DCSD or JCSD tests and the differences in the within day price

dynamics further translate into different dynamic dependencies in the semicovariance

components across days.14 The total realized covariance C, in particular, generally appear

more persistent following days with diffusive events, as detected by the DCSD algorithm.

On the other hand, the persistence of P increases primarily following positive DCSD

events, while the persistence of N is mostly higher following negative DCSD events,

again consistent with the idea that certain types of “soft” news is processed only slowly

over multiple days. By comparison, only negative JCSD jump events appear to affect the

average persistence of either component.

4. Forecasting with realized semicovariances

The results discussed in the previous section highlight the additional information and

economic insights afforded by the realized semicovariances beyond those from standard

realized covariances. The results also point to the existence of different dynamic depen-

dencies conditional on different days. In this section we further explore these empirical

14A summary table with the correlations conditional on different DCSD and JCSD event days isavailable in Supplemental Appendix Section S4.

22

Figure 4: Autocorrelations

RVPSVNSV

0 10 20 30 40 50

0.4

0.6Variance Elements Covariance Elements

Lag Lag

Aut

ocor

rela

tion

RVPSVNSV

CPNM

0 10 20 30 40 500.2

0.4

0.6

0.8CPNM

Note: The graph plots the autocorrelation functions for the different realized semi-covariance elements. All of the estimates are averaged across 1,000 randomly selectedS&P 500 pairs of stocks, and bias-adjusted following the approach of Hansen and Lunde(2014).

differences from the perspective of forecasting future variances and covariances.

To allow for the construction of larger-dimensional portfolios, we expand our previous

sample of 30 DJIA stocks to include all of the S&P 500 constituent stocks, and we consider

a longer sample period, from January 1993 to December 2014, a total of 5,541 trading

days. In order to reliably estimate models for covariances and semicovariances, we include

only stocks with at least 2,000 daily observations, resulting in a total of 749 stocks. Most

of these stocks are not as actively traded as the DJIA stocks, especially during the earlier

part of the sample. Correspondingly, since we only require consistent estimates for this

part of our analysis, we rely on a coarser 15-minute sampling scheme to construct the

realized measures. Finally, similar to most existing work on volatility forecasting (e.g.,

Hansen, Huang, and Shek (2012), and Noureldin, Shephard, and Sheppard (2012)), we

focus on the intra-daily period excluding the overnight returns.15

4.1. Vector autoregressions for realized semicovariances

Figure 1 discussed in the introduction already points to the existence of different dy-

namic dependencies in the average combined concordant and discordant semicovariance

components. To more directly highlight these differences, Figure 4 plots the lag 1 through

50 autocorrelations for each of the individual realized semicovariance components aver-

aged across 1,000 randomly selected S&P 500 pairs of stocks.16 While the autocorrelations

for the realized variances (RV) and the positive and negative realized semivariances (PSV

and NSV) shown in the left panel are almost indistinguishable, there is a clear ordering in

15Empirical results that include the overnight returns are presented in Supplemental Appendix S6.3.All of our main empirical findings remain qualitatively unaltered.

16We rely on the estimator of Hansen and Lunde (2014) to account for measurement errors in therealized measures. Section S5 of the Supplemental Appendix provides the corresponding unadjustedautocorrelation functions.

23

the rate of decay of the autocorrelations for the realized semicovariance elements shown

in the right panel. Most noticeably, the autocorrelations for the (total) covariances C

are systematically below those for the three realized semicovariance elements, with the

mixed M component exhibiting the highest overall persistence. The fact that almost all

of our pairs of stocks are positively correlated means that few co-jumps will appear in

M , and so by the asymptotic theory in Section 2, M is mostly comprised of diffusive

covariation, while P and N contain both diffusive and co-jump components. Previous

work (e.g., Maheu and McCurdy (2004), and Andersen, Bollerslev, and Diebold (2007))

has found that the diffusive component of volatility is generally more persistent than

the jump component, and our finding that M appears more persistent than P and N is

consistent with those findings.

These differences also naturally suggest that more accurate volatility and covariance

forecasts may be obtained by separately modeling the realized semicovariance components

that make up the realized covariance. To more directly investigate this, we estimate a

vector version of the popular HAR model of Corsi (2009), in which each of the elements

in the realized semicovariance matrix is allowed to depend on its own daily, weekly,

and monthly lags, as well as the lags of the other realized semicovariance components.17

Specifically, for each pair of assets (j, k) we estimate the following three-dimensional

vector autoregression:⎛⎜⎝ Pjk,t

Njk,t

Mjk,t

⎞⎟⎠ =

⎛⎜⎝ φjk,P

φjk,N

φjk,M

⎞⎟⎠+ Φjk,Day

⎛⎜⎝ Pjk,t−1

Njk,t−1

Mjk,t−1

⎞⎟⎠+ Φjk,Week

⎛⎜⎝ Pjk,t−2:t−5

Njk,t−2:t−5

Mjk,t−2:t−5

⎞⎟⎠+Φjk,Month

⎛⎜⎝ Pjk,t−6:t−22

Njk,t−6:t−22

Mjk,t−6:t−22

⎞⎟⎠+

⎛⎜⎝ εPjk,tεNjk,tεMjk,t

⎞⎟⎠ ,

(4.1)

where Pt−l:t−k ≡ 1k−l+1

∑ks=l Pt−s, with the other components defined analogously.

The first three columns of Table 2 report the resulting parameter estimates aver-

aged across 500 randomly selected (j, k) pairs of stocks. The table reveals a clear block

structure in the coefficients of this general specification. Most notably, the dynamic de-

pendencies in P and N are almost exclusively driven by the lagged N terms, while the

dynamic behavior of the mixed M elements is primarily determined by their own lags,

with the monthly lag receiving the largest weight.

The last two columns of Table 2 report the parameter estimates from regressing the re-

alized covariances C on the lagged realized semicovariances and the lagged covariances.18

17Building on the new ideas and theoretical results first presented here, Bollerslev, Patton, and Quaed-vlieg (2020) have recently shown how the realized semicovariances may similarly be used in the construc-tion of improved multivariate realized GARCH-type models.

18Note that due to the linear nature of the HAR model and the fact that realized semicovariances24

Table 2: Semicovariance HAR Estimates

Pjk,t Njk,t Mjk,t Cjk,t

Pjk,t−1 0.038** 0.050** -0.035** 0.052**

Pjk,t−2:t−5 0.004** 0.057** -0.002** 0.059**

Pjk,t−6:t−22 -0.074** 0.023** 0.099** 0.048**

Njk,t−1 0.248** 0.192** -0.096** 0.344**

Njk,t−2:t−5 0.312** 0.250** -0.090** 0.472**

Njk,t−6:t−22 0.349** 0.206** -0.021** 0.534**

Mjk,t−1 -0.075** -0.072** 0.141** -0.006**

Mjk,t−2:t−5 -0.044** -0.049** 0.209** 0.116**

Mjk,t−6:t−22 0.028** -0.020** 0.409** 0.417**

Cjk,t−1 0.184**

Cjk,t−2:t−5 0.305**

Cjk,t−6:t−22 0.304**R2 0.397** 0.376** 0.354** 0.313** 0.284**R2

adj 0.395** 0.374** 0.352** 0.311** 0.283**

Note: The table reports the average parameter estimates for the vector HAR model in(4.1) averaged across 500 randomly selected pairs of stocks. The first three columnsreport results for the unrestricted models. The fourth column reports the estimatesfrom a model that restricts the rows of Φjk,Day, Φjk,Week and Φjk,Month to be the same,

corresponding to a model for Cj,kt, whilst the final column reports the results of a

standard HAR model on Cjk,t. ** and * signify that the estimates for that coefficientare significant at the 5% level for 75% and 50% of the randomly selected pairs of stocks.

The model with individual semicovariances clearly reveals the most important compo-

nents: the three lags of N and the monthly lag of M constitute the main drivers of

the realized covariance C. Interestingly, the models based on the semicovariances also

put a greater weight on more recent information compared to the standard HAR model

reported in the last column: normalizing each of the explanatory variables by their sam-

ple means, the semicovariance-based HAR models effectively put a weight of 0.339 on

lagged daily information, while the final column shows that a standard HAR model on

average puts a weight of only 0.184 on the daily lag, implying a more muted reaction to

new information. These differences are naturally associated with an improved fit of the

semicovariance-based models, as shown by the R2 values. In the next section we inves-

tigate whether this improved in-sample fit is accompanied by a similar improvement in

out-of-sample forecast performance for models that utilize the realized semicovariances.

sum exactly to the realized covariance, each coefficient in the fourth column is simply the sum of thecorresponding coefficients in the first three columns.

25

4.2. Portfolio volatility forecasting

The realized variance of a portfolio depends on the realized semicovariances of the

assets included in the portfolio. In particular, utilizing the result that C = P + N + M ,

the realized variance of a portfolio with portfolio weights w may be expressed as

RVw ≡ w�Cw

= w�Pw + w�Nw + w�Mw

≡ Pw + Nw + Mw, (4.2)

where we use the superscript w to indicate the relevant (scalar-valued) portfolio quan-

tities. These portfolio semicovariance measures are distinct from the portfolio semivari-

ances of Barndorff-Nielsen, Kinnebrock, and Shephard (2010), which only depend on

the high-frequency returns of the portfolio. The portfolio semicovariances, on the other

hand, depend on the high-frequency returns for all of the individual assets included in

the portfolio, and cannot be computed using only returns on the portfolio itself.

To explore whether portfolio semicovariances convey useful information beyond re-

alized variances and semivariances, we extend the HAR model of Corsi (2009) to allow

the forecasts to depend on each of the realized portfolio semicovariance components.

Accordingly, the one-step forecast for the portfolio return variance is constructed from

RV wt+1|t = φ0 + φDay,P P

wt + φWeek,P P

wt−1:t−4 + φMonth,P P

wt−5:t−21

+φDay,NNwt + φWeek,NN

wt−1:t−4 + φMonth,NN

wt−5:t−21 (4.3)

+φDay,MMwt + φWeek,MM

wt−1:t−4 + φMonth,MM

wt−5:t−21.

We will refer to this model as the SemiCovariance HAR (SCHAR) model. The general

SCHAR model in (4.3) is obviously quite richly parameterized. Hence, motivated by the

results in Table 2, we also consider a restricted version, in which we only include the

daily, weekly and monthly lags of Nw, and the monthly lag of Mw. We will refer to this

specification as the restricted SCHAR model, or SCHAR-r for short.

If the parameters associated with the lagged realized semicovariance component all

coincide (i.e., φj,P = φj,N = φj,M = φj for j ∈ {Day, Week,Month}), the SCHAR model

trivially reduces to the basic HAR model and the corresponding forecasting scheme

RV wt+1|t = φ0 + φDayRV

w

t + φWeekRVw

t−1:t−4 + φMonthRVw

t−5:t−21. (4.4)

This simple and easy-to-implement model has arguably emerged as the benchmark for

judging alternative realized volatility-based forecasting procedures in the literature.

In addition to this commonly used benchmark, we also consider the forecasts from

the Semivariance HAR (SHAR) model of Patton and Sheppard (2015). This model uses

26

the portfolio realized semivariances defined as

PSV =

[T/Δn]∑i=1

p(w�Δn

iX)2, NSV =

[T/Δn]∑i=1

n(w�Δn

iX)2,

to decompose the daily realized variance into positive and negative semivariances, result-

ing in the forecasting scheme:

RV wt+1|t = φ0+φDay,+PSV t+φDay,−NSV t+φWeekRV

w

t−1:t−4+φMonthRVw

t−5:t−21. (4.5)

This model has been found to perform particularly well from the perspective of portfolio

variance forecasting, performing on par with or better than the forecasts from other HAR-

style models, and as such constitutes another particularly challenging benchmark.19

We consider equally-weighted portfolios comprised of d = 10 (“small”) and 100

(“large”) stocks randomly selected from the full set of 749 individual stocks. In each

case, we ensure that the selected stocks contain an overlap of at least 1,100 daily obser-

vations. We then construct rolling out-of-sample forecasts based on each of the different

models, with model parameters re-estimated daily using the most recent 1,000 daily ob-

servations. We rely on the commonly used mean-square-error (MSE) and QLIKE loss

functions to evaluate the performance of the forecasts vis-a-vis the actual portfolio real-

ized variances RVw

t+1.20 Table 3 reports the resulting losses averaged across 500 randomly

selected portfolios. In addition, for each of the 500 portfolios, we also compute the ratio of

each model’s average loss relative to the benchmark HAR model, and report the average

of these ratios over all of the random samples.21

Consistent with Patton and Sheppard (2015), the SHAR-based forecasts that uti-

lize the portfolio realized semivariances result in fairly large relative gains vis-a-vis the

benchmark HAR-based forecasts, especially for the large dimensional portfolios. The

performance of the unrestricted SCHAR model, however, is mixed: it has smaller MSE

loss than the HAR forecasts, but underperforms that same benchmark under QLIKE

loss. This finding is hardly surprising. There is ample evidence in the forecasting litera-

ture emphasizing the importance of parsimony (see, e.g., Zellner (1992)), and the results

in Table 2 clearly suggest that the unrestricted SCHAR model is “over-parameterized,”

and as such is likely to perform poorly in a forecasting context. Indeed, looking at the

19A comprehensive comparison of these models with the numerous other specifications that haveappeared in the volatility forecasting literature (e.g., Andersen, Bollerslev, and Diebold (2007), Corsi,Pirino, and Reno (2010) and Caporin, Kolokolov, and Reno (2017)) would be interesting. In the interestof space, we leave a more extensive empirical analysis along these lines for future research.

20The MSE and QLIKE loss functions may both be formally justified for the purpose of volatilityforecast evaluation based on the use of imperfect ex-post volatility proxies; see Patton (2011).

21The Supplemental Appendix Section S6 reports corroborative evidence from a series of additionalrobustness checks, including the results from multivariate models designed to forecast the full d × ddimensional realized covariance matrix.

27

Table 3: Performance Comparison for Portfolio Variance Forecasts

MSE QLIKEModel Average Ratio Average Ratio

Panel A. Small Portfolio Case (d =10)

HAR 1.849 1.000 0.141 1.000SHAR 1.671 0.966 0.139 0.986SCHAR 1.643 0.955 0.210 1.318SCHAR-r 1.567 0.908 0.139 0.979

Panel B. Large Portfolio Case (d =100)

HAR 0.048 1.000 0.119 1.000SHAR 0.045 0.935 0.115 0.957SCHAR 0.045 0.976 0.236 1.495SCHAR-r 0.041 0.862 0.111 0.925

Note: The table reports the loss for forecasting the portfolio variance for portfoliosof size d = 10 and 100 for each of the different forecasting models. The reportednumbers are based on 500 randomly selected portfolios. The Average columnprovides the average loss over time and all portfolios. The Ratio column gives thetime-average ratio of losses across all sets of portfolios relative to the HAR model.

results for the restricted SCHAR-r model guided by the estimates in Table 2, we see that

the forecasts from this model unambiguously outperform those from the other models;

the 13.8% improvement in terms of predictive accuracy (measured by MSE) for the large

portfolio case is particularly impressive. As such, this clearly shows the benefit of utilizing

the additional information inherent in the realized semicovariances, compared to both the

HAR- and SHAR-based forecasts that only rely on the portfolio realized (semi)variances.

These gains in forecast accuracy are not unique to the two portfolio dimensions high-

lighted in Table 3. Figure 5 plots the median loss ratios for the HAR, SHAR and SCHAR-r

models for all values of d ranging from 2 to 100, together with the 10% and 90% quantiles

for the SCHAR-r forecasts computed across the 500 random portfolio-formations. As the

figure shows, the median loss ratios for the SCHAR-r model are systematically below

those of the other two models, the only exception being the QLIKE loss for d = 2. Also,

the gains from using the information in the realized semicovariances accrue relatively

quickly as the number of stocks in the portfolio increases, and for the QLIKE loss appear

to reach somewhat of a plateau for d ≈ 40 stocks.

The gains in forecast accuracy obtained from using realized semicovariances are also

not specific to the one-day forecast horizon. Table S.6 in Section S7 of the Supplemental

Appendix reports results for 5-day- and 22-day-ahead forecasts, and shows that the fore-

cast improvements obtained by the SCHAR-r model relative to the standard HAR model

28

Figure 5: Median Loss Ratios

0 10 20 30 40 50 60 70 80 90 100

0.85

0.90

0.95

1.00

1.05M

SE R

atio

s

HARSHARSCHAR-rSCHAR-r 10% and 90% Quantiles

0 10 20 30 40 50 60 70 80 90 100

0.90

0.95

1.00

1.05

QLI

KE

Rat

io

d

HARSHARSCHAR-rSCHAR-r 10% and 90% Quantiles

Note: The graph plots the median loss ratios as a function of the number of stocks inthe portfolio, d. The ratio is calculated as the average loss of the models divided by theaverage loss of of the standard HAR, based on 500 random samples of d-stock portfolios.

appear even larger over longer forecast horizons.

To help understand the source of these forecast improvements it is instructive to

represent the SHAR and SCHAR models as HAR models with time-varying parameters.

To illustrate the idea, consider the forecasts from a SCHAR-type model based only on

the lagged daily realized semicovariances,

RV wt+1|t = φ0 + φDay,P P

wt + φDay,NN

wt + φDay,MM

wt

= φ0 +

(φDay,P

Pwt

RVw

t

+ φDay,NNw

t

RVw

t

+ φDay,MMw

t

RVw

t

)RV

w

t

≡ φ0 + φDay,tRVw

t .

As these equations show, even though the φDay,P , φDay,N and φDay,M parameters used in

the formulation of the model are all constant, the model may alternatively be interpreted

as a first-order autoregression for RVw

t+1 with a time-varying autoregressive parameter

φDay,t. This idea immediately extends to the SHAR and the more elaborate SCHAR-r

forecasting models used in our empirical analysis, in which the parameters associated

with the weekly and monthly lags in the implied HAR-type representations would be

29

Figure 6: Implied HAR Parameters

HARSHARSCHAR-r

2000 2005 2010 2015

0.4

0.6

0.8

Dai

lyHARSHARSCHAR-r

2000 2005 2010 20150.3

0.5

0.7

Wee

kly

2000 2005 2010 2015-0.5

0-0

.25

0.00

0.25

Mon

thly

2000 2005 2010 20150.8

1.0

Tota

l

TimeTime

Note: The figure plots the implied HAR-type parameters for predicting the variance of aten-stock portfolio. The figure shows moving averages of the estimates, averaged across500 randomly selected portfolios. The models are estimated over the full sample period.

time-varying as well.

To study these effects, Figure 6 plots the daily, weekly and monthly time-varying

HAR parameters implied by the SHAR and SCHAR-r models for d = 10, averaged across

500 randomly selected ten-stock portfolios.22 In addition to the implied daily, weekly

and monthly parameters, the last panel reports their sum as a measure of the overall

persistence of the different models.

The daily, weekly, and monthly parameters for the HAR model are by definition all

constant, with an average implied persistence of around 0.94. By comparison, the implied

daily parameter estimates for the SHAR model vary slightly above the constant daily

HAR parameter, while the constant weekly parameter for the SHAR model is slightly

below that of the HAR model. As such, the overall persistence of the SHAR-based

forecasts are generally close to that of the standard HAR-based forecasts.

By contrast, the implied time-varying daily and weekly HAR parameters for the

SCHAR-r model both far exceed those of the standard HAR model, especially over the

earlier part of the sample. On the other hand, the implied time-varying monthly pa-

rameters for the SCHAR-r model are typically less than the monthly parameters for the

standard HAR and SHAR models. Meanwhile, the sum of the three implied parameters

22In contrast to the out-of-sample forecast results in Table 3, which are based on a rolling estimationscheme, Figure 6 plots the implied parameter estimates obtained over the full sample period. Requiringobservations to be available over the full sample reduces the number of stocks to 121. To avoid con-taminating the results by a few influential outliers, we exclude any portfolios for which the maximum

Pw/RVw, Nw/RV

wand Mw/RV

wratios exceed ten, and further smooth the implied parameters using

a day t− 25 to day t+ 25 moving average.

30

for the SCHAR-r model, which may be interpreted as an overall measure of persistence,

are typically greater than those for the other two models. Thus, not only do the su-

perior SCHAR-based forecasts respond more quickly to new information, the forecasts

are typically also more persistent and slower to mean-revert than the benchmark HAR-

and SHAR-based forecasts. This explains why the SCHAR-r model performs well not

just over the short one-day forecast horizon, but also over weekly and monthly forecast

horizons. It also highlights the usefulness of the information residing in the new realized

semicovariance measures for volatility forecasting more generally.

5. Conclusion

We propose a decomposition of the realized covariance matrix based on the signs

of the underlying high-frequency returns into positive, negative and mixed-sign realized

semicovariance components. Under a standard infill asymptotic setting for continuous-

time Ito semimartingales, we derive the first- and second-order asymptotic properties of

these new realized semicovariance measures. The asymptotic theory, taking the form of

a non-central limit theorem, reveals the differential information carried by each of the

realized semicovariance components, related to stochastic correlation, signed co-jumps,

and notions of co-drifting and leverage effects. Using high-frequency data for a large

cross-section of U.S. equities, we demonstrate how the asymptotic theory may be used

to help understand key features of the realized semicovariance components. We further

document distinctly different dynamic dependencies in the different realized semicovari-

ance components. These differences in turn translate into superior forecast performance

for models that utilize the realized semicovariance measures.

The infill asymptotic framework underlying much of the recent financial econometrics

literature, including this paper, naturally suggests the use of high-frequency intraday

data for the calculation of daily realized variation measures. However, the availability of

reliable intraday data limits such analyses to relatively short and recent sample periods.

On the other hand, empirical work in the finance literature often employs daily data for

the calculation of monthly or quarterly (co)variances over longer time periods (see, e.g.,

Schwert (1989)), which speak more directly to the longer holding periods that are com-

monplace in empirical finance. In this spirit, Figure 7 presents a lower-frequency analog

to Figure 1, showing monthly covariances and semicovariances constructed from daily

data spanning 1926 to 2018.23 While the use of coarser daily returns in the construction

of the monthly semicovariances introduces a wedge between the infill asymptotic theory

and the empirical measures, thereby blurring the impact of co-jumps, short-lived co-drifts

and/or leverage effects for explaining the differences between the concordant semicovari-

ance components, the figure still reveals notable differences through time in the relative

23Analogous to Figure 1, each month we randomly select 500 pairs of stocks from that month’s list ofavailable stocks in the CRSP database, and average the monthly realized measures across all 500 pairs.

31

Figure 7: Decomposition of Monthly Realized Covariances

1930 1940 1950 1960 1970 1980 1990 2000 2010

-10

010

2030

(Sem

i)Cov

aria

nce

Time



Note: The figure plots seven-month moving averages of the monthly time series of theconcordant semicovariance (P + N), the mixed semicovariance (M) and the realized

covariance (C). Each series is constructed as the average of the corresponding monthlyrealized measures averaged across 500 randomly selected pairs of CRSP stocks over the1926 - 2018 period. The shaded regions represent NBER recession periods.

importance of the concordant versus discordant components. In particular, the combined

concordant semicovariance component generally, though not uniformly, increases in reces-

sions more than the mixed component declines, leading to an overall increase in the total

covariance. This is true for the Great Depression in the 1930’s and the Great Recession

in the late 2000’s, as well as for the 1970’s oil-shock recession. However, it is not the

case for the recessions of the early 1980s and 2000s. We leave further analyses of these

intriguing features for future research.

References

Aıt-Sahalia, Y., and J. Jacod (2009): “Testing for Jumps in a Discretely Observed

Process,” Annals of Statistics, 37, 184–222.

Aıt-Sahalia, Y., and J. Jacod (2014): High-Frequency Financial Econometrics.

Princeton University Press, Princeton.

Aıt-Sahalia, Y., and D. Xiu (2016): “Increased Correlation Among Asset Classes:

Are Volatility or Jumps to Blame, or Both?,” Journal of Econometrics, 194(2), 205–

219.

Amaya, D., P. Christoffersen, K. Jacobs, and A. Vasquez (2015): “Does Re-

alized Skewness Predict the Cross-Section of Equity Returns?,” Journal of Financial

Economics, 118(1), 135–167.32

Andersen, T. G., T. Bollerlsev, F. X. Diebold, and P. Labys (2003): “Mod-

eling and Forecasting Realized Volatility,” Econometrica, 71(2), 579–625.

Andersen, T. G., and T. Bollerslev (1997): “Intraday Periodicity and Volatility

Persistence in Financial Markets,” Journal of Empirical Finance, 4, 115–158.

Andersen, T. G., T. Bollerslev, and F. X. Diebold (2007): “Roughing It Up:

Including Jump Components in the Measurement, Modeling, and Forecasting of Return

Volatility,” Review of Economics and Statistics, 89(4), 701–720.

Ang, A., and J. Chen (2002): “Asymmetric Correlations of Equity Portfolios,” Journal

of Financial Economics, 63(3), 443–494.

Back, K. (1992): “Insider Trading in Continuous Time,” Review of Financial Studies,

5(3), 387–409.

Barndorff-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2008):

“Designing Realized Kernels to Measure the ex post Variation of Equity Prices in the

Presence of Noise,” Econometrica, 76, 1481–1536.

(2011): “Multivariate Realised Kernels: Consistent Positive Semi-Definite Esti-

mators of the Covariation of Equity Prices with Noise and Non-Synchronous Trading,”

Journal of Econometrics, 162(2), 149–169.

Barndorff-Nielsen, O. E., S. Kinnebrock, and N. Shephard (2010): “Measur-

ing Downside Risk: Realised Semivariance,” in Volatility and Time Series Economet-

rics: Essays in Honor of Robert F. Engle, ed. by T. Bollerslev, J. R. Russell, and M. W.

Watson, pp. 117–136. Oxford University Press, Oxford.

Barndorff-Nielsen, O. E., and N. Shephard (2004a): “Econometric Analysis of

Realized Covariation: High-Frequency based Covariance, Regression, and Correlation

in Financial Economics,” Econometrica, 72(3), 885–925.

(2004b): “Power and Bipower Variation with Stochastic Volatility and Jumps,”

Journal of Financial Econometrics, 2(1), 1–37.

(2006): “Econometrics of testing for jumps in financial economics using bipower

variation,” Journal of Financial Econometrics, 4(1), 1–30.

Bauwens, L., S. Laurent, and J. V. K. Rombouts (2006): “Multivariate GARCH

Models: A Survey,” Journal of Applied Econometrics, 21(1), 79–109.

Bollerslev, T., T. Law, and G. Tauchen (2008): “Risk, Jumps and Diversifica-

tion,” Journal of Econometrics, 144(1), 234–256.

Bollerslev, T., J. Li, and Y. Xue (2018): “Volume, Volatility, and Public News

Announcements,” Review of Economic Studies, 85(4), 2005–2041.

Bollerslev, T., A. J. Patton, and R. Quaedvlieg (2020): “Multivariate Lever-

age Effects and Realized Semicovariance GARCH Models,” Journal of Econometrics,

Forthcoming.

Bollerslev, T., and V. Todorov (2011a): “Estimation of Jump Tails,” Economet-

rica, 79(6), 1727–1783.

33

(2011b): “Tails, Fears, and Risk Premia,” Journal of Finance, 66(6), 2165–2211.

Caporin, M., A. Kolokolov, and R. Reno (2017): “Systemic Co-jumps,” Journal

of Financial Economics, 126(3), 563–591.

Cappiello, L., R. Engle, and K. Sheppard (2006): “Asymmetric Dynamics in the

Correlations of Global Equity and Bond Returns,” Journal of Financial Econometrics,

4(4), 537–572.

Collin-Dufresne, P., and V. Fos (2016): “Insider Trading, Stochastic Liquidity,

and Equilibrium Prices,” Econometrica, 84(4), 1441–1475.

Corsi, F. (2009): “A Simple Approximate Long-Memory Model of Realized Volatility,”

Journal of Financial Econometrics, 7(2), 174–196.

Corsi, F., D. Pirino, and R. Reno (2010): “Threshold bipower variation and the

impact of jumps on volatility forecasting,” Journal of Econometrics, 159(2), 276–288.

Das, S. R., and R. Uppal (2004): “Systemic Risk and International Portfolio Choice,”

Journal of Finance, 59(6), 2809–2834.

Dovonon, P., S. Goncalves, U. Hounyo, and N. Meddahi (2019): “Bootstrapping

High-Frequency Jump Tests,” Journal of the American Statistical Association, 114,

793–803.

Elton, E. J., and M. J. Gruber (1973): “Estimating the Dependence Structure of

Share Prices – Implications for Portfolio Selection,” Journal of Finance, 28(5), 1203–

1232.

Fishburn, P. C. (1977): “Mean-Risk Analysis with Risk Associated with Below-Target

Returns,” American Economic Review, 67(2), 116–126.

Goncalves, S., and N. Meddahi (2009): “Bootstrapping Realized Volatility,” Econo-

metrica, 77, 283–306.

Hansen, P. R., Z. Huang, and H. H. Shek (2012): “Realized GARCH: A Joint Model

for Returns and Realized Measures of Volatility,” Journal of Applied Econometrics,

27(6), 877–906.

Hansen, P. R., and A. Lunde (2006): “Realized Variance and Market Microstructure

Noise,” Journal of Business and Economic Statistics, 24(2), 127–161.

(2014): “Estimating the Persistence and the Autocorrelation Function of a Time

Series that is Measured with Error,” Econometric Theory, 30(1), 60–93.

Hogan, W. W., and J. M. Warren (1972): “Computation of the Efficient Boundary

in the E-S Portfolio Selection Model,” Journal of Financial and Quantitative Analysis,

7(4), 1881–1896.

(1974): “Toward the Development of an Equilibrium Capital-Market Model

Based on Semivariance,” Journal of Financial and Quantitative Analysis, 9(1), 1–11.

Hong, Y., J. Tu, and G. Zhou (2007): “Asymmetries in Stock Returns: Statistical

Tests and Economic Interpretation,” Review of Financial Studies, 20(5), 1547–1581.

Jacod, J. (1997): “On Continuous Conditional Gaussian Martingales and Stable Con-

34

vergence in Law,” in Seminaire de Probabilites, pp. 232–246. Springer-Verlag, Berlin

Heidelberg New York.

Jacod, J., Y. Li, and X. Zheng (2017): “Statistical Properties of Microstructure

Noise,” Econometrica, 85(4), 1133–1174.

Jacod, J., and P. Protter (2012): Discretization of Processes. Springer Verlag.

Jacod, J., and A. N. Shiryaev (2003): Limit Theorems for Stochastic Processes.

Springer-Verlag, New York, second edn.

Jacod, J., and V. Todorov (2009): “Testing for Common Arrivals of Jumps for

Discretely Observed Multidimensional Processes,” Annals of Statistics, 37(4), 1792–

1838.

Kendall, M. G. (1953): “The Analysis of Economic Time-Series – Part I: Prices,”

Journal of the Royal Statistical Society, Series A, 116(1), 11–34.

Kinnebrock, S., and M. Podolskij (2008): “A Note on the Central Limit Theorem

for Bipower Variation of General Functions,” Stochastic Processes and Their Applica-

tions, 118(6), 1056–1070.

Kristensen, D. (2010): “Nonparametric Filtering of the Realized Spot Volatility: A

Kernel-Based Approach,” Econometric Theory, 26(1), pp. 60–93.

Kroner, K. F., and V. K. Ng (1998): “Modeling Asymmetric Comovements of Asset

Returns,” Review of Financial Studies, 11(4), 817–844.

Kyle, A. S. (1985): “Continuous Auctions and Insider Trading,” Econometrica, 53(6),

1315–1335.

Lahaye, J., S. Laurent, and C. J. Neely (2011): “Jumps, Cojumps and Macro

Announcements,” Journal of Applied Econometrics, 26(6), 893–921.

Lee, S. S. (2012): “Jumps and Information Flow in Financial Markets,” Review of

Financial Studies, 25(2), 439–479.

Lee, S. S., and P. A. Mykland (2008): “Jumps in Financial Markets: A New Non-

parametric Test and Jump Dynamics,” Review of Financial Studies, 21(6), 2535–2563.

Li, J., V. Todorov, and G. Tauchen (2017a): “Adaptive Estimation of Continuous-

Time Regression Models Using High-Frequency Data,” Journal of Econometrics,

200(1), 36–47.

(2017b): “Jump Regressions,” Econometrica, 85(1), 173–195.

Li, J., V. Todorov, G. Tauchen, and R. Chen (2017): “Mixed-Scale Jump Regres-

sions with Bootstrap Inference,” Journal of Econometrics, 201(2), 417–432.

Li, J., and D. Xiu (2016): “Generalized Method of Integrated Moments for High-

Frequency Data,” Econometrica, 84(4), 1613–1633.

Li, Y., P. A. Mykland, E. Renault, L. Zhang, and X. Zheng (2014): “Realized

Volatility when Sampling Times are Possibly Endogenous,” Econometric Theory, 30(3),

580–605.

Longin, F., and B. Solnik (2001): “Extreme Correlation of International Equity

35

Markets,” Journal of Finance, 56(2), 649–676.

Maheu, J. M., and T. H. McCurdy (2004): “News Arrival, Jump Dynamics, and

Volatility Components for Indivdual Stock Returns,” Journal of Finance, 59(2), 755–

793.

Mancini, C. (2001): “Disentangling the Jumps of the Diffusion in a Geometric Jumping

Brownian Motion,” Giornale dell’Istituto Italiano degli Attuari, LXIV, 19–47.

(2009): “Non-Parametric Threshold Estimation for Models with Stochastic

Diffusion Coefficient and Jumps,” Scandinavian Journal of Statistics, 36(2), 270–296.

Mancini, C., and F. Gobbi (2012): “Identifying the Brownian Covariation from the

Co-Jumps given Discrete Observations,” Econometric Theory, 28(2), 249–273.

Mao, J. C. T. (1970): “Survey of Capital Budgeting: Theory and Practice,” Journal

of Finance, 25(2), 349–360.

Markowitz, H. M. (1959): Portfolio Selection. Wiley.

Neuberger, A. (2012): “Realized Skewness,” Review of Financial Studies, 25(11),

3423–3455.

Noureldin, D., N. Shephard, and K. Sheppard (2012): “Multivariate High-

Frequency-Based Volatility (HEAVY) Models,” Journal of Applied Econometrics,

27(6), 907–933.

Patton, A. J. (2004): “On the Out-of-Sample Importance of Skewness and Asymmetric

Dependence for Asset Allocation,” Journal of Financial Econometrics, 2(1), 130–168.

(2011): “Volatility Forecast Comparison using Imperfect Volatility Proxies,”

Journal of Econometrics, 160(1), 246–256.

Patton, A. J., and K. Sheppard (2015): “Good Volatility, Bad Volatility: Signed

Jumps and the Persistence of Volatility,” Review of Economics and Statistics, 97(3),

683–697.

Poon, S.-H., M. Rockinger, and J. Tawn (2004): “Extreme Value Dependence

in Financial Markets: Diagnostics, Models, and Financial Implications.,” Review of

Financial Studies, 17(2), 581–610.

Reno, R. (2008): “Nonparametric Estimation of the Diffusion Coefficient of Stochastic

Volatility Models,” Econometric Theory, 24, 1174–1206.

Schwert, G. W. (1989): “Why Does Stock Market Volatility Change Over Time?,”

Journal of Finance, 44(5), 1115–1153.

Tjøsthem, D., and K. O. Hufthammer (2013): “Local Gaussian Correlation: A

New Measure of Dependence,” Journal of Econometrics, 172(1), 33–48.

Zellner, A. (1992): “Presidential Address: Statistics, Science and Public Policy,” Jour-

nal of the American Statistical Association, 87(417), 1–6.

Zhang, L., P. A. Mykland, and Y. Aıt-Sahalia (2005): “A Tale of Two Time

Scales: Determining Integrated Volatility with Noisy High-Frequency Data,” Journal

of the American Statistical Association, 100, 1394–1411.

36

Appendix: Regularity Conditions and Proofs

A.1. Regularity conditions

The following regularity conditions are needed for our asymptotic theory.

Assumption 1 The process X is an Ito semimartingale defined on a filtered probabil-

ity space (Ω,F , (Ft),P) of the form (2.1) with Jt =∫ t

0

∫Rδ (s, u)μ (ds, du), where the

process b is locally bounded, the process σ is cadlag and takes value in Rd×d, δ is a pre-

dictable function and μ is a Poisson random measure defined on R+×R with compensator

ν (dt, du) = dt⊗ λ (du) for some finite measure λ on R.

Assumption 2 We have Assumption 1. Moreover, the process σ has the form (2.3)

such that (i) σt is non-singular almost surely for all t; (ii) b is locally bounded; (iii)

σ is d × d × d cadlag adapted process; (iv) the process M is a local martingale that is

orthogonal to W with ‖ΔMt‖ ≤ σ for some constant σ > 0 and its predictable quadratic

covariation process has the form 〈M, M〉t =∫ t

0qsds for some locally bounded process q;

(v) the compensator of the pure-jump process∑

s≤t Δσs1{‖Δσs‖>σ} has the form∫ t

0qsds

for some locally bounded process q.

Overall, these assumptions are fairly mild and quite standard in the analysis of high-

frequency data. In particular, they allow for price and volatility jumps and co-jumps,

as well as the so-called leverage effect. The only notable restriction is the finite activity

of the price jumps. As in Li, Todorov, and Tauchen (2017b), we purposely impose this

condition because, in the current paper, the empirical interest vis-a-vis jumps mainly

concerns “large” market-wide co-jumps, which occur relatively infrequently (and thus

aligned with the finite-activity condition). Relaxing this condition would greatly com-

plicate our technical exposition, without leading to any change in the actual numerical

implementation.

A.2. Proofs

A.2.1. Notation and preliminary results

We begin by defining some notation. Recall that for j ∈ {1, . . . , d}, Tj = {τ :

ΔXj,τ �= 0} collects the jump times of asset j, and Tj+ ≡ {τ ∈ T : ΔXj,τ > 0} and

Tj− ≡ {τ ∈ T : ΔXj,τ < 0} collect the times at which asset j has positive and negative

jumps, respectively. For each jump time τ , we denote by i (τ) the unique random integer

i such that (i− 1)Δn < τ ≤ iΔn. We then set

Ij ≡ {i (τ) : τ ∈ Tj} and Ij± ≡ {i (τ) : τ ∈ Tj±}. (A.1)

37

Recall that p (x) ≡ max {x, 0} and n (x) ≡ min{x, 0}. To simplify notation, we

denote, for x, y ∈ R, f (x, y) ≡ p (x) p (y), g (x, y) ≡ n (x)n (y), and m (x, y) = p (x)n (y).

Note that f and g are both continuously differentiable except on {(x, y) : x = 0 or y = 0},with gradients

∂f (x, y) =

(1{x≥0} max {y, 0}max {x, 0} 1{y≥0}

), ∂g (x, y) =

(1{x≤0} min {y, 0}min {x, 0} 1{y≤0}

).

For generic functions h, h1 and h2 defined on Rd, and an invertible d × d matrix a

such that c = aa�, we define the following quantities:⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

Rc (h) ≡ EU [h (aU)] , γc (h) ≡ EU [h (aU) (aU)] ,

γa (h) ≡ EU [h (aU)U ] ,

Hj ≡ hj (aU)−Rc (h)−(c−1γc (hj)

)�aU, j = 1, 2,

γc (h1, h2) ≡ CovU (H1, H2) ,

Γc (h1, h2) ≡ CovU

((c−1γc (h1)

)�aU,(c−1γc (h2)

)�aU),

Γc (h1, h2) ≡ CovU (h1 (aU) , h2 (aU)) ,

(A.2)

where U is a generic d-dimensional standard normal variable, and EU and CovU are

the expectation and covariance operators with respect to U , respectively. We note that,

except for γa (h), the expected values in (A.2) depend on a only through c = aa�, which

explains our notation Rc (·), γc (·), etc.We need to establish a few identities for these functionals. Observe that the variable

Hj is the residual obtained from projecting the demeaned variable hj (aU)−EU [hj (aU)]

onto aU , with c−1γc (hj) being the corresponding projection coefficient. Since a is non-

singular, this residual may alternatively be written as Hj = hj (aU) − EU [hj (aU)] −γa (hj)

� U . Hence, the functional γc (h1, h2) can be rewritten as

γc (h1, h2) = EU

[(h1 (aU)− γa (h1)

� U)(

h2 (aU)− γa (h2)� U)]

−EU [h1 (aU)]EU [h2 (aU)] .(A.3)

We further note that Γc (h1, h2) computes the covariance of h1 (aU) − EU [h1 (aU)] and

h2 (aU)−EU [h2 (aU)], while Γc (h1, h2) computes the covariance of their projections. By

a decomposition of covariance, we then deduce

γc (h1, h2) = Γc (h1, h2)− Γc (h1, h2) . (A.4)

In addition,

Γc (h1, h2) = γc (h1)� c−1γc (h2) . (A.5)

Our analysis for the semicovariances relies on some explicit calculations of the expec-

38

tations in (A.2). Lemma A1, below, provides the details for c taking the form

c =

(v21 ρv1v2

ρv1v2 v22

). (A.6)

The proof is done by direct integration and is omitted for brevity.

Lemma A1 The following statements hold when the matrix c has the form (A.6):

(a) Rc (f) = Rc (g) = v1v2ψ (ρ) and Rc (m) = −v1v2ψ (−ρ);(b) Rc (∂f) = −Rc (∂g) =

(2√2π)−1

(1 + ρ) (v2, v1)�;

(c) γc (f) = −γc (g) =(2√2π)−1

(1 + ρ)2 v1v2 (v1, v2)�;

(d) Γc (f, f) = Γc (g, g) = v21v22(Ψ (ρ)− ψ (ρ)2) and Γc (f, g) = −v21v22ψ (ρ)2.

A.2.2. Proofs

Proof of Theorem 1. Let X ′ denote the continuous part of X, that is, X ′t ≡∫ t

0bsds+∫ t

0σsdWs. We define P ′ in the same way as P , but with X replaced by X ′. It follows

that

Pjk = P ′jk +

∑i∈Ij∪Ik

p (ΔniXj) p (Δ

niXk)−

∑i∈Ij∪Ik

p(Δn

iX′j

)p (Δn

iX′k) .

By Theorem 3.4.1(b) in Jacod and Protter (2012) and Lemma A1(a),

P ′jk

P−→∫ T

0

vj,svk,sψ(ρjk,s)ds. (A.7)

We also note that∑

i∈Ij∪Ik p (ΔniXj) p (Δ

niXk) =

∑τ∈Tj∪Tk p(Δ

ni(τ)Xj)p(Δ

ni(τ)Xk) and

Δni(τ)X → ΔXτ pathwise. Hence,

∑i∈Ij∪Ik

p (ΔniXj) p (Δ

niXk)

P−→ P †jk. (A.8)

Finally, by a standard estimate for continuous Ito semimartingales and the fact that Ij∪Ik

is almost surely finite, we deduce that∑

i∈Ij∪Ik p(Δn

iX′j

)p (Δn

iX′k) = Op(Δn). This

estimate, combined with (A.7) and (A.8), implies the asserted convergence PjkP−→ Pjk.

The proof for N and M follows essentially the same argument. Q.E.D.

Proof of Theorem 2. Step 1. We begin by outlining the basic steps of the proof.

First, we decompose Δ−1/2n (P12 − P12) =

∑5j=1 P

(j) and Δ−1/2n (N12 − N12) =

∑5j=1 N

(j),

39

where ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

P (1) ≡ Δ−1/2n

⎛⎝ ∑i∈I1+∩I2+

p (ΔniX1) p (Δ

niX2)− P †

12

⎞⎠ ,

P (2) ≡ Δ−1/2n

∑i∈I1+\I2

p (ΔniX1) p (Δ

niX2) ,

P (3) ≡ Δ−1/2n

∑i∈I2+\I1

p (ΔniX1) p (Δ

niX2) ,

P (4) ≡ Δ−1/2n

∑i∈I1−∪I2−

p (ΔniX1) p (Δ

niX2) ,

P (5) ≡ Δ−1/2n

( ∑i/∈I1∪I2

p (ΔniX1) p (Δ

niX2)− P �

12

),

and ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

N (1) ≡ Δ−1/2n

⎛⎝ ∑i∈I1−∩I2−

n (ΔniX1)n (Δ

niX2)−N †

12

⎞⎠ ,

N (2) ≡ Δ−1/2n

∑i∈I1−\I2

n (ΔniX1)n (Δ

niX2) ,

N (3) ≡ Δ−1/2n

∑i∈I2−\I1

n (ΔniX1)n (Δ

niX2) ,

N (4) ≡ Δ−1/2n

∑i∈I1+∪I2+

n (ΔniX1)n (Δ

niX2) ,

N (5) ≡ Δ−1/2n

( ∑i/∈I1∪I2

n (ΔniX1)n (Δ

niX2)−N�

12

).

To prove the assertion of Theorem 2, it suffices to show the following joint stable

convergence in law(4∑

j=1

P (j),4∑

j=1

N (j)

)L-s−→ (ξP , ξN) , (A.9)(

P (5)

N (5)

)L-s−→(

B

−B

)+

(L

−L

)+

(ζ

−ζ

)+

(ζP

ζN

). (A.10)

In steps 2 and 3, below, we prove these claims in turn. We note that, by a standard

argument, it is easy to show that the convergences in (A.9) and (A.10) hold jointly with F -

conditionally independent limits (as they involve non-overlapping Brownian increments).

Therefore, the remaining task is to separately show (A.9) and (A.10).

Step 2. This step proves the claim in (A.9). Let T ≡ T1 ∪ T2 collect all jump times

of the bivariate process X. We set, for each τ ∈ T , ητ = (η1,τ , η2,τ )� ≡ Δ

−1/2n (Δn

i(τ)X −ΔXτ ). By Proposition 4.4.10 in Jacod and Protter (2012),

(ητ )τ∈TL-s−→ (ητ )τ∈T , (A.11)

40

where ητ = (η1,τ , η2,τ )� ≡ √

κτ ξτ− +√1− κτ ξτ+ (recall the definitions in Section 2.2).

From (A.11), it is easy to derive the following representations uniformly for all τ ∈ T(note that T is finite almost surely): for j ∈ {1, 2},

p(Δn

i(τ)Xj

)=

⎧⎪⎨⎪⎩ΔXj,τ +Δ

1/2n ηj,τ + op(Δ

1/2n ), if τ ∈ Tj+,

Δ1/2n p (ηj,τ ) + op(Δ

1/2n ), if τ ∈ T \ T j,

op(Δ1/2n ), if τ ∈ Tj−,

(A.12)

and

n(Δn

i(τ)Xj

)=

⎧⎪⎨⎪⎩op(Δ

1/2n ), if τ ∈ Tj+,

Δ1/2n n (ηj,τ ) + op(Δ

1/2n ), if τ ∈ T \ T j,

ΔXj,τ +Δ1/2n ηj,τ + op(Δ

1/2n ), if τ ∈ Tj−.

(A.13)

Using these representations, we further deduce⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

P (1) =∑

τ∈T1+∩T2+(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) + op(1),

P (2) =∑

τ∈T1+\T2ΔX1,τp (η2,τ ) + op(1),

P (3) =∑

τ∈T2+\T1ΔX2,τp (η1,τ ) + op(1),

P (4) = op(1),

and ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

N (1) =∑

τ∈T1−∩T2−(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) + op(1),

N (2) =∑

τ∈T1−\T2ΔX1,τn (η2,τ ) + op(1),

N (3) =∑

τ∈T2−\T1ΔX2,τn (η1,τ ) + op(1),

N (4) = op(1).

These representations, together with the convergence (A.11), imply (A.9).

Step 3. This step proves the claim in (A.10). Let X ′ denote the continuous part of

X, that is, X ′t ≡∫ t

0bsds+

∫ t

0σsdWs. We define P ′ and N ′ in the same way as P and N ,

but with X replaced by X ′. It is easy to see that P ′ −∑i/∈I1∪I2 p (ΔniX1) p (Δ

niX2) and

N ′ −∑i/∈I1∪I2 n (ΔniX1)n (Δ

niX2) are Op(Δn). Hence, it suffices to prove the claim with

P (5) and N (5) replaced by P ′ ≡ Δ−1/2n (P ′

12−P �12) and N

′ ≡ Δ−1/2n (N ′

12−N�12), respectively.

In order to derive the stable convergence in law of (P ′, N ′), we apply Theorem 5.3.5 in

Jacod and Protter (2012) to the test function x → (p(x1)p (x2) , n (x1)n (x2)). To do so,

it is enough to verify that the limiting variables (B,−B), (L,−L), (ζ,−ζ) and (ζP , ζN)

coincide with Jacod and Protter’s A, A′, U and U

′variables. Turning to the details,

we first note that, by Lemma A1(a), we can rewrite P �12 =

∫ T

0Rcs (f) ds and N�

12 =∫ T

0Rcs (g) ds. By Lemma A1(b), we can rewrite the B term as B =

∫ T

0b�s Rcs (∂f) ds =

41

− ∫ T

0b�s Rcs (∂g) ds. Given (2.7), no further rewriting is needed for the (L,−L) term.

By Lemma A1(c), we can rewrite

γt = γct (f) and − γt = γct (g) . (A.14)

Hence, γt = σtγσt (f) = −σtγσt (g) and, by (2.8), ζ =∫ T

0γσs (f)

� dWs = − ∫ T

0γσs (g)

� dWs.

By (A.14) and (A.5), we see that Γt defined in (2.10) can be written as

Γt =

(Γct (f, f) Γct (f, g)

Γct (g, f) Γct (g, g)

). (A.15)

By Lemma A1(d), we see that Γt defined in (2.12) can be expressed equivalently as

Γt =

(Γct (f, f) Γct (f, g)

Γct (g, f) Γct (g, g)

). (A.16)

Recall that γt ≡ Γt − Γt. By (A.4), (A.15) and (A.16), it follows that

γt =

(γct (f, f) γct (f, g)

γct (g, f) γct (g, g)

).

In view of (A.3), we verify that the local quadratic variation of the (ζP , ζN) term coincide

with that defined in (5.2.4) of Jacod and Protter (2012).

We are now ready to apply Theorem 5.3.5 in Jacod and Protter (2012) to finish the

proof of (A.10), and hence, the assertion of Theorem 2. Q.E.D.

Proof of Proposition 1. By Proposition 1 in Li, Todorov, and Tauchen (2017b), the

set I coincides with I1 ∪ I2 with probability approaching one and, in restriction to that

sequence of events,

P �12 =

∑i/∈I1∪I2

p (ΔniX1) p (Δ

niX2) , P †

12 =∑

i∈I1∪I2p (Δn

iX1) p (ΔniX2) ,

N�12 =

∑i/∈I1∪I2

n (ΔniX1)n (Δ

niX2) , N †

12 =∑

i∈I1∪I2n (Δn

iX1)n (ΔniX2) .

The assertion then follows from (A.9) and (A.10) in the proof of Theorem 2. Q.E.D.

Proof of Proposition 2. Let T = T 1 ∪ T2. For notational simplicity, we denote, for

each subset S ⊆ T ,

ξ∗P (S) ≡ Δ−1/2n

∑τ∈S

(p(Δn

i(τ)X∗1 +Δ1/2

n η∗i(τ),1)p(Δn

i(τ)X∗2 +Δ1/2

n η∗i(τ),2)

−p (Δni(τ)X

∗1

)p(Δn

i(τ)X∗2

)).

(A.17)

42

By Proposition 1 in Li, Todorov, and Tauchen (2017b), I coincides with I1 ∪ I2 with

probability approaching one. Hence, we can restrict our calculations to that sequence

of events without loss of generality. In particular, we can write ξ∗P as ξ∗P (T ) using the

notation (A.17). We can then decompose ξ∗P as

ξ∗P = ξ∗P (T1+ ∩ T2+) + ξ∗P (T1+ \ T2) + ξ∗P (T2+ \ T1) + ξ∗P (T1− ∪ T2−) .

We note that Δni(τ)X

∗j = 0 for all τ ∈ T \ T j with probability approaching one. It is then

easy to deduce that

p(Δn

i(τ)X∗j +Δ1/2

n η∗i(τ),j)=

⎧⎪⎨⎪⎩p(Δn

i(τ)X∗j ) + Δ

1/2n η∗i(τ),j + op(Δ

1/2n ), if τ ∈ Tj+,

Δ1/2n p(η∗i(τ),j) + op(Δ

1/2n ), if τ ∈ T \ T j,

op(Δ1/2n ), if τ ∈ Tj−.

From there, we readily deduce that⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

ξ∗P (T1+ ∩ T2+) =∑

τ∈T1+∩T2+

(p(Δn

i(τ)X∗1 )η

∗i(τ),2 + p(Δn

i(τ)X∗2 )η

∗i(τ),1

)+ op(1),

ξ∗P (T1+ \ T2) =∑

τ∈T1+\T2p(Δn

i(τ)X∗1 )p(η∗i(τ),2

)+ op(1),

ξ∗P (T2+ \ T1) =∑

τ∈T2+\T1p(Δn

i(τ)X∗2 )p(η∗i(τ),1

)+ op(1),

ξ∗P (T1− ∪ T2−) = op(1).

(A.18)

By a standard result for spot covariance estimation (see, e.g., Theorem 9.3.2 in Jacod

and Protter (2012)), we have ci(τ)±P−→ cτ±. Consequently, (η∗i(τ))τ∈T

L|F−→ (ητ )τ∈T , whereL|F−→ denotes the convergence in probability of F -conditional laws under the uniform

metric. In addition, we note that p(Δni(τ)X

∗j )

P−→ ΔXj,τ for all τ ∈ Tj+. Combining

these convergence results with (A.18), we deduce that ξ∗PL|F−→ ξP . Evidently, the same

argument can be used to show that ξ∗NL|F−→ ξN , jointly with ξ∗P

L|F−→ ξP , which implies the

first assertion of Proposition 2 (i.e., ξ∗P − ξ∗NL|F−→ ξP − ξN). The size and power properties

of the test readily follow from here. Q.E.D.

Proof of Proposition 3. Let G be the σ-field generated by (bt, σt)0≤t≤T . Since σ = 0,

L = 0 by definition (recall (2.7)). Proposition 1 then implies that Δ−1/2n (P �

12 − N�12) −

2BL-s−→ 2ζ + ζP − ζN . Since G and W are independent, and W in the definition of ζ

is independent of F , we see that 2ζ + ζP − ζN is, conditional on G, centered Gaussian

with conditional variance Σ� (recall (2.15)). By Theorem 3 of Li, Todorov, and Tauchen

(2017a), Σ� → Σ�. By the property of stable convergence in law, we have

Δ−1/2n

(P �12 − N�

12

)− 2B√

Σ�

L-s−→ 2ζ + ζP − ζN√Σ�

≡ N ,

43

where, by definition, N is a standard normal variable (defined on an extension of the

original probability space) that is independent of G. In restriction to Ω0 = {B = 0}, wehave Δ

−1/2n (P �

12 − N�12)/√Σ� L-s−→ N . Since Ω0 ∈ G, this convergence readily implies the

first assertion of the proposition concerning the null rejection probability.

To prove the second assertion, we fix some constant R > 0. We suppose that P (Ωa) >

0 without loss of generality because, otherwise, the conditional probability P (C+n |Ωa) can

be arbitrarily defined and there would be nothing to prove. By the property of stable

convergence, we have

P

({Δ

−1/2n (P �

12 − N�12)√

Σ�> zα

}∩ Ωa

)

= P

({Δ

−1/2n (P �

12 − N�12)− 2B√

Σ�> zα − 2B√

Σ�

}∩ Ωa

)

= P

({N > zα − 2B√

Σ�

}∩ Ωa

)+ o (1)

≥ P ({N > zα −R} ∩ Ωa) + o (1) .

Note that Ωa ∈ G. Since N is independent of G, P({N > zα −R} ∩ Ωa) = (1 − Φ(zα −R))P (Ωa). Hence,

lim infn

P

({Δ

−1/2n (P �

12 − N�12)√

Σ�> zα

}∩ Ωa

)≥ (1− Φ (zα −R))P (Ωa) .

Dividing both sides by P (Ωa), we finish the proof of the second assertion of this propo-

sition. Q.E.D.

44

Download - Realized Semicovariances - Duke Universitypublic.econ.duke.edu/~boller/Papers/Realized_Semicovariances.pdf · Figure2: SignedReturn-PairsforDJIAStocks.-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5-1.5-1.0-0.5

Top Related