Realized Semicovariances
This version: 17 February, 2020
Tim Bollersleva,∗, Jia Lib, Andrew J. Pattonb, Rogier Quaedvliegc
aDepartment of Economics, Duke University, NBER and CREATESbDepartment of Economics, Duke University
cErasmus School of Economics, Erasmus University Rotterdam
Abstract
We propose a decomposition of the realized covariance matrix into components based on
the signs of the underlying high-frequency returns, and we derive the asymptotic prop-
erties of the resulting realized semicovariance measures as the sampling interval goes
to zero. The first-order asymptotic results highlight how the same-sign and mixed-sign
components load differently on economic information related to stochastic correlation and
jumps. The second-order asymptotic results reveal the structure underlying the same-sign
semicovariances, as manifested in the form of co-drifting and dynamic “leverage” effects.
In line with this anatomy, we use data on a large cross-section of individual stocks to
empirically document distinct dynamic dependencies in the different realized semicovari-
ance components. We show that the accuracy of portfolio return variance forecasts may
be significantly improved by exploiting the information in realized semicovariances.
Keywords: High-frequency data; realized variances; semicovariances; co-jumps;
volatility forecasting.
JEL: C22, C51, C53, C58
�We would like to thank the Co-Editor (Aviv Nevo) and four anonymous referees for their helpfulcomments, which greatly improved the paper. We would also like to thank conference and seminar partic-ipants at Aarhus, Boston, Ca’Foscari Venice, Chicago, Cologne, Gerzensee, ITAM, Konstanz, Lancaster,Padova, Pennsylvania, PUC Rio, QUT, Rutgers, and Toulouse for helpful comments and suggestions.Bingzhi Zhao kindly provided us with the cleaned high-frequency data underlying our empirical investi-gations. Patton and Quaedvlieg gratefully acknowledge support from, respectively, Australian ResearchCouncil Discovery Project 180104120 and Netherlands Organisation for Scientific Research Grant 451-17-009. This paper subsumes the 2017 paper “Realized Semicovariances: Looking for Signs of DirectionInside the Covariance Matrix” by Bollerslev, Patton, and Quaedvlieg.
∗Corresponding author: Department of Economics, Duke University, 213 Social Sciences Building, Box90097, Durham, NC 27708-0097, United States. Email: [email protected]. A Supplemental Appendix tothe paper is available at http://www.econ.duke.edu/∼boller/research.html.
1. Introduction
The covariance matrix of asset returns arguably constitutes the most crucial input
for asset pricing, portfolio and risk management decisions. Correspondingly, there is a
substantial literature devoted to the estimation, modeling, and prediction of covariance
matrices dating back more than half a century (e.g., Kendall (1953), Elton and Gruber
(1973), and Bauwens, Laurent, and Rombouts (2006)). Meanwhile, a rapidly-growing
recent literature has forcefully advocated the use of high-frequency intraday data to more
reliably estimate lower-frequency return covariance matrices (e.g., Andersen, Bollerlsev,
Diebold, and Labys (2003), Barndorff-Nielsen and Shephard (2004a), and Barndorff-
Nielsen, Hansen, Lunde, and Shephard (2011)).
Set against this background, we propose a decomposition of the realized covariance
matrix into three realized semicovariance matrix components dictated by the signs of the
underlying high-frequency returns. The realized semicovariance matrices may be seen as
a high-frequency multivariate extension of the semivariances originally proposed in the
finance literature (e.g., Markowitz (1959), Mao (1970), Hogan and Warren (1972, 1974),
and Fishburn (1977)). Our high-frequency theoretical analysis is inspired by and extends
the pioneering work of Barndorff-Nielsen, Kinnebrock, and Shephard (2010).
To fix ideas, let Xt = (X1,t, . . . , Xd,t)� denote a d-dimensional log-price process, sam-
pled on a regular time grid {iΔn : 0 ≤ i ≤ [T/Δn]} over some fixed time span T > 0.
Let the ith return of X be denoted by ΔniX ≡ XiΔn −X(i−1)Δn . The realized covariance
matrix (Barndorff-Nielsen and Shephard (2004a)) is then defined as:
C ≡[T/Δn]∑i=1
(ΔniX) (Δn
iX)� . (1.1)
If we let p (x) ≡ max {x, 0} and n (x) ≡ min {x, 0} denote the component-wise positive
and negative elements of the real vector x, the corresponding “positive,” “negative,” and
“mixed” realized semicovariance matrices are then simply defined as:
P ≡[T/Δn]∑i=1
p (ΔniX) p (Δn
iX)� , N ≡[T/Δn]∑i=1
n (ΔniX)n (Δn
iX)� ,
M ≡[T/Δn]∑i=1
(p (Δn
iX)n (ΔniX)� + n (Δn
iX) p (ΔniX)�
).
(1.2)
Note that C = P + N + M for any sampling frequency Δn. The concordant realized
semicovariance matrices, P and N , are defined as sums of vector outer-products and thus
are positive semidefinite. By contrast, the mixed semicovariance matrix, M , has diagonal
elements that are identically zero, and thus is necessarily indefinite.
As an illustration of the different dynamic dependencies and information conveyed by
2
Figure 1: Realized Covariance Decomposition
Realized Covariance Concordant Semicovariance Mixed Semicovariance
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
05
1015
(Sem
i)Cov
aria
nce
Time
Realized Covariance Concordant Semicovariance Mixed Semicovariance
Note: The figure plots the time series of the concordant semicovariance (P + N), the
mixed semicovariance (M) and the realized covariance (C). Each series is constructed asthe moving average of the relevant daily realized measures averaged across 500 randompairs of S&P 500 stocks.
the realized semicovariances, Figure 1 plots the daily realized covariance averaged across
500 randomly-selected pairs of S&P 500 stocks, together with its concordant (P + N)
and mixed (M) semicovariance components.1 The mixed component is, of course, always
negative, while the concordant component is always positive. The two components are
typically similar in magnitude during “normal” time periods, while in periods of high
volatility the concordant component increases substantially more than the mixed compo-
nent declines, in line with the widely-held belief that during periods of financial market
stress correlations between financial assets tend to increase. As such,the (total) realized
covariance is largely determined by the concordant realized semicovariance components
in these “crisis” periods.
To help understand these empirical features, consider a simple setting in which the
vector log-price process Xt is generated by a Brownian motion with constant drift b, unit
volatility, and constant correlation ρ. Although stylized, this simple model captures the
central force in the first-order asymptotic behavior of the semicovariance estimators in
the no-jump setting; we provide a more general analysis below. Normalizing T = 1, by
the law of large numbers the probability limits (as Δn → 0) of the (j, k) off-diagonal
1Each day we randomly draw 500 pairs of assets and compute C ≡ (1/500)∑
j �=k Cjk. P , N and
M are computed similarly. A detailed description of the data are provided in Section 4 below. Toavoid cluttering the figure, we sum P and N into a single concordant component, and smooth the dailymeasures using a day t− 25 to day t+ 25 moving average.
3
Figure 2: Signed Return-Pairs for DJIA Stocks.
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5 September 18, 2013
Return of Asset k
Ret
urn
of A
sset
j
-0.4 -0.2 0.0 0.2 0.4-0.4
-0.2
0.0
0.2
0.4 February 25, 2013
Return of Asset k
Note: The figure shows a scatter plot of the one-minute returns of each pair of the 30Dow Jones Industrial Average stocks on two days in 2013. The left panel presents a daywith an FOMC announcement that led to positive stock price jumps for many stocks.The right panel presents a day with steady downward price moves for many stocks.
elements of the realized semicovariance matrices are then given by
plim Pjk = plim Njk = ψ(ρ), plim Mjk = −2ψ(−ρ), (1.3)
where
ψ(ρ) = (2π)−1(ρ arccos (−ρ) +
√1− ρ2
), (1.4)
corresponds to E[Z1Z21{Z1<0,Z2<0}] for (Z1, Z2) bivariate standard normally distributed
with correlation ρ. As these expressions illustrate, the limiting values of the semicovari-
ance components depend crucially on the value of ρ. As ρ increases to one, the limiting
value of the concordant component P + N also approaches one, while the mixed com-
ponent M approaches zero, and vice versa when ρ decreases to −1. This is consistent
with the empirical observation from Figure 1 that the concordant component accounts
for most of the covariance in periods of market stress, which are generally believed to be
accompanied by increased positive correlations.
This simple diffusive setting highlights the potentially different information conveyed
by the concordant and the mixed semicovariance components. It does not, however,
reveal any differences between the P and N components as they have the same limits in
this stylized setting. This is at odds with the intuition that these measures carry distinct
economic information about good and bad news over each day. To illustrate, Figure 2
presents high-frequency returns on each of the 30 Dow Jones Industrial Average (DJIA)
stocks on two different days. On September 18, 2013 (left panel), the Federal Reserve
announced that it would not taper its asset-purchasing program, in contrast to what the4
market had been anticipating, and individual stocks responded with positive jumps at
the announcement time, resulting in much larger estimates of P than N . By contrast, on
February 25, 2013 (right panel), the DJIA drifted down by 1.5% over the course of the
day amid concerns, according to market anecdotes,2 about political uncertainty in Italy,
and generated much larger estimates of N than P . Hence, the empirical estimates of P
and N can indeed differ depending on the directional content of the news, and whether
it manifests in the form of price jumps or price drifts.
Motivated by these empirical observations, in Section 2 we derive first- and second-
order asymptotics for the positive and the negative semicovariance estimators in a general
Ito semimartingale setting, focusing particularly on a deeper understanding of the infor-
mation they convey. Extending the work of Barndorff-Nielsen, Kinnebrock, and Shephard
(2010), our limit theory identifies three channels through which P and N may differ: di-
rectional “co-jumps,” a type of “co-drifting,” and a specific form of “dynamic leverage
effect.” The co-jump channel manifests straightforwardly in the first-order asymptotics.
In particular, assuming that the vector log-price process in the aforementioned example
is subject to finite activity jumps, it follows readily that
plim Pjk = ψ(ρ) +∑
0<s≤1
p (ΔXj,s) p (ΔXk,s) ,
plim Njk = ψ(ρ) +∑
0<s≤1
n (ΔXj,s)n (ΔXk,s) ,
where ΔXj,s denotes the jump of the jth component of X at time s. By comparison, co-
drifting and dynamic leverage effects manifest in second-order bias terms in a non-central
limit theorem. These terms are unique to our analysis of realized semicovariances, and,
from a methodological perspective, they set our asymptotic analysis apart from stan-
dard high-frequency econometric analysis, in which central limit theorems are generally
applied for the purpose of conducting statistical inference (see, e.g., Aıt-Sahalia and Ja-
cod (2014)). By contrast, the main purpose of our higher-order asymptotic results is to
“dissect” the semicovariance estimators, thereby allowing for additional theoretical and
empirical insights by comparing and contrasting the relevant terms.
To gain more insights into the price behaviors illustrated in Figure 2, we employ
a standard truncation technique (Mancini (2001)) to separately study the jump and
diffusive information in positive and negative semicovariances. For the jump component,
we establish a feasible central limit theorem that facilitates formal statistical inference.
For the diffusive component, which is more complicated to study, we provide a standard
error estimator that quantifies its sampling variability in a well-defined sense, and which,
under more restrictive regularity conditions, also yields an asymptotically valid test.
2Source: https://money.cnn.com/2013/02/25/investing/stocks-markets/index.html.
5
Implementing the new inference procedures on high-frequency data for the 30 DJIA
stocks over a nine-year period reveals strong evidence of significant differences in the P
and N semicovariance components on many days. Consistent with economic intuition,
large differences in the jump semicovariance components are typically associated with
“sharp” public news announcements (e.g., FOMC announcements). Large differences in
the diffusive semicovariance components are typically associated with more difficult-to-
interpret news, which manifests in the form of common price drifts within the day.3
The above results naturally suggest that decomposing the realized covariance matrix
into its semicovariance components may be useful for volatility forecasting. We investi-
gate this idea by analyzing a large cross-section of stocks comprised of all of the S&P
500 constituents. We show that out-of-sample forecasts of portfolio return volatility are
significantly improved by using semicovariance measures. The gains from doing so in-
crease with the number of stocks included in the portfolio, although in line with the gains
from portfolio diversification, they appear to plateau at around 30-40 stocks in the port-
folio. Further dissecting the forecasting gains, we find that the models that use realized
semicovariances generally respond to new information faster than models that use only
realized variances or semivariances (e.g., Corsi (2009) and Patton and Sheppard (2015)).
Interestingly, during the financial crisis existing volatility models reduce the weight on
recent information, whereas semicovariance-based models increase the weight, primarily
due to an increase in the short-run importance of the negative semicovariance component.
The forecasting gains obtained through the use of the realized semicovariances are
naturally linked to the early work on asymmetric volatility models (e.g., Kroner and Ng
(1998) and Cappiello, Engle, and Sheppard (2006)). Our proposed tests for differences be-
tween positive and negative semicovariance components are also related to existing tests
for asymmetric dependencies (e.g., Longin and Solnik (2001), Ang and Chen (2002) and
Hong, Tu, and Zhou (2007)). Our work also connects to earlier empirical results on corre-
lations between asset returns in “bear” versus “bull” markets, and notions of asymmetric
tail dependencies (e.g., Patton (2004), Poon, Rockinger, and Tawn (2004), and Tjøsthem
and Hufthammer (2013)), and to recent work on high-frequency based co-skewness and
co-kurtosis measures (e.g., Neuberger (2012) and Amaya, Christoffersen, Jacobs, and
Vasquez (2015)), and jumps and co-jumps (e.g., Das and Uppal (2004), Bollerslev, Law,
and Tauchen (2008), Lee and Mykland (2008), Mancini and Gobbi (2012), Jacod and
Todorov (2009), Aıt-Sahalia and Xiu (2016) and Li, Todorov, and Tauchen (2017b)). In
contrast to all of these studies, however, we retain the covariance matrix as the sum-
mary measure of dependence, and instead use information from signed high-frequency
returns to “look inside” this matrix to obtain additional information about the inherent
dependencies, both dynamically and cross-sectionally at a given point in time.
3The two days plotted in Figure 2 also correspond to these two scenarios, and are indeed detected byusing the new inference method, as further discussed in Section 3 below.
6
The rest of the paper is organized as follows. Section 2 presents the first- and second-
order asymptotic properties of the realized semicovariances. Readers primarily interested
in empirical applications of the new measures may skip the more technical parts of Section
2. Section 3 discusses our empirical findings from the new semicovariance-based tests.
Our results pertaining to the use of the realized semicovariances in the construction of
improved volatility forecasts are discussed in Section 4. Section 5 concludes. Technical
regularity conditions and proofs are deferred to the Appendix. Additional robustness
checks and extensions are available in a Supplemental Appendix.
2. Information content of realized semicovariances: an asymptotic analysis
In this section, we demonstrate the differential information embedded in the realized
semicovariance measures in equation (1.2) in an infill asymptotic framework. Sections
2.1 and 2.2 present the first- and the second-order asymptotics, respectively. Section 2.3
describes feasible inference methods. Below, for a matrix A, we denote its (j, k) element
by Ajk and its transpose by A�. Convergence in probability and stable convergence in
law are denoted byP−→ and
L-s−→, respectively. All limits are for the sample frequency
Δn → 0 on a probability space (Ω,F ,P).
2.1. First-order asymptotic properties
Suppose that the log-price vector Xt is an Ito semimartingale of the form
Xt = X0 +
∫ t
0
bsds+
∫ t
0
σsdWs + Jt, (2.1)
where b is the Rd-valued drift process, W is a d-dimensional standard Brownian motion,
σ is the d×d dimensional stochastic volatility matrix and J is a finitely active pure-jump
process. We denote the spot covariance matrix of X by ct ≡ σtσ�t and further set
vj,t ≡ √cjj,t, ρjk,t ≡
cjk,tvj,tvk,t
. (2.2)
That is, vj,t and ρjk,t denote the spot volatility of asset j and the spot correlation coef-
ficient between assets j and k, respectively. We explicitly allow for so-called “leverage
effect” (i.e., dependence between changes in the price and changes in volatility), stochastic
volatility of volatility, volatility jumps and price-volatility co-jumps.
We begin by characterizing the first-order limiting behavior of the realized semico-
variance estimators defined by equation (1.2) in the Introduction. Let ΔXs denote the
price jump occurring at time s, if a jump occurred, and set it to zero if no jump occurred
at time s. Further define
7
P † ≡∑s≤T
p(ΔXs)p(ΔXs)�,
N † ≡∑s≤T
n(ΔXs)n(ΔXs)�,
M † ≡∑s≤T
(p(ΔXs)n(ΔXs)
� + n(ΔXs)p(ΔXs)�) .
These measures characterize the discontinuous parts of the semicovariance measures, as
formally spelled out in the following theorem.
Theorem 1 Under Assumption 1 in the appendix, (P , N , M)P−→ (P,N,M), where P ,
N and M are d× d matrices with their (j, k) elements given by
Pjk ≡∫ T
0
vj,svk,sψ(ρjk,s)ds+ P †jk,
Njk ≡∫ T
0
vj,svk,sψ(ρjk,s)ds+N †jk,
Mjk ≡ −2
∫ T
0
vj,svk,sψ(−ρjk,s)ds+M †jk,
and ψ(·) is defined in equation (1.4).
It follows from Theorem 1 that each of the realized semicovariances contains both
diffusive and jump covariation components. Importantly, the limiting variables P and
N share exactly the same diffusive component, but their jump components differ. In
particular,
P − NP−→ P −N = P † −N †.
That is, the first-order asymptotic behavior of the concordant semicovariance differential
(CSD) is fully characterized by the “directional co-jumps.” Consequently, in line with
the stylized model in equation (1.2) discussed in the introduction, Theorem 1 cannot
distinguish the information conveyed by P and N in periods when there are no jumps.
Hence, in order to reveal the differential information inherent in the realized measures
more generally, we turn next to a more refined second-order asymptotic analysis.
2.2. Second-order asymptotic properties
Since the main theoretical lessons about the second-order asymptotic behavior of
the concordant realized semicovariance components can be readily learnt in a bivariate
setting, we set d = 2 and focus on the analysis of P12 and N12 throughout this subsection.
Correspondingly, we also write ρt in place of ρ12,t for simplicity. The joint analysis of
all semicovariance components can be done in a similar manner. However, it is much
more tedious to discuss, so for readability we defer this more general analysis to the
Supplemental Appendix, Section S1.
8
We need to impose some additional structure on the volatility dynamics. In particular,
we will assume that the stochastic volatility σt is also an Ito semimartingale of the form
(see, e.g., equation (4.4.4) in Jacod and Protter (2012))
σt = σ0 +
∫ t
0
bsds+
∫ t
0
σsdWs + Mt +∑s≤t
Δσs1{‖Δσs‖>σ}, (2.3)
where b is the drift, σ is a d × d × d tensor-valued process, M is a local martingale
that is orthogonal to the Brownian motion W .4 A few remarks are in order. The pro-
cess σ collects the loadings of the stochastic volatility matrix σ on the price Brownian
shocks dW , and hence is naturally thought of as a multivariate quantification of a “lever-
age effect.” We also allow σ to load on Brownian shocks that are independent of dW
through the local martingale M . The M process may also contain compensated “small”
volatility jumps in the form of a purely discontinuous local martingale.5 Meanwhile,
the term∑
s≤t Δσs1{‖Δσs‖>σ} collects the “large” volatility jumps (with an arbitrary but
fixed threshold σ > 0), which often occur in response to major news announcements (see,
e.g., Bollerslev, Li, and Xue (2018)). The Ito semimartingale setting also readily accom-
modates the well-established intraday periodicity in the volatility dynamics (see, e.g.,
Andersen and Bollerslev (1997)). Further regularity conditions regarding the σ process
are collected in the appendix.
Theorem 2, below, describes the F -stable convergence in law of the normalized statis-
tic Δ−1/2n (P12 − P12, N12 − N12). The limit variable turns out to be fairly complicated,
but it may be succinctly expressed as(B
−B
)+
(L
−L
)+
(ζ
−ζ
)+
(ζP
ζN
)+
(ξP
ξN
), (2.4)
where, as we will detail below, B and L are bias terms, and (ζ, ζ, ξ) capture sampling
variabilities that arise from various sources. In particular, we note that P12 and N12 load
on the B and L bias terms in exactly opposite ways, which would cancel with each other
in the (aggregated) realized covariance C12. Before presenting the actual limit theorem,
we begin by briefly describing each of these separate terms. Recall that the processes b,
σ, v, ρ, and σ have previously been introduced in equations (2.1), (2.2), and (2.3).
Bias components due to price drift, B. The first type of bias is related to the price
4By convention, the (j, k) element of the stochastic integral∫ t
0σsdWs equals
∑dl=1
∫ t
0σjkl,sdWl,s.
5We remind the reader that two local martingales are called orthogonal if their product is a localmartingale (or equivalently, their predictable covariation process is identically zero). A local martingaleis called purely discontinuous if it is orthogonal to all continuous local martingales. See Definition I.4.11and Proposition I.4.15 in Jacod and Shiryaev (2003) for additional details.
9
drift, which is defined for the semicovariance estimator P12 as
B =1
2√2π
∫ T
0
(b1,sv1,s
+b2,sv2,s
)v1,sv2,s (1 + ρs) ds. (2.5)
As mentioned above, the analogous bias term for N12 is −B. Other things equal, this
bias term is proportional to the average spot “Sharpe ratio” of the two assets
1
2
(b1,sv1,s
+b2,sv2,s
). (2.6)
Therefore, the B term tends to be more pronounced when the two assets drift in the
same direction, akin to a “co-drift” type phenomenon.
Bias components due to continuous price-volatility covariation, L. The second bias
term stems from the fact that the volatility matrix process σt may be partially driven by
the Brownian motion W (i.e., σ �= 0), corresponding to a “dynamic leverage” type effect.
To more precisely describe this bias term for P12, define f1(x) ≡ 1{x1≥0} max{x2, 0} and
f2 (x) ≡ max{x1, 0}1{x2≥0} and then set, for any 2× 2 matrix A,
Fj (A) ≡ E
[fj (AW1)
∫ 1
0
WsdW�s
], j = 1, 2.
The bias term in P12 due to the common price-volatility Brownian dependence may then
be expressed as
L ≡2∑
j=1
∫ T
0
Trace [σj,sFj (σs)] ds, (2.7)
where σj,s denotes the 2× 2 matrix [σjkl,s]1≤k,l≤2. The bias term for N12 may be defined
similarly, and it can be shown to equal −L.Diffusive sampling error spanned by price risk, ζ. The third component in the limit
of P12 captures the sampling variability in p (ΔniX1) p (Δ
niX2) that is spanned by the
Brownian price shock σtdWt. Formally,
ζ ≡∫ T
0
(c−1s γs
)�(σsdWs) , (2.8)
where the γt process is defined as
γt ≡(1 + ρt)
2 v1,tv2,t
2√2π
(v1,t
v2,t
). (2.9)
The analogous component for N12 equals −ζ. Note that the quadratic covariation matrix
10
of the local martingale (ζ,−ζ) equals ∫ T
0Γsds, where
Γt ≡(
γ�t c−1t γt −γ�t c−1
t γt
−γ�t c−1t γt γ�t c
−1t γt
). (2.10)
Diffusive sampling error orthogonal to price risk, ζ. While ζ defined above captures
the diffusive risk in the semicovariance spanned by the Brownian shocks to the price
process, ζ captures the diffusive risk component orthogonal to those shocks. This limit
variable may be represented by its F -conditional distribution as
ζ =
(ζP
ζN
)=
∫ T
0
γ1/2s dWs, (2.11)
where W is a 2-dimensional standard Brownian motion that is independent of the σ-field
F , and the γ process is defined by γt ≡ Γt − Γt, where
Γt ≡ v21,tv22,t
(Ψ(ρt)− ψ (ρt)
2 −ψ (ρt)2
−ψ (ρt)2 Ψ(ρt)− ψ (ρt)
2
), (2.12)
and
Ψ (ρ) ≡ 3ρ√1− ρ2 + (1 + 2ρ2) arccos (−ρ)
2π, (2.13)
with Ψ (ρ) corresponding to E[Z2
1Z221{Z1<0,Z2<0}
]for (Z1, Z2) standard normally dis-
tributed with correlation ρ.
Jump-induced sampling error, ξ. The price jumps also induce sampling errors. Let Tj
for j ∈ {1, 2} denote the collection of jump times of (Xj,t)t∈[0,T ], with the corresponding
“signed” subsets denoted by
Tj+ ≡ {τ ∈ Tj : ΔXj,τ > 0}, Tj− ≡ {τ ∈ Tj : ΔXj,τ < 0}.
For each τ ∈ T1 ∪ T2 associate the variables (κτ , ξτ−, ξτ+) that are, conditionally on F ,
mutually independent with the following conditional distributions: κτ ∼ Uniform[0, 1],
ξτ− ∼ MN (0, cτ−), and ξτ+ ∼ MN (0, cτ ). Further define ητ = (η1,τ , η2,τ )� ≡ √
κτ ξτ− +√1− κτ ξτ+. The limiting variable ξ = (ξP , ξN)
� may then be expressed as
ξP ≡∑
τ∈T1+∩T2+(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) +
∑τ∈T1+\T2
ΔX1,τp (η2,τ ) +∑
τ∈T2+\T1ΔX2,τp (η1,τ ) ,
ξN ≡∑
τ∈T1−∩T2−(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) +
∑τ∈T1−\T2
ΔX1,τn (η2,τ ) +∑
τ∈T2−\T1ΔX2,τn (η1,τ ) .
Note that the first component in ξP (resp. ξN) concerns times when both assets have
positive (resp. negative) jumps, while the other two terms are active when one asset
11
jumps upwards (resp. downwards) and the other asset does not jump. The latter terms
involve the half-truncated doubly mixed Gaussian variable p(ητ ) (resp. n (ητ )). To the
best of our knowledge, this limiting distribution is new to the literature.
With the definitions above, we are now ready to state the stable convergence in law
of the realized semicovariances.
Theorem 2 Under Assumption 2 in the appendix,
Δ−1/2n
(P12 − P12
N12 −N12
)L-s−→(
B
−B
)+
(L
−L
)+
(ζ
−ζ
)+
(ζP
ζN
)+
(ξP
ξN
).
Theorem 2 depicts a non-central limit theorem for the positive and negative realized
semicovariances, where (±B,±L) represent bias terms, while (±ζ, ζ, ξ) stem from various
sources of “sampling errors.” The latter sampling error terms are all formed as (local)
martingales and have zero mean under mild integrability conditions. We note that the
bias terms arise because the test functions used to define the realized semicovariances
(e.g., (x1, x2) → p (x1) p (x2) = max{x1, 0}max{x2, 0}) are not globally even.6 This
phenomenon also appears in earlier work by Kinnebrock and Podolskij (2008), Barndorff-
Nielsen, Kinnebrock, and Shephard (2010), and Li, Mykland, Renault, Zhang, and Zheng
(2014); also see Chapter 5 of Jacod and Protter (2012). In the Supplemental Appendix
Section S1, we provide a more general result for the joint convergence of all the realized
semicovariance components, including the realized semivariances of Barndorff-Nielsen,
Kinnebrock, and Shephard (2010) as special “diagonal” cases. We also demonstrate there
how to recover that prior result in the case without price jumps, and further characterize
the effect of jumps on the sampling variability of the realized semivariances.
The presence of the bias terms means that Theorem 2 is not directly suitable for the
construction of confidence intervals for the (P,N) estimand, however that is not our goal.
Instead, the main insight derived from Theorem 2 is to reveal the differential second-order
behavior of the realized semicovariances P and N , about which the first-order asymptotics
in Theorem 1 remains entirely silent in the absence of jumps. Indeed, while Theorem 1
states that P and N have the same limit in the no-jump case, Theorem 2 clarifies that
they actually load on higher-order “signals” ±B and ±L in the exact opposite way. From
a theoretical perspective, this therefore explains why P and N may behave differently.
From an empirical perspective, it helps guide our understanding of the actual P and N
estimates discussed in Section 3, and the use of these measures in the construction of
improved volatility forecasts discussed in Section 4.
Theorem 2 focuses on the concordant semicovariance terms, P and N . The asymptotic
6Following Jacod and Protter (2012), we say that a function f (·) defined on Rd is globally even iff (x) = f (−x) for all x ∈ Rd; see page 135 in that book. Kinnebrock and Podolskij (2008) simply referto such functions as even functions; see page 1057 of that paper.
12
behavior of the mixed semicovariance term, M , is of particular interest when studying
negatively correlated assets, such as in analyzing hedge portfolios. This term is considered
jointly with all other elements of the semicovariance matrices in the more general limit
theorem presented in the Supplemental Appendix. Meanwhile, Theorem 2 can also be
used to help understand the behavior of the mixed semicovariances by computing P and
N on a rotation of the original returns: use the original returns on the first asset, and
the negative of the returns on the second asset.
2.3. Tests based on concordant semicovariance differentials
Even though Theorem 2 does not allow for the construction of standard confidence
intervals, it is still possible to develop feasible inference methods for the difference P −N ,
that is, the CSD. In particular, the asymptotic theory in the previous subsection reveals
three types of signals underlying the CSD: a directional co-jump effect (i.e., P † −N †), a
co-drifting effect (i.e., 2B), and a dynamic leverage effect (i.e., 2L). Empirically, it is of
great interest to separate the variation due to jumps from that due to the diffusive price
moves. Below, we use the standard truncation method (see, e.g., Mancini (2001, 2009))
to achieve such a separation. As in Section 2.2, we consider a bivariate setting, or d = 2,
and focus on the inference for P12 − N12.
The truncation method involves a sequence un ∈ R2+ of truncation thresholds satisfy-
ing uj,n � Δ�n for some � ∈ (0, 1/2) and j ∈ {1, 2}. Under our maintained assumption
of finite activity jumps, it can be shown that the index set
I ≡ {i : −un ≤ ΔniX ≤ un does not hold}
consistently estimates the jump times of the vector log-price process X.7 In practice,
it is important to choose the truncation threshold un adaptively so as to account for
time-varying volatility, particularly its well-known U-shape intraday pattern; we discuss
this further in connection with our empirical analyses below. As a result, the diffusive
and jump returns may be separated, allowing for the separate estimation of the diffusive
components of the semicovariances using the “small” (non-jump) returns:
P � ≡∑i/∈I
p(ΔniX)p(Δn
iX)�, N� ≡∑i/∈I
n(ΔniX)n(Δn
iX)�,
7See Proposition 1 of Li, Todorov, and Tauchen (2017b), for which the inequality −un ≤ Δni X ≤ un is
interpreted element-by-element. Importantly, although jumps can be separately recovered in asymptotictheory, it is difficult to do so in practice. The realized semicovariance measures only concern time-aggregated jump characteristics, instead of individual jumps. In our analysis, the jump recovery is onlyused as theoretical auxiliary tool to prove asymptotic results for time-aggregated quantities.
13
and the jump components of the semicovariances using the “jump” returns:
P † ≡∑i∈I
p(ΔniX)p(Δn
iX)�, N † ≡∑i∈I
n(ΔniX)n(Δn
iX)�.
The aforementioned jump detection result permits the truncated estimators to be
analyzed in the following straightforward extension of Theorem 2.
Proposition 1 Under Assumption 2 in the appendix, the following convergences hold
jointly
Δ−1/2n
(P †12 − P †
12
N †12 −N †
12
)L-s−→(ξP
ξN
),
Δ−1/2n
(P �12 − P �
12
N�12 −N�
12
)L-s−→(
B
−B
)+
(L
−L
)+
(ζ
−ζ
)+
(ζP
ζN
),
where P �12 = N�
12 ≡∫ T
0v1,sv2,sψ(ρs)ds.
Proposition 1, and consistent estimation of the spot covariances (discussed below),
allows for feasible inference on each of the two separate CSD components, P †12 − N †
12 and
P �12 − N�
12. We will refer to these as the jump or diffusive concordant semicovariance
differentials (JCSD or DCSD) respectively.
We start with a discussion of how to implement the JCSD test, which is the simpler
of the two as it admits a central limit theorem. In particular, it follows immediately from
Proposition 1,
Δ−1/2n
(P †12 − N †
12 −(P †12 −N †
12
)) L-s−→ ξP − ξN ,
where, as discussed above, ξP and ξN are defined in terms of doubly-mixed Gaussian
variables. Hence, to consistently estimate the distribution of the limiting variable, we
first need to estimate the spot covariance matrix before and after each detected jump
time. In order to do so, we choose an integer sequence kn of local windows that satisfies
kn → ∞ and knΔn → 0, and set, for each i,
ci− ≡ 1
knΔn
kn∑l=1
(Δn
i−lX) (
Δni−lX
)�1{−un≤Δn
i−lX≤un},
ci+ ≡ 1
knΔn
kn∑l=1
(Δn
i+lX) (
Δni+lX
)�1{−un≤Δn
i+lX≤un}.
Algorithm 1 describes the requisite steps for implementing the resulting JCSD test for
the null hypothesis P †12 = N †
12, that is, equal directional jump covariation.
Algorithm 1 (JCSD Test).
14
Step 1. Draw random variables (κ∗i , ξ∗i−, ξ
∗i+) that are mutually independent such that
κ∗i ∼ Uniform[0, 1] and ξ∗i± ∼ MN (0, ci±). Set η∗i = (η∗i,1, η∗i,2) =
√κ∗i ξ
∗i− +
√1− κ∗i ξ
∗i+.
Step 2. Let ΔniX
∗j ≡ Δn
iXj1{|Δni Xj |>uj,n} for j ∈ {1, 2} and set
ξ∗P = Δ−1/2n
∑i∈I
(p(Δn
iX∗1 +Δ1/2
n η∗i,1)p(Δn
iX∗2 +Δ1/2
n η∗i,2)− p (Δn
iX∗1 ) p (Δ
niX
∗2 )),
ξ∗N = Δ−1/2n
∑i∈I
(n(Δn
iX∗1 +Δ1/2
n η∗i,1)n(Δn
iX∗2 +Δ1/2
n η∗i,2)− n (Δn
iX∗1 )n (Δ
niX
∗2 )).
Step 3. Repeat steps 1–2 many times. Compute the 1− α (resp. α) quantile of ξ∗P − ξ∗Nas the critical value of Δ
−1/2n (P †
12 − N †12) for the null hypothesis P
†12 = N †
12 in favor of the
one-sided alternative P †12 > N †
12 (resp. P †12 < N †
12) at significance level α. �
Algorithm 1 may be seen as a parametric bootstrap that exploits the approximate
(parametric) doubly-mixed Gaussian distribution of the detected jump returns given the
estimated spot covariances, with ξ∗P and ξ∗N being the bootstrap analogue of the original
normalized estimators. While this type of simulation-based inference is often used in
the study of jumps, a non-standard feature of Algorithm 1 is its use of the truncated
return ΔniX
∗j = Δn
iXj1{|Δni Xj |>uj,n}, which shrinks the detected diffusive returns to zero.
This shrinkage is needed in situations where exactly one asset jumps at time τ , so that
the sampling variability contributed by the other (no-jump) asset, say j, is given by a
half-truncated doubly-mixed Gaussian variable like p (ητ,j). This distribution may in turn
be mimicked by Δ−1/2n p(Δn
iX∗j + Δ
1/2n η∗i,j) = p(η∗i,j), which differs from the “un-shrunk”
variable Δ−1/2n p(Δn
iXj +Δ1/2n η∗i,j).
Proposition 2 Under Assumption 2 in the appendix, the conditional distribution of
ξ∗P −ξ∗N given the data converges in probability to the F-conditional distribution of ξP −ξNunder the uniform metric. Consequently, the test described in Algorithm 1 has asymp-
totic level α under the null {P †12 = N †
12} and asymptotic power of one under one-sided
alternatives.
We turn next to the conduct of feasible inference using the DCSD statistic P �12− N�
12.
This involves some additional non-standard theoretical subtlety. Proposition 1 implies
that
Δ−1/2n
(P �12 − N�
12
)− 2B − 2L
L-s−→ 2ζ + ζP − ζN , (2.14)
where we recall that B and L capture co-drift and dynamic leverage effects, respec-
tively. The first-order limiting variables P �12 and N�
12 exactly cancel with each other in
the P �12 − N�
12 difference. Consequently, as revealed by the convergence in (2.14), the re-
maining “signal” carried by the DCSD is given by the higher-order term 2B +2L, which
is comparable in magnitude with the statistical noise term 2ζ + ζP − ζN (defined as a
local martingale). Since the signal-to-noise ratio does not diverge to infinity even in large15
samples, the resulting test is generally not consistent.
A further non-standard complication related to (2.14) stems from the fact that the
limiting variable 2ζ+ ζP − ζN is generally not mixed Gaussian. Specifically, while ζP − ζNis F -conditional Gaussian, the remaining part
2ζ = 2
∫ T
0
(c−1s γs
)�(σsdWs)
is generally not mixed Gaussian unless, of course, the stochastic volatility σ is independent
of the Brownian motion W that drives the diffusive price moves.
Although these non-standard features of the limit theory prevent us from conducting
formal tests in the most general setting, it is nevertheless possible to assess the sampling
variability of P �12 − N�
12 in a well-defined way. Indeed, it follows from (2.8) and (2.11)
that the quadratic variation of the continuous local martingale 2ζ + ζP − ζN is given by
Σ� ≡ 2
∫ T
0
v21,sv22,sΨ(ρs) ds. (2.15)
Therefore,√Σ� may be naturally used as the standard error for gauging the sampling
variability of the centered variable Δ−1/2n (P �
12−N�12)−(2B+2L). The Σ� variable is defined
as an integrated functional of the spot covariance matrix and it may be consistently
estimated using a nonparametric “plug-in” estimator Σ� (see, e.g., Li, Todorov, and
Tauchen (2017a)).
Comparing P �12 − N�
12 with its standard error provides an econometrically disciplined
approach for detecting “large” differences in positive and negative concordant semicovari-
ances, as summarized in the following algorithm. We refer to it cautiously as a detection
rule instead of a test; it may be formally interpreted as a test under additional conditions,
as detailed below.
Algorithm 2 (DCSD Detection).
Step 1. Define v1,i, v2,i and ρi implicitly by decomposing the spot covariance matrix
estimator ci+ as
ci+ =
(v21,i ρiv1,iv2,i
ρiv1,iv2,i v22,i
),
and set
Σ� ≡ 2Δn
[T/Δn]− kn + 1
[T/Δn]−kn∑i=0
v21,iv22,iΨ(ρi) . (2.16)
Step 2. Use the 1−α quantile of a standard normal distribution zα as the critical value for
the t-statistic Δ−1/2n (P �
12−N�12)/√Σ� for one-sided detection of deviations from B+L = 0.
The DCSD detection rule described in Algorithm 2 should be interpreted carefully in
16
empirical work. Due to the aforementioned lack of mixed Gaussianity for the limiting
variable under the most general conditions, the t-statistic Δ−1/2n (P �
12 − N�12)/√Σ� is gen-
erally not asymptotically standard normally distributed. For this reason, the asymptotic
level of the proposed one-sided test is not guaranteed to be α. Instead, the t-statistic
may be interpreted as a well-defined “signal-to-noise” measure, as opposed to a formal
statistical test. These asymptotic results are also corroborated by the simulation results
pertaining to the finite-sample properties of the JCSD test and DCSD detection rule
presented in Section S2 of the Supplemental Appendix.
In lieu of these results, mixed Gaussianity can be formally restored under the addi-
tional assumption that the volatility process σ is independent of the Brownian motionW ,
which in turn allows us to analyze the size and power properties of the DCSD detection
rule as a formal test. Note that in this “no-leverage” case with σ being independent of
W , without loss of generality we can fix σ ≡ 0 in the volatility model in (2.3), which
further implies that L ≡ 0 (recall equation (2.7)). Hence, the statistic Δ−1/2n (P �
12 − N�12)
is centered at 2B, and the signal-to-noise ratio equals 2B/√Σ�.
Formally, the null hypothesis for DCSD is comprised of the following collection of
sample paths:
Ω0 ≡ {ω ∈ Ω : B (ω) = 0}.
We remind the reader that in non-ergodic high-frequency settings, it is standard to con-
sider hypotheses as events consisting of sample paths of interest (see, e.g., Aıt-Sahalia
and Jacod (2014) and the many references therein). Correspondingly, to state a one-side
alternative hypothesis, we fix a constant R > 0, and consider:
Ωa ≡{ω ∈ Ω :
2B (ω)√Σ� (ω)
≥ R
}.
Intuitively, R measures the “distance” between the null and the alternative by drawing
a lower bound for the (random) signal-to-noise ratio 2B/√Σ�, which, not surprisingly,
determines the power of the test. Proposition 3, below, characterizes the size and power
properties of the resulting DCSD test described in Algorithm 2, where we use Φ (·) to
denote the cumulative distribution function of the standard normal distribution.
Proposition 3 Suppose that (i) Assumption 2 in the appendix holds, and (ii) the (bt)
and (σt) processes are independent of W , and σt ≡ 0. Then, the critical region C+n ≡
{Δ−1/2n (P �
12 − N�12)/√Σ� > zα} associated with the (positive) one-sided DCSD test has
asymptotic level α in restriction to Ω0, that is
limn→∞
P(C+
n |Ω0
)= α.
17
Moreover, for any R > 0,
lim infn→∞
P(C+
n |Ωa
) ≥ 1− Φ (zα −R) > α.
Proposition 3 shows that under the additional “no-leverage” assumption the one-sided
DCSD test controls size under the null with “no co-drift” (i.e., B = 0), and is asymp-
totically unbiased with strictly non-trivial power under the alternative. The asymptotic
power is higher when the signal-to-noise ratio 2B/√Σ� is higher, and the power is at least
50% when R = zα. Also, even though the proposition focuses on a positive one-sided
test, these same results readily extend to a negative one-sided test, or two-sided tests.
2.4. Discussion of prior related studies
Putting our results further into perspective, Theorem 2 naturally extends Barndorff-
Nielsen, Kinnebrock, and Shephard’s (2010) asymptotic theory (Proposition 2) from the
univariate setting and realized semivariances to a multivariate setting and realized semi-
covariances. However, the results in Barndorff-Nielsen, Kinnebrock, and Shephard (2010)
rely on the theory of Kinnebrock and Podolskij (2008), and hence rule out the presence
of price or volatility jumps. Jacod and Protter (2012) provide a more general result al-
lowing for volatility jumps (see Theorem 5.3.5). By comparison, we allow for both price
and volatility jumps. Price jumps in turn result in an interesting limiting distribution
that involves truncated doubly-mixed Gaussian variables (i.e., p(ητ ) and n(ητ )), which, to
the best of our knowledge, are new to the literature on jump-related inference (see, e.g.,
Aıt-Sahalia and Jacod (2009), Jacod and Todorov (2009), and Chapter 14 of Aıt-Sahalia
and Jacod (2014)).
The higher-order bias terms that appear in the second-order asymptotics arise from
the fact that the realized semicovariances are not globally even transformations of the
high-frequency returns. Results in Kinnebrock and Podolskij (2008), Barndorff-Nielsen,
Kinnebrock, and Shephard (2010), and Chapter 5 of Jacod and Protter (2012) share that
same feature. In the probability literature, a closely related limit theorem is also proved
by Jacod (1997) and reported as Theorem IX.7.3 in Jacod and Shiryaev (2003). Another
interesting example of this type of higher-order bias terms is provided by Li, Mykland,
Renault, Zhang, and Zheng (2014), in their analysis of realized tricity (see their Theorem
2) and a test for endogenous sampling times.8
8The (realized) tricity estimator is defined as the sum of cubic powers of high-frequency returns.Theorem 2 of Li, Mykland, Renault, Zhang, and Zheng (2014) establishes a non-central limit theoremfor this estimator allowing for random sampling. In the special case with regular sampling (correspondingto ht = 1 in the notation in that paper), the bias term has the form (using the notation in the present
paper) 3∫ T
0σ2sbsds+3
∫ T
0σ3sdWs+(3/2)〈σ2, X〉t, where 〈σ2, X〉t denotes the quadratic variation between
σ2 and X, corresponding to a leverage type effect. These three components play similar roles to the B,ζ, and L terms in Theorem 2, respectively. It may be interesting to further generalize the results in thepresent paper to a setting with random sampling using the technique of Li, Mykland, Renault, Zhang,and Zheng (2014).
18
The feasible inference developed in Section 2.3 also further sets our analysis apart
from that of Barndorff-Nielsen, Kinnebrock, and Shephard (2010). In the absence of price
jumps, in particular, our DCSD inference about the diffusive component involves a nonlin-
ear transform of the spot covariance matrix, Σ� ≡ 2∫ T
0v21,sv
22,sΨ(ρs) ds. Correspondingly,
our inference relies on the estimator for general integrated volatility functionals recently
developed by Li and Xiu (2016) and Li, Todorov, and Tauchen (2017a); see also Reno
(2008) and Kristensen (2010) for earlier results on spot volatility estimation. By contrast,
the feasible inference of Barndorff-Nielsen, Kinnebrock, and Shephard (2010), available
under more restrictive conditions, only requires the estimation of integrated quarticity.
In further contrast to Barndorff-Nielsen, Kinnebrock, and Shephard (2010), who do
not consider jumps, our JCSD test is explicitly geared toward price jumps. By focusing on
the differential between positive and negative directional co-jumps, our test also serves
a different empirical purpose than other tests for the presence of jumps and co-jumps
(see, e.g., Barndorff-Nielsen and Shephard (2006), Aıt-Sahalia and Jacod (2009), Jacod
and Todorov (2009), and Caporin, Kolokolov, and Reno (2017)). Our feasible inference
described in Algorithm 1 may be seen as a parametric bootstrap, and as such is naturally
related to the high-frequency bootstrap methods developed by Goncalves and Meddahi
(2009), Dovonon, Goncalves, Hounyo, and Meddahi (2019), among others. However,
while the latter bootstrap methods are designed to mimic the sampling variability of
aggregated Brownian shocks, we aim to recover that of the truncated jump returns.
3. Empirical semicovariance tests
We begin our empirical investigations by looking at the realized semicovariances for
the 30 Dow Jones Industrial Average (DJIA) constituent stocks as of the last day of our
sample period. We use one-minute returns, obtained from the Trades and Quotes (TAQ)
database, spanning the period from January 2006 to December 2014, a total of 2,265
trading days. Our choice of a one-minute sampling frequency and recent sample period is
dictated by the need to reliably estimate the spot covariance matrix, which is used in the
tests described in the previous section.9 To more clearly pinpoint within-day market-wide
price moves and economic events, we exclude the returns for the first half-hour of the
trading day in the calculation of the statistics.
We calculate the JCSD and DCSD semicovariance-based tests for all of the 435 unique
stock-pairs and 2,265 days in the sample, resulting in close to one million test statistics
9The choice of a one-minute sampling frequency mirrors that of Li, Todorov, Tauchen, and Chen(2017) in their estimation of spot covariances. It is also supported by the “signature plots” in SectionS8 of the Supplemental Appendix. For additional discussion of market microstructure effects, see Zhang,Mykland, and Aıt-Sahalia (2005), Hansen and Lunde (2006), Barndorff-Nielsen, Hansen, Lunde, andShephard (2008), and Jacod, Li, and Zheng (2017).
19
for each of the two tests.10 We use one-sided tests at the 5% significance level. The
average rejection rates for the JCSD tests far exceed the nominal level, rejecting in favor
of positive (negative) co-jumps for about 30% (32%) of the pairs. The DCSD algorithm
detects significantly positive (negative) differences for 13% (10%) of all stock pairs.11
To shed additional light on these test results, Table 1 lists the days with the most
rejections for each of the tests for each of the nine years in the sample, along with a short
description of the most important economic events that occurred on these days. Panel A
shows that all but one of the days with the most rejections for the JCSD test are associated
with FOMC statements or changes in the federal funds rate. This finding is consistent
with the prior literature that links jumps in asset prices with public news announcements
(e.g., Andersen, Bollerslev, and Diebold (2007), Lee and Mykland (2008), Lee (2012),
and Caporin, Kolokolov, and Reno (2017)). It is also in line with the literature on testing
for co-jumps and the finding that those jumps are associated with economy-wide news
(e.g., Bollerslev, Law, and Tauchen (2008) and Lahaye, Laurent, and Neely (2011)).12
In contrast to the “sharp” economic events associated with the JCSD test, Panel B
shows that the days with the most rejections for the DCSD test are typically associated
with “softer” more difficult-to-interpret information. Kyle (1985)-type models can be
used to establish a more formal economic link between the “soft” information and price
drift (which drives the DCSD test through the co-drift effect). In these models, informed
agents trade strategically with liquidity traders to maximize their profit, and they do
so patiently in order to manage the market marker’s belief. As shown by Back (1992),
informed traders’ optimal order flow is smooth (i.e., differentiable) in time, which in turn
determines the drift of the equilibrium price. In a setting with stochastic liquidity, Collin-
Dufresne and Fos (2016) further show that, in equilibrium, the price drift exhibits mean
reversion towards the asset’s true value, and mean reversion is strong (weak) when the
short-term liquidity is high (low) relative to the long-term liquidity. This theory suggests
that, ceteris paribus, price drift is greater in magnitude when there is a greater degree of
mispricing, or when the revelation of the private information is imminent.
To illustrate the distinct price dynamics on the “sharp” and “soft” news event days
identified by the JCSD and DCSD tests, Figure 3 plots cumulative returns for each of
the 30 DJIA stocks for two days from Table 1: September 18, 2013 and February 25,
10We rely on the dynamic threshold advocated by Bollerslev and Todorov (2011a,b) based on threetimes the trailing bipower variation, as originally defined by Barndorff-Nielsen and Shephard (2004b,2006), adjusted for the intraday periodicity in the volatility. We set the local window kn = 45.
11We do not intend to make a formal statistical statement jointly across all pairs. Instead, we viewthe rejection frequencies as simple summary statistics of the pairwise test results.
12A detailed comparison of the dates in Table 1 with those reported by all of the studies cited here isbeyond the scope of this paper. However, we note that some of the dates overlap with those in Caporin,Kolokolov, and Reno (2017), for example, while others do not, as the different focus of the tests naturallyleads to some variation in the days with the most significant jumps.
20
Table 1: Top Rejection Days by Year
Year Date Direction % Headline Event
Panel A. JCSD Test
2006 June 29 + 100 Fed raises short-term rate by a quarter-percentagepoint.
2007 September 18 + 99 Fed cuts short-term rate by a half-percentage point.2008 December 16 + 100 Fed cuts short-term rate by a quarter-percentage
point.2009 March 18 + 100 Fed announces it will buy up to �300 billion in long-
term Treasuries.2010 August 10 + 98 Fed announces it will continue Quantitative Easing.2011 September 22 + 100 Fed announces Operation Twist.2012 September 13 + 100 Fed announces it will continue buying Mortgage
Backed Securities.2013 September 18 + 98 Fed announces it will sustain the asset buying pro-
gram.2014 August 5 − 97 Russian troops are reported lining on the borders
of Ukraine.
Panel B. DCSD Test
2006 July 19 + 57 Bernanke explains to the Senate Banking Commit-tee how the Fed sees the economic slowdown.
2007 August 29 + 58 Bernanke writes letter to senator that Fed is mon-itoring and ready to step in if necessary.
2008 January 2 − 67 Markets react to poor manufacturing, housing andcredit news.
2009 March 23 + 90 Obama administration announces its plan to buy�1 trillion in bad bank assets.
2010 July 7 + 76 EU reveals its first list of stress test banks.2011 June 1 − 83 Moody’s cuts Greece’s bond rating by three
notches.2012 June 21 − 84 Rumors of Moody’s downgrade for global banks.2013 February 25 − 83 Political uncertainty surrounding Italian elections.2014 February 3 − 85 Janet Yellen sworn in as the new Fed chair.
Note: The table reports the top rejection dates by year for the semicovariance-based testsacross all 435 DJIA stocks-pairs. The first column gives the date, the second gives the directionin which the rejections occurred, and the third provides the fraction of pairs of stocks forwhich the test rejects at the 5% level in that direction. The final column summarizes headlineeconomic news events for the different days. Panel A reports the results for the jump CSDtest, while Panel B is based on the diffusive CSD test.
2013.13 (These are also the dates presented in Figure 2.) For the event detected by the
JCSD test (left panel), all stocks experienced a large positive price change (of just over
1% on average) at 2pm, following the release of the FOMC meeting statement which
13Section S3 in the supplemental appendix presents plots for all of the dates listed in Table 1.21
Figure 3: DJIA Cumulative Returns on Representative Event Days
10:00 12:00 14:00 16:00
-1
0
1
2
September 18, 2013
Cum
ulat
ive
Ret
urn
Time 10:00 12:00 14:00 16:00
-2
-1
0
1
February 25, 2013
Time
Note: The figure plots the cumulative return of the 30 Dow Jones Industrial Averagestocks on two of the event dates associated with market-wide jump CSD and diffusiveCSD events from Table 1.
declared that the Fed would sustain its asset-purchasing program. By contrast, for the
event detected by the DCSD test (right panel), we observe slow and steadily decreasing
price paths throughout the day for all of the stocks. The total daily return is large, with
the median daily return of around negative 2%, but no “extreme” one-minute returns
occurred for any of the stocks during the course of that day.
The outcome of the DCSD or JCSD tests and the differences in the within day price
dynamics further translate into different dynamic dependencies in the semicovariance
components across days.14 The total realized covariance C, in particular, generally appear
more persistent following days with diffusive events, as detected by the DCSD algorithm.
On the other hand, the persistence of P increases primarily following positive DCSD
events, while the persistence of N is mostly higher following negative DCSD events,
again consistent with the idea that certain types of “soft” news is processed only slowly
over multiple days. By comparison, only negative JCSD jump events appear to affect the
average persistence of either component.
4. Forecasting with realized semicovariances
The results discussed in the previous section highlight the additional information and
economic insights afforded by the realized semicovariances beyond those from standard
realized covariances. The results also point to the existence of different dynamic depen-
dencies conditional on different days. In this section we further explore these empirical
14A summary table with the correlations conditional on different DCSD and JCSD event days isavailable in Supplemental Appendix Section S4.
22
Figure 4: Autocorrelations
RVPSVNSV
0 10 20 30 40 50
0.4
0.6Variance Elements Covariance Elements
Lag Lag
Aut
ocor
rela
tion
RVPSVNSV
CPNM
0 10 20 30 40 500.2
0.4
0.6
0.8CPNM
Note: The graph plots the autocorrelation functions for the different realized semi-covariance elements. All of the estimates are averaged across 1,000 randomly selectedS&P 500 pairs of stocks, and bias-adjusted following the approach of Hansen and Lunde(2014).
differences from the perspective of forecasting future variances and covariances.
To allow for the construction of larger-dimensional portfolios, we expand our previous
sample of 30 DJIA stocks to include all of the S&P 500 constituent stocks, and we consider
a longer sample period, from January 1993 to December 2014, a total of 5,541 trading
days. In order to reliably estimate models for covariances and semicovariances, we include
only stocks with at least 2,000 daily observations, resulting in a total of 749 stocks. Most
of these stocks are not as actively traded as the DJIA stocks, especially during the earlier
part of the sample. Correspondingly, since we only require consistent estimates for this
part of our analysis, we rely on a coarser 15-minute sampling scheme to construct the
realized measures. Finally, similar to most existing work on volatility forecasting (e.g.,
Hansen, Huang, and Shek (2012), and Noureldin, Shephard, and Sheppard (2012)), we
focus on the intra-daily period excluding the overnight returns.15
4.1. Vector autoregressions for realized semicovariances
Figure 1 discussed in the introduction already points to the existence of different dy-
namic dependencies in the average combined concordant and discordant semicovariance
components. To more directly highlight these differences, Figure 4 plots the lag 1 through
50 autocorrelations for each of the individual realized semicovariance components aver-
aged across 1,000 randomly selected S&P 500 pairs of stocks.16 While the autocorrelations
for the realized variances (RV) and the positive and negative realized semivariances (PSV
and NSV) shown in the left panel are almost indistinguishable, there is a clear ordering in
15Empirical results that include the overnight returns are presented in Supplemental Appendix S6.3.All of our main empirical findings remain qualitatively unaltered.
16We rely on the estimator of Hansen and Lunde (2014) to account for measurement errors in therealized measures. Section S5 of the Supplemental Appendix provides the corresponding unadjustedautocorrelation functions.
23
the rate of decay of the autocorrelations for the realized semicovariance elements shown
in the right panel. Most noticeably, the autocorrelations for the (total) covariances C
are systematically below those for the three realized semicovariance elements, with the
mixed M component exhibiting the highest overall persistence. The fact that almost all
of our pairs of stocks are positively correlated means that few co-jumps will appear in
M , and so by the asymptotic theory in Section 2, M is mostly comprised of diffusive
covariation, while P and N contain both diffusive and co-jump components. Previous
work (e.g., Maheu and McCurdy (2004), and Andersen, Bollerslev, and Diebold (2007))
has found that the diffusive component of volatility is generally more persistent than
the jump component, and our finding that M appears more persistent than P and N is
consistent with those findings.
These differences also naturally suggest that more accurate volatility and covariance
forecasts may be obtained by separately modeling the realized semicovariance components
that make up the realized covariance. To more directly investigate this, we estimate a
vector version of the popular HAR model of Corsi (2009), in which each of the elements
in the realized semicovariance matrix is allowed to depend on its own daily, weekly,
and monthly lags, as well as the lags of the other realized semicovariance components.17
Specifically, for each pair of assets (j, k) we estimate the following three-dimensional
vector autoregression:⎛⎜⎝ Pjk,t
Njk,t
Mjk,t
⎞⎟⎠ =
⎛⎜⎝ φjk,P
φjk,N
φjk,M
⎞⎟⎠+ Φjk,Day
⎛⎜⎝ Pjk,t−1
Njk,t−1
Mjk,t−1
⎞⎟⎠+ Φjk,Week
⎛⎜⎝ Pjk,t−2:t−5
Njk,t−2:t−5
Mjk,t−2:t−5
⎞⎟⎠+Φjk,Month
⎛⎜⎝ Pjk,t−6:t−22
Njk,t−6:t−22
Mjk,t−6:t−22
⎞⎟⎠+
⎛⎜⎝ εPjk,tεNjk,tεMjk,t
⎞⎟⎠ ,
(4.1)
where Pt−l:t−k ≡ 1k−l+1
∑ks=l Pt−s, with the other components defined analogously.
The first three columns of Table 2 report the resulting parameter estimates aver-
aged across 500 randomly selected (j, k) pairs of stocks. The table reveals a clear block
structure in the coefficients of this general specification. Most notably, the dynamic de-
pendencies in P and N are almost exclusively driven by the lagged N terms, while the
dynamic behavior of the mixed M elements is primarily determined by their own lags,
with the monthly lag receiving the largest weight.
The last two columns of Table 2 report the parameter estimates from regressing the re-
alized covariances C on the lagged realized semicovariances and the lagged covariances.18
17Building on the new ideas and theoretical results first presented here, Bollerslev, Patton, and Quaed-vlieg (2020) have recently shown how the realized semicovariances may similarly be used in the construc-tion of improved multivariate realized GARCH-type models.
18Note that due to the linear nature of the HAR model and the fact that realized semicovariances24
Table 2: Semicovariance HAR Estimates
Pjk,t Njk,t Mjk,t Cjk,t
Pjk,t−1 0.038** 0.050** -0.035** 0.052**
Pjk,t−2:t−5 0.004** 0.057** -0.002** 0.059**
Pjk,t−6:t−22 -0.074** 0.023** 0.099** 0.048**
Njk,t−1 0.248** 0.192** -0.096** 0.344**
Njk,t−2:t−5 0.312** 0.250** -0.090** 0.472**
Njk,t−6:t−22 0.349** 0.206** -0.021** 0.534**
Mjk,t−1 -0.075** -0.072** 0.141** -0.006**
Mjk,t−2:t−5 -0.044** -0.049** 0.209** 0.116**
Mjk,t−6:t−22 0.028** -0.020** 0.409** 0.417**
Cjk,t−1 0.184**
Cjk,t−2:t−5 0.305**
Cjk,t−6:t−22 0.304**R2 0.397** 0.376** 0.354** 0.313** 0.284**R2
adj 0.395** 0.374** 0.352** 0.311** 0.283**
Note: The table reports the average parameter estimates for the vector HAR model in(4.1) averaged across 500 randomly selected pairs of stocks. The first three columnsreport results for the unrestricted models. The fourth column reports the estimatesfrom a model that restricts the rows of Φjk,Day, Φjk,Week and Φjk,Month to be the same,
corresponding to a model for Cj,kt, whilst the final column reports the results of a
standard HAR model on Cjk,t. ** and * signify that the estimates for that coefficientare significant at the 5% level for 75% and 50% of the randomly selected pairs of stocks.
The model with individual semicovariances clearly reveals the most important compo-
nents: the three lags of N and the monthly lag of M constitute the main drivers of
the realized covariance C. Interestingly, the models based on the semicovariances also
put a greater weight on more recent information compared to the standard HAR model
reported in the last column: normalizing each of the explanatory variables by their sam-
ple means, the semicovariance-based HAR models effectively put a weight of 0.339 on
lagged daily information, while the final column shows that a standard HAR model on
average puts a weight of only 0.184 on the daily lag, implying a more muted reaction to
new information. These differences are naturally associated with an improved fit of the
semicovariance-based models, as shown by the R2 values. In the next section we inves-
tigate whether this improved in-sample fit is accompanied by a similar improvement in
out-of-sample forecast performance for models that utilize the realized semicovariances.
sum exactly to the realized covariance, each coefficient in the fourth column is simply the sum of thecorresponding coefficients in the first three columns.
25
4.2. Portfolio volatility forecasting
The realized variance of a portfolio depends on the realized semicovariances of the
assets included in the portfolio. In particular, utilizing the result that C = P + N + M ,
the realized variance of a portfolio with portfolio weights w may be expressed as
RVw ≡ w�Cw
= w�Pw + w�Nw + w�Mw
≡ Pw + Nw + Mw, (4.2)
where we use the superscript w to indicate the relevant (scalar-valued) portfolio quan-
tities. These portfolio semicovariance measures are distinct from the portfolio semivari-
ances of Barndorff-Nielsen, Kinnebrock, and Shephard (2010), which only depend on
the high-frequency returns of the portfolio. The portfolio semicovariances, on the other
hand, depend on the high-frequency returns for all of the individual assets included in
the portfolio, and cannot be computed using only returns on the portfolio itself.
To explore whether portfolio semicovariances convey useful information beyond re-
alized variances and semivariances, we extend the HAR model of Corsi (2009) to allow
the forecasts to depend on each of the realized portfolio semicovariance components.
Accordingly, the one-step forecast for the portfolio return variance is constructed from
RV wt+1|t = φ0 + φDay,P P
wt + φWeek,P P
wt−1:t−4 + φMonth,P P
wt−5:t−21
+φDay,NNwt + φWeek,NN
wt−1:t−4 + φMonth,NN
wt−5:t−21 (4.3)
+φDay,MMwt + φWeek,MM
wt−1:t−4 + φMonth,MM
wt−5:t−21.
We will refer to this model as the SemiCovariance HAR (SCHAR) model. The general
SCHAR model in (4.3) is obviously quite richly parameterized. Hence, motivated by the
results in Table 2, we also consider a restricted version, in which we only include the
daily, weekly and monthly lags of Nw, and the monthly lag of Mw. We will refer to this
specification as the restricted SCHAR model, or SCHAR-r for short.
If the parameters associated with the lagged realized semicovariance component all
coincide (i.e., φj,P = φj,N = φj,M = φj for j ∈ {Day, Week,Month}), the SCHAR model
trivially reduces to the basic HAR model and the corresponding forecasting scheme
RV wt+1|t = φ0 + φDayRV
w
t + φWeekRVw
t−1:t−4 + φMonthRVw
t−5:t−21. (4.4)
This simple and easy-to-implement model has arguably emerged as the benchmark for
judging alternative realized volatility-based forecasting procedures in the literature.
In addition to this commonly used benchmark, we also consider the forecasts from
the Semivariance HAR (SHAR) model of Patton and Sheppard (2015). This model uses
26
the portfolio realized semivariances defined as
PSV =
[T/Δn]∑i=1
p(w�Δn
iX)2, NSV =
[T/Δn]∑i=1
n(w�Δn
iX)2,
to decompose the daily realized variance into positive and negative semivariances, result-
ing in the forecasting scheme:
RV wt+1|t = φ0+φDay,+PSV t+φDay,−NSV t+φWeekRV
w
t−1:t−4+φMonthRVw
t−5:t−21. (4.5)
This model has been found to perform particularly well from the perspective of portfolio
variance forecasting, performing on par with or better than the forecasts from other HAR-
style models, and as such constitutes another particularly challenging benchmark.19
We consider equally-weighted portfolios comprised of d = 10 (“small”) and 100
(“large”) stocks randomly selected from the full set of 749 individual stocks. In each
case, we ensure that the selected stocks contain an overlap of at least 1,100 daily obser-
vations. We then construct rolling out-of-sample forecasts based on each of the different
models, with model parameters re-estimated daily using the most recent 1,000 daily ob-
servations. We rely on the commonly used mean-square-error (MSE) and QLIKE loss
functions to evaluate the performance of the forecasts vis-a-vis the actual portfolio real-
ized variances RVw
t+1.20 Table 3 reports the resulting losses averaged across 500 randomly
selected portfolios. In addition, for each of the 500 portfolios, we also compute the ratio of
each model’s average loss relative to the benchmark HAR model, and report the average
of these ratios over all of the random samples.21
Consistent with Patton and Sheppard (2015), the SHAR-based forecasts that uti-
lize the portfolio realized semivariances result in fairly large relative gains vis-a-vis the
benchmark HAR-based forecasts, especially for the large dimensional portfolios. The
performance of the unrestricted SCHAR model, however, is mixed: it has smaller MSE
loss than the HAR forecasts, but underperforms that same benchmark under QLIKE
loss. This finding is hardly surprising. There is ample evidence in the forecasting litera-
ture emphasizing the importance of parsimony (see, e.g., Zellner (1992)), and the results
in Table 2 clearly suggest that the unrestricted SCHAR model is “over-parameterized,”
and as such is likely to perform poorly in a forecasting context. Indeed, looking at the
19A comprehensive comparison of these models with the numerous other specifications that haveappeared in the volatility forecasting literature (e.g., Andersen, Bollerslev, and Diebold (2007), Corsi,Pirino, and Reno (2010) and Caporin, Kolokolov, and Reno (2017)) would be interesting. In the interestof space, we leave a more extensive empirical analysis along these lines for future research.
20The MSE and QLIKE loss functions may both be formally justified for the purpose of volatilityforecast evaluation based on the use of imperfect ex-post volatility proxies; see Patton (2011).
21The Supplemental Appendix Section S6 reports corroborative evidence from a series of additionalrobustness checks, including the results from multivariate models designed to forecast the full d × ddimensional realized covariance matrix.
27
Table 3: Performance Comparison for Portfolio Variance Forecasts
MSE QLIKEModel Average Ratio Average Ratio
Panel A. Small Portfolio Case (d =10)
HAR 1.849 1.000 0.141 1.000SHAR 1.671 0.966 0.139 0.986SCHAR 1.643 0.955 0.210 1.318SCHAR-r 1.567 0.908 0.139 0.979
Panel B. Large Portfolio Case (d =100)
HAR 0.048 1.000 0.119 1.000SHAR 0.045 0.935 0.115 0.957SCHAR 0.045 0.976 0.236 1.495SCHAR-r 0.041 0.862 0.111 0.925
Note: The table reports the loss for forecasting the portfolio variance for portfoliosof size d = 10 and 100 for each of the different forecasting models. The reportednumbers are based on 500 randomly selected portfolios. The Average columnprovides the average loss over time and all portfolios. The Ratio column gives thetime-average ratio of losses across all sets of portfolios relative to the HAR model.
results for the restricted SCHAR-r model guided by the estimates in Table 2, we see that
the forecasts from this model unambiguously outperform those from the other models;
the 13.8% improvement in terms of predictive accuracy (measured by MSE) for the large
portfolio case is particularly impressive. As such, this clearly shows the benefit of utilizing
the additional information inherent in the realized semicovariances, compared to both the
HAR- and SHAR-based forecasts that only rely on the portfolio realized (semi)variances.
These gains in forecast accuracy are not unique to the two portfolio dimensions high-
lighted in Table 3. Figure 5 plots the median loss ratios for the HAR, SHAR and SCHAR-r
models for all values of d ranging from 2 to 100, together with the 10% and 90% quantiles
for the SCHAR-r forecasts computed across the 500 random portfolio-formations. As the
figure shows, the median loss ratios for the SCHAR-r model are systematically below
those of the other two models, the only exception being the QLIKE loss for d = 2. Also,
the gains from using the information in the realized semicovariances accrue relatively
quickly as the number of stocks in the portfolio increases, and for the QLIKE loss appear
to reach somewhat of a plateau for d ≈ 40 stocks.
The gains in forecast accuracy obtained from using realized semicovariances are also
not specific to the one-day forecast horizon. Table S.6 in Section S7 of the Supplemental
Appendix reports results for 5-day- and 22-day-ahead forecasts, and shows that the fore-
cast improvements obtained by the SCHAR-r model relative to the standard HAR model
28
Figure 5: Median Loss Ratios
0 10 20 30 40 50 60 70 80 90 100
0.85
0.90
0.95
1.00
1.05M
SE R
atio
s
HARSHARSCHAR-rSCHAR-r 10% and 90% Quantiles
0 10 20 30 40 50 60 70 80 90 100
0.90
0.95
1.00
1.05
QLI
KE
Rat
io
d
HARSHARSCHAR-rSCHAR-r 10% and 90% Quantiles
Note: The graph plots the median loss ratios as a function of the number of stocks inthe portfolio, d. The ratio is calculated as the average loss of the models divided by theaverage loss of of the standard HAR, based on 500 random samples of d-stock portfolios.
appear even larger over longer forecast horizons.
To help understand the source of these forecast improvements it is instructive to
represent the SHAR and SCHAR models as HAR models with time-varying parameters.
To illustrate the idea, consider the forecasts from a SCHAR-type model based only on
the lagged daily realized semicovariances,
RV wt+1|t = φ0 + φDay,P P
wt + φDay,NN
wt + φDay,MM
wt
= φ0 +
(φDay,P
Pwt
RVw
t
+ φDay,NNw
t
RVw
t
+ φDay,MMw
t
RVw
t
)RV
w
t
≡ φ0 + φDay,tRVw
t .
As these equations show, even though the φDay,P , φDay,N and φDay,M parameters used in
the formulation of the model are all constant, the model may alternatively be interpreted
as a first-order autoregression for RVw
t+1 with a time-varying autoregressive parameter
φDay,t. This idea immediately extends to the SHAR and the more elaborate SCHAR-r
forecasting models used in our empirical analysis, in which the parameters associated
with the weekly and monthly lags in the implied HAR-type representations would be
29
Figure 6: Implied HAR Parameters
HARSHARSCHAR-r
2000 2005 2010 2015
0.4
0.6
0.8
Dai
lyHARSHARSCHAR-r
2000 2005 2010 20150.3
0.5
0.7
Wee
kly
2000 2005 2010 2015-0.5
0-0
.25
0.00
0.25
Mon
thly
2000 2005 2010 20150.8
1.0
Tota
l
TimeTime
Note: The figure plots the implied HAR-type parameters for predicting the variance of aten-stock portfolio. The figure shows moving averages of the estimates, averaged across500 randomly selected portfolios. The models are estimated over the full sample period.
time-varying as well.
To study these effects, Figure 6 plots the daily, weekly and monthly time-varying
HAR parameters implied by the SHAR and SCHAR-r models for d = 10, averaged across
500 randomly selected ten-stock portfolios.22 In addition to the implied daily, weekly
and monthly parameters, the last panel reports their sum as a measure of the overall
persistence of the different models.
The daily, weekly, and monthly parameters for the HAR model are by definition all
constant, with an average implied persistence of around 0.94. By comparison, the implied
daily parameter estimates for the SHAR model vary slightly above the constant daily
HAR parameter, while the constant weekly parameter for the SHAR model is slightly
below that of the HAR model. As such, the overall persistence of the SHAR-based
forecasts are generally close to that of the standard HAR-based forecasts.
By contrast, the implied time-varying daily and weekly HAR parameters for the
SCHAR-r model both far exceed those of the standard HAR model, especially over the
earlier part of the sample. On the other hand, the implied time-varying monthly pa-
rameters for the SCHAR-r model are typically less than the monthly parameters for the
standard HAR and SHAR models. Meanwhile, the sum of the three implied parameters
22In contrast to the out-of-sample forecast results in Table 3, which are based on a rolling estimationscheme, Figure 6 plots the implied parameter estimates obtained over the full sample period. Requiringobservations to be available over the full sample reduces the number of stocks to 121. To avoid con-taminating the results by a few influential outliers, we exclude any portfolios for which the maximum
Pw/RVw, Nw/RV
wand Mw/RV
wratios exceed ten, and further smooth the implied parameters using
a day t− 25 to day t+ 25 moving average.
30
for the SCHAR-r model, which may be interpreted as an overall measure of persistence,
are typically greater than those for the other two models. Thus, not only do the su-
perior SCHAR-based forecasts respond more quickly to new information, the forecasts
are typically also more persistent and slower to mean-revert than the benchmark HAR-
and SHAR-based forecasts. This explains why the SCHAR-r model performs well not
just over the short one-day forecast horizon, but also over weekly and monthly forecast
horizons. It also highlights the usefulness of the information residing in the new realized
semicovariance measures for volatility forecasting more generally.
5. Conclusion
We propose a decomposition of the realized covariance matrix based on the signs
of the underlying high-frequency returns into positive, negative and mixed-sign realized
semicovariance components. Under a standard infill asymptotic setting for continuous-
time Ito semimartingales, we derive the first- and second-order asymptotic properties of
these new realized semicovariance measures. The asymptotic theory, taking the form of
a non-central limit theorem, reveals the differential information carried by each of the
realized semicovariance components, related to stochastic correlation, signed co-jumps,
and notions of co-drifting and leverage effects. Using high-frequency data for a large
cross-section of U.S. equities, we demonstrate how the asymptotic theory may be used
to help understand key features of the realized semicovariance components. We further
document distinctly different dynamic dependencies in the different realized semicovari-
ance components. These differences in turn translate into superior forecast performance
for models that utilize the realized semicovariance measures.
The infill asymptotic framework underlying much of the recent financial econometrics
literature, including this paper, naturally suggests the use of high-frequency intraday
data for the calculation of daily realized variation measures. However, the availability of
reliable intraday data limits such analyses to relatively short and recent sample periods.
On the other hand, empirical work in the finance literature often employs daily data for
the calculation of monthly or quarterly (co)variances over longer time periods (see, e.g.,
Schwert (1989)), which speak more directly to the longer holding periods that are com-
monplace in empirical finance. In this spirit, Figure 7 presents a lower-frequency analog
to Figure 1, showing monthly covariances and semicovariances constructed from daily
data spanning 1926 to 2018.23 While the use of coarser daily returns in the construction
of the monthly semicovariances introduces a wedge between the infill asymptotic theory
and the empirical measures, thereby blurring the impact of co-jumps, short-lived co-drifts
and/or leverage effects for explaining the differences between the concordant semicovari-
ance components, the figure still reveals notable differences through time in the relative
23Analogous to Figure 1, each month we randomly select 500 pairs of stocks from that month’s list ofavailable stocks in the CRSP database, and average the monthly realized measures across all 500 pairs.
31
Figure 7: Decomposition of Monthly Realized Covariances
1930 1940 1950 1960 1970 1980 1990 2000 2010
-10
010
2030
(Sem
i)Cov
aria
nce
Time
Realized Covariance Concordant Semicovariance Mixed Semicovariance
Realized Covariance Concordant Semicovariance Mixed Semicovariance
Note: The figure plots seven-month moving averages of the monthly time series of theconcordant semicovariance (P + N), the mixed semicovariance (M) and the realized
covariance (C). Each series is constructed as the average of the corresponding monthlyrealized measures averaged across 500 randomly selected pairs of CRSP stocks over the1926 - 2018 period. The shaded regions represent NBER recession periods.
importance of the concordant versus discordant components. In particular, the combined
concordant semicovariance component generally, though not uniformly, increases in reces-
sions more than the mixed component declines, leading to an overall increase in the total
covariance. This is true for the Great Depression in the 1930’s and the Great Recession
in the late 2000’s, as well as for the 1970’s oil-shock recession. However, it is not the
case for the recessions of the early 1980s and 2000s. We leave further analyses of these
intriguing features for future research.
References
Aıt-Sahalia, Y., and J. Jacod (2009): “Testing for Jumps in a Discretely Observed
Process,” Annals of Statistics, 37, 184–222.
Aıt-Sahalia, Y., and J. Jacod (2014): High-Frequency Financial Econometrics.
Princeton University Press, Princeton.
Aıt-Sahalia, Y., and D. Xiu (2016): “Increased Correlation Among Asset Classes:
Are Volatility or Jumps to Blame, or Both?,” Journal of Econometrics, 194(2), 205–
219.
Amaya, D., P. Christoffersen, K. Jacobs, and A. Vasquez (2015): “Does Re-
alized Skewness Predict the Cross-Section of Equity Returns?,” Journal of Financial
Economics, 118(1), 135–167.32
Andersen, T. G., T. Bollerlsev, F. X. Diebold, and P. Labys (2003): “Mod-
eling and Forecasting Realized Volatility,” Econometrica, 71(2), 579–625.
Andersen, T. G., and T. Bollerslev (1997): “Intraday Periodicity and Volatility
Persistence in Financial Markets,” Journal of Empirical Finance, 4, 115–158.
Andersen, T. G., T. Bollerslev, and F. X. Diebold (2007): “Roughing It Up:
Including Jump Components in the Measurement, Modeling, and Forecasting of Return
Volatility,” Review of Economics and Statistics, 89(4), 701–720.
Ang, A., and J. Chen (2002): “Asymmetric Correlations of Equity Portfolios,” Journal
of Financial Economics, 63(3), 443–494.
Back, K. (1992): “Insider Trading in Continuous Time,” Review of Financial Studies,
5(3), 387–409.
Barndorff-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2008):
“Designing Realized Kernels to Measure the ex post Variation of Equity Prices in the
Presence of Noise,” Econometrica, 76, 1481–1536.
(2011): “Multivariate Realised Kernels: Consistent Positive Semi-Definite Esti-
mators of the Covariation of Equity Prices with Noise and Non-Synchronous Trading,”
Journal of Econometrics, 162(2), 149–169.
Barndorff-Nielsen, O. E., S. Kinnebrock, and N. Shephard (2010): “Measur-
ing Downside Risk: Realised Semivariance,” in Volatility and Time Series Economet-
rics: Essays in Honor of Robert F. Engle, ed. by T. Bollerslev, J. R. Russell, and M. W.
Watson, pp. 117–136. Oxford University Press, Oxford.
Barndorff-Nielsen, O. E., and N. Shephard (2004a): “Econometric Analysis of
Realized Covariation: High-Frequency based Covariance, Regression, and Correlation
in Financial Economics,” Econometrica, 72(3), 885–925.
(2004b): “Power and Bipower Variation with Stochastic Volatility and Jumps,”
Journal of Financial Econometrics, 2(1), 1–37.
(2006): “Econometrics of testing for jumps in financial economics using bipower
variation,” Journal of Financial Econometrics, 4(1), 1–30.
Bauwens, L., S. Laurent, and J. V. K. Rombouts (2006): “Multivariate GARCH
Models: A Survey,” Journal of Applied Econometrics, 21(1), 79–109.
Bollerslev, T., T. Law, and G. Tauchen (2008): “Risk, Jumps and Diversifica-
tion,” Journal of Econometrics, 144(1), 234–256.
Bollerslev, T., J. Li, and Y. Xue (2018): “Volume, Volatility, and Public News
Announcements,” Review of Economic Studies, 85(4), 2005–2041.
Bollerslev, T., A. J. Patton, and R. Quaedvlieg (2020): “Multivariate Lever-
age Effects and Realized Semicovariance GARCH Models,” Journal of Econometrics,
Forthcoming.
Bollerslev, T., and V. Todorov (2011a): “Estimation of Jump Tails,” Economet-
rica, 79(6), 1727–1783.
33
(2011b): “Tails, Fears, and Risk Premia,” Journal of Finance, 66(6), 2165–2211.
Caporin, M., A. Kolokolov, and R. Reno (2017): “Systemic Co-jumps,” Journal
of Financial Economics, 126(3), 563–591.
Cappiello, L., R. Engle, and K. Sheppard (2006): “Asymmetric Dynamics in the
Correlations of Global Equity and Bond Returns,” Journal of Financial Econometrics,
4(4), 537–572.
Collin-Dufresne, P., and V. Fos (2016): “Insider Trading, Stochastic Liquidity,
and Equilibrium Prices,” Econometrica, 84(4), 1441–1475.
Corsi, F. (2009): “A Simple Approximate Long-Memory Model of Realized Volatility,”
Journal of Financial Econometrics, 7(2), 174–196.
Corsi, F., D. Pirino, and R. Reno (2010): “Threshold bipower variation and the
impact of jumps on volatility forecasting,” Journal of Econometrics, 159(2), 276–288.
Das, S. R., and R. Uppal (2004): “Systemic Risk and International Portfolio Choice,”
Journal of Finance, 59(6), 2809–2834.
Dovonon, P., S. Goncalves, U. Hounyo, and N. Meddahi (2019): “Bootstrapping
High-Frequency Jump Tests,” Journal of the American Statistical Association, 114,
793–803.
Elton, E. J., and M. J. Gruber (1973): “Estimating the Dependence Structure of
Share Prices – Implications for Portfolio Selection,” Journal of Finance, 28(5), 1203–
1232.
Fishburn, P. C. (1977): “Mean-Risk Analysis with Risk Associated with Below-Target
Returns,” American Economic Review, 67(2), 116–126.
Goncalves, S., and N. Meddahi (2009): “Bootstrapping Realized Volatility,” Econo-
metrica, 77, 283–306.
Hansen, P. R., Z. Huang, and H. H. Shek (2012): “Realized GARCH: A Joint Model
for Returns and Realized Measures of Volatility,” Journal of Applied Econometrics,
27(6), 877–906.
Hansen, P. R., and A. Lunde (2006): “Realized Variance and Market Microstructure
Noise,” Journal of Business and Economic Statistics, 24(2), 127–161.
(2014): “Estimating the Persistence and the Autocorrelation Function of a Time
Series that is Measured with Error,” Econometric Theory, 30(1), 60–93.
Hogan, W. W., and J. M. Warren (1972): “Computation of the Efficient Boundary
in the E-S Portfolio Selection Model,” Journal of Financial and Quantitative Analysis,
7(4), 1881–1896.
(1974): “Toward the Development of an Equilibrium Capital-Market Model
Based on Semivariance,” Journal of Financial and Quantitative Analysis, 9(1), 1–11.
Hong, Y., J. Tu, and G. Zhou (2007): “Asymmetries in Stock Returns: Statistical
Tests and Economic Interpretation,” Review of Financial Studies, 20(5), 1547–1581.
Jacod, J. (1997): “On Continuous Conditional Gaussian Martingales and Stable Con-
34
vergence in Law,” in Seminaire de Probabilites, pp. 232–246. Springer-Verlag, Berlin
Heidelberg New York.
Jacod, J., Y. Li, and X. Zheng (2017): “Statistical Properties of Microstructure
Noise,” Econometrica, 85(4), 1133–1174.
Jacod, J., and P. Protter (2012): Discretization of Processes. Springer Verlag.
Jacod, J., and A. N. Shiryaev (2003): Limit Theorems for Stochastic Processes.
Springer-Verlag, New York, second edn.
Jacod, J., and V. Todorov (2009): “Testing for Common Arrivals of Jumps for
Discretely Observed Multidimensional Processes,” Annals of Statistics, 37(4), 1792–
1838.
Kendall, M. G. (1953): “The Analysis of Economic Time-Series – Part I: Prices,”
Journal of the Royal Statistical Society, Series A, 116(1), 11–34.
Kinnebrock, S., and M. Podolskij (2008): “A Note on the Central Limit Theorem
for Bipower Variation of General Functions,” Stochastic Processes and Their Applica-
tions, 118(6), 1056–1070.
Kristensen, D. (2010): “Nonparametric Filtering of the Realized Spot Volatility: A
Kernel-Based Approach,” Econometric Theory, 26(1), pp. 60–93.
Kroner, K. F., and V. K. Ng (1998): “Modeling Asymmetric Comovements of Asset
Returns,” Review of Financial Studies, 11(4), 817–844.
Kyle, A. S. (1985): “Continuous Auctions and Insider Trading,” Econometrica, 53(6),
1315–1335.
Lahaye, J., S. Laurent, and C. J. Neely (2011): “Jumps, Cojumps and Macro
Announcements,” Journal of Applied Econometrics, 26(6), 893–921.
Lee, S. S. (2012): “Jumps and Information Flow in Financial Markets,” Review of
Financial Studies, 25(2), 439–479.
Lee, S. S., and P. A. Mykland (2008): “Jumps in Financial Markets: A New Non-
parametric Test and Jump Dynamics,” Review of Financial Studies, 21(6), 2535–2563.
Li, J., V. Todorov, and G. Tauchen (2017a): “Adaptive Estimation of Continuous-
Time Regression Models Using High-Frequency Data,” Journal of Econometrics,
200(1), 36–47.
(2017b): “Jump Regressions,” Econometrica, 85(1), 173–195.
Li, J., V. Todorov, G. Tauchen, and R. Chen (2017): “Mixed-Scale Jump Regres-
sions with Bootstrap Inference,” Journal of Econometrics, 201(2), 417–432.
Li, J., and D. Xiu (2016): “Generalized Method of Integrated Moments for High-
Frequency Data,” Econometrica, 84(4), 1613–1633.
Li, Y., P. A. Mykland, E. Renault, L. Zhang, and X. Zheng (2014): “Realized
Volatility when Sampling Times are Possibly Endogenous,” Econometric Theory, 30(3),
580–605.
Longin, F., and B. Solnik (2001): “Extreme Correlation of International Equity
35
Markets,” Journal of Finance, 56(2), 649–676.
Maheu, J. M., and T. H. McCurdy (2004): “News Arrival, Jump Dynamics, and
Volatility Components for Indivdual Stock Returns,” Journal of Finance, 59(2), 755–
793.
Mancini, C. (2001): “Disentangling the Jumps of the Diffusion in a Geometric Jumping
Brownian Motion,” Giornale dell’Istituto Italiano degli Attuari, LXIV, 19–47.
(2009): “Non-Parametric Threshold Estimation for Models with Stochastic
Diffusion Coefficient and Jumps,” Scandinavian Journal of Statistics, 36(2), 270–296.
Mancini, C., and F. Gobbi (2012): “Identifying the Brownian Covariation from the
Co-Jumps given Discrete Observations,” Econometric Theory, 28(2), 249–273.
Mao, J. C. T. (1970): “Survey of Capital Budgeting: Theory and Practice,” Journal
of Finance, 25(2), 349–360.
Markowitz, H. M. (1959): Portfolio Selection. Wiley.
Neuberger, A. (2012): “Realized Skewness,” Review of Financial Studies, 25(11),
3423–3455.
Noureldin, D., N. Shephard, and K. Sheppard (2012): “Multivariate High-
Frequency-Based Volatility (HEAVY) Models,” Journal of Applied Econometrics,
27(6), 907–933.
Patton, A. J. (2004): “On the Out-of-Sample Importance of Skewness and Asymmetric
Dependence for Asset Allocation,” Journal of Financial Econometrics, 2(1), 130–168.
(2011): “Volatility Forecast Comparison using Imperfect Volatility Proxies,”
Journal of Econometrics, 160(1), 246–256.
Patton, A. J., and K. Sheppard (2015): “Good Volatility, Bad Volatility: Signed
Jumps and the Persistence of Volatility,” Review of Economics and Statistics, 97(3),
683–697.
Poon, S.-H., M. Rockinger, and J. Tawn (2004): “Extreme Value Dependence
in Financial Markets: Diagnostics, Models, and Financial Implications.,” Review of
Financial Studies, 17(2), 581–610.
Reno, R. (2008): “Nonparametric Estimation of the Diffusion Coefficient of Stochastic
Volatility Models,” Econometric Theory, 24, 1174–1206.
Schwert, G. W. (1989): “Why Does Stock Market Volatility Change Over Time?,”
Journal of Finance, 44(5), 1115–1153.
Tjøsthem, D., and K. O. Hufthammer (2013): “Local Gaussian Correlation: A
New Measure of Dependence,” Journal of Econometrics, 172(1), 33–48.
Zellner, A. (1992): “Presidential Address: Statistics, Science and Public Policy,” Jour-
nal of the American Statistical Association, 87(417), 1–6.
Zhang, L., P. A. Mykland, and Y. Aıt-Sahalia (2005): “A Tale of Two Time
Scales: Determining Integrated Volatility with Noisy High-Frequency Data,” Journal
of the American Statistical Association, 100, 1394–1411.
36
Appendix: Regularity Conditions and Proofs
A.1. Regularity conditions
The following regularity conditions are needed for our asymptotic theory.
Assumption 1 The process X is an Ito semimartingale defined on a filtered probabil-
ity space (Ω,F , (Ft),P) of the form (2.1) with Jt =∫ t
0
∫Rδ (s, u)μ (ds, du), where the
process b is locally bounded, the process σ is cadlag and takes value in Rd×d, δ is a pre-
dictable function and μ is a Poisson random measure defined on R+×R with compensator
ν (dt, du) = dt⊗ λ (du) for some finite measure λ on R.
Assumption 2 We have Assumption 1. Moreover, the process σ has the form (2.3)
such that (i) σt is non-singular almost surely for all t; (ii) b is locally bounded; (iii)
σ is d × d × d cadlag adapted process; (iv) the process M is a local martingale that is
orthogonal to W with ‖ΔMt‖ ≤ σ for some constant σ > 0 and its predictable quadratic
covariation process has the form 〈M, M〉t =∫ t
0qsds for some locally bounded process q;
(v) the compensator of the pure-jump process∑
s≤t Δσs1{‖Δσs‖>σ} has the form∫ t
0qsds
for some locally bounded process q.
Overall, these assumptions are fairly mild and quite standard in the analysis of high-
frequency data. In particular, they allow for price and volatility jumps and co-jumps,
as well as the so-called leverage effect. The only notable restriction is the finite activity
of the price jumps. As in Li, Todorov, and Tauchen (2017b), we purposely impose this
condition because, in the current paper, the empirical interest vis-a-vis jumps mainly
concerns “large” market-wide co-jumps, which occur relatively infrequently (and thus
aligned with the finite-activity condition). Relaxing this condition would greatly com-
plicate our technical exposition, without leading to any change in the actual numerical
implementation.
A.2. Proofs
A.2.1. Notation and preliminary results
We begin by defining some notation. Recall that for j ∈ {1, . . . , d}, Tj = {τ :
ΔXj,τ �= 0} collects the jump times of asset j, and Tj+ ≡ {τ ∈ T : ΔXj,τ > 0} and
Tj− ≡ {τ ∈ T : ΔXj,τ < 0} collect the times at which asset j has positive and negative
jumps, respectively. For each jump time τ , we denote by i (τ) the unique random integer
i such that (i− 1)Δn < τ ≤ iΔn. We then set
Ij ≡ {i (τ) : τ ∈ Tj} and Ij± ≡ {i (τ) : τ ∈ Tj±}. (A.1)
37
Recall that p (x) ≡ max {x, 0} and n (x) ≡ min{x, 0}. To simplify notation, we
denote, for x, y ∈ R, f (x, y) ≡ p (x) p (y), g (x, y) ≡ n (x)n (y), and m (x, y) = p (x)n (y).
Note that f and g are both continuously differentiable except on {(x, y) : x = 0 or y = 0},with gradients
∂f (x, y) =
(1{x≥0} max {y, 0}max {x, 0} 1{y≥0}
), ∂g (x, y) =
(1{x≤0} min {y, 0}min {x, 0} 1{y≤0}
).
For generic functions h, h1 and h2 defined on Rd, and an invertible d × d matrix a
such that c = aa�, we define the following quantities:⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
Rc (h) ≡ EU [h (aU)] , γc (h) ≡ EU [h (aU) (aU)] ,
γa (h) ≡ EU [h (aU)U ] ,
Hj ≡ hj (aU)−Rc (h)−(c−1γc (hj)
)�aU, j = 1, 2,
γc (h1, h2) ≡ CovU (H1, H2) ,
Γc (h1, h2) ≡ CovU
((c−1γc (h1)
)�aU,(c−1γc (h2)
)�aU),
Γc (h1, h2) ≡ CovU (h1 (aU) , h2 (aU)) ,
(A.2)
where U is a generic d-dimensional standard normal variable, and EU and CovU are
the expectation and covariance operators with respect to U , respectively. We note that,
except for γa (h), the expected values in (A.2) depend on a only through c = aa�, which
explains our notation Rc (·), γc (·), etc.We need to establish a few identities for these functionals. Observe that the variable
Hj is the residual obtained from projecting the demeaned variable hj (aU)−EU [hj (aU)]
onto aU , with c−1γc (hj) being the corresponding projection coefficient. Since a is non-
singular, this residual may alternatively be written as Hj = hj (aU) − EU [hj (aU)] −γa (hj)
� U . Hence, the functional γc (h1, h2) can be rewritten as
γc (h1, h2) = EU
[(h1 (aU)− γa (h1)
� U)(
h2 (aU)− γa (h2)� U)]
−EU [h1 (aU)]EU [h2 (aU)] .(A.3)
We further note that Γc (h1, h2) computes the covariance of h1 (aU) − EU [h1 (aU)] and
h2 (aU)−EU [h2 (aU)], while Γc (h1, h2) computes the covariance of their projections. By
a decomposition of covariance, we then deduce
γc (h1, h2) = Γc (h1, h2)− Γc (h1, h2) . (A.4)
In addition,
Γc (h1, h2) = γc (h1)� c−1γc (h2) . (A.5)
Our analysis for the semicovariances relies on some explicit calculations of the expec-
38
tations in (A.2). Lemma A1, below, provides the details for c taking the form
c =
(v21 ρv1v2
ρv1v2 v22
). (A.6)
The proof is done by direct integration and is omitted for brevity.
Lemma A1 The following statements hold when the matrix c has the form (A.6):
(a) Rc (f) = Rc (g) = v1v2ψ (ρ) and Rc (m) = −v1v2ψ (−ρ);(b) Rc (∂f) = −Rc (∂g) =
(2√2π)−1
(1 + ρ) (v2, v1)�;
(c) γc (f) = −γc (g) =(2√2π)−1
(1 + ρ)2 v1v2 (v1, v2)�;
(d) Γc (f, f) = Γc (g, g) = v21v22(Ψ (ρ)− ψ (ρ)2) and Γc (f, g) = −v21v22ψ (ρ)2.
A.2.2. Proofs
Proof of Theorem 1. Let X ′ denote the continuous part of X, that is, X ′t ≡∫ t
0bsds+∫ t
0σsdWs. We define P ′ in the same way as P , but with X replaced by X ′. It follows
that
Pjk = P ′jk +
∑i∈Ij∪Ik
p (ΔniXj) p (Δ
niXk)−
∑i∈Ij∪Ik
p(Δn
iX′j
)p (Δn
iX′k) .
By Theorem 3.4.1(b) in Jacod and Protter (2012) and Lemma A1(a),
P ′jk
P−→∫ T
0
vj,svk,sψ(ρjk,s)ds. (A.7)
We also note that∑
i∈Ij∪Ik p (ΔniXj) p (Δ
niXk) =
∑τ∈Tj∪Tk p(Δ
ni(τ)Xj)p(Δ
ni(τ)Xk) and
Δni(τ)X → ΔXτ pathwise. Hence,
∑i∈Ij∪Ik
p (ΔniXj) p (Δ
niXk)
P−→ P †jk. (A.8)
Finally, by a standard estimate for continuous Ito semimartingales and the fact that Ij∪Ik
is almost surely finite, we deduce that∑
i∈Ij∪Ik p(Δn
iX′j
)p (Δn
iX′k) = Op(Δn). This
estimate, combined with (A.7) and (A.8), implies the asserted convergence PjkP−→ Pjk.
The proof for N and M follows essentially the same argument. Q.E.D.
Proof of Theorem 2. Step 1. We begin by outlining the basic steps of the proof.
First, we decompose Δ−1/2n (P12 − P12) =
∑5j=1 P
(j) and Δ−1/2n (N12 − N12) =
∑5j=1 N
(j),
39
where ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
P (1) ≡ Δ−1/2n
⎛⎝ ∑i∈I1+∩I2+
p (ΔniX1) p (Δ
niX2)− P †
12
⎞⎠ ,
P (2) ≡ Δ−1/2n
∑i∈I1+\I2
p (ΔniX1) p (Δ
niX2) ,
P (3) ≡ Δ−1/2n
∑i∈I2+\I1
p (ΔniX1) p (Δ
niX2) ,
P (4) ≡ Δ−1/2n
∑i∈I1−∪I2−
p (ΔniX1) p (Δ
niX2) ,
P (5) ≡ Δ−1/2n
( ∑i/∈I1∪I2
p (ΔniX1) p (Δ
niX2)− P �
12
),
and ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
N (1) ≡ Δ−1/2n
⎛⎝ ∑i∈I1−∩I2−
n (ΔniX1)n (Δ
niX2)−N †
12
⎞⎠ ,
N (2) ≡ Δ−1/2n
∑i∈I1−\I2
n (ΔniX1)n (Δ
niX2) ,
N (3) ≡ Δ−1/2n
∑i∈I2−\I1
n (ΔniX1)n (Δ
niX2) ,
N (4) ≡ Δ−1/2n
∑i∈I1+∪I2+
n (ΔniX1)n (Δ
niX2) ,
N (5) ≡ Δ−1/2n
( ∑i/∈I1∪I2
n (ΔniX1)n (Δ
niX2)−N�
12
).
To prove the assertion of Theorem 2, it suffices to show the following joint stable
convergence in law(4∑
j=1
P (j),4∑
j=1
N (j)
)L-s−→ (ξP , ξN) , (A.9)(
P (5)
N (5)
)L-s−→(
B
−B
)+
(L
−L
)+
(ζ
−ζ
)+
(ζP
ζN
). (A.10)
In steps 2 and 3, below, we prove these claims in turn. We note that, by a standard
argument, it is easy to show that the convergences in (A.9) and (A.10) hold jointly with F -
conditionally independent limits (as they involve non-overlapping Brownian increments).
Therefore, the remaining task is to separately show (A.9) and (A.10).
Step 2. This step proves the claim in (A.9). Let T ≡ T1 ∪ T2 collect all jump times
of the bivariate process X. We set, for each τ ∈ T , ητ = (η1,τ , η2,τ )� ≡ Δ
−1/2n (Δn
i(τ)X −ΔXτ ). By Proposition 4.4.10 in Jacod and Protter (2012),
(ητ )τ∈TL-s−→ (ητ )τ∈T , (A.11)
40
where ητ = (η1,τ , η2,τ )� ≡ √
κτ ξτ− +√1− κτ ξτ+ (recall the definitions in Section 2.2).
From (A.11), it is easy to derive the following representations uniformly for all τ ∈ T(note that T is finite almost surely): for j ∈ {1, 2},
p(Δn
i(τ)Xj
)=
⎧⎪⎨⎪⎩ΔXj,τ +Δ
1/2n ηj,τ + op(Δ
1/2n ), if τ ∈ Tj+,
Δ1/2n p (ηj,τ ) + op(Δ
1/2n ), if τ ∈ T \ T j,
op(Δ1/2n ), if τ ∈ Tj−,
(A.12)
and
n(Δn
i(τ)Xj
)=
⎧⎪⎨⎪⎩op(Δ
1/2n ), if τ ∈ Tj+,
Δ1/2n n (ηj,τ ) + op(Δ
1/2n ), if τ ∈ T \ T j,
ΔXj,τ +Δ1/2n ηj,τ + op(Δ
1/2n ), if τ ∈ Tj−.
(A.13)
Using these representations, we further deduce⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
P (1) =∑
τ∈T1+∩T2+(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) + op(1),
P (2) =∑
τ∈T1+\T2ΔX1,τp (η2,τ ) + op(1),
P (3) =∑
τ∈T2+\T1ΔX2,τp (η1,τ ) + op(1),
P (4) = op(1),
and ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
N (1) =∑
τ∈T1−∩T2−(ΔX1,τ η2,τ +ΔX2,τ η1,τ ) + op(1),
N (2) =∑
τ∈T1−\T2ΔX1,τn (η2,τ ) + op(1),
N (3) =∑
τ∈T2−\T1ΔX2,τn (η1,τ ) + op(1),
N (4) = op(1).
These representations, together with the convergence (A.11), imply (A.9).
Step 3. This step proves the claim in (A.10). Let X ′ denote the continuous part of
X, that is, X ′t ≡∫ t
0bsds+
∫ t
0σsdWs. We define P ′ and N ′ in the same way as P and N ,
but with X replaced by X ′. It is easy to see that P ′ −∑i/∈I1∪I2 p (ΔniX1) p (Δ
niX2) and
N ′ −∑i/∈I1∪I2 n (ΔniX1)n (Δ
niX2) are Op(Δn). Hence, it suffices to prove the claim with
P (5) and N (5) replaced by P ′ ≡ Δ−1/2n (P ′
12−P �12) and N
′ ≡ Δ−1/2n (N ′
12−N�12), respectively.
In order to derive the stable convergence in law of (P ′, N ′), we apply Theorem 5.3.5 in
Jacod and Protter (2012) to the test function x → (p(x1)p (x2) , n (x1)n (x2)). To do so,
it is enough to verify that the limiting variables (B,−B), (L,−L), (ζ,−ζ) and (ζP , ζN)
coincide with Jacod and Protter’s A, A′, U and U
′variables. Turning to the details,
we first note that, by Lemma A1(a), we can rewrite P �12 =
∫ T
0Rcs (f) ds and N�
12 =∫ T
0Rcs (g) ds. By Lemma A1(b), we can rewrite the B term as B =
∫ T
0b�s Rcs (∂f) ds =
41
− ∫ T
0b�s Rcs (∂g) ds. Given (2.7), no further rewriting is needed for the (L,−L) term.
By Lemma A1(c), we can rewrite
γt = γct (f) and − γt = γct (g) . (A.14)
Hence, γt = σtγσt (f) = −σtγσt (g) and, by (2.8), ζ =∫ T
0γσs (f)
� dWs = − ∫ T
0γσs (g)
� dWs.
By (A.14) and (A.5), we see that Γt defined in (2.10) can be written as
Γt =
(Γct (f, f) Γct (f, g)
Γct (g, f) Γct (g, g)
). (A.15)
By Lemma A1(d), we see that Γt defined in (2.12) can be expressed equivalently as
Γt =
(Γct (f, f) Γct (f, g)
Γct (g, f) Γct (g, g)
). (A.16)
Recall that γt ≡ Γt − Γt. By (A.4), (A.15) and (A.16), it follows that
γt =
(γct (f, f) γct (f, g)
γct (g, f) γct (g, g)
).
In view of (A.3), we verify that the local quadratic variation of the (ζP , ζN) term coincide
with that defined in (5.2.4) of Jacod and Protter (2012).
We are now ready to apply Theorem 5.3.5 in Jacod and Protter (2012) to finish the
proof of (A.10), and hence, the assertion of Theorem 2. Q.E.D.
Proof of Proposition 1. By Proposition 1 in Li, Todorov, and Tauchen (2017b), the
set I coincides with I1 ∪ I2 with probability approaching one and, in restriction to that
sequence of events,
P �12 =
∑i/∈I1∪I2
p (ΔniX1) p (Δ
niX2) , P †
12 =∑
i∈I1∪I2p (Δn
iX1) p (ΔniX2) ,
N�12 =
∑i/∈I1∪I2
n (ΔniX1)n (Δ
niX2) , N †
12 =∑
i∈I1∪I2n (Δn
iX1)n (ΔniX2) .
The assertion then follows from (A.9) and (A.10) in the proof of Theorem 2. Q.E.D.
Proof of Proposition 2. Let T = T 1 ∪ T2. For notational simplicity, we denote, for
each subset S ⊆ T ,
ξ∗P (S) ≡ Δ−1/2n
∑τ∈S
(p(Δn
i(τ)X∗1 +Δ1/2
n η∗i(τ),1)p(Δn
i(τ)X∗2 +Δ1/2
n η∗i(τ),2)
−p (Δni(τ)X
∗1
)p(Δn
i(τ)X∗2
)).
(A.17)
42
By Proposition 1 in Li, Todorov, and Tauchen (2017b), I coincides with I1 ∪ I2 with
probability approaching one. Hence, we can restrict our calculations to that sequence
of events without loss of generality. In particular, we can write ξ∗P as ξ∗P (T ) using the
notation (A.17). We can then decompose ξ∗P as
ξ∗P = ξ∗P (T1+ ∩ T2+) + ξ∗P (T1+ \ T2) + ξ∗P (T2+ \ T1) + ξ∗P (T1− ∪ T2−) .
We note that Δni(τ)X
∗j = 0 for all τ ∈ T \ T j with probability approaching one. It is then
easy to deduce that
p(Δn
i(τ)X∗j +Δ1/2
n η∗i(τ),j)=
⎧⎪⎨⎪⎩p(Δn
i(τ)X∗j ) + Δ
1/2n η∗i(τ),j + op(Δ
1/2n ), if τ ∈ Tj+,
Δ1/2n p(η∗i(τ),j) + op(Δ
1/2n ), if τ ∈ T \ T j,
op(Δ1/2n ), if τ ∈ Tj−.
From there, we readily deduce that⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
ξ∗P (T1+ ∩ T2+) =∑
τ∈T1+∩T2+
(p(Δn
i(τ)X∗1 )η
∗i(τ),2 + p(Δn
i(τ)X∗2 )η
∗i(τ),1
)+ op(1),
ξ∗P (T1+ \ T2) =∑
τ∈T1+\T2p(Δn
i(τ)X∗1 )p(η∗i(τ),2
)+ op(1),
ξ∗P (T2+ \ T1) =∑
τ∈T2+\T1p(Δn
i(τ)X∗2 )p(η∗i(τ),1
)+ op(1),
ξ∗P (T1− ∪ T2−) = op(1).
(A.18)
By a standard result for spot covariance estimation (see, e.g., Theorem 9.3.2 in Jacod
and Protter (2012)), we have ci(τ)±P−→ cτ±. Consequently, (η∗i(τ))τ∈T
L|F−→ (ητ )τ∈T , whereL|F−→ denotes the convergence in probability of F -conditional laws under the uniform
metric. In addition, we note that p(Δni(τ)X
∗j )
P−→ ΔXj,τ for all τ ∈ Tj+. Combining
these convergence results with (A.18), we deduce that ξ∗PL|F−→ ξP . Evidently, the same
argument can be used to show that ξ∗NL|F−→ ξN , jointly with ξ∗P
L|F−→ ξP , which implies the
first assertion of Proposition 2 (i.e., ξ∗P − ξ∗NL|F−→ ξP − ξN). The size and power properties
of the test readily follow from here. Q.E.D.
Proof of Proposition 3. Let G be the σ-field generated by (bt, σt)0≤t≤T . Since σ = 0,
L = 0 by definition (recall (2.7)). Proposition 1 then implies that Δ−1/2n (P �
12 − N�12) −
2BL-s−→ 2ζ + ζP − ζN . Since G and W are independent, and W in the definition of ζ
is independent of F , we see that 2ζ + ζP − ζN is, conditional on G, centered Gaussian
with conditional variance Σ� (recall (2.15)). By Theorem 3 of Li, Todorov, and Tauchen
(2017a), Σ� → Σ�. By the property of stable convergence in law, we have
Δ−1/2n
(P �12 − N�
12
)− 2B√
Σ�
L-s−→ 2ζ + ζP − ζN√Σ�
≡ N ,
43
where, by definition, N is a standard normal variable (defined on an extension of the
original probability space) that is independent of G. In restriction to Ω0 = {B = 0}, wehave Δ
−1/2n (P �
12 − N�12)/√Σ� L-s−→ N . Since Ω0 ∈ G, this convergence readily implies the
first assertion of the proposition concerning the null rejection probability.
To prove the second assertion, we fix some constant R > 0. We suppose that P (Ωa) >
0 without loss of generality because, otherwise, the conditional probability P (C+n |Ωa) can
be arbitrarily defined and there would be nothing to prove. By the property of stable
convergence, we have
P
({Δ
−1/2n (P �
12 − N�12)√
Σ�> zα
}∩ Ωa
)
= P
({Δ
−1/2n (P �
12 − N�12)− 2B√
Σ�> zα − 2B√
Σ�
}∩ Ωa
)
= P
({N > zα − 2B√
Σ�
}∩ Ωa
)+ o (1)
≥ P ({N > zα −R} ∩ Ωa) + o (1) .
Note that Ωa ∈ G. Since N is independent of G, P({N > zα −R} ∩ Ωa) = (1 − Φ(zα −R))P (Ωa). Hence,
lim infn
P
({Δ
−1/2n (P �
12 − N�12)√
Σ�> zα
}∩ Ωa
)≥ (1− Φ (zα −R))P (Ωa) .
Dividing both sides by P (Ωa), we finish the proof of the second assertion of this propo-
sition. Q.E.D.
44