modeling dynamic diurnal patterns in high frequency financial data · 2013-01-14 · modeling...
TRANSCRIPT
Modeling dynamic diurnal patterns in highfrequency financial data
Ryoko Ito1
1Faculty of Economics,Cambridge University
Tinbergen Institute Amsterdam, January 2013
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 1/27
Introduction
We want to build a model for high frequency observations offinancial data
Not easy: need to capture stylized features
– Concentration of zero-observations– Diurnal patterns– Skewness, heavy tail– Seasonality (intra-weekly, monthly, quarterly...)– Highly persistent dynamics (long-memory?)
An extension of DCS model works very well! Several advantagesover other existing methods.
Methods:
– Distribution decomposition at zero– Unobserved components– Dynamic cubic spline (Harvey and Koopman (1993))
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 2/27
Data
Trade volume of IBM stock traded on the NYSE. The numberof shares traded.
Period: 5 consequtive trading weeks in February - March 2000
Sampling frequency: 30 seconds
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 3/27
Empirical features
Diurnal U-shaped patterns
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000
Mon‐20‐Mar Tue‐21‐Mar Wed‐22‐Mar Thu‐23‐Mar Fri‐24‐Mar0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
Mon‐20‐Mar Tue‐21‐Mar Wed‐22‐Mar Thu‐23‐Mar Fri‐24‐Mar
Figure: IBM trade volume (left column) and the same series smoothed by the simplemoving average (right column). Time on the x-axis. Monday 20 - Friday 24 March2000. Each day covers trading hours between 9.30am-4pm (in the New York localtime).
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 4/27
Empirical features
Trade volume bottoms out at around 1pm.
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
80,000
90,000
100,000
09 11 13 150
200
400
600
800
1,000
1,200
1,400
1,600
1,800
2,000
09 11 13 15
Figure: IBM trade volume (left column) and the same series smoothed by the simplemoving average (right column). Time on the x-axis. Wednesday 22 March 2000,covering 9.30am-4pm (in the New York local time).
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 5/27
Empirical features
Sample autocorrelation. Highly persistent.
0 20 40 60 80 100 120 140 160 180 200−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Lag
Sam
ple
Aut
ocor
rela
tion
of y
95% confidence interval
Figure: Sample autocorrelation of IBM trade volume. Sampling period: 28 February -31 March 2000. The 200th lag corresponds approximately to 1.5 hours prior.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 6/27
Empirical features
Skewness, long upper-tail.
0 0.5 1 1.5 2 2.5 3 3.5 4
x 104
0
0.5
1
1.5
All volume
Fre
quen
cy %
0 0.5 1 1.5 2 2.5 3 3.5 4
x 104
0
10
20
30
40
50
60
70
80
90
100
All volumeE
mpi
rical
CD
F %
Figure: Frequency distribution (left) and empirical cdf (right) of IBM trade volume.Sampling period: 28 February - 31 March 2000.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 7/27
Empirical features
Sample statistics.
Observations (total) 19,500
Mean 10,539Median 6,100Maximum 1,652,100Minimum 0Std. Dev. 26,071Skewness 29
99.9% sample quantile 293,654Max - 99.9% quantile 1,358,446
Frequency of zero-obs. 0.47% (92 obs.)
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 8/27
Intra-day DCS with dynamic cubic spline
Our model: intra-day DCS with dynamic cubic spline
Time index: intra-day time τ on t-th trading day as ·t,ττ = 0, . . . , I and t = 1, . . . ,T .τ = 0 and τ = I : the moments of market open and close.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 9/27
Distribution decomposition at zero
Define CDF F : R≥0 → [0, 1] of a standard random variable X ∼ F
– the origin has a discrete mass of probability– strictly positive support is captured by a conventional
continuous distribution
Formally,
PF (X = 0) = p p ∈ [0, 1]
PF (X > 0) = 1− p (1)
PF (X ≤ x |X > 0) = F ∗(x) x > 0
F ∗ : R>0 → [0, 1] is the cdf of a conventional standard continuousrandom variable with constant parameter vector θ∗.
Decomposition technique: Amemiya (1973), Heckman (1974),McCulloch and Tsay (2001) Rydberg and Shephard (2003),Hautsch, Malec, and Schienle (2010)
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 10/27
Our p is constant. OK as the number of zero-observations is small.
– Possible extension: time-varying pt,τ using logit link [Rydbergand Shephard (2003), Hautsch, Malec, and Schienle (2010)]
Apply DCS filter only to positive observations.
– irregular term ut,τ : the score of F ∗
– λt,τ driven by ut,τ−11l{yt,τ−1>0}
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 11/27
Unobserved components
Assumption 1: periodicity and autocorrelation in data are due to thetime-varying scale parameter αt,τ = exp(λt,τ ). Standardized observationsare iid and free of periodicity and autocorrelation.
⇒ Let λt,τ have a component structure to capture each feature of data.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 12/27
Unobserved components
Unobserved components:
yt,τ = εt,τ exp(λt,τ ), εt,τ |Ft,τ−1 ∼ iidF
λt,τ = δ + µt,τ + ηt,τ + st,τ
µt,τ : low-frequency component. Captures highly persistentnonstationary dynamics
µt,τ = µt,τ−1 + κµut,τ−11l{yt,τ−1>0}
ηt,τ : stationary (autoregressive) component. A mixture of ARcomponents captures long-memory.
ηt,τ =J∑
j=1
η(j)t,τ
η(j)t,τ = φ
(j)1 η
(j)t,τ−1 + φ
(j)2 η
(j)t,τ−2 · · ·+ φ(j)
m η(j)
t,τ−m(j) + κ(j)η ut,τ−11l{yt,τ−1>0}
for some J ∈ N>0 and m(j) ∈ N>0.
st,τ : periodic component capturing diurnal patterns
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 13/27
Dynamic cubic spline
st,τ : dynamic cubic spline (Harvey and Koopman (1993))
st,τ =k∑
j=1
1l{τ∈[τj−1,τj ]} z j(τ) · γ†
k: number of knots
τ0 < τ1 < · · · < τk : coordinates of the knots along time-axis
γ† = (γ1, . . . , γk)>: y-coordinates (height) of the knots
z j : [τj−1, τj ]k → Rk : k-dimensional vector of functions.
Conveys information about (i) polynomial order, (ii) continuity, (iii)periodicity, and (iv) zero-sum conditions.
Bowsher and Meeks (2008): “special type of dynamic factor model”
Time-varying spline: let γ† → γ†t,τ
where
γ†t,τ
= γ†t,τ−1
+ κ∗ · ut,τ−11l{yt,τ−1>0}
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 14/27
Why use this spline?
Alternative options used by many:
Fourier representation
Sample moments for each intra-day bins
Diurnal pattern = deterministic function of intra-day time
(Andersen and Bollerslev (1998), Engle and Russell (1998), Shang et al.(2001), Campbell and Diebold (2005), Engle and Rangel (2008),Brownlees et al. (2011), Engle and Sokalska (2012).)
So why use this spline?
Allows for changing diurnal patterns
No need for a two-step procedure to “diurnally adjust” data
Formal test for the day-of-the-week effect. Compare shape ofdirunal patterns.
– Unlike the alternative: seasonal dummies. Test for leveldifferences. Used by many (e.g. Andersen and Bollerslev(1998), Lo and Wang (2010))
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 15/27
Estimation
Apply spline-DCS model to IBM trade volume data
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 16/27
Estimation results
Assumption 1: εt,τ = yt,τ/αt,τ has to be free of autocorrelation.Satisfied - no autocorrelation in εt,τ .
0 20 40 60 80 100 120 140 160 180 200−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Lag
Sam
ple
Aut
ocor
rela
tion
of y
95% confidence interval
0 20 40 60 80 100 120 140 160 180 200−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
LagS
ampl
e A
utoc
orre
latio
n of
res
95% confidence interval
Figure: Sample autocorrelation of trade volume (top), of εt,τ (left). The 95%confidence interval is computed at ±2 standard errors.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 17/27
Estimation results
F ∗ ∼ Burr distribution fits very well. PIT: F ∗(εt,τ ) ∼ U[0,1].
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
res > 0, Burr
Em
piric
al C
DF
Empirical CDF
res > 0Burr with estimated parameters
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
F*(res>0)E
mpi
rical
CD
F o
f F*(
res>
0)
Empirical CDF
Figure: Empirical cdf of εt,τ > 0 against cdf of Burr(ν, ζ) (left). Empirical cdf of the
PIT of εt,τ > 0 computed under F∗(·; θ∗) ∼Burr(ν, ζ) (right).
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 18/27
Compare with log-normal distribution
Log-normal distribution popular. Often used in literature. (e.g.Alizadeh, Brandt, Diebold (2002))
But log-normal inferior to Burr.
PIT of εt,τ far from U[0,1]. Why?
−4 −3 −2 −1 0 1 2 3 4 50
500
1000
1500
2000
2500
3000
3500
4000
log(IBM30s>0)
Em
piric
al fr
eque
ncy
−5 −4 −3 −2 −1 0 1 2 3 4 5−4
−3
−2
−1
0
1
2
3
4
5
Theoretical N(0,1) quantiles
Em
piric
al q
uant
iles
of IB
M30
s>0
QQ Plot of Sample Data versus Standard Normal
Figure: Log(trade volume): The frequency distribution (left) and the QQ-plot (right).Using non-zero observations re-centered around mean and standardized by onestandard deviation.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 19/27
Estimated spline component
Reflects diurnal patterns that evolve over time.
‐1.0
‐0.5
0.0
0.5
1.0
1.5
Mon
day
Tuesday
Wed
nesday
Thursday
Friday
Mon
day
Tuesday
Wed
nesday
Thursday
Friday
Mon
day
Tuesday
Wed
nesday
Thursday
Friday
Mon
day
Tuesday
Wed
nesday
Thursday
Friday
6 ‐ 10 March 13 ‐ 17 March 20 ‐ 24 March 27 ‐ 31 March
‐1.0
‐0.5
0.0
0.5
1.0
09:30
10:30
11:30
12:30
13:30
14:30
15:30
Tuesday 14 March
Figure: st,τ : over 6 - 31 March 2000 (left) and of a typical day, Tuesday 14 March,from market open to close (right).
Formal test for the day-of-the-week effect (a likelihood ratio test)⇒ there is no statistically significant evidence in data.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 20/27
Overnight effect
Standard methods:
Dummy variables. Treat extreme observations as outliers.
Differentiate day and overnight jumps
Treat morning events as censored.
(e.g. Rydberg and Shephard (2003), Gerhard and Hautsch (2007), Boes,Drost and Werker (2007).)
Issues:
It may take time for the overnight effect to diminish completelyduring the day
Difficult to identify which observations are due to overnightinformation
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 21/27
Overnight effect
0
400,000
800,000
1,200,000
1,600,000
2,000,000
1000 2000 3000 4000 5000 6000 7000 8000 9000
IBM_VOLUME
0
20,000
40,000
60,000
80,000
100,000
120,000
1000 2000 3000 4000 5000 6000 7000 8000 9000
EXP_LAMBDA
Figure: Capturing overnight effect: trade volume (left) and αt,τ = exp(λt,τ ) (right).
Periodic hikes in scale parameter αt,τ .Reflects cyclical surge in market activity.Overnight information interpreted as the elevated probability of extremeevents.Advantage of the exponential link.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 22/27
Long memory
Two component specification for ηt,τ works well:
ηt,τ = η(1)t,τ + η
(2)t,τ
η(1)t,τ = φ
(1)1 η
(1)t,τ−1 + φ
(1)2 η
(1)t,τ−2 + κ
(1)η ut,τ−11l{yt,τ−1>0}
η(2)t,τ = φ
(2)1 η
(2)t,τ−1 + κ
(2)η ut,τ−11l{yt,τ−1>0}
0 50 100 150−0.2
0
0.2
0.4
0.6
0.8
Lag
Sam
ple
Aut
ocor
rela
tion
of e
ta
ACF of etaHyperbolic decay
Figure: Autocorrelation function of ηt,τ .
The degree of fractional integration in ηt,τ : d = 0.140 (s.e. 0.057).Picking up long-memory in data.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 23/27
Estimated coefficients
κµ 0.007 (0.002) γ†1;1,0 0.122 (0.070)
φ(1)1 0.561 (0.129) γ†2;1,0 -0.485 (0.079)
φ(1)2 0.400 (0.129) γ†3;1,0 -0.229 (0.058)
κ(1)η 0.053 (0.008) δ 9.254 (0.181)
φ(2)1 0.676 (0.046) ν 1.635 (0.016)
κ(2)η 0.091 (0.009) ζ 1.467 (0.042)κ∗1 0.000 (0.001) p 0.0047 (0.0005)κ∗2 -0.002 (0.001)κ∗3 0.000 (0.001)
Parametric assumptions, identifiability requirements satisfied.ηt,τ stationary.p is consistent with sample statistics.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 24/27
Out-of-sample performance
Our model and estimation results are stable
One-step ahead density forecasts (without re-estimation): very goodfor 20 days ahead.
Multi-step ahead density forecasts: very good (i.e. PIT approx. iid
∼ U[0,1]) for one complete trading-day ahead (equivalent of 780steps).
More details and discussions in the paper.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 25/27
Out-of-sample performance
Multi-step density forecasts
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PIT value
Em
piric
al C
DF
1 day ahead
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PIT value
Em
piric
al C
DF
5 days ahead
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
PIT value
Em
piric
al C
DF
8 days ahead
Figure: Empirical cdf of the PIT of multi-step forecasts. Forecast horizons: 1 dayahead (left), 5 days ahead (middle), 8 days ahead (right).
0 10 20 30 40 50 60 70 80 90−0.5
0
0.5
1
Lag
Sam
ple
Aut
ocor
rela
tion
95% confidence interval
0 10 20 30 40 50 60 70 80 90−0.5
0
0.5
1
Lag
Sam
ple
Aut
ocor
rela
tion
95% confidence interval
0 10 20 30 40 50 60 70 80 90−0.5
0
0.5
1
LagS
ampl
e A
utoc
orre
latio
n
95% confidence interval
Figure: Autocorrelation of the PIT of multi-step forecasts. Forecast horizons: 1.5hours ahead (left), one day ahead (middle), five days ahead (right).
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 26/27
Future directions
Model for higher-frequency: 1 second?Asymptotic properties of MLE when DCS non-stationaryMulti-variate version: price and volumeApplication to panel data (using composite likelihood?)etc. etc.
Ryoko Ito, Faculty of Economics, Cambridge University, UK Modeling dynamic diurnal patterns in high freq. fin. data 27/27