bootstrap and subsampling dependent data - · pdf filebootstrap and subsampling dependent data...

Bootstrap and Subsampling Dependent Data

Helle Bunzel

ISU

March 13, 2009

Helle Bunzel (ISU) Bootstrap and Subsampling Dependent Data March 13, 2009 1 / 26

Bootstrapping Dependent Data IWhy it is di¤erent.

Suppose X1,X2,...,XT is a sequence of stationary weakly dependentvariables.

Weak dependence means, roughly, that random variables far apart fromone another are nearly independent.Mixing.

Note that the distribution of this data sequence can no longer bedescribed by a single marginal distribution function.

We need the entire joint distribution.

Consider a situation where the parameter of interest is the populationmean, such that θ = E (X1) .

Suppose we estimate θ by θ̂ = 1T ∑T

t=1 Xt and let Tn = θ̂ � θ.

Then Tn depends on the entire joint distribution of X1,X2,...,XT .


Bootstrapping Dependent Data IIWhy it is di¤erent.

For example:

V (Tn) =1T

"V (X1) + 2

T�1∑i=1

�1� i

T

�Cov (X1,X1+i )

#

This depends on all the bivariate distributions of X1 and Xi fori = 1, ...,T .

Now suppose we carry out the standard iid bootstrap in this situation.We create a sample X �1 ,X

�2,...,X

�T .

Note that conditional on the original data, X �1 ,X�2,...,X

�T are actually

iid variables.As a result, we would still have θ̂

�= 1

T ∑Tt=1 X�t is a valid estimator of

the mean.However, V

�θ̂��is wrong.


Bootstrapping Dependent Data IWhen the innovations are iid

Suppose fXtgt�1 is a sequence of random variables such that

Xt = h (Xt�1, ...,Xt�p ; β) + εt

where:

p < Tβ is a q � 1 vector of parametersh : Rp+q ! R is a known functionfεtgt>p is and iid sequence of random variables that are independentof X1, ...,Xp .

Examples?

Let β̂n be the estimator of β, and assume that we are interested insome statistic Tn.


Bootstrapping Dependent Data IIWhen the innovations are iid

Now, de�ne the residuals:

ε̂t = Xt � h�Xt�1, ...,Xt�p ; β̂n

�, p < t � T

Note that in general

ε̄T =1

T � pT�p∑t=1

ε̂t+p 6= 0

For this reason, de�ne the centered residuals:

ε̃t = ε̂t � ε̄T , p < t � T

It has been shown that if you forget to center, a non-disappearingrandom bias is introduced in the bootstrap.


Bootstrapping Dependent Data IIIWhen the innovations are iid

Now, draw a random sample of size (T � p), ε�p , ε�p+1, ..., ε

�T from

fε̃tgp<t�T .Finally, create your bootstrap sample as:

X �t =�

Xt 1 � t � ph�Xt�1, ...,Xt�p ; β̂n

�+ ε�t p < t � T

Note that

this method is similar to the residual bootstrap for regression models.The ε�t are iid and mean 0 by construction

There are some di¤erent versions of this method, but all with thesame principle.

Once you have the bootstrap data, it is possible to calculate T � anduse the methods we covered under the iid bootstrap.


The principle of the Block Bootstrap I

To use the previous method, we needed very speci�c informationabout the model.

We do not always have that information.

Return to the initial example:

X1,X2,...,XT is a sequence of stationary weakly dependent variablesthe parameter of interest is the population mean, θwe estimate θ by θ̂ = 1

T ∑Tt=1 Xt and let Tn = θ̂ � θ and

V (Tn) =1T

"V (X1) + 2

T�1∑i=1

�1� i

T

�Cov (X1,X1+i )

#

Because of weak dependence, the lower-order lag autocovariances aremore important for V (Tn) than the higher-order lag autocovariances.


The principle of the Block Bootstrap IIFor example, choose ` large but ` < T . Then��T�1∑

i=`

�1� i

T

�Cov (X1,X1+i )

�� ∞

∑i=`

jCov (X1,X1+i )j !`!∞

0

That means we can approximate

V (Tn) �1`

"V (X1) + 2

`�1∑i=1

�1� i

T

�Cov (X1,X1+i )

#

This only depends on the joint distribution of the shorter seriesX1,X2,...,X`.

The block bootstrap makes use of this fact.


The principle of the Block Bootstrap III

Consider now a more general case:

We are interested in Tn (X1,X2,...,XT , θ) where θ is a level oneparameter and Tn is invariant to permutations of X1,X2,...,XT .Tn = 1

T ∑Tt=1 Xt � θ satis�es this.

Now suppose that ` is some integer such that both ` and T` are large.

Example: ` =�T δ�, 0 < δ < 1

To make notation easier, assume b = T` is an integer.

For the non-overlapping block bootstrap (NBB) we partition the datainto b blocks:

Y1 = fX1, ..,X`g , ...,Yb =nX(b�1)`+1, ..,XT

oWe then sample (with replacement) b blocks Y �1 , ...,Y

�b and, putting

them together, get the bootstrap dataset X �1 ,X�2 , ...,X

�T .


The principle of the Block Bootstrap IVThis method, like the others, attempts to re-create the relationshipbetween the population and the sample.

Notation: Let Pk denote the joint distribution of (X1, ..,Xk ) , k � 1.Then, because of stationarity, Y1, ....,Yb are all identically distributedwith distribution P`.

Also, because of the weak dependece assumption, Y1, ....,Yb areapproximately independent.


The principle of the Block Bootstrap V

The basic principle is therefore:

Assume stationarity and weak dependence.Then the data is approximately distributed as Pb` = P` P` ... P`.Speci�cally, the empirical distribution of Y1, ....,Yb is P̃b` , which is anestimate of Pb` which is "close" to the true distribution.The bootstrap sample Y �1 , ...,Y

�b = X

�1 ,X

�2 , ...,X

�T has the exact

distribution P̃b` (which is close to Pb` , which is close to the true

distribution)

Underlying all these "close to", are a lot of approximations, whichrequire assumptions to hold.


The Block Bootstrap IAn Example

Suppose X1, ...,XT are generated by a stationary ARMA (1, 1)process:

Xt = β1Xt�1 + εt + α1εt�1

where jα1j < 1 and jβ1j < 1 are parameters at fεtg is a sequence ofstandard iid random variables.

Suppose we have 100 observations and we wish to use the NBB withblock lenght ` = 5.

Then we create 20 blocks:

B1 = (X1, ...,X5) , B2 = (X6, ...,X10) , ...,B20 = (X96, ...,X100)

We now re-sample 20 blocks fB�1 ,B�2 , ...,B�20g . Use these to createthe bootstrap dataset X �1 , ...,X

�100.

Now consider estimation of the variance of µ̂ = X̄T .


The Block Bootstrap IIAn Example

The bootstrap estimator is given by

V̂ (X̄T ) = E� (X̄ �T � E� (X̄ �T ))2

= E� (X̄ �T � X̄T )2

It turns out that in the case we do NOT need to draw manybootstrap samples to evaluate this expression.

Let Vj = 1` ∑`

i=1 X(j�1)`+i = B̄j Then

V̂ (X̄T ) = E�

1b

b

∑j=1B̄�j � X̄T

!2= E�

1b

b

∑j=1B̄�j

!2� X̄ 2T

= E�

1b2

b

∑j=1

�B̄�j�2!� X̄ 2T = 1

b2

b

∑j=1(B̄j )

2 � X̄ 2T


The Block Bootstrap IIIAn Example

Now suppose we wish to estimate the distribution of

Tn =

pT (X̄T � µ)

σ̂T,

that isGn (u) = P (Tn � u)

We then create the bootstrap version of Tn :

T �n =

pT (X̄ �T � X̄T )

σ̂�T

and we can �ndG �n (u) = P (T

�n � u)


The Block Bootstrap IVAn Example

Here we are back to a situation where we need to draw manybootstrap samples to �nd G �n (u) .

A way of looking at this is that we are using Monte Carlo simulationsto �nd our estimators.

Once we have G �n (u) , we can proceed to �nd critical values or createcon�dence intervals using the methods discussed under the iidbootstrap.


Di¤erent Versions of the Block Bootstrap IMoving Block Bootstrap

The block bootstrap we have seen is just one version. That one iscalled Non-overlapping Block Bootstrap or NBB.

Next we will describe the Moving Block Bootstrap, or MBB.

Let the data be X1, ...,Xn and consider estimators of the formθ̂ = T (Fn) , where Fn is the empirical distribution function ofX1, ...,Xn.

Let ` be an integer such that `! ∞ and `n ! 0 as n! ∞.

Let Bi = (Xi , ....,Xi+`�1) be the block of length ` starting atobservation i , 1 � i � N = n� `+ 1


Di¤erent Versions of the Block Bootstrap IIMoving Block Bootstrap

We are forming N blocks and drawing each with probability 1N .

We then randomly sample k blocks from B1, ...,BN denoted byB�1 , ...,B

�k and from here obtain data X �1 , ...,X

�m , where m = k`.


Di¤erent Versions of the Block Bootstrap IIIMoving Block Bootstrap

The bootstrap version of the estimator is then denoted by:

θ�m,n = T�F �m,n

�where F �m,n is the empirical distribution of X

�1 , ...,X

�m .

Note that:

Typically k is picked such that the bootstrap sample is approximatelythe same size as the the original sample.Each block is picked with probability 1

NIf ` = 1 this method reduced to the iid bootstrap.

Why do we eventually get independent bootstrap samples?


Di¤erent Versions of the Block Bootstrap IVMoving Block Bootstrap

For timeseries we also need more general estimators. Consider forexample the estimator of the autocovariance function at lag k � 0 :

γ̂n (k) =1

n� kn�k∑j=1(Xj+k � X̄n,k ) (Xj � X̄n,k )

This is not a simple function of Fn. Instead we can write this as afunction of

Fn,p =1

n� p + 1n�p+1∑j=1

FXj ,Xj+1,...,Xj+p�1

such thatθ̂ = T (Fn,p)

For γ̂n (k) , we�d use p = k + 1


Di¤erent Versions of the Block Bootstrap VMoving Block Bootstrap

Many other estimators of value for timeseries can be written like thisas well.

To get a MBB version of this estimator we:1 Fix a block size ` such that 1 < ` � n� p.2 De�ne the blocks in terms of Yi�s as

B̃j =�Yj , ...,Yj+1�1

�, 1 � j � n� p � `+ 2

3 Now sample k � 1 blocks to generate bootstrap observations

Y �1 , ...,Y�` ,Y

�`+1, ...,Y

�2`, ....,Y

�m

where m = ` � k.4 We can now de�ne the bootstrap estimator as

θ�m,n = T�F̃ �m,n

�where F̃ �m,n is the emirical distribution function of Y

�1 , ...,Y

�m .


Di¤erent Versions of the Block Bootstrap VIMoving Block Bootstrap

Because this involves sampeling blocks of Y 0s, which are actuallythemselves blocks of X 0s, this method is also called blocks of blocks.

There are variations on this method.


Comparing MBB and NBB I

There are pros and cons of these two methods.

The MBB has more possible blocks.The NBB has less �nite sample dependence.

Compare a simple estimator calculated under the two di¤erentmethods.

Consider the sample mean: θ̂n =1n ∑nj=1 Xj

Let the MBB estimator be denoted by

θ�m,n =1m

m

∑j=1X �j


Comparing MBB and NBB IIAnd the NBB estimator:

θ�(2)m,n =

1m

m

∑j=1X �2,j

where the "2" is used to signify the NBB estimator.

Let us calculate the means of these estimators:

Note that we are calculating the mean when we�ve carried out ONEdraw.

E��θ�m,n

�= E�

"1`

`

∑i=1

X �i

#=1N

N

∑j=1

1`

`

∑i=1

Xj+i�1

!

=1N

N

∑j=1

1`

`

∑i=1

Xj+i�1

!

=1N`f(X1 + X2 + ...+ X`) + (X2 + X3 + ...+ X`+1)

+...+ (XN+2 + ...+ XN+`�1)gHelle Bunzel (ISU) Bootstrap and Subsampling Dependent Data March 13, 2009 23 / 26

Comparing MBB and NBB III

Recall that N = n� `+ 1

E��θ�m,n

�=

1N`f(X1 + X2 + ...+ X`) + (X2 + X3 + ...+ X`+1)

+...+ (XN+2 + ...+ Xn)g

=1N`

(n`X̄n �

`�1∑j=1(`� j) (Xj + Xn�j+1)

)

=1N

(nX̄n �

1`

`�1∑j=1(`� j) (Xj + Xn�j+1)

)


Comparing MBB and NBB IVNow for the NBB estimator:

E�hθ�(2)m,n

i= E�

"1`

`

∑i=1X �2,i

#=1b

b

∑j=1

"1`

`

∑i=1X(j�1)`+i

#

=1b`

b

∑j=1

`

∑i=1X(j�1)`+i =

1b`

b`

∑i=1Xi

=1b`

(nX̄n �

n

∑i=b`+1

Xi

)Note that these means are di¤erent. Under certain conditions,however, it is possible to show that

E��E��θ�m,n

�� E�

hθ�(2)m,n

i�2�= O

�`

n2

�Therefore the di¤erences between the two are small for large samplesizes.Helle Bunzel (ISU) Bootstrap and Subsampling Dependent Data March 13, 2009 25 / 26

Overview of other bootstrap methods I

One issue with both the MBB and the NBB are that the �rst and lastobservations have lower probability of appearing in the sample.

Methods propsed to �x this:

Circular Block Bootstrap. Arrange data in a circle and then proceed asMBB.

It turns out that here the expected value of the bootstrap estimator isthe orginal sample mean.

Stationary Block Bootstrap. Uses stochastic blocklengths.

Subsampling: Similar to MBB, except take only 1 sample.

Less computationally demandingNo automatic second order accuracy.


bootstrap and subsampling dependent data - · pdf filebootstrap and subsampling dependent data...

Documents