the jackknife
DESCRIPTION
The JackknifeTRANSCRIPT
-
The Jackknife
The Jackknife
Patrick Breheny
September 6
Patrick Breheny STA 621: Nonparametric Statistics 1/24
-
The Jackknife
Introduction
The last few lectures, we described influence functions as atool for assessing the standard error of statistical functionalsand obtaining nonparametric confidence intervals
Today we will discuss a tool called the jackknife
Like influence functions, the jackknife can be used to estimatestandard errors in a nonparametric way
The jackknife can also be used to obtain nonparametricestimates of bias
Although superficially different, we will see that the jackknifeis built on essentially the same idea as the influence function,although the jackknife was proposed much earlier (1949)
Patrick Breheny STA 621: Nonparametric Statistics 2/24
-
The Jackknife
Jackknife setup and notation
Suppose we have an estimator which can be computed froma sample {xi}Jackknife methods revolve around computing the estimates{(i)}, where (i) denotes the estimate calculated from thedata with the ith observation removed (these are sometimescalled the leave one out estimates)
Finally, let
=1
n
ni=1
(i)
Patrick Breheny STA 621: Nonparametric Statistics 3/24
-
The Jackknife
Jackknife estimation of bias
Note that, if is unbiased,
E =1
n
ni=1
E(i)
=
However, suppose that E() = + an1 + bn2 +O(n3)Then
E( ) = an(n 1) +O(n
3)
Patrick Breheny STA 621: Nonparametric Statistics 4/24
-
The Jackknife
Jackknife estimation of bias
Thus,
bjack = (n 1)( ),
the jackknife estimate of bias, is correct up to second order
Furthermore, the bias-corrected jackknife estimate,
jack = bjack,
is an unbiased estimate of , again up to second order
Patrick Breheny STA 621: Nonparametric Statistics 5/24
-
The Jackknife
Example: Plug-in variance
For example, consider the plug-in estimate of variance: = n1
i(xi x)2
The expected value of the jackknife estimate of bias is
E(bjack) = n
= Bias()
Furthermore, it can be shown that the bias-corrected estimateis
jack = s2,
the usual unbiased estimate of the variance
Patrick Breheny STA 621: Nonparametric Statistics 6/24
-
The Jackknife
Pseudo-values
Another way to think about the jackknife is in terms of thepseudo-values:
i = n (n 1)(i)Note that
1
n
i
i = n (n 1)
= bjack,
which is the bias-corrected estimate jack
The idea behind pseudo-values is that it allows us to think ofthe bias-corrected estimate as simply the mean of nindependent data values
Patrick Breheny STA 621: Nonparametric Statistics 7/24
-
The Jackknife
Properties of the pseudo-values
The pseudo-values {i} are not, in general, independent,although note that for the special case of a linear statistic = n1
i a(xi), i = a(xi) i.e., for the mean, i = xi
A reasonable idea, therefore, is to treat i as linearapproximations to iid observations and approach inference forjack as we would the sample mean
Thus, letting s2 denote the sample variance of thepseudo-values, consider
vjack =s2
n
as an estimate of V()
Patrick Breheny STA 621: Nonparametric Statistics 8/24
-
The Jackknife
Example: Mean
As a somewhat trivial example, suppose = x
bjack = 0
vjack = s2/n, where s2 is the usual, unbiased estimate of the
variance
Patrick Breheny STA 621: Nonparametric Statistics 9/24
-
The Jackknife
Jackknife confidence intervals?
The pseudo-value statement of the jackknife suggests thefollowing approach to constructing confidence intervals:
jack t1/2;n1SEjack,
These intervals turn out to be similar to those produced bythe functional delta method, for reasons that will be discussedsoon
Patrick Breheny STA 621: Nonparametric Statistics 10/24
-
The Jackknife
Homework
Homework: Write an R function called jackknife whichimplements the jackknife. The function should accept twoarguments: x (the data) and theta (a function which, whenapplied to x, produces the estimate). The function should return anamed list with the following components:
bias the jackknife estimate of bias
se the jackknife estimate of standard error
values the leave-one-out estimates {(i)}Please submit the function via Dropbox and name your fileBreheny-jack.R, with your last name replacing Breheny
Patrick Breheny STA 621: Nonparametric Statistics 11/24
-
The Jackknife
Is vjack a good estimator?
Obviously, vjack is a good estimator of V(x), where it isequivalent to the usual estimator
The same logic extends to any linear statistic
Furthermore, if g is function continuously differentiable at ,it can be shown that vjack is a consistent estimator of g(x):
vjackV{g(x)}
P 1
Patrick Breheny STA 621: Nonparametric Statistics 12/24
-
The Jackknife
Is vjack a good estimator? (contd)
However, there are also cases where vjack is not a goodestimator of the variance of an estimate
In particular, vjack can be shown to perform poorly when theestimator is not a smooth function of the data
A simple example of a non-smooth estimator is the median
Patrick Breheny STA 621: Nonparametric Statistics 13/24
-
The Jackknife
Jackknife applied to the median
Suppose we observe the sample {1, 2, . . . , 9, 10}What will our leave-one-out estimates look like?
We will obtain five 5s and 5 6s
There is nothing special about these particular numbers: forany data set with an even number of observations, we willalways obtain two unique values for (i), each with n/2instances
Patrick Breheny STA 621: Nonparametric Statistics 14/24
-
The Jackknife
Inconsistency
This doesnt seem like a good way to estimate the variance ofthe median, and it isnt
It can be shown that the jackknife variance estimate isinconsistent for all F and all p
In particular, for the median,
vjack
V()d(222
)2
Patrick Breheny STA 621: Nonparametric Statistics 15/24
-
The Jackknife
Jackknife and influence function
There is a close connection between the jackknife andinfluence functions
Viewing the jackknife as a plug-in estimator, it calculates nestimates based on a slightly altered version of the empiricaldistribution function and compares these altered estimates tothe plug-in estimate in order to assess variability of theestimate
What CDF is the jackknife using?(1
n 1 , , 0, ,1
n 1)
Patrick Breheny STA 621: Nonparametric Statistics 16/24
-
The Jackknife
Jackknife and influence function (contd)
This is very similar to the idea behind the influence function,with
1 = nn 1
= = 1n 1
In a sense, then, the jackknife is a numerical approximation tothe functional delta method
Indeed, an alternative name for the functional delta method isthe infinitesimal jackknife
Patrick Breheny STA 621: Nonparametric Statistics 17/24
-
The Jackknife
The positive jackknife
There is an important difference, however, between thejackknife and the functional delta method: the delta methodadds point mass to observation xi, while the jackknife takesmass away
Another take on the jackknife, then, is to compute nestimates {(i)} by adding an observation at xi instead oftaking one away (i.e., = 1/(n+ 1))
This method is called the positive jackknife; however, it is notcommonly used
Patrick Breheny STA 621: Nonparametric Statistics 18/24
-
The Jackknife
The delete-d jackknife
Another variation on the jackknife that has been proposed iscalled the delete-d jackknife
As the name suggests, instead of leaving out one observationwhen calculating the collection {(s)}, the delete-d jackknifeleaves out d observations
This approach has merit: in particular, it can be shown that ifd is appropriately chosen, then the delete-d jackknife estimateof variance is consistent for the median
However, it has the drawback that instead of calculating nleave-one-out estimates, we now have to calculate
(nd
)leave-d-out estimates a much larger number, oftenbordering on the computationally infeasible
Patrick Breheny STA 621: Nonparametric Statistics 19/24
-
The Jackknife
Homework
Homework: What would the collection of weights {wi}ni=1 looklike for the delete-d jackknife?
Patrick Breheny STA 621: Nonparametric Statistics 20/24
-
The Jackknife
Homework
Homework: The standardized test used by law schools is calledthe Law School Admission Test (LSAT), and it has a reasonablyhigh correlation with undergraduate GPA. The course websitecontains data on the average LSAT score and averageundergraduate GPA for the 1973 incoming class of 15 law schools.
(a) Use the jackknife to obtain an estimate of the bias andstandard error of the correlation coefficient between GPA andLSAT scores. Comment on whether the estimate is biasedupward or downward.
(b) If x and y are drawn from a bivariate normal distribution, then
nV() P (1 2)2. Use this to estimate the standard errorof .
Patrick Breheny STA 621: Nonparametric Statistics 21/24
-
The Jackknife
Homework (contd)
(c) On page 21 of our textbook, the author gives the influencefunction for the correlation coefficient:
L(x, y) = xy 12(x2 + y2),
where
x =x x
2x
and y is defined similarly. Use this to estimate the standarderror of ; compare the three estimates (a)-(c).
Patrick Breheny STA 621: Nonparametric Statistics 22/24
-
The Jackknife
Homework (contd)
(d) For each data point (xi, yi), make a plot of vs. the mass atpoint i, as the point mass at i varies from 0 to 1/3 (and therest of the mass is spread evenly on the rest of theobservations).
(e) Comment on why some plots slope upwards and others slopedownwards.
(f) Extra credit: Comment on the how the shape of the curverelates to the comparison between the delta method andjackknife estimates of the variance
Patrick Breheny STA 621: Nonparametric Statistics 23/24
-
The Jackknife
Hints
The R function cov.wt can be used to calculate weightedcorrelation coefficients, taking on an optional argument thatconsists of a vector of weights for each observation
Please do not turn in 15 pages of plots use mfrow to make agrid of plots on one page
You will likely have to reduce the margins of your plots toeliminate whitespace; you are free to design your plot how youwish, but I will pass on the suggestion par(mfrow=c(5,3),mar=c(2,4,2,1))
Patrick Breheny STA 621: Nonparametric Statistics 24/24
The Jackknife