0% found this document useful (0 votes)
46 views23 pages

The Jackknife: Patrick Breheny

The document discusses the jackknife, a statistical tool for estimating standard errors and bias in a nonparametric way. It introduces the jackknife setup using leave-one-out estimates and shows how the jackknife can be used to estimate bias. Pseudo-values are presented as another way to conceptualize the jackknife. Examples are provided for how the jackknife applies to estimating the variance and median. The relationship between the jackknife and influence functions is explored.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views23 pages

The Jackknife: Patrick Breheny

The document discusses the jackknife, a statistical tool for estimating standard errors and bias in a nonparametric way. It introduces the jackknife setup using leave-one-out estimates and shows how the jackknife can be used to estimate bias. Pseudo-values are presented as another way to conceptualize the jackknife. Examples are provided for how the jackknife applies to estimating the variance and median. The relationship between the jackknife and influence functions is explored.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

The Jackknife

The Jackknife

Patrick Breheny

September 9

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Introduction

The last few lectures, we described influence functions as a


tool for assessing the standard error of statistical functionals
and obtaining nonparametric confidence intervals
Today we will discuss a tool called the jackknife
Like influence functions, the jackknife can be used to estimate
standard errors in a nonparametric way
The jackknife can also be used to obtain nonparametric
estimates of bias
Although superficially different, we will see that the jackknife
is built on essentially the same idea as the influence function,
although the jackknife was proposed much earlier (1949)

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife setup and notation

Suppose we have an estimator θ̂ which can be computed from


a sample {xi }
Jackknife methods revolve around computing the estimates
{θ̂(i) }, where θ̂(i) denotes the estimate calculated from the
data with the ith observation removed (these are sometimes
called the “leave one out” estimates)
Finally, let
n
1X
θ̄ = θ̂(i)
n
i=1

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife estimation of bias

Note that, if θ̂ is unbiased,


n
1X
Eθ̄ = Eθ̂(i)
n
i=1

However, suppose that E(θ̂) = θ + an−1 + bn−2 + O(n−2 )


Then
a b
E(θ̄ − θ) = + 2 + O(n−2 )
n(n − 1) n (n − 1)

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife estimation of bias

Thus,

bjack = (n − 1)(θ̄ − θ̂),

the jackknife estimate of bias, is correct up to second order


Furthermore, the bias-corrected jackknife estimate,

θ̂jack = θ̂ − bjack ,

is an unbiased estimate of θ, again up to second order

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Example: Plug-in variance

For example,
P consider the plug-in estimate of variance:
θ̂ = n−1 i (xi − x̄)2
The expected value of the jackknife estimate of bias is
θ
E(bjack ) =
n
= Bias(θ̂)

Furthermore, it can be shown that the bias-corrected estimate


is

θ̂jack = s2 ,

the usual unbiased estimate of the variance

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Pseudo-values

Another way to think about the jackknife is in terms of the


pseudo-values:

θ̃i = nθ̂ − (n − 1)θ̂(i)

Note that,
Pfor the special case of a linear statistic
θ̂ = µ + a(xi ), θ̃i = a(xi ) (i.e., for the mean, θ̃i = xi )
The idea behind pseudo-values is that they are supposed to
act like a sample of n independent data values – or at least, a
linear approximation to that ideal

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Properties of the pseudo-values

Note that
1X
θ̃i = nθ̂ − (n − 1)θ̄
n
= θ̂ − bjack ,

the bias-corrected estimate


Furthermore, letting s̃2 denote the sample variance of the
pseudo-values,

s̃2
vjack =
n

is, in certain circumstances, a good estimate of V(θ̂)

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Example: Mean

As a somewhat trivial example, suppose θ̂ = x̄


bjack = 0
vjack = s2 /n, where s2 is the usual, unbiased estimate of the
variance

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife confidence intervals?

The pseudo-value statement of the jackknife suggests the


following approach to constructing confidence intervals:

Mean(θ̃i ) ± t1−α/2;n−1 SEjack ,

These intervals turn out to be relatively similar to those


produced by the functional delta method, for reasons that will
be discussed soon
They are not, however, particularly commonly used

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Homework

Homework: Write an R function called jackknife which


implements the jackknife. The function should accept two
arguments: x (the data) and theta (a function which, when
applied to x, produces the estimate). The function should return a
named list with the following components:
bias – the jackknife estimate of bias
se – the jackknife estimate of standard error
values the leave-one-out estimates {θ̂(i) }
Please turn in the function electronically so I can verify that it
works correctly.

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Is vjack a good estimator?

Obviously, vjack is a good estimator of V(x̄), where it is


equivalent to the usual estimator
The same logic extends to any linear statistic
Furthermore, if g is function continuously differentiable at µ,
it can be shown that vjack is a consistent estimator of g(x̄):
vjack P
−→ 1
V{g(x̄)}

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Is vjack a good estimator? (cont’d)

However, there are also cases where vjack is not a good


estimator of the variance of an estimate
In particular, vjack can be shown to perform poorly when the
estimator is not a smooth function of the data
A simple example of a non-smooth estimator is the median

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife applied to the median

Suppose we observe the sample {1, 2, . . . , 9, 10}


What will our leave-one-out estimates look like?
We will obtain five 5’s and 5 6’s
There is nothing special about these particular numbers: for
any data set with an even number of observations, we will
always obtain two unique values for θ̂(i) , each with n/2
instances

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Inconsistency

This doesn’t seem like a good way to estimate the variance of


the median, and it isn’t
It can be shown that the jackknife variance estimate is
inconsistent for all F and all p
In particular, for the median,
2
χ22

vjack d
−→
V(θ̂) 2

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife and influence function

There is a close connection between the jackknife and


influence functions
Viewing the jackknife as a plug-in estimator, it calculates n
estimates based on a slightly altered version of the empirical
distribution function and compares these altered estimates to
the plug-in estimate in order to assess variability of the
estimate
What CDF is the jackknife using?
 
1 1
, · · · , 0, · · · ,
n−1 n−1

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Jackknife and influence function (cont’d)

This is very similar to the idea behind the influence function,


with
n
1−=
n−1
1
=⇒  = −
n−1
In a sense, then, the jackknife is a numerical approximation to
the functional delta method
Indeed, an alternative name for the functional delta method is
the infinitesimal jackknife

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

The positive jackknife

There is an important difference, however, between the


jackknife and the functional delta method: the delta method
adds point mass to observation xi , while the jackknife takes
mass away
Another take on the jackknife, then, is to compute n
estimates {θ̂(i) } by adding an observation at xi instead of
taking one away (i.e.,  = 1/(n + 1))
This method is called the positive jackknife; however, it is not
commonly used

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

The delete-d jackknife

Another variation on the jackknife that has been proposed is


called the delete-d jackknife
As the name suggests, instead of leaving out one observation
when calculating the collection {θ̂(s) }, the delete-d jackknife
leaves out d observations
This approach has merit: in particular, it can be shown that if
d is appropriately chosen, then the delete-d jackknife estimate
of variance is consistent for the median
However, it has the drawback that instead of calculating
 n
n
leave-one-out estimates, we now have to calculate d
leave-d-out estimates – a much larger number, often
bordering on the computationally infeasible

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Homework

The standardized test used by law schools is called the Law School
Admission Test (LSAT), and it has a reasonably high correlation
with undergraduate GPA. The course website contains data on the
average LSAT score and average undergraduate GPA for the 1973
incoming class of 15 law schools.
(a) Use the jackknife to obtain an estimate of the bias and
standard error of the correlation coefficient between GPA and
LSAT scores. Comment on whether the estimate is biased
upward or downward.
(b) If x and y are drawn from a bivariate normal distribution, then
P
nV(ρ̂) −→ (1 − ρ2 )2 . Use this to estimate the standard error
of ρ̂.

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Homework (cont’d)

(c) On page 21 of our textbook, the author gives the influence


function for the correlation coefficient:
1
L(x, y) = x̃ỹ − θ(x̃2 + ỹ 2 ),
2
where
x − µx
x̃ = p
σx2

and ỹ is defined similarly. Use this to estimate the standard


error of ρ̂; compare the three estimates (a)-(c).

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Homework (cont’d)

(d) For each data point (xi , yi ), make a plot of ρ̂ vs. the mass at
point i, as the point mass at i varies from 0 to 1/3 (and the
rest of the mass is spread evenly on the rest of the
observations).
(e) Comment on why some plots slope upwards and others slope
downwards.
(f) Extra credit: Comment on the how the shape of the curve
relates to the comparison between the delta method and
jackknife estimates of the variance

Patrick Breheny STA 621: Nonparametric Statistics


The Jackknife

Hints

The R function cov.wt can be used to calculate weighted


correlation coefficients, taking on an optional argument that
consists of a vector of weights for each observation
Please do not turn in 15 pages of plots – use mfrow to make a
grid of plots on one page
You will likely have to reduce the margins of your plots to
eliminate whitespace; you are free to design your plot how you
wish, but I will pass on the suggestion
par(mfrow=c(5,3),mar=c(2,4,2,1))

Patrick Breheny STA 621: Nonparametric Statistics

You might also like