Lec 6
Lec 6
Reading: Chapter 5
Rajan Patel
1/1
Cross-validation vs. the Bootstrap
2/1
Standard errors in linear regression
Standard error: SD of an estimate from a sample of size n.
3/1
Classical way to compute Standard Errors
4/1
Limitations of the classical approach
5/1
Example. Investing in two assets
Suppose that X and Y are the returns of two assets.
2
2
1
1
0
Y
Y
0
−1
−1
−2
−2
−2 −1 0 1 2 −2 −1 0 1
X X
6/1
Example. Investing in two assets
ˆY2 ˆ
Cov(X, Y)
↵
ˆ= .
2 2 ˆ
ˆX + ˆY 2Cov(X, Y )
7/1
Example. Investing in two assets
8/1
Resampling the data from the true distribution
2
2
1
1
0
Y
Y
0
−1
−1
−2
−2
−2 −1 0 1 2 −2 −1 0 1 2
X X
2
2
1
1
0
0
Y
Y
−1
−1
−2
−2
−3
−3
−3 −2 −1 0 1 2 −2 −1 0 1 2 3
X X
9/1
Computing the standard error of ↵
ˆ
(2)
(x1 , . . . , x(2)
n )
...
ˆ (1) , ↵
we can compute a value of the estimate ↵ ˆ (2) , . . . .
10 / 1
In reality, we only have n samples
2
2
1
1
Y
0
n
1X
−1
P̂ (X, Y ) = (xi , yi ).
−1
n
i=1
−2
−2
−2 −1 0 1 2
I Equivalently,
−2 −1 0
resample
1 2
the data by
X drawing n samples
X with
replacement from the actual
observations.
2
11 / 1
A schematic of the Bootstrap
Obs X Y
3 5.3 2.8
α̂ *1
1 4.3 2.4
*1
Z 3 5.3 2.8
Obs X Y
Obs X Y
2 2.1 1.1
1 4.3 2.4 Z *2
!!
3 5.3 2.8 α̂ *2
2 2.1 1.1
!! 1 4.3 2.4 !!
3 5.3 2.8 !!
!! !!
!! !! !!
!Z
*B
!! !!
Original Data (Z)
!!
Obs X Y
α̂ *B
2 2.1 1.1
2 2.1 1.1
1 4.3 2.4
12 / 1
Comparing Bootstrap resamplings
to resamplings from the true distribution
0.9
200
200
0.8
150
150
0.7
0.6
↵
100
100
0.5
50
50
0.4
0
0.3
0.4 0.5 0.6 0.7 0.8 0.9 0.3 0.4 0.5 0.6 0.7 0.8 0.9 True Bootstrap
↵ ↵
13 / 1