Martingale Limit Theory and Stochastic Regression Theory: Ching-Zong Wei
Martingale Limit Theory and Stochastic Regression Theory: Ching-Zong Wei
and
Stochastic Regression Theory
Ching-Zong Wei
Contents
1
Chapter 1
Example 1.1 Let yi = ayi−1 + εi , where εi i.i.d. with E(εi ) = 0, Var(εi ) = σ 2 , and
if we estimate a by least squares estimation
Pn
yi−1 yi
â = Pi=1
n 2
i=1 yi−1
Pn
yi−1 εi
â − a = Pi=1n 2
,
i=1 yi−1
Ln (θ) = fθ (X1 , . . . , Xn )
= fθ (Xn |X1 , . . . , Xn−1 ) · fθ (X1 , . . . , Xn−1 )
Yn
= fθ (Xi |X1 , . . . , Xi−1 ) · fθ (X1 ),
i=2
Ln (θ)
then Rn (θ) = Ln (θ0 )
, Rn (θ) is a martingale.
2
For example, if Xi = θui + εi , where ui is constant, {εi } is i.i.d. N (0.1), then
n (x −θu )2
1
P
i=1 i i
fθ (x1 , . . . , xn ) = ( √ )n e− 2
2π
Pn
(x −θui )2
Pn
fθ (x1 , . . . , xn ) (x −θ u )2
− i=1 2i + i=1 i2 0 i
= e
fθ0 (x1 , . . . , xn )
Pn (θ 2 −θ0
2) Pn
ui xi − u2i
= e(θ−θ0 ) i=1 2 i=1 .
Let
dui (θ) d2 log fθ (Xn |X1 , . . . , Xn−1 )
Vi (θ) = = ,
dθ dθ2
since
Eθ (u2i (θ)|X1 , . . . , Xi−1 ) = −Eθ (Vi (θ)|X1 , . . . , Xi−1 )
and n
X
Jn (θ) = Vi (θ),
i=1
Example 1.4
PZnBranching Process with Immigration :
Let Zn+1 = i=1 Yn+1,i +In+1 , where {Yj,i } is i.i.d. with mean E(Yj,i ) = m, Var(Yj,i ) =
3
σ 2 , and {In } is i.i.d. with mean E(In ) = b, Var(In ) = λ, then
where
δn+1
εn+1 = √ 2
.
σ Zn + λ
4
Consider (Ω, F, P), where
Ω: Sample space
F: σ–algebra ⊂ 2Ω
P: probability
{X = ai } = Ei , i = 1, . . . , n
FX = minimum σ–algebra ⊃ {E1 , . . . , En }
FX1 ,X2 = σ–algebra ⊃ {X1 = ai , X2 = bj } i = · · · , j = · · ·
Note that FX1 ,X2 ⊃ FX1 .
Ω = ∪∞ i=1 Bi , where Bi ∩ Bj = ∅ if i 6= j.
F = σ(Bi ), P 1≤i<∞
E(X|F) = ∞ i=1 E(X|Bi )IBi
5
Sol :
(i) E(X|F) = li=1 E(X|Bi )IBi ,
P
(ii) ∀ G ∈ F
Z l
Z X
E(X|F)dP = E(X|Bi )IBi dP
G G i=1
l
X
= E(X|Bi )P (Bi ∩ G)
i=1
Xl Xn
= aj P (Aj |Bi )P (Bi ∩ G)
i=1 j=1
n
X Xl
= aj ( P (Aj |Bi )P (Bi ∩ G))
j=1 i=1
Xn
= aj P (Aj ∩ G).
j=1
Since by hypothesis G ∈ F,
∃ an index set I s.t. G = ∪i∈I Bi
l
X X X
P (Aj |Bi )P (Bi ∩ G) = P (Aj |Bi )P (Bi ) = P (Aj ∩ Bi )
i=1 i∈I i∈I
= P (Aj ∩ (∪i∈I Bi )) = P (Aj ∩ G).
6
(i) G is F–measurable,
R R R
(ii) G (Z − W )dP = G XdP − G XdP = 0
⇒ P (G) = 0.
so f = E(X|F) a.s.
• derivative : 4f /4t
• ratio
2. FA 6= FB ⇒ E(X|FA ) 6= E(X|FB )
Example 1.6
7
1. Discrete : F = σ(Bi , 1 ≤ i < ∞) X ∈ L1
∞
R
X Bi
XdP
E(X|F) = IBi
i=1
P (Bi )
⇒ ϕ(X) = h(Ỹ )
2. E(X|{∅, Ω}) = E X.
3. If X is F–measurable then
R E(X|F) = XR a.s..
Proof : Since ∀G ∈ F G E(X|F)dP = G XdP .
4. If X = cR ,a constant,
R a.s. then E(X|F) = c a.s..
Proof : G XdP = G cdP, Y ≡ c is F–measurable.
8
5. ∀ constantsR a, b E(aXR + bY |F) = aE(X|F) + bE(Y |F).
Proof : G
(rhd) = G (lhs).
6. X ≤ Y a.s. ⇒ E(X|F) ≤ E(Y |F).
Proof : Use (5), we only show that
X − Y = Z ≥ 0 a.s. ⇒ E(Z|F) ≥ 0 a.s..
Let A = {E(Z|F) < 0}, then
Z Z
0≤ ZdP = E(Z|F)dP ⇒ P (A) = 0.
A A
Proof :
Set Zn = supk≥n |Xk −X|, then Zn ≤ 2Y . So Zn ∈ L1 , and Zn ↓ ⇒ E(Zn |F) ↓ .
So ∃Z s.t. limn→∞ E(Zn |F) = Z a.s.. We only have to show that Z = 0 a.s..
Since |E(Xn |F) − E(X|F)| ≤ E(|Xn − X||F) ≤ E(Zn |F).
Note that Z ≥ 0 a.s.. We only have to prove E Z = 0.
Since E(Zn |F) ↓ Z, hence
E Z ≤ lim E(E(Zn |F)) = lim E(Zn ) = E( lim Zn ) = 0
n→∞ n→∞ n→∞
⇒ E Z = 0.
Theorem 1.1 If X is F–measurable and Y, XY ∈ L1 , then E(XY |F) = XE(Y |F).
Proof :
1. X = IG where G ∈ F
∀B∈F
Z Z Z Z
E(XY |F)dP = XY dP = IG Y dP = Y dP
B B B B∩G
Z
= E(Y |F)dP (Since B ∩ G ∈ F)
ZB∩G Z
= IG E(Y |F)dP = XE(Y |F)dP.
B B
9
P 2
2. Find Xn s.t. Xn = nk=0 nk I[ k ≤x< k+1 ] − nk I[− k+1 <x≤− k ] ,
n n n n
then |Xn | ≤ |X|, and Xn → X a.s..
From (1), we obtain that E(Xn Y |F) = Xn E(Y |F).
Now Xn Y → XY a.s.
|Xn Y | = |Xn ||Y | ≤ |XY |
byD.C.T.
limn→∞ E(Xn Y |F) = E(limn→∞ Xn Y |F) = E(XY |F).
But limn→∞ Xn E(Y |F) = XE(Y |F) a.s..
So E(XY |F) = XE(Y |F).
k
X
E(X|F) = ai E(IAi |F).
i=1
Since
k
X Xk
E(IAi |F) = E( IAi |F) = E(1|F) = 1 a.s.,
i=1 i=1
10
so
k
X
ϕ(E(X|F)) ≤ E(IAi |F)ϕ(ai )
i=1
Xk
= E( ϕ(ai )IAi |F) = E(ϕ(X)|F)
i=1
P
2. Find Xn as before (i.e., Xn is of the form ai IAi , |Xn | ≤ |X|, and Xn → Xa.s..)
Then ϕ(E(Xn |F)) ≤ E(ϕ(Xn )|F).
First observe that E(Xn |F) → E(X|F)a.s.. By continuity of ϕ,
Fix m,we can find a convex function ϕm such that ϕm (x) = ϕ(x), ∀|x| ≤ m,
and |ϕm (x)| ≤ Cm (|x| + 1), ∀x, and ϕ(x) ≥ ϕm (x), ∀x.
Fix m, ∀n,
|ϕm (xn )| ≤ Cm (|xn | + 1) ≤ Cm (|x| + 1),
so
lim E[ϕm (xn )|F] = E[ lim ϕm (xn )|F] = E[ϕm (x)|F],
n→∞ n→∞
11
Corollary 1.1 If X ∈ Lp , p ≥ 1 then E(X|F) ∈ Lp .
Proof : Since ϕ(x) = |x|p is convex if p ≥ 1, then
and
E|E(X|F)|p ≤ EE(|X|p |F) = E|X|p < ∞.
Homework :
1 1
1. If p > 1 and p
+ q
= 1,X ∈ Lp , Y ∈ Lq , then
1 1
E(|XY ||F) ≤ E(|X|p |F) p E(|Y |q |F) q a.s..
Therefore
inf2 E(X − Y )2 = E(X − E(X|F))2 .
Y ∈L (F )
Proof :
12
Remark 1.2 Let Fn = σ(X1 , · · · , Xn ). Then θ̂ is Fn –measurable
⇔ ∃ measurable function h such that θ̂ = h(X1 , · · · , Xn ) a.s.
So θ̂n = E(θ|Fn ) is the solution.
1.2 Martingale
(Ω, F, P)
Fn ⊂ F, Fn ⊂ Fn+1 : history(filtration)
Definition 1.2
(iii) The σ–fields Fn = σ(X1 , · · · , Xn ) is said to be the natural history of {Xn }.( It
is obvious Fn ↑. )
(1) Xn is Fn –adaptive.
(2) E(Xn |Fn−1 ) = Xn−1 , ∀n ≥ 2.
(3) {εn , n ≥ 1} is said to be a martingale difference sequence w.r.t. {Fn , n ≥ 0}
if E(εn |Fn−1 ) = 0 a.s., ∀n ≥ 1.
13
{Fn , n ≥ 0}.
Proof :
Example 1.7
(a) IfP{εi } are independent r.v.0 s with E(εi ) = 0, and V ar(εi ) = 1, ∀i. Let Sn =
n
i=1 εi , and Fn = σ(ε1 , · · · , εn ), then E(εn |Fn−1 ) = E(εn ) = 0.
(b) Let Xn = ρXn−1 + εn , |ρ| < 1, where εn are i.i.d. with E(εn ) = 0, E(ε2n ) < ∞ and
n
X0 ∈ L2 is independent of {εi , i ≥ 1}, then i=1 Xi−1 εi is a martingale w.r.t.
P
{Fn , n ≥ 0}, where Fn = σ(X0 , ε1 , · · · , εn ), ∀n ≥ 0.
proof :
Xn = ρ2 Xn−2 + ρεn−1 + εn
= · · · = ρn X0 + ρn−1 ε1 + · · · + εn .
Fn = σ(X1 , · · · , Xn )
14
Fix θ0 , θ, then{Yn (θ), Fn , n ≥ 1} is a martingale
Ln (θ)
Eθ0 (Yn (θ)|Fn−1 ) = Eθ0 ( |Fn−1 )
Ln (θ0 )
fθ (Xn |X1 , · · · , Xn−1 ) Ln−1 (θ)
= Eθ0 ( · |Fn−1 )
fθ0 (Xn |X1 , · · · , Xn−1 ) Ln−1 (θ0 )
Ln−1 (θ) fθ (Xn |X1 , · · · , Xn−1 )
= Eθ0 ( |Fn−1 )
Ln−1 (θ0 ) fθ0 (Xn |X1 , · · · , Xn−1 )
fθ (xn |X1 , · · · , Xn−1 )
Z
= Yn−1 (θ) · fθ0 (xn |X1 , · · · , Xn−1 )dxn .
fθ0 (xn |X1 , · · · , Xn−1 )
Z
i.e., E(ϕ(X)|X1 , · · · , Xn ) = ϕ(x)f (x|X1 , · · · , Xn )dx.
d log Ln (θ)
Eθ ( |Fn−1 )
dθ
d log fθ (Xn |X1 , · · · , Xn−1 ) d log Ln−1 (θ)
= Eθ ( + |Fn−1 )
dθ dθ
∂fθ (Xn |X1 ,··· ,Xn−1 )
∂θ d log Ln−1 (θ)
= Eθ [ |Fn−1 ] +
fθ (Xn |X1 , · · · , Xn−1 ) dθ
Z ∂fθ (xn |X1 ,··· ,Xn−1 )
∂θ d log Ln−1 (θ)
= · fθ (xn |X1 , · · · , Xn−1 )dxn +
fθ (xn |X1 , · · · , Xn−1 ) dθ
d log Ln−1 (θ)
= .
dθ
Lemma : If Xn is Fn –adaptive and Xn ∈ L1 , then S1 = X1 , Sn = X1 +
P n
i=2 (Xi − E(Xi |Fi−1 )) is a martingale w.r.t. {Fn , n ≥ 1}.
proof : n ≥ 2,
n
X
∵ E(Sn |Fn−1 ) = X1 + Xi − E(Xi |Fi−1 ) + E[(Xn − E(Xn |Fn−1 ))|Fn−1 ],
i=2
∴ E[(Xn − E(Xn |Fn−1 ))|Fn−1 ] = E(Xn |Fn−1 ) − E(Xn |Fn−1 ) = 0.
15
(f ) Let
therefore
n n
d log Ln (θ) 1 X 1 X
= 2 xi−1 (xi − θxi−1 ) = 2 xi−1 εi .
dθ σ i=1 σ i=1
16
1 1 2
i.e., ui (θ) = 2
Xi−1 (Xi − θXi−1 ) ⇒ u2i (θ) = 4 Xi−1 (Xi − θXi−1 )2 .
σ σ
Then
1 2
E[u2i (θ)|Fi−1 ] = X E[(Xi − θXi−1 )2 |Fi−1 ]
σ 4 i−1
2
1 2 2 Xi−1
= X σ = 2 ,
σ 4 i−1 σ
so
n
1 X 2
I(θ) = X ,
σ 2 i=1 i−1
2
dui (θ) Xi−1
vi (θ) = =− 2 ,
dθ σ
n n
X 1 X 2
J(θ) = vi (θ) = − 2 X .
i=1
σ i=1 i−1
⇒ I(θ) + J(θ) = 0.
Pn Pn
And i=1 u2i (θ) + i=1 E[vi (θ)|Fi−1 ] is also a martingale, since
n n n
1 X 2 2 1 X 2 1 X 2 2
X [Xi − θXi−1 ] − 2 X = X [ε − σ 2 ],
σ 4 i=1 i−1 σ i=1 i−1 σ 4 i=1 i−1 i
17
Theorem 1.3
(i) Assume that {Xn , Fn } is a martingale. If ϕ is convex and ϕ(Xn ) ∈ L1 , then
{ϕ(Xn )Fn } is a submartingale.
(ii) Assume that {Xn , Fn } is a submartingale. If ϕ is convex, increasing and E[ϕ(Xn )] ∈
L1 , then {ϕ(Xn ), Fn } is a submartingale.
Proof : By Jensen inequality,
Pn
Prove that Xn = i=1 εi , where ε0i s are i.i.d. r.v.0 s with E(εi ) = 0, and E|εi |3 < ∞,
then
E|Xn |3 ≤ E|Xn+1 |3 ≤ . . . .
(iii) [Gilat,D.(1977) Ann. Prob. 5,pp.475-481]
For a nonnegative submartingale {Xn , σ(X1 , . . . , Xn )}, there is a martingale
D
{Yn , σ(Y1 , . . . , Yn )} s.t. {Xn } = {|Yn |}.
18
{T ≤ n} ∈ Fn , ∀ n ⇔ {T = n} ∈ Fn , ∀ n.
19
(ii) If T1 ≤ T2 then FT1 ⊂ FT2 .
Proof :
⇔ [E(Z|Fn ) − U ]dp = 0.
A
20
Z Z Z Z
Xn I[β≥n] dp = Xn dp = Xn dp + Xn dp
A A∩[β≥n] A∩[β=n] A∩[β≥n+1]
Z Z
≤ Xβ dp + Xn+1 dp.
A∩[β=n] A∩[β≥n+1]
Since B ∈ Fn , Z Z Z
E[Xn+1 |Fn ]dp = Xn+1 dp ≥ Xn dp.
B B B
We have that
Z Z Z Z
Xn I[β≥n] dp ≤ Xβ dp + . . . + Xβ dp + XK+1 dp
A A∩[β=n] A∩[β=K] A∩[β≥K+1]
Z Z
= Xβ dp = Xβ dp.
A∩[n≤β≤K] A∩[n≤β]
∀ n, {Xα ≤ x} ∪ {α = n} = {Xn ≤ x} ∩ {α = n} ∈ Fn
So {Xα ≤ x} ∈ Fα .
21
2. Without knowing the limit:
{lim inf Xn < lim sup Xn } = ∪ a<b {lim inf Xn < a < b < lim sup Xn }
rationals
α1 = inf{m : Xm ≤ a}
β1 = inf{m > α1 : Xm ≥ b}
..
.
αk = inf{m > βk−1 : Xm ≤ a}
βk = inf{m > αk : Xm ≥ b},
and define upcrossing number Un = Un [a, b] = sup{j : βj ≤ n, j < ∞}. Note that if
αi0 = αi ∧ n, βi0 = βi ∧ n then αn0 = βn0 = n.
Then define τ0 = 1, τ1 = α10 , . . . , τ2n−1 = αn0 , and τ2n = βn0 . Clearly, τn = n.
If {Xn , Fn } is a submartingale, then {Xτk , Fτk , 1 ≤ k ≤ n} is a submartingale by
optional sampling theorem. ( Since τk ≤ n ∀ 1 ≤ k ≤ n. )
Theorem 1.6 (Upcrossing Inequality)
If {Xn , Fn } is a submartingale, then (b − a)EUn ≤ E(Xn − a)+ − E(X1 − a)+ .
Proof : Observe that the upcrossing number Un [0, b − a] of (Xn − a)+ is the same as
Un [a, b] of Xn . Furthermore,{(Xn −a)+ , Fn } is also a martingale. ϕ(x) = (x−a)+ is a
convex function. Hence we only have to show the case Xn ≥ 0 a.s. and Un = Un [0, c].
Now consider
n−1
X X X
Xn − X1 = Xτn − Xτn−1 + . . . + Xτ1 − Xτ0 = (Xτi+1 − Xτi ) = + ,
i=0 i:even i:odd
X
∵ (xτi+1 − Xτi ) ≥ Un C,
i:odd
X X
∴ EXn − EX1 ≥ CEUn + E( ) ≥ CEUn + (EXτi+1 − EXτi ) ≥ CEUn .
i:even i:even
22
Let U∞ [a, b] be the upcrossing number of {Xn }. Then {lim inf Xn < a < b <
lim sup Xn } ⊂ {∪∞ [a, b] = ∞} and Un [a, b] ↑ U∞ [a, b],
a.s.
Corollary 1.3 If {Xn } is a nonnegative supermartingale then ∃ X ∈ L0 s.t. Xn →
X.
Proof : Since −Xn is a nonpositive submartingale and E(−Xn )+ = 0, ∀ n.
Example 1.9
23
1. Likelihood Ratio
Ln (θ)
Yn (θ) = ≥ 0.
Ln (θ0 )
So Yn (θ) → Y (θ) a.s. (Pθ0 ), (Y (θ) = 0 if θ1 , θ0 are distinctable.)
2. Baye’s est.
24
2. If ∃ Borel–measurable function f : [0, ∞) 7→ [0, ∞) s.t. supn Ef (|Xn |) < ∞ and
limt→∞ f (t)
t
= ∞, then {Xn } is u.i..
p
Theorem 1.9 Assume that Xn → X , then the following statements are equivalent.
(i) {|Xn |p } is u.i.
Lp n→∞
(ii) Xn → X, (i.e.E|Xn − X|p −→ 0)
n→∞
(iii) E|Xn |p −→ E|X|p
D n→∞
Remark 1.8 If Xn → X and {|Xn |p } is u.i., then E|Xn |p −→ E|X|p .
Proof : We can reconstruct the probability space and r.v.’s Xn0 , X 0 ,
D D a.s.
s.t. Xn0 = Xn , X 0 = X and Xn0 → X 0 .
D n→∞
Ex. Let Xn → N (0, σ 2 ) and Xn2 is u.i., then E(Xn2 ) −→ σ 2 . How to know
max1≤i≤n |Xn |p ∈ L1 ?
Proof : Define τ = inf{i : Xi > λ}, (recall : inf Ø = ∞), then {max1≤i≤n Xi > λ} =
{τ ≤ n}. On the set τ = k ≤ n, Xτ > λ, then
Z Z Z
λP [τ = k] ≤ Xτ dP = Xk dP ≤ Xn dP
[τ =k] [τ =k] [τ =k]
Since
τ = k ⇔ X1 ≤ λ, . . . , Xk−1 ≤ λ, Xk > λ,
then
n
X Z Z
λP [ max Xi > λ] = λ P [τ = k] ≤ Xn dP = Xn dP.
1≤i≤n [τ ≤n] [max1≤i≤n Xi >λ]
k=1
25
1
where kXkp = (E|X|p ) p and p1 + 1q = 1.
Proof : Since {|Xn |, Fn } is a submartingale, by the theorem. Let Z = max1≤i≤n |Xi |,
then
Z ∞
p
E(Z ) = p xp−1 P [Z > x]dx
Z0 ∞ Z ∞
p−2
≤ p x E(|Xn |I[Z>x] )dx = pE[|Xn | I[Z > x]xp−2 dx]
0 0
Z Z p−1
Z
≤ pE[|Xn | xp−2 dx] = pE[|Xn | ]
0 p−1
p p 1
≤ kXn kp kZ p−1 kq = kXn kp [E(Z p )] q .
p−1 p−1
Hence
kZ p−1 kq = {E(Z p−1 )q }1/q = [E(Z p )]1/q ,
1
kZkp = [E(Z p )]1/p = [E(Z p )]1− q ≤ qkXn kp .
Note that
k max |Xi |kp = ∞ ⇒ qkXn kp = ∞.
1≤i≤n
Corollary 1.4 If {Xn , Fn , n ≥ 1} is a martingale s.t. supn E|Xn |p < ∞ for some
p > 1 then {|Xn |p } is u.i. and Xn converges in Lp .
Proof : p > 1 ⇒ supn E|Xn | < ∞ so Xn converges a.s. to a r.v. X. By Doob’s
inequality:
k max |Xi |kp ≤ qkXn kp ≤ q sup kXn kp < ∞
1≤i≤n n
By the Monotone convergence theorem:
E sup |Xi |p = lim E sup |Xi |p ≤ q sup E|Xn |p < ∞
1≤i≤∞ n→∞ 1≤i≤n n
Lp
So sup1≤i≤∞ |Xi |p ∈ L1 , {|Xn |p } is u.i. and Xn −→ X.
26
a.s.
Ex.( Bayes Est. ) θ̂n = E[θ|X1 , . . . , Xn ]. If θ ∈ L2 then θ̂n → θ∞ and E[θ̂n −θ∞ ]2 → 0.
pf: E θ̂n2 ≤ Eθ2 < ∞(p = 2).
L1 R n→∞ R R
RTherefore, XRn → X∞ . So ∀ Λ ∈ F, Λ Xn dP → Λ X∞ dP . Since | Λ Xn dP −
Λ
X∞ dP | ≤ Λ |Xn − X∞ |dP ≤ E|Xn − X∞ | → 0. Fix n, Λ ∈ Fn , ∀ m ≥ n
Z Z Z Z
XdP = Xn dP = Xm dP = X∞ dP
Λ Λ Λ Λ
27
Example: yi = θxi + εi ,Xi : constant,θ ∈ L2 with known density f (θ),εi i.i.d.
N (0, σ 2 ), σ 2 known, and {εi } is independent of θ.
Pn
µ i=1 Xi Yi
c 2 + 2
θ̂n = E(θ|Y1 , . . . , Yn ) = Pnσ 2
1 i=1 Xi
c 2 + σ 2
P∞
When i=1 Xi2 < ∞
P∞ 2
P∞
µ i=1 Xi i=1 Xi εi
n→∞ c2 + σ2
θ+ σ2
θ̂n −→ P∞ 2 = θ∞
1 i=1 Xi
c2
+ σ2
P∞ 2
P∞
i=1 Xi ( Xi2 )σ 2
σ 2 c2 + i=1
σ4
D
∼ N (µ, P∞ 2 ) 6= θ
i=1 Xi
( c12 + σ 2 )2
P∞
When i=1 Xi2 → ∞
Pn Pn
xi yi xi εi a.s.
θ̂n ∼ Pn 2 = θ + Pi=1
i=1
n 2
→θ
i=1 xi i=1 xi
Pn
xi y i Pn
In general, let θ̂n = Pi=1
n 2 . When i=1 x2i → ∞,
i=1 xi
Pn
2 x i εi 2 σ2
E(θ̃n − θ) = E{ Pi=1
n 2
} = Pn 2
→0
i=1 Xi i=1 xi
p
So θn → θ. By our theorem, θˆn → θ a.s. or L2 .
Pn
How to calculate the upper and lower bound of E|Xn |p and E| i=1 Xi εi |p ?
28
1.4 Square function inequality
Let {Xn , Fn } be a martingale and d1 = X1 ,di = Xi − Xi−1 for i ≥ 2.
D D
N (0, σ 2 ) then Y ∼ N (0, ( ∞ 2 2
P
If εi ∼ P −∞ ai )σ ).
C2 = ( ∞ 2
−∞ ai )σ
2
∞
p Y p p p p p p/2
X
E|Y | = E| | C = (E|N (0, 1)| )C = {E|N (0, 1)| }σ ( a2i )p/2
C −∞
P∞ P∞ 2
Example: Let Y = −∞ ai εi ,where −∞ ai < ∞ and εi are i.i.d. random Pvaribles
2 p n
with E(εi ) = 0 and V ar(εi ) = σ < ∞. Assume E|εi )| < ∞,Yn = −n ai εi ,
(a−n ε−n , a−n ε−n + a−n+1 ε−n+1 , · · · , Yn ) is a martingale.
29
By Fatou’s lemma, E|Y |p ≤ C2 (E|ε1 |p ){ ∞ 2 p/2
P
−∞ ai } , ∃ C1 , C2 depending only on p
p
and E|εi | s.t.
X∞ X∞
C1 ( a2i )p/2 ≤ E|Y |p ≤ C2 ( a2i )p/2
−∞ −∞
Example: Consider yi = α + βxi + εi where εi are P i.i.d. mean 0 and E|εi |p < ∞
for some p ≥ 2. Assume that xi are constant and s2n = ni=1 (xi − x̄n )2 → ∞. If p > 2
then the least square estimator β̂ is strongly consistent.
Pn
(xi − x̄n )εi σ2
β̂n − β = Pi=1
n 2
(V ar(β̂n ) = )
i=1 (xi − x̄n ) s2n
x̄n = n1 ni=1 xi ,let
P
Pn
Sn = i=1 (xi − x̄n )εi , n ≥ 2
= S2 + (S3 − S2 ) + · · · + (Sn − Sn−1 )
When n > m,
Sn − Sm = Pni=1 (xi − x̄n )εi − Pm
P P
i=1 (xi − x̄m )εi
m n
= (x̄
i=1 m − x̄ )ε
n i + i=m+1 (xi − x̄n )εi
Pm
E(Sn − Sn−1 )Sm = x̄m )(x̄m − x̄n )σ 2
i=1 (xi − P
= (x̄m − x̄n )[ m i=1 (xi − x̄m )]σ
2
So s2n = ( ni=2 Ci2 ) where C22 = E(S22 )/σ 2 and Cn2 = E(Sn − Sn−1 )2 /σ 2 . We want to
P
show Ss2n → 0 a.s.
n
30
Theorem 1.14 (Burkholder-Davis-Gundy)
∀ ρ > 0, ∃ C depending only on p s.t.
n
X
E(Xn∗ )p ≤ C{E[ E(d2i |Fi−1 )]p/2 + E( max |di |p )}
1≤i≤n
i=1
Cor.(Wei,1987,Ann.Stat. 1667-1682)
Assume that {εi , Fi } is a martingale differences s.t. supn E{|εn |p |Fn−1 } ≤ C for
some p ≥ 2 and constant C.
Assume that un is Fn−1 –measurable. Let Xn = ni=1 ui εi and Xn∗ = sup1≤i≤n |Xi |.
P
Then ∃ K depending only on C and p s.t. E(Xn∗ )p ≤ KE( ni=1 u2i )p/2 .
P
Proof: By B–D–G inequality:
n
X
E(Xn∗ )p ≤ Cp {E[ E(u2i ε2i |Fi−1 )]p/2 + E max |ui εi |p }
1≤i≤n
i=1
Pn Pn 2
i=1E(u2i ε2i |Fi−1 ) ≤ p
i=1 ui [E(|εi | |Fi−1 )
2/p
]
2 P
n
≤ C p ( i=1 u2i )
CE( ni=1 u2i )p/2
P
f irst term ≤ CpP
n p p
second term ≤ E Pn i=1 |ui | p|εi | p
= Pi=1 E(|ui | |εi | )
n p p
= i=1 E{E(|ui | |εi | |Fi−1 )}
P n p
≤ C P i=1 E|ui |
n
= CE(P i=1 |ui |p )
≤ CE ni=1 u2i (max1≤j≤n |uj |p−2 )
p−2
≤ CE(Pni=1 u2i )( ni=1 u2i ) 2
P P
= CE( ni=1 u2i )p/2
Let K = Cp C + C.
Pn
|ai |p ≤ ( ni=1 a2i )p/2 .
P
ai constant,p ≥ 2 : i=1
31
P
1. If P (Ai ) < ∞ then P (Ai i.o.) = 0.
P
2. If Ai are independent and P (Ai i.o.) = 0 then P (Ai ) < ∞.
P∞
Define X = i=1 IAi then {Ai i.o.} = {X = ∞}.
X X X
P (Ai ) = E(IAi ) = E( IAi ) = E(X)
2. ?
P∞ P∞
i=1 E(IAi |Fi−1 ) < ∞ a.s. if i=1 EIAi < ∞,Fn = σ(A1 , · · · , An )
1
2. if
PY = supn Xn /(1 + XP1 + · · · + Xn−1 ) ∈ L and Xn is Fn –measurable then
∞ ∞
i=1 Mi < ∞ a.s. on { i=1 Xi < ∞}.
Theorem 1.16 Let {Xn } be a sequence of nonnegative random variables and {Fn }
be a sequence of increasing σ–fields. Let Mn = E(Xn |Fn−1 ) for n ≥ 1.
P∞ P∞
1. i=1 Xi < ∞ a.s. on { i=1 Mi < ∞}.
Xn 1
P∞
2. If Xn is Fn –measurable and Y = supn 1+X1 +···+Xn−1
∈ L then i=1 < ∞ a.s.
P∞
on { i=1 Mi < ∞}.
32
Classical results : Ai events
P ∞
i=1 P (Ai ) < ∞ ⇒ P (An i.o.) = 0
If Ai are independent then P (An i.o.) = 0 or P ( ∞
P P∞
i=1 IAi < ∞) = 1, ⇒ i=1 P (Ai ) <
∞.
xi = IAi , Fn = σ(A1 , · · · , An )
P∞
P (Ai ) = ∞
P P∞
i=1P i=1 E(I Ai
) = E( i=1 IAi )
∞ P∞
= E( i=1 Xi ) = E{ i=1 E(Xi |Fi−1 )} < ∞
P∞
|Fi−1 ) < ∞ a.s. ⇒P ∞
P
⇒ i=1 E(XiP i=1 Xi < ∞ a.s.
{An i.o.} = { ∞ I Ai = ∞} = { ∞
i=1 Xi = ∞}
P∞ P∞ i=1 indep. P∞ P∞
i=1
P∞ Mi = i=1 E(IAi |Fi−1 P)∞ = i=1 E(IAi ) = i=1 P (Ai )
P { i=1 IAi < ∞} > 0 ⇒ i=1 P (Ai ) < ∞
proof of theorem:
(i) Let M0 = 1. Consider
Pn Mi
i=1 (M0 +···+Mi−1 )(M0 +···+Mi )
P n 1 1
= i=1 { M0 +···+M i−1
− M0 +···+M i
}
1 1 1
= M0 − M0 +···+Mn = 1 − 1+M0 +···+Mn
Let Sn = M0 + · · · + Mn then Sn is Fn−1 –measurable.
Since 1 ≥ E ∞
P Mi
P∞ Mi
i=1 Si−1 Si = i=1 E( Si−1 Si )
P∞
= i=1 E( E(X i |Fi−1 )
)= ∞ Xi
P
S i−1 S i i=1 E{E( Si−1 Si |Fi−1 )}
= ∞
P Xi
P∞ Xi
i=1 E( Si−1 Si ) = E( i=1 Si−1 Si )
P∞ Xi
So i=1 Si−1 Si < ∞ a.s.
E( ∞ Mi
)= ∞ Mi
)= ∞ EE( UM2 i |Fi−1 )
P P P
2
i=1 Ui−1 i=1 E( Ui−1
2
P∞ Xi P∞ Xi Ui=1 i−1
= E( i=1 U 2 ) = E( i=1 Ui−1 Ui Ui−1 i
)
i−1
≤ E[( ∞ Xi Ui Ui
P
i=1 Ui−1 Ui )(supi Ui−1 )] ≤ E supi Ui−1
= E(supi (1 + UXi−1
i
)) = E(1 + Y ) < ∞
33
P∞ Mi
So 2
i=1 Ui−1 < ∞ a.s.
2
On the set {U∞ < ∞}
∞ P∞ ∞
Mi i=1 Mi
X X
2
≥ 2
⇒ Mi < ∞
i=1
Ui−1 U ∞ i=1
P [{P∞
P P∞
i=1 Mi < ∞} 4 {Pi=1 Xi < ∞}] = 0, and
P [{ ∞i=1 Mi = ∞} 4 {
∞
i=1 Xi = ∞}] = 0.
34
Pn
(ii) i=1 E(εi I[|εi |≤C] |Fi−1 ) converges, and
P∞ 2
(iii) i=1 {E(εi I[|εi |≤C] |Fi−1 ) − E 2 (εi I[|εi |≤C] |Fi−1 )} < ∞
Remark: When εi are independent,(i),(ii) and (iii) are also necessary for Xn to be an
a.s. convergent series.
proof:
Xn = Pni=1 εi
P
n Pn
= i=1 ε i I[|εi |>C] + i=1 {εi I|εi |≤C] − E(εi I[|εi |≤C] |Fi−1 )}
+ ni=1 E(εi I[|εi |≤C] |Fi−1 )
P
= I1n + I2n + I3n
Let Ω0 = {(i),(ii) and (iii) hold}. By (i) and the conditional Borel—Cantelli lemma,
∞
X
I[|εi |>C] < ∞ a.s. on Ω0
i=1
E(Zi2 |Fi−1 ) = E(ε2i I[|εi |≤C] |Fi−1 ) − E 2 (εi I[|εi |≤C] |Fi−1 ).
I3n follows from (ii).
Theorem 1.19P(Chow)
n
P{X
Let n = i=1 εi , Fn } be a martingale and 1 ≤ p ≤ 2. Then Xn converges a.s.
on { ∞ i=1 E(|ε |p
i |Fi−1 ) < ∞}.
35
proof: Let C > 0.
(i) P [|εi | > C|Fi−1 ] ≤ E(|εi |p |Fi−1 )/C p .
(ii)
∞
X ∞
X
|E(εi I[|εi |≤C] |Fi−1 )| = |E(εi I[|εi |>C] |Fi−1 )|
i=2 i=2
∞
X ∞
X
≤ E(|εi |I[|εi |>C] |Fi−1 ) ≤ E(|εi |p |Fi−1 )/C p−1
i=2 i=2
(iii)
E{ε2i I[|εi |≤C] |Fi−1 } ≤ E{|εi |p C 2−p |Fi−1 }
≤ C 2−p E{|εi |p |Fi−1 }.
Pn+1
New proof: τ = inf{n : E(|εi |p |Fi−1 ) > K}, 1 < p ≤ 2.
i=1
Pn
E|Xτ ∧n |p = E| P i=1 I[τ ≥i] εi |
p
n
≤ Cp E( Pi=1 I[τ ≥i] ε2i )p/2
≤ Cp E{Pni=1 I[τ ≥i] |εi |p }
= Cp E{ n∧τ p
i=1 E(|εi | |Fi−1 )}
≤ KCp
When p = 1,
E|Xτ ∧n | ≤ E Pni=1 I[τ ≥i] |εi |
P
= E n∧τ
i=1 E(|εi ||Fi−1 ) ≤ K.
36
yi = βxi + εi
εi i.i.d., Eεi = 0 and V ar(εi ) = σ 2
xi is Fi−1 P
= σ(ε1 , · · · , εi−1 )–measurable. P
n ∞ P∞ 2
xi εi xi εi
β̂n = β + Pi=1 2 converges a.s. to β + 2 on { i=1 xi < ∞}
n Pi=1
∞
x
i=1 i x
i=1 i
Chow’s Theorem
n
(∞ )
X X
εi converges a.s. on E(| εi |P | Fi−1 ) < ∞ , where 1 ≤ p ≤ 2.
i=1 1
Special case:
pf : Take xi = u1i
∞
X 1
Then εi converges a.s. by previous corollary. In view of Kronecker’s Lemma.
1
ui
∞
X
εi
1
−→ 0 when un ↑ ∞
un
R∞
Pn : Let2 f : [0, ∞) → (0, ∞) be an increasing fun. s.t.
Corollary 0
f −2 (t)dt < ∞ . Let
2
sn = i=1 E(εi | Fi−1 )Fn−1 measurable. Then
n n
X εi X
2
converges a.s., εi = 0(f (s2n )) a.s.
i=1
f (si ) i=1
on {s2n → ∞} where lim f (t) = ∞.
t→∞
37
pf:
∞
" 2 # ∞
X εi X E(ε2i | Fi−1 )
E | Fi−1 =
i=1
f (s2i ) i=1
f 2 (s2i )
∞ ∞ Z s2i
X s2i − s2i−1 X 1
= ≤ dt
f 2 (s2i ) s2i−1 f 2 (t)
Zi=1∞ i=1
1
≤ dt < ∞
so f 2 (t)
Remark:
1+δ
t1/2 (log t) 2 , δ > 0, t ≥ 2
f (t) =
f (2), o.w.
or f (t) = t
For this, we have that
∞
X
s2∞ = x2i E(ε2i | Fi=1 )
i=1
n n
!
X X
x i εi = 0 x2i E(ε2i | Fi=1 )
"i=1∞ i=1
#
X
on x2i E(ε2i | Fi=1 ) = ∞
i=1
If we assume that
sup E(ε2i | Fi−1 ) < ∞
i
n n
! (∞ )
X X X
x i εi = 0 x2i on x2i = ∞
i=1 i=1 i=1
n P∞ 2
X O(1) on { 1 xi < ∞}
x i εi =
o( 1 xi ) on { ∞
Pn 2 P 2
i=1 1 xi = ∞}
38
Example : yi = βxi + εi
where {εi , Fi } is a martingale difference seq. s.t.
∞
X
converges a.s. and the limit is β on { x2i = ∞}
1
pf:
∞
X n
X
On { x2i < ∞}, xi εi converges
1 1
∞
X
x i εi
i=1
So that β̂n → β + ∞
X
x2i
i=1
∞
X
On { x2i = ∞}
1
n
X
x i εi
i=1
n −→ 0, as n → ∞
X
x2i
i=1
So that β̂n −→ β
Application (control)
yi = βxi + εi , (β 6= 0) where εi i.i.d. with E(εi ) = 0, V ar(εi ) = σ 2
Goal: Design xi which depends on previous observations so that y ' y ∗ 6= 0
39
Strategy : choose x1 arbitrary , set
y∗
xn+1 =
β̂n
Question:
y∗
xn → a.s. ?
β
or β̂n → β a.s. ?
(y ∗ )2
Then x2n+1 = is bounded away from zero
β̂n2
∞
X
and x2n+1 = ∞ a.s.. Therefore, β̂n → β a.s.
1
Open Question:
Is there a corresponding result for
yi = α + βxi + εi
or yi = αyi−1 + βxi + εi
Open Questions:
∞
X
Assume that | xi |p < ∞ a.s. and
1
p
sup E(| εn | | Fn−1 ) < ∞ a.s. for some 1 ≤ p ≤ 2
n
∞
X
What are the distribution properties of S = x i εi ?
1
xi are constants
xi 6= 0 i.o.
⇒ S has a continuous distribution
p=2
lim inf n→∞ E(| εn || Fn−2 ) > 0 a.s.
40
Almost Supermartingale
Theorem (Robbins and Siegmund)
Let {Fn } be a sequence of increasing fields and xn , βn , yn , zn are nonnegative Fn -
measurable random variables
Then on
(∞ ∞
)
X X
βi < ∞, yi < ∞
i=1 i=1
∞
X
xn converges and zi < ∞ a.s.
1
i=1
n
Y
0
zn = zn (1 + βi )−1
i=1
n
0
Y
Then E(xn+1 | Fn ) = E(xn+1 | Fn ) (1 + βi )−1
i=1
n
Y
≤ [xn (1 + βn ) + yn − zn ] (1 + βi )−1
i=1
= x0n + yn0 − zn0
(∞ ) n
X Y
on βi < ∞ , (1 + βi )−1 converges to a nonzero limit.
i=1 i=1
41
Therefore,
X X
(i) yi < ∞ ⇐⇒ yi0 < ∞
(ii) xn converges ⇐⇒ x0n converges
X X
(iii) zi < ∞ ⇐⇒ zi0 < ∞
2o Assume that βn = 0, ∀ n
E(xn+1 | Fn ) ≤ xn + yn − zn
n−1
X n−1
X n−1
X
Let un = xn − (yi − zi ) = xn + zi − yi
1 1 1
n
X
Then E(un+1 | Fn ) = E(xn+1 | Fn ) − (yi − zi )
1
n
X
≤ xn + yn − zn − (yi − zi )
1
n−1
X
= xn − (yi − zi ) = un
1
42
∞
X ∞
X
So that x + zi converges a.s. on { yi < ∞}
1 1
n
X
So that zi converges and so does xn .
1
Example : Find the quantile
43
P 2
Condition (1) an < ∞ P
Then Xn converges
P a.s. and Zi < ∞ a.s.
Condition (2) an = ∞
X
Pn converges
P to X
Zi = 2β ai+1 Xi < ∞
⇒ X = 0 a.s. P
Remark: Assume ai < ∞
n
Y
P
when aj < ∞, Cn = (1 − aj β) converges to C > 0.
j=1
" ∞
#
X
So that xn − x∗ → C (x1 − x∗ ) − Cj−1 aj εj
j=1
∞
X
Note that (Cj−1 aj )2 < ∞, Cj−1 aj > 0 ∀j
j=1
∞
X
So that Cj−1 aj εj has a continuous distribution
j=1
44
Classical CLT:
Assume that ∀ n
Xn,i , 1 ≤ i ≤ kn are indep. with EXn,i = 0.
Xkn
2 2
Let sn = Xn,i
i=1
kn
X 1 2
Thm. If ∀ ε > 0, 2
E Xn,i I[|Xn,i |>sn ε] → 0
s
i=1 n
kn
X Xn,i D
then → N (0, 1)
i=1
sn
Xn,i
* Reformulation: X
en,i =
sn
kn
X
(i) e2 ) = 1
E(Xn,i
i=1
Xkn h i
2
(ii) E Xen,i I[|Xen,i |>ε] → 0 ∀ ε
i=1
0
(ii) is Lindeberg s condition
* uniform negligibility (How to use mathematics to formulate?)
( D
max | Xn,i |→ 0
1≤i≤kn
2
controlXn,i
* condition of varience
To recall Burkholder0 s inequality: ∀ 1 < p < ∞
n
!p/2 n
!p/2
X X
Cp0 E d2i ≤ E | Sn |p ≤ Cp E d2i
i=1 i=1
EZ p d2i )1/2 EZ p
P
Z=(
kn kn
f ormalize
X X
2 2
Xn,i → (Xn,i | Fn,i−1 )
i=1 i=1
1 j
X X
2 2
Xn,i E(Xn,i | Fn,i−1 )
i=1 i=1
↑ ↑
optional quadratic variance predictable quadratic variance
45
( Thm. j∀ n ≥ 1, {Fn,j ; 1 ≤
) j ≤ kn < ∞} is a sequence of increasing σ-fields. Let
X
Sn,j = Xn,i , 1 ≤ j ≤ kn be {Fn,j }-adaptive.
i=1
Define
Xn∗ = max | Xn,i |,
1≤i≤kn
j
X
2 2
Un,j = Xn,i , 1 ≤ j ≤ kn
i=1
Assume that
kn
D
X
(i) Un2 = 2
Un,k n
= 2
Xn,i → Co , where Co > 0 is a constant.
i=1
D
(ii) Xn∗ → 0
(iii) sup E(Xn∗ )2 < ∞
n≥1
kn kn
D D
X X
(iv) E{Xn,j | Fn,j−1 } → 0 and E 2 {Xn,j | Fn,j−1 } → 0
j=1 j=1
Then
kn
D
X
Sn = Xn,i → N (0, Co )
i=1
= Sn,kn
Remark:{Xn,j , 1 ≤ j ≤ kn } can be defined on different probability space for different
n.
Step 1. Reduce the problem to the case
where {Sn,j , Fn,j , 1 ≤ j ≤ kn } is a martingale. Set
en,j = Xn,j − E(Xn,j | Fn,j−1 )
X 1 ≤ j ≤ kn , Fn,o : trivial field
kn
X
2 e2
Un =
e X n,j
j=1
en∗
X = max | X
en,j |
1≤j≤kn
kn
X
Sen = X
en,j
j=1
46
kn
D
X
(a)Sn − Sen = E(Xn,j | Fn,j−1 ) → 0 by(iv)
j=1
1/2
en∗ ≤ | E(Xn,j | Fn,j−1 ) |2
(b) X max | Xn,j | + max
1≤j≤kn 1≤j≤kn
(k )1/2
X n
e∗ → D
So that X n 0 by (ii) and (iv)
∗ 2 ∗ 2 2
(X ) ≤ 2(Xn ) + 2 max E (Xn,j | Fn,j−1 )
e
1≤j≤kn
∗ 2 2 ∗
≤ 2(Xn ) + 2 max E (Xn | Fn,j−1 )
1≤j≤kn
e ∗ )2 ≤ 2E(X ∗ )2 + 2 × 4E(X ∗ )2
So that E(Xn n n
∗ 2
= 10E(Xn ) < ∞
kn kn
D
X X
e2 − U 2 =
U 2
E (Xn,j | Fn,j−1 ) − 2 Xn,j E(Xn,j | Fn,j−1 ) → 0
n n
j=1 j=1
kn
D
X
E 2 (Xn,j | Fn,j−1 ) → 0 By(iv)
j=1
kn
D
X
Xn,j E(Xn,j | Fn,j−1 ) → 0
j=1
47
kn kn
!1/2 kn
!1/2
D
X X X
2
Because | Xn,j E(Xn,j | Fn,j−1 ) |≤ Xn,j E 2 (Xnj | Fn,j−1 ) →0
i=1 j=1 i=1
kn
!1/2
D
X
2
Xn,j = (Un2 )1/2 → Co1/2
j=1
kn
!1/2
D
X
E 2 (Xn,j | Fn,j−1 ) →0
i=1
D
e2 →
So that Un Co
Then
kn
D
X
Sn = Xn,i → N (0, Co )
i=1
48
where C > Co
D
⇒ Ûn2 → Co
Clearly, X̂n∗ ≤ Xn∗
D
Therefore, X̂n∗ → 0 by (ii) and
sup E(X̂n∗ )2 ≤ sup E(Xn∗ )2 < ∞
n≥1 n≥1
49
Reason : Step 3 ⇒ E eiSn → e−Co /2
2
Now replace Sn by t Sn . Using step 3 again, we obtain EeitSn → e−t Co /2
(a) Expansion
2
eix = (1 + ix)e(−x /2)+r(x) , where | r(x) |≤| x |3 for | x |< 1
Because | x |< 1
⇒ ix = [log(1 + ix)] − x2 /2 + r(x)
x2
⇒ r(x) = + ix − log(1 + ix)
2 "∞ #
x2 X (ix)j
= + ix − (−1)j+1
2 j=1
j
∞ j
X
j (ix) (ix)3 (ix)4
= (−1) =− + − ···
j=3
j 3 4
= x a(x) + x3 b(x)i
4
1 x2 x4 1
where a(x) = − + − ··· <
4 6 8 4
2 4
1 x x 1
b(x) = − + ··· <
3 5 7 3
p
| r(x) | = x8 a2 (x) + x6 b2 (x)
r r
x8 x6 3 1 1
≤ + ≤| x | + ≤| x |3
16 9 16 9
kn
Y
iŜn
e = eiXn,j
j=1
kn
X Xkn
2
"k
n
# − X̂n,j /2 + r(X̂n,j )
Y
= (1 + iX̂n,j ) e j=1 j=1
j=1
def 2
= Tn e−Ûn /2+Rn
h 2
i 2
= (Tn − 1)e−Co /2 + (Tn − 1) e−Ûn /2+Rn − e−Co /2 + e−Ûn /2+Rn
= In + IIn + IIIn
50
Note that on {X̂n∗ < 1}
kn
X kn
X
| Rn | ≤ | r(X̂n,j ) |≤ | X̂n,j |3
j=1 j=1
kn
X
≤ X̂n∗ 2
X̂n,j = X̂n∗ Ûn2
j=1
↓D ↓D ↓D
0 0 Co
D
⇒ Rn → 0
D
So that IIIn → e−Co /2
(k )
X n
Now E | Tn |2 = E 2
(1 + X̂n,j )
j=1
Y
2 2
= E(1 + X̂n,τ ) (1 + X̂n,j )
j<τ
τ −1
X
2
X̂n,j
≤ E(1 + X̂n∗2 )e j=1
≤ ec E(1 + X̂n∗2 ) < ∞
D
| IIn |= | Tn − 1 | | IIIn − e−co /2 |→ 0
k ↓D
Op (1) 0
51
E(In ) = e−co /2 [E(Tn ) − 1] = 0
(k )
Y n
E(Tn ) = E (1 + iX̂n,j )
j=1
(k )
Y n
D
eiŜn − In = IIn + IIIn → e−co /2
Note:
j
( )
X
∀n Sn,j = Xn,i , Fn,j is a martingale
i=1
kn
D
X
(i) Un2 = 2
Xn,i →C>0
i=1
D
(ii) sup | Xn,i |→ 0
1≤i≤kn
52
Lemma 1. Assume that Fo ⊂ F1 ⊂ · · · ⊂ Fn
Then ∀ ε > 0 ,
n
! ( n )
[ X
P Ai ≤ ε + P P (Aj | Fj−1 ) > ε
i=1 j=1
k
X
pf: Let µk = P (Aj | Fj−1 )
j=1
Then µk is Fk−1 -measurable
n
! n
[ X
So that P Ai [µn ≤ ε] ≤ P (Ai [µn ≤ ε])
i=1 i=1
Xn
≤ P (Ai [µi ≤ ε])
i=1
n
X
= E E(IAi I[µi ≤ε] | Fi−1 )
i=1
n
X
= E E(IAi | Fi−1 )I[µi ≤ε]
i=1
≤ ε
j
X
Lemma : Zj ≥ 0, µj = E(Zi | Fi−1 )
i=1
n
X n
X
Then E Zi I[µi ≤ε] = E E(Zi | Fi−1 )I[µi ≤ε] ≤ ε
i=1 i=1
53
Corollary. Assume that Ynj ≥ 0 a.s. and Fn,1 ⊂ · · · ⊂ Fn,kn
kn
D
X
Then P (Yn,j > ε | Fn,j−1 ) → 0, ∀ ε
j=1
D
⇒ max Yn,j → 0
a≤j≤kn
kn
D
X
Remark : E[Yn,j I[Yn ,j>ε] | Fn,j−1 ] → 0 is sufficient
j=1
pf : Let Yn∗ = max Yn,j
1≤j≤kn
"k #
[ n
D
then max | Un,j − Vn,j |→ 0
1≤j≤kn
D
pf: By previous corollary Yn∗ → 0
0
Let Yn,j = Yn,j I[Yn,j ≤δ, Vn,j ≤λ]
54
0 0
Define Un,j , Vn,j , Un0 , Vn0 similarly
Then P max | Un,j − Vn,j |> 3γ
1≤j≤kn
0
≤ P max | Un,j − Un,j |> γ
1≤j≤kn
0 0
+P max | Un,j − Vn,j |> γ
1≤j≤kn
0 def
+P max | Vn,j − Vn,j |> γ ≡In + IIn + IIIn
1≤j≤kn
55
0
(3) Note that max | Vn,j − Vn,j |
1≤j≤kn
j
X
0
≤ max | (E(Yn,i | Fn,i−1 ) − E(Yn,i | Fn,i−1 )) |
1≤j≤kn
i=1
kn
X
0
≤ E(| Yn,i − Yn,i || Fn,i−1 )
i=1
kn
X
≤ E(Yn,j I[Yn,j >δ or Vn,j >λ] | Fn,j−1 )
j=1
kn
X kn
X
≤ E(Yn,j I[Yn,j >δ] | Fn,j−1 ) + E(Yn,j I[Vn,j >λ] | Fn,j−1 )
j=1 j=1
kn
X kn
X
≤ E(Yn,j I[Yn,j >δ] | Fn,j−1 ) + E(Yn,j | Fn,j−1 )I[Vn,j >λ]
j=1 j=1
kn
X kn
X
≤ E(Yn,j I[Yn,j >δ] | Fn,j−1 ) + E(Yn,j | Fn,j−1 )I[Vn >λ]
j=1 j=1
kn
X
≤ E(Yn,j I[Yn,j >δ] | Fn,j−1 ) + Vn I[Vn >λ]
j=1
" kn
#
X γ
IIIn ≤ P E(Yn,j I[Yn,j >δ] | Fn,j−1 ) >
j=1
2
h γi
+P Vn I[Vn >λ] >
"k 2 #
n
X γ
≤ P E(Yn,j I[Yn,j >δ] | Fn,j−1 ) > + P [Vn > λ]
j=1
2
4δλ
So that lim sup P max | Un,j − Vn,j |> 3γ ≤ 2 sup P [Vn > λ] +
n→∞ 1≤j≤kn n γ2
1
Let λ → ∞, δ = λ2
. The proof is completed.
56
j
X
Thm. ∀ n {Sn,j = Xn,i , Fn,j } is a martingale
i=1
kn
D
X
If (i) Vn2 = 2
E(Xn,i | Fn,i−1 ) → C > 0
i=1
kn
D
X
2 0
and (ii) E(Xn,i 2 >ε] | Fn,i−1 ) → 0 Conditional Lindeberg s condition
I[Xn,i
i=1
kn
D
X
then Sn = Xn,i → N (0, C)
i=1
2
pf: Set Yn,j = Xn,j
2 D
By (ii) and lemma 1, Yn∗ = max Xn,j →0
1≤j≤kn
D
or max | Xn,j |→ 0
1≤j≤kn
By (i), {Vn2 } is tight.
Therefore by (ii) and lemma 2.
D D
Vn2 − Un2 → 0, So that Un2 → C by (i).
0
Now define Xn,j = Xn,j I j
X
2
E(Xn,j 2 >ε] | Fn,j−1 ) ≤ 1
I[Xn,j
i=1
" kn
#
X
2
Since P [Sn 6= Sn0 ] ≤ P E(Xn,j 2 >ε] | Fn,j−1 ) > 1
I[Xn,j →0
j=1
D
So that it is sufficient to show that Sn0 → N (0, C)
0 D
(a) max | Xn,j |≤ Xn∗ → 0
1≤j≤kn
"k #
n
02
X
2 2
(b) P [Un 6= Un ] ≤ P 2 >ε] | Fn,j−1 ) > 1
E(Xn,j I[Xn,j →0
j=1
02 D
So that Un → C
57
0 0 0
(c) E max (Xn,j )2 ≤ E max (Xn,j )2 I[(Xn,j
0 )2 ≤ε] + E max (X
2
n,j ) I[(Xn,j
0 )2 >ε]
1≤j≤kn 1≤j≤kn 1≤j≤kn
kn
X
0
≤ ε+E (Xn,j )2 I[(Xn,j
0 )2 >ε]
j=1
kn
X
2
= ε+E Xn,j 2 >ε] I
I[Xn,j j
X
j=1 2
E(Xn,i 2 >ε] | Fn,i−1 ) ≤ 1
I[Xn,i
i=1
≤ ε + 1 < ∞.
( i
)
X
Thm. Let Sn,i = Xn,j , Fn,i 1 ≤ j ≤ kn be a martingale, s.t.
j=1
kn
D
X
2
(i) E(Xn,i | Fn,i−1 ) → C > 0
i=1
and
kn
D
X
2
(ii) An = E(Xn,i 2 >ε] | Fn,i−1 ) → 0 ∀ ε
I[Xn,i
i=1
kn
D
X
Then Sn = Xn,i → N (0, C)
i=1
58
Both are sufficient since An ≥ 0 and Bn ≥ 0
Example: yi = βxi + εi , i = 1, 2, · · ·
n
X n
X
xi yi x i εi
i=1 i=1
β̂n = n =β+ n
!
X X
x2i x2i
i=1 i=1
Assumptions:
n
an X
(1) ∃ an > 0 s.t. an ↑ ∞, → 1 and x2i /an → 1 a.s.
an+1 i=1
(2) εi i.i.d. E(εi ) = 0, V ar(εi ) = σ 2
(3) xi is Fi = σ(xo , ε1 , · · · , εi−1 ) measurable
(a) If E | ε1 |2+δ < ∞ then
√ D
an (β̂n − β) → N (0, σ 2 )
(b) If (xi , εi ) are identically distributed with
E(Xi2 ) < ∞, and an = n, then
√ D
n(β̂n − β) → N (0, σ 2 )
n
X
x i εi
i=1 xi εi
Consider Sn = √
an
, i.e. Xn,i = √
an
, kn = n
kn n
X
2
X x2 i
(1) E(Xn,i | Fn,i−1 ) = E(ε2i )
i=1 i=1
an
Xn
x2i
i=1 a.s.
= σ2 → σ2
an
59
n n !
X X Xi 2+δ
(a) E(| Xn,i |2+δ | Fn,i−1 ) = √
an (E | ε1 |2+δ )
i=1 i=1
n
X
max | Xi | | Xi |2
δ
1≤i≤n a.s.
i=1
≤ √ E | ε1 |2+δ →0
an
a n
n
X n−1
X
x2i x2i
x2n i=1 an−1 i=1 a.s.
= − · →0
an an an an−1
max (x2i )
1≤i≤n a.s.
⇒ →0
an
n
!
X Xi2 ε2i
(b) E I Xi2 ε2i
i=1
n n
>δ
n
!
1X
= E X12 ε21 I X12 ε21
n i=1 n
>δ
n→∞
= E(X12 ε21 I[X12 ε21 >nδ] ) −→ 0
Note that
60
( i
)
X
Let Sn,i = Xn,j , Fn,i , 1 ≤ i ≤ kn be a martingale s.t.
j=1
kn
D
X
2
(1) Xn,j →C>0
j=1
D
(2) Xn∗ = max | Xni |→ 0
1≤i≤kn
Theorem 3.
(1) +
E(Xn∗ ) → 0 is sufficient
!
Note that (3) ⇒ {Xn∗ } is u.i.
(2) + u.i. ⇒ lim E(Xn∗ ) = 0
n→∞
Theorem 30 .
(1)+(2)+
kn
D
X
E(Xn,j I[|X
nj |>1]
|Fn,j−1 )| → 0 is sufficient
j=1
S kn
> ε] = [Yn∗ > ε]
inf {1 ≤ j ≤ kn : Yn,j > ε} on j=1 [Yn,j
pf : Define τn =
kn Otherwise
61
∀δ(> 0 )
kn
X
P E(Yn,j I[Yn,j >ε] | Fn,j−1 ) > δ Fn,j−1 −measurable
j=1
(τ )
Xn
62
kn
D
X
Therefore E(zn,j I[zn,j > 1 ] | Fn,j−1 ) → 0
2
j=1
kn
X
= E(I[Yn,j >ε] I[zn,j =1] | Fn,j−1 )
j=1
kn
X
= E(I[Yn,j >ε] | Fn,j−1 )
j=1
kn
X
= P (Yn,j > ε | Fn,j−1 ).
j=1
63
pf. of Theorem 30
kn
X
Sn = Xn,i
i=1
Xkn kn
X
= Xn,i I[|Xn,i |≤1] + Xn,i I[|Xn,i |>1]
i=1 i=1
Let X
en,i = Xn,i I[X |≤1]
n,i
Note that
P [Xn,j 6= X
en,j , for some 1 ≤ j ≤ kn ]
≤ P [Xn∗ > 1] → 0 by (2)
D
So that Sn − Sen → 0
kn
D
X
2
and (1) gives Xen,j →C
j=1
en,j − E(X
X̄n,j = X en,j | Fn,j−1 )
kn
X
Sn − S̄n =
e E(Xn,j I[|Xn,j |≤1] | Fn,j−1 )
j=1
kn
X
= − E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) By martingale properties.
j=1
kn
D
X
So that | Sen − S̄n | ≤ | E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) |→ 0
j=1
Observe that
|Xen,j |≤ 1 ⇒| X̄n,j |≤ 2
So that sup E(X̄n∗ ) ≤ 2 [(3) is satisfied]
n
64
∗
Xn = max | X
en,j − E(X
en,j | Fn,j−1 ) |
1≤j≤n
≤ max | X
en,j | + max | E(Xn,j I[|X |>1] | Fn,j−1 ) |
n,j
1≤j≤n 1≤j≤n
kn
X
≤ max | Xnj | + | E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) |
1≤j≤n
j=1
X kn kn
2 X
2
X n,j − Xn,j
e
j=1 j=1
X kn X kn
= −2 Xen,j E(X en,j | Fn,j−1 ) + E 2 (X
en,j | Fn,j−1 )
j=1 j=1
k kn
X n X
≤ 2 Xn,j E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) + E 2 (Xn,j I[|Xn,j |>1] | Fn,j−1 )
e
j=1 j=1
kn
! 1/2 kn
!1/2
X X
2
≤2 Xen,j E 2 (Xn,j I[|X |>1] | Fn,j−1 )
n,j
j=1 j=1
kn
X
+ E 2 (Xn,j I[|Xn,j |>1] | Fn,j−1 )
j=1
It is sufficient to show
kn
D
X
| E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) |2 → 0 (By the assumption ∀ 0 < δ < 1)
j=1
kn
(k )2
n
D
X X
| E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) |≤ | E(Xn,j I[|Xn,j |>1] | Fn,j−1 ) | →0
j=1 j=1
65
Homework: Assume that Xn,j is Fnj -measurable
kn
D
X
2
(1) E(Xn,j 2 >ε] | Fn,j−1 ) → 0
I[Xn,j
j=1
kn
D
X
(2) E(Xn,j | Fn,j−1 ) → 0
j=1
kn
D
X
2
(3) {E(Xn,j | Fn,j−1 ) − E 2 (Xn,j | Fn,j−1 )} → C > 0
j=1
kn
D
X
Then Sn = Xn,j → N (0, C)
j=1
Exponential Inequality:
Theorem 1 (Bennett0 inequality):
Assume that {Xn } is a martingale difference with respect to {Fn } and τ is an {Fn }-
stopping time (with possible value ∞). Let σn2 = E(Xn2 | Fn−1 ) for n ≥ 1. Assume
Pτ
that ∃ positive constants U and V such that Xn ≤ U a.s. for n ≥ 1 and i=1 σi2 ≤ V
a.s., Then ∀ λ > 0
( τ )
X 1 2 −1 −1
P Xi ≥ λ ≤ exp − λ V ψ(4λV )
i=1
2
n ∞
√
Z
X 1 x2 1 1 − λ2
(i) Xi / n =⇒ √ e− 2 dx ∼ √ e 2.
i=1
2π λ 2π λ
66
Reference: (i) Annals probability (1985).
Johson, Schechtman, and Zin.
(ii) Journal of theoretical probalility (1989) (Levental).
Corollary:(Bernsteins in equality).
τ
X 1 2 1
P( Xi ≥ λ) ≤ exp − λ /(V + υλ)
i=1
2 3
proof:
λ
By ψ(λ) ≥ (1 + )−1 , ∀ λ > 0.
3
idea:(i) Note that on (τ = ∞)
∞
X τ
X
since E(Xi2 | Fi−1 ) = σi2 ≤ V a.s.
i=1 i=1
τ
X
Xi coverges a.s. on(τ = ∞).
i=1
( τ )
X
Let δ ↓ 0. Left = P Xi ≥ λ
i=1
1 2 −1 −1
right = exp − λ V ψ(υλV )
2
67
(iii)
τ
X ∞
X n
X
Xi = Xi I[τ ≥i] = lim Xi I[τ ≥i] a.s. (By (i))
n→∞
i=1 i=1 i=1
τ
!
X
P Xi > λ = E I X
τ
Xi > λ]
i=1
[
i=1
≤ E lim inf I (Fatou0 s Lemma)
n
n→∞
X
Xi I[τ ≥i] > λ
i=1
≤ lim inf E I X
n
n→∞
Xi I[τ ≥i] > λ}
{
i=1
So that,
n
X n
X
Xi2 I[τ ≥i] I[τ ≥i] E(Xi2 | Fi−1 )
E | Fi−1 =
i=1 i=1
τ
X
≤ σi2 ≤ V.
i=1
68
and Xi I[τ ≥i] ≤ υ a.s.
Proof: Let Yi = Xi I[τ ≥i] .
where
Claim:
j j
X
I[τ ≥i] σi2
X
t Yi
g(t)
i=1
e i=1 /e
is a supermartingale.
69
proof:
n
X n
X
t Yi − g(t) I[τ ≥i] σi2
E e i=1 i=1 | Fn−1
n−1
X n
X
t Yi − g(t) I[τ ≥i] σi2
E etYn | Fn−1
=e i=1 i=1
n−1
X n−1
X
t Yi − g(t) I[τ ≥i] σi2
≤e i=1 i=1
n
X n
X
n
X
t Yi t Yi g(t)V −
I[τ ≥i] σi2
Ee i=1 ≤ Ee i=1 · e i=1
n n
X X
t Yi − g(t) I[τ ≥i] σi2
g(t)V
= E e e
i=1 i=1
n
!
X
≤ eg(t)V since V − I[τ ≥i] σi2 > 0
i=1
( n )
X Pn
−λt t Yi
P Yi > λ ≤ e E e i=1
i=1
⇒
( n )
X inf (−λt + g(t)V )
P Yi > λ ≤ et>0
i=1
70
Therefore
( n )
X
P Yi > λ ≤ eh(to)
i=1
2
λ −1 −1
= exp − V ψ(υλV ) .
2
Note:
Pn Pn
Eet i=1 Yi
= E E et i=1 Yi | Fn−1
Remark:
(i) ψ(0+ ) = 1
(ii) ψ(λ) ∼= 2λ−1 logλ, as λ → ∞.
λ
(iii) ψ(λ) ≥ (1 + )−1 , ∀λ > 0.
3
Reference: Appendix of shorack and wellner (1986, p.852).
∀ λ>0
( τ τ
) 2
X X
2 λ −1 −1
P Xi > λ, σi ≤ V ≤ exp − V ψ(υλV )
i=1 i=1
2
also holds.
Example:
∞
X
V = σi2 < ∞
( ni=1 ) ( τ )
X X
P Xi > λ, for some n ≤ P Xi > λ
i=1 i=1
( n
)
X
Let τ = inf n: Xi > λ .
i=1
71
Then ∀ λ > 0,
( n n
)
2λ2
X X
P Xi − µi ≥ λ ≤ exp − Pn 2
i=1 i−1 i=1 (bi − ai )
2n2 λ2
or P X̄n − µ̄n ≥ λ ≤ exp − Pn 2
i=1 (bi − ai )
bi − Xi tai Xi − ai tbi
etXi ≤ e + e
b i − ai b i − ai
Pi (1 − Pi )e−hi
L00 (hi ) = = ui (1 − ui )
[(1 − Pi )e−hi + Pi ]2
t2 (bi − ai )2
t(Xi −µi )
So that E(e ) ≤ exp
8
72
n
X
t (Xi − µi )
Ee i=1
≤ E{E(· · · | Fn−1 )}
n−1
X
t (Xi − µi )
1 2
t (bi −ai )2
≤ e 8 Ee i=1
n
X
1 2
8
t (bi − ai )2
≤ e i=1
( n ) " n
#
X 1 X
So that P (Xi − µi ) > λ ≤ exp −λt + t2 (bi − ai )2
i=1
8 i=1
n
1 X
Leth(t) = −λt + t2 (bi − ai )2
8 i=1
n
X
minimizer t0 = 4λ (bi − ai )2
i=1
2
n
4λ 1 4λ X
h(t0 ) = −λ n + n
(bi − ai )2
X 8 X
(bi − ai )2 (bi − ai )2
i=1
i=1 i=1
n
X
= −2λ2 (bi − ai )2
i=1
( n ) " n
X #
X
So that P (Xi − µi ) > λ ≤ exp −2λ2 (bi − ai )2
i=1 i=1
73
εn is independent of Fn−1 ⊃ σ(ε1 , · · · , εn )
Eεn = 0, 0 < V ar(εn ) = σ 2 < ∞ .
Question : Test F = Fo (Ho )
Example : AR(1) process
yn = βyn−1 + εn , yo Fo − measurable
n
1X
F̂n (u) = I ,
n i=1 [yi −β̂n xi ≤u]
74
√ P
Wish : n sup | F̂n (u) − Fn (u) |→ 0 (In general, it is wrong)
u
√ D o
n sup | F̂n (u) − Fn (u) |→ sup |ω (t) |
u 0≤t≤1
√
Reject if n sup | F̂n (u) − Fn (u) |> Cα
u
Compare:
n
√ 1X P
(i) n sup | F̂n (u) − F (u + (β̂n − β)xi ) − Fn (u) + F (u) |→ 0 (right)
u n i=1
√ P
(ii) n sup | [F̂n (u) − F (u)] − [Fn (u) − F (u)] |→ 0 (It is wrong, in general)
u
n
1X
F̂n (u) = I
n i=1 [yi −β̂n xi ≤u]
n
1X
= I
n i=1 [εi ≤u+(β̂n −β)xi ]
F (c xi + u)
= E(I[εi ≤c xi +u] | Fi−1 )
(If C is constant, we can use the exponential bound).
√
n(F̂n (u) − F (u))
n
√ 1X
= n(F̂n (u) − F (·) − Fn (u) + F (u)) · · · (1)
n i=1
n
!
√ 1X
+ n F (·) − F (u) · · · (2)
n i=1
√
+ n(Fn (u) − F (u)) · · · (3)
In fact, tell us:
n
1 X
√ [F (u + (β̂n − β)xi ) − F (u)]
n i=1
n
∼ 1 X 0
=√ F (u)(β̂n − β)xi
n i=1
n
!
1 X
= F 0 (u) √ xi (β̂n − β) does not converge to zero.
n i=1
75
Example:
wish:(1) → 0p (1)
(2) → 0, and
D
known (3) → Wo (t), 0 ≤ t ≤ 1
Classical result: υ(0, 1) = F
Define: √
αn (t) = n(Fn (t) − t)
Oscillation modulus:
n
X
By Yi , (Yi | Fi−1 ) ∼ b(1, Pi ) and exponential bound. Pi ∈ Fi−1 -measurable.
i=1
0
• Lemma: If k F∞ k, Then
n
√ X
n sup | I[εi ≤u+δni ] − F (u + δni ) − I[εi ≤u] + F (u) |
u
i=1
P 1
→ 0, if δn = op ( √ )
n
76
•(β̂n − β) = op (an ) PP
∃ c ∈ Cn lattice points and ∀x ∈ ( : square set) .
1
3 (c − x) sup | xi |= 0( √ )
1≤i≤n n
# (Cn ) ≤ nk .
wish:
n
√ 1X P
n sup | F̂n (u) − F (u + (β̂n − β)xi ) − Fn (u) + F (u) |→ 0
u n i=1
By
n
√ 1X
n sup sup | F̂n (u) − F (u + cxi ) − Fn (u) + F (u) |
u c∈Cn n i=1
∀ ε >0
√
X
P n sup | F̂n (u) − · · · |> ε
u
c∈Cn
XX n√ o
≤ P n | F̂n (u) · · · |> ε
u∈Un c∈Cn
nε2
0
0
≤ nk+k e− 2
t
. if #(Un ) ≤ nk
Question:
n
1 X
√ I[εi ≤(β̂n −β)xi +u] − F ((β̂n − β)xi + u) − I[εi ≤u] + F (u)
n i=1
•(β̂n − β) = Op (an )
Yi = βXi + εi , εi i.i.d. with distribution ft. F
Xi ∈ Fi−1 -measurable, εi independent of Fi−1 .
n
1 X
√ I[εi ≤δXi +u] − F (δXi + u) − I[εi ≤u] + F (u)
n i=1
n
1 X
=√ Yi
n i=1
77
Z Z
(a) E[Yi | Fi−1 ] = dF (ε) − F (δXi + u) − dF (ε) + F (u)
[ε≤δXi +u] [ε≤u]
= 0
(b) − 1 ≤ yi ≤ 1
−1 = ai ≤ Yi ≤ bi = 1
2n2 t2
P {Ȳn ≥ t} ≤ exp − n
.
X
(bi − ai )2
( n i=1 ) 2
1 X 2n2 λn
So that P √ Yi ≥ λ ≤ 2e− 2n = 2exp[−λ2 ]
n i=1
78
n
X n
X
0
E[Yi2 | Fi−1 ] ≤k F k∞ | δ | | xi |
i=1 i−1
n
X n
X
x i εi x i εi
i=1 i=1 1
β̂n − β = n = ! 12 ! 12
X n n
x2i
X X
x2i x2i
i=1
i=1 i=1
n
x2i ∼
X
= a2n cn
i=1
n
X
n
| xi | n
! 12
1
X i=1
X
(β̂n − β) | xi |≈ Op (1) ! 12 ≤ n 2 x2i
n
i=1 X i=1
x2i
i=1
√
take V = nc, τ = n, υ = 1
( n
)
1 X
P |√ Yi |> λ
n i=1
√
( nλ)2 √ √
≤ exp − √ ψ( nλ/ nc)
2 nc
√ 2
nλ λ
= exp − ψ
2c c
Law of the iteratived logarithm:
classical: Xn i.i.d., EXn = 0, 0 < V ar(Xn ) = σ 2 < ∞
Sn
lim sup √ = σ a.s.
n→∞ 2nloglogn
√
D
(a) Zn = Sn nσ ∼ N (0, 1)
p
Sn = Zn 2loglogn
79
(b) if m and n very closeness.
If Zm and Zn are very closeness.
n
!2
X
E Xi r
i=1 n 1 n
E(Zm Zn ) = 2
√ = 2√ == 2 .
σ mn σ mn σ m
n
(c) m
= 1c , c large enough.
n1 = c, n2 = c2 , · · · , nk = ck
Zn1 , Zn2 , · · · , Znk ' i.i.d.N (0, 1).
proof: ∀ ε > 0
p
P {Yn ≥ (1 + ε) 2logn i.o.} = 0
p
P {Yn ≥ (1 − ε) 2logn i.o.} = 1
n=1
(1 + δ) 2logn
∞
X 1 1
= √ (1+δ) 2 < ∞ if δ > 0 .
n=1
(1 + δ) 2logn n
Zn,k
(e) lim sup √ = 1 a.s.
n→∞ 2logk
nk = ck , loglognk = logk + loglogc.
S ck
(f) lim sup p = 1 a.s.
k→∞ ck · 2 · loglogck
80
Sn
(g) lim sup √ = 1 a.s.
n→∞ σ 2nloglogn
Theorem A: Let {Xi , Fi } be a martingale difference such that | Xi |≤ υ a.s. and
n
X
s2n = E(Xi2 | Fi−1 ) → ∞ a.s.
i=1
Then
Sn
lim sup 1 ≤ 1 a.s.
n→∞ sn (2loglogs2n ) 2
where
n
X
Sn = Xi
i=1
Corollary:
Sn
lim inf 1 ≥ −1
n→∞ sn (2loglogs2n ) 2
| Sn |
and lim sup 1 ≤ 1 a.s.
n→∞ sn (2loglogs2n ) 2
proof: (theorem A)
c>1
∀ k, let Tk = inf {n : s2n+1 ≥ c2k }
So that Tk is a stopping time
Tk < ∞ a.s. since s2n → ∞ a.s.
Consider STk
a.s.
ST2k ≤c ,2k
ST2k c2k → 1.
Want to show:
n p o
k
(∗) P STk > (1 + ε)c 2logk, i.o. = 0
2 12
⇒ lim sup (STk STk (2loglogsTk ) )] ≤ 1 + ε a.s.
k→∞
81
By Bennett0 s inequality, let
p
λ = (1 + ε)ck 2logk, V = c2k , υ = υ
∞
λ2
X υλ
(∗) ≤ exp − ψ
k=1
2V V
k
√
X∞ (1 + ε)2 c2k ψ υ(1+ε)cc2k 2logk 2logk
= exp − 2k
k=1
2c
∞
X
≤ c0 exp[−(1 + ε0 )2 logk]
k=1
∞
X 1
= c0 <∞
k=1
k (1+ε0 )2
1
Because (1 + ε)2 logk · ≥ (1 + ε0 )2 logk
√
υ(1+ε)ck 2logk
1+ c2k
∀ n, ∃ Tk , Tk+1 , s.t. Tk ≤ n ≤ Tk+1
Sn = STk + Sn − STk
S ST Sn − STk
p n ≤ p k + p
sn 2loglogsn2 sn 2loglogsn sn 2loglogs2n
2
Want to prove:
p
P{ sup (Sn − STk ) > εck 2logk, i.o.} = 0
Tk <n≤Tk+1
j
X p
pf : Def τ = inf {j : Xi I[Tk <i≤Tk+1 ] > εck 2logk}
i=1
82
∞
( )
X p
k
P sup (Sn − STk ) > εc 2logk
Tk <n≤Tk+1
k=1
∞
( τ )
X X p
= P Xi I[Tk <i≤Tk+1 ] > εck 2logk
k=1 i=1
∞ k√
ε2 c2k 2logk
X υc 2logk
≤ exp − 2 2k
ψ
k=1
2(c − 1)c (c2 − 1)c2k
∞ 2 √
υ 2logkck
X ε logk
≤ exp − 2 ψ
k=1
c −1 (c2 − 1)c2k
Reference:
Exponential Centering:
X ∼ F, ∃ ϕ(t) = EetX
P {X > µ}
etx dF (x)
Z Z
= dF (x) = ϕ(t)e−tx
[x>µ] [x>µ] ϕ(t)
Z
= ϕ(t) e−tx dG(x)
[x>µ]
83
d
R
xetx etx dF ϕ0 (t)
Z Z
• xdG(x) = dF (x) = dt
= = [log ϕ(t)]0
ϕ(t) ϕ(t) ϕ(t)
= ψ 0 (t)
R
Similarly, for x2 dG(x).
So, P {x > u}
Z
= ϕ(t) e−tx dG(x)
[x>u]
Z
0 0
= ϕ(t)e−tψ (t) e−t(x−ψ (t)) dG(x)
x−ψ 0 (t) u−ψ 0−1 (t)
√ > √
ψ 00 (t) ψ 00 (t)
0
Z √ 00
= eψ(t)−tψ (t) e−t ψ (t)z dH(z)
u−ψ 0 (t)
z> √ 00
ψ (t)
p
where H(z) = G( ψ 00 (t)z + ψ 0 (t)).
Example: X ∼ N (0, 1)
t2
ϕ(t) = e− 2
ψ(t) = t2 /2, ψ 0 (t) = t, ψ 00 (t) = 1
Z
t2
−t2
P {X > u} = e 2 e−tz dH(z)
[z>u−t]
H(z) ∼ N (0, 1)
Simulation: t = u
2
Z
− u2
P {X > u} = e e−uz dH(z)
[z>0]
84
Then P {X > 0 | F} ∧ P {X < 0 | F} ≥ c2 /4d.
proof:
c 2 4 1
≤ E(X +2 | F) = E[(X + ) 3 · (X + ) 3 | F ] (Hölder inequality)
2
2 1
≤ E 3 (X + | F)(E(X + )4 ) 3
c
So that ( )3 ≤ E 2 (X + | F)E(X 4 | F)
2
c 3
( ) /d ≤ E 2 (X + | F)
2
c 3 1
( ) 2 /d 2 ≤ E(X + | F) = E(X + I[X>0] | F)
2
1 3
≤ E 4 (X 4 | F)E 4 (I[X>0] | F) (Hölder inequality)
1 3
≤ d 4 P 4 {X > 0 | F}
c
( )6 /d2 ≤ dP 3 {X > 0 | F}, implies
2
c2
P {X > 0 | F} ≥
4d
c 32
( )
Similarly, E(X − | F) ≥ 2 1 , and
d2
P {X < 0 | F} ≥ c2 /4d.
n
X
E(ε2i | Fi−1 ) ≤ c1 and sup | εi |≤ M a.s.
1≤i≤n
i=1
85
Then there is a universal constant B s.t.
( n ) ( n )
X X
p εi < 0 | Fo ∧ P εi > 0 | Fo
i=1 i=1
≥ Bc22 /(c21 +M )4
p>0
" i
#
X
E ( sup | εj |P | Fo
1≤i≤n
j=1
n
! P2
X
≤ kE E(ε2j | Fj−1 ) | Fo
j=1
P
+kE max | εi | | Fo
1≤j≤n
⇒ P (A) = 0.
(ii) By a conditional version of B-G-D inequality take p=4.
4
X n
E εj | Fo
j=1
( )2
X n
≤ kE (ε2i | Fi−1 ) | Fo
i=1
4
+kE max | εi | | Fo
1≤i≤n
≤ kc21 + kM 4 = k(c21 + M 4 )
86
By Lemma 1,
( n ) ( n )
X X
P εi > 0 | Fo ∧P εi < 0 | Fo
i=1 i=1
≥ c22 /4k(c21 + M 4 ) = Bc22 /(c21 + M 4 )
1
where B = 4k
n
!
X
use (i) E εi | Fo =0
i=1
!2 !
n
X n
X
(ii) E εi | Fo = E ε2i | Fo ≥ c2 .
i=1 i=1
Similarly,
( n ) ( n )
X X
P εi (−λ, 0) | Fo ∧P εi (0, λ) | Fo
i=1 i=1
≥ Bc22 /(c21 4
+ M ) − c1 /λ 2
(By Markov-inequality)
n
X
Let Sn = Xi
i=1
Assumptions:
(i) {Xi , Fi } is a martingale difference sequence.
(ii) P {| Xi |≤ d} = 1, ∀ 1 ≤ i ≤ n
Notations:
i
X
σi2 = E(Xi2 | Fi−1 ), s2i = σj2
j=1
−1 −2
g1 (x) = x (e − 1), g(x) = x (ex − 1 − x)
x
87
So that P {Sn > λ | Fo }
n
X
Z Z "Y
n
# −t xi
(t)
= ··· [ϕi (t)] e i=1 dFn(t) · · · dF1
[Sn >λ] i=1
n
X n
X
Z Z [ψi (t)] −t xi
(t)
= ··· e i=1 e i=1 dFn(t) · · · dF1
[Sn >λ]
n
X n
X
Z Z [ψi (t) − tψi0 (t)] −t (xi − ψi0 (t))
(t)
= ··· e i=1 e i=1 dFn(t) · · · dF1
[Sn >λ]
88
Now, if s2n ≤ M, g(−td) − t2 d2 g 2 (−td) − g1 (td) ≤ 0,
then P {Sn > λ | Fo }
Xn
Z Z Y" n
# −t xi
(t)
= ··· ϕi (t) e i=1 dFn(t) · · · dF1 , ∀t > 0
[Sn >λ] i=1
n
X n
X
Z Z ψi (t) − t xi
(t)
= ··· e i=1 i=1 dFn(t) · · · dF1
[Sn >λ]
n
X n
X
Z Z (ψi (t) − tψi0 (t)) −t (xi − ψi0 (t))
(t)
(∗∗) = ··· e i=1 e i=1 dFn(t) · · · dF1
[Sn >λ]
(t) (t)
under dFn , · · · dF1 ,
Z
(t)
E[Yi | Fi−1 ] = ydFi (y) = E[Xi etXi | Fi−1 ]
= [logϕi (t)]0 = ψi0 (t), and
V ar(Yi | Fi−1 ) = ψi00 (t).
89
• • ϕi (t) = E[etXi | Fi−1 ]
= E[1 + tXi + t2 Xi2 g(tXi ) | Fi−1 ]
≤ 1 + t2 σi2 g(td)
≥ 1 + t2 σi2 g(−td)
0
ϕi (t)
• • •ψi0 (t) =
ϕ (t)
(i
≤ tg1 (td)σi2
tg (−td)σi2
≥ 1+t1 2 σ2 g(td)
i
, and
n
X
(ψi (t) − tψi0 (t))
i=1
≥ t2 s2n {g(−td) − t2 d2 g 2 (−td) − g1 (td)}
Xn
Because σi2 = s2n .
i=1
90
Thus,
2 M [g(−td)−t2 d2 g 2 (−td)−g (td)]
(∗∗) ≥ et 1
n
X
Z Z −t (xi − ψi0 (t))
(t)
· ··· e dFn(t) · · · dF1 .
i=1
[Sn >λ]
n n
!
X X
[Sn > λ] = [Sn − ψi0 (t) > λ − ψi0 (t)].
i=1 i=1
n
X
Z Z −t
(xi − ψi0 (t))
≥ ··· h i e i=1 dFn(t) · · · dFn(t)
tmg(−td)
Sn − n 0
P
i=1 ψi (t)≥λ− 1+t2 d2 g(td)
2 2 2 2
·et M [g(−td)−t d g (−td)−g1 (td)]
2 2 d2 g 2 (−td)−g (td)]
≥Z et M [g(−td)−t
Z
1
(t)
· ··· h i 1dFn(t) · · · dF1 .
tmg(−td)
0≥Sn − n 0
P
i=1 ψi (t)≥λ− 1+t2 d2 g(td)
n
X
So, ψi00 (t)
i=1
≤ s2n etd
= 2 2
≥ s2n {e−td−t d g(td) − t2 d2 g12 (td)}
91
√ √
Replace t by t/ M and λ by (1 − r) M t.
√
(∗ ∗ ∗) P {Sn > (1 − r) M t | Fo }
√ √
t2 {g(−td/ M )−(td/ M )2 g 2 (− √td )−g1 ( √td )}
≥ eZ M M
Z √ √
(t/ M )
· ··· " Pn √ m √t
√
g(−td/ M )
# dFn(t/ M)
· · · dF1
0 √t M
0≥ i=1 xi −ψi ≥(1−r) M t− 2 √
M 1+ t d2 g 2 (td/ M )
M
√ 2d
Let εi = Xi − ψi0 (t/ M ), | εi |≤ √
M
n 2 √ √
X s
(1) E(ε2i | Fi−1 ) ≤ n e(td/ M ) ≤ etd/ M = c1
i=1
M
" n # " n #
X X √
(2) E ε2i | Fo = E (Xi − ψi0 (t/ M ))2 | Fo
i=1 i=1
t2 d2 2 td
M1 td td m t td
≥ − 2 √ g1 √ + √ g1 (− √ )/(1 + g ( √ ))
M M M M M M M M
Thus,
2
√ √ 2 2
√ √
(∗ ∗ ∗) ≥ et [g(td/ M )−(td/ M ) g (−td/ M )−g1 (td/ M )]
h √ √ √
M M
1
− (2td/ M )g 1 (td/ M ) + m √t
M M
g(−td/ M) · B
· √ √
e2td/ M + (2d/ M )4
√ )
etd/ M
− m
√ 2 2
√
t2 [(1 − r) − M tg(−td/ M )/(1 + tMd g 2 (td/ M ))]2
√
Let t → ∞, and td/ M → 0
n
X
Assume that E(Xi2 | Fo ) ≥ M1 > 0, and let M1 /M → 1, m/M → 1,
i=1
n
X
m≤ E(Xi2 | Fi−1 ) ≤ M , and 1 − (m/M ) < r
i=1
Then √
P {Sn > (1 − r) M t | Fo }
t2
≥ e− 2 (1+0(1)) · B(1 + 0(1))
In summary:
92
For each n, {Xn,i , Fu,i , i = 1, 2, · · · , n} is a martingale difference such that
Note that
93
(1) s2τk+1 − s2τk ≤ ck+1 − s2τk +1 − στ2k +1
≤ ck+1 − ck + d2
(2) s2τk+1 − s2τk ≥ s2τk+1 +1 − στ2k+1 +1 − ck
≥ ck+1 − d2 − ck
By in summary,
τk+1 ∞
X X
Sτk+1 − Sτk = Xi = Xi I[τk <i≤τk+1 ]
i=τk +1 i=1
p
(∗) = P {Sτk+1 − Sτk > (1 − r) Mk tk | Fτk }
t2
k
≥ e− 2 (1+0(1)) B(1 + 0(1))
2 loglogck+1 (1+0(1))
≥ B(1 + 0(1))e−α
2
≥ B(1 + 0(1))((k + 1)α (1+0(1)) )−1
∞
X p
So that, P {Sτk+1 − Sτk > (1 − r) Mk tk | Fτk } = ∞ a.s.
k=1
94
But
History of L.I.L.:
Step 1:
95
(1922) Steinhauss:
(1923) Khinchine:
Sn = O((n loglogn)1/2 )
(1924) Khinchine:
step 2:
(1929) Kolmogorov:
n
X
0
Xi indep. r.v .s EXi = 0, s2n = EXi2
i=1
sn
(i) sup | Xk |≤ kn
1≤k≤n (loglogs2n )1/2
(ii) kn → 0, s2n → ∞.
Then
96
Step 3:
(196?) Strassen:
Xi i.i.d, EXi = 0, V ar(Xi ) = 1.
limit of Sn /(2loglogn) is {-1, 1}.
Wn is a Brownian Motion
1 1
| Sn − Wn |= 0 (n 2 (loglogn) 2 ).
Construct a Brownian Motion W (t) and stopping time τ1 , τ2 , · · · so that
( n
)
D
X
Sn = W ( τi ), n = 1, 2, · · ·
i=1
n
!
X
| Sn − Wn |=| W τi − Wn |
i=1
(1965) Strassen:
Xi independent case and special martingale.
(1970) W.F. Stout:
Martingale
( Version)of Kolmogorov0 s Law of Iterated Logarithm. Z.W.V.G. 15, 279∼290.
Xn
Xn = Yi , Fn is a martingale.
i=1
n
X
s2n = E[Yi2 | Fi−1 ]
i=1
1
If s2n → ∞ a.s. and | Yn |≤ kn sn /(2log2 s2n ) 2 a.s.
where kn is Fn−1 -measurable and lim kn = 0
n→∞
97
1
where 0.3533/a ≤ c ≤ min[ + bg(a, b)]
b>0 b
(1986) E. Fisher:
Sankhyea, Series A, 48, p.267∼ 272.
Martingale Version:
n
1
X
implies lim sup Yi /sn (2log2 s2n ) 2 ≤ 1 + ε(k).
n→∞
i=1
k/4, if 0 < k ≤ 1
where ε(k) =
(3 + 2k 2 )/4k − 1, if k > 1.
Papers:
D. Freedman (1973). Annals probability, 1, 910∼925.
Basic assumptions:
(i) Fo ⊂ F1 ⊂ · · · ⊂ Fn · · · (σ −fields)
(ii) Xn is Fn −measurable, n ≥ 1.
(iii) 0 ≤ Xn ≤ 1 a.s.
Xn n
X
Sn = Xi , Mi = E[Xi | Fi−1 ], Tn = Mi .
i=1 i=1
98
Theorem: Let τ be a stopping time
(i) If 0 ≤ a ≤ b, then
( τ τ
)
X X
P Xi ≤ a and Mi ≥ b
i=1
i=1
(a − b)2
a a−b
≤ (b/a) e ≤ exp −
2c
, where c = a ∨ b = max{a, b}.
(ii) If 0 ≤ b ≤ a, then
( τ τ
)
X X
P Xi ≥ a and Mn ≤ b
i=1
i=1
(b − a)2
≤ (b/a)a ea−b ≤ exp − , where c = aV b.
2c
Lemma:
P 0 ≤ X ≤ 1 is a r.v. on (Ω , F, P).
Let be a sub-σ-field
P of F.
Let M = E{X | } and h be a real number.
Then
X
E{exp(hX) | } ≤ exp[M (eh − 1)]
X X
E[ehX | ] ≤ E[(1 − X) + eh X | ]
= (1 − M ) + eh M
h −1)M
= 1 + (eh − 1)M ≤ e(e .
(Because 1 − x ≤ ex , ∀x).
Corollary : For each h, define Rn (m, x) = exp[hx − (eh − 1)m].
Then
Rh (Tn , Sn ) is a super-martingale.
99
proof:
100
Then
P {S ≤ a and Tτ ≥ b}
Z τ
= u(Tτ , Sτ )dP
Z
= (Tτ , Sτ )dP, G = {Tτ < ∞ or Sτ < ∞}
ZG
≤ Qh (Tτ , Sτ )dP ( Qh ≥ u)
G
(Qh (m, x) = exp[−h(x − a) + (1 − e−h )(m − b)] ≥ 1, if m ≥ b and x < a)
Z
= Qh (0, 0) R−h (Tτ , Sτ )dP
G
≤ Qh (0, 0)
101
Lemma1 : a ≥ 0, b ≥ 0, c = a ∨ b
Then (b/a)a ea−b ≤ exp[−(a − b)2 /2c]
0 1 1−ε −ε
Lemma1 : 0 < ε < 1, f (ε) = ( ) e , g(ε) = (1 − ε)eε .
1−ε
we have
f (ε) < exp[−ε2 /2] < 1 and
g(ε) < exp[−ε2 /2] < 1.
proof of Lemma 1:
(i) a=b (trivial).
(ii) case 1: 0 < a < b, let ε = (b − a)/b = 1 − a/b.
102
case 2: 0 < b < a.
ε = (a − b)/a = 1 − b/a
(b/a)a ea−b = (1 − ε)a eaε
ε2
a
= g (ε) ≤ exp −a
2
(a − b)2
= exp −a ·
2a2
(a − b)2
= exp −
2a
If 0 ≤ a ≤ b then
( τ τ
)
X X
P Xi ≤ a and Mi ≥ b ≤ exp[−(a − b)2 /2(a ∨ b)]
i=1 i=1
Application:
Let Xn = ρXn−1 + εn , n = 1, 2, · · · , | ρ |< 1.
{εn , Fn } is a martingale difference sequence such that E[ε2n | Fn−1 ] = σ 2 , and
sup E[(ε2n )p | Fn−1 ] ≤ c < ∞
n
i=1
n
!−1 n
!
X X
ρ̂n − ρ = Xi2 Xi−1 εi
n
!i=1 i=1
2
( ni=1 Xi−1 εi )
X P
2 2
Xi−1 (ρ̂n − ρ) = Pn 2
i=1 i=1 Xi−1
103
n
X
2
difficult: Xi−1 is a random variable.
i=1
This problem how to calculate.
The corresponding χ2 -statistic is
n n
!2 n
X X X
2 2 2
Qn = Xi−1 (ρ̂n − ρ) = Xi−1 εi Xi−1 (Cauchy − Schwarz inequality)
i=1 i=1 i=1
n
!1/2 n
!1/2 2
X X
2
Xi−1 ε2i
n
i=1 i=1 X
≤ n = ε2i
X
2 i=1
Xi−1
i=1
?
E(Qpn ) → σ 2p E | N (0, 1) |2p .
104
Ideas:
105
( n )
X
Then P ε2i ≤ nτ
i=1
( n )
X
≤P ε2i I[ε2i ≤k] ≤ nτ
( i=1
n
)
X
≤P (ε2i /k)I[ε2i ≤k] ≤ n τ /k
("i=1n # " n
#)
ε2i
X X n
=P (ε2i /k)I[ε2i /k≤1] ≤ n τ /k , E I 2 | Fi−1 ≥ α
i=1 i=1
k [εi /k<1] k
h n i n
nE ε2i /k
I[εi /k≤1] ≥ α > τ
2
k k
((n/k) α − nk τ )2
≤ exp −
2( nk α)
= exp[−n(α − τ )2 /2kα]
n
(α − τ )2
= exp − = r−n
2kα
(α − τ )2
r = exp > 1.
2kα
106
" n−1 #
X
(v) Let An = ε2i ≤ (n − 1)τ , and q > p0 > p ≥ 1.
i=1
!p0 1/p0
n
X
E ε2i IAn
i=1
n 1/p0
0
X
≤ E(ε2i )p IAn
i=1
n
X p0 1 1 1 1
≤ ((E[ε2i ]q ) q (EIAn
s
) s ) p0 , + = 1.
i=1
q s
(Hölder inequality)
1 1
≤ E(ε2i )q q · n{p(An)} sp0
1
−n sp
≤c·n·r → 0.
0
Pn 2p0
0 0E | Xi−1 ε i |
(vi) EQpn IAc n ≤c i=1
(n − 1)p0
Recall : (1987) Wei, Ann. Stat. 1667∼ 1687.
X n
Xn = ui εi , ui − Fi−1 measurable.
i=1
{εi , Fi } is a martingale difference sequence.
p≥2
sup E{| εn |p | Fn−1 } ≤ c a.s.
n
n
! p2
X
Then E sup | Xi |p ≤kE u2i
1≤i≤n
i=1
, k depends only on p, c.
2p0 !p0
Xn n
X
2
So, E Xi−1 εi ≤ k E Xi−1
i=1 i=1
n
X 0
≤kk 2
Xi−1 kpp0
i=1
n
!p0
X
≤k k Xi kp0
i=1
107
Now, Xn = ρXn−1 + εn = εn + ρεn−1 + · · · + ρn−1 ε1 + ρn Xo
= Yn + ρn Xo .
0
E | Yn + ρn Xo |2p
0 0 0
≤ 22p [E | Yn |2p +(| ρ |n | Xo |)2p ]
0
It is sufficient to show that sup E | Yn |2p < ∞
n
Since this implies
2p
Xn
0
E Xi−1 εi = O(np ) and
i=1
0
E[Qpn IAcn ] = O(1)
108
Chapter 2
2.1 Introduction:
Model yn = β1 xn,1 + · · · + βn xn,p + εn
where {εn , Fn } is a martingale difference sequence and ~x = (xn,1 , · · · , xn,p ) is Fn−1
-measurable.
~
Issue: Based on the observations {~x1 , y1 , · · · , ~xn , yn }, make inference on β.
Examples:
(i) Classical Regression Model
(Fixed Design, i.e. ~x0i s are constant vectors).
(ii) Time series: AR(p) model
yn = β1 yn−1 + β2 yn−2 + · · · βp yn−p + εn
where εn are i.i.d. N (0, σ 2 ).
~xn = (yn−1 , · · · , yn−p )0 .
(iii) Input-Output Dynamic System.
(1) System Identification (Economic of Control)
(2) Control:
~u Fn−1 -measurable.
Example:
yn = αyn−1 + βun−1 + εn
Goal: yn ≡ T, T fixed constant.
If α, β are known.
109
After observing {u1 , y1 , · · · , un−1 , yn−1 }
Define un−1 so that
T − αyn−1
T = αyn−1 + βun−1 , i.e. un−1 = , (β 6= 0)
β
Fn−1 −measurable.
If α, β unknown:
Based on {u1 , y1 , · · · , un−1 , yn−1 }
Let α and β (say by α̂n−1 , β̂n−1 ).
Define un−1 = T −α̂β̂n−1 yn−1
n−1
Question:
Is the system under control?
Xm
Is m1
(yn − εn − T )2 small?
n=1
(iv) Transformed Model:
Xn
X
Branching Pocess with Immigration: Xn+1 = Yn+1,i + In+1
i=1
Xn : the population size of n-th generation.
Yn+1,i : the size of the decends of i-th number in n-th generation.
In+1,i : the size of the immigration in (n+1)th generation.
Assumptions:
110
Xn
X
E(Xn+1 | Fn ) = E[Yn+1,i | Fn ] + E[In+1 | Fn ]
i=1
= mXn + b
Xn
X
V ar(Xn+1 | Fn ) = (E((Yn+1,i − m)2 | Fn ))
i=1
+E((In+1 − b)2 | Fn )
= Xn σ 2 + σI2
Xn
X
(Yn+1,i − m) + (In+1 − b)
i=1
Let εn+1 = p
σ 2 Xn + σI2
111
T.L. Lai and C.Z. Wei (1982).
Ann. Stat., 10, 154 ∼ 166.
Model: yi = β~ 0~xi + εi
{εi , Fi } is a sequence of martingale difference and ~xi is Fi−1 -measurable.
Bassic Issue : Make inference on β~ , based on observations {~x1 , y1 , · · · , ~xn , yn }
Estimation:
(a) εi ∼ i.i.d. N (0, σ 2 )
~x1 fixed, ~xi σ(y1 , · · · , yi−1 ), i = 2, 3, · · ·
MLE of β~ :
~ = L(β,
L(β) ~ y 1 , · · · , yn )
~ y1 , · · · , yn−1 )L(β,
= L(β, ~ yn | y1 , · · · , yn−1 )
~ y1 , · · · , yn−1 ) √ 1 e−(yn −β~ 0 ~xn )2 /2σ2
= L(β,
2πσ
..
.
X n
n
!−1 n
ˆ X X
So, M.L.E. β~n = ~xi~x0i ~xi yi
i=1 i=1
n
~ˆxi )2
X
σ̂n2 = 1/n (yi − β~
i=1
ˆ
Solve the equation, we obtain β~n .
Computation Aspect:
112
• Recursive Formula
ˆ ˆ ˆ
β~n+1 = β~n + {(yn+1 − β~n0 ~x0n+1 )/(1 + ~x0n+1 Vn~xn+1 )}Vn ~xn+1
Vn+1 = Vn − Vn~xn+1~x0n+1 Vn /(1 + ~xn+1 Vn~xn+1 )
n
!−1
X
Vn = ~xi~x0i
i=1
f : hardware or program.
!
ˆ
~
βn : stored in the memory.
Vn
~xn+1 : new data
113
(1) If A, m× m matrix, is nonsingular υ, V <m
Then
0 −1 −1 (A−1 υ)(V 0 A−1 )
(A + υV ) = A −
1 + V 0 A−1 υ
114
Po
!−1
X ˆ
If we set VPn = ~xi~x0i , β~Po =Least square estimator. Then Vn+1 =
i=1
n+1
!−1
X
~xi~x0n and
i=1
ˆ
β~n are least square estimator of β. ~
Engineer : Set initial value
Vo = CI, C is very small.
ˆ
β~o : guess.
(2) If A = B + w ~w~ 0 is nonsingular
|A|−|B|
Then w ~ 0 Aw
~ = |A|
Notice:
N
an−1
X an − an−1
as an ↑ ∞, an → 1, ∼ log aN
i=1
an
n+1 n
!
X X
Special Case : x2i = x2i + x2n+1 .
i=1 i=1
A w
~ 0
proof : | B |=| A − w
~w~ |= 0
(∗)
w
~ 1
Lemma : If A is nonsingular,
A C −1
Then B D =| A || D − BA C |
I O A C
proof : det
−BA−1 I B D
A C
= det
0 −BA−1 C + D
~ 0 A−1 w
So, (∗) = | A || 1 − w ~|
2. Strong Consistency:
Conditional Fisher0 s information matrix:
L(β,~ yi | y1 , · · · , yi−1 )
n
Y
= ~ yi | y1 , · · · , yi−1 ), implies
L(β,
i=1
n
X
~ y 1 , y2 , · · · , yn ) =
log L(β, ~ yi , | y1 , · · · , yi−1 )
log L(β,
i=1
115
Definition:
( )
~ yi | y1 , · · · , yi−1 ) [∂ log L(β,
∂ log L(β, ~ yi | y1 , · · · , yi−1 )]0
Ji = E y1 , · · · , yi−1
∂ β~ ∂ β~
Conditional Fisher0 s information matrix is
Xn
In = Ji
i=1
Model : yn = β~ 0~xn + εn
εn i.i.d. ∼ N (0, σ 2 )
~xn σ{y1 , · · · , yn−1 } = Fn−1
~0 ~
x )2
~ yi | y1 , · · · , yi−1 ) = log √ 1 e− i 2σ2 i
(y −β
log L(β,
2πσ
√ (yi − β~ ~xi )
0 2
= − log 2πσ −
( 2σ 2
(yi − β~ 0~xi ) 0 (yi − β~ 0~xi )
Ji = E ~xi~xi |Fi−1 }
σ2 σ2
= E{ε2i ~xi~x0i | Fi−1 }/σ 4 = ~xi~x0i E{ε2i | Fi−1 }/σ 4
= ~xi~x0i /σ 2 ,
Xn
In = ~xi~x0i /σ 2
i=1
n
!−1
X
= ~xi~x0i σ 2 = In−1
i=1
116
ˆ
Let δn (~e∗ ) be the minimum eigenvalue (eigenvector) of In . Then Var(~e0∗ β~n ) =
~e0∗ In−1~e∗ = 1/δn ≥ ~e0 In−1~e, ∀ ~e.
So, the data set {~x1 , y1 , ~x2 , y2 , · · · , ~xn , yn } provides least information for estimat-
ing β~ along the direction ~e∗ , we can interpretate the maximum-eigenvaluce similarly.
ˆ
When the L.S.E. β~n is (strongly) consistent? Heuristically, if the most difficult direc-
tion has “infinite” information, we should be able to estimate β~ consistently. More
precisely, if
ˆ
λmin (In ) → ∞, we expect β~n → β~ a.s.
Example:
εi ∼i.i.d. Eεi = 0, V ar(εi ) < ∞.
More general, {εn , Fn } is a martingale difference sequence such that
Stochastie Case:
< 1 > First Attempt : (Reduce to 1-dimension case).
n
!−1 n
ˆ X X
β~n − β~ = ~xi~x0 ~xi εi
i
i=1 i=1
117
Recall that : {εi , Fi } martingale difference sequence ui Fi−1 .
n
( P∞ 2
X converges a.s. on { i=1 ui < ∞}
ui εi 1+δ
P
1/2
0 ( ni=1 u2i ) [log ( ni=1 u2i )] 2
P
a.s. ∀ δ > 0
i=1
p = dim(β)~ = 1.
ˆ
Conclusion: β~n converges a.s.
n
X
The limit is β~ on the set {In = x2i → ∞}. In fact on this set
i=1
n
! 1+δ
2 n
!1/2
ˆ X X
β~n − β~ = 0 log x2i / x2i a.s. ∀ δ > 0.
i=1 i=1
n
!
X
Let Pn = ~xi~x0i , Vn = Pn−1 , Dn = diag(Pn ).
i=1
n
ˆ X
β~n − β~ = (Pn−1 Dn )(Dn−1 ~xi εi )
i=1
Xn Xn
xi1 , εi / x2i1
= Pn−1 Dn i=1 i=1
..
Pn . P
n 2
i=1 ip i /
x ε i=1 xip
So
n
! 1+δ
2
X
log x2ij
ˆ
k β~n − β~ k ≤ k Pn−1 kk Dn k max Pn
i=1
1/2
1≤j≤P
i=1 x2ij
1+δ
!
(log λ∗n ) 2
= O 1/λn · λ∗n · 1/2
, λ∗n : max. eigen.
λn
118
since
0
0
0
..
.
(0, · · · , 0, 1, 0, · · · , 0)Pn 0 ≥ λn
1
0
..
.
0
1+δ 3
= O(λ∗n (log λ∗n ) 2 /λn ). (∗)
2
ˆ
Conclusion: β~n → β~ a.s. on the set
n 3 o
lim λ∗n (log λ∗n )(1+δ)/2 /λn2 = 0, for some δ > 0 = C
n→∞
n 3 o
Remark: C ⊂ lim λ∗n /λn2 =0
n→∞
λ∗n λn
det Pn
λn /2 ≤ = ∗ ≤ λn
tr(Pn ) λn + λn
119
Example 1 : yi = β1 + β2 i + εi
i = 1, 2, 3, · · · , n.
1
~xi =
i
n
X
n
X n i
Pn = ~xi~x0i = X i=1
n Xn
2
i=1 i i
i=1 i=1
n
X
implies tr(Pn ) = n + i2 ∼ n 3 .
i=1
n n
!2
X X
det(Pn ) = n i2 − i
i=1 i=1
2
n2 n4 n4 n4
3
∼ n n /3 − = − = .
2 3 4 12
implies λ∗n ∼ n3
λn ∼ n
P (λ) = (λ − ρ1 )(λ − ρ2 )
= λ2 − (ρ1 + ρ2 )λ + ρ1 ρ2
β1 = ρ1 + ρ2 , β2 = −ρ1 ρ2
zn−1
yn = zn , ~xn =
zn−2
Depcomposition:
vn 1 −ρ1 zn zn − ρ1 zn−1
= =
wn 1 −ρ2 zn−1 zn − ρ2 zn−1
120
Claim : vn = ρ2 vn−1 = εn
wn = ρ1 wn−1 = εn
ρ2 = 1, ρ1 = 0, then vn − vn−1 = εn
Xn
= εi + v o
i=1
and wn = εn
n
X zi−1
Pn = (zi−1 , zi−2 )
zi−2
i=1
1 − ρ1 1 − ρ1
Pn
1 − ρ2 1 − ρ2
n
X vi−1
= (vi−1 , wi−1 )
wi−1
i=1
X n Xn
2
vi vi wi
i=1 i=1
= X
n Xn
2
vi wi wi
i=1 i=1
n
X
vo = 0 implies vn = εi , w i = εi
i=1
εi i.i.d. Eεi = 0, and V ar(εi ) < ∞.
121
n
! n
!
X X
tr(Pn ) on order vi2 + ε2i
i=1 i=1
n
! n
! n
!2
X X X
det(Pn ) = vi2 ε2i − v i εi
i=1 i=1 i=1
n
! n
! n n
!2
X X X X
= vi2 ε2i − ε2i + vi−1 εi
i=1 i=1 i=1 i=1
Because vi = vi−1 + εi .
n
X
lim sup vi2 /n(2n log log n) < ∞ a.s. (Donsker Theorem)
n→∞
i=1
n
X
inf(log log n) vi2
i=1
lim > 0 a.s.
n→∞ n2
n
X
implies tr(Pn ) ∼ vi2
i=1
n
n
! " n
#! 1+δ
2
X X X
2 2
Because vi−1 εi = 0 vi−1 log vi−1 .
i=1 i=1 i=1
n
! n
! 1+δ
2
X X
det(Pn ) = −O n2 + 2
vi−1 log 2
vi−1
i=1 i=1
n
! n
!
X X
+ vi2 ε2i .
i=1 i=1
n
! n
!1+δ
X X
n2 + 2 2
n
! n
!
vi−1 log vi−1
X X i=1 i=1
= vi2 ε2i 1 − O
! !
n n
X X
i=1 i=1 2
vi=1 ε2i
i=1 i=1
122
n
X ! n
!
X n
n2 2
vi−1 ε2i ∼ n
!
i=1 i=1
X
2
vi−1
i=1
2 log log n
= O(n/(n / log log n)) = O
n
" n
!#1+δ n
(log n)1+δ
X X
2
log vi−1 ε2i = O
i=1 i=1
n
= o(1)
implies
n
X
tr(Pn ) ∼ vi2
i=1
n
!
X
det(Pn ) ∼ vi2 ·n
i=1
Not application I
< 2 > Second Approach
Energy function, ε-Liapounov0 s function.
dε(x(t))/dt < 0
Roughly speaking, construct a constant function.
V : <P → <
V (~x) > 0, if ~x 6= ~0
V (~0) = 0
inf V (~x) > 0
|~
x|>M
~ n+1 ) ≤ V (w
V (w ~ n ) and lim V (~ωn ) = 0
n→∞
then ~ n = ~0.
lim w
n→
123
Two essential ideas:
(1) decreasing
(2) never ending unless it reaches zero.
What are the probability analogous ?
Decreasing → supermartingale .
→ almost supermartingle.
Recall the following theorem (Robbins and Siegmund) 1971, Optimization Methods
in stat. ed. by Rustgi, 233∼.
Lemma : (Important Theorem )
Let an , bn , cn , dn , be Fn -measurable nonnegative
( ∞ random varaibles ) E[an+1 | Fn ] ≤
s.t.
X ∞
X
an (1 + bn ) + cn − dn . Then on the event bi < ∞, ci < ∞
i=1 i=1
n
X
lim an exists and finite a.s. and di < ∞ a.s.
n→∞
i=1
What is the supermartingale in above ?
Ans: bn = 0, cn = 0, dn = 0.
We start with the residual sum of squares.
n n
X ˆ X
(yi − β~n0 ~xi )2 = ε2i − Qn
i=1 i=1
n
X ˆ
where Qn = (β~n~xi − β~ 0~xi )2
i=1
n
!
ˆ X ˆ
= (β~n − β)
~ 0 ~xi~x0i (β~n − β)
~
i=1
n
X
That is, relative to ε2i , Qn should be smaller. Therefore, Qn /a∗n may be a
i=1
right consideration for the “energying function ”. Another aspect of Qn is that it is
ˆ ~ which reaches zero only when β~ˆn = β.
a quadratic function of (β~n − β), ~
124
How to choose a∗n ?
ˆ
Qn ≥k β~n − β~ k2 ·λn
ˆ
or Qn /λn ≥k β~n − β~ k, choose : a∗n = λn .
Theorem : In the stochastic regression model.
yn = β~ 0~xi + εi
if sup E[ε2n | Fn−1 ] < ∞ a.s.
n
proof : an = Qn /λn , bn = 0.
n
!0 n
!−1 n
!
X X X
0
Qn = ~xi εi ~xi~xi ~xi εi
i=1 i=1 i=1
n−1
!0 n−1
!
X X
E[an | Fn−1 ] = ~xi εi Vn ~xi εi /λn
i=1 i=1
n−1
!
X
+2E[~x0n εn Vn ~xi εi | Fn−1 ]/λn
i=1
+E(~x0n Vn εn | Fn−1 )/λn .
n
!0 n
!
X X
= ~xi εi Vn ~xi εi /λn
i=1 i=1
+~x0n Vn~xn E[ε2n | Fn−1 ]/λn
n−1
!0 n−1
!
X X
≤ ~xi εi Vn−1 ~xi εi /λn + cn−1
i=1 i=1
= Qn−1 /λn + cn−1
1 1
= Qn−1 /λn−1 − Qn−1 − + cn−1
λn−1 λn
= an−1 − an−1 (1 − λn−1 /λn ) + cn−1
125
By the almost supermartingale theorem.
X
λn − λn−1
lim an < ∞ and an−1 <∞
n→∞ λn
X 0
X ~xn Vn~xn 2
a.s. on { cn−1 < ∞} = E[εn | Fn−1 ] < ∞
λn
X 0
~xn Vn~xn
⊃ <∞
λn
If lim an = a > 0
n→∞
Then ∃N s.t. an ≥ a/2, ∀ n > N
∞ ∞
!
X λi − λi−1 a X λi − λi−1
So ai−1 ≥
i=1
λi 2 i=N
λi
∞ Z λi
a X dx
≥ · λn /λn−1
2 i=N λi−1 x
Z ∞
a 1
≥ inf λn /λn−1 dx = ∞
2 n≥N λn−1 x
Note 1: If λn−1 /λn has limit point λ < 1 then there exists
λnj − λnj−1
nj 3 lim λnj−1 /λnj = λ, lim = 1 − λ.
j→∞ j→∞ λnj
This contradicts.
X λi − λi−1
Note 2 : If <∞
i
λi
λn −λn−1
Then λn
→0
λn−1 /λn → 1.
Therefore, on the event
X
~xn Vn~xn
< ∞, λn → ∞ ,
λ
an → 0 a.s.
ˆ
since an ≥k β~n − β~ k2
ˆ
β~n → β~ a.s. on the same event.
126
Corollary : On the event
∞ ∞
X ~x0 Vn~xn X | Pn | − | Pn−1 |
= n
≤ (By Pn = Pn−1 + ~xn~x0n )
n=p
λn n=p
| P n | λ n
∞
!
X | Pn | − | Pn−1 |
= O
n=p
| Pn | (log λ∗n )1+δ
∞
!
X | Pn | − | Pn−1 |
= O
n=p
| Pn | (log | Pn |)1+δ
= O(1)
Since | Pn |= λ∗n · · · λn → ∞.
implies log | Pn |≤ p log(λ∗n ).
•• Knopp : Sequence and Series.
as an ↑
X an − an−1
implies <∞
an (log an )1+δ
Z ∞
1
dx < ∞
2 x(log x)1+δ
Because ~x0n Vn = ~x0n Vn−1 /(1 + ~x0n Vn−1~xn )
~x0 Vn−1~xn~x0n Vn−1
~x0n Vn = ~x0n Vn−1 − n
1 + ~x0n Vn−1~xn
127
< 3 > Third Approach:
k
!0 k
!
X X
Qk = ~xi εi Vk ~xi εi
i=1 i=1
k−1
!0 k−1
!
X X
= ~xi εi Vk ~xi εi
i=1 i=1
k−1
X
+~x0k Vk ~xk ε2k + 2(~x0k ~xk ~xi εi )εk
i=1
k−1
X
= Qk−1 − (~x0k Vk−1 ~xi εi )2 /(1 + ~x0k Vk−1~xk )
i=1
k−1
X
+~x0k Vk ~xk ε2k + 2(~x0k Vk ~xi εi )εk .
i=1
n
X
Qn − QN = (Qj − Qj−1 )
j=N +1
n k−1
!2
X X
= − ~x0k Vk−1 (~xi εi ) 2
(1 + ~x0k Vk−1~xk )
k=N +1 i=1
n n k−1
!
X X X
+ ~x0k Vk ~xk ε2k + 2 ~x0k Vk ~xi εi εk
k=N +1 k=N +1 i=1
128
n k−1
!2
X X
implies Qn − QN + ~x0k Vk−1 ~xi εi /(1 + ~x0k Vk−1~xk )
k=N +1 i=1
k−1
!
X
n n
~x0k Vk−1 ~xi εi
X X k=1
(1) = ~x0k Vk ~xk ε2k +2 εk
k=N +1 k=N +1
1 + ~x0k Vk−1~xk
n k−1
!2
X X
(2) = ~x0k Vk−1 ~xi εi /(1 + ~x0k Vk−1~xk )
k=N +1 i=1
k−1
!
X
n
~x0k Vk−1 ~xi εi
X i=1
= εk
i=N +1
1 + ~x0k Vk−1~xk
129
Therefore
n
! " n
#
X X
Uk2 on Uk2 < ∞
O
n
X
k=N +1 " k=N +1
Uk εk = n
! ∞
#
X X
k=N +1 2 2
o Uk on Uk = ∞
k=N +1 k=N +1
n
X n
X
But Uk2 ≤ Uk2 (1 + ~x0k Vk−1~xk )
N +1 N +1
n k−1
!2
X X
= ~x0k Vk−1 ~xi εi /(1 + ~x0k Vk−1~xk )
N +1 i=1
n
X ε2k
∼
k=N +1
k
k−1 2
X
n εi n !
X 1 1 X k − 1
(εk−1 )2
1+ =
k=N +1 k − 1
k − 1 k=N +1
k
130
(εk )2 ∼ (log n)σ 2 .
P
Because
n
X k−1
X
Qn + (~x0k Vk+1 ~xi εi )2 /(1 + ~x0k Vk−1~xk )
k=N +1 i=1
Xn
∼ ~x0k Vk ~xk ε2k , if one of it → ∞, where
k=N +1
n
!
ˆ X ˆ
Qn = (β~n − β)
~ ~xi~x0i (β~n − β)
~
i=1
n
!0 n
!
X X
= ~xi εi Vn ~xi εi .
i=1 i=1
∞
(∞ )
X X
Then | uk | ε2k < ∞ a.s. on | uk |< ∞
k=1 k=1
!1+δ
∞ n
! n
X X X
and | uk | ε2k = o | uk | log | uk | .
k=1 k=1 k=1
(∞ )
X
on the set | uk |= ∞ , for all δ > 0.
k=1
(ii) Assume that sup E[| εn |α | Fn−1 ] < ∞, for some α > 2. Then
n
n
X n
X
| uk | ε2k − | uk | E[ε2k | Fk−1 ]
k=1 k=1
n
! (∞ )
X X
=o | uk | a.s. on | uk |= ∞, sup | un |< ∞ .
n
k=1 k=1
131
Therefore, if lim E[ε2k | Fk−1 ] = σ 2 a.s.
k→∞
n
X n
X
Then lim | uk | ε2k / | uk |= σ 2 a.s.
n→∞
( ∞ k=1 k=1
)
X
on | uk |= ∞, sup | un |< ∞
n
k=1
Note:
Basic idea is to ask : zi ≥ 0, the relation of
n
X n
X
zi and E[zi | Fi−1 ]
i=1 i=1
Xn n
X
Because E(| uk | ε2k | Fk−1 ) = | uk | E(ε2k | Fk−1 )
k=1 k=1
132
∞
X
≤E
| u i | I
i
X
· M
i=1
| u |≤ M
j
j=1
≤ M2 < ∞
∞
X
So | vi | ε2i < ∞ a.s.
i=1
( ∞
)
X
Observe that vk = uk , ∀ k on sup E[ε2n | Fn−1 ] ≤ M, | un |≤ M = ΩM .
n
n=1
∞
X
So | ui | ε2i < ∞ a.s. on ΩM , ∀ M .
i=1
∞
( ∞
)
[ X
But ΩM = sup E[ε2n | Fn−1 ] < ∞, | un |< ∞ .
n
M =1 n=1
(∞ )
X
= | un |< ∞
n=1
133
( n
)
X
By Kronecker0 s Lemma, on sn = | ui |→ ∞
i=1
n
X
| uk | ε2k
k=1
lim = 0 a.s.
n→∞ sn (log sn )1+δ
(ii) (Chow (1965), local convergence theorem).
For a martingale difference sequence {δk , Fk }
X n
εk converges a.s. on
k=1
(∞ )
X
E(| δk |r | Fk−1 ) < ∞ .
k=1
where 1 ≤ r ≤ 2.
Set δk = u2k [ε2k − E(ε2k | Fk−1 )]
Then {δk , Fk } is a martingale difference sequence without loss of generality,
1 1
we can assume that 2 < α ≤ 4. If α ≥ 4, then E 4 (ε4i | Fi−1 ) ≤ E α (| εi |α | Fi−1 ).
Set r = α/2.
Let tn = ni=1 | ui |2r .
P
E[| δk |r | Fk−1 ]
=| uk |2r E{| ε2k − E[ε2k | Fk−1 ] |r | Fk−1 }
≤ | uk |2r E{[max(| ε2k |, E[ε2k | Fk−1 ]r | Fk−1 }
k
≤ | uk | E{| εk |2r +E r [ε2k | Fk−1 ] | Fk−1 }
2r
n
X
So δk = o(tn ) a.s. on {tn → ∞}
k=1
n
(∞ )
X X
But δk converges a.s. on | ui |2r = lim tn < ∞ .
n→∞
k=1 i=1
134
n
X
0
by Chow s Theorem on δi .
i=1
Observe that on {supn | un |< ∞}.
n
!
X
2r−1
tn ≤ | ui | sup | un |
n
i=1
This is because
( n
X
(a) On | ui | < ∞, sup | un |< ∞ ,
n
i=1
n n
!
X X
| uk | ε2k = O(1) = O | uk | (by (i))
k=1 k=1
( n
X
(b) On | uk | = ∞, sup | un |< ∞} ,
n
k=1
n n n
!
X X X
| uk | ε2k = | uk | E(ε2k | Fk−1 ) + o | uk |
k=1 k=1 k=1
n
! n
!
X X
≤ | ui | sup E(ε2n | Fn−1 ) + o | uk |
n
i=1 k=1
n
!
X
= | ui | sup E(ε2n | Fn−1 ) + o(1)
n
i=1
n
!
X
= O | ui | .
i=1
135
Now, if lim E[ε2n | Fn−1 ] = σ 2 .
n→∞
n n
(∞ )
X X X
Then | uk | E[ε2k | Fk−1 ]/ | uk |→ σ 2 a.s. on | uk |= ∞ .
k=1 k=1 k=1
n
X
By an ≥ 0, bn ≥ 0, bn → b, ai → ∞
i=1
n
X n
X
Then ai b i / ai → b.
i=1 i=1
n
X n
X
So | uk | ε2k / | uk |
k=1 k=1
n
X
| uk | E[ε2k | Fk−1 ]
k=1
= n + o(1)
X
| uk |
k=1
( ∞
)
X
→ σ 2 , a.s. on sup | un |< ∞, | uk |= ∞ .
n
k=1
136
proof : (i) trivial.
| An | − | An−1 |
~ n0 A−1
(ii) w n w~n =
| An |
(λn ) ≥| An | and | An |≥ λ∗n λp−1
∗ p
n
n n
X X | Ai | − | Ai−1 |
(iii) Note that ~ i0 A−1
w i w~i =
i=N i=N
| Ai |
n Z |Ai |
X 1
≤ dx + 1
i=N +1 |A i−1 | x
= 1 + log | An | − log | AN |
= O(log | An |) = O(log λ∗ ).
| An | − | An−1 |
Now → 0 implies
| An |
n
X | Ai | − | Ai−1 |
∼ log | An |
i=N
| A i |
137
Corollaryl : (1) If sup E[ε2n | Fn−1 ] < ∞ a.s.
n
n
X
Then ~x0k Vk ε2k
k=N +1
| Pk | − | Pk−1 |
proof : 0 ≤ uk = ~x0k Vk ~xk = ≤1
| Pk |
n
X
If λ∗n → ∞, ui = O(log λ∗n ).
i=1
n n
! n
!
X X X
and ui ε2i = O( ui [log ui ]1+δ )
i=1 i=1 i=1
∗ ∗ 1+δ
= O(log λn (log log λn ) )
= O((log λ∗n )1+δ ).
138
(2) Note that 0 ≤ ui ≤ 1.
un → 0 on Ωo , Ωo = {limn→∞~x0n Vn~xn = 0, λ∗n → ∞} .
Xn
ui → ∞ on Ωo
i=1
By lemma 1 - (ii),
n
X n
X
ui ε2i / ui → σ 2 a.s.
i=1 i=1
n
X
on ui ε2i ∼ (log | Pn |)σ 2
i=1
Remark:
n k−1
!2
X X
1o Rn = Qn + ~x0k Vk ~xi εi /(1 + ~x0k Vk−1~xk )
k=N +1 i=1
n
X
∼ ~x0k Vk ~xk ε2k if one of it → ∞.
k=N +1
3o If sup E[| εn |α | Fn−1 ] < ∞ a.s. and lim E[ε2n | Fn−1 ] < ∞ a.s.
n n→∞
139
and k ~bn − β~ k2 = O((log λ∗n )1+δ /λn ) a.s., for all δ > 0.
≥ λn (~bn − β)
~ 0 (~bn − β)
~
= λn k ~bn − β~ k2 .
140
proof: By Remark- 3o ,
n k−1
!2
X X
Qn + ~x0k Vk ~xi εi /(1 + ~x0k Vk−1~xk )
k=N +1 i=1
Xn
Qn + ~x0k (~bk−1 − β)
~ 2 /(1 + ~x0 Vk−1~xk )
k
k=N +1
n
!
X
∼ σ 2 log[det ~xi~x0i ] a.s.
i=1
n
X
[~x0k (~bk−1 − β)]
~ 2 /(1 + ~xk Vk−1~xk )
k=N +1
Xn
∼ [~x0k (~bk−1 − β)]
~ 2 if it → ∞ and ~x0 Vk−1~xk → 0,
k
k=N +1
1
since 1 + ~x0k Vk−1~xk = → 1.
1 − ~x0k Vk ~xk
n
X n
X
and ai b i ∼ ai (ai bi > 0)
i=1 i=1
n
X
if bi → 1 and ai → ∞
i=1
~0
(Because yk = β ~xk + εk )
Predict:
At stage n, we already above {y1 , ~x1 , · · · , yn , ~xn } since we can not forsee the
future, we have to use observed data to predict yn+1 .
i.e. The predictor ŷn+1 is Fn -measurable.
If we are only interested in a single period prediction, we may use (yn+1 − ŷn+1 )2 as
a measure of performance. In the adaptive prediction case, it may be more appropriate
to use the accumulated prectiction errors
n
X
Ln = (yk+1 − ŷk+1 )2
k=1
141
In the stochastic regression model,
n
X
Ln = (β~ 0~xk+1 − ŷk+1 )2
k=1
n
X n
X
+2 (β~ 0~xk+1 − ŷk+1 )εk+1 + ε2k+1
k=1 k=1
Example : AR(1)
xk = ρxk−1 + εk , εk ∼ i.i.d..
E[εi ] = 0, V ar(εi ) = σ 2 , E | εi |3 < ∞
n
X
(i) | ρ |< 1, x2i /n → σ 2 /(1 − ρ2 ) a.s.
i=1
n
X
(ii) | ρ |= 1, x2i = O(n2 log log n)
i=1
142
n
!
X
(log log n) x2i
1
lim inf > 0 a.s.
n→∞ n2
λ∗n = O(n3 ) a.s., | ρ |≤ 1.
lim inf λn /n > 0
n→∞
1/2
log n
ρ̂n − ρ = 0 (By Corollary1.)
n
Xn
2
xn / x2i → 0
i=1
n
X n−1
X
n
x2i /n − x2i /n
X i=1 i=1 0
(i) | ρ |< 1, x2n / x2i = n → =0
X 1/(1 − ρ2 )
i=1
x2i /n
i=1
(ii) | ρ |= 1,
!2
n
X n
X
x2n / x2i = O εi /(log log n/n2 )−1
i=1 i=1
log log n p
= O ( 2n log log n)2
n2
(log log n)2
= O = o(1)
n
143
n
!2 n
X X
Qn = x i εi / x2i
i=1 i=1
n
!1/2 n
!1/3 2
1 X X
= n
X !
x2i log x2i
2 i=1 i=1
xi
i=1
n
!2/3
X
= log x2i
i=1
= (log n) 2/3
= 0(log λ∗n )
By Corolary 2,
n n+1
!
X X
(ρ̂n − ρ)x2i+1 ∼ σ 2 log x2i a.s.
i=2 i=1
σ 2 log n, a.s. if | ρ |< 1.
∼
2σ 2 log n, a.s. if | ρ |= 1.
log[n2 log log n] = 2 log n + log(log log n)
log[n2 / log log n] = 2 log n − log(log log n).
o 0
1 lim inf inf ~x Bn~x
n→∞ k~
xk=1
n o
6= inf lim ~x0 Bn~x (The place of difficulity)
k~
xk=1 n→∞
2o Lemma : Assume that {Fn } be a sequence of ↑ σ -fields and ~yn = ~xn + ~εn ,
when
`
X
~xn is Fn−` -measurable, ~εn = ~εn (j) and E{~εn (j) | Fn−j−1 } = 0.
j=1
144
sup E[k~εn (j)kα | Fn−j−1 ] < ∞ a.s. for some α > 2. Also assume that λn =
n ! !
X n n
X n
X
λ ~xi~x0i + ~εi ~ε0i → ∞ a.s. and log λ∗ ~xi~x0i = 0 (λn ) a.s.
i=1 i=1 i=1
n
!
X
Then lim λ ~yi ~yi0 /λn = 1 a.s.
n→∞
i=1
n
X n
X
proof : Let Rn = ~xi~x0i and Gn = ~εi ~ε0i
i=1 i=1
n
X n
X n
X
Then ~yi ~yi0 = Rn + ~xi ~ε0i + ~εi~xi + Gn
i=1 i=1 i=1
n
−1
X
kRn 2 ~xi ~ε0i (j)k2 = O(log(λ∗n )), (By Corollary 1.)
1
= o(λn )
145
n
−1
X
Therefore kRn 2 ~xi ~εi k2 = O(log λ∗n )
1
1 1
= (~u0 Rn~u) 2 O((log λ∗ ) 2 )
1 1
≤ (~u0 (Rn + Gn )~u) 2 O(log 2 λ∗n )
1 1
≤ (~u0 (Rn + Gn )~u/λn2 ) O(log 2 λ∗n )
(Because 1 ≤ ~u0 (Rn + Gn )~u)/λn )
1
= ~u0 (Rn + Gn )~u O((log λ∗n /λn ) 2 )
= (~u0 (Rn + Gn )~u)o(1)
n
!
X
So ~u0 ~yi ~yi0 ~u = ~u0 (Rn + Gn )~u(1 + o(1))
1
yi = β1 yi−1 + · · · + βp yi−p + εi
ψ(z) = z p − β1 z p−1 · · · − βp .
146
Then L.S.E.
n
!−1 n
!
X X
~bn = 0
~yi−1 ~yi−1 yi−1 εi + β~
i=1 i=1
1
0
implies ~yn = B ~yn−1 + ~e εn , where e =
..
.
0
0
~yn = B n yo + B n−1~eεn + · · · + B o~eεn
B can be written as
B = C −1 DC, where D = diag [D1 , · · · , Dq ]
λj 1 0 · · · 0
0 λj 1 · · · 0
Dj =
.. .. . . ..
. . . .
0 0 ··· λj
147
q
X
, mj = p, λj are roots of ψ and C is a nonsingular matrix.
1
k k k
λk−1 λk−2
λkj 1 j 2 j ··· mj −1
λk−mj +1
k
0 λj 0 ··· 0
k
.. ..
Dj =
0 0 . .
.. .. .. ..
. . . .
0 0 0 ··· λkj
B n = C −1 Dn C
= C −1 diag[D1n , · · · , Dqn ]C
n
!
X
0
λmax ~yi−1 ~yi−1
i=1
n
X
0
≤ k ~yi−1 ~yi−1 k
i=1
n
X n
X
0
≤ k~yi−1 ~yi−1 k≤ k~yi−1 k2
i=1 i=1
n
!
X
= O ip+1 a.s.
i=1
p+2
= O(n ) a.s.
148
implies λmax = O(np+2 )
n p−1
1X 0 X
Claim : lim ~εi ~εi = σ 2 B j ~e ~e0 (B 0 )j ≡ Γ, a.s.
n→∞ n
i=1 j=0
p−1
X
~εi ~ε0i = B j ~e~e0 (B 0 )j ε2i−j
j=0
p−1
X
+ B j ~e ~e0 (B 0 )` εi−j εi−`
j6=`
149
Observe that
~e0
~e0 B 0
Γ = (~e, B~e, · · · , B p−1~e)
..
.
~e0 (B 0 )p−1
~xn = B p ~yn−p
n
! n
X X
∗ 0 p 2 0
λ ~xi~xi ≤ kB k ~yi−p ~yi−p k
i=p i=p
p+2
= O(n ) a.s.
n
!
X
But λn ≥ λ∗ ~εi ~ε0i ∼ nλ∗ (Γ)
i=1
n
!
X
So log λ∗ ~xi~x0i = O(log n) = o(λn ) a.s.
i=1
By previous theorem,
n
!
X
0
λ∗ ~yi−1 ~yi−1
i=1
lim = 1 a.s.
n→∞ λn
n
!
X
0
Therefore, lim inf λ∗ ~yi−1 ~yi−1 /n > 0 a.s.
n→∞
i=1
150
n
!
X
So log λ∗ 0
~yi−1 ~yi−1
i=1
n
!!
X
0
= o λ∗ ~yi−1 ~yi−1 and
i=1
3. Limiting Distribution :
Assume that ∀ n,
∃ ↑ σ-fields {Fn,j ; j = 0, 1, 2, · · · , n}
s.t. ∀ n {εn,j , Fn,j } is a martingale difference sequence and xn,j is Fn,j−1 -measurable.
Assume that:
n
!−1 n
!
X X
Then if ~bn = ~xn,i~x0n,i ~xn,i yn,i , we have
i=1 i=1
D
(A0n )−1 (~bn ~ → N (0, σ 2 Γ−1 )
− β)
take i = 1, 2, · · · , kn
151
Note: If {Xn,j , Fn,j , 1 ≤ j ≤ kn } is a martingale difference sequence s.t.
kn
D
X
2
(i) E[Xn,j | Fn,j−1 ] → C, constant
j=1
kn
D
X
2
(ii) E[Xn,j 2 >ε} | Fn,j−1 ] → 0
I{Xn,j
j=1
kn
D
X
Then Xn,j → N (0, C)
j=1
152
kn
X h i
and E u2n,i I{u2n,i >ε} | Fn,i−1
i=1
kn
α−2
X
α
≤ E[| un,i | | Fn,i−1 ] ε 2
i=1
kn
1 X
= α−2 | ~t0 An~xn,i |α E[| εn,i |α | Fn,i−1 ]
ε 2
i=1
kn
−( α−2
2 )
X
≤ ε 2
sup E[| εn,i | | Fn,i−1 ] | ~t0 An~xn,i |2
1≤i≤kn
i=1
· sup | ~t0 An~xn,i |α−2 .
1≤i≤kn
n
!
α−2
≤ ε−( ) sup E[| ε |α | F
X
2
n,i
~0
n,i−1 ] · t An ~xn,i~xn,i A0n~t
1≤i≤kn
i=1
D
·k~tk sup kAn~xn,i k → 0.
1≤i≤kn
Example : yo = 0
yn = α + βyn−1 + εn , where | β |< 1, εn i.i.d., E[εn ] = 0,
V ar[εn ] = σ α , E[| εn |2 ] < ∞, for some α > 2.
Since α + βα + β 2 α + · · · + β n−1 α + · · ·
= α(1 + β + β 2 + · · · + β n−1 + · · · )
1
= α .
1−β
153
n 2
σ2
1X 2 α
implies y → +
n i=1 i 1−β (1 − β)2
n
1X α
yi → a.s.
n i=1 1−β
1
yn = (α, β) = β~ 0~xn
yn−1
n
1X 1
(1, yi−1 )
n i=1 yi−1
n
X
n yi−1
1 i=1
=
n n
n X X
2
yi−1 yi−1
i=1 i=1
!
1 α/(1 − β)
→
α
2
σ2 ≡ Γ.
α/(1 − β) 1−β
+ (1−β) 2
√
take ( n)−1 = A
Now, kn = n
1 1
sup k √ k
1≤i≤n n yi−1
1 1
≤ √ + √ sup | yi−1 |
n n 1≤i≤n
It is sufficient to show that
√
yn−1 / n → 0 a.s.
n−1
X n−2
X
yi2 − yi2
2 i=1 i=1
yn−1 /n = → 0 a.s.
kn
!n
X
An ~xn,i~x0n,i A0n → Γ a.s.
1
An /An−1 → 1 a.s.
D
implies sup kAn~xn,i k → 0.
1≤i≤kn
154