0% found this document useful (0 votes)
76 views27 pages

Conditional Least Squares Estimation in Nonlinear and Nonstationary Stochastic Regression Models

This document summarizes Christine Jacob's presentation on conditional least squares estimation for nonlinear and nonstationary stochastic regression models. It discusses estimating the parameter of interest θ0 from an observed trajectory of stochastic processes {Zn, Un} by minimizing the sum of squared residuals between Zn and its conditional expectation given past information. Challenges arise from the stochasticity, nonstationarity, and nonlinearity of the model. Asymptotic properties of the estimator are discussed, as well as extensions to allow for missing data, outliers, and multivariate models. Examples of time series, financial, and branching process models are provided.

Uploaded by

dayum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views27 pages

Conditional Least Squares Estimation in Nonlinear and Nonstationary Stochastic Regression Models

This document summarizes Christine Jacob's presentation on conditional least squares estimation for nonlinear and nonstationary stochastic regression models. It discusses estimating the parameter of interest θ0 from an observed trajectory of stochastic processes {Zn, Un} by minimizing the sum of squared residuals between Zn and its conditional expectation given past information. Challenges arise from the stochasticity, nonstationarity, and nonlinearity of the model. Asymptotic properties of the estimator are discussed, as well as extensions to allow for missing data, outliers, and multivariate models. Examples of time series, financial, and branching process models are provided.

Uploaded by

dayum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Prediction of time series and nonstationary times, Maison des Sciences Economiques, Paris, February 10-11, 2012

Conditional Least Squares Estimation in nonlinear


and nonstationary stochastic regression models
Christine Jacob
Applied Mathematics and Informatic unit, INRA, Jouy-en-Josas, France
[email protected]
1

Model
{Zn}: real stochastic process, may depend on an external process {Un },

a.s.

E(Zn |Fn1) = g( 0, 0, Fn1 ) = g (1)( 0, Fn1) + g (2)( 0, 0, Fn1) <


|
|
|
{z
}
{z
}
{z
}
Model, Lipschitz in

Approximate model
a.s.

V ar(Zn|Fn1) = 2( 0, 0, Fn1) < ,

where Fn1 := {Zn1 , Zn2 , . . . , Un , Un1 , . . .}: observations


0 0 Rp, p < : parameter of interest
0 Rq , q : nuisance parameter
If q = , n := g (2) ( 0, 0, Fn1), if q = 0, g (2)( 0, 0, Fn1 ) = 0

nuisance negligible part

Goal
Estimate 0 from one observed trajectory of {Zn, Un } by Conditional Least Squares:
a.s.

Zn = g( 0, 0, Fn1) + en , E(en |Fn1) = 0, E(e2n |Fn1) := 2( 0, 0, Fn1))


n
X
bn := arg min Sn (,
b ), Sn(,
b ) :=
b , Fk1 ))2(Fk1)

(Zk g(,

bn ,
b n) := arg min Sn (, ), Sn(, ) :=
or, if q < , (
,

k=1
n
X
k=1

(Zk g(, , Fk1 ))2(Fk1)

bn, as n : strong consistency, asymptotic distribution


Asymptotic properties of
bn!
Difficulties: stochasticity, nonstationarity, nonlinearity, no explicit expression of

Literature: for the strong consistency, existence of sufficient conditions forbidding many types of
nonstationarity, asymptotic distribution in very particular cases

bn for a finite n
Optimal properties of

For q = 0 (no nuisance parameter), optimality if (Fk1) 2( 0, Fk1) (Heyde, 1997):


Ex. p = 1 = optimality if S n ( 0) minimum with Sn ( 0) maximum

E(S n2 ( 0))



E(Sn ( 0))

= Take (Fk1)
b2 ( 0, Fk1 )

2

minimum for (Fk1 ) 2( 0, Fk1)

bn is generally not ensured


Note: If (Fk1 ) depends on , then the consistency of

Usefulness of g (2)( 0, 0, Fn1) := g( 0, 0, Fn1) g (1)( 0, Fn1)


{z
} |
{z
}
|
E(Zn |Fn1 )

Approximate model

b , Fn1) is A.N. (Asymptotically Negligible):


Assumption: g (2)(,

b , Fn1 ) g (2)( 0, 0, Fn1 )| a.s.


supk0k |g (2)(,
> 0, limn
= 0
inf k0k |g (1)(, Fn1) g (1) ( 0, Fn1 )|

g( 0, 0, Fn1) is more realistic than the simpler model g (1) ( 0, 0, Fn1),


but bn may have no asymptotic properties = study bn
6

Zn is observed with an observation error n , E(n |Fn1 ) = 0


= Zn = Zn + n , = E(Zn |Fn1) = E(Zn |Fn1) = g( 0, 0, Fn1)

)
) + g( 0, 0, Fn1 ) g( 0, 0, Fn1
g( 0, 0, Fn1) = g( 0, 0, Fn1

)
) + g( 0, 0, Fn1 ) g( 0, 0, Fn1
) + g (2) ( 0, 0, Fn1
= g (1)( 0, Fn1
{z
} |
{z
}
|

g (1) ( 0 ,Fn1 )

g (2)( 0 , 0 ,Fn1 ,Fn1 )=: n

Zn depends on the past errors en1 , en2, . . .


Ex. ARM A(p, q) : Zn =

p
X

j Znj +

j=1

A(L) := 1
7

= Zn =

p
X
j=1

q
X
j=1

p
X

j enj + en A(L)Zn = B(L)en,

j L , B(L) := 1 +

j=1

q
X
j=1

j Znj + (B(L) 1)L1


n1
X
j=0

j Lj , LZn = Zn1

nj Zj
{z

Approximate model

g (1)(.)

n0
X

j=1

en1
|{z}

+en

(B(L])1 A(L)Zn1

nj Zj
{z

Nuisance part

+en

g (2)(.)

where l is function of {j } and {j } (for p = 1, q = 1, j = (1)j1(1 + 1 )1j1),


Zn = 0, n < n0 0

Pn

2

Extension: Sn(, ) := k=1 k (Zk g(, , Fk1)) (Fk1),


E(k (Zk g( 0, 0, Fk1))|Fk1) = 0
Missing data
= k (Zk g(, , Fk1 )) = 1{Zk I obs}(Zk g(, , Fk1 ))
k

E(Zn |Fn1) , E(Zn2|Fn1 )


a.s.
= k (Zk g(, , Fk1 )) = 1{Zk Ik }(Zk g(, , Fk1 )), limk Ik =
Existence of outliers (Heyde, 1997)
P
= (Fk1) = kh=1 h,k kh (Zkh g(, , Fkh1 )),
k (x) = (x) = x, if |x| < m, (x) = m sign(x), if |x| m (Hubers function)


2
Extension: Sn(, ) := k=1 k (Zk g(, , Fk1)) (Fk1) + pen(, p, Fn1),
E(k (Zk g( 0, 0, Fk1))|Fk1) = 0, where p is the nber of i 6= 0
Pn

= Penalize large values of p

Extension:PMultivariate stochastic regression model:


1
Sn(, ) = nk=1(Zk g(, , Fk1))T 1
k (Zk g(, , Fk1 )), k Fk1 -measurable
n
X
1
1
1
= Sn (, ) =
(Zk g(, , Fk1 ))T Uk k U1
k (Zk g(, , Fk1 )), Uk k Uk = k , k Fk1

k=1

n
X

1/2 1
k Uk (Zk

k=1

g( , , Fk1 ))

T 
|

1/2
k U1
k (Zk

d X
n
X
=
(Yk,j fk,j (, , Fk1))2,
j=1 k=1

g(, , Fk1))
{z
}

denoted by Yk fk (,,Fk1 )

Examples of models
Classical Regression with stochastic regressors
Time series: ARM AX(p, q, b) model, nonlinear time series
Financial models: GARCH(p, q) model, nonlinear financial models
10

n = sn( 0)Un , {Un } i.i.d. (0, 1) = Zn := n2 = s2n( 0) + s2n( 0)(Un2 1)


|
{z
}
en

s2n()

= 0 +

X
j=1

Branching processes
Nn =

j s2nj ()

2
j nj
(volatility),

j=1

Nn1

X
i=1

= Zn

Xn,i, {Xn,i}i|Fn1 i.i.d. (m0, 0 (Fn1 ), 2 0, 0 (Fn1 ))

p
:= Nn = m0 , 0 (Fn1 )Nn1 + Nn1 en , E(e2n|Fn1 ) = 2 0 , 0 (Fn1 )

Asymptotic properties for q = 0 (no nuisance parameter)

11

bn = arg min Sn(), Sn () :=

n
X
k=1

n
X
(Zk g(, Fk1)) (Fk1 ) =:
(Yk f (, Fk1))2
2

k=1

Let ek := Yk f ( 0, Fk1 ), k2 := V ar(ek |Fk1 ). Assume


A1 : > 0,

sup

k 0 k k=1

a.s.
k2d2k ()
2 < , dk () := f (, Fk1 ) f ( 0, Fk1 )
P
k
2
h=1 dh ()

Strong consistency (Jacob, 2010)


Let Dn() =


2
(identifiability criterion). Assume A1. Then
k=1 f (, Fk1 ) f ( 0 , Fk1 )

Pn

A2 : > 0, limn

inf

k 0 k

a.s.
bn a.s.
= 0
Dn() = = lim

bn 0 =
=
Linear case: f (, Fn1) =



T
T
Dn() = 0 Wn1Wn1 0
0T Wn1

12

= limn

inf

k 0 k

P

n
T
k=1 Wk1 Wk1

a.s.

Dn() = lim min


n

n
X
k=1

1 P

T
Wk1Wk1

n
k=1 ek Wk1 ,

Note. In the literature, existence of additional sufficient conditions:


Dn() has to tend to at some rate
Ex. Linear case f (, Fn1 ) = 0T Wn1, the additional condition is
Pn
Pn
T
limn[ln(max { k=1 Wk Wk }] ][min{ k=1 Wk WkT }]1 = 0 (Lai & Wei, 1982)

a.s.

Proof
1. Wus Lemma (1981):
limn

inf

k 0 k

a.s.
bn a.s.
= 0.
(Sn() Sn( 0)) > 0, > 0 = lim

13

2. Sn () Sn ( 0) = Dn() + 2Ln()


|Ln()| 
=
inf Sn() Sn( 0)
inf Dn() 1 2 sup
k 0 k
k 0 k
k 0 k Dn ()


Pn
k=1 ek f (, Fk1 ) f ( 0 , Fk1 )
Ln()
:= P 
2
Dn()
n
k=1 f (, Fk1 ) f ( 0 , Fk1 )


1
converges a.s. to 0
3. Prove that supk0 k |Ln()| Dn()

Linear case f (, Fn1 ) = 0T Wn1

14

P
|( 0)T k ek Wk1|
|Ln()|
P
=
=
T
Dn()
( 0)
( 0)T nk=1 Wk1Wk1
|Ln()|
|Ln( n)|
=
sup
=
, n depends on en , Fn1
D
()
D
(
)
n
n n
k 0 k
P
ek Wk1
|Ln(n )|
|(n 0)1|| Pnk
p = 1 =
2 | = SLLNM (Hall & Heyde, 1980)
W
Dn(n )
k=1
k1


f (n ,Fk1 )f ( 0 ,Fk1 )
Pn
General nonlinear case with p 1? k=1 ek

2 is not a martingale!
Pk

h=1

f ( n ,Fh1 )f ( 0 ,Fh1 )

Strong Law of Large Numbers for SubMartingales (Jacob, 2010)


dk () := f (, Fk1) f ( 0, Fk1) Fk1-measurable and Lipschitz in ,
E(ek |Fk1) = 0, E(e2k |Fk1) =: k2

15

lim inf
n

n
X
k=1

a.s;

d2k () = and sup

Pn

P

X
k=1

k2d2k ()

P

k
2
h=1 dh ()

Pn
ek dk () a.s.
sup | Pk=1
2 < = lim
n
2 () | = 0
n
d

k=1 k
a.s.

k
2
h=1 dh ()

1

| is a submartingale
Proof: sup | k=1 ek dk ()
= use submartingale properties (Hall & Heyde, 1980), and analytical lemmas
Pn
ek dk () a.s.
= 0
Note. SLLNM: limn Pk=1
n
2
k=1 dk ()
P
2 d2 () a.s.
Markovs theorem: limn nk=1 k nk2 =

0 = limn

Pn

k=1 ek dk ()

=0

Asymptotic distribution (Jacob, 2010)


bn) at 0
1. Taylors expansion of S n(


1
b
b
b
n( n )( n 0) = ( n 0) = S
n( n)
= S n (n ) = S n( 0) + S
S n( 0)

16

2. Find a p p deterministic matrix n such that



1
1/2
1
1/2 b
S n( 0)
Sn( n)n
n ( n 0) = n
P

n( n )1 = I, and limn 1/2S n( 0) exists in distribution


with limn S
n
n
S n( 0) = 2
n( n) = 2
S

n
X

k=1
n
X

|k=1

ek f ( 0, Fk1 ) =

1/2
n

=O

n
X

f ( 0, Fk1 )f ( 0, Fk1 )

k=1

f ( n , Fk1)f ( n, Fk1) 2
{z

Use the SLLNSM for proving that limn

Pn

n
X

|k=1

ek f ( n, Fk1)
{z

6=0 in the nonlinear case


a.s.

k=1 ek f ( n , Fk1 )n = 0

1/2

Examples
Polymerase Chain Reaction: replication in vitro of a population of N0 DNA
(Lalam, Jacob & Jagers, 2004)
Nn1

Nn =
17

X
i=1

(1 + Xn,i), {Xn,i }i|Fn1 i.i.d. Ber(p0 (Nn1 ))




1+
K0
p0 (Nn1 ) := P (Xn,i = 1|Nn1 ) =
K0 + NS0 ,n1

exp(C0(S01 NS0 ,n1


2

where NS0 ,n1 = S0, if Nn1 < S0, and NS0 ,n1 = Nn1 , if Nn1 S0
Nn increases exponentially when Nn1 < S0 (BGW branching process),
a.s.
Nn increases linearly (limn Nn n1 = K0/2) when Nn1 S0

1))

Zn = Nn + n

=
n large

exp(C0(S01NS0 ,n1

1)) 
1+
K0
Nn1 + en + n
1+
K0 + NS0 ,n1
2


K0Zn1
1
Zn1 +
+ O exp(C0(S0 Zn1 1)) +en
2(K0 + Zn1 ) |
{z
}
|
{z
}
(2)


g (1) ( 0 ,Zn1 )

( 0 ,Zn1 ,n )

18

Non asymptotic identifiability of (K, C, S) because of C, S

bn, Sbn )
Strong identifiability of K given (C
1/2 b
b n|(C
bn, Sbn ) a.s.
b b D
= limn K
= K 0 , n (K
n K0 )|(Cn, Sn ) = N (0, K/2)

2
1.5
1
0.5
5

10

15

20

25

30

35

-0.5
-1

Figure 1: Efficiency {p0 (Nk1 )}kn calculated from a simulated trajectory of the branching process (K0 = 4.00311.1010, S0 = 1010 , C0 = 0)
1
In dashed line: p(Zk1 ) = Zk Zk1
1, k n (empirical efficiency), in continuous line: pbn (Zk1 ), k n (estimated efficiency)
19
2
1.5
1
0.5
5

10

15

20

25

30

35

40

-0.5
-1
1
Figure 2: Real-time PCR, well 21 of data set 1, efficiency {p0 (Nk1 )}kn . In dashed line: {p(Zk1 ) = Zk Zk1
1}kn (empirical efficiency), in continuous line:
b
b
b
{pbn (Zk1 )}kn (estimated efficiency) with n
bs = 23 (saturation threshold cycle), Kh,n = 0.38055, Sh,n = 0.070553, Ch,n = 0.6

PCR: another model (Jacob, 2010)


Nn1

Nn

(1 + Xn,i)

i=1

P (Xn,i = 2|Nn1 )

20

n large

0
)
(1 + S00 NS
K0
0 ,n1
(
, >0
)
K0 + NS0 ,n1
2
10
 
K0S00 Zn1
K0Zn1
+
+ O n + en
Zn1 +
2(K0 + Zn1 ) 2(K0 + Zn1 )

(K, S, ) non asymptotically identifiable due to = assume 0 known with 0 20 1


Let = (K, S 0 ). Then
1
d
lim n1/2(bn 0) = N (0, (K/2)I), n =
n
4

n
Kn10
Kn10 K 2an(0)

GARCH(1, 1)
Zn := n2 =
s2n() =

s2n ( 0)
| {z }

+ s2n ( 0)(Un2 1), E(Un2|Fn1 ) = 1


{z
}
|

en
g(0 , 0 ,Fn1 )
2
0 + 1n1
+ 1s2n1 ()

= s2n() = 0 (
21

= 0 (
|

n1
X

l=0

X
l=0

1l ) + 1
1l ) + 1

n1
X

l=0
n1
X

{z l=0

2
1l1 n1l
) + 1n s20

2
+ 1n (s20 0
1l1 n1l

g (1) (,,Fn1 )

{z

0 := 10 + 10

1l )

l=0

g (2) (,,Fn1 )=:n

(2) ( , , F
A.N. of g (2)( 0, 0, Fn1) = 0, for 10 < 1 = take bn := gc
0
0
n1 ) = 0

Pn
n
2
(1) = = E(sn ( 0))0 = 00 k=1 0k + s20 ( 0),
= 0 < 1 = limn E(s2n ( 0)) = 00(1 0)1
a.s.
= 0 > 1 = limn s2n ( 0)0n = W , E(W ) <
= 0 = 1 = E(s2n ( 0)) = n00 + s20( 0)

(1)

1000

700

900
600
800
500

700
600

400
500
300
400
300

200

200
100
100
0

2000

4000

6000

8000

10000

50

100

150

200

50

100

150

200

50

100

150

200

14

10

2500

3.5
2000
3

2.5

1500

2
1000

1.5

1
500

22

0.5

2.5

2000

4000

6000

8000

10000

293

10

10

2.5
2

2
1.5
1.5
1
1

0.5
0.5

2000

4000

6000

8000

10000

Figure 3: Simulations with {Un2 } i.i.d. exp(1). Red line: {n2 }, blue line: {s2n ()}
On the first line, 0 = (00 , 10 , 10 ) = (10, 0.1, 0.8); on the second line, 0 = (10, 0.22, 0.8); on the third line, 0 = (10, 0.3, 0.8)

Conditional Least Squares estimator of 0 := (C0, 10, 10), 10 < 1, C0 := 00(1 10)1
Zn := n2 =

s2n ( 0)
| {z }

g(0 , 0 ,Fn1 )

+ s2n ( 0)(Un2 1), E(Un2|Fn1 ) = 1


{z
}
|
en

23

=: g( 0, 0, Fn1 ) + en
n1
X
l 2
= C0 + 10
10
n1l + g (2)( 0, 0, Fn1 ) +en, E(e2n |Fn1) s4n ( 0)
{z
}
|
l=0

n
{z
}
|
g (1)(,Fk1 )

n
X
bn|b

= 0 := arg min Sn (, 0), Sn (, 0) :=


(Zk g (1)(, Fk1))2(Fk1)

k=1

Optimality if (Fn1) (V ar(Zn|Fn1 ))1 s4


( 0)
2
 n P
n1 l 2
(1)
2
, < 1
= take (Fn1 ) = (g (n, Fn1 )) = 1 + l=0 n1l

bn if A1 and A2 checked
Strong Consistency of
A1 checked for all > 0, 0 < < 1
P
a.s.
(F
)
=
For = (C, 10, 10), A2
k1
k=1

(Fk1) =

k=1

=
24


X
k=1

k=1 (Fk1 )
P
k=1 (Fk1 )

= for 0 > 1,

k1
X

k1l l2

m (Lm+1

1 +

l=0

2


X
k=1

1 + (1 )

Lm) 1 + (1 )1 L2 m

2

Mk1

2

:= sup {l2}
, Mk1
lk1

, L2 m := mth record of {n2 }

= if (Lm+1 Lm)(L2 m )2 does not tend to 0 too quickly (0 < 1)

k=1 (Fk1 )

P 
k=1

Pk1
k1

1 + W 0

l=0 ( /0 )

k1l

Ul2

2

< , a.s.

For = (C0, 1, 10) or = (C0, 10, 1), A2 checked for all 0


bn , = (1, 1), 0 > 1:
Asymptotic distribution of
b
n ) and V ar(bn) minimum for = 0
limn n( n 0) = N (0, ), dependent on 0 and , V ar(b

2
2
b
Simulations of {n2 }N
n=1 , {Un } i.i.d. exp(1), s1 ( 0 ) = 0, calculus of N , for different values of 0 = (00 , 10 , 10 ) and N
For each value of 0 and of N, two graphics are given.
2
The first one represents 12 , . . . , N
(erratic line) with s21 ( 0 ), . . . , s2N ( 0 ) (smooth line).
The second one represents Sn (, 0) calculated with = (1, 1, 0.999), [00 0.1, 00 + 0.05] [10 0.1, 10 + 0.05] [10 0.1, 10 + 0.05].
0.39
0.38

0.37
0.8

0.36
0.35

0.6
0.34
0.33
0.4
0.32
0.31

0.2

0.3
0

20

40

60

80

100

500

1000

1500

2000

2500

3000

500

1000

1500

2000

2500

3000

500

1000

1500

2000

2500

3000

0.24

1.2

0.23

0.22
1
0.21
0.8

0.2

25

0.19

0.6

0.18
0.4
0.17

0.2

0.16

0.15
0

50

100

150

200

250

300

0.32
1.8

1.6

0.3

1.4

0.28

1.2

0.26

1
0.24
0.8
0.22
0.6
0.2
0.4
0.18

0.2

100

200

300

400

500

600

Figure 4: 0 = (0.2, 0.2, 0.1).


1rst line, N = 100: min Sn () = Sn (0.16, 0.11, 0.15) = 0.2913, and Sn (0 ) = 0.3129.
2nd line, N = 300: min Sn () = Sn (0.24, 0.11, 0.09) = 0.1460, and Sn (0 ) = 0.1564.
3rd line, N = 600, min Sn () = Sn (0.25, 0.25, 0.07) = 0.1612, and Sn (0 ) = 0.1889.

10

5.3
2
5.2
1.8
5.1

1.6

1.4

4.9

1.2

4.8

4.7

0.8

4.6

0.6

4.5

0.4

4.4

0.2

4.3

20
x

40

60

80

100

500

1000

1500

2000

2500

3000

500

1000

1500

2000

2500

3000

500

1000

1500

2000

2500

3000

17

10

12.5

3.5
12
3
11.5
2.5

11

1.5
10.5
1
10
0.5

9.5

50
x

100

150

200

250

300

40

10

26

18

29

16

28.5
28

14

27.5

12

27
10
26.5
8
26
6
25.5
4
25
2
24.5
0

100

200

300

400

500

600

Figure 5: 0 = (0.2, 0.3, 0.9).


1rst line, N = 100: min Sn () = Sn (0.23, 0.22, 0.95) = 4.2846, and Sn (0 ) = 4.3297
2nd line, N = 300: min Sn () = Sn (0.25, 0.25, 0.92) = 9.4485, and Sn (0 ) = 9.6351.
3rd line, N = 600, min Sn () = Sn (0.12, 0.35, 0.86) = 24.0845, and Sn ( 0 ) = 24.3542.

General comments:
For each value of 0 and of N, the minimum value of Sn () is quite close to Sn ( 0 )
Assuming C0 known or setting = 0 does not improve significantly the results.

Conclusion
Indirect way of proof (Wus Lemma) + SLLNSM
bn )
= The difficulties (stochasticity, nonstationarity, nonlinearity, no explicit expression of
are removed
27

bn
= Strong consistency, Asymptotic distribution of
Thank you for your attention!
Main Reference:
J ACOB , C. (2010) Conditional Least Squares Estimation in nonstationary nonlinear stochastic
regression models.
Ann. Statist., 38(1), 566597.

You might also like