0% found this document useful (0 votes)
63 views

Lecture2 PDF

This document provides an overview and summary of key concepts related to random processes, linear least squares, and system identification. It begins with an overview of dynamical systems and random processes, including definitions of linear time-invariant systems, input-output representations, and stochastic signals. It then discusses deterministic least squares problems, including defining the least squares solution and proving the classical solution using normal equations. Finally, it discusses statistical properties of least squares estimates and applications of least squares to system identification.

Uploaded by

Serkan Sezin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Lecture2 PDF

This document provides an overview and summary of key concepts related to random processes, linear least squares, and system identification. It begins with an overview of dynamical systems and random processes, including definitions of linear time-invariant systems, input-output representations, and stochastic signals. It then discusses deterministic least squares problems, including defining the least squares solution and proving the classical solution using normal equations. Finally, it discusses statistical properties of least squares estimates and applications of least squares to system identification.

Uploaded by

Serkan Sezin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Filtering and Identification

Lecture 2:
Random processes and Linear Least
Squares

Michel Verhaegen and Jan-Willem van Wingerden


1/35

Delft Center for Systems and Control


Delft University of Technology
Overview
• Dynamical Systems and Random
Processes
• Deterministic least squares (LS) problems
• Statistical Properties LS estimates
• Application of LS to System Identification

2/35

Delft Center for Systems and Control


Dynamical Systems (Chapter 3)
v(k)

u(k) y(k)
Σ
“input” “output”

Σ : LTI (Linear Time Invariant) System with


representation:

3/35

Delft Center for Systems and Control


Dynamical Systems (Chapter 3)
v(k)

u(k) y(k)
Σ
“input” “output”

Σ : LTI (Linear Time Invariant) System with


representation:
P∞
• IIR: y(k) =
ℓ=−∞ g(ℓ)u(k − ℓ) + v(k)

3/35

Delft Center for Systems and Control


Dynamical Systems (Chapter 3)
v(k)

u(k) y(k)
Σ
“input” “output”

Σ : LTI (Linear Time Invariant) System with


representation:
P∞
• IIR: y(k) =
ℓ=−∞ g(ℓ)u(k − ℓ) + v(k)
• (Or)
( State Space Model:
x(k + 1) = Ax(k) + Bu(k)
y(k) = Cx(k) + Du(k) + v(k)
• (Or) · · · .
3/35

Delft Center for Systems and Control


Signals (Chapter 4: 4.1 - 4.4)
v(k)

u(k) y(k)
Σ
“input” “output”

4/35

Delft Center for Systems and Control


Signals (Chapter 4: 4.1 - 4.4)
v(k)

u(k) y(k)
Σ
“input” “output”

1. The quantities u(k), y(k) are (measured) Signals (OR)


discrete time sequences (OR) sampled input-output data
sequences generally denoted as:

Given :{u(k), y(k)}N


k=1 N ∈N

4/35

Delft Center for Systems and Control


Signals (Chapter 4: 4.1 - 4.4)
v(k)

u(k) y(k)
Σ
“input” “output”

1. The quantities u(k), y(k) are (measured) Signals (OR)


discrete time sequences (OR) sampled input-output data
sequences generally denoted as:

Given :{u(k), y(k)}N


k=1 N ∈N

2. The quanity v(k) is a stochastic process (generally


unknown) (OR) a discrete time sequence of Random
variables with a (Gaussian) probability density. In this
course we restrict to its mean and covariance function.
4/35

Delft Center for Systems and Control


Stochastic Signals

e(k) Σn v(k)

In this course: A stochastic process v(k) is stationary and


“assumed” to result by filtering zero-mean white noise e(k)
with an LTI system Σn

5/35

Delft Center for Systems and Control


Stationarity
Definition wide sense stationarity (WSS): A
random process x(k) ∈ R is WSS if the following
three conditions are satisfied:
1. mean is constant, mx (k) = mx
2. auto-correlation function
Rx (k, ℓ) = E[x(k)x(ℓ)] only depends on the
lag k − ℓ
3. variance E[(x(k) − mx )2 ] is finite

6/35

Delft Center for Systems and Control


Stationarity
Definition wide sense stationarity (WSS): A
random process x(k) ∈ R is WSS if the following
three conditions are satisfied:
1. mean is constant, mx (k) = mx
2. auto-correlation function
Rx (k, ℓ) = E[x(k)x(ℓ)] only depends on the
lag k − ℓ
3. variance E[(x(k) − mx )2 ] is finite
Hence,
Rx (k, ℓ) = Rx (k − ℓ) = Rx (τ ) =
6/35

Delft Center for Systems and Control


Stationarity
Definition wide sense stationarity (WSS): A
random process x(k) ∈ R is WSS if the following
three conditions are satisfied:
1. mean is constant, mx (k) = mx
2. auto-correlation function
Rx (k, ℓ) = E[x(k)x(ℓ)] only depends on the
lag k − ℓ
3. variance E[(x(k) − mx )2 ] is finite
Hence,
Rx (k, ℓ) = Rx (k − ℓ) = Rx (τ ) = E[x(k)x(k − τ )]
6/35

Delft Center for Systems and Control


White noise e(k) ∼ (0, σe2)
A zero-mean white noise sequence (ZMWN):
The random process e(k) is a ZMWN if it has
mean zero and its auto-covariance
(auto-correlation) function equals:
(
σe2 for k = ℓ
E[e(k)e(ℓ)] =
0 otherwise

Denoted as Re (τ ) = E[e(k)e(k − τ )] = σe2 ∆(τ ),


with ∆(τ ) the unit-pulse.

7/35

Delft Center for Systems and Control


RPs in the time-domain
If the (real) RPs x(k) and y(k) are wide sense
stationary (WSS), then these RPs are fully
characterized in the time-domain by their means
E[x(k)] = mx , E[y(k)] = my
and their auto-, cross-covariance functions:
h i
Cx (τ ) = E (x(k) − mx )(x(k − τ ) − mx )T
h i
Cxy (τ ) = E (x(k) − mx )(y(k − τ ) − my )T

8/35

Delft Center for Systems and Control


RPs in the time-domain
An equivalent characterization is to replace
the auto-, cross-covariance functions
by the auto-, cross-correlation functions:
h i h i
Rx (τ ) = E x(k)x(k − τ )T = E x(k + τ )x(k)T
h i
Rxy (τ ) = E x(k)y(k − τ )T

9/35

Delft Center for Systems and Control


RPs in the time-domain
The numerical calculation may proceed via the
assumption of ergodicity which enables to proof
relationships like:
" N
#
1 X
Pr lim x(k)x(k − τ )T = Rx (τ ) = 1
N →∞ N
k=1

10/35

Delft Center for Systems and Control


Overview
• Dynamical Systems and Random Processes
• Deterministic least squares (LS) problems
• Statistical Properties LS estimates
• Application of LS to System Identification

11/35

Delft Center for Systems and Control


Deterministic least squares problem
Given F ∈ RN ×n , y ∈ RN , the problem is:

min ǫT ǫ subject to: y = Fx + ǫ


x

The argument that minimizes this problem is the


b.
least squares solution and is denoted as, x

12/35

Delft Center for Systems and Control


Deterministic least squares problem
Given F ∈ RN ×n , y ∈ RN , the problem is:

min ǫT ǫ subject to: y = Fx + ǫ


x

The argument that minimizes this problem is the


b.
least squares solution and is denoted as, x
For all x ∈ Rn , it satisfies,
bk22 ≤ky − F xk22
ky − F x

12/35

Delft Center for Systems and Control


Deterministic least squares problem

bk22 ≤ky − F xk22


ky − F x

y
f2
Fx
ǫ
yb
span(F ) f1
h i
where yb = F x
b = f1 f2 x
b

13/35

Delft Center for Systems and Control


The classical solution
Lemma: Let the matrix F in
min ǫT ǫ subject to: y = Fx + ǫ
x

b is:
have full column rank, then the solution x
T
−1 T
b= F F
x F y

This follows from the normal equations:


T ↓
b=F T y
F Fx

14/35

Delft Center for Systems and Control


Proof of the classical solution
Via the completion of squares. For all x and x̂ satisfying:

(F T F )x̂ = F T y

we can write the least squares cost function as:

ky − F xk22 = (y − F x)T (y − F x)
= y T y − xT F T y − y T F x + xT F T F x
= y T y − y T F x̂+ (x − x̂)T F T F (x − x̂)

15/35

Delft Center for Systems and Control


Proof of the classical solution
Via the completion of squares. For all x and x̂ satisfying:

(F T F )x̂ = F T y

we can write the least squares cost function as:

ky − F xk22 = (y − F x)T (y − F x)
= y T y − xT F T y − y T F x + xT F T F x
= y T y − y T F x̂+ (x − x̂)T F T F (x − x̂)

Therefore,
arg minx ky − F xk22 = x̂

15/35

Delft Center for Systems and Control


Proof of the classical solution
  
h i y T y −y T F 1
2
ky − F xk2 = 1 x T    
−F T y F T F x
| {z }
M
   
I −b xT yT y − yT F x
b 0 I 0
M=      ,
0 I 0 FTF −bx I
b satisfying,
for x
b = F T y.
FTFx

16/35

Delft Center for Systems and Control


Proof of the classical solution (Ct’d)
  
h i y T y −y T F 1
ky − F xk22 = 1 x T    
−F T y F T F x

    
h i I −b xT yT y − yT F x
b 0 I 0 1
= 1 x T        
0 I 0 FTF −bx I x
  
h i yT y − yT F xb 0 1
= 1 (x − xb) T    
0 FTF x−xb

b satisfying F T F x
for x b = F T y.

ky − F xk22 = (y T y − y T F x b)T F T F (x − x
b) + (x − x b).

17/35

Delft Center for Systems and Control


Overview
• Dynamical Systems and Random Processes
• Deterministic least squares (LS) problems
• Statistical Properties LS estimates
• Application of LS to System Identification

18/35

Delft Center for Systems and Control


“Measurement Errors” ǫ ∼ (0, I)

Given F ∈ RN ×n , y ∈ RN , the problem is:

min ǫT ǫ subject to: y = Fx + ǫ


x

19/35

Delft Center for Systems and Control


“Measurement Errors” ǫ ∼ (0, I)

Given F ∈ RN ×n , y ∈ RN , the problem is:

min ǫT ǫ subject to: y = Fx + ǫ


x

• F is a known full column rank matrix.


• x an unknown, deterministic vector.
• ǫ is a zero-mean random vector with
E[ǫǫT ] = I

19/35

Delft Center for Systems and Control


Linear Estimators for the least squares problem

Least squares solution


T
−1
b= F F
x FTy
is a linear estimator: it is linear in y.
Definition: Linear estimator for x given y has the
form:
e = My
x
with M ∈ Rn×N

20/35

Delft Center for Systems and Control


Unbiased and minimum variance
cy is unbiased if
b=M
The linear estimator x
x − x] = 0
E[b

21/35

Delft Center for Systems and Control


Unbiased and minimum variance
cy is unbiased if
b=M
The linear estimator x
x − x] = 0
E[b

b=M
The linear estimator x cy is called the
minimum variance estimator if
h i h i
E (b x − x)T ≤ E (e
x − x)(b x − x)(ex − x)T

e = M y.
for all linear estimators x

21/35

Delft Center for Systems and Control


The Gauss-Markov theorem
The least squares solution
T
−1 T
b= F F
x F y
is an unbiased minimum variance estimate
(UMVE) and has covariance matrix:
T T
−1
b)(x − x
E[(x − x b) ] = F F

22/35

Delft Center for Systems and Control


Proof of the Gauss-Markov theorem
Linear Estimator which is UNBIASED
e = My = MF x+Mǫ ⇒ x
x e − x = (M F − I)x + M ǫ

23/35

Delft Center for Systems and Control


Proof of the Gauss-Markov theorem
Linear Estimator which is UNBIASED
e = My = MF x+Mǫ ⇒ x
x e − x = (M F − I)x + M ǫ
e−x
Consider the mean of x
x − x] = (M F − In )x + M E[ǫ] = (M F − I)x
E[e
The linear estimator is unbiased provided,

23/35

Delft Center for Systems and Control


Proof of the Gauss-Markov theorem
Linear Estimator which is UNBIASED
e = My = MF x+Mǫ ⇒ x
x e − x = (M F − I)x + M ǫ
e−x
Consider the mean of x
x − x] = (M F − In )x + M E[ǫ] = (M F − I)x
E[e
The linear estimator is unbiased provided,

x − x] = 0 ⇔ M F = I
E[e
−1
The least squares estimator M = FTF FT
clearly satisfies M F = In
23/35

Delft Center for Systems and Control


Proof of Minimum Variance Property
Recall
e − x = (M F − I)x + M ǫ = M ǫ
x
Then, the covariance matrix of the Unbiased linear
e = hM y with M satisfying
estimate x i M F = I:
E (e x − x)T
x − x)(e = M E[ǫǫT ]M T
= MMT

For the least squares solution x̂(M̂ = (F T F )−1 F T ), its


covariance
h matrix equals,i
E (b x − x)T = (F T F )−1 F T F (F T F )−1
x − x)(b
= (F T F )−1

24/35

Delft Center for Systems and Control


lsq_mvar.m

25/35

Delft Center for Systems and Control


Overview
• Dynamical Systems and Random Processes
• Deterministic least squares (LS) problems
• Statistical Properties LS estimates
• Application of LS to System Identification

26/35

Delft Center for Systems and Control


A simple System Identification Problem
e(k)

u(k) y(k)
Σ

With Σ given by the following (2nd order) difference equation:

y(k) + a1 y(k − 1) + a2 y(k − 2) = b1 u(k − 1) + b2 u(k − 2) +e(k)


| {z } | {z }
AR X

27/35

Delft Center for Systems and Control


A simple System Identification Problem
e(k)

u(k) y(k)
Σ

With Σ given by the following (2nd order) difference equation:

y(k) + a1 y(k − 1) + a2 y(k − 2) = b1 u(k − 1) + b2 u(k − 2) +e(k)


| {z } | {z }
AR X

This can be written in transfer function


 form: 
b1 q +b2 q
−1 −2
G(q) = 1+a
y(k) = G(q)u(k) + H(q)e(k)  1q
−1 +a q −2
2 
H(q) = 1+a1 q−11+a2 q−2

27/35

Delft Center for Systems and Control


A simple System Identification Problem
e(k)

u(k) y(k)
Σ

With Σ given by the following (2nd order) difference equation:

y(k) + a1 y(k − 1) + a2 y(k − 2) = b1 u(k − 1) + b2 u(k − 2) +e(k)


| {z } | {z }
AR X

This can be written in transfer function


 form: 
b1 q +b2 q
−1 −2
G(q) = 1+a
y(k) = G(q)u(k) + H(q)e(k)  1q
−1 +a q −2
2 
H(q) = 1+a1 q−11+a2 q−2

Then the identification problem is: {u(k), y(k)}N


k=1 → Σ̂
27/35

Delft Center for Systems and Control


Solving the ARX Identification Problem
We can denote the difference equation as:  
a1
 
h i a2 
 
y(k) = −y(k − 1) −y(k − 2) u(k − 1) u(k − 2)  +e(k)
 b1 
 
b2

If we define this data relationship as the k − 2th row of the


following matrix equation (for k = 3 : N ):
y = Fx + ǫ
h i
then the ARX parameters a1 a2 b1 b2 can be found by
solving the least squares problem:
min ǫT ǫ subject to: y = F x + ǫ
x
28/35

Delft Center for Systems and Control


lsq_demo.m
Consider the followig AR(MA)X model:
y(k) − 1.5y(k − 1) + 0.7y(k − 2) = u(k − 1) − u(k − 2) + e(k)
 
− e(k − 1) + 0.2e(k − 2)

with u(k) and e(k) independent zero-mean white noise


sequences of unit variance and length 1000. Using
{u(k), y(k)}903
k=1 estimate the parameters of a 2 nd
order ARX
model for (a) MA part zero and (b) MA as given!.

29/35

Delft Center for Systems and Control


The use of the QR factorization

The QR-Theorem: Let A ∈ Rm×n (m ≥ n), then


there exists an orthogonal matrix Q ∈ Rm×m that
can be partitioned as:
h i
Q = Q1 Q2 Q1 ∈ Rm×n

such that,
" #
T R
Q A= with R ∈ Rn×n and R upper-triangular
0

[or the matrix A is factorized as Q1 R.]


30/35

Delft Center for Systems and Control


QR Solution to LS problem
LSQR-Theorem: Consider the LS problem minx ky − F xk2 and
consider the following QR factorization of F and the application
of QT to y as,
     
h i R QT1 d1
F = Q1 Q2     y=  
0 QT2 d2

Consider the matrix F to have full column rank, then the LS


b and the LS residual satisfy:
solution x

b = R−1 d1
x bk2 = kd2 k2
ky − F x

31/35

Delft Center for Systems and Control


SensNeq.m

32/35

Delft Center for Systems and Control


Summary of Lecture 2
• Refreshment of characterization of RPs in
Time (- and Frequency) domain
• The linear least squares problem: unknown x
deterministic.
• Gauss-Markov theorem (MVUE).
• Ready for the first Home Work - download
Now!

33/35

Delft Center for Systems and Control


Next Instruction session
Preparation:
Study Chapters 2(2.6-2.7) and 4 (4.1 - 4.5.2)
Get Homework 1

Next lecture:
Addressing your questions on Homework 1.
Tuesday 17-11-2015

34/35

Delft Center for Systems and Control

You might also like