0% found this document useful (0 votes)
19 views

Model Based Output Difference Feedback Optimal Control

Uploaded by

Gary Rey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Model Based Output Difference Feedback Optimal Control

Uploaded by

Gary Rey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Model-Based Output-Difference Feedback

Optimal Control

1 Introduction
This document investigates a model-based method to design the optimal Output-
Difference Feedback Controller (ODFC). We begin by assuming the presence of
an observer that provides an unbiased estimate of the state, represented math-
ematically as:

x̂k = xk + ϵk , ϵk ∼ N (0, Σϵ )

2 Theorem 3.1: Optimal Control Problem


Consider the optimal control problem defined by equations (2)-(5). The optimal
state feedback controller gain K ∗ is given by:

K ∗ = R + B T P ∗ B −1 B T P ∗ A + N T
where P ∗ > 0 is the solution to the Algebraic Riccati Equation (ARE):

AT P ∗ A − P ∗ − AT P ∗ B + N R + B T P ∗ B −1 B T P ∗ A + N T + Q = 0
T T
Here, Q = A Qx A, R = B T Qx B + R, N = A Qx B, and A = A − I.

3 Average Cost
The average cost associated with K ∗ is given by:

λK ∗ = Tr(ATeff Qx Aeff Σϵ )+Tr(Qx Ww )+2Tr(Qy Wv )+Tr(K ∗T B T P ∗ K ∗ Wv )+Tr(P ∗ (Ww +Σϵ ))−Tr((A−BK ∗ )T P

3.1 Deriving Each Term


1. **State Cost**: - Tr(ATeff Qx Aeff Σϵ ): Captures the cost associated with the
state estimation error.

1
2. **Control Cost**: - Tr(Qx Ww ): Reflects the cost related to the process
noise affecting the state.
3. **Output Cost**: - 2Tr(Qy Wv ): Represents the cost linked to the output
noise.
4. **Feedback Gain Cost**: - Tr(K ∗T B T P ∗ K ∗ Wv ): Captures the cost
incurred due to the control action based on the feedback gain K ∗ .
5. **Covariance Cost**: - Tr(P ∗ (Ww + Σϵ )): Accounts for the combined
effect of the process noise covariance and the estimation error covariance.
6. **Adjustment for Feedback**: - −Tr((A − BK ∗ )T P ∗ (A − BK ∗ )Σϵ ):
Adjusts for the effect of the feedback control on the state dynamics.

4 Proof Overview
The proof resembles results for linear stochastic systems with state-dependent
quadratic costs, following similar procedures to those found in [?]. The optimal
feedback gain K ∗ is derived from minimizing the Bellman equation, leading to
the satisfaction of equations (9) and (10).

5 Theorem 3.2: Iterative Algorithm


Let K0 be any stabilizing state feedback controller gain and Pi > 0 be the
solution of the Lyapunov equation:

ATi Pi Ai − Pi + Q + KiT RKi − KiT N − N Ki = 0


where i = 0, 1, 2, . . . and Ai = A − BKi . For Ki+1 calculated as:

Ki+1 = R + B T Pi B −1 B T Pi A + N T
The following holds:

• A − BKi+1 is Schur.

• P ∗ ≤ Pi+1 ≤ Pi
• limi→∞ Pi = P ∗ , limi→∞ Ki = K ∗

6 Proof Overview
The proof follows arguments similar to those in [?] (Theorem 3.1) and is there-
fore omitted here.

2
7 Theorem 3.3: Parameterized Observer
A parameterized observer is introduced to estimate the system state xk from
the output difference measurement. The observer can be combined with (8) to
provide a solution for the optimal control problem.
The state parametrization is given as:

x̄k = Γu αk + Γy βk
This converges exponentially in mean to the state xk as k → ∞ for an
observable system. The estimation error is given by:

x̃k ≡ xk − x̄k ∼ N (0, Σϵ )


where Σϵ is a bounded error covariance matrix.

7.1 Matrices and Updates


The matrices Γu and Γy contain system-dependent transfer function coefficients.
The updates for αk and βk are defined as follows:
i
αk+1 = Aαki + Buk , ∀i = 1, 2, . . . , m

βki = Cσki + D(yk − yk−1 ), ∀i = 1, 2, . . . , p


where ui and yi are the i-th input and output, respectively.

7.2 Existence of the Observer


The existence of the parametrization is equivalent to the difference-feedback
state observer:

x̄k+1 = (A − LCA + LC)x̄k + (B − LCB)uk + L(yk+1 − yk )


where L is the observer gain. The mean and covariance of the estimation
error can be determined using this formulation.

8 Derivation of Discrete-Time ARE


The Algebraic Riccati Equation (ARE) is a fundamental equation in optimal
control theory, particularly for discrete-time linear systems. Below, we derive
the discrete-time ARE from the principles of optimal control.

3
8.1 Discrete-Time Linear System
Consider a discrete-time linear system described by:

xk+1 = Axk + Buk


where:
• xk is the state vector at time k,

• uk is the control input,


• A is the state transition matrix,
• B is the input matrix.

8.2 Cost Function


We want to minimize a quadratic cost function of the form:

X
xTk Qxk + uTk Ruk + 2xTk N uk

J=
k=0

where:

• Q is a positive semi-definite matrix,


• R is a positive definite matrix,
• N is a matrix that captures the coupling between the state and control
inputs.

8.3 Bellman Equation


The optimal control problem can be formulated using the Bellman equation.
The value function V (x) represents the minimum cost to go from state x:

V (x) = min xT Qx + uT Ru + 2xT N u + V (Ax + Bu)



u

Assuming a quadratic form for the value function:

V (x) = xT P x
where P is a positive semi-definite matrix, we can write:

V (Ax + Bu) = (Ax + Bu)T P (Ax + Bu)

4
8.4 Substituting into the Bellman Equation
Substituting back into the Bellman equation, we have:

V (x) = min xT Qx + uT Ru + 2xT N u + xT AT P Ax + xT AT P Bu + uT B T P Ax + uT B T P Bu



u

Grouping terms, we get:

V (x) = xT Q + AT P A x + uT R + B T P B u + 2xT AT P B + N u
  

8.5 Minimizing the Cost Function


To minimize this quadratic expression with respect to u, we take the derivative
and set it to zero:
∂V
= 2 R + B T P B u + 2 AT P B + N x = 0
 
∂u
Solving for u gives:
−1
u∗ = − R + B T P B BT P A + N T x


8.6 Substituting Back into the Cost Function


Substituting u∗ back into the cost function:

−1 T
J ∗ = x T Q + AT P A x − x T AT P B + N R + B T P B B PA + NT x
  

This leads to the equation:

 −1 T 
J ∗ = x T Q + AT P A − AT P B + N R + B T P B B PA + NT x


For the minimum cost to be zero, the term in parentheses must equal zero:

−1
AT P A − P − AT P B + N R + BT P B BT P A + N T + Q = 0
 

8.7 Algebraic Riccati Equation


Rearranging gives us the discrete-time Algebraic Riccati Equation (ARE):
−1
AT P A − P − AT P B R + B T P B BT P A + Q = 0

5
9 Conclusion
The discrete-time ARE is a key result in optimal control, allowing us to compute
the optimal feedback gain matrix K ∗ using:
−1
K ∗ = R + BT P B BT P A + N T


The solution P can be found using various numerical methods, such as iter-
ative algorithms or matrix factorizations.

References

You might also like