0% found this document useful (0 votes)

44 views149 pages

Navigation Filter Best Practices - NASA Report 2018

The document titled 'Navigation Filter Best Practices' provides a comprehensive guide on the use of navigation filters, particularly the Extended Kalman Filter (EKF), in NASA's missions. It includes best practices, technical details, and rationales for effective navigation system design and operation, aimed at supporting future exploration and commercial development. The report serves as a valuable resource for both novice and experienced engineers in the field of Guidance, Navigation, and Control (GN&C).

Uploaded by

ozgur.kahraman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views149 pages

Navigation Filter Best Practices - NASA Report 2018

Uploaded by

ozgur.kahraman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 149

https://ntrs.nasa.gov/search.jsp?

R=20180003657 2019-12-27T06:15:13+00:00Z

NASA/TP–2018–219822

Navigation Filter Best Practices

Edited by

J. Russell Carpenter
Goddard Space Flight Center, Greenbelt, Maryland

Christopher N. D’Souza
Johnson Space Center, Houston, Texas

April 2018
NASA STI Program. . . in Profile

Since its founding, NASA has been dedicated • CONFERENCE PUBLICATION.

to the advancement of aeronautics and space Collected papers from scientific and
science. The NASA scientific and technical technical conferences, symposia, seminars,
information (STI) program plays a key part or other meetings sponsored or
in helping NASA maintain this important co-sponsored by NASA.
role.
• SPECIAL PUBLICATION. Scientific,
The NASA STI Program operates under the technical, or historical information from
auspices of the Agency Chief Information NASA programs, projects, and missions,
Officer. It collects, organizes, provides for often concerned with subjects having
archiving, and disseminates NASA’s STI. substantial public interest.
The NASA STI Program provides access to
the NASA Aeronautics and Space Database • TECHNICAL TRANSLATION. English-
and its public interface, the NASA Technical language translations of foreign scientific
Report Server, thus providing one of the and technical material pertinent to
largest collection of aeronautical and space NASA’s mission.
science STI in the world. Results are
Specialized services also include organizing
published in both non-NASA channels and
and publishing research results, distributing
by NASA in the NASA STI Report Series,
specialized research announcements and
which includes the following report types:
feeds, providing information desk and
• TECHNICAL PUBLICATION. Reports of personal search support, and enabling data
completed research or a major significant exchange services.
phase of research that present the results
For more information about the NASA STI
of NASA programs and include extensive
Program, see the following:
data or theoretical analysis. Includes
compilations of significant scientific and • Access the NASA STI program home page
technical data and information deemed to at http://www.sti.nasa.gov
be of continuing reference value. NASA
counterpart of peer-reviewed formal • E-mail your question to
professional papers, but having less [email protected]
stringent limitations on manuscript length
and extent of graphic presentations. • Phone the NASA STI Information Desk at
757-864-9658
• TECHNICAL MEMORANDUM.
Scientific and technical findings that are • Write to:
preliminary or of specialized interest, e.g., NASA STI Information Desk
quick release reports, working papers, and Mail Stop 148
bibliographies that contain minimal NASA Langley Research Center
annotation. Does not contain extensive Hampton, VA 23681-2199
analysis.
• CONTRACTOR REPORT. Scientific and
technical findings by NASA-sponsored
contractors and grantees.
NASA/TP–2018–219822

Navigation Filter Best Practices

Edited by

J. Russell Carpenter
Goddard Space Flight Center, Greenbelt, Maryland

Christopher N. D’Souza
Johnson Space Center, Houston, Texas

National Aeronautics and

Space Administration
NASA Engineering and Safety Center
Hampton, Virginia 23681

April 2018
The use of trademarks or names of manufacturers in this report is for accurate reporting and
does not constitute an official endorsement, either expressed or implied, of such products or
manufacturers by the National Aeronautics and Space Administration.

Available from:

NASA STI Program / Mail Stop 148

NASA Langley Research Center
Hampton, VA 23681-2199
Fax: 757-864-6500
NASA Engineering and Safety Center

Navigation Filter Best Practices

First Edition

J. Russell Carpenter and Christopher N. D’Souza, Editors

with contributions from

J. Russell Carpenter, Christopher N. D’Souza, F. Landis Markley, Renato Zanetti

April, 2018
This material is declared a work of the US Government and is not subject to copyright
protection in the United States, but may be subject to US Government copyright in other
countries.
If there be two subsequent events, the probability of the second b/N and the probability
of both together P/N , and it being first discovered that the second event has also happened,
from hence I guess that the first event has also happened, the probability I am right is P/b.
Thomas Bayes, c. 1760

The explicit calculation of the optimal estimate as a function of the observed variables
is, in general, impossible.
Rudolph Kalman, 1960.

The use of Kalman Filtering techniques in the on-board navigation systems for the
Apollo Command Module and the Apollo Lunar Excursion Module was an important factor
in the overwhelming success of the Lunar Landing Program.
Peter Kachmar, 2002.
Dedicated to the memory of Gene Muller, Emil Schiesser, and Bill Lear.
Contents

Foreword iii

Editor’s Preface v

Notational Conventions ix

Chapter 1. The Extended Kalman Filter 1

1.1. The Additive Extended Kalman Filter 1
1.2. The Multiplicative Extended Kalman Filter 7

Chapter 2. The Covariance Matrix 9

2.1. Metrics for Orbit State Covariances 9
2.2. Covariance Propagation 15
2.3. Covariance Measurement Update 25
2.4. Sigma-Point Methods 28

Chapter 3. Processing Measurements 29

3.1. Measurement Latency 29
3.2. Invariance to the Order of Measurement Processing 30
3.3. Processing Vector Measurements 31
3.4. Filter Cascades 32
3.5. Use of Data from Inertial Sensors 32

Chapter 4. Measurement Underweighting 33

4.1. Introduction 33
4.2. Nonlinear E↵ects and the Need for Underweighting 34
4.3. Underweighting Measurements 38
4.4. Pre-Flight Tuning Aids 39

Chapter 5. Bias Modeling 41

5.1. Zero-Input Bias State Models 41
5.2. Single-Input Bias State Models 43
5.3. Multi-Input Bias State Models 49

Chapter 6. State Representations 53

6.1. Selection of Solve-For State Variables for Estimation 53
6.2. Units and Precision 53
6.3. Coordinate and Time Systems 54
6.4. Orbit Parameterizations 54
6.5. Relative State Representations 55
6.6. Modeling Inertial Components 58
i
ii CONTENTS

Chapter 7. Factorization Methods 63

7.1. Why Use the UDU Factorization? 63
7.2. Preliminaries 64
7.3. The Time Update of the Covariance 65
7.4. The Measurement Update 72
7.5. Consider Covariance and Its Implementation in the UDU Filter 74
7.6. Conclusions 76
Chapter 8. Attitude Estimation 77
8.1. Attitude Matrix Representation 77
8.2. Euler Axis/Angle Representation 79
8.3. Quaternion Representation 80
8.4. Rodrigues Parameter Representation 83
8.5. Modified Rodrigues Parameters 84
8.6. Rotation Vector Representation 85
8.7. Euler Angles 86
8.8. Additive EKF (AEKF) 87
8.9. Multiplicative EKF (MEKF) 88
Chapter 9. Usability Considerations 93
9.1. Editing 93
9.2. Reinitialization, Restarts, and Backup Ephemeris 93
9.3. Ground System Considerations 94
Chapter 10. Smoothing 95
Chapter 11. Advanced Estimation Algorithms 99
The Sigma-Point Estimator 99
Appendix A. Models and Realizations of Random Variables 105
Appendix B. The Mathematics Behind the UDU Factorization 107
B.1. The Partitioning into Two Subproblems 107
B.2. The Mathematics Behind the Second Subproblem 107
B.3. The Agee-Turner Rank-One Update 109
B.4. Decorrelating Measurements 112
B.5. The Carlson Rank-One Update 112
Appendix C. An Analysis of Dual Inertial-Absolute and Inertial-Relative Navigation
Filters 117
C.1. Introduction 117
C.2. The Filter Dynamics 117
C.3. Incorporation of Measurements 120
C.4. Analysis of the Merits of the Inertial-Absolute and Inertial-Relative Filters 123
C.5. Conclusions 125
Bibliography 127
Foreword

It certainly should not come as a surprise to the reader that navigation systems are at
the heart of almost all of NASA’s missions, either on our launch vehicles, on robotic science
spacecraft, or on our crewed human exploration vehicles. Clearly navigation is absolutely
fundamental to operating our space systems across the wide spectrum of mission regimes.
Safe and reliably performing navigation systems are essential elements needed for routine
low Earth orbiting science missions, for rendezvous and proximity operation missions or
precision formation flying missions (where relative navigation is a necessity), for navigation
through the outer planets of the solar system, and for accomplishing pinpoint landing on
planets/small bodies, and many more mission types.
I believe the reader will find that the navigation filter best practices the team has
collected, documented, and shared in this first edition book will be of practical value in
your work designing, developing, and operating modern navigation systems for NASA’s
challenging future missions. I want to thank the entire team that has diligently worked
to create this NASA Engineering and Safety Center (NESC) GN&C knowledge capture
report. I especially want to acknowledge the dedication, care, and attention to detail as
well as the energy that both Russell Carpenter and Chris D’Souza, the report editors,
have invested in producing this significant product for the GN&C community of practice.
It was Russell and Chris who had the inspiration to create this report and they have
done a masterful job in not only directly technically contributing to the report but also
coordinating its overall development. It should be mentioned that some high-level limited
work was previously performed under NESC sponsorship to capture the lessons learned
over the course of the several decades NASA has been navigating space vehicles. This
report however fills a unique gap by providing extensive technical details and, perhaps
more importantly, providing the underlying rationale for each of the navigation filter best
practices presented here. Capturing these rationales has proven to be a greatly needed but
very challenging task. I congratulate the team for taking this challenge on.
The creation, and the wide dissemination of this report, is absolutely consistent with the
NESC’s commitment to engineering excellence by capturing and passing along, to NASA’s
next generation of engineers, the lessons learned emerging from the collective professional
experiences of NASA’s navigation system subject matter experts. I believe this book will
not only provide relevant tutorial-type guidance for early career GN&C engineers that have
limited real-world on the job experience but it should also serve as a very useful memory
aid for more experienced GN&C engineers, especially as a handy reference to employ for
technical peer reviews of navigation systems under development.
As the NASA Technical Fellow for GN&C I urge the reader (especially the “naviga-
tors” among you obviously) to invest the time to digest and consider how the best practices
provided in this report should influence your own work developing navigation systems for
the Agency’s future missions. The editors and I recognize this will be a living document

iii
iv FOREWORD

and we sincerely welcome your feedback on this first edition of the report, especially your
constructive recommendations on ways to improve and/or augment this set of best practices.

Cornelius J. Dennehy
NASA Technical Fellow for GN&C
January 2018
Editor’s Preface

As the era of commercial spaceflight begins, NASA must ensure that lessons the US
has learned over the first 50 years of the Space Age will continue to positively influence the
continuing exploration and development of space. Of the many successful strands of this
legacy, onboard navigation stands out as an early triumph of technology whose continuous
development and improvement remains as important to future exploration and commercial
development as it was in the era of Gemini and Apollo. The key that opened the door
to practical and reliable onboard navigation was the discovery and development of the ex-
tended Kalman filter (EKF) in the 1960s, a story that has been well-chronicled by Stanley
Schmidt [65], and Kalman filtering has far outgrown NASA’s applications over the inter-
vening decades. What are less well-documented are the accumulated art and lore, tips and
tricks, and other institutional knowledge that NASA navigators have employed to design
and operate EKFs in support of dozens of missions in the Gemini/Apollo era, well over one
hundred Space Shuttle missions, and numerous robotic missions, without a failure ever at-
tributed to an EKF. To document the best of these practices is the purpose that motivates
the contributors of the present document.
Kernals of such best practices have appeared, scattered throughout the open technical
literature, but such contributions are limited by organizational publication policies, and in
some case by technology export considerations. Even within NASA, there has heretofore
not been any attempt to codify this knowledge into a readily available design handbook
that could continue to evolve along with the navigation community of practice. As a result,
even with the Agency, it is possible for isolated practitioners “not to know any better:” to
fail to appreciate the subtleties of successful and robust navigation filter design, and to lack
an understanding of the motivations for, and the implied cost/benefit trade, of many of the
tried and true approaches to filter design.
Some limited progress toward filling this void has been made at a summary level in
reports and briefings prepared for the NASA Engineering and Safety Center (NESC) [13].
In particular, one of a series of recommendations in Reference 13 “...directed towards the
development of future non-human rated [rendezvous] missions...” included as its fourteenth
recommendation the admonishment to “[u]tilize best practices for rendezvous navigation
filter design.” This recommendation listed eight such practices, as follows:

a. Maintain an accurate representation of the target-chaser rel-

ative state estimation errors, including an accurate variance-
covariance matrix. This allows the filter to compute an appropriate
gain matrix. It also aids the filter in appropriately editing unsuitable
measurements.
b. Provide a capability for measurement underweighting that
adapts to the current uncertainty in the filters state estima-
tion error, as required to be consistent with the suboptimality
v
vi PREFACE

of the navigation filters measurement update. E↵ective means

for accomplishing this have been found to include:
i. Modified second-order Gaussian state update method [30];
ii. Multiplicative adjustment of the mapping of the state error covari-
ance matrix into the measurement subspace, which occurs within
the computation of the residual covariance [78]; and
iii. Schmidt-Kalman state update [3] that utilizes the covariance ma-
trix of “consider” parameters (i.e., states that the filter does not
update, but for which it maintains a covariance).
Multiplicative adjustment of the measurement noise covariance matrix
within the computation of the residual covariance (the “bump up R”
method [3]) has been found to be less e↵ective, and is not recommended
unless other methods are not feasible.
c. Estimate states that model biases in sensor measurements and
account for unmodeled accelerations. Gauss-Markov models for
these biases have been found to be more e↵ective than random con-
stant or random walk models. Random constant models can become
stale, and random walk models can overflow during long periods with-
out measurement updates.
d. Provide commands that allow for selective processing of indi-
vidual measurement types. If the filter utilizes an automated resid-
ual edit process, then the recommended command capability should be
able to override the residual edit test.
e. Maintain a backup ephemeris, unaltered by measurement up-
dates since initialization, which can be used to restart the filter
without uplink of a new state vector.
f. Provide a capability for reinitializing the covariance matrix
without altering the current state estimate.
g. Ensure tuning parameters are uplinkable to the spacecraft,
and capable of being introduced to the filter without loss of
onboard navigation data.
h. Provide flexibility to take advantage of sensors and sensor
suites full capability over all operating ranges.
A subsequent briefing given for an NESC webinar [7] listed these as well as the following
“additional considerations:”
• State Representation
– Translational states
⇤ Dual inertial
⇤ Inertial/relative
⇤ Relative-only
– Attitude states, as required
⇤ 3-parameter vs. 4-parameter
⇤ Multiplicative vs. additive update
• Covariance Factorization (or not)
– U-D
– “Square Root” Methods
• Measurement correlation
PREFACE vii

• Non-simultaneous measurements
• Backward smoothing (for BETs)
• Error Budgets
• Sensitivity Analysis
• IMU/Accelerometer processing
• Observability
While these summary-level lists give the community a place to start, they are lacking in
some respects. They lack sufficient rationale that would motivate a designer to adopt them.
Even if so motivated, a designer needs much more detailed information concerning how to
implement the recommendations.
The present work is an attempt to address these shortcomings. Each contributor has
selected one aspect of navigation filter design, or several closely related ones, as the basis
of a chapter. Each chapter clearly identifies best practices, where a consensus of the com-
munity of practice exists. While it is sometimes difficult to cast aside one’s opinions and
express such a consensus, each contributor has made a best e↵ort in this regard. Where a
diversity of opinion exists, the chapter will summarize the arguments for and against each
approach. Also, if promising new developments are currently afoot, the chapter will assess
their prospects.
While the contributors strive for consistency of convention and notation, each has his
own preferences, and readers may need to accommodate subtle di↵erences along these lines
as they traverse the book. The first chapter, which summarizes the EKF, sets the stage,
and should be briefly perused by even seasoned navigators in order to become familiar with
the conventions adopted for this work. Subsequent chapters should stand on their own, and
may be consulted in any order.
While this is a NASA document concerned with space navigation, it is likely that many
of the principles would apply equally to the wider navigation community. That said, readers
should keep in mind that hard-earned best practices of a particular discipline do not always
carry over to others, even though they may be seemingly similar. To assume so is a classic
example of the logical fallacy argumentum ad verecundiam, or the argument from [false]
authority.
Finally, the contributors intend for this work to be a living document, which will continue
to evolve with the state of the practice.
Notational Conventions

A A set
a, ↵, A Scalars
q, M An array of scalars, e.g. column, row, matrix
x A point in an abstract vector space
~
r A physical vector, i.e. an arrow in 3-D space
F A coordinate frame
T
M The transpose of the array M
kxk The (2-)norm of the vector x
y A random variable
z A random vector
px (x) The probability density function of the random variable x evaluated at the
realization x
Pr(y < Y ) The probability that y < Y
E[z] The expectation of the random vector z
exp(t) The exponential function of t
et exp(t) written as Euler’s number raised to the t power
dx Leibniz’ (total) di↵erential of x
dy
dx (First) (total) derivative of y with respect to x
dn y dn
dxn = dxn y nth (total) derivative of y with respect to x
Fdn
dtn ~
r ~ with respect to t in frame F
nth (total) derivative of r
@M
@x (First) partial derivative of M with respect to x
@M
@x xo (First) partial of M with respect to x, evaluated at xo

ix
CHAPTER 1

The Extended Kalman Filter

Contributed by J. Russell Carpenter

As described in the preface, use of the Extended Kalman Filter (EKF) for navigation has
a long history of flight-proven success. The EKF thus forms the foundational best practice
advocated by this work, and it forms the basis for many of the best practices later chapters
describe. The purpose of the present chapter is not to derive the EKF and its relations, but
rather to present them in a basic form, as a jumping o↵ point for the rest of the material
we shall present. As we shall show, while the EKF is a powerful and robust algorithm, it
is based on a few ad hoc assumptions, which can lead to misuses and misunderstandings.
Many of the best practices we shall describe are tricks of the trade that address such issues.
1.1. The Additive Extended Kalman Filter
The additive EKF is distinguished from the multiplicative EKF (MEKF) by the form
of its measurement update. The additive EKF is the usual and original form of the EKF,
and when we refer to the EKF without a modifier, one may assume we mean the additive
form.
1.1.1. The Dynamics Model Suppose we have a list of n real quantities that we
need to know in order to perform navigation, and we have a di↵erential equation that tells
us how these quantities evolve through time, such as
Ẋ(t) = f (X(t), t) (1.1)
where X 2 Rn , which we call the state vector, contains the quantities of interest; we shall
call the f (X, t) the dynamics function. If we knew these quantities perfectly at any time,
(1.1) would allow us to know them at any other time. For a variety of reasons, this is not
the case however; both the initial conditions and the dynamics function are corrupted by
uncertainty.
Suppose instead that the quantities of interest are realizations of a random process, X(t),
whose distribution at some initial time to is known to us, and whose evolution (forward) in
time follows the stochastic di↵erential equation given by
dX(t) = f (X(t), t) + B(t) dw(t) (1.2)
where the presence of the process noise dw(t) reflects uncertainty in the dynamics. To
interpret (1.2), imagine dw(t) as the limit of a discrete sequence of random increments,
as the time between increments goes to zero. The result will be a continuous but non-
di↵erentiable process; hence the notation Ẋ(t) has ambiguous meaning. Henceforth, we
shall define our notation such that when we write
Ẋ(t) = f (X(t), t) + B(t)w(t) (1.3)
1
2 1. THE EXTENDED KALMAN FILTER

what we mean is really (1.2).

Finally, suppose that the initial distribution of X(t) is Gaussian, with mean and covari-
ance given by
⇥ ⇤
E[X(to )] = X̄o and E (X(to ) X̄o )(X(to ) X̄o )T = Po (1.4)
and suppose that infinitesimal increments of w(t) are Gaussian, with
E[w(t)] = 0 and E[w(t)wT (⌧ )] = Q(t) (t ⌧) (1.5)
where Q(t) is the power spectral density function of w(t), and (t ⌧ ) is the Dirac delta
function. We shall also assume that
⇥ ⇤
E w(t)(X(to ) X̄o )T = 0, 8 t (1.6)
We shall take (1.3) – (1.6) to define the dynamics model for the additive EKF. Note that
even though we have assumed X(to ) and w(t) are Gaussian, we cannot assume that X(t)
remains Gaussian for t > to , because f may be a nonlinear function.
1.1.2. The Measurement Model In an ideal world, we might have devices for mea-
suring all of the state vector components directly; then state determination would be simply
a matter of collecting enough such observations to reduce the state uncertainty to sufficient
levels. Unfortunately, this is almost never the case. Instead, like Socrates’ prisoners, we
can usually only perceive noisy projections of the state elements, at discrete times, ti , in
the form of measurements:
Y(ti ) = h(X(ti ), ti ) + v(ti ) (1.7)
where h is a surjection from R to R . We shall assume that v(ti ), which we call the
n m

measurement noise, is a Gaussian sequence, with mean and covariance given by

E[v(ti )] = 0 and E[v(ti )v(tj )T ] = R(ti ) ij (1.8)
where ij is the Kronecker delta function. We shall also assume that
⇥ ⇤
E v(ti )(X(to ) X̄o )T = 0 and E[w(t)v(ti )T ] = 0, 8 i, t (1.9)
We shall take (1.7) – (1.9) to define the measurement model for the additive EKF. Note
that we cannot assume that Y(ti ) is Gaussian, because h may be a nonlinear function. For
compactness of notation, we shall often suppress the time argument and write (1.7) as
Yi = hi (Xi ) + vi (1.10)
Bayes’ Law, the Markov Property, and Observability Bayes’ Law tells us how
to update the conditional probability density function (PDF) of X(ti ), given a realization
Y (ti ) of the random process Y(ti ):
pXi (Xi )
pXi |Yi (Xi |Yi ) = pYi |Xi (Yi |Xi ) (1.11)
pYi (Yi )
If all the PDFs in (1.11) were known, it would be relatively simple to use (1.11) to estimate
the state vector from a single measurement; our best estimate of the state would simply be
the mean of pXi |Yi (Xi |Yi ). But to apply (1.11) to the navigation problem, where we have a
time sequence of measurements, Yi = {Yi , Yi 1 , . . . , Y1 }, we need to consider how the state
dynamics evolve.
Unlike (1.1), our dynamics model, given by (1.3), only runs forward in time. Hence, the
state at any future time depends only on its history. Also, because the non-homogeneous
inputs to (1.3) are uncorrelated by the Dirac function in (1.5), the value of the state at
1.1. THE ADDITIVE EXTENDED KALMAN FILTER 3

any particular time in the future depends only on its present value, and its accumulated
di↵usion due to the process noise over the interval between now and the future time of
interest. Random processes such as this are said to possess the Markov Property. Using
this property, we can write (1.11) in terms of the measurement history as follows:
pXi |Yi 1 (Xi |Yi 1)
pXi |Yi (Xi |Yi ) = pYi |Xi (Yi |Xi ) (1.12)
pYi |Yi 1 (Yi |Yi 1)

Even if we could compute all of the PDFs in (1.12), we are not guaranteed that the
sequence of measurements provide sufficient information to reduce the initial uncertainty
of all the modes of (1.1). If the system given by (1.1) and (1.10) is such that use of (1.12)
results in uncertainty in all the modes going asymptotically to zero in finite time, from any
initial condition, then we say the system is globally asymptotically observable. If at least all
of the unstable modes are observable, then we say the system is detectable. Unfortunately,
for nonlinear systems, there is no known way to compute global observability. At best, under
certain restrictions on (1.1) and (1.10), we can in principal establish local observability, in
the neighborhood of a particular initial condition. However, this is a laborious calculation,
often numerically unstable to evaluate. Also, note that observability is a property of the
structure of (1.1) and (1.10), and hence is dependent on how one chooses to represent the
navigation problem. Hence, a system that is observable with one representation may be
unobservable with a di↵erent representation.
Kalman’s original filter, which we now usually call the linear Kalman filter (LKF),
is the result when the dynamics and measurement models are linear, Markov, Gaussian,
and observable. An appreciation of the linear Kalman filter is essential to understanding
the strengths and weaknesses of the EKF, although it is almost never the case that such
assumptions are valid for real-world navigation problems.

1.1.3. The Linear Kalman Filter Suppose the dynamics and measurements are
given by the following discrete-time linear models:
xi = i,i 1 xi 1 + i ui (1.13)
yi = Hi xi + vi (1.14)
with
E[xo ] = x̄o and E[(xo x̄o )(xo x̄o )]T ] = Po (1.15)
⇥ ⇤
E[ui ] = 0 and E i ui uTj Ti = Si ij (1.16)
and the moments of vi as given by (1.8). This system will be globally observable if the
observability Gramian is strictly positive definite,
k
X
0 T
Wk = i,1 Hi Hi i,1 >0 (1.17)
i=1
i.e. it has full rank.
With such assumptions, Kalman showed [36] that Algorithm 1.1 provides an optimal
(both minimum variance and maximum liklihood) estimate of the moments of the PDFs
appearing in (1.11). Note that in Algorithm 1.1, the covariance recursion given by (1.19)
and (1.22) does not depend on the measurement history, and hence one may compute the
gain sequence, Ki , o↵-line and store it as a time-indexed table or schedule, along with i,i 1
and Hi . Also note that because the system is globally observable, there is no chance that it
4 1. THE EXTENDED KALMAN FILTER

Algorithm 1.1 The Linear Kalman Filter

+
x̂i = i,i 1 x̂i 1 , x̂+
0 = x̄o (1.18)
+
Pi = T
i,i 1 Pi 1 i,i 1 + Si , P+
0 = Po (1.19)
1
Ki = Pi HTi Hi Pi HTi + Ri (1.20)
x̂+
i = x̂i + Ki yi Hi x̂i (1.21)
P+
i = Pi Ki Hi Pi (1.22)

will fail to converge from any initial condition, except perhaps due to build up of numerical
truncation and/or roundo↵ error.
If we further suppose the dynamics and measurements are given by linear time-invariant
(LTI) models,
xi = xi 1 + u i (1.23)
yi = Hxi + vi (1.24)
then we may test its global observability using a somewhat simpler calculation than (1.17),
as follows: 2 3
H
6 H 7
6 7
6 H 2 7
W=6 7>0 (1.25)
6 .. 7
4 . 5
H n 1
If the system is detectable, then it turns out that the covariance recursion given by (1.19)
and (1.22) reaches a steady-state, which we denote P1 . The corresponding gain is K1 =
P1 HT R 1 . There exist numerous software packages that will compute such quantities,
e.g. the Matlab Control Systems Toolbox, which may unfortunately lead to their misuse
in inappropriate contexts. Perhaps worse, experts from other domains, who are familiar
with techniques such as pole placement for control of LTI systems, may recognize that the
steady-state linear Kalman filter is “just a pole placement algorithm,” and may infer that
the EKF is not much more than a clever pole placement algorithm as well. As we shall show
below, this is far from being the case; the EKF operates directly on the nonlinear system
of interest, for which such LTI concepts have dubious applicability.
1.1.4. The Linearized Kalman Filter An immediately apparent generalization of
the linear Kalman Filter is to use it to solve for small corrections to a nonlinearly propagated
reference trajectory. While such an approach may have certain applications over limited
time horizons, and/or for ground-based applications where an operator can periodically
intervene, experience with onboard navigation systems has shown that such corrections can
fail to remain small enough to justify the required approximations.
1.1.5. The Extended Kalman Filter There are a number of ways to proceed from
Algorithm 1.1 to “derive” the EKF, but all contain a variety of ad hoc assumptions that
are not guaranteed to hold in all circumstances. Most weaknesses and criticisms of the
EKF arise from such assumptions. Rather than reproduce one or more of such derivations,
we will simply point out that if one replaces (1.18) with an integral of (1.1) over the time
between measurements, and computes the coefficient matrices appearing in (1.13) and (1.14)
1.1. THE ADDITIVE EXTENDED KALMAN FILTER 5

as Jacobians evaluated at the current solution of (1.1), then the result is Algorithm 1.2,
which bears more than a passing resemblance to the Kalman filter.

Algorithm 1.2 A Naive Extension of the Kalman Filter

Z ti
X̂i = f (X(⌧ ), ⌧ ) d⌧, X(ti 1 ) = X̂i+ 1 , X̂0+ = X̄o (1.26)
ti 1

@f @hi
A(t) = , Hi = (1.27)
@X X̂(t) @X X̂i
Z ti
(ti , ti 1) = A(⌧ ) (ti , ⌧ ) d⌧, (ti 1 , ti 1 ) =I (1.28)
ti 1
Z ti Z ti
Si = (ti , ⌧ )B(⌧ ) E[w(⌧ )wT ( )] BT ( ) T
(ti , ) d⌧ d (1.29)
ti 1 ti 1
+
Pi = (ti , ti 1 )Pi 1
T
(ti , ti 1) + Si , P+
0 = Po (1.30)
T T 1
Ki = Pi Hi Hi Pi Hi + Ri (1.31)
⇣ ⌘
X̂i+ = X̂i + Ki Yi hi (X̂i ) (1.32)
P+
i = Pi Ki Hi Pi (1.33)

Several observations are in order regarding Algorithm 1.2.

• One might infer from (1.26) that X̂i = E[X|Yi 1 ]. This would be a somewhat
problematic inference however, since in general
Z ti "Z #
ti
f (E[X(⌧ ), ⌧ )] d⌧ 6= E f (X(⌧ ), ⌧ ) d⌧ (1.34)
ti 1 ti 1

This implies that an initially Gaussian distribution for the state cannot in general
remain Gaussian. At best, all we can hope is that X̂i ⇡ E[X|Yi 1 ].
• Let us define the estimation error as e(t) = X(t) X̂(t). Then since X̂(t) 6=
E[X(t)|Yi 1 ],
P(t) 6= E[e(t)eT (t)|Yi 1 ] (1.35)
At best, all we can hope is that P(t) ⇡ E[e(t)eT (t)|Yi 1 ].
• Let us define the innovation as ri = Yi h(X̂i ). Then since h(X̂i ) 6= E[Y],
Hi Pi HTi + Ri 6= E[ri rTi ] (1.36)
At best, all we can hope is that the above will hold approximately.
• Taken together, the approximations listed above imply that (1.32) and (1.33) can
at best satisfy (1.12) only approximately, not only because the mean and covariance
are approximations, but also because the PDFs fail to remain Gaussian, and hence
fail to be characterized completely by only their first two moments.
• Even if all of the above are reasonable approximations, there is a problem with
(1.33). The posterior covariance should be approximated by
⇥ + + T ⇤
P+
i ⇡ E ei (ei ) |Yi (1.37)
6 1. THE EXTENDED KALMAN FILTER

Let us assume that

Yi hi (X̂i ) ⇡ Hi ei + vi (1.38)
Then (1.32) implies that
e+
i = ei
Ki Hi ei Ki v i (1.39)
⇥ ⇤
and by our prior assumption that E ei vTi = 0,
⇥ + + T ⇤
P+i ⇡ E ei (ei ) |Yi (1.40)
T
= (I Ki Hi ) Pi (I Ki Hi ) + Ki Ri KTi (1.41)
Equation (1.41) is Joseph’s Formula, and it holds for any gain Ki . Only for the
optimal gain and true covariance does (1.41) reduce to (1.33). Since (1.31) was
computed with only an approximate covariance, and due to the various other
approximations listed above as well, Ki cannot be the optimal gain, so at best,
(1.33) will only hold approximately. At worst, such approximations may lead to
P+i becoming non-positive definite, which is a significant issue. Because of its
symmetric and additive form, (1.41) is much less likely (but not impossible!) to
produce a non-positive definite P+ i .
T
• By our assumption that E[w(t)w (⌧ )] = Q(t) (t ⌧ ), one of the integrals in (1.29)
should be annihilated by the Dirac function, resulting in
Z ti
Si = (ti , ⌧ )B(⌧ )Q(⌧ )BT (⌧ ) T (ti , ⌧ ) d⌧ (1.42)
ti 1

In any case, unlike for a discrete time dynamics model,

"Z Z #
ti ti
E (ti , ⌧ )B(⌧ )w(⌧ )wT ( )BT ( ) T
(ti , ) d⌧ d
ti 1 ti 1
"Z Z #
ti ti
T T T
6= E (ti , ⌧ )B(⌧ )w(⌧ ) d⌧ w (⌧ )B (⌧ ) (ti , ⌧ ) d⌧ (1.43)
ti 1 ti 1

• In general, one would need to simultaneously integrate (1.26) and (1.28) due to
their interdependence via the Jacobian A. If the time between measurements is
small enough, then if one were to employ a suitable approximation for (1.28),
perhaps as simple as
(ti , ti 1) ⇡ I + A(ti ) (ti ti 1) (1.44)
then one may reasonably expect that a carefully chosen approximation would be
no worse than the many other approximations inherent in the EKF. One may also
consider the same or simpler approximations when considering approximations to
(1.42).
• Because there no way to prove global observability for a nonlinear system, the
EKF may fail to converge from some initial conditions, even if the system is locally
observable in particular neighborhoods.
In light of the above observations, we conclude this section by presenting a slightly
improved version of the EKF as Algorithm 1.3. In subsequent chapters, we shall describe
additional improvements to the EKF.
1.2. THE MULTIPLICATIVE EXTENDED KALMAN FILTER 7

Algorithm 1.3 A Slightly Improved Extension of the Kalman Filter

Z ti
X̂i = f (X(⌧ ), ⌧ ) d⌧, X(ti 1 ) = X̂i+ 1 , X̂0+ = X̄o (1.45)
ti 1

@f @hi
A(t) = , Hi = (1.46)
@X X̂(t) @X X̂i
(ti , ti 1 ) = a suitable approximation to (1.28) (1.47)
Si = a suitable approximation to (1.42) (1.48)
Pi = (ti , ti 1 )P+ i 1
T
(ti , ti 1 ) + Si , P+
0 = Po (1.49)
1
Ki = Pi HTi Hi Pi HTi + Ri (1.50)
⇣ ⌘
X̂i+ = X̂i + Ki Yi hi (X̂i ) (1.51)
P+
i = (I Ki Hi ) Pi (I T
Ki Hi ) + Ki Ri Ki T
(1.52)

1.2. The Multiplicative Extended Kalman Filter

An interesting variation on the EKF is possible in the context of estimating attitude
parameters. An attitude correction may be viewed as a small-angle rotation from a frame
associated with the previous estimate to a frame associated with a current estimate. In
this context, one may use the previous attitude estimate as a linearization reference for a
linearized Kalman Filter’s Jacobian matrices, and estimate the small-angle correction as the
filter state. After each state update, one performs a rectification of the attitude reference
by applying the small-angle correction. Since for many attitude representations, a frame
rotation is multiplicative operation, this procedure has become known as the multiplicative
EKF. Chapter 8 covers this subject.
CHAPTER 2

The Covariance Matrix

Contributed by J. Russell Carpenter

As Chapter 1 pointed out, the EKF estimate X̂(t) is at best an approximation for E[X(t)|Y],
and hence the EKF symbol P(t) is at best an approximation for E[e(t)eT (t)|Y]. In the
present Chapter we discuss best practices for maintaining such approximations, and hence-
forth we will simply refer to P(t) as the covariance matrix.
2.1. Metrics for Orbit State Covariances
To discuss what makes one covariance approximate better or worse than another, we
must be able to compare matrices to one another, and hence we must adopt metrics. For ma-
trices that serve as coefficients, there exist various matrix norms that serve. For covariance
matrices, we are usually more interested in measures of the range of possible realizations
that could be drawn from the probability distribution characterized by the covariance. The
square root of the trace of the covariance is often a reasonable choice, since it is the root sum
squared (RSS) formal estimation error. A drawback to the trace for orbital applications
is that coordinates and their derivatives typically di↵er by several orders of magnitude, so
that for example the RSS position error will dominate the RSS velocity error if the trace
is taken over a 6 ⇥ 6 Cartesian position and velocity state error covariance, unless some
scaling is introduced. Although arbitrary scalings are possible, we discuss several metrics
herein that have been found to be especially suitable to space applications.
Orbit determination is distinguishable from other types of positioning and navigation
not only by the use of dynamics suitable to orbiting bodies, but also by a fundamental need
to produce states that predict accurately. This need arises because spacecraft operations
require accurate predictions for acquisition by communications assets, for planning future
activities such as maneuvers and observations, for predicting conjunctions with other space
objects, etc. For closed, i.e. elliptical, orbits about most planetary bodies, the two-body
potential dominates all other forces by several orders of magnitude. Thus, in most cases, the
ability of an orbit estimate to predict accurately is dominated by semi-major axis (SMA)
error, a. This is because SMA error translates into period error through Kepler’s third
law, and an error in orbit period translates into a secularly increasing error in position along
the orbit track. As Reference [6] shows, the along-track drift per orbit revolution, s, for
an elliptical orbit with eccentricity e is bounded by
r
1+e
s = 3⇡ a from periapse to periapse (2.1)
1 e
r
1 e
s = 3⇡ a from apoapse to apoapse (2.2)
1+e
9
10 2. THE COVARIANCE MATRIX

This phenomenon is especially significant for rendezvous and formation flying applications,
where relative positions must be precisely controlled.
For a central body whose gravitational constant is µ, the SMA of a closed Keplerian
orbit, a, may be found from the vis viva equation,
µ µ v2
= + (2.3)
2a r 2
from which one can see that achieving SMA accuracy requires good knowledge of both
radius, r, and speed, v. What is less obvious from (2.3) is that radius and speed errors
must also be both well-balanced and well-correlated to maximize SMA accuracy [6, 9, 27],
as Figure 1 illustrates. In this figure, radius error, r , has been normalized by the squared

SMA Error vs. Normalized Radius and Speed Error

10 4 ρrv =0 10000
ρrv = −0.5
ρrv = −0.9
ρrv = −0.99
3 10000
10
1000

2
10
σv /(nvc /v)

10 1
10

10 0
100

10 -1 0.1 1

10 -2 0.01

-2 -1 0 1 2 3 4
10 10 10 10 10 10 10
σr /(r2 /a2 )

Figure 1. Semi-major axis accuracy depends on radius error, speed error,

their correlation, and their balance. All scales are in units of position.

ratio of radius to SMA,

p and speed error, v , has been normalized
p by nvc /v, where the
3
orbital rate is n = µ/a , and the circular speed is vc = µ/a, to make the relationships
illustrated be independent of any particular point in any particular closed orbit. Figure 1’s
contours of constant SMA error, a , show that a is dominated by radius error below a
diagonal region, and dominated by speed error above the diagonal. When radius and speed
errors are balanced, along the diagonal, SMA accuracy can be substantially improved by
increasing (negative) correlation. Experience has shown that a is one of the more useful
2.1. METRICS FOR ORBIT STATE COVARIANCES 11

figures of merit for evaluating orbit determination performance, particularly for relative
navigation applications.
In fact, it is easy to show that any function of two (scalar) random variables possesses
a similar correlation and balance structure, at least to first order. For example, navigation
requirements for atmospheric entry are often stated in terms of flight-path angle error, .
Since sin = r rk · v
~ /k~ v k, then from geometrical considerations we should expect that
~ /k~
depends on the component of position error which is in the local horizontal plane, in
the direction of the velocity vector, and on the component of velocity error that is normal
to both velocity and angular momentum, i.e. binormal to the velocity vector. These are
of course the in-plane components of position and velocity that are normal to radius and
speed. Thus by using the pair a and as metrics, we can fully characterize the correlation
and balance of the in-plane covariance components. The following subsections derive these
relationships.
2.1.1. Variance of an Arbitrary Function of Two Random Variables Suppose
there exists a random variable z which is a possibly nonlinear function of two other random
variables, x and y, such that
z = f (x, y) (2.4)
and let the joint covariance of x and y be given by
 2
x ⇢xy x y
P= 2 (2.5)
⇢xy x y y
The variance of z is given by ⇥ ⇤
2
z = E (z E[z])2 (2.6)
where Z 1 Z 1 Z 1
E[z] = ⇣ pz (⇣) d⇣ = f (⇠, ⌘) pz (f (⇠, ⌘)) d⇠ d⌘ (2.7)
1 1 1
Let x̂ = E[x] and ŷ = E[y]. Then E[x x̂] = 0 and E[y ŷ] = 0. Expanding f (x, y) around
f (x̂, ŷ) in a Taylor series to first order, we find that
f (x, y) ⇡ f (x̂, ŷ) + fx · (x x̂) + fy · (y ŷ) (2.8)
where fx and fy are the partials of f with respect to x and y, respectively; so, to first order,
ẑ = E[z] = E[f (x, y)] ⇡ f (x̂, ŷ) (2.9)
Now let us similarly expand (z E[z])2 :
(z E[z])2 = (f (x, y) f (x̂, ŷ))2 (2.10)
⇡ fx2 · (x 2
x̂) + 2fx fy · (x x̂)(y ŷ) + fy2 · (y ŷ) 2
(2.11)
Taking expectations on both sides yields
⇥ ⇤ ⇥ ⇤ ⇥ ⇤
E (z E[z])2 = fx2 E (x x̂)2 + 2fx fy E[(x x̂)(y ŷ)] + fy2 E (y ŷ)2 (2.12)
2
z = fx2 x2 + 2fx fy ⇢xy x y + fy2 y2 (2.13)
T
= FPF (2.14)
where F = [fx , fy ]. Since 1 < ⇢xy < 1, it is clear that a high negative correlation
between x and y will minimize z for given values of x and y , but if either fx x >> fy y
or fx x << fy y , the impact of the negative correlation will be insignificant. Thus, the
only way to simultaneously achieve z << fx x and z << fx y is when ⇢xy ⇡ 1 and
12 2. THE COVARIANCE MATRIX

fx x ⇡ fy y , which are the correlation and balance conditions mentioned above, and which
occur along the diagonal of Figure 1.
Note also that by defining new variables scaled by their respective partial derivatives,
x̃ = xfx and ỹ = yfy , and correspondingly ˜x = fx x and ˜y = fy y , then a normalization
of the fashion described above is also possible:
q
z = ˜x2 + 2⇢xy ˜x ˜y + ˜y2 (2.15)

2.1.2. Semi-Major Axis Variance To derive a relationship for semi-major axis vari-
ance, let us take variations on (2.3), which results in
a 2 r 2v v
2
= 2 + (2.16)
a r µ
If we replace the variations with deviations of random variables from their expectations,
and the non-deviated terms with their expected values, we find that
✓ ◆
2 (r r̂) v̂(v v̂)
(a â) = 2â + (2.17)
r̂2 µ
which by squaring and taking expectation yields the following equation for the SMA vari-
ance: ⇢
2 4 1 2 v̂ v̂ 2 2
a = 4â + 2 ⇢ rv r v + (2.18)
r̂4 r µr̂2 µ2 v
For the normalization used in Figure 1, rewrite (2.18) as
s✓ ◆2 ✓ ◆✓ ◆ ✓ ◆2
r r v v
a =2 + 2⇢rv + (2.19)
r̂2 /â2 r̂2 /â2 µ/(â2 v̂) µ/(â2 v̂)
and note that µ/(â2 v̂) = n̂v̂c /v̂. As mentioned above, normalizing radius and speed standard
deviation in this manner permits comparison of data across all points in all closed orbits.
If the orbit is exactly circular, then further simplification of (2.18) is possible. In this
case, a = r and v/µ = Tp /2⇡, where Tp is the orbit period. Then (2.18) may be rewritten
as s ✓ ◆ ✓ ◆2
2
Tp Tp 2
a =2 r +2 ⇢rv r v + v (2.20)
2⇡ 2⇡
For orbit determination applications, the state representation most often chosen is a
rT , v
Cartesian inertial state vector, x = [~ ~ T ]T . Rewriting (2.3) as
✓ ◆ 1
2 v k2
k~
a(x) = (2.21)
k~rk µ
and taking partials yields

@a 2 ~rˆ T ˆT
~v
= Fa (x̂) = 2â (2.22)
@x x̂ r̂3 µ
so that
2
a = Fa (x̂)Px FTa (x̂) (2.23)
where Px is the state error covariance.
2.1. METRICS FOR ORBIT STATE COVARIANCES 13

2.1.3. Flight-Path Angle Variance The flight-path angle, , is the angle between
the velocity vector and the local horizontal plane; it is therefore the complement of the
angle between the position and velocity vectors, so that
= arcsin(ur~ · uv~ ) (2.24)
where ur~ = r~ /r and uv~ = v
~ /v. Taking partials with respect to x, we find that
 T
@ 1 uv̂ sin ˆ uTr̂ uTr̂ sin ˆ uTv̂ ⇡ ⇡
F (x̂) = =p , <ˆ< (2.25)
@x x̂ 1 sin ˆ2 r̂ v̂ 2 2
so that
2
= F (x̂)Px FT (x̂) (2.26)
which is a form suitable for use in an OD filter estimating a Cartesian inertial state.
For analysis, a simpler form of (2.26) is as follows. Let us define two vectors that are
normal to both the position vector and a vector normal to the orbit plane, n ~ : the unit
in-track vector,
uv~ ur~ sin
u v~ = un ~ ⇥ ur ~ = p (2.27)
1 sin2
which defines a unit vector in the orbit plane that is along the orbit track at apoapsis and
periapsis, and the unit bi-normal vector,
ur~ uv~ sin
u~b = uv~ ⇥ un ~ = p (2.28)
1 sin2
which defines a unit vector in the orbit plane that is along the position vector at apoapsis
and peripasis. Let us next define a composite transformation matrix as follows. Let
⇥ ⇤
Mrtn = ur~I u v~ I un ~I (2.29)
where the subscript I indicates that the specified vector is expressed in an inertial basis.
The unitary matrix MTrtn thus transforms coordinates of vectors in physical space, which
are given in the frame I, to a coordinates defined in the basis given by the the radial,
transverse, and normal unit vectors. Similarly, define Mvnb as
⇥ ⇤
Mvnb = uv~ I un ~ I u~bI (2.30)
Now define the block diagonal transformation matrix M as

Mrtn 03x3
M= (2.31)
03x3 Mvnb
Using the matrix M, we can transform the state error covariance, given in inertial coor-
dinates, such that its position error covariance is expressed in the “RTN” frame, and its
velocity error covariance is in the “VNB” frame. Using the preceding results in (2.26) results
in considerable simplification:
  
2
⇥ T T
⇤ MTrtn 03x3 Mrtn 03x3 ut̂ /r̂
= ut̂ /r̂ ub̂ /v̂ Px (2.32)
03x3 MTvnb 03x3 Mvnb ub̂ /v̂
⇣ ⌘2 ⇣ ⌘⇣ ⌘ ⇣ ⌘
rt rt vb vb 2
= + 2⇢rt vb + (2.33)
r̂ r̂ v̂ v̂
which demonstrates the aforementioned assertion that flight-path angle error depends on
the in-track component of position error, and the bi-normal component of velocity error.
Note that (2.33) possesses the desirable feature that the relevant covariance information
(in-track position variance and bi-normal velocity variance) is normalized by radius and
14 2. THE COVARIANCE MATRIX

speed, allowing di↵ering orbital conditions to be readily compared with one another. In
most applications, rt << r̂ and vb << v̂ so that these ratios can be taken as small angles
and expressed in angular measures commensurate with the units chosen for flight-path angle
itself. Figure 2 employs this convention.

10 -1

0.01
10 -2

10 -3

0.1
0.0001
10 -4
0.001

10 -5
1e-05

10 -6

10 -6 10 -5 10 -4 10 -3 10 -2 10 -1

Figure 2. Flight-path angle accuracy depends on in-track position error

and bi-normal velocity error, their correlation, and their balance.

2.1.4. Summary of Orbit Determination Covariance Metrics A recommended

best practice for comparison of OD covariances is to use the semi-major axis standard de-
viation as a metric for most applications, with a secondary emphasis on flight-path angle
standard deviation. For entry applications, a best practice is to use flight-path angle stan-
dard deviation at entry interface as the primary metric. A summary of these metrics is as
follows.
2.2. COVARIANCE PROPAGATION 15

For orbits that are very close to circular:

s ✓ ◆ ✓ ◆2
2
Tp Tp 2
a =2 r +2 ⇢rv r v + v (2.34)
2⇡ 2⇡
For elliptical orbits: s
1 2 v̂ v̂ 2
a = 2â2 4 r +2 2
⇢rv r v + 2 v2 (2.35)
r̂ µr̂ µ
For normalization of radius and speed standard deviations across points in closed orbits:
s✓ ◆2 ✓ ◆✓ ◆ ✓ ◆2
r r v v
a =2 + 2⇢rv + (2.36)
r̂2 /â2 r̂2 /â2 n̂v̂c /v̂ n̂v̂c /v̂
For entry applications, and as a secondary metric:
r⇣ ⌘ ⇣ ⌘⇣ ⌘ ⇣ ⌘
rt 2 rt vb vb 2
= + 2⇢rt vb + (2.37)
r̂ r̂ v̂ v̂
For use in an OD filter that is estimating a Cartesian inertial state vector:
q 
T 2 ~ rˆ T ~v ˆT
a = F a (x̂)P F
x a (x̂), F a (x̂) = 2â (2.38)
r̂3 µ
q  T
ut̂ uTb̂
= F (x̂)Px FT (x̂), F (x̂) = (2.39)
r̂ v̂

2.2. Covariance Propagation

This section discusses best practices for implementing the covariance propagation re-
cursion
Pi = (ti , ti 1 )P+
i 1
T
(ti , ti 1 ) + Si , P+
0 = Po (2.40)
As Chapter 1 mentions, to achieve this goal we need suitable approximations for the state
transition matrix, (ti , ti 1 ), and the process noise covariance, Si . However, the suitability
of a given set of approximations is strongly dependent on specifics of the application. For
example, if a set of measurements from which the state is fully observable are available at
an interval that is a small fraction of the orbit period, without significant drop-outs, and
prediction of the covariance far into the future of the time of availability of the measurements
is not required, then simple models such as those to which Chapter 1 alluded have been
successfully employed. Selection of appropriate covariance propagation approximations also
depends strongly on the choice of state representation, which is the subject of Chapter 6.
Therefore, this section will discuss the merits of some of the more common and generally
applicable approaches, in the context of orbit determination.

2.2.1. Matrix Ricatti Equation Many textbooks on Kalman filtering derive (2.40)
as the solution of a matrix Ricatti equation; using the notation of Chapter 1, this takes the
following form:
Ṗ(t) = A(t)P(t) + P(t)AT (t) + Q(t) (2.41)
Use of (2.41) would seem to avoid the need to perform the integrations required to compute
the state transition matrix (STM) and process noise covariance (PNC). In orbit determina-
tion practice however, (2.40) has been found to be more numerically stable and also, despite
16 2. THE COVARIANCE MATRIX

the need to compute or approximate the state transition and process noise integrals, more
efficient than (2.41).
2.2.2. State Transition Matrix A common approach in ground-based OD, especially
in the batch least squares context, is to simultaneously integrate the STM along with the
state vector,
Ẋ(t) = f (X(t), t), X(to ) = Xo (2.42)
˙ (t, to ) = A(t) (t, to ), (to , to ) = I (2.43)
When coupled with a good numerical integration algorithm, this method has excellent
fidelity, which is rarely necessary in onboard OD applications. Lear [40] studied a number
of practical methods for computing the STM in the onboard OD context. As his report is
not widely available, we will summarize some key findings here.
Lear’s approach was to compare various orders of truncated Taylor series and Runge-
Kutta approximations to the solution of (2.43). He used these STM approximations to
propagate an initially diagonal covariance for one revolution in a two-body circular orbit
around a point mass with the GM of Earth. By comparing these results to those he obtained
using an analytic STM, Lear could compute the maximum step size that would result in a
given relative accuracy for radius and speed formal standard deviations. Table 1 lists a few
of Lear’s results. Notably, Method H has the desirable feature that simply saving the value

Description Method Max. Step [sec]

A. 1st -order Taylor I + Ai t 0.125
nd 2 2
B. 2 -order Taylor, ignoring Ȧ I + Ai t + Ai t /2 1.0
C. 2nd -order Taylor I + Ai t + (Ȧi + A2i ) t2 /2 16
F. st
1 -order Runge-Kutta I + Ai+.5 t 0.14
G. 2nd -order Runge-Kutta, with I + Ai+.5 t + A2i+.5 t2 /2 14
one evaluation of A
H. 2nd -order Runge-Kutta, with I + (Ai + Ai+1 ) t + Ai Ai+1 t2 /2 16
two evaluations of A
Table 1. A few of Lear’s STM Comparison Results, for 1% relative error.

of the state Jacobian from the previous propagation step allows for more than an order of
magnitude increase in allowable time step, with essentially the same computational burden
as Method B. If it is not too burdensome to compute A at the midpoint of the propagation
step, then Method G o↵ers nearly equivalent performance without the need to retain the
previous value of A. For higher-rate propagations, Method B o↵ers far more accuracy
than Method A with only a small additional computational burden. While Method A
appears to be a poor choice for many applications, it does play a central role in some useful
approximations to the process noise covariance, as the sequel shows.
2.2.3. Process Noise Covariance As stated in Chapter 1, nearly all practical meth-
ods for computing the process noise covariance assume that E[w(t)wT (⌧ )] = Q(t) (t ⌧ ), so
that (1.29) simplifies to a single integral for the process noise covariance1, given by (1.42),
1A notable exception is the work of Wright [76], which describes a correlated process noise model that
is intended to account for gravity modeling errors in a “physically realistic” manner. Although this method
2.2. COVARIANCE PROPAGATION 17

and repeated here:

Z ti
Si = (ti , ⌧ )B(⌧ )Q(⌧ )BT (⌧ ) T
(ti , ⌧ ) d⌧ (2.44)
ti 1

Tapley, Schutz, and Born [69] describe two approximations to the portion of (2.44) which
corresponds to position and velocity state noise, which have proven useful in both ground-
and onboard-OD applications. Reference 69 refers to these methods as “State Noise Com-
pensation” (SNC) and “Dynamic Model Compensation” (DMC). Before describing SNC
and DMC however, we will consider some inappropriate models.
Chapter 1 pointed out that
"Z #
ti Z ti
E (ti , ⌧ )B(⌧ )w(⌧ )wT ( )BT ( ) T (ti , ) d⌧ d
ti 1 ti 1
"Z Z #
ti ti
6= E (ti , ⌧ )B(⌧ )w(⌧ ) d⌧ wT (⌧ )BT (⌧ ) T
(ti , ⌧ ) d⌧ (2.45)
ti 1 ti 1

Let us explore the implications of assuming equality of the expression above. Suppose we
assume that the process noise increments are approximately constant over some particular
interval t = ti ti 1 , and that E[w(ti )wT (tj )] = W(ti ) ij , where ij denotes the Kronecker
delta function. Then,
"Z Z ti #
ti
E (ti , ⌧ )B(⌧ )w(⌧ ) d⌧ wT (⌧ )BT (⌧ ) T (ti , ⌧ ) d⌧
ti 1 ti 1
Z ti Z ti
= (ti , ⌧ )B(⌧ ) d⌧ E[w(ti )w(ti )T ] BT (⌧ ) T
(ti , ⌧ ) d⌧
ti 1 ti 1 (2.46)
T
= i Wi i
There is a subtlety with (2.46) that can lead to issues: if the time interval associated with the
assumption that E[w(ti )wT (tj )] = W(ti ) ij is not the same as the time interval associated
Rt
with the integral i = tii 1 (ti , ⌧ )B(⌧ ) d⌧ , then the process noise will not be consistently
applied. For example, if the EKF is tuned at a particular time step using a particular noise
covariance W, and then for some reason the time step is changed, then one must retune
the value of W.
A similar issue occurs when the process noise covariance is chosen without regard for
the dynamics, e.g. by setting it equal to a diagonal matrix of user-specified parameters.
Whatever careful tuning has been done to choose such parameters will be invalidated by a
change in the time step.
2.2.3.1. State Noise Compensation For SNC, as applied to OD, we assume velocity error
is an uncorrelated random walk with fixed intensity in orbit-fixed coordinates, such as the
RTN or VNB coordinates described above. Thus, assuming RTN coordinates without loss
of generality, the process noise spectral density matrix becomes
2 3
qr 0 0
Q(t) = Qrtn = 4 0 qt 0 5 (2.47)
0 0 qn

has had occasional onboard application, it is more widely known for its inclusion in commercial-o↵-the-shelf
software for ground-based OD.
18 2. THE COVARIANCE MATRIX

We assume the transformation from orbit-fixed coordinates to the coordinates used for
navigation, which are typically inertial coordinates, is approximately constant over the
interval t = ti ti 1 , and ignore the correlation-inducing dependence of this transformation
on the estimated position and velocity. This results in

03⇥3
B(t) = Brtn = (2.48)
Mrtn
We also assume that t is small enough that a 1st -order Taylor series truncation (Lear’s
Model A) is adequate for modeling the STM (ti , ⌧ ) in the integrand of (2.44), and that
Mrtn is constant. With these assumptions, (2.44) becomes
2 3
t3 t2
6Q̃ 3 Q̃
2 7
Si = 6
4 2
7
5 (2.49)
t
Q̃ Q̃ t
2
where Q̃ = Mrtn Qrtn MTrtn . Note that with the SNC model, velocity covariance grows
linearly with time, as expected for a random walk model, and hence we should expect the
p
units of q i , i = r, t, n to be meters per second3/2 .
2.2.3.2. State Noise Compensation for Maneuvers During powered flight, it is often
necessary to include additional process noise to accommodate maneuver magnitude and
direction errors. One approach is to simply define an additional SNC process noise covari-
ance, with intensities that are sized to the maneuvering errors. While this works fine for
modeling maneuver magnitude errors, direction errors may be more accurately modeled by
recognizing that a misaligned maneuver vector may be represented by (I3 ✓⇥ ) v~ nom ,
⇥
where ✓ represents a skew-symmetric matrix of small angle misalignments, and v ~ nom
is the nominal maneuver vector. Thus, v ⇥
~ nom ✓ is the error in the velocity increment
due to maneuver direction errors, or if sensed accelerations are being fed-forward into the
dynamics, due to IMU misalignments. To model these direction errors as a process noise
term, let 
03⇥3
B(t) = ⇥ (2.50)
~ nom
v
and let q✓ be the intensity of the maneuver direction noise. Then the SNC-style process
noise for accommodating maneuver direction errors becomes
2 3
⇥ 2 t3 ⇥ 2 t2
~ nom )
6 ( v ( v~ nom )
S i = q✓ 6 3 2 7 7 (2.51)
4 t2 5
⇥
~ nom )
( v 2 ( v ⇥ 2
~ nom ) t
2
⇥ T
⇥ ⇥ )2 . A version of this method was used by the Space Shuttle
since v ~ nom ~ nom
v = ( v~ nom
during powered flight with IMU-sensed accelerations.
2.2.3.3. Dynamic Model Compensation The DMC approach assumes the presence of
exponentially-correlated acceleration biases, which are included as additional solve-fors in
the filter state. As Chapter 5 and Appendix A discuss, a model for such biases is given by
t
b(t + t) = e b(t) + $(t)
⌧ (2.52)
⇣ 2 t
⌘
where b(to ) ⇠ N (0, pbo ), and $(t) ⇠ N (0, q⌧
2 1 e ⌧ ). As Chapter 5 discusses, ⌧
is a time constant controlling the “smoothness” of the random process, and q is a power
2.2. COVARIANCE PROPAGATION 19

spectral density that describes the intensity of the random input. While Chapter 5 discusses
a variety of other bias models that might be used, the exponentially-correlated model has
proved to be a best practice for applications in which there are measurements continually
available to persistently excite it. Refer to Chapter 5 for a fuller discussion of the relative
merits of various bias modeling approaches.
As above, without loss of generality we can assume the acceleration biases are aligned
with the RTN frame, and again assume that t is small enough that a 1st -order Taylor series
truncation is adequate for modeling the portion of the STM corresponding to position and
velocity errors. With these assumptions, the terms appearing the integrand of (2.44) become
2 3
03⇥3
B(t) = Brtn = 4 03⇥3 5 (2.53)
Mrtn

and
2 3
I3 tI3 ⌧ t ⌧ 2 (1 e t/⌧ ) I3
(t + 4
t, t) = 03⇥3 I3 ⌧ (1 e t/⌧ )I3 5 (2.54)
03⇥3 03⇥3 e t/⌧ I3

and the process noise covariance becomes2

2 3
pp Q̃ pv Q̃ pa Q̃
Si = 4 pv Q̃ vv Q̃ va Q̃
5 (2.55)
pa Q̃ va Q̃ aa Q̃

with
n⇣ ⌘ ⇣ ⌘ o
⌧5 2 t/⌧ 2 t t/⌧ t 2 2 t 3
pp = 2 1 e + ⌧ 1 2e 2 ⌧ + 3 ⌧ (2.56)

n⇣ ⌘ ⇣ ⌘ ⇣ ⌘ o
⌧4 2 t/⌧ t/⌧ 2 t t/⌧ t 2
pv = 2 e 1 2 e 1 + ⌧ e 1 + ⌧ (2.57)

n⇣ ⌘ o
⌧3 2 t/⌧ 2 t t/⌧
pa = 2 1 e ⌧ e (2.58)

n⇣ ⌘ ⇣ ⌘ o
⌧3 2 t/⌧ t/⌧
vv = 2 1 e 4 1 e + 2 t/⌧ (2.59)

⇣ ⌘2
⌧2 t/⌧
va = 2 1 e (2.60)

⇣ ⌘
⌧ 2 t/⌧
aa = 2 1 e (2.61)

2In (2.55), the matrix Q̃ has the same form as it does for the SNC method, but with pq , i = r, t, n
i
now representing acceleration intensities, with units of meters per second5/2 . Also note that (2.55) assumes
the same time constant is applicable to all three acceleration channels. While this is usually sufficient, it is
straightforward to extend (2.55) to accommodate separate time constants for each channel.
20 2. THE COVARIANCE MATRIX

2.2.3.4. Explicit Dynamic Biases While the DMC approach allows for quite general
estimation of otherwise unmodeled forces on the spacecraft, it is often the case that the
domain of application provides context that can narrow the filter designer’s focus. For
example, it may be the case that the only under-modeled force of appreciable significance
on the spacecraft is drag, or perhaps solar radiation pressure, within the context of the
application. Alternatively, the application may require much higher resolution models than
DMC, which might necessitate estimation of smaller forces with larger uncertainties such
as Earth radiation pressure, spacecraft thermal emission, etc., or panel-based modeling of
drag and/or solar radiation pressure, etc. In such cases, it is often useful to tailor the
DMC approach so that it estimates model-specific biases, such a drag or SRP corrections,
rather than modeling three general RTN biases. Similarly, during powered flight, maneuver
magnitude and direction errors might be more successfully modeled explicitly.
As an example of an explicit bias, consider estimating a multiplicative correction to the
density; a similar approach may be used for drag or solar radiation pressure coefficients.
Let t denote geocentric coordinate time. Let R denote a planetary-body-fixed, body-centric
system of coordinates, aligned with the central body’s rotation axis. Let I denote a body-
centric, celestially-referenced system of coordinates, aligned with R at an epoch to . Let r ~
represent the position of the center of gravity of a satellite, expressed in I. Let v represent
~
the satellite’s velocity within I. Let v~ r represent the satellite’s velocity within frame R.
Assume that r ~ evolves with respect to t and I in the vicinity of to according to
Id2
✓ ◆
µ 1 A ⇢
~=
r ~
r CD ⇢ 1 + ~r
vr v (2.62)
dt2 r3 2 m ⇢
where r = k~r k, vr = k~
vr k, ⇢ is an atmospheric density disturbance, ⇢ is the undisturbed
~ r , m is the satellite
atmospheric density, A is the area of the satellite in a plane normal to v
mass, and CD is the satellite’s coefficient of drag. Assume that ⇢/⇢ is a random process
that formally evolves as a first-order Gauss-Markov process, similar to a DMC bias:
✓ ◆ ✓ ◆
d ⇢ 1 ⇢
= + w⇢ (2.63)
dt ⇢ ⌧ ⇢
r0 , v
where q⇢ is the intensity of w⇢ . Let the state vector be x = [~ ~ 0 , ( ⇢/⇢)]0 . With these
assumptions, the state dynamics and noise input partials are
2 3 2 3
03⇥3 I3 03⇥1 03⇥1
A(t) = 4G(t) + Dr (t) Dv (t) d(t) ~ 5 , B(t) = 403⇥1 5 (2.64)
01⇥3 01⇥3 1/⌧ 1
~ = 1 CD A ⇢vr v
where G(t) is the gravity gradient matrix, d(t) ~ r is the nominal drag accel-
2 m
eration, and Dr and Dv are partials of the drag acceleration with respect to position and
velocity, respectively3.
Using the DMC results from above, with the assumption that the nominal drag accel-
eration is approximately constant over the integration time, t, the term (t + t, t)B(t)
3The sensitivity D contains terms that are roughly proportional to the drag acceleration magnitude
r
divided by the atmospheric scale height, and to the product of drag acceleration magnitude and the ratio
of planetary rotation rate to speed relative to the atmosphere. For nearly all spacecraft, these terms will be
many orders magnitude smaller than the gravity gradient, which is proportional to the gravity acceleration
divided by the radius. So Dr can usually be neglected. For reference, Dv = 12 CD m A
⇢vr (~ ~ 0vr + I3 ),
uvr u
0 ⇥ ⇥
and Dr = d/Rs (~ ~ vr ) Dv !
ur u ~ , where !~ is the skew-symmetric “cross-product” matrix formed from the
central body’s rotation rate vector, and the notation u~ (·) indicates the unit vector of its subscript.
2.2. COVARIANCE PROPAGATION 21

appearing in the integrand of (2.44) becomes

2 3
⌧ t ⌧ 2 (1 e t/⌧ ) d~
(t + , t)B(t) = 4 ⌧ (1 e ~
t/⌧ )d 5 (2.65)
e t/⌧

and the process noise for a proportional density bias becomes

2 3
~ ~0 pv d~d~0 pa d~
pp dd
6 7
Si = q⇢ 4 pv d~d~0 vv d~d~0 va d~5 (2.66)
pa d
~0 va d
~0 aa

2.2.3.5. Episodic Dynamic Biases It has sometimes been found to be the case, particu-
larly for crewed missions, that episodic spacecraft activities can produce un-modeled accel-
erations. In the early days of manned spaceflight, events such as vents, momentum unloads,
RCS firings, etc. that may perturb a spacecraft’s trajectory were not well modeled, and came
to be described as FLAK, which was supposed to be an acronym for (un)-Fortunate Lack
of Acceleration Knowledge.
If the mean time between such activities can be characterized, along with the expected
intensity of the acceleration, a compound Poisson-Gaussian process noise model may be
e↵ective. Conveniently, it turns out that the covariance of a linear system driven by a train
of Gaussian-distributed impulses whose arrival times follow a Poisson distribution is the
same as the covariance of the same system driven by a white noise input process, except
for the scaling of the process noise covariance by the Poisson process rate parameter [24].
To understand this result, consider a linear model of the error in a spacecraft trajectory
as follows
ẋ(t) = A(t)x(t) + B(t)u(t) (2.67)
where x represents the deviation of the actual position/velocity state from its estimated or
nominal value, and u represents the FLAK. Then, if we make use of inertial coordinates,
 
0 I 0
A(t) = , B(t) = (2.68)
G(t) 0 M(t)
where G represents the gravity gradient matrix, and M is the direction cosine matrix
rotating the supposed body-fixed FLAK into the inertial frame. There is no general solution
to this di↵erential equation, but over short time intervals, we can assume that
Z tk
x(tk ) = (tk , tk 1 )xk 1 + (tk , ⌧ )B(⌧ )u(⌧ ) d⌧ (2.69)
tk 1

where
(tk , tk 1 ) = I + A(tk )(tk tk 1 ) (2.70)
Since the input is a sequence of impulses of (random) length nk ,
nk
X
u(t) = ui (t ti )) (2.71)
i=1
with tk < ti < tk+1 , the delta functions annihilate the integral and our model becomes
(with ti = tk ti ):
nk 
X ~ i ti

x(tk ) = (tk , tk 1 )xk 1 + ui (2.72)
~i

i=1
22 2. THE COVARIANCE MATRIX

where ~ i is the unit inertial direction vector for the FLAK event. Note that the input
response is in some sense fundamentally an increment to the entire state vector at each tk ;
we can however compute an equivalent zero-order hold acceleration by dividing the velocity
increment by the time step tk tk 1 .
Since we assume each impulse is Gaussian, the input response has zero mean. To find a
tractable form for the covariance, assume that the direction of the FLAK event is constant
over each interval tk tk 1 . This assumption assures that the impulses are identically
distributed over each sampling interval. Then, the process noise covariance is given by
Reference 24:
Z tk 
~ k (tk ⌧ ) ⇥ 0
 ⇤
S(tk ) = q  ~ 0k d⌧
~ k (tk ⌧ ),  (2.73)
tk 1 ~k

where q is the intensity of the Gaussian impulses and is the rate parameter of the Poisson
process. Carrying out the integration results in

~ 0k t3 /3 
~ k
 ~ 0k t2 /2
~ k
S(tk ) = q 0 (2.74)
~ k
 2
~ k t /2 ~ k
 ~ 0k t
2.2.3.6. Computational Considerations The primary computational issues that can af-
fect covariance propagation may be broadly characterized as underflow and overflow. Under-
flow can occur especially because many of the process noise parameters described above may
often have values that approach computational truncation limits, which can lead to non-
positive-definite process noise covariances. Overflow can similarly occur when truncation
limits are approached by very large covariance values such as can occur with long propaga-
tion times. Because the orbital dynamics are at best marginally stable, even propagating
without process noise can result in di↵erences of many orders of magnitude between largest
and smallest eigenvalues. This problem will be exacerbated if process noise is present, since
all of the process noise models described above introduce unbounded position covariance
error growth4.
Simple tricks like enforcing symmetry, or adding a small positive diagonal matrix, will
not always ensure positive eigenvalues in such cases. A better solution is to maintain the
covariance in factorized form, for example as Chapter 7 describes. In lieu of a fully fac-
torized filtering approach, process noise factors may be computed from their factorizations.
Chapter 5 shows a few examples of Cholesky factorizations that may be employed in this
fashion.

2.2.4. Tuning the Covariance Propagation Since even the best practices this
Chapter has discussed are at best approximations, it is inevitable that EKF designers must
perform some artful tuning of the free parameters to achieve acceptable results. Further-
more, computational limitations of flight computers often lead to the need for compromises
in modeling fidelity. What one generally hopes to accomplish via tuning of the covariance
propagation is that any approximations or compromises the EKF has had to endure to be
implementable have not impaired its covariance’s accuracy too much. In particular, one
would like to compute an idealized “truth” covariance matrix, based on the best-available
models and data, and adjust the EKF’s “formal” covariance via the tuning process to yield
some semblance of a match.

4Reference 8 proposes an approximate “solution” to this problem via a Floquet analysis of a modified
set of covariance propagation dynamics that include artificially-introduced damping.
2.2. COVARIANCE PROPAGATION 23

In some cases, it is possible to compute the true covariance. In particular, if we are

studying a linear system, and the random components have zero-mean Gaussian distribu-
tions, then the mean errors will be zero, and we can use linear covariance analysis [20,48,50]
to compute the covariances. It is often possible to approximate the performance for a nonlin-
ear system with this technique by linearization. This is often a first step in early conceptual
design studies.
One may divide tuning of the covariance propagation into those activities a designer
performs (1) during the detailed development of a system, prior to the collection of any
flight data, and (2) during the commissioning of a new system or an existing system in a
new application, when flight data are available. For pre-flight detailed design studies, one
generally simulates the system, so one has access to truth data. One can also run the mission
simulation many times, generating an ensemble of parallel results, performing a Monte Carlo
analysis. During and after the actual mission, we never have access to truth data. At best,
we can reconstruct the trajectory after the fact using more sophisticated processing and
additional data that were not available in real-time. For near-realtime analysis, we can
compare current definitive states to predictions from previous epochs. These predictions
can come from either mission products generated in real-time at a past epoch, or past
reconstructions of the trajectory. In all cases, the best we have are di↵erences between
estimates, not errors from the truth. There are several empirical approximations to the
true covariance that one might use in these situations.
2.2.4.1. Empirical Approximations of the True Covariance Let ej (ti ) represent the ran-
dom error vector at time ti for case j from a Monte Carlo simulation, and {e(ti )} be the
set of all cases at time ti . Assume the total number of cases is K, and the total number of
time samples is N .
Time Series Expectation: We can take statistics of the error realizations across
time for each case, ej (ti ), to get the time series expectations for each case:
N
1 X
Êt (ej ) = ej (ti ) (2.75)
N
i=1
N
X
1
Êt (ej eTj ) = ej (ti )eTj (ti ) (2.76)
N 1
i=1
If the data are stationary5, the time series statistics are usually an adequate ap-
proximation, if we have considered a long enough time span. In some systems,
described as ergodic, a long time series is in some sense equivalent to a large num-
ber of shorter Monte Carlo cases.
Ensemble Expectation: We can take statistics of the error realizations over all the
cases at each time sample to get the ensemble expectations:
K
1 X
Êe {e(ti )} = ej (ti ) (2.77)
K
j=1

K
X
T 1
Êe {e(ti )e (ti )} = ej (ti )eTj (ti ) (2.78)
K 1
j=1

5Stationary data are those for which the statistics do not change when the time origin shifts.
24 2. THE COVARIANCE MATRIX

The ensemble statistics will generally give the best indication of performance if
the data are non-stationary, so long as we use an adequate number of Monte Carlo
cases.
Since there is nothing analogous to an ensemble of Monte Carlo cases for flight data,
we cannot use ensemble statistics as defined above. Let d represent the random di↵erence
vector between the quantity of interest and its comparison value. We can apply time series
statistics, but as the mission evolves, the span of the time series continually extends, so we
have to decide which subsets of the entire mission span to use, e.g. the time series extending
back over the entire history of the mission, extending back only over some shorter interval,
etc., and also how frequently to recompute the time series statistics, e.g. continuously, once
per day, etc. There are some other approximations to the expectation that we might use
here.
Sliding Window Time Series Expectation: We can take statistics of the di↵er-
ence realizations, d(ti ), across a sliding window extending t into the past from
each observation, for each case, to get the t-sliding window time series expecta-
tions:
n 1
1 X
Êt, t (d) = d(tN i ) (2.79)
n
i=0
n 1
X
1
Êt, t (ddT ) = d(tN i )d(tN i )T (2.80)
n 1
i=0
where n is the number of time samples in the window t.
Period-Folding Expectation: If the data are periodic, we can break up the data
into K spans of one period in duration each, and shift the time origin of each span
so that the data are “folded” into the same, one-period-long interval. We can then
take ensemble statistics over times at the same phase angle, t i , within each period.
K
1 X
Êf {d(t i )} = dj (t i ) (2.81)
K
j=1

K
X
1
Êf {d(t i )d(t i )T } = dj (t i )dTj (t i ) (2.82)
K 1
j=1
It is often useful to fold the data into bins of equal mean anomaly. This is especially
useful for orbits with notable eccentricity, since it ensures that a roughly equal
number of time points will be present in each bin.
Sliding Window Period-Folding Expectation: Period-folding can obviously be
applied over a sliding window as well, with each window extending n periods into
the past.
n 1
1X
Êf,n {d(t i )} = dK j (t i ) (2.83)
n
j=0
n
X1
1
Êf,n {d(t i )d(t i )T } = dK T
j (t i )dK j (t i ) (2.84)
n 1
j=0
This is especially useful for identifying secular trends in periodic data sets.
2.3. COVARIANCE MEASUREMENT UPDATE 25

2.2.4.2. Tuning for Along-track Error Growth As described above, the position error
component along the orbit track will dominate covariance propagation error, and so the
most important step in tuning the covariance propagation is to ensure that this component
grows no faster or slower than it should based on the truncations and approximations that
the EKF design has employed. One may use any of the analytical or empirical methods
described above to estimate the “true” covariance. For example, for preflight analysis,
one may generate a time series or ensemble of time series of di↵erences between states
propagated using the formal models the filter employs, and a best available “truth” model
of the system. One can then compare the appropriate empirical covariance computed from
this data set to the filter’s formal covariance, and adjust the process noise intensities until
a reasonable match occurs. For flight data analysis, one may similarly di↵erence across
overlaps between predictive and definitive states, and compare these empirical covariances
of these di↵erences to the sum of the predictive and definitive formal covariances from the
filter.
If one uses the SMC method, the primary “knob” for tuning the alongtrack covariance
growth rate is the corresponding alongtrack component of the process noise intensity qT or
qV , depending on whether RTN or VNB components are used, respectively. Essentially, an
impulse along the velocity vector, or change in speed, causes a change in SMA, corresponding
to a change in period, and hence a secular growth in position error along the orbit, as
discussed at the beginning of this Chapter. This mechanism is especially transparent for
near-circular orbits, and some simple analysis yields a good starting point. One may find a
fuller exposition of the following result in Reference 21.
For near-circular orbits, the position components of the integrand in (2.44) become, in
RTN coordinates,
2 3
qR 0 0
rv ( t) 4 0 qT 0 5 Trv ( t) (2.85)
0 0 qN
where rv ( t) is given, per Hill, Clohessy, and Wiltshire, by
2 3
sin(n t)/n 2(1 cos(n t))/n 0
rv ( t) = 42(cos(n t) 1)/n 4 sin(n t)/n 3 t 0 5 (2.86)
0 0 sin(n t)/n
Retaining only secular terms and carrying out the integral, the along-track component of
the process noise covariance becomes
ST ( t) ⇡ 3 t3 qT (2.87)
an approximation which holds for t > Tp . Thus, one may use an empirical covariance
of the along-track error after one orbit period, such as ˆ 2s = Êf { s2 }, to derive a starting
point from which to tune qT , as
ˆ2
qT = s3 (2.88)
3Tp

2.3. Covariance Measurement Update

This section discusses methods for implementing the covariance measurement update.
Some of the most important of these best practices are related to factorization methods and
underweighting, which are topics of enough significance to warrant their own chapters.
26 2. THE COVARIANCE MATRIX

2.3.1. “Stable Form” of non-Joseph Covariance Update As Chapter 1 pointed

out, only for the optimal gain and true covariance does the Joseph form of the covariance
measurement update, (1.41),
P+
i = (I Ki Hi ) Pi (I Ki Hi )T + Ki Ri KTi (2.89)
reduce to (1.33),
P+
i = Pi Ki Hi Pi (2.90)
While this assertion is strictly true, the cancellations that produce the above results will
still incur so long as the EKF algorithm is internally consistent with truncating and ap-
proximating
⇥ its
⇤ various terms. The resulting “covariance” will not accurately represent
E e+ i (e + T
i ) |Y i , but the fact that these truncations and approximations have produced a
suboptimal gain will, in themselves, provide no computational issues. In e↵ect, the re-
sulting suboptimal gain remains “optimal” with respect to the internally consistent set of
approximations and truncations internal to the filter.
However, even if the gain is optimal, the stability of the non-Joseph form depends on
the order of multiplication, as Schmidt points out [64]. He describes a “stable form” of
the non-Joseph update, given by Algorithm 2.1, which was successfully used by the Space
Shuttle. Algorithm 2.1 processes each jth scalar element of the measurement vector one
at a time, using only the jth column of the measurement partials matrix, hj , and the
(j, j) diagonal element of the measurement noise covariance, rj , assuming Ri is a diagonal
matrix. In comparison with P+ i = (I Ki Hi )Pi , use of Algorithm 2.1 also reduces the

Algorithm 2.1 “Stable form” of the non-Joseph Covariance Measurement Update

P1 = Pi
for each scalar measurement j = 1 through k do
hj = jth column of Hi
rj = (j, j) element of Ri
bj = Pj hTj
kj = bj / (hj bj + rj )
Pj Pj kj bTj
end for
P+i = Pk
Only the non-redundant (upper or lower triangular) portions of the covariance should be
updated, and then the other redundant elements set equal to the ones that have been
computed.

computational burden from O(n3 ) to O(n2 /2), where n is the state dimension.
Although Algorithm 2.1 does not show the state update, it may also be sequentially
updated as part of the iteration. However, the order in which the scalar measurements
update the state can a↵ect the outcome, if the measurement partials are computed one
column at a time, corresponding with each scalar update. This may produce undesirable
or even unstable outcomes. Chapter 3 will discuss such issues further.
Despite the extensive and successful flight heritage of Algorithm 2.1, it cannot guarantee
numerical stability and positive definiteness of the covariance. Therefore, the recommended
best practice for the covariance update is to utilize the U D-factorization, which Chapter 7
describes.
2.3. COVARIANCE MEASUREMENT UPDATE 27

2.3.2. Use of Consider States It may often be the case that unobservable states
are present in the system being estimated. Most commonly, such states will be parameters
whose values are unknown or uncertain. Inclusion of such parameters as solve-for states in
the EKF is a not a recommended practice. However, if the EKF completely ignores the
uncertainty that such parameters introduce, its covariance can become overly optimistic, a
condition sometimes known as “filter smugness.” One approach to addressing this problem
was introduced by Schmidt [64], originally in the context of reducing the computational
burden that the EKF imposed on flight computers of the 1960’s. Schmidt’s idea is essentially
for the EKF to maintain a covariance containing all of the states whose uncertainties are
significant enough to a↵ect filter performance, but only to update a subset of those states.
The states which are not updated in this framework are typically known as “consider”
parameters, and such a filter has been called a “consider filter” or a “Schmidt-Kalman”
filter. Although most commonly the state space is simply partitioned by selecting states
as either solve-for or consider states, Reference 48 points out that partitioning using linear
combinations of the full state space is also possible.
Following Reference 48, suppose the filter produces estimates for a subset of ns solve-for
states, out of the full state of size n. The filter does not estimate the remaining nc = n ns
consider states. Denote the true solve-for vector by s(t), and the true consider vector by c(t).
Assume that these are linear combinations of the true states, according to the following:
s(t) = S(t)x(t) and c(t) = C(t)x(t) (2.91)
where the ns ⇥ n matrix S(t) and the nc ⇥ n matrix C(t) are such that the matrix

S
M= (2.92)
C
is non-singular. The inverse of M is partitioned into an n ⇥ ns matrix S̃ and an n ⇥ nc
matrix C̃: h i
M 1 = S̃, C̃ . (2.93)
The properties of the matrix inverse then lead immediately to the identities
SS̃ = Ins , CC̃ = Inc , SC̃ = 0ns ⇥nc , CS̃ = 0nc ⇥ns , (2.94)
and
S̃S + C̃C = In . (2.95)
In the usual case that the elements of the solve-for and consider vectors are merely selected
and possibly permuted components of the state vector, the matrix M is an orthogonal
permutation matrix. In this case, and in any case for which M is orthogonal, the matrices
S̃ and C̃ are just the transposes of S and C, respectively, which makes inversion of M
unnecessary and simplifies many of the following equations.
It follows from Eqs. 2.91 and 2.95 that
x(t) = S̃(t)s(t) + C̃(t)c(t). (2.96)
Relations similar to Eq. 2.91 give the estimated solve-for vector ŝ(t) and the assumed con-
sider vector ĉ(t) in terms of the estimated state x̂(t). Thus, errors in the solve-for and
consider states are given by
es (t) = s(t) ŝ(t) = S(t)e(t) (2.97)
ec (t) = c(t) ĉ(t) = C(t)e(t) (2.98)
28 2. THE COVARIANCE MATRIX

and the true error may be written in terms of the solve-for and consider errors by
e(t) = S̃(t)es (t) + C̃(t)ec (t). (2.99)
In terms of this notation, the EKF update has the form
ŝ+
i = ŝi + Ki ri (2.100)
where
ri = yi ŷi = Hi ei + vi , (2.101)
and the subscript i is a shorthand for the time argument ti . The usual EKF will not contain
the full covariance, but only its solve-for part
Pss (t) = E[es (t)eTs (t)] (2.102)
By contrast, the Schmidt-Kalman filter will use the full covariance, P(t). In the usual case,
the Kalman gain is given by
⇥ ⇤ 1
Ki = Pssi HTsi Hsi Pssi HTsi + Ri (2.103)
where
Hsi = Hi S̃i (2.104)
In the Schmidt-Kalman case,
⇥ ⇤ 1
Ki = Si Pi HTi Hi Pi HTsi + Ri (2.105)
Thus, the Schmidt-Kalman gain matrix is computed from the full covariance, but only
applies measurement innovations to the solve-for states.
2.4. Sigma-Point Methods
Before concluding this Chapter, it is worth noting that a promising method for propa-
gating and updating the covariance that is coming into greater use and acceptance within
the navigation community is the use of sigma-point methods, also known as “unscented”
transforms. A subsequent chapter covering Advanced Topics will cover these topics in a
section on sigma-point filtering.
CHAPTER 3

Processing Measurements

Contributed by Christopher N. D’Souza and J. Russell Carpenter

This chapter will discuss how to handle real-world measurement processing issues. In par-
ticular, being able to handle measurements that aren’t synchronous is of paramount impor-
tance to running filters in a real-time environment. As well, the performance of navigation
filters which have nonlinear measurement models are susceptible to divergence depending
on the order of processing of measurements which occur at the same epoch. Therefore, a
technique which provides invariance to measurement processing is detailed. A technique
for processing correlated measurements is presented, and brief comments on filter cascading
and processing of inertial data are o↵ered.
3.1. Measurement Latency
In general, the measurement time tags are not going to be equal to the current filter
epoch time, tk . To state it another way, the measurements do not come in at the current
filter time. Rather, they may be latent by up to p seconds. Thus, a situation will arise where
the filter has propagated its state and covariance to time t = tk from time t = tk 1 , and is
subsequently given a measurement to be filtered (denoted by subscript m) that corresponds
to the time t = tm , where
tm  t k (3.1)
If t = tm tk is not insignificant, the time di↵erence between the measurement and the
filter state and covariance will need to be accounted for during filtering in order to accurately
process the measurement. This can be done in much the same way a batch filter operates
(see pages 196-197 of Tapley [69]). If the measurement at time t = tm is denoted as ym ,
the nominal filter state at that time is given by X⇤m ⌘ X⇤ (tm ) (⇤ denotes the nominal), and
the measurement model is denoted as hm (Xm , tm ), then one can expand the measurement
model to first order about the nominal filter state to get
hm (Xm , tm ) = hm (X⇤m , tm ) + Hm xm + ⌫m (3.2)
where xm = Xm X⇤m and Hm is defined as
✓ ◆
@hm (X, tm )
Hm = (3.3)
@X X=X⇤m
The perturbed state at time tm can be written in terms of the state at time tk as follows
xm = (tm , tk )xk + m wm (3.4)
so we can compute the measurement as
hm (Xm , tm ) = hm (X⇤m , tm ) + Hm (tm , tk )xk Hm m wm + ⌫m (3.5)
29
30 3. PROCESSING MEASUREMENTS
⇥ 2⇤
where the measurement noise has the characteristics E[⌫m ] = 0 and E ⌫m = Rm , the state
process noise from t = tm to t = tk has the characteristics E[wm ] = 0 and E[wm wTm ] = Qm ,
and the state deviation is given by
xk = Xk = Xk X⇤k (3.6)
(Note that the e↵ect of the state process noise would be to increase the measurement noise
variance. However, because the process noise term is very small over time periods of a few
seconds, it can safely be neglected for the remainder of this analysis.)
Upon taking the conditional expectation of the measurement equation and rearranging,
the scalar residual of the measurement is given by
ym H e m x̂k ( ) = ym hm (X⇤ , tm ) Hm (tm , tk )x̂k ( ) (3.7)
m
where ˆ· denotes an estimated value,
ym = ym hm (X⇤m , tm )
x̂k ( ) = X̂k ( ) X⇤k (3.8)
The measurement partials that are used in the update, which map the measurement to the
state at time t = tk , are given by
e m = Hm (tm , tk )
H (3.9)
Eq. 3.9 was derived by noting that
Hm x̂m = Hm (tm , tk )x̂k
e m x̂k
= H (3.10)
From the above discussion, it is evident that the unknown quantities needed to update
the state at time t = tk with a measurement from time t = tm are the nominal state at the
measurement time, X⇤m , and the state transition matrix relating the two times, (tm , tk ).
Given those values, hm (X⇤m , tm ) and He m can be calculated.
Thus the nominal state at the measurement time is calculated by back-propagating the
filter state from time tk to time tm using bu↵ered IMU data. The same thing is done to
calculate the required state transition matrix. The same propagation algorithms used in
forward propagation ought to be utilized for the back-propagation, with the exception that
the smaller time step allows for a 1st -order approximation of the matrix exponential used
to update the state transition matrix.
3.2. Invariance to the Order of Measurement Processing
It has long been known that the performance of an EKF is dependent on the order in
which one processes measurements. This is of particular import in the case when there
is powerful measurement coupled with a large a priori error. The state (and covariance)
update will be large, very likely out of the linear range. Subsequent measurements which
are processed may well be outside the residual edit thresholds, and hence will be rejected.
In order to remedy this, we employ a hybrid Linear/Extended Kalman Filter measurement
update. Recall that in an Extended Kalman Filter, the state is updated / relinearized
/ rectified after each measurement is processed, regardless of whether the measurements
occur at the same time. Hence, the solution is highly dependent on the order in which the
measurements are processed. This is not a desirable situation in which to be.
We obviate this difficulty simply by not updating the state until all the measurements
at a given time are processed. We accumulate the state updates in state deviations x, using
3.3. PROCESSING VECTOR MEASUREMENTS 31

Algorithm 3.1. This algorithm makes use of the fact that, in the absence of process noise, a
batch/least squares algorithm is mathematically equivalent to a linear Kalman Filter [25].
Algorithm 3.1 is a recommended best practice. Algorithm 3.1 may readily be combined with
Algorithm 3.1 Measurement Update Invariant to Order of Processing
for each (scalar) measurement j = 1 through k do
yj = Yj hj (X⇤m , tm )
@h
Hj = @Xj (X⇤m , tm )
1
Kj = Pj HTj Hj Pj HTj + Rj
x̂j x̂j Kj (yj Hj x̂j )
Pj (I Kj Hj ) Pj (I Kj Hj )T + Kj Rj KTj
end for
Xm X⇤m + x̂j

the residual mapping approach described above when the measurements are asynchronous.
Algorithm 3.1 may also be readily combined with Algorithm 2.1, for cases in which the
preferred factorized covariance methods are precluded.
3.3. Processing Vector Measurements
If the UDU factorization is used, measurements need to be processed as scalars. If
the vector measurements are correlated, one option is to assume they are uncorrelated and
ignore the correlations between the measurements.
However, there is a better alternative. Given the measurement equation (Yj = Hj (Xj , tj )+
⌫ j ) with measurement error covariance matrix, Ri , first decompose the matrix with a
Cholesky factorization as
⇥ ⇤ 1/2 T/2
Ri = E ⌫ j ⌫ Tj = Ri Ri (3.11)
1/2
and premultiply the measurement equation by Ri to yield
Ỹj = H̃j (Xj , tj ) + ⌫
˜j (3.12)
with
1/2
Ỹj = Ri Yj (3.13)
1/2
H̃j (Xj , tj ) = Ri Hj (Xj , tj ) (3.14)
1/2
˜j
⌫ = Ri ⌫j (3.15)
so that E[˜⌫j⌫˜ j ] = I. Thus, the new measurement equation has errors which are now
decorrelated.
Alternatively, one can decompose the m ⇥ m measurement error covariance matrix
with a UDU decomposition as Ri = URi DRi UTRi so that using a similar reasoning, we
premultiply the measurement equation by URi1 so that in this case
Ỹj = URi1 Yj (3.16)
1
H̃j (Xj , tj ) = URi Hj (Xj , tj ) (3.17)
1
˜j
⌫ = U Ri ⌫ j (3.18)
so that E[˜ ˜ j ] = DRi where DRi is a diagonal matrix and, as in the case of the Cholesky
⌫j⌫
decomposition, the new measurement model has decorrelated measurement errors.
32 3. PROCESSING MEASUREMENTS

3.4. Filter Cascades

It is often tempting to ingest the output of one navigation filter as a measurement
into another, “downstream” filter. The problem with this approach is that the output of
the upstream filter will be time-correlated. This requires the downstream filter to model
the correlation structure induced by the upstream filter. Such an exercise has rarely led
to useful and robust designs, and therefore such filter cascades are strongly discouraged.
Nonetheless, some forms of pre-filtering noisy high-rate data, such as carrier-smoothing,
have been usefully employed.
3.5. Use of Data from Inertial Sensors
Inertial measurement units (IMUs), consisting of gyros and accelerometers, sense ro-
tational and translational accelerations. While in principle these high-rate data could be
processed as observations in the navigation filter, it is often sufficient instead to use this
data in model replacement mode, which Brown and Hwang [3] compare to complementary
filtering. In this approach, the sensed accelerations are fed forward as deterministic inputs
to the navigation filter’s dynamics model. Biases a↵ecting the IMU data usually must be
estimated by the filter. Thresholding the IMU accelerations is also usually necessary to
avoid the introduction of unfiltered IMU noise into the filter state propagation. Chapter 6
describes best practices for modeling the structures associated with IMU data.
CHAPTER 4

Measurement Underweighting

Contributed by Renato Zanetti

4.1. Introduction
Given an m-dimensional random measurement y which is somehow related to an un-
known, n-dimensional random vector x the family of affine estimators of x from y is
x̂ = a + K y (4.1)
where a 2 <n and K 2 <n⇥m . The optimal, in a Minimum Mean Square Error sense, affine
estimator has
K = Pxy Pyy1 (4.2)
a = E[x] K E[y] (4.3)
where
h⇣ ⌘⇣ ⌘T i
Pxy = E x E[x] y E[y] (4.4)
h⇣ ⌘⇣ ⌘T i
Pyy = E y E[y] y E[y] (4.5)
In the presence of nonlinear measurements of the state,
y = h(x) + v (4.6)
(where v is zero-mean measurement noise) the extended Kalman filter (EKF) [20] approxi-
mates all moments of y by linearization of the measurement function centered on the mean
of x. This methodology has proven very e↵ective and produces very satisfactory results
in most cases. Approaches other than the EKF exist, for example the Unscented Kalman
Filter [32] approximates the same quantities via stochastic linearization using a determin-
istic set of Sigma Points. High order truncations of the Taylor series are also possible.
Underweighting [38, 40] is an ad-hoc technique to compensating for nonlinearities in the
measurement models that are neglected by the EKF and successfully flew on the Space
Shuttle and on Orion Exploration Flight Test 1.
The commonly implemented method for the underweighting of measurements for hu-
man space navigation was introduced by Lear [39] for the Space Shuttle navigation system.
In 1966 Denham and Pines showed the possible inadequacy of the linearization approx-
imation when the e↵ect of measurement nonlinearity is comparable to the measurement
error [12]. To compensate for the nonlinearity Denham and Pines proposed to increase the
measurement noise covariance by a constant amount. In the early seventies, in anticipation
of Shuttle flights, Lear and others developed relationships which accounted for the second-
order e↵ects in the measurements [40]. It was noted that in situations involving large state
33
34 4. MEASUREMENT UNDERWEIGHTING

errors and very precise measurements, application of the standard extended Kalman filter
mechanization leads to conditions in which the state estimation error covariance decreases
more rapidly than the actual state errors. Consequently the extended Kalman filter be-
gins to ignore new measurements even when the measurement residual is relatively large.
Underweighting was introduced to slow down the convergence of the state estimation error
covariance thereby addressing the situation in which the error covariance becomes overly
optimistic with respect to the actual state errors. The original work on the application of
second-order correction terms led to the determination of the underweighting method by
trial-and-error [39].
More recently, studies on the e↵ects of nonlinearity in sensor fusion problems with
application to relative navigation have produced a so-called “bump-up” factor. [16, 44, 56,
58]. While Ferguson [16] seems to initiate the use of the bump-up factor, the problem
of mitigating filter divergence was more fully studied by Plinval [58] and subsequently by
Mandic [44]. Mandic generalized Plinval’s bump-up factor to allow flexibility and notes
that the value selected influences the steady-state convergence of the filter. In essence, it
was found that a larger factor keeps the filter from converging to the level that a lower factor
would permit. This finding prompted Mandic to propose a two-step algorithm in which the
bump-up factor is applied for a certain number of measurements only, upon which the factor
was completely turned o↵. Finally, Perea, et al. [56] summarize the findings of the previous
works and introduce several ways of computing the applied factor. In all cases, the bump-up
factor amounts in application to the underweighting factor introduced in Lear [39]. Save for
the two-step procedure of Mandic, the bump-up factor is allowed to persistently a↵ect the
Kalman gain which directly influences the obtainable steady-state covariance. E↵ectively,
the ability to remove the underweighting factor autonomously and under some convergence
condition was not introduced.
The work of Lear is not well known as it is only documented in internal NASA memos
[39, 40]. Kriegsman and Tau [38] mention underweighting in their 1975 Shuttle navigation
paper without a detailed explanation of the technique.
4.2. Nonlinear E↵ects and the Need for Underweighting
We review briefly the three state estimate update approaches assuming a linear time-
varying measurement model leading to the classical Kalman filter, a nonlinear measure-
ment model with first-order linearization approximations leading the widely used extended
Kalman filter, and a nonlinear model with second-order approximations leading to the
second-order extended Kalman filter.

4.2.1. Linear Measurement Model and the Classical Kalman Filter Update
Let’s briefly recap the linear Kalman filter, the measurement model is
yi = Hi xi + vi , (4.7)
where yi 2 Rm are the m measurements at each time ti , xi 2 Rn is the n-dimensional state
vector, Hi 2 Rm⇥n is the known measurement mapping hmatrix, i vi 2 R is modeled as a
m

zero-mean white-noise sequence with E[vi ] = 0, 8 k and E vi vjT = Ri kj where Ri > 0 8 k

and kj = 1 when k = j and kj = 0 when k 6= j. The Kalman filter state update algorithm
provides an optimal blending of the a priori estimate x̂i and the measurement yi at time
ti to obtain the a posteriori state estimate x̂+
i via
⇥ ⇤
x̂+
i = x̂i + Ki yi Hi x̂i , (4.8)
4.2. NONLINEAR EFFECTS AND THE NEED FOR UNDERWEIGHTING 35

where the superscript denotes a priori and + denotes a posteriori.

Defining the a priori estimation error as ei = xi x̂i and the a posteriori esti-
mation error as e+ i = xi x̂+
i and assuming these errors to be zero mean, the associ-
ated symmetric,
h positive
i definite ha priori and
i a posteriori estimation error covariances are
T + + + T
Pi = E e i e i and Pi = E ei ei , respectively. Using Eq. (4.8) and the defini-
tions of the state estimation errors and error covariances, we obtain the a posteriori state
estimation error covariance via the well-known Joseph formula
P+
i = [I Ki Hi ] Pi [I Ki Hi ]T + Ki Ri KTi , (4.9)
which is valid for any Ki . If the gain Ki is chosen so as to minimize the trace of the a
posteriori estimation error, we call that gain the Kalman gain which is given by
⇥ ⇤ 1
Ki = Pi HTi Hi Pi HTi + Ri . (4.10)
The trace of the state estimation error covariance is generally not a norm but is equivalent
to the nuclear norm (the matrix Shatten 1-norm) for symmetric semi-positive matrices. If
the gain given in Eq. (4.10) is applied to the state estimation error covariance of Eq. (4.9),
then the update equation can be rewritten after some manipulation as
P+
i = [I Ki Hi ] Pi , (4.11)
or equivalently,
P+i = Pi Ki [Hi Pi HTi + Ri ]KTi . (4.12)
Under the assumptions of the Kalman filter development (linear, time-varying measurement
model with a zero-mean white-noise sequence corrupting the measurements, unbiased a pri-
ori estimation errors, known dynamics and measurement models, etc.), the state estimate
and state estimation error covariance updates are optimal and we expect no filter divergence
issues. The estimation error covariance will remain positive definite for all ti and the estima-
tion error covariance will be consistent with the true errors. In practice, the measurements
are usually nonlinear functions of the state leading to a variety of engineering solutions,
that must be carefully designed to ensure acceptable state estimation performance. Under-
weighting is one such method to improve the performance of the extended Kalman filter in
practical settings.
4.2.2. Nonlinear Measurement Model and the Extended Kalman Filter Up-
date In the nonlinear setting, consider the measurement model given by
yi = h(xi , ti ) + vi , (4.13)
where h(xi ) 2 Rm is a vector-valued di↵erentiable nonlinear function of the state vector
xi 2 Rn . The idea behind the extended Kalman filter (EKF) is to utilize Taylor series
approximations to obtain linearized models in such a fashion that the EKF state update
algorithm has the same general form as the Kalman filter.
yi ' h(x̂i , ti ) + Hi ei + vi , (4.14)
where " #
@h(xi , ti )
Hi , . (4.15)
@xi xi =x̂i
Since the estimation error is (approximately) zero mean and the measurement noise is zero
mean, it follows that
E[yi ] ' h(x̂i , ti ), (4.16)
36 4. MEASUREMENT UNDERWEIGHTING

all expectations are conditioned on past measurements and we find that the state estimate
update is given by [20]
x̂+
i = x̂i + Ki [yi h(x̂i )] . (4.17)
Similarly, the measurement residual is given by
r i = yi h(x̂i ) ' Hi ei + vi , (4.18)
T
Computing the measurement residual covariance E[ri ri ] yields
Wi = Hi Pi HTi + Ri . (4.19)
The state estimation error covariance and Kalman gain are the same as in Eqs. (4.9) and
(4.10), respectively, with Hi given as in Eq. (4.15). The state estimation error covariances
in the forms shown in Eqs. (4.11) and (4.12) also hold in the nonlinear setting with Hi as
in Eq. (4.15).
From Eqs. (4.12) and (4.17), it is seen that reducing the Kalman gain leads to a smaller
update in both the state estimation error covariance and the state estimate, respectively.
Reducing the gain and hence the update is the essence of underweighting and the need for
this adjustment is illuminated in the following discussion.
Adopting the viewpoint that the state estimation error covariance matrix represents the
level of uncertainty in the state estimate, we expect that when we process a measurement
(adding new information) that the uncertainty would decrease (or at least, not increase).
This is, in fact, the case and can be seen in Eq. (4.12). Under the assumption that the
symmetric matrices Pi > 0 and Ri > 0, it follows that
Ki [Hi Pi HTi + Ri ]KTi 0, (4.20)
and we can find a number ↵i 0 at each time ti such that
Pi P+
i ↵i I , (4.21)
which shows that the Pi P+
i is non-negative definite. The same argument can be made
from the viewpoint of comparing the trace (or the matrix norm) of the a posteriori and
a priori state estimation error covariances. As each new measurement is processed by the
EKF, we expect the uncertainty in the estimation error to decrease. The question is, does
the a posteriori uncertainty as computed by the EKF represent the actual uncertainty, or
in other words, is the state estimation error covariance matrix always consistent with the
actual state errors? In the nonlinear setting when there is a large a priori uncertainty in the
state estimate and a very accurate measurement, it can happen that the state estimation
error covariance reduction at the measurement update is too large. Underweighting is a
method to address this situation by limiting the magnitude of the state estimation error
covariance update with the goal of retaining consistency of the filter covariance and the
actual state estimation error through situations of high nonlinearity of the measurements.
Pre- and post-multiplying the a posteriori state estimation error covariance in Eq. (4.12)
by Hi and HTi , respectively, yields (after some manipulation)
Hi P+ T T T
i Hi = Hi Pi Hi (Hi Pi Hi + Ri )
1
Ri . (4.22)
T
In Eq. (4.22), we see that if Hi Pi Hi Ri then it follows that
Hi P+ T
i Hi ' Ri . (4.23)
The result in Eq. (4.23) is of fundamental importance and is the motivation behind un-
derweighting. What this equations express is the fact that when the a priori estimated
4.2. NONLINEAR EFFECTS AND THE NEED FOR UNDERWEIGHTING 37

state uncertainty Hi Pi HTi is much larger than the measurement error covariance Ri , the
Kalman filter largely neglects the prior information and relies heavily on the measurement.
Therefore the a posteriori estimated state uncertainty Hi P+ T
i Hi is approximately equal to
Ri .
4.2.3. Nonlinear Measurement Model and the 2nd-Order Kalman Filter Up-
date Eq. (4.14) truncates the Taylor series expansion of the nonlinear measurement func-
tion to first order, carrying it to second order we obtain
m
!
X @ 2 h(xi , ti )
yi ' h(x̂i , ti ) + Hi ei + ei (k) ei + vi (4.24)
@xi @xi (k) xi =x̂
k=1 i

= h(x̂i , ti ) + Hi ei + bi + vi , (4.25)
where ei (k) and xi (k) are the k-th elements of vectors ei and xi , respectively. The expected
value of the measurement now includes contributions from the second order terms, denoted
as b̂i
E[yi ] ' h(x̂i , ti ) + b̂i (4.26)
Define " #
@ 2 hi (xi )
Hi,k ,
T
,
@xi @xTi xi =x̂
i
where hi (xi ) is the k-th component of h(xi ). Then the k-th component of bi is given by
1 1
bi,k = (ei )T HTi,k ei = tr(HTi,k ei (ei )T ) . (4.27)
2 2
where tr denotes the trace. To keep the filter unbiased the k-th component of b̂i is given by
b̂i,k = 1/2 tr(HTi,k Pi ) .
The measurement residual is
ri = yi E[yi ] (4.28)
Expanding Eq. (4.28), the k-th component of the residual is obtained to be
ri,k = hTi,k ei + 1/2 tr(HTi (ti )ei (ei )T ) 1/2 tr(HTi (ti )Pi ) + vi,k , (4.29)
T
where hi,k is the ik-th row of the measurement Jacobian and vi,k is the k-th component of
the measurement noise vi . Computing the measurement residual covariance E[ri riT ] yields
Wi = Hi Pi HTi + Bi + Ri , (4.30)
where matrix Bi is the contribution of the second order e↵ects and its (kj)-th component
is given by
⇥ ⇤
Bi,kj , 1/4 E tr(HTi (ti )ei (ei )T ) tr(HTi (ti )ei (ei )T )
1/4 tr(HTi (ti )Pi ) tr(HTi (ti )Pi ) .
where it was assumed that the third order central moments are all zeros. Assuming the
prior estimation error is distributed as a zero-mean gaussian distribution with covariance
Pi , the ij th component of Bi is given by
1
Bi,kj = tr(HTi,k Pi HTi,j Pi ) . (4.31)
2
Comparing the measurement residual covariance for the EKF in Eq. (4.19) with the
measurement residual covariance for the second-order filter in Eq. (4.30), we see that when
38 4. MEASUREMENT UNDERWEIGHTING

the nonlinearities lead to significant second-order terms which should not be neglected, then
the EKF tends to provide residual covariance estimates that are not consistent with the
actual errors. Typically, we address this by tuning the EKF using Ri as a parameter to
be tweaked. If the contribution of the a priori estimation error Hi Pi HTi to the residuals
covariance is much larger than the contribution of the measurement error Ri , the EKF
algorithm will produce Hi P+ T
i Hi ' Ri . If Bi is of comparable magnitude to Ri then the
actual covariance of the posterior measurement estimate should be Hi P+ T
i Hi ' Ri + Bi .
Therefore, a large underestimation of the a posteriori covariance can occur in the presence
of nonlinearities when the estimated measurement error covariance is much larger than the
measurement error covariance.
The covariance update is given by the modified Gaussian second order filter update [30]
T
P+
i = Pi Hi Pi Wi 1
Hi Pi , (4.32)
where the residual covariance Wi is given by Eq. (4.30).
4.3. Underweighting Measurements
In the prior section we saw that when “large” values of Hi Pi HTi exist (or similarly,
large values of Pi ), and possibly “small” values of Ri , the EKF is at risk underestimating
the posterior estimation error covariance matrix. We must repeat that this can only happen
in the presence of “large” nonlinearities. The larger Pi , the larger the domain of possible
values of the true state x, hence the more likely the higher order terms of the expansion of the
nonlinear measurement functions will become relevant. If a measurement function is largely
non-linear, but the prior estimate is very precise, the EKF algorithm and linearization are
likely sufficiently accurate since:
(1) The posterior measurement will rely heavily on the prior and rely less on the
measurement
(2) Since the error is small, while the Hessian matrix might be relatively large, the
actual contributions of the second order e↵ects is likely to remain small
Underweighting is the process of modifying the residual covariance to reduce the update
and compensate for the second-order e↵ects described above. In this section, we describe
three common methods for performing underweighting with the EKF algorithm.
4.3.1. Additive Compensation Method The most straightforward underweighting
scheme is to add an underweighting factor Ui as
Wi = Hi Pi HTi + Ri + Ui . (4.33)
With the Kalman gain given by
1
Ki = Pi HTi Wi , (4.34)
we see that the symmetric, positive-definite underweighting factor Ui decreases the Kalman
gain, thereby reducing the state estimate and state estimation error covariance updates.
One choice is to select Ui = Bi , which is the contribution to the covariance assuming
the prior distribution of the estimation error is Gaussian. The advantage of this choice is
its theoretical foundation based on analyzing the second-order terms of the Taylor series
expansions. The disadvantages include higher computational costs to calculate the second-
order partials and the reliance on the assumption that the estimation errors possess Gaussian
distributions. In practical applications, the matrix Ui needs to be tuned appropriately for
acceptable overall performance of the EKF. The process of tuning a positive definite matrix
is less obvious than tuning a single scalar parameter.
4.4. PRE-FLIGHT TUNING AIDS 39

4.3.2. Scaling the Measurement Error Covariance Another possible underweight-

ing approach is to scale the measurement noise by choosing
U i = Ri , (4.35)
where > 0 is a scalar parameter selected in the design process. This approach has
been successfully used [31]; however, it is not recommended from both a conceptual and a
practical reason. Recalling that the underweighting is necessary because of neglecting the
second-order terms of the Taylor series expansion of the measurement function, it seems
more natural to express the underweighting as a function of the a priori estimation error
covariance. Choosing a constant coefficient to scale Ri seems less practical and will probably
lead to a more complicated tuning procedure. As long as the measurement noise is white
the contributions of the second order e↵ects are not a function of the measurement error
covariance, therefore making them a fraction or a multiple of the measurement noise (an
unrelated quantity) is likely not the best choice.

4.3.3. Lear’s Method Lear’s choice was to make Ui a percentage of the a priori
estimation error covariance via [39]
Ui = Hi Pi HTi . (4.36)
Let P̄i 2 R3⇥3 be the partition of the state estimation error covariance associated
q with
the position error states. The Space Shuttle employs underweighting when tr P̄i > ↵.
The positive scalars ↵ and are design parameters. For the
q Space Shuttle, ↵ is selected to
be 1000 meters and is selected to be 0.2 [39]. When tr P̄i > 1000 m, then = 0.2,
otherwise = 0.
Orion employs a slightly di↵erent approach, underweighting is applied when Hi Pi HTi >
↵, where ↵ is a tunable flight software parameter.
This choice of underweighting scheme is sounds since it assumes that the higher order
e↵ect are a fraction (or a multiple) of the first order e↵ects, which are a related quantity.
Some unusual nonlinear measurement cases where the measurement Jacobian evaluates to
zero, or a small value, while the Hessian does not vanish are not appropriately handled by
underweighting.

4.4. Pre-Flight Tuning Aids

In this section, a technique to aid the tuning of the underweighting coefficient during
pre-flight analysis is presented. When the nonlinearities lead to second-order terms that
cannot be neglected, we find that the measurement residual covariance is more accurately
given by (see Eq. (4.30))
Wi = Hi Pi HTi + Ri + Bi . (4.37)
Following Lear’s approach to underweighting, we have that
WU,i = (1 + )Hi Pi HTi + Ri . (4.38)
Comparing Eqs. (4.37) and (4.38), the desired e↵ect is to have
tr WU,i tr Wi . (4.39)
This leads us to choose the underweighting coefficient i such that
tr Bi / tr Hi Pi HTi 8i. (4.40)
40 4. MEASUREMENT UNDERWEIGHTING

Running simulations pre-flight, the designer can calculate the time history of tr Bi / tr Hi Pi HTi
and choose an appropriate value of . It is unlikely that higher order terms than Bi will
need to be considered in designing the value of .
CHAPTER 5

Bias Modeling

Contributed by J. Russell Carpenter

A general model for a measurement error is as follows:

e=b+v (5.1)
where b models the systematic errors, and v models the measurement noise. We assume that
the measurement noise is a discrete sequence of uncorrelated random numbers. Variables
such as v are known as random variables, and Appendix A describes how to model them.
This Chapter describe models for the systematic errors.
The discussion of systematic errors treats such errors as scalar quantities to simplify
the exposition; generalization to the vector case is straightforward. Note that if the mea-
surement is non-scalar, but the errors in the component measurements are independent of
one another, then we can model each measurement independently, so modeling the biases
as vector is not required. If the measurement errors are not independent, then many esti-
mators require that we apply a transformation to the data prior to processing so that the
data input to the estimator have independent measurement errors; Appendix A describes
some ways to accomplish this transformation.

5.1. Zero-Input Bias State Models

The simplest non-zero measurement error consists only of measurement noise. The next
simplest class of measurement errors consists of biases which are either themselves constant,
or are the integrals of constants. We can view such biases as the output of a system which
has zero inputs, and which may have internal states. In the sequel, we will consider cases
where there are random inputs to the system.
In cases were the bias is the output of a system with internal states, the estimator may
treat the internal states as solve-for or consider parameters. In such cases, the estimator
requires a measurement partials matrix. Otherwise, the “measurement partial” is just
H = @b/@b = 1.

5.1.1. Random Constant The simplest type of systematic error is a constant bias
on the measurement. There are two types of such biases: deterministic constants, which
are truly constant for all time, and random constants, which are constant or very nearly so
over a particular time of interest. For example, each time a sensor is power-cycled, a bias
associated with it may change in value, but so long as the sensor remains powered on, the
bias will not change.
In some cases, we may have reason to believe that a particular systematic error source
truly is a deterministic bias, but due to limited observability, we do not have knowledge of
41
42 5. BIAS MODELING

its true value. In such cases, we may view our estimate of the bias as a random constant,
and its variance as a measure of the imprecision of our knowledge.
Thus, we may view all constants that could be solve-for or consider parameters in orbit
determination as random constants. A model for a random constant is
ḃ(t) = 0, b(to ) ⇠ N (0, pbo ). (5.2)
Thus the unconditional mean of b(t) is zero for all time, and its unconditional covariance is
constant for all time as well. If b(t) is a filter solve-for variable that is observable, then its
covariance conditioned on the measurement sequence will reach zero in the limit as t ! 1.
This is an undesirable characteristic for application in a sequential navigation filter.
To simulate a realization of the random constant, we need only generate a random
number according to N (0, pbo ), as the previous subsection described.
5.1.2. Random Ramp The random ramp model assumes that the rate of change of
the bias is itself a random constant; thus the random ramp model is
b̈(t) = 0, ḃ(to ) ⇠ N (0, pḃo ). (5.3)
Thus, the initial condition ḃ(to ) is a random constant. For a pure random ramp, the initial
condition on b(to ) and its covariance are taken to be zero, but an obvious and common
generalization is to allow b(to ) to also be a random constant.
It is convenient to write this model as a first-order vector system as follows:
   
ḃ(t) ḃ(t) 0 1 b(t)
= = (5.4)
b̈(t) ḋ(t) 0 0 d(t)
ẋ(t) = Ax(t) (5.5)
The resulting output equation for the total measurement error is
⇥ ⇤
e = 1 0 x+v (5.6)
= Hx + v (5.7)
Note that the ensemble of realizations of x(t) has zero-mean for all time. The unconditional
covariance evolves in time according to
T
Px (t) = (t to )Pxo (t to ) (5.8)
where  
1 t p 0
(t) = and Pxo = bo (5.9)
0 1 0 pḃo
which we can also write in recursive form as
T
Px (t + t) = ( t)Px (t) ( t) (5.10)
Thus, we can generate realizations of the random ramp with either x(t) ⇠ N (0, Px (t)) or
recursively from
x(t + t) = ( t)x(t) (5.11)
2
Note that the norm of the unconditional covariance becomes infinite as t becomes in-
finite. This could lead to an overflow of the representation of the covariance in a computer
program if the propagation time between measurements is large, if the bias is unobserv-
able, or if the bias is a consider parameter, and could also lead to the representation of
the covariance losing either its symmetry and/or its positive definiteness due to roundo↵
and/or truncation. If the bias and drift are filter solve-for variables, then the norm of their
5.2. SINGLE-INPUT BIAS STATE MODELS 43

covariance conditioned on the measurement sequence will reach zero in the limit as t ! 1.
These are all undesirable characteristics for application in a sequential navigation filter.
5.1.3. Higher-Order Derivatives of Random Constants In principle, a random
constant may be associated with any derivative of the bias in a straightforward extension
of the models above. In practice, it is rare to need more than two or three derivatives.
Conventional terminology does not appear in the literature for derivatives of higher order
than the random ramp. The slope of the bias is most commonly described as the “bias
drift,” so that a “drift random ramp” would be one way to describe a bias whose second
derivative is a random constant. The measurement partials matrix needs to be accordingly
padded with trailing zeros for the derivatives of the bias in such cases.
5.2. Single-Input Bias State Models
The simplest non-constant systematic errors are systems with a single input that is a
random process. We can think of a random process as the result of some kind of limit in
which the intervals between an uncorrelated sequence of random variables get infinitesimally
small. In this limit, each random increment instantaneously perturbs the sequence, so that
the resulting process is continuous but non-di↵erentiable. We call this kind of a random
input “process noise.”
Although such random processes are non-di↵erentiable, there are various techniques
for generalizing the concept of integration so that something like integrals of the process
noise exist, and hence so do the di↵erentials that appear under the integral signs. It turns
out that so long as any coefficients of the process noise are non-random, these di↵erentials
behave for all practical purposes as if they were di↵erentiable.
5.2.1. Random Walk The random walk is the simplest random process of the type
described above. In terms of the “formal derivatives” mentioned above, the random walk
model for a measurement bias is
ḃ(t) = w(t), w(t) ⇠ N (0, q (t s)) (5.12)
The input noise process on the right hand side is known as “white noise,” and the Dirac
delta function that appears in the expression for its variance indicates that the white noise
process consists of something like an infinitely-tightly spaced set of impulses. The term q
that appears along with the delta function is the intensity of each impulse1. The initial
condition b(to ) is an unbiased random constant. Since b(to ) and w(t) are zero-mean, then
b(t) is also zero-mean for all time. The unconditional variance of b evolves in time according
to
pb (t) = pbo + q(t to ) (5.13)
which we can also write in recursive form as
pb (t + t) = pb (t) + q t (5.14)
Thus, to generate a realization of the random walk at time t, we need only generate a
random number according to N (0, pb (t)). Equivalently, we could also generate realizations
of $(t) ⇠ N (0, q t), and recursively add these discrete noise increments to the bias as
follows:
b(t + t) = b(t) + $(t) (5.15)
1Another way to imagine the input sequence, in terms of a frequency domain interpretation, is that
it is a noise process whose power spectral density, q, is non-zero at all frequencies, which implies infinite
bandwidth.
44 5. BIAS MODELING

Note that the unconditional variance becomes infinite as t becomes infinite. This could
lead to an overflow of the representation of pb if q is large in the following circumstances:
in a long gap between measurements, if the bias is unobservable, or if the bias is a consider
parameter. These are all somewhat undesirable characteristics for application in a sequential
navigation filter. However, because the process is persistently stimulated by the input, its
variance conditioned on a measurement history will remain positive for all time. Hence the
random walk finds frequent application in sequential navigation filters, particularly when
continuous measurements from which the bias is observable are generally available, such as
often occurs for GPS data.
5.2.2. Random Run The random run model assumes that the rate of change of the
bias is itself a random walk; thus the random run model is
b̈(t) = w(t), w(t) ⇠ N (0, q (t s)) (5.16)
The initial condition ḃ(to ) is a random constant. For a pure random run, the initial condition
on b(to ) and its covariance are taken to be zero, but an obvious and common generalization
is to allow b(to ) to also be a random constant.
It is convenient to write this model as a first-order vector system as follows:
    
ḃ(t) ḃ(t) 0 1 b(t) 0
= = + w(t) (5.17)
b̈(t) ḋ(t) 0 0 d(t) 1
ẋ(t) = Ax(t) + Bw(t) (5.18)
The measurement partial is the same as for the random ramp. The initial condition x(to )
is an unbiased random constant. Since x(to ) and w(t) are zero-mean, then x(t) is also
zero-mean for all time. The covariance evolves in time according to
T
Px (t) = (t to )Pxo (t to ) + S(t to ) (5.19)
where  
1 t p 0
(t to ) = and Pxo = bo (5.20)
0 1 0 pḃo
and 
t3 /3 t2 /2
S(t) = q (5.21)
t2 /2 t
which we can also write in recursive form as
T
Px (t + t) = ( t)Px (t) ( t) + S( t) (5.22)
Thus, we can generate realizations of the random run with either x(t) ⇠ N (0, Px(t) ) or
recursively from
x(t + t) = ( t)x(t) + $(t) (5.23)
where $(t) ⇠ N (0, S( t)) is a noise sample vector arising from formal integration of the
scalar noise input process over the sample time. A Cholesky decomposition of S(t) useful
for sampling is
p
p 3t3 /3 p0
C
S(t) = p (5.24)
3t/2 t/2
Note that the norm of the unconditional covariance becomes infinite as t3 becomes
infinite, and the process is persistently stimulated by the input, so its covariance conditioned
on a measurement history will remain positive definite for all time. Hence, the random run
5.2. SINGLE-INPUT BIAS STATE MODELS 45

shares similar considerations with the random walk for application in sequential navigation
filters.

5.2.3. Higher-Order Derivatives of Random Walks In principle, a random walk

may be associated with any derivative of the bias in a straightforward extension of the
models above. In practice, it is rare to need more than two or three derivatives. Conven-
tional terminology does not appear in the literature for derivatives of higher order than
the random run. A “drift random run” would be one way to describe a bias whose second
derivative is a random walk. Below, we will refer to such a model as a “random zoom.”

5.2.4. First-Order Gauss-Markov The first-order Gauss-Markov (FOGM) process

is one of the simplest random processes that introduces time correlation between samples.
In terms of a frequency domain interpretation, we can view it as white noise passed through
a low-pass filter. Since such noise, often called “colored noise,” has finite bandwidth, it is
physically realizable, unlike white noise. In the notation of formal derivatives, the FOGM
model is
1
ḃ(t) = b(t) + w(t), (5.25)
⌧
where, as with the random walk, b(to ) ⇠ N (0, pbo ), and w(t) ⇠ N (0, q (t s)). The time
constant, ⌧ gives the correlation time, or the time over which the intensity of the time
correlation will fade to 1/ e of its prior value2.
Since b(to ) and w(t) are zero-mean, then b(t) is also zero-mean for all time. The covari-
ance evolves in time according to
2
(t to )
pb (t) = e ⌧ pbo + s(t to ) (5.26)
where
q⌧ ⇣ 2
(t to )
⌘
s(t to ) =
1 e ⌧ (5.27)
2
which we can also write in recursive form as
2 t
pb (t + t) = e ⌧ pb (t) + s( t) (5.28)
Thus, to generate discrete samples of a particular realization of the FOGM, we can either
generate samples from b(t) ⇠ N (0, pb (t)), or generate a realization of the initial bias value,
and then at each sample time generate realizations of $(t) ⇠ N (0, s( t)), and recursively
add these discrete noise sample increments to the bias sample history as follows:
t
b(t + t) = e ⌧ b(t) + $(t) (5.29)
Note that pb approaches a finite steady-state value equal to q⌧ /2 as t becomes infinite.
One can choose the parameters of the FOGM so that this steady-state value avoids any
overflow of the representation of pb in a computer program, and such that the FOGM’s co-
variance evolution prior to reaching steady-state closely mimics that of a random walk. For
these reasons, the FOGM is recommended as a best practice for bias modeling in sequential
navigation filters.

2One sometimes sees ⌧ described as the “half-life,” but since 1/ e < 1/2, this is not an accurate label.
46 5. BIAS MODELING

5.2.5. Integrated First-Order Gauss-Markov Model As with the random walk

and random constant models, any number of derivatives of the bias may be associated with
a FOGM process. However, integration of the FOGM destroys its stability. For example,
the singly integrated first-order Gauss-Markov model is given by
   
ḃ(t) 0 1 b(t) 0
= + , (5.30)
ḋ(t) 0 1/⌧ d(t) w(t)
which leads to the following state transition matrix,

1 ⌧ 1 e t/⌧
(t) = , (5.31)
0 e t/⌧
and process noise covariance3,
" 2
#
q⌧ ⌧ 2 1 e 2t/⌧ 4 1 e t/⌧ + 2t/⌧ ⌧ 1 e t/⌧
S(t) = 2 . (5.32)
2 ⌧ 1 e t/⌧ 1 e 2t/⌧
Clearly, this is an unstable model, as the bias variance increases linearly with elapsed time.
As an alternative, the following second-order model is available.
5.2.6. Second-Order Gauss-Markov The model for a second-order Gauss-Markov
(SOGM) random process is
b̈(t) = 2⇣!n ḃ(t) !n2 b(t) + w(t), w(t) ⇠ N (0, q (t s)) (5.33)
The initial conditions b(to ) and ḃ(to ) are random constants. It is convenient to write this
model as a first-order vector system as follows:
    
ḃ(t) ḃ(t) 0 1 b(t) 0
= = 2 + w(t) (5.34)
b̈(t) ḋ(t) ! n 2⇣! n d(t) 1
ẋ(t) = Ax(t) + Bw(t) (5.35)
The measurement partial is the same as for the random ramp. The initial condition x(to ) is
an unbiased random constant vector. Since x(to ) and w(t) are zero-mean, then x(t) is also
zero-mean for all time.
The covariance evolves in time according to according to
T
Px (t) = (t to )Pxo (t to ) + S(t to ) (5.36)
which we can also write in recursive form as
T
Px (t + t) = ( t)Px (t) ( t) + S( t) (5.37)
Thus, we can generate realizations of the SOGM with either x(t) ⇠ N (0, Px (t)) or recur-
sively from
x(t + t) = ( t)x(t) + $(t) (5.38)
where $(t) ⇠ N (0, S( t)).
For the underdamped case (⇣ < 1), the state transition matrix and discrete process
noise covariance are given by Reference 73:

e ⇣!n t (!d cos !d t + ⇣!n sin !d t) sin !d t
(t) = 2 (5.39)
!d !n sin !d t (!d cos !d t ⇣!n sin !d t)
3Note that (5.32) corrects an error in Reference 10.
5.2. SINGLE-INPUT BIAS STATE MODELS 47

and

(1,1) q e 2⇣!n t 2
Q (t) = 1 (!d + 2⇣!n !d cos !d t sin !d t + 2⇣ 2 !n2 sin2 !d t) (5.40)
4⇣!n3 wd2

(2,2) q e 2⇣!n t 2
Q (t) = 1 (!d 2⇣!n !d cos !d t sin !d t + 2⇣ 2 !n2 sin2 !d t) (5.41)
4⇣!n wd2
(1,2) q
Q (t) = 2 e 2⇣!n t sin2 !d t, (5.42)
2!d
(2,1) (1,2)
Q (t) = Q (t) (5.43)
p
where !d = !n 1 ⇣ 2 . In the over-damped case (⇣ > 1), replace sin and cos with sinh
and cosh, respectively. In the critically-damped case,
 ! t
e n (1 + !n t) t e !n t
(t) = 2 ! t (5.44)
!n t e n e n t (1 !n t)
!

and
(1,1) q ⇥ ⇤
Q (t) =3
1 e 2!n t (1 + 2!n t + 2!n2 t2 ) (5.45)
4!n
(2,2) q ⇥ ⇤
Q (t) = 1 e 2!n t (1 2!n t + 2!n2 t2 ) (5.46)
4!n
(2,1) (1,2) qt2 2!n t
Q (t) = Q (t) = e (5.47)
2
Note that for any damping ratio, kPx k remains finite, since as t ! 1,

q 1/!n2 0
Px (t ! 1) = . (5.48)
4⇣!n 0 1
Thus, the ratio of the steady-state standard deviations of the bias and drift will be
d
= !n , (5.49)
b
and these are related to the power spectral density by
3
d
q = 4⇣ . (5.50)
b
Hence, we can choose the parameters of the SOGM so that we avoid any overflow, loss of
symmetry and/or positive definiteness of Px due to roundo↵ and/or truncation. For these
reasons, the SOGM is recommended as a best practice for bias drift modeling in sequential
navigation filters.

5.2.7. Vasicek Model A criticism of the FOGM process is that as t ! 1, E[b(t)] ! 0.

In the filtering context, this implies that a data outage that is long relative to the time
constant, ⌧ , can result in the filter’s bias estimate decaying toward zero, which may be
undesirable. To address this concern, Seago et al. [66] proposed that biases the filter should
retain across such outages might be modeled instead with a model proposed by Vasicek [72]
for modeling interest rates:
1
ḃ(t) = (b(t) b1 ) + w(t), (5.51)
⌧
48 5. BIAS MODELING

where, as previously, b(to ) ⇠ N (0, Pbo ), and w(t) ⇠ N (0, q (t s)). A formal solution to
(5.51) gives
Z t
t to t to s
b(t) = b(to ) e ⌧ +b1 (1 e ⌧ ) + e ⌧ w(s) ds (5.52)
to
Since b(to ) and w(t) are zero-mean, then
t to
E[b(t)] = b1 (1 e ⌧ ) (5.53)
⇥ ⇤
and E[b(t)] = b1 as t ! 1. Since E[b(t)]2 is subtracted from E b(t) 2 to get the covariance,
the covariance evolves in time identically to the FOGM,
2
(t to )
pb (t) = e ⌧ pbo + s(t to ) (5.54)
where as before
q⌧ ⇣ 2
⌘
s(t to ) = 1 e ⌧ (t to ) (5.55)
2
Thus, to generate a realization of the Vasicek Model at particular time t, we could generate
a realization of the initial bias value, and then at each sample time generate realizations of
$(t) ⇠ N (0, s( t)), and recursively add these discrete noise sample increments to the bias
sample history as follows:
t t
b(t + t) = b(t) e ⌧ +b1 (1 e ⌧ ) + $(t) (5.56)
t to
or we could generate a random realization of N (0, pb (t)) and add this to b1 (1 e ⌧ ).
To configure or “tune” the Vasicek model, one chooses the time constant ⌧ and the
noise PSD q in a manner analogous to the FOGM process; it is less clear how one might
choose b1 . Seago et al. [66] proposed that b1 be estimated as a random constant filter
state. Casting the Vasicek model into such a two-state form results in the following model:
   
ḃ(t) 1/⌧ 1/⌧ b(t) w(t)
= + , (5.57)
ḃ1 0 0 b1 0
which leads to the following state transition matrix and process noise covariance:
 " ⇣ 2t
⌘ #
q⌧
e t/⌧ 1 e t/⌧ 2 1 e ⌧ 0
(t) = , S(t) = . (5.58)
0 1 0 0
While the Vasicek Model shares with the FOGM the desirable feature that pb ! q⌧ /2
as t ! 1, in the two-state form just described, it also has the undesirable feature that the
variance of b1 goes to zero as t ! 1. Modeling b1 with process noise, e.g. as a random
walk with PSD of q1 , introduces an unstable integral of the process noise as occurs for the
integrated FOGM:
2 ⇣ t 2t
⌘ ⇣ 2t
⌘ ⇣ t
⌘3
q⌧
q1 t 3⌧ 2 + 2⌧ e ⌧
⌧
2⇣e ⌧ + 1 e ⌧ q 1 t q 1 ⌧ 1 e ⌧
S(t) = 4 t
⌘2 5 , (5.59)
q1 t q1 ⌧ 1 e ⌧ q1 t

although choosing q1 appropriately small may mitigate this concern. In any case, retaining
a steady-state bias across long data gaps may not always be warranted, depending on the
context. And if long measurement gaps are not present, the need to retain such a bias, with
the accompanying complexity of maintaining an additional state, may not be necessary. We
will consider further such multi-input bias models in the sequel.
5.3. MULTI-INPUT BIAS STATE MODELS 49

5.3. Multi-Input Bias State Models

We may combine any of the above models to create multi-input bias models; for exam-
ple the bias could be a second-order Gauss-Markov, and the bias rate could be a first-order
Gauss-Markov. In practice, the most useful combinations have been found to be the follow-
ing.

5.3.1. Bias and Drift Random Walks (Random Walk + Random Run) A
common model for biases in clocks, gyros, and accelerometers is that the bias is driven
by both its own white noise input, and also by the integral of the white noise of its drift.
Such models derive from observations that the error magnitudes of these devices depend
on the time scale over which the device is observed. They are often characterized by Allan
deviation specifications, which may be heuristically associated with the white noise power
spectral densities. The model is as follows:
    
ḃ(t) 0 1 b(t) 1 0 wb (t)
= + (5.60)
ḋ(t) 0 0 d(t) 0 1 wd (t)
ẋ(t) = Ax(t) + Bw(t) (5.61)

The measurement partial is the same as for the random ramp. The initial condition x(to )
is an unbiased random constant. Since x(to ) and w(t) are zero-mean, then x(t) is also
zero-mean for all time. The covariance evolves in time according to
T
Px (t) = (t to )Pxo (t to ) + S(t to ) (5.62)

where
 
1 t p 0
(t) = and Pxo = bo (5.63)
0 1 0 pḃo
and

qb t + qd t3 /3 qd t2 /2
S(t) = (5.64)
qd t2 /2 qd t
which we can also write in recursive form as
T
Px (t + t) = ( t)Px (t) ( t) + S( t) (5.65)

Thus, we can generate realizations of the random run with either x(t) ⇠ N (0, Px (t)) or
recursively from
x(t + t) = ( t)x(t) + $(t) (5.66)
where $(t) ⇠ N (0, S( t)). Note that a Cholesky decomposition of S(t) is
p p
p
C qb t + qd t3 /12 qd t3 /2
S(t) = p (5.67)
0 qd t
As with its constituent models, the norm of the unconditional covariance becomes infi-
nite as t3 becomes infinite, while the process is persistently stimulated by the input, so its
covariance conditioned on a measurement history will remain positive definite for all time.
Hence, the this model shares similar considerations with its constituents for application in
sequential navigation filters.
50 5. BIAS MODELING

5.3.2. Bias, Drift, and Drift Rate Random Walks (Random Walk + Random
Run + Random Zoom) Another model for biases in very-high precision clocks, gyros,
and accelerometers is that the bias is driven by two integrals of white noise in addition
to its own white noise input. Such models are often characterized by Hadamard deviation
specifications, which may be heuristically associated with the white noise power spectral
densities. The model is as follows:
2 3 2 32 3 2 32 3
ḃ(t) 0 1 0 b(t) 1 0 0 wb (t)
4ḋ(t)5 = 40 0 15 4d(t)5 + 40 1 05 4wd (t)5 (5.68)
d̈(t) 0 0 0 ḋ(t) 0 0 1 w ḋ (t)
ẋ(t) = Ax(t) + Bw(t) (5.69)
The resulting output equation is
⇥ ⇤
e = 1 0 0 x+v (5.70)
= Hx + v (5.71)
The initial condition x(to ) is an unbiased random constant. Since x(to ) and w(t) are zero-
mean, then x(t) is also zero-mean for all time. The covariance evolves in time according
to
T
Px (t) = (t to )Pxo (t to ) + S(t to ) (5.72)
where 2 3 2 3
1 t t2 /2 pbo 0 0
(t) = 40 1 t 5 and Pxo = 4 0 pdo 0 5 (5.73)
0 0 1 0 0 pḋo
and 2 3
qb t + qd t3 /3 + qḋ t5 /5 qd t2 /2 + qḋ t4 /8 qḋ t3 /6
S(t) = 4 qd t2 /2 + qḋ t4 /8 qd t + qḋ t3 /3 qḋ t2 /25 (5.74)
3
qḋ t /6 2
qḋ t /2 qḋ t
which we can also write in recursive form as
T
Px (t + t) = ( t)Px (t) ( t) + S( t) (5.75)
Thus, we can generate realizations of the random run with either x(t) ⇠ N (0, Px (t)) or
recursively from
x(t + t) = ( t)x(t) + $(t) (5.76)
where $(t) ⇠ N (0, S( t)). Note that a Cholesky decomposition of S(t) is
2p p p 3
3 /12 + q t5 /720 t/2 q t + q t3 /12 t2 /6 q t
p q b t + q d t ḋ p d ḋ ḋ
p
C
S(t) = 4 0 qd t + qḋ t3 /12 t/2 qḋ t 5 (5.77)
p
0 0 qḋ t
Similar to its constituent models, the norm of the unconditional covariance becomes
infinite as t5 becomes infinite, while the process is persistently stimulated by the input, so
its covariance conditioned on a measurement history will remain positive definite for all time.
Hence, the this model shares similar considerations with its constituents for application in
sequential navigation filters.
5.3. MULTI-INPUT BIAS STATE MODELS 51

5.3.3. Bias and Drift Coupled First- and Second-Order Gauss-Markov The
following model provides a stable alternative, developed in Reference 10, to the “Random
Walk + Random Run” model. Note that the following description corrects a sign error in
the process noise cross-covariance results of the cited work. The transient response of the
stable alternative can be tuned to approximate the Random Walk + Random Run model,
and its stable steady-state response can be used to avoid computational issues with long
propagation times, observability, consider states, etc. Although this model has received
limited application as of the time of this writing, due to its stability, it shows promising
potential to evolve into a best practice for sequential navigation filtering applications.
The coupled first- and second-order Gauss-Markov model is as follows.

    
ḃ(t) 1/⌧ 1 b(t) 1 0 wb (t)
= + (5.78)
ḋ(t) !n2 2⇣!n d(t) 0 1 wd (t)
ẋ(t) = Ax(t) + Bw(t) (5.79)

T
Px (t) = (t to )Pxo (t to ) + S(t to ) (5.80)

where


e⌘t ⌫ cos ⌫t + (⌘ + 2⇣!n ) sin ⌫t sin ⌫t
(t) = (5.81)
⌫ !n2 sin ⌫t ⌫ cos ⌫t + (⌘ + ) sin ⌫t

with

= 1/⌧, (5.82)
1
⌘= ( + 2⇣!n ) , (5.83)
r 2
1
⌫= !d2 + ⇣!n 2, (5.84)
4
p
!d = !n 1 ⇣ 2 , (5.85)

and we assume that ⌫ 2 > 0. Let

= + ⇣!n ;
2
52 5. BIAS MODELING

then, the process noise covariance is given by the following:

 2⌘t ✓ ◆ ✓ ◆
(1,1) e 1 2 e2⌘t sin 2⌫t ⌫ 2 2 + ⌘
S (t) = qb 1+ 2 +
4⌘ ⌫ 4(⌘ 2 + ⌫ 2 ) ⌫
2⌘t
✓ 2 2 2
◆
e cos 2⌫t 1 ⌘⌫ ⌘ + 2⌫ 
+ 2 2
(5.86)
4(⌘ + ⌫ ) ⌫2
✓ 2⌘t ◆
qd e 1 e2⌘t (⌫ sin 2⌫t + ⌘ cos 2⌫t) ⌘
+ 2
⌫ 4⌘ 4(⌘ 2 + ⌫ 2 )
 2⌘t ✓ ◆ ✓ ◆
(2,2) e 1 2 e2⌘t sin 2⌫t ⌫ 2 2 + ⌘
S (t) = qd 1+ 2 +
4⌘ ⌫ 4(⌘ 2 + ⌫ 2 ) ⌫
2⌘t
✓ 2 2 2
◆
e cos 2⌫t 1 ⌘⌫ ⌘ + 2⌫ 
+ (5.87)
4(⌘ 2 + ⌫ 2 ) ⌫2
✓ ◆
qb ! 4 e2⌘t 1 e2⌘t (⌫ sin 2⌫t + ⌘ cos 2⌫t) ⌘
+ 2n
⌫ 4⌘ 4(⌘ 2 + ⌫ 2 )
" ⇥ ⇤ #
(1,2) qb !n2  2⌘t e2⌘t (⌫ ⌘⌫) sin 2⌫t + (⌘ + ⌫ 2 ) cos 2⌫t (⌘ + ⌫ 2 )
S (t) = 2 1 e +
⌫ 4⌘ 4(⌘ 2 + ⌫ 2 )
" ⇥ ⇤ #
qd  2⌘t e2⌘t (⌘⌫ + ⌫) sin 2⌫t + (⌘ ⌫ 2 ) cos 2⌫t (⌘ ⌫ 2)
+ 2 1 e + .
⌫ 4⌘ 4(⌘ 2 + ⌫ 2 )
(5.88)
(2,1) (1,2)
S (t) = S (t) (5.89)
Examining the solution given above, we see that the parameter ⌘ governs the rate of
decay of all of the exponential terms. Therefore, we define the “rise time” as that interval
within which the transient response of the covariance will reach a close approximation to
the above steady-state value; thus, we define the rise time as follows:
3
tr = . (5.90)
⌘
Next, we note that all of the trigonometric terms are modulated by 2⌫; thus we may
view this value as a characteristic damped frequency of the coupled system. The period of
the oscillation, ⇧, is then
⇧ = ⇡/⌫ (5.91)
In the limit as t ! 1, all the exponential terms in the analytical solution die out, so
that the steady-state value of the covariance simplifies to:

1 qd + (2⌘ 2 + ⌫ 2 + 2 ⌘)qb qb !n2 (⌘ ) qd (⌘ + )
P(1) = 2
2
4⌘(⌘ + ⌫ ) 2 qb !n (⌘ ) qd (⌘ + ) (2⌘ 2 + ⌫ 2 + 2 + ⌘)qd + qb !n4
(5.92)
which may be expressed in terms of the original parameters as
1
P(1) =
4!n (!n + 2 ⇣)(⇣!n + /2)
 (5.93)
qd + (!n2 + 2 ⇣!n + 4⇣ 2 !n2 )qb qd 2⇣!n3 qb
·
qd 2⇣!n3 qb (!n2 + 2 ⇣!n + 2 )qd + !n4 qb
CHAPTER 6

State Representations

Contributed by J. Russell Carpenter and Christopher N. D’Souza

This Chapter discusses state representation, primarily for translations; attitude representa-
tions are discussed in Chapter 8.

6.1. Selection of Solve-For State Variables for Estimation

As has been discussed in Chapter 2, it not good practice to include unobservable states
in the EKF solve-for vector, particularly if this introduces unstable dynamical modes.
Nonetheless, during the early stages of designing a navigation filter, it may not be clear
to the designer which states to include. There essentially two approaches to addressing
this question, which we may describe as the additive and subtractive methods. With the
additive approach, one begins with the smallest possible set of states, adding additional
models as one deems them necessary. The problem with this approach is essentially that it
is not possible to foresee how additional states will a↵ect the system until they are added;
one cannot analyze the sensitivity of the filter’s performance to states which are not present
in the analysis. The preferred, subtractive, approach is instead to start with a design of as
high a fidelity as practical, including even modes which one may suspect are unobservable
and possibly destabilizing. A designer may then readily perform sensitivity and covariance
analysis [20, 50] to winnow the solve-for state to as parsimonious a set of observable states
as needed to achieve design requirements.

6.2. Units and Precision

In the early days of ground-based orbit determination, canonical units were preferred due
to the limited word lengths that were available for computation. Factorized filtering meth-
ods largely eliminated the need for canonical units even before “’modern” double-precision
word lengths became available in onboard processors. A renewed interest in single-precision
computations has emerged however as the desire to utilize processors based on Field Pro-
grammable Gate Arrays has become widespread. Thus, the possibility of overflow, trunca-
tion, and roundo↵ errors must still be considered. Wherever possible, filter computations,
especially those involving the covariance matrix (even when it is factorized!), should be done
in double precision, and time should be maintained in either two double-precision variables,
or in quadruple precision if available. For low-Earth orbit navigation in an Earth-centered
frame, position/velocity units based on meters and seconds are often adequate; for applica-
tions that may reach into cislunar space and beyond, units based on kilometers and seconds
are preferred.
53
54 6. STATE REPRESENTATIONS

6.3. Coordinate and Time Systems

For most orbital navigation applications, use of an “inertial” coordinate frame, such
as the International Celestial Reference Frame, the “J2000” (FK5) frame, etc., will be
desirable, since onboard computations utilizing navigation filter state estimates will typically
most naturally occur in an inertial setting. It is usually convenient to choose a frame whose
origin is at the center of the primary gravitational body. Some missions, such as cislunar
and interplanetary missions, will occur within the Hill spheres of more than one celestial
body, and some mechanism for changing the coordinate system origin without requiring
reset of the filter should be considered.
In some cases, consideration may be given to central-body-fixed frames, such as the In-
ternational Terrestrial Reference Frame, World Geodetic System of 1984, etc., particularly
for applications that rely primarily on Global Navigation Satellites Systems (GNSS), and/or
ground-based tracking systems. Although integrating the equations of motion in such sys-
tems necessitates additional calculations of Coriolis and centripetal acceleration, such cal-
culations are relatively trivial in comparison to the computations required to accurately
maintain a transformation between central-body-fixed and inertial frames. Computations
of higher-order gravity acceleration are simplified, and maintenance of polar motion coeffi-
cients is also eliminated. Several of NASA’s early Global Position System relative navigation
experiments used such a formulation successfully [55, 63]. If other onboard applications re-
quire inertial states, but are indi↵erent as to which inertial frame is provided, it may be
prudent to consider defining a fixed, true-of-date inertial frame which is identical to the
body-fixed frame at the initial power-up of the navigation system, and which is henceforth
related to the body-fixed frame by a simple single-axis polar rotation. Such an approach
will permit navigation in the body-fixed frame without the difficulties of maintaining a
relationship onboard the spacecraft to one of the conventional inertial frames.
With regard to time systems, navigation filter designs should strongly avoid depen-
dence upon discontinuous time scales, such as Coordinated Universal Time (“UTC”). While
ground-based applications will generally prefer UTC, it is far easier for the mission’s ground
system to manage leap seconds than it is to robustly test and maintain discontinuous time
scales in an autonomous onboard navigation setting. The filter designer should strive to
ensure that a misapplication of leap second logic can never a↵ect filter performance. If re-
quirements for maintenance of UTC onboard cannot be avoided, all such calculations should
occur independently from the uniform continuous time scale that the filter uses internally.
Time-tagged commands that a↵ect filter performance should also utilize the same internal,
continuous time scale.

6.4. Orbit Parameterizations

For orbital navigation applications, orbital elements are geometrically appealing as a
state representation, and there exist various “semi-analytic” theories for improving their
usefulness as ephemeris representations for real-world orbits, such as the GPS broadcast
ephemeris model, two-line elements, etc. Furthermore, long-term evolution of the orbital
error covariance more naturally occurs in element representations, which may be especially
relevant to conjunction analysis. However, NASA’s experience has been that Cartesian
parameters generally o↵er simpler computational efficiencies for high-fidelity measurement
and dynamics models, including for the Jacobian matrices required in estimation algorithms.
6.5. RELATIVE STATE REPRESENTATIONS 55

6.5. Relative State Representations

Although the subject of this Section implies the need for relative state knowledge, it
is not necessarily the case that this implies estimation of the relative states directly. For
example, if each spacecraft’s only sensor is a GNSS receiver, and there is no method for
exchanging the GNSS data between spacecraft, then each satellite’s measurement errors will
be largely uncorrelated, assuming that errors in the GNSS constellation data are minimal.
Furthermore, there may be no common sources of dynamical error, such as might arise
from common yet imperfect models of atmospheric density for low Earth orbiters. In such
cases, mission requirements may be met simply be performing isolated state estimation
onboard each satellite, and simply di↵erencing the estimated state vectors. In such cases,
the covariance of the relative state error between any two spacecraft is given by
Prel = E[(e2 e1 )(e2 e1 ) T ]
= P1 + P2 (6.1)
where ei denotes the estimation error for spacecraft i, since (by assumption) E[e1 eT2 ] = 0.
For many other applications, either the measurements or the dynamics or both will induce
a correlation structure, making it necessary to simultaneously estimate some combination
of Earth-centered (a.k.a. “absolute” or “inertial”) states and spacecraft-to-spacecraft rel-
ative states. The choice of state representation and associated dynamical model for each
application can have significant impacts on efficiency and accuracy, and requires careful
consideration.
Aside from the choice of orbit parameterization, there are at least three choices for
estimating a relative orbit. The most obvious choice is to solve directly for the di↵erences
between the parameters chosen for the orbit representation; that is, to solve for relative
position and relative velocity, or relative orbital elements. In some contexts, such as nearly
circular orbits, efficient dynamics, such as the linear time-invariant Hill-Clohessy-Wilshire
model [11,26], become available with a choice to solve directly for relative Cartesian states.
In many other contexts, higher fidelity may be required, and furthermore, models for drag,
solar radiation pressure, the ephemerides of other gravitational bodies, etc. may require
knowledge of the Earth-centered (Cartesian) state of one or more of the spacecraft. In
such cases, the estimator may solve for a combination of pure absolute/inertial states, or
some combination of absolute and relative states. The architecture originally developed
for NASA’s Apollo missions was the former “dual-inertial” formulation [52]. While the
absolute/relative formulation may appear to be mathematically equivalent, computational
considerations may choose one or the other to be favored in various application contexts. A
general observation is that the dual-inertial formulation may be favorable for computations
involving the state and state error covariance, and for “absolute” measurements such as un-
di↵erenced GPS pseudorange, while the absolute/relative formulation may be favorable for
computations involving satellite-to-satellite relative measurements. Reference [52] provides
a comprehensive mathematical description of the dual-inertial formulation in the context of
relative range, Doppler, and bearing measurements that one may easily adapt to an other
measurement types.

6.5.1. Dual Inertial State Representation Here, we consider only two spacecraft,
but the results are easily generalized. Let xi = [riT , viT ]T , i = 1, 2 denote the true state of
spacecraft i, with ri , vi the position and velocity vectors expressed in non-rotating coordi-
nates centered on the primary central gravitational body. Based on mission requirements,
56 6. STATE REPRESENTATIONS

any appropriate fidelity of dynamics may be directly utilized per the methods of Chapter 50,
e.g.

vi P
ẋi = µ
r + f (6.2)
kri k3 i j j

where the specific forces fj may include thrust, higher-order gravity, drag, solar radiation
pressure, gravity from non-central bodies such as the moon and the sun, etc.
Let ei = x̂i xi , where x̂i is an estimate for the state of spacecraft i. Then, the error
in the state estimate x̂ = [x̂T1 , x̂T2 ]T is e = [eT1 , eT2 ]T , and the error covariance is

T P1 P12
P = E[ee ] = T (6.3)
P12 P2
Any linear unbiased estimate of x will have the following measurement update equation:
x̂+ = x̂ + K(y h(x̂ )) (6.4)
where x̂ is the value of x̂ immediately prior to incorporating the observation, y, and h(x̂ )
is an unbiased prediction of the measurement’s value. The optimal gain is
1
K = P H T (HP H T + R) (6.5)
where R is the measurement noise covariance and H = @h(x)/@x|x̂ . Partition the update
as follows:
 +  
x̂1 x̂1 K1
= + (y h(x̂ )) (6.6)
x̂+
2 x̂ 2 K2
 
x̂1 P H T + P12 H2T
= + 1T 1 T (HP H T + R) 1 (y h(x̂ )) (6.7)
x̂2 P12 H1 + P2 H2T
from which it is clear that the optimal update for the relative state x̂rel = x̂2 x̂1 is
x̂+
rel
T
= x̂rel + (P2 H2 P1 H 1T T T T
P12 H2 + P12 H1 )(HP H + R) T 1
(y h(x̂ )) (6.8)
with corresponding relative error covariance
T
Prel = P1 + P2 P12 P12 (6.9)
Noting that it must be true that h(xrel ) = h(x) and hence that @h(xrel )/@xrel = @h(x2 )/@x2 =
@h(x1 )/@x1 , let Hrel = H2 = H1 . Then it is clear that
T
Prel Hrel = P2 H2T P1 H1T P12 H2T + P12
T
H1T (6.10)
and that
T
Hrel Prel Hrel = HP H T (6.11)
and hence
x̂+ T T
rel = x̂rel + Prel Hrel (Hrel Prel Hrel + R)
1
(y h(x̂rel )) (6.12)
Therefore, the dual inertial state update is mathematically (although perhaps not compu-
tationally) equivalent to a direct update of the relative state.
Appendix C reproduces a memorandum that further details the benefits of the dual
inertial formulation.
6.5. RELATIVE STATE REPRESENTATIONS 57

6.5.2. Linearized Relative State Representation While it is sometimes useful to

employ a linear model of the relative dynamics, especially for close proximity operations
in near-circular orbits, there is significant benefit to casting the equations of motion in
spherical coordinates. The following derivation of the Hill-Clohessy-Wiltshire equations in
spherical coordinates is derived from notes from a lecture given by Robert H. Bishop. Let
the position of a spacecraft be give by a set of right-handed spherical coordinates
2 3
cos sin ✓
r = ⇢ 4 sin 5 (6.13)
cos cos ✓
where ⇢ is the distance from the central body to the spacecraft, ✓ is measured along some
specified great circle of the central body, and is measured along a great circle of the
central body that is normal to the former great circle, and contains the position vector, as
Figure 1 depicts. Define a state vector as follows: x = [⇢, ⇢, ˙ , ˙ ]. If the only force on
˙ ✓, ✓,

ρ φ

Figure 1. Spherical coordinates.

the spacecraft is point-mass gravity from the central body, then the equations of motion
are given by
2 3
⇢˙
6 µ/⇢2 + ⇢ ˙ 2 + ⇢✓˙2 cos2 7
6 7
6
6 ✓˙ 7
7
ẋ = f (x) = 6 ˙ ˙ ˙ 7 (6.14)
6 2 ⇢˙ ✓/⇢ + 2 ✓ tan 7
4 ˙ 5
2⇢˙ ˙ /⇢ ✓˙2 cos sin
Now consider a circular reference orbit, with
p radius ⇢⇤ , which is in the plane of the great
circle containing the ✓ coordinate. Let !⇤ = µ/⇢3⇤ . Then, the state of an object following
the circular reference orbit at any time t > to will be x⇤ (t) = [⇢⇤ , 0, !⇤ (t to ) ✓o , !⇤ , 0, 0].
Without loss of generality, take ✓o = to = 0. Letting x = x x⇤ , linearization of (6.14) in
the neighborhood of x⇤ yields
2 3
0 1 0 0 0 0
63! 2 0 0 2!⇤ ⇢⇤ 0 07
6 ⇤ 7
@f (x) 6 0 0 0 1 0 07
ẋ(t) = 6
x(t) = 6 7 x(t) (6.15)
@x x⇤ (t) 6 0 2!⇤ /⇢⇤ 0 0 0 077
4 0 0 0 0 0 15
0 0 0 0 2
!⇤ 0
58 6. STATE REPRESENTATIONS

In this context, it is useful to redefine the state vector x̃ = [⇢, ⇢, ˙ ⇢⇤ , ⇢⇤ ˙ ] so that

˙ ⇢⇤ ✓, ⇢⇤ ✓,
angles are replaced by arc lengths. Then, the linearized equations of motion become
2 3
0 1 0 0 0 0
63! 2 0 0 2!⇤ 0 07
6 ⇤ 7
6 0 0 0 1 0 07
˙x̃(t) = 6 7 x̃(t) (6.16)
6 0 2!⇤ 0 0 0 07
6 7
4 0 0 0 0 0 15
0 0 0 0 2
!⇤ 0
If linearized relative dynamics are to be used for relative navigation in near-circular
orbits, then interpreting the motion along the orbit track, and normal to the orbit track, as
arc lengths, per the derivation above, is desirable since it will preserve the linearity of the
approximation over a much wider range than if the along-track and cross-track coordinates
are taken as rectilinear tangents to the reference orbit position.
6.6. Modeling Inertial Components
Many onboard navigation systems employ inertial components consisting of gyros and/or
accelerometers. In some applications, an external algorithm will process the increments of
rotational and/or translational motion that these devices inherently measure, and produce
an acceleration vector that the filter can directly incorporate into its computation of the
equations of motion. However, it will often be the case that biases a↵ecting these devices
should be estimated by the filter. This section describes some recommended models for such
biases, as well as the computations that need to be performed to accumulate the inertial
measurement unit’s (IMU) angle and velocity increments.

6.6.1. The Gyro Model The gyro is modeled in terms of the bias, scale factor and
non-orthogonality. The IMU case frame is defined such that the x-axis of the gyro is
the reference direction with the x y plane being the reference plane; the y- and z-axes
are not mounted perfectly orthogonal to it (this is why we don’t have a full misalign-
ment/nonorthogonality matrix as we will in the accelerometer model). The errors in de-
termining these misalignments are the so-called non-orthogonality errors, expressed as a
matrix , as
2 3
0 0 0
= 4 yx 0 0 5
zx zy 0

The gyro scale factor represents the error in conversion from raw sensor outputs (gyro
digitizer pulses) to useful units. In general we model the scale-factor error as a first-order
Markov (or a Gauss-Markov) process in terms of a diagonal matrix given as
2 g 3
sx 0 0
Sg = 4 0 sgy 0 5
0 0 sgz
Similarly, the gyro bias errors are modeled as as first-order vector Gauss-Markov processes
as
2 g 3
bx
b = bgy 5
g 4
bgz
6.6. MODELING INERTIAL COMPONENTS 59

Finally, the gyro noise is represented by ✏g . Hence

C
!m = (I3 + + S g ) ! C + bg + ✏ g (6.17)
where I3 is a 3 ⇥ 3 identity matrix, the superscript C indicates that this is an inertial
measurement at the ‘box-level’ expressed in case-frame co-ordinates, and ! C is the ‘true’
g ‡
angular velocity in the case frame. If we let g = + Sg and (I + g ) 1 ⇡ I , we
can express the actual angular velocity in terms of the measured angular velocity as
! C = (I3 g C
) !m bg ✏g (6.18)

6.6.2. The Accumulated ✓ In order to find the accumulated angle (not as a func-
tion of the measurement, but purely as a function of the true angular velocity), we define
✓ as
⇣ ⌘ Z tk
Ck C 1 C C
✓ Ck 1 = !m (⌧ ) + ⇥ !m (⌧ ) d⌧ (6.19)
m tk 1 2 Cref
Z tk "Z #
⌧
C 1 ˙ C C
= !m (⌧ ) + ( ) d ⇥ !m (⌧ ) d⌧ (6.20)
tk 1 2 tk 1 Cref
"Z ◆ #
Z tk ⌧ ✓
C 1 C 1 C C C
= !m (⌧ ) + !m ( )+ ⇥ !m ( ) d ⇥ !m (⌧ ) d⌧
tk 1 2 tk 1 2 Cref
Ignoring second-order terms, we get
⇣ ⌘ Z tk " Z #
Ck C 1 ⌧ C C
✓ Ck 1 = !m (⌧ ) + ! ( ) d ⇥ !m (⌧ ) d⌧ (6.21)
m tk 1 2 tk 1 m
⇣ ⌘
With this expression, we find that, by analogy, we can express ✓ CCkk 1 as

⇣ ⌘ Z tk " Z #
Ck C 1 ⌧ C C
✓ Ck 1 = ! (⌧ ) + ! ( ) d ⇥ ! (⌧ ) d⌧ (6.22)
tk 1 2 tk 1

6.6.3. The Accelerometer Model The accelerometer package will likely be mis-
aligned relative to the IMU reference frame. This is due to the fact that the three ac-
celerometers (contained in the accelerometer package) are not mounted orthogonal to each

‡In order to evaluate (I + ) 1

we recall the Woodbury matrix identity
1 1 1 1 1 1 1
(A + UCV) =A A U C + VA U VA
Using this, we obtain the following relation (with A = I, U = , C = I and V = I),
1 1
(I + ) =I (I + )
which, worked recursively, yields the following approximation
1 2 3 4 5
(I + ) =I + + + ···
Therefore, to first-order (neglecting second-order and higher terms in the above equation), we get
1
(I + ) ⇡I
60 6. STATE REPRESENTATIONS

other and these errors are expressed in terms of six di↵erent small angles as:
2 a a
3
0 ⇠xy ⇠xz
⌅a = 4 ⇠yxa 0 ⇠yz a 5
a a
⇠zx ⇠zy 0
Similar to the gyros, the accelerometer scale factor represents the error in conversion
from raw sensor outputs (accelerometer digitizer pulses) to useful units. In general we model
the scale-factor error as a first-order (Gauss-) Markov process in terms of a diagonal matrix
given as
2 a 3
sx 0 0
Sa = 4 0 say 0 5
0 0 saz
Similarly, the bias errors are modeled as as first-order Gauss-Markov processes as
2 a 3
bx
ba = 4 bay 5
baz
So, the accelerometer measurements, aCm are modeled as:
aCm = (I3 + ⌅a ) (I3 + Sa ) aC + ba + a (6.23)
where I3 is a 3 ⇥ 3 identity matrix, the superscript C indicates that this is an inertial
measurement at the ‘box-level’ expressed in case-frame co-ordinates, and AC is the ‘true’
non-gravitational acceleration in the case frame. The quantity a is the velocity random
walk, a zero-mean white sequence on acceleration that integrates into a velocity random
walk, which is the ‘noise’ on the accelerometer output. If we assume that the errors are
small, then to first-order
(I3 + ⌅a ) (I + Sa ) ⇡ I + ⌅a + Sa
So, the linear accelerometer measurements (in the case frame) are:
aCm = (I3 + ⌅a + Sa ) aC + ba + a (6.24)
6.6.4. Accumulated v We note that the measured v in the case frame, vCm , is
mapped to the end of its corresponding time interval by the sculling algorithm within the
IMU firmware, so that we can write
Z tk
vCm k = TCC(t)
k
aC(t)
m dt (6.25)
tk 1

where vCm k
covers the time interval from tk 1 to tk (tk > tk 1 ) and C(t) is the instan-
taneous case frame§. We recall that a transformation matrix can be written in terms of the
§Or equivalently,
⇣ ⌘ Z tk
B
vB
m = k
TB(t) aB(t)
m dt (6.26)
k tk 1
h i
B Bk
k
But since TB(t) ⇡ I3 B(t) ⇥ , we find
⇣ ⌘ Z tk h h ii
Bk
vB
m = I3 B(t) ⇥ aB(t)
m dt (6.27)
k tk 1
6.6. MODELING INERTIAL COMPONENTS 61

Euler axis/angle as
sin 1 cos T
T ( ) = cos( )I [ ⇥] + 2
(6.28)
sin 1 cos
= I [ ⇥] + 2
[ ⇥] [ ⇥] (6.29)

which, for ⇠ 0 can be approximated as

T ( ) = I [ ⇥] (6.30)
h i
With this in mind, TCC(t)
k
= I3 ✓ CC(t)
k
⇥ , vB
m k using Eq. (6.24), becomes
Z tk h h ii ⇥ ⇤
C
vm k = I3 ✓ CC(t)
k
⇥ (I3 + a
) aC + b a + a dt (6.31)
tk 1

We can expand this equation, neglecting terms of second-order, as follows

Z tk h h ii Z tk
C Ck C
vm k = I3 ✓ C(t) ⇥ a dt + (ba + a ) dt
tk 1 tk 1
Z tk
a C
+ a dt (6.32)
tk 1

The first term in the above equation (Eq. (6.32)) becomes

Z tk h h ii
I3 ✓ CC(t)
k
⇥ aC dt = vC k
(6.33)
tk 1

and the third term becomes

Z tk Z tk
a C
a dt = a
aC dt ⇡ a
vC k
(6.34)
tk 1 tk 1

Finally, the accelerometer noise, which is zero-mean process with spectral density Sa be-
comes
Z tk+1
a dt = ua (6.35)
tk

where ua is a random vector with covariance Sa (tk tk 1 ). So, Eq. (6.32) becomes
vCm k
= [I3 + a
] vC k
+ ba t + a t (6.36)
a 1 a
Since we have established that [I3 + ] ⇡ [I3 ], and neglecting terms of second-
order,
vC k
= [I3 a
] vCm k
ba t a t (6.37)
The average acceleration in the case frame is
vC k
aCave = (6.38)
t
and the average measured acceleration in the case frame is
vCm
aCm ave
= k
(6.39)
t
62 6. STATE REPRESENTATIONS

so we find that
aCave = [I3 a
] aCm ave
ba a (6.40)
Recalling that the IMU measures accelerations except for gravity, total acceleration is
⇣ ⌘
B C
aI = gI (r) + TIBref TBref TB C aave (6.41)
k

6.6.5. The Gravity Call One of the more expensive computations involving the prop-
agation of the trajectory with IMU data is the gravity calculation. This is particularly acute
when the gravity field used is of high order. The gravity gradient matrix requires even more
computation. Hence it goes without saying that if a way is found to minimize the gravity
calls, that would make the navigation software more tractable. Taking advantage of the
fact that propagation of the trajectory using IMU data occurs at a high rate (usually at 40
Hz or higher), we expand the gravity vector in terms of a Taylor series about r⇤ as
@g 1 @2g
g(r) = g(r⇤ ) + [r r⇤ ] + [r r⇤ ] T [r r⇤ ] + . . . (6.42)
@r r=r⇤ 2 @r2 r=r⇤
Knowing that the gravity gradient matrix, G, is
@g
G(r⇤ ) = (6.43)
@r r=r⇤
and truncating after the first-order in r, we find that
g(r) ⇡ g(r⇤ ) + G(r⇤ ) [r r⇤ ] (6.44)
⇡ G(r⇤ )r + [g(r⇤ ) G(r⇤ )r⇤ ] (6.45)
where now the gravity vector and the gravity gradient matrix need only to be evaluated at
the beginning of the major cycle.
CHAPTER 7

Factorization Methods

Contributed by Christopher D’Souza

Of the various covariance factorization methods⇤, the UDU covariance factorization tech-
nque is among the most commonly used covariance matrix factorization methodologies used
in practice. It is implemented in GEONS and flew on MMS and is the heart of the Orion
Absolute Navigation System. This chapter is intended to present the U DU triangular
factorization method and the rationale for its use.
Above all, we demonstrate that the U DU factorization results in a significant reduction
in the arithmetic operations (specifically adds and multiplies) compared with the usual
P T + Q time update and the Joseph measurement update.
In the next section, we present some notational and preliminary operations for the
matrix factors U and D. In the section that follows, we will derive the time update equations
for the aforementioned covariance matrix factors. Next, we will derive the measurement
update equations for the covariance matrix factors. Finally, we will present some concluding
comments.
7.1. Why Use the UDU Factorization?
The usual Kalman filter equations work well for rather simple problems. But once
the state-space becomes large, the condition numbers of the covariance matrix becomes
large and nonlinear e↵ects begin to a↵ect the numerical characteristics, problems such as
filter divergence and non-positive definiteness of the covariance matrix occur. These issues
began to be observed almost as soon as Kalman filters began to be used in real problems.
Matrix factorization techniques were introduced to solve (at least) some of these issues. The
earliest was the Potter Square Root Factorization, which was used in the on-board Apollo
navigation filters.
In fact Bierman and Thornton, in a 1976 JPL Report, rather cheekily compare those who
insist on using the conventional Kalman filtering and batch least-squares algorithms (contra
the matrix factorization algorithms) to unrepentant smokers by describing “an attitude often
encountered among estimation practitioners [is] that they will switch to the more accurate
and stable algorithms if and when numerical problems occur. An analogy comes to mind of
a smoker who promises to stop when cancer or heart ailment symptoms are detected. To
expand on the analogy, one may note the following:
• Most smokers do not get cancer or heart disease. (Most applications of the Kalman
algorithms work.)
⇤Other options include the Square Root Covariance Factorization and the Square Root Information
Filter (SRIF).

63
64 7. FACTORIZATION METHODS

• Even when catastrophic illness does not occur, there is diminished health. (Even
when algorithms work, performance may be degraded.)
• Smokers can take precautions to lessen the danger, such as smoking low tar or
filtered cigarettes. (Engineers can scale their variables to reduce the dynamic range
or use double-precision arithmetic.)
• Lung cancer may not be diagnosed until it is too advanced for treatment. (Numer-
ical problems may not be detected in time to be remedied.) ” [71]
In addition, a little advertised, but incredibly useful feature of the UDU factorization
is the ability to interrogate for the positive definiteness of the covariance matrix for ‘free’,
since definiteness of the D matrix may be trivially evaluated.
This sets the stage for the need for the matrix factorization techniques and the UDU
technique in particular.
7.2. Preliminaries
Let us factor a covariance matrix, P, into the following form
P = UDUT (7.1)
where U is a upper triangular matrix with 1’s on the diagonals and 0’s on the lower portion
of the matrix, D is a diagonal matrix. We can write U and D compactly as
U = {uij }, i<j (7.2)
D = {dii } (7.3)
as well
uii = 1 (7.4)
It should be noted that Eq. (7.2) gives the upper triangular portion of the covariance; the
lower triangular matrix can be obtained by reflection or by evaluation of Eq. (7.2), with
ulm = 0 for l > m.
Equally valid is
Xn
pij = uik dkk ujk , j<i (7.5)
k=i
and for the diagonals we find,
n
X
pii = u2ik dkk (7.6)
k=i
So, given an n ⇥ n symmetric, positive semi-definite matrix P, the unit upper triangular
factor U and the diagonal factor D (such that P = UDPT ) is obtained using the following
equations. We begin with the (n, n) element and work upwards (along the columns).
The algorithm is as follows:
dn,n = pn,n
un,n = 1.0
for j = n 1 : 1 : 1 do
uj,n = pi,n /dn,n
end for
for j = n : 1 : 2 do
dj,j = pj,j
for k = j + 1 : n do
7.3. THE TIME UPDATE OF THE COVARIANCE 65

dj,j = dj,j + dk,k u2j,k

end for
uj,j = 1.0
for i = j 1 : 1 : 1 do
ui,j = pi,j
for k = j + 1 : n do
ui,j = ui,j + dk,k uj,k ui,k
end for
ui,j = uij /dj,j
end for
end for
A word of caution: Maybeck [50] on page 392, rather unusually and uncharacteristically,
calls the matrix U a unitary matrix, ostensibly because there are 1’s on the diagonals.
Strictly speaking a if U were a unitary matrix, UUT = UT U = I, which is clearly not the
case for the U in the UDU factorization.
In practice, particularly when storage limitations are driving the design, the n⇥n matrix
D, which is a diagonal matrix, can be stored as a n vector. Likewise, the matrix n ⇥ n
matrix U which is upper triangular with 1’s on the diagonal, can be stored as a n(n 1)/2
vector. The storage savings can be particularly significant as n increases. Of course the
algorithms need to be designed to ensure that the entries of U above the diagonal are the
only ones used in the computations.
To complicate matters further, if one uses the Matlab qr function to triangularize a
matrix A, it outputs two matrices, Q and R, of which Q is a unitary matrix in the ‘classic’
mathematical definition (i.e. QQT = QT Q = I) and R is an upper triangular matrix so
that A = QR.

7.3. The Time Update of the Covariance

As is necessary in Kalman Filtering, we wish to propagate the UDU factorization of
the covariance matrix. We loosely follow Maybeck [50] in this development. First, we will
pose the more general problem and then we will specialize it for navigation problems with
a large number of sensor biases.
We begin by expressing the equations for the general time update problem. Next, we
specialize the general problem to the case where a subset of states, which we will call
‘parameters’, whose dynamics are uncorrelated with any other state other than themselves.
Finally, we present the arithmetic operation (numbers of adds, multiplies, and divides) of
the time update of the covariance matrix.

7.3.1. The General Time Update Problem Given a state, x, that evolves accord-
ing to
x(tk ) = (tk , tk 1 )x(tk 1 ) + Gk wk
where wk is the process noise at time tk , where x is an n ⇥ 1 vector, and w is an m ⇥ 1
vector. With this in hand, the general problem is as follows [70]: we wish to propagate the
covariance matrix defined by
T
P(tk ) = (tk , tk 1 )P(tk 1 ) (tk , tk 1) + Gk Qk GTk (7.7)
where P is the propagated covariance (the overbar indicates a propagated quantity) and P
is the updated covariance at the prior time step, Qk is the diagonal process noise covariance
66 7. FACTORIZATION METHODS

matrix and Gk is the mapping of the noise to the state. To save memory, since Qk and Dk
are diagonal matrices, in the implementation, we pass Qk and Dk as vectors.
T
We want to find the propagated factors Uk and Dk , such that Pk = Uk Dk Uk . For
compactness, we now drop the time subscripts. Given the U DU factorization of covariance
matrices, we can write Eq. (7.7) as
T T T
Uk Dk Uk = k Uk 1 D k 1 Uk 1 k+ Gk Qk GTk (7.8)

⇥ ⇤ Dk 1 0 ⇥ ⇤T
= k Uk 1 Gk k Uk 1 Gk (7.9)
0 Qk
Since x is an n ⇥ 1 vector, and w is an m ⇥ 1 vector, k is an n ⇥ n matrix, Gk is an n ⇥ m
matrix, and Qk is an m ⇥ m matrix.
We recall that both Uk and Uk 1 are n ⇥ n upper triangular matrices, with 1’s on
the diagonal and D and D are purely diagonal matrices. So, we have  some work to do
⇥ ⇤ Dk 1 0
on Eq, (7.9) because k U k 1 Gk is an n ⇥ (n + m) matrix and is an
0 Qk
(n + m) ⇥ (n + m) diagonal matrix.
⇥ ⇤
Defining the first matrix ( k U k 1 Gk which is an n ⇥ (n + m) matrix) on the
right hand side of Eq.(7.9) as
⇥ ⇤
Y= k U k 1 Gk (7.10)
we seek a matrix Tk that transforms Yk such that
⇥ ⇤
Yk Tk T = Uk 0n⇥m (7.11)

where Uk is an n ⇥ n upper triangular matrix with 1’s on the diagonal. In order to find the
desired matrix Tk we perform a Gram-Schmidt orthogonalization‡. We define the diagonal
matrix De k as

e k = Dk 1 0
D (7.12)
0 Qk
which will be important as we define the weighted inner product in the Gram-Schmidt
Orthogonalization. Eq. (7.9) which was
T
e k YT
Uk Dk Uk = Y D (7.13)
can now be rewritten as
h i
T
e k Tk T 1 YT
Uk Dk Uk = Yk Tk T TTk D (7.14)
k k
h i
e k Tk is a diagonal matrix.
We note that the matrix TTk D

“All” that remains is to find Tk . This is where we harness the power of the modified
Gram-Schmidt orthogonalization process which provides us with what we were after: Uk
and Dk .

‡Whereas we can use a qr factorization, we specifically perform a Gram-Schmidt factorization specialized

to the UDU factorization on the correlated states.
7.3. THE TIME UPDATE OF THE COVARIANCE 67

⇥ 7.3.1.1. The ⇤Modified Gram-Schmidt Algorithm Given the (n + m) ⇥ n matrix Yk =

k U k 1 Gk , k Uk 1 can be constructed taking advantage of the structure of Uk 1
which is an upper triangular matrix with 1’s on the diagonal, with 12 (n3 n2 ) adds and
1 3
2 (n n2 ) multiplies. We recall that Yk and bk are (n + m) ⇥ 1 vectors. In the following
algorithm, the number of adds, multiplies and divides as a consequence of each operation
is expressed in terms of ‘[adds, multiplies, divides]’. We only go to j = 2 because U11 = 1.
The MGS algorithm can be expressed as:

for k = n, · · · , 2 do
bk = Y k

end for

for j = n, · · · , 2 do
e j
f j = Db [0, n(n + m), 0]
Djj = bTj f j [n(n + m), n(n + m), 0]
f j = f j /Djj [0, 0, (n + m)(n 1)]

for i = 1, · · · , j 1 do
2 2
Uij = bTi f j [(n + m) (n 2 n) , (n + m) (n 2 n) , 0]
2 2
bi = bi Uij bj [(n + m) (n 2 n) , (n + m) (n 2 n) , 0]
end for

U11 = 1
e 1
f 1 = Db
D11 = bT1 f 1
end for
Thus, the algorithm not only provides the orthogonal basis vectors, bj , j = 1, · · · , nx ,
but it also provides the triangular matrix factors U and D.
Since we are also interested in the arithmetic operations, we find that there are [n3x +
2
nx mx ] adds, [nx (nx + 1)(nx + mx )] mulitplies and [(nx + mx )(nx 1)] divides. For the case
when mx = 0, i.e. no process noise, we have n3x adds and [n3x + n2x ] mulitplies and [n2x nx ]
divides.
Finally, the entire covariance update algorithm, including the computation of Yk uses
[1.5n3x +0.5n2x (2mx 1)], and [0.5n2x (3nx +1)+nx mx (nx +1)] mulitplies and [(nx +mx )(nx
1)] divides.
The Modified Gram-Schmidt orthogonalization process makes no assumptions regarding
the structure of . For a large number of states (say, nx = 35 with process noise inputs,
mx = 35), most of which might be biases (or Gauss Markov processes), much of this is
wasted considering the sparseness of . In Appendix B.1, we show that we use [n3x + n2x mx ]
adds and [nx (nx + 1)(nx + mx )] multiplies to obtain Uk and Dk . For nx = 35 and mx = 35,
we require 85,750 adds and 88,200 multiplies – quite a large number of computations. We
can stop here and all will be well – if we are willing to pay the heavy computational price.
But we can do better! We can vastly improve (reduce) on the number of computations
by partitioning the original state vector into ‘states’ and ‘parameters’, where the parameters
68 7. FACTORIZATION METHODS

will be modeled as first-order Gauss-Markov processes. Unlike the parameters, the states
can vary in any manner. This motivates the next section.
7.3.2. An Improvement for the Case of Parameters As stated earlier, for a large
number of states (say, nx = 35), the U DU time update for the full covariance matrix
(a la Gram-Schmidt orthogonalization) is computationally expensive, requiring 2n3x + 2n2x
multiplies and additions. This is not competitive with the ‘standard’ P + Q formulation
(which uses 2n3x multiplies). However, one might guess that an improvement can be made.
This is particularly significant because, normally, most of the states are biases or ECRVs
(Exponentially Correlated Random Variables) or first-order Gauss-Markov processes. In
order to generalize the development, we assume ECRVs for the ‘parameter’ (or bias) states.
For most space-borne navigation applications, we can usually partition the states into
position, velocity, attitude (if applicable) and clock states, all of which we group together
and denote as x, and parameter states which usually comprise the sensor biases, scale
factors, etc., which we denote as p. This means that the full state space is

x
X = (7.15)
p
The ‘states’ partition must comprise all those quantities whose time evolution cannot be
described as purely self-auto correlated processes. With this in hand, we partition U and
D as
 
Uxx Uxp Dxx 0
U= D= (7.16)
0 Upp 0 Dpp
Also, partition according to
  
xx xp I 0 xx xp
= = = 2 1 (7.17)
0 M 0 M 0 I
where M is a diagonal matrix, representing an ECRV whose propagation for pk is
t/⌧
p̄k = e p̄k 1 (7.18)
so that
t/⌧i
M(i, i) = mi = e (7.19)
where ⌧i is the time constant of the ith
ECRV state and Q is partitioned according to
  
Qxx 0 Qxx 0 0 0
Q= = + = Q1 + Q2 (7.20)
0 Qpp 0 0 0 Qpp
Recall that the original propagation equation was
T
UDU = UDUT T
+Q (7.21)
T
Harnessing the development in Appendix B.1, U D U becomes
T T T T
UDU = 2[ 1 UDU 1 + Q1 ] 2 + Q1 (7.22)
This suggests the following two-step process:
e and D
1) Find U e from
eD
U eT =
eU 1 UDU
T T
+ Q1 (7.23)
1
2) Find U and D from
UDU =
T
e e eT
2 UDU
T
+ Q2 (7.24)
2
7.3. THE TIME UPDATE OF THE COVARIANCE 69

The following sub-sections will describe each of these steps.

7.3.2.1. The First Sub-Problem ( 1 UDUT 1 T + Q1 ) Lets look at 1). The left hand
side of Eq. (7.23) is
" #
T e xx D
U eT + U
e xx U e xp D eT
e pp U e xp D
U eT
e pp U
UeDeUe = xx xp pp
(7.25)
Ue pp D eT
e pp U e pp D
U eT
e pp U
xp pp

The right hand side of Eq. (7.23) is

2 T T T T
3
xx Uxx Dxx Uxx + Uxp Dpp Uxp xx | xx Uxp Dpp Upp
6 T T
+ xx Uxp Dpp Upp xp | + xp Upp Dpp UTpp 7
6 7
6 + xp Upp Dpp UTxp Txx | 7
eD
U eUeT = 6 7 (7.26)
6 + xp Upp Dpp UTpp Txp + Qxx | 7
6 7
4 | 5
Upp Dpp UTxp Txx + Upp Dpp UTpp T
xp | Upp Dpp UTpp
Equating each component of the matrix in Eqs (7.25) and (7.26), we find that the (2, 2)
component yields
Ue pp = Upp (7.27)
e pp = Dpp
D (7.28)
Equating the (1, 2) (or (2, 1)) component yields
e xp =
U xx Uxp + xp Upp (7.29)
Finally, equating the (1, 1) components of Eqs.(7.25) and (7.26), and using Eqs. (7.27),
(7.28) and (7.29), we find that
e xx D
U eT =
e xx U xx Uxx Dxx Uxx
T T
+ Qxx (7.30)
xx xx
Since we have partitioned the states such that x comprises the position, velocity and
attitude and is a 9-vector, we use the modified Gram-Schmidt algorithm to update
e xx and D
U e xx . And then we compute U e pp , D
e pp and Ue xp according to Eqs.(7.27) - (7.29).
Thus, given nx states with mx process noise parameters associated with those states, the
number of computations associated with the the first sub-problem is: [1.5n3x + 0.5n2x (2mx
1)] adds, [0.5n2x (3nx + 1) + nx mx (nx + 1)] multiplies, and [(nx + mx )(nx 1)] divides.

7.3.2.2. The Second Sub-Problem( e e eT

2 UD U
T
+ Q2 ) Now we look at 2). We now
2
partition Ue and De as
2 3 2 3
Ue aa Ue ab e ac
U } na e aa 0
D 0 } na
6
e =4 0 e 7 e 6 e 7
U 1 Ubc 5 } 1 and D=4 0 db 0 5 }1 (7.31)
0 0 e cc
U } nc 0 0 De cc } nc
in order to isolate a parameter. Correspondingly,
2 32 3
I 0 0 I 0 0
2 = 4 0 1 0 5 4 0 mb 0 5 = c b (7.32)
0 0 Mc 0 0 I
70 7. FACTORIZATION METHODS

and
2 3 2 3
0 0 0 0 0 0
Q 2 = 4 0 qb 0 5 + 4 0 0 0 5 = Q b + Q c (7.33)
0 0 0 0 0 Qc
As in the previous section, we note that c 1 Qb c T = Qb . So, now Eq. (7.24) becomes
h i
UDU = c
T
e e e T T + Qb
b UD U
T
b c + Qc (7.34)
The term in the square bracket in Eq. (7.34) is
ŬD̆Ŭ =
T
e e eT
b UDU
T
+ Qb (7.35)
b
In Appendix B the above equation is expanded until we find that
e ac ,
Ŭac = U e cc ,
D̆cc = D
T
eT
Ŭcc = U (7.36)
cc
Additionally,
e bc
Ŭbc = mb U (7.37)
and
d˘b = m2b deb + qb (7.38)
and
deb e
Ŭab = mb Uab (7.39)
d˘b
Finally, we find that
!
T
e aa D
e aa U
e + T deb qb eT
e ab U
Ŭaa D̆aa Ŭaa = U aa U ab (7.40)
d˘b
We note that U e ab is a column vector so Eq.(7.40), and hence is of rank 1, constitutes a
‘rank one’ update. Since d˘b , deb and qb are all positive (assuming mb is a positive quantity),
we can use the Agee-Turner Rank One update [1]. It should be pointed out that as the
algorithm proceeds down the ‘list’ of parameters, the size of the states a increases by one
(and consequently the size of the parameters c decreases by one. Hence Ŭaa and D̆aa begins
with a dimension of nx and concludes with dimension nx + np 1.
Therefore, this is done recursively for all the (sensor) parameters p which are of size np .

The Algorithm for e e eT

2 UD U
T
+ Q2
2

The algorithm can be expressed as follows (with the arithmetic operations (adds, multi-
plies, divides) in square brackets per k ):
for k = 1, · · · , np do
e x + k, nx + k) + Qpp (k, k)
D̆(nx + k, nx + k) = M(k, k)2 D(n (Eq. (7.38)) [1, 2, 0]
e
↵ = M(k, k) D(nx +k,nx +k) [0, 1, 1]
D̆(nx +k,nx +k)
for i = 1, · · · , (nx + k 1) do
Ŭ(i, nx + k) = ↵U(i, e nx + k) (Eq. (7.37)) [0, nx + k 1, 0]
end for
7.3. THE TIME UPDATE OF THE COVARIANCE 71

for j = nx + k + 1, · · · , (nx + np ) do
Ŭ(nx + k, j) = M(k, k)U(n e x + k, j) (Eq. (7.39)) [0, np k, 0]
end for
(k) (k)
Solve for Ŭxx , D̆xx using1 the Rank-One update [(nx +k)2 , (nx +k)2 +3(nx +k)+2, 0]
end for
Thus, the arithmetic operations are as follows:
Adds:
np np
X X
2 2
(nx + k) + 1 = nx np + np + nx (np + 1)np + k2 (7.41)
k=1 k=1
Multiplies:
np
X
3 + (nx + k 1) + np k + (nx + k)2 + 3(nx + k) + 2 = (5.5 + 5nx + n2x )np
k=1
np
1 X
+ (2nx + 5)n2p + k2 (7.42)
2
k=1
and np divides.

The Agee-Turner Rank One Update Algorithm

Appendix B.3 contains the development of the Agee-Turner Rank-One update which is
the key to reducing the numerical operations on the UDU Time update. Given U and D,
e and D
along with c, and the vector x, we are interested in obtaining U e along the lines of
eU
U e T = UDUT + cxxT
eU (7.43)
eij and D
The algorithm to compute U e ii ss:

Cn = c
for j = n, · · · , 2 do
De jj = Djj + C j x2 [n 1, 2(n 1), 0]
i
ejj = 1
U
j e
j = C /Djj [0, 0, (n 1)]
v j = j xj [0, (n 1), 0]
for i = 1, · · · , j 1 do
xi := xi Uij xj [ 21 (n2 n), 12 (n2 n), 0]
eij = Uij + xi vj
U [ 12 (n2 n), 12 (n2 n), 0]
end for
C j 1 = j Djj [0, (n 1), 0]
end for
De 11 = D11 + C 1 x2 [1, 2, 0]
1

This algorithm has n2 adds, (n2 + 3n + 2) multiplies and n 1 divides.

1We are using the nomenclature U(k) and D(k) to denote the upper left n + k 1 rows and columns
x
of the U and D matrices
72 7. FACTORIZATION METHODS

7.3.3. Arithmetic Operations for Time Update For the time update of the co-
variance matrix, we will have (from Section XX and XX), we have the following arithmetic
operations:
np
X
Adds : 1.5n3x + n2x mx + n2x np 0.5n2x + np + nx (np + 1)np + k2
k=1
np
1 X
Multiplies : 0.5n2x (3nx + 1) + nx mx (nx + 1) + (5.5 + 5nx + n2x )np + (2nx + 5)n2p + k2
2
k=1

Divides : (nx + mx )(nx 1) + np

For nx = 9, mx = 9, np = 26, we will utilize 16,407 adds, 19,338 multiplies, and 170
divides. In contrast, if we did the MGS on all 35 states (nx = 35, mx = 35 and np = 0),
we would use 85,750 adds, 88,200 multiplies, and 34 divides. Finally, if the covariance were
updated (without any consideration given to the structure of from P T ) in the conven-
tional manner, with nx = 35, mx = 35, it would cost 84,525 adds, 85,750 multiplies and no
divides†. Thus, a very strong case is made for using the U DU factorization and harnessing
the benefit of updating the sensor parameters using the Agee-Turner Rank-One update.
Thus, the U DU time update taking advantage of the fact that most of the states are sensor
parameters results in nearly five times fewer adds and multiplies and 170 more divides
that if we operated on the full covariance matrix.
7.4. The Measurement Update
The U DU factorization requires that we process the measurements one-at-a-time [2].
This should not be construed as a weakness of the formulation. If the measurements are
correlated, they can be ‘decorrelated’ as in Appendix B.4
So, the covariance update equations are
P=P KHP (7.45)
where K is the Kalman Gain matrix, H is the measurement partial, P is the a priori
covariance, and P is the a posteriori covariance matrix. Using the covariance factors U and
D, we rewrite the above equation as
T T
UDUT = U D U KHŪD̄Ū (7.46)
We note that U and Ū and D and D̄ are n ⇥ n matrices, and because we are using the
paradigm of processing the measurements one at a time, H is an 1 ⇥ n vector and K is an
n ⇥ 1 vector. Recalling that K is defined as
T T
T T 1 U D U HT U D U HT
K = PH HPH + R = T = (7.47)
HU D U HT + R ↵
where the scalar ↵ is defined to be
T
↵ = HU D U HT + R (7.48)
†For matrices, A and B of dimension n ⇥ m and m ⇥ p, respectively, the product

C = AB (7.44)
results in n(m 1)p adds, nmp multiplies and no divides.
7.4. THE MEASUREMENT UPDATE 73

We find that Eq. (7.46) becomes

" T
#
D U HT T
UDUT = U D HU D U (7.49)
↵

If we define the n ⇥ 1 vector v as

T
v = D U HT (7.50)
Eq. (7.49) becomes

1 T T
UDUT = U D vv Ū (7.51)
↵
We now analyze the bracketed term in Eq. (7.51) and find that we can define

eD eT = D
eU 1 T
U vv (7.52)
↵
Therefore,
e
U = UU and e
D=D (7.53)
So, how do we proceed? This has all the marks of a rank-one update, for after all v is of
rank one. We can proceed by using the Agee-Turner rank-one update. Except for one thing
– that pesky minus sign in Eq. (7.52). That minus sign portends all sorts of numerical
issues because there is a strong possibility that we can lose numerical precision if the Agee-
Turner update is used blindly. It turns out that we can have ‘our cake and eat it too’, for
Neil Carlson developed a rank-one update to remedy precisely our issue. The mathematical
development of this algorithm is detailed in Appendix B.5.

The algorithm is as follows: Given U, D, R, and H

T ⇥ ⇤T
f = U HT where f = f1 f2 · · · fn [ 12 (n2x nx ), 12 (n2x nx ), 0]
T ⇥ ⇤T
v=D f where v = d1 f1 d2 f2 · · · dn fn [0, nx , 0]
⇥ ⇤T
K1 = v 1 0 · · · 0
↵1 = R⇣+ v⌘ 1 f1 [1, 1, 0]
R
d1 = ↵1 d1 [0, 1, 1]
for j = 2, · · · , n do
↵j = ⇣↵j 1 + ⌘ vj fj [nx 1, nx 1, 0]
↵j 1
dj = ↵j dj [0, nx 1, nx 1]
j = (fj /↵j 1 ) [0, 0, nx 1]
U j = U j + j Kj 1 [ 12 (n2x 1
nx ), 2 (n2x nx ), 0]
Kj = Kj 1 + v j U j [ 12 (n2x nx ), 12 (n2x nx ), 0]
end for
K = Kn /↵ [0, 0, nx ]
T
Thus, taking advantage of the triangularity of the U matrix (and the fact that U HT and
j Kj 1 and vj Uj use nx (nx 1)/2 multiplies and adds), for each measurement processed,
the covariance update results in 1.5n2x 0.5nx adds, 1.5n2x + 1.5nx multiplies and 3nx 1
divides.
74 7. FACTORIZATION METHODS

For the normal, Joseph Kalman filter update, for a scalar measurement, we find that
if we use efficient methods of calculating and storing quantities [2], we use 4.5n2x + 3.5nx
adds, 4n2x + 4.5nx multiplies and 1 divide.
For the “Conventional” Kalman filter update (P = P KHP in Eq.(61)), for a scalar
measurement, we find that [2] we use 1.5n2x + 1.5nx adds, 1.5n2x + 0.5nx multiplies and 1
divide.
Thus, for nx = 35, the covariance update due to measurement processing with the U DU
factorization uses 1820 adds, 1890 divides and 104 divides compared with 5635 adds, 5058
multiplies and 1 divide for the efficient Joseph update. The “Conventional” Kalman update
uses 1890 adds, 1855 multiplies, and 1 divide.
Hence there almost a factor of 2.5 improvement in the adds and multiplies using the
triangular (U DU ) update compared with the Joseph update. This rivals the efficiency of
the “conventional” Kalman Filter update.
7.5. Consider Covariance and Its Implementation in the UDU Filter
‘Consider’ Analysis was first introduced by S. F. Schmidt of NASA Ames in the mid
1960s as a means to account for errors in both the dynamic and measurement models due to
uncertain parameters [64]. The Consider Kalman Filter, also called the Schmidt-Kalman
Filter, resulted from this body of work. The consider approach is especially useful when
parameters have low observability.
We partition the state-vector, x, into the ns “estimated states”, s, and the np “consider”
parameters, p, as

s
x= (7.54)
p
so that
  
Pss Psp ⇥ ⇤ Ks,opt Pss HTs + Psp HTp 1
P= ,H = Hs Hp , Kopt = = W
Pps Ppp Kp,opt Pps HTs + Ppp HTp
where Kopt is the optimal Kalman gain computed for the full state, x. Therefore, if we
now choose the Ks and Kp carefully such that the Ks = Ks,opt , the a posteriori covariance
matrix is
2  3
T Psp
6 Pss Ks WKs Psp Ks H
Ppp 7
P+ = 6
4
 T
7
5 (7.55)
Psp T T T
Pps H Ks Ppp Kp WKp
Ppp
This equation is valid for any value of Kp . Notice that there is no Kp in the correlation terms
of the covariance matrix. Therefore, what is remarkable about this equation is that
once the optimal Ks is chosen, the correlation between s and p is independent
of the choice of Kp .
In its essence, the consider parameters are not updated; therefore, the Kalman gain
associated with the consider parameters, p, is zero, i.e. Kp = 0. However, several comments
are in order:
(1) When using the Schmidt-Kalman filter, the a priori and a posteriori covariance of
the parameters (Ppp ) are the same.
(2) The a posteriori covariance matrix of the states and the correlation between the
states and the parameters are the same regardless of whether one uses the Schmidt-
Kalman filter or the optimal Kalman update
7.5. CONSIDER COVARIANCE AND ITS IMPLEMENTATION IN THE UDU FILTER 75

Therefore, the consider covariance, P+

con is
2  3
T Psp
6 Pss Ks WKs Psp Ks H 7
6 Ppp 7
P+
con = 4
 T
5 (7.56)
Psp
Pps HT KTs Ppp
Ppp
Of course, the “full” optimal covariance matrix update is
2  3
T Psp
6 Pss Ks,opt WKs,opt Psp Ks,opt H
Ppp 7
P+ = 6  7 (7.57)
opt 4 Psp
T
T T T
5
Pps H Ks,opt Ppp Kp,opt WKp,opt
Ppp
The UDU formulation, while numerically stable and tight, is quite inflexible to making any
changes in the framework. At first blush, it would seem that the consider analysis would
not fit into the framework. However, all is not in vain. With some clever rearrangements,
we can allow for a rank-one update to include consider states in the measurement update.
The measurement update, expressed in terms of the consider covariance [79], is
P+ +
opt = Pcon W (SKopt ) (SKopt )T (7.58)

where S is an nx ⇥ nx matrix (defining nx = ns + np , where nx is the total number of states,

np is the number of consider states, and ns is the number of “non-consider” states) defined
as

0ns ⇥ns 0ns ⇥np
S= (7.59)
0np ⇥ns Inp ⇥np
Since we are processing scalar measurements, we note that W = ↵1 is a scalar and Kopt is
an nx ⇥ 1 vector. Therefore SKopt is an nx ⇥ 1 vector. Therefore, solving for the consider
covariance,
+
P+
con = Popt + W (SKopt ) (SKopt )
T
(7.60)
Eq. (7.58) has the same form as the original rank-one update i.e. P+ = P + caaT . With
this in mind, we can use the (un-modified) rank-one update which is a backward-recursive
update [1]. If, for example, all the consider parameters are in the lower part of the state-
space, we can e↵ectively reduce the computations by ending the update when the covariance
of the state of the last consider parameter is updated.
Therefore, the procedure is as follows: first perform a complete rank-one measurement
update with the optimal Kalman Gain (Kopt ) according to the modified rank-one update –
on the full covariance matrix. Second, perform another rank-one update with a = SKopt
and c = W , according to the (un-modified) rank-one update.
Therefore, since there is an additional rank-one update associated with the consider
states and if no rearrangement of the consider states are performed, then there will be an
additional n2x adds, and n2x + 3nx + 2 multiplies, and nx 1 divides per measurement.
The use of the ‘consider state’ option, if it is exercised, is likely to be used in ‘consider’ing
the attitude states, particularly during entry. The rationale for this is that in certain
degenerate cases, when GPS satellites are reacquired after entry blackout, the attitude
could be adversely a↵ected. So, to protect for this, the ‘consider’ option may be exercised
with respect to the attitude states.
76 7. FACTORIZATION METHODS

7.6. Conclusions
Matrix factorization methods, particularly the UDU factorization, are very useful –
indeed essential – for onboard navigation algorithms. They are numerically stable and
computationally efficient, competitive with the classic Kalman filter implementation. In
addition, they allow the navigation designer to investigate the positive definiteness of the
covariance matrix for ‘free’, via the entries of the diagonal matrix D.
CHAPTER 8

Attitude Estimation

Contributed by F. Landis Markley

The particular complications of attitude estimation arise from a fundamental di↵erence

between rotational kinematics and translational kinematics. The translational state of mo-
tion can be completely specified in a nonsingular way by the cartesian components of the
position vector r(t) and the velocity vector v(t). The integral of any reasonable function
v(t) between two times gives the translational displacement of r(t) between these two times.
Other parameterizations of the translational state may be singular; the classical Keplerian
orbit elements are singular for zero inclination or zero eccentricity, for example. It is the
case, however, that globally nonsingular six-parameter representations exist.
Rotations in three-dimensional space have three degrees of freedom, just like transla-
tions in three dimensions, and the angular velocity vector !(t) is the rotational analog of
the velocity vector. However, two di↵erent time histories !(t) that have the same integral
over a time interval can result in di↵erent rotational displacements over the interval. This
is because the order in which rotations are performed is significant, unlike the order in
which translations are performed. Thus integration of !(t) does not result in a three-vector
rotational analog of the position vector. In fact, it can be proven that no global three-
component parameterization of rotations without singular points exists [68]. Rotational
analysis is forced to deal with either higher-dimensional representations of rotations or with
three-dimensional representations possessing singularities or discontinuities. The following
seven sections will briefly present a nine-parameter representation, two four-parameter rep-
resentations, and five three-parameter representations. Fuller discussions can be found in
Refs. [49, 67]. The discussion of attitude representations is followed by two sections on
extended Kalman filters for attitude estimation.

8.1. Attitude Matrix Representation

Attitude representations are the methods of representing the orientation of an orthonor-
mal triad of basis vectors in one reference frame with respect to an orthonormal triad in
some other reference frame. The attitude matrix, in particular, represents the orientation
of a vehicle’s body frame with respect to a frame that is often, but not always, an iner-
tial frame. The attitude determination of earth-pointing spacecraft, for example, typically
employs a reference frame in which one basis vector is pointed from the spacecraft toward
the center of the earth and another points opposite to the orbital angular velocity. The
body frame of a rigid vehicle is simply defined as a frame fixed in the vehicle. No vehicle
is completely rigid, though, so it is quite common to define the body frame operationally
as the orientation of some navigation base, a sufficiently rigid subsystem of the spacecraft
including the most critical attitude sensors and payload instruments.
77
78 8. ATTITUDE ESTIMATION

For actual applications, representations are 3 ⇥ 3 matrices that transform the repre-
sentations of vectors in one frame, i.e. their components along the basis vectors in that
frame, to their representations in a di↵erent frame. Thus attitude representations describe
a fixed physical vector in a rotated frame rather than a rotated vector. This is the passive
interpretation of a transformation, also known as the alias sense (from the Latin word for
“otherwise,” in the sense of “otherwise known as”) [67]. The alternative active interpre-
tation (also known as the alibi sense from the Latin word for “elsewhere”) considers the
representation in a fixed reference frame of a rotated physical vector. It is crucial to keep
this distinction in mind, because an active rotation in one direction corresponds to a pas-
sive rotation in the opposite direction. Overlooking this point has led to errors in flight
software.1
Now consider transforming the representation of a vector x ~ in a frame F to its repre-
sentation in a frame G and then from frame G to frame H or directly from frame F to frame
H, so
xH = AHG xG = AHG (AGF xF ) = AHF xF (8.1)
These transformations must be equivalent for any vector xF , so successive transformations
are accomplished by simple matrix multiplication:
AHF = AHG AGF (8.2)
This may appear to be an obvious result, but only one other attitude representation
has such a simple composition rule. Matrix multiplication is associative, meaning that
AHG (AGF AF E ) = (AHG AGF )AF E . Matrix multiplication is not commutative, however,
which means that AHG AGF 6= AGF AHG in general. The non-commutativity of matrix
multiplication is at the heart of the problem of finding a suitable attitude representation.
Transforming from frame F to frame G and back to frame F is e↵ected by the matrix
AF F = AF G AGF , which must be the identity matrix. Rotations must also preserve inner
products and norms of vectors, so
xF · yF = xG · yG = (AGF xF )T AGF yF = xTF ATGF AGF yF (8.3)
These two observations mean that
ATGF = AGF1 = AF G (8.4)
A matrix with its transpose is equal to its inverse is called an orthogonal matrix, and its
determinant must equal ±1. The attitude matrix must be a proper orthogonal matrix, i.e.
have determinant +1, in order to transform a right-handed coordinate frame to a right-
handed coordinate frame.
The nine-component attitude matrix is in some ways the ideal representation of a ve-
hicle’s attitude. It has a 1:1 correspondence with physical attitudes, it varies smoothly as
the physical attitude varies smoothly, its elements all have magnitudes less than or equal
to one, and it follows a simple rule for combining successive rotations. It is not an efficient
representation, though; only three of its nine parameters are independent because the or-
thogonality constraint is equivalent to six independent scalar constraints. This provides the
opportunity to specify an attitude or an attitude matrix using only three parameters, but
not, as was pointed out above, in a globally continuous and nonsingular fashion.

1One example is an incorrect sign for the velocity aberration correction for star tracker measurements
on the WMAP spacecraft, which fortunately was easily corrected.
8.2. EULER AXIS/ANGLE REPRESENTATION 79

8.2. Euler Axis/Angle Representation

The Euler axis/angle representation of a rotation matrix is based on Euler’s Theorem,
which states that the general displacement of a rigid body with one point fixed is a rotation
about a fixed axis [22]. Specify the axis by a unit vector e ~ and the rotation angle by #,
and denote the matrix that maps the representations of vectors from frame F to frame G
by AGF (e, #). The rotation axis is fixed, so e can be its representation in either frame F
or frame G, which are identical. Consider the mapping of a vector x ~ whose representation
in frame F is
xF = (e eT )xF + (I3 e eT )xF (8.5)
where In denotes the n ⇥ n identity matrix. The first term on the right side is parallel
and the second is perpendicular to to the rotation axis. The rotation does not a↵ect the
parallel component and rotates the perpendicular component through an angle # around
the rotation axis out of the plane defined by that axis and xF , so
xG = AGF (e, #)xF = (e eT )xF + cos #(I3 e eT )xF sin #(e ⇥ xF ) (8.6)
This formula preserves the norm of xF because e ⇥ xF and (I3 e eT )xF are orthogonal
and have equal magnitude. Since xF is an arbitrary vector, Eq. (8.6) means that
A(e, #) = (cos #) I3 sin # [e⇥] + (1 cos #)e eT (8.7)
where [e⇥] is the cross-product matrix:
2 3
0 e3 e2
[e⇥] ⌘ 4 e3 0 e1 5 (8.8)
e2 e1 0
The cross-product matrix is defined so that [x⇥] y = x ⇥ y. Equation (8.7) is the Euler
axis/angle parameterization of an attitude matrix, with explicit frame indices omitted.
It requires four parameters, but only three are independent because of the unit vector
constraint kek = 1.
The sin # terms in Eqs. (8.6) and (8.7) are negative because the rotation in Euler’s
theorem is an active rotation of the frame G relative to the frame F, while the rotation
~ from frame F to
matrix A(e, #) specifies the passive mapping of the representation of x
frame G.
The Euler axis/angle representation can be used to find the time dependence of the
rotation matrix. The fundamental definition of a derivative gives
AGF (t + t) AGF (t)
ȦGF (t) ⌘ lim
t!0 t
✓ ◆
AGF (t + t)AF G (t) I3
= lim AGF (t) (8.9)
t!0 t
because AF G (t)AGF (t) is equal to the identity matrix. For small t, the matrix product
AGF (t + t)AF G (t) di↵ers from the identity matrix by a small rotation, so it can be
represented by a small angle approximation of Eq. (8.7):
AGF (t + t)AF G (t) ⇡ I3 # [e⇥] (8.10)
Inserting this into Eq. (8.9), taking the limit of as t goes to zero, and omitting time
arguments gives
ȦGF = [! GFG ⇥]AGF (8.11)
80 8. ATTITUDE ESTIMATION

where
#e
! GF
G ⌘ lim (8.12)
t!0 t
is the vector representation in frame G of the angular velocity of frame G with respect to
frame F. The angular velocity is known to be represented in frame G because the product
AGF (t + t)AF G (t) is a rotation from frame G at one time to frame G at a di↵erent time,
and these two frames coincide in the limit that t goes to zero. The angular velocity is
usually written simply as !, with the frames understood. Its units are rad/sec, assuming
that time is measured in seconds, because radian measure was assumed in taking the small
angle limit of sin #.
Equation (8.11) is the fundamental equation of attitude kinematics. It does not dis-
tinguish between the situations where frame F or frame G or both frames are rotating in
an absolute sense; it only cares about the relative rotation between the two frames. This
equation can also be written as

ȦGF = AGF AF G [! GF T
G ⇥]AF G = AGF [AF G ! GF
G ⇥] = AGF [! GF
F ⇥] (8.13)

which expresses the kinematics in terms of the representation in frame F of ! GF . The second
equality uses an identity that holds for any proper orthogonal matrix. These kinematic
equations, if integrated exactly, preserve the orthogonality of the attitude matrix.
The Euler axis/angle representation is fundamental for analysis, as demonstrated above,
but it has been entirely superseded for practical applications by a superior four-parameter
representation described in the next section.

8.3. Quaternion Representation

Applying trigonometric half-angle identities to Eq. (8.7) gives

A(q) = q42 kq1:3 k2 I3 2q4 [q1:3 ⇥] + 2 q1:3 qT1:3 (8.14)

where the three-component vector q1:3 and the scalar q4 are defined by

q1:3 = e sin(#/2) (8.15a)

q4 = cos(#/2) (8.15b)

This representation has the advantage over the Euler axis/angle representation of requiring
no trigonometric function evaluations, and its four components are more economical than
the nine-component attitude matrix.
The four parameters of this representation were first considered by Euler but their full
significance was revealed by Rodrigues, so they are often referred to as the Euler symmet-
ric parameters or Euler-Rodrigues parameters. This representation is called the quaternion
representation and denoted A(q) because the four parameters can be regarded as the com-
ponents of a quaternion

q
q = 1:3 (8.16)
q4
8.3. QUATERNION REPRESENTATION 81

with vector part q1:3 and scalar q4 . A quaternion is basically four-component vector with
some additional operations defined for it.2 The attitude quaternion

e sin(#/2)
q(e, #) = (8.17)
cos(#/2)
is a unit quaternion, obeying the norm constraint kqk = 1, but not all quaternions are
unit quaternions. It is clear from Eq. (8.14) that q and q give the same attitude matrix.
This 2:1 mapping of quaternions to rotations is a minor annoyance that cannot be removed
without introducing discontinuities in the representation.
The most important added quaternion operations are two di↵erent products of two
quaternions q and q̄. They can be implemented in matrix form similar to the matrix form
of the vector cross product:3
q ⌦ q̄ = [ (q) q ] q̄ (8.18a)
q q̄ = [⌅(q) q ] q̄ (8.18b)
where (q) and ⌅(q) are the 4 ⇥ 3 matrices

q I [q ⇥]
(q) ⌘ 4 3 T 1:3 (8.19a)
q1:3

q I + [q ⇥]
⌅(q) ⌘ 4 3 T 1:3 (8.19b)
q1:3
Unlike the vector cross product, though, the norm of either product of two quaternions is
equal to the product of their norms.
Both quaternion products are associative but not commutative in general, in parallel
with matrix products. The two product definitions di↵er only in the signs of the cross
product matrices in Eqs. (8.19a) and (8.19b), from which it follows that
q ⌦ q̄ = q̄ q (8.20)
The identity quaternion ⇥ ⇤T
Iq ⌘ 0 0 0 1 (8.21)
acts in quaternion multiplication like the identity matrix in matrix multiplication. The
conjugate q⇤ of a quaternion is obtained by changing the sign of its vector part:
 ⇤ 
q q1:3
q⇤ = 1:3 ⌘ (8.22)
q4 q4
Either product of a quaternion with its conjugate is equal to the square of its norm times
the identity quaternion.
The inverse of any quaternion having nonzero norm is defined by
q 1
⌘ q⇤ /kqk2 (8.23)
2This is conceptually di↵erent from the quaternion introduced by Hamilton in 1844, before the introduc-
tion of vector notation, as a hypercomplex extension q = q0 + iq1 + jq2 + kq3 of a complex number z = x + iy.
The scalar part of a quaternion is often labeled q0 and put at the top of the column vector. Care must be
taken to thoroughly understand the conventions embodied in any quaternion equation that one chooses to
reference.
3The notation q ⌦ q̄ was introduced in Ref. [43], and q q̄ is a modification of notation introduced in
Ref. [57]. Hamilton’s product q̄q corresponds to q q̄, but q ⌦ q̄ has proven to be more useful in attitude
analysis. The order of the quaternion products in Eqs. (8.27) and (8.28) would be reversed with the classical
definition of quaternion multiplication.
82 8. ATTITUDE ESTIMATION

A unit quaternion, such as the attitude quaternion, always has an inverse, which is
identical with its conjugate. The conjugate of the product of two quaternions is equal to
the product of their conjugates in the opposite order: (q̄ ⌦ q)⇤ = q⇤ ⌦ q̄⇤ . The same
relationship holds for the other product definition and with conjugates replaced by inverses.
Equation (8.14) can be compactly written as
A(q) = ⌅T (q) (q) (8.24)
Now consider, for a unit quaternion q, the product
 ✓  ◆ 
x ⇤ ⇤ x ⇥ ⇤ ⇤
⇤⇥ ⇤ x
q⌦ ⌦q =q q⌦ = ⌅(q ) q (q) q
0 0 0
 T 
⌅ (q) A(q) x
= (q) x = (8.25)
qT 0
Applying a transformation by a second quaternion q̄ gives
✓  ◆  
x ⇤ ⇤ A(q) x ⇤ A(q̄)A(q) x
q̄ ⌦ q ⌦ ⌦ q ⌦ q̄ = q̄ ⌦ ⌦ q̄ =
0 0 0
 
x A(q̄ ⌦ q) x
= (q̄ ⌦ q) ⌦ ⌦ (q̄ ⌦ q)⇤ = (8.26)
0 0
Because this must hold for any x, it shows that
A(q̄ ⌦ q) = A(q̄)A(q) (8.27)
This and Eq. (8.2) mean that the quaternion corresponding to successive rotations is just
the product
qHF = qHG ⌦ qGF (8.28)
A simple bilinear composition rule of this type holds only for the attitude matrix and
quaternion representations, a major reason for the popularity of quaternions.
Representing the rotation between times t and t + t by an Euler axis and angle,
Eqs. (8.28), (8.17), (8.20), and (8.18b) give
 
e sin(#/2) e
q(t + t) = ⌦ q(t) = cos(#/2)q(t) + sin(#/2) ⌦ q(t)
cos(#/2) 0
= cos(#/2)q(t) + sin(#/2)⌅ (q(t)) e (8.29)
This quaternion propagation equation has proven to be very useful. It preserves the unity
norm of the attitude quaternion. If the angular velocity is well approximated as constant
over the time interval, then # = k!k t and e = !/k!k. Alternatively, and particularly for
onboard applications, #e can be computed by di↵erencing the outputs of rate-integrating
gyroscopes.
Using small angle approximations for the sine and cosine leads to the kinematic equation
for the quaternion:

q(t + t) q(t) 1 ! 1
q̇ ⌘ lim = ⌦ q = ⌅(q) ! (8.30)
t!0 t 2 0 2
where ! is defined by Eq. (8.12) and several time arguments have been omitted. Exact
integration of this equation preserves the unit norm of the quaternion. The inverse of
Eq. (8.30) is often useful:
! = 2⌅T (q)q̇ (8.31)
8.4. RODRIGUES PARAMETER REPRESENTATION 83

The quaternion representation of a given attitude matrix A can be found by normalizing

any one of the four vectors [46]
2 3 2 3
1 + 2A11 trA A21 + A12
6 A12 + A21 7 61 + 2A22 trA7
6 7 6 7
4 A13 + A31 5 = 4q1 q , 4 A23 + A32 5 = 4q2 q
A23 A32 A31 A13
2 3 2 3
A31 + A13 A23 A32
6 A32 + A23 7 6 A31 A13 7
6 7 6 7
41 + 2A33 trA5 = 4q3 q , 4 A12 A21 5 = 4q4 q (8.32)
A12 A21 1 + trA
Numerical errors are minimized by choosing the vector with the greatest norm, which can be
found by the following procedure. Find the largest of trA and the three diagonal elements
of A. If trA is the largest, then |q4 | is the largest of the |qi |, otherwise the largest value of
|qi | is the one with the same index as the largest diagonal element of A. The overall sign of
the normalized vector is not determined, reflecting the twofold ambiguity of the quaternion
representation.
Extracting many of the other parameterizations from an attitude matrix is most easily
accomplished by first extracting a quaternion and then converting to the desired represen-
tation. The kinematic equations of these other representations can also be readily derived
through the intermediary of the quaternion representation.
A three-parameter representation can be obtained by using only three components of
the quaternion, say the i, j, k components, with the fourth component given by
q
q` = ± 1 qi2 qj2 qk2 (8.33)
The sign is not arbitrary, but is determined by the four-component quaternion being rep-
resented by three of its components. Once so determined, the sign will not change if the
quaternion varies smoothly unless q` passes through zero. To avoid a sign error if |q` |
becomes small, the representation is switched to make q` one of three components in the
three-parameter representation, replacing one of the other components, which is then given
by the square root with the correct sign. The unit norm constraint on the attitude quater-
nion means that at least one of its components must have magnitude 1/2 or greater. To
minimize switching, it should be done when |q` | is significantly less than 1/2, and the com-
ponent replaced by q` in the three-component representation should have magnitude no
smaller than 1/2. To first order in the errors, the error in the fourth component of the
quaternion is
q` = q` 1 (qi qi + qj qj + qk qk ) (8.34)
This approaches the indeterminate quantity 0/0 as |q` | ! 0, providing another reason for
switching.
8.4. Rodrigues Parameter Representation
The three Rodrigues parameters appeared in 1840 [61], but were first arranged in a
vector by Gibbs, who invented modern vector notation. For this reason, the vector of
Rodrigues parameters is often called the Gibbs vector and denoted by g. The Rodrigues
parameters are related to the Euler axis/angle and the quaternion by
q
g = e tan(#/2) = 1:3 (8.35)
q4
84 8. ATTITUDE ESTIMATION

The quaternion as a function of the Gibbs vector is


±1 g
q= p (8.36)
1+ kgk2 1
It is clear from Eq. (8.35) that q and q map to the same Gibbs vector, so the Rodrigues
parameters provide a 1:1 mapping of rotations. The price paid for this is that the Gibbs
vector is infinite for a 180 rotation. Thus this parameterization is not recommended as a
global attitude representation, but it provides an excellent representation of small rotations.
The Rodrigues parameter representation of the attitude matrix is
[g⇥]2 [g⇥]
A(g) = I3 + 2 (8.37)
1 + kgk2
This resembles the quaternion representation in requiring no transcendental function eval-
uations, but it is a rational function rather than a simple polynomial.
The composition rule for the Rodrigues parameters corresponding to the quaternion
product q̄ ⌦ q is
ḡ + g ḡ ⇥ g
ḡ ⇥ g ⌘ (8.38)
1 ḡ · g
This composition law is associative but not commutative in general, in parallel with matrix
and quaternion products. Because it is not a bilinear function of the constituent Gibbs
vectors, it cannot be represented as a matrix product like quaternion composition.
The kinematic equation for the Rodrigues parameters is
ġ = (1/2) (I3 + [g⇥] + ggT ) ! (8.39)
with the inverse
1
! = 2 1 + kgk2 (ġ g ⇥ ġ) (8.40)
8.5. Modified Rodrigues Parameters
The modified Rodrigues parameters (MRPs) were invented by T. F. Wiener in 1962 [75],
rediscovered by Marandi and Modi in 1987 [45], and have been championed by Junkins and
Schaub [62]. The vector of MRPs is related to the Euler axis/angle and the quaternion by
q
p = e tan(#/4) = 1:3 (8.41)
1 + q4
The quaternion as a function of the MRPs is

1 2p
q= (8.42)
1 + kpk2 1 kpk2
and the attitude matrix is given by
8 [p⇥]2 4 1 kpk2 [p⇥]
A(p) = I3 + (8.43)
(1 + kpk2 )2
Every vector of MRPs has a shadow
p q1:3
pS ⌘ = (8.44)
kpk2 1 q4
An MRP vector and its shadow represent the same attitude because q and q represent
the same attitude, so the MRPs are a 2:1 mapping of the rotations just as the quaternions
are. It is clear from Eq. (8.44) that kpS kkpk = 1, so every attitude can be represented
8.6. ROTATION VECTOR REPRESENTATION 85

by either an MRP vector with kpk  1 or an equivalent MRP vector in the shadow set of
MRPs with kpk 1.
The MRP vector corresponding to the quaternion product q̄⌦q follows the composition
rule
1 kpk2 p̄ + 1 kp̄k2 p 2 p̄ ⇥ p
p̄ p ⌘ (8.45)
1 + kpk2 kp̄k2 2 p̄ · p
This composition law is associative but not commutative in general, in parallel with matrix
and quaternion products. It cannot be represented as a matrix product.
The kinematic equation for the MRPs is
✓ ◆
1 + kpk2 [p⇥]2 + [p⇥]
ṗ = I3 + 2 ! (8.46)
4 1 + kpk2
The matrix in parentheses is orthogonal, so the inverse of Eq. (8.46) is
✓ ◆
4 [p⇥]2 [p⇥]
!= I3 + 2 ṗ (8.47)
1 + kpk2 1 + kpk2
The norm of an MRP vector can grow without limit during dynamic propagation or
attitude estimation; Eq. (8.41) shows that the norm is infinite for # = 2⇡. This singularity
can be avoided by switching from the MRP vector to its shadow. The norm can be restricted
to be less than or equal to unity in theory, but in practice it is best to allow the norm to
exceed unity by some amount before switching in order to avoid “chattering” between the
MRP and its shadow. An error p in an MRP vector corresponds to an error
@pS 2ppT kpk2 I3 ⇥ S S T ⇤
pS = p= p = 2p (p ) kpS k2 I3 p (8.48)
@p kpk4
in its shadow. This relation is useful for mapping covariance matrices into and out of
the shadow set. The 2:1 four-component quaternion representation does not have these
complications because the two quaternions representing the same attitude both have unit
norm, so there is no need to switch between them.
8.6. Rotation Vector Representation
It is convenient for analysis, but not for computations, to combine the Euler axis and
angle into the three-component rotation vector
q
# ⌘ # e = 2(cos 1 q4 ) 1:3 (8.49)
kq1:3 k
This leads to the very elegant expression
A(#) = exp( [#⇥]) (8.50)
where exp(·) is the matrix exponential. This equation can be verified by expansion of it and
Eq. (8.7) as Taylor series in # and repeated applications of the identity [e⇥]2 = eeT I3 .
The kinematic equation for the rotation vector is
✓ ◆
1 1 # #
#̇ = ! + # ⇥ ! + 2 1 cot # ⇥ (# ⇥ !) (8.51)
2 # 2 2
The coefficient of # ⇥ (# ⇥ !) goes to 1/12 as # goes to zero, but it is singular for # equal
to any nonzero multiple of 2⇡. The inverse of Eq. (8.51) is
1 cos # # sin #
! = #̇ 2
# ⇥ #̇ + # ⇥ (# ⇥ #̇) (8.52)
# #3
86 8. ATTITUDE ESTIMATION

The rotation vector is useful for the analysis of small rotations, but not for large rota-
tions, because of both the computational cost of evaluating the matrix exponential and the
kinematic singularity for k#k = 2⇡. This singularity can be avoided, as for the MRPs, by
switching from the rotation vector to its shadow
#S ⌘ (1 2⇡k#k 1
)# (8.53)
which represents the same attitude. This can restrict the norm of the rotation vector to ⇡
or less in theory, but in practice it is best to allow the norm to exceed ⇡ by some amount
before switching in order to avoid “chattering” between the rotation vector and its shadow.
The properties of the rotation vector are very similar to those of the MRPs, and it has
no obvious advantages over the MRPs. It has the disadvantage of requiring transcenden-
tal function evaluations to compute the attitude matrix, so it is rarely used in practical
applications.
8.7. Euler Angles
An Euler angle representation parameterizes a rotation by the product of three rotations
about coordinate frame unit vectors:
Aijk ( , ✓, ) = A(ek , )A(ej , ✓)A(ei , ) (8.54)
where ej , ej , and ej are selected from the set
⇥ ⇤T ⇥ ⇤T ⇥ ⇤T
e1 = 1 0 0 , e2 = 0 1 0 , e3 = 0 0 1 (8.55)
The possible choice of axes is constrained by the requirements i 6= j and j 6= k, leaving six
symmetric sets with ijk equal to 121, 131, 232, 212, 313, and 323 and six asymmetric sets
with ijk equal to 123, 132, 231, 213, 312, and 321. Symmetric Euler angle sets are used in
classical studies of rigid body motion [22, 28, 35, 49, 74].
The asymmetric sets of angles are called the Tait-Bryan angles or roll, pitch, and yaw
angles. The latter terminology originally described the motions of ships and then was
carried over into aircraft and spacecraft. Roll is a rotation about the vehicle body axis
that is closest to the vehicle’s usual direction of motion, and hence would be perceived as a
screwing motion. The roll axis is conventionally assigned index 1. Yaw is a rotation about
the vehicle body axis that is usually closest to the direction of local gravity, and hence
would be be perceived as a motion that points the vehicle left or right. The yaw axis is
conventionally assigned index 3. Pitch is a rotation about the remaining vehicle body axis,
and hence would be perceived as a motion that points the vehicle up or down. The pitch
axis is conventionally assigned index 2. Note that Eq. (8.54) assigns the variables , ✓, and
based on the order of rotations in the sequence, making no definite association between
these variables and either the axis labels 1, 2, and 3 or the names roll, pitch and yaw. Many
authors follow a di↵erent convention, denoting roll by , pitch by ✓, and yaw by . As
always, the reader consulting any source should be careful to understand the conventions
that it follows.
Using the product rule and Eq. (8.11) to compute the time derivative of Eq. (8.54) gives
n
[!⇥]Aijk ( , ✓, ) = [( ˙ ek )⇥] A(ek , )[(✓e ˙ j )⇥]AT (ek , )
o
A(ek , )A(ej , ✓)[( ˙ ei )⇥]AT (ej , ✓)AT (ek , ) Aijk ( , ✓, ) (8.56)
The identity A[x⇥]AT = [(Ax)⇥], which holds for any proper orthogonal A, gives
! = ˙ ek + ✓A(e
˙ ˙
k , )ej + A(ek , )A(ej , ✓)ei = A(ek , )M[
˙ ✓˙ ˙ ]T (8.57)
8.8. ADDITIVE EKF (AEKF) 87

where
M = [A(ej , ✓)ei ej ek ] (8.58)
The second equality in Eq. (8.57) makes use of A(ek , )ek = ek . The Euler angle rates as
functions of the angular velocity are
[ ˙ ✓˙ ˙ ]T = M 1
AT (ek , )! (8.59)
This kinematic equation is singular if the determinant of M is zero. Equation (8.7) and the
Euler axis requirement that ei · ej = ej · ek = 0 gives
det M = [A(ej , ✓)ei ] · (ej ⇥ ek ) = cos ✓[ ei · (ej ⇥ ek )] sin ✓(ei · ek ) (8.60)
The triple vector product ei ·(ej ⇥ek ) is zero for the symmetric Euler angles, so the kinematic
equations of these representations are singular if sin ✓ = 0. The dot product ei · ek is zero
for the asymmetric Euler angles, so the kinematics of these representations are singular if
cos ✓ = 0. This singularity is known as gimbal lock and is caused by collinearity of the
physical rotation axis vectors of the first and third rotations in the sequence. Note that the
column vector representations of these rotation axes are always parallel for the symmetric
Euler angle sequences and always perpendicular for the asymmetric sequences, but this
neither causes nor prevents gimbal lock.
Because Euler angles are discussed in many references on rotational motion and because
they are not widely used in navigation filters, they will not be discussed further here.
Kinematic equations and explicit forms of the attitude matrices for all twelve sets can be
found in Refs. [28, 35, 49, 74].

8.8. Additive EKF (AEKF)

Three-component representations are the most natural representations for filtering, be-
cause only three parameters are needed to represent rotations. As was pointed out at the
beginning of this Chapter, though, all three-parameter representations of the rotation group
have discontinuities or singularities. Any filter using a three-dimensional attitude represen-
tation must provide some guarantee of avoiding these singular points, either by restricting
the vehicle’s attitude or by switching between di↵erent parameter sets if the representation
approaches a discontinuity or singularity.
The earliest Kalman filters for spacecraft attitude estimation used a roll, pitch, yaw
sequence of Euler or Tait-Bryan angles discussed in Section 8.7 [14, 15]. This is a very
useful representation if the middle angle of the sequence, generally the pitch angle, stays
well away from ±90 , and has been used for Earth-pointing spacecraft with small pitch
angles. One disadvantage of this representation is its trigonometric function evaluations,
but this is less of an issue with the computing power now available, especially in onboard
computers.
An EKF can estimate three components of a quaternion, with the fourth component
being determined by the quaternion unit norm constraint, as discussed at the end of Section
8.3 [43]. If the fourth component becomes small, it must be added to the set of estimated
quaternion components, with one of the other components switched out. This switch should
be made when the magnitude of the fourth component is is not too close to either end of
the range from 0 and 1/2 to avoid “chattering” between component sets. The switch must
be accompanied by the following covariance matrix transformation. Assuming the three
components in the pre-switch representation have indices i, j, k in ascending order, their
88 8. ATTITUDE ESTIMATION

3⇥3 covariance matrix is the symmetric matrix

2 3
Pii Pij Pik
P3⇥3 = 4 Pji Pjj Pjk 5 (8.61)
Pki Pkj Pkk
The 4 ⇥ 4 covariance matrix of the full quaternion is formed by adding the `th row and
column, keeping the indices in ascending order. The needed covariance components, using
using Eqs. (8.33) and (8.34), are
Pm` = P`m = q` 1 (qi Pim + qj Pjm + qk Pkm ) for m = i, j, k (8.62a)
2
P`` = q` (qi2 Pii + qj2 Pjj + qk2 Pkk + 2qi qj Pji + 2qi qk Pik + 2qj qk Pjk ) (8.62b)
Then the row and column corresponding to the quaternion component switched out of the
representation are deleted to form the new 3⇥3 covariance matrix.
The modified Rodrigues parameters (MRPs) are non-singular for rotations of less than
360 , and the singularity can be avoided by switching to an MRP in the shadow set, as
discussed in Section 8.5. The switch to the shadow set is made at some angle greater
than 180 to avoid “chattering” between the two parameter sets. The switch must be
accompanied by a covariance matrix transformation using Eq. (8.48)
⇥ ⇤ ⇥ ⇤
PSpp = 2pS (pS )T kpS k2 I3 Ppp 2pS (pS )T kpS k2 I3 (8.63)
where Ppp is the covariance before the switch, and PSpp is the covariance of pS after the
switch [37]. This covariance mapping is simpler than the corresponding mapping for the
three-component quaternion representation, and the MRP representation avoids a square
root computation. The appearance of #/4 in Eq. (8.41) as opposed to #/2 in Eq. (8.17)
means that switching will be less frequent when the MRP representation is used. For these
reasons the MRP representation has become the preferred three-parameter attitude filter
when the attitude is unrestricted.
The Gibbs vector or Rodrigues parameter representation has been used in an EKF [29],
but it is not well suited to filtering because of its inability to represent 180 rotations, as
discussed in Section 8.4. It provides an excellent representation of small attitude errors,
however. The rotation vector, discussed in Section 8.6, is also not recommended for appli-
cation in an EKF, as it has no clear advantage over the MRPs and has the disadvantage of
requiring transcendental function evaluations.
The nine-component attitude matrix and the four-component quaternion represent the
entire rotation group without singularities or discontinuities, but the linear measurement
update in EKFs employing these representations violates the orthogonality constraint on
the attitude matrix or the unit norm constraint on the quaternion. Various methods have
been proposed to circumvent this difficulty, but these are not without problems [49]. At
the very least, they are inefficient due to the larger dimesionality of the covariance matrix.
8.9. Multiplicative EKF (MEKF)
The multiplicative EKF (MEKF) uses the nine-component attitude matrix or the four-
component quaternion as the “global” attitude representation and a three-component vector
# for the “local” representation of attitude errors, so that the true attitude is represented
as a product
Atrue = A( #)Â (8.64a)
qtrue = q( #) ⌦ q̂ (8.64b)
8.9. MULTIPLICATIVE EKF (MEKF) 89

The constraints on the representations are satisfied because Atrue , A, and Â are all proper
orthogonal matrices, and qtrue , q, and q̂ all have unit norm. The MEKF avoids the attitude
restrictions or switching required by additive attitude EKFs because the error vector #
is assumed to be small enough to completely avoid singularities in the parameterizations
A( #) or q( #). In some sense, though, Eq. (8.64) incorporates a continuous switching
of the attitude reference.
Only the quaternion version of the MEKF, which is much more widely employed, is
presented here. Reference [43] reviews its history. Any three-component attitude represen-
tation that is related to first order in # to the quaternion by

#/2
q⇡ (8.65)
1
can be used as the error vector. Common choices are the rotation vector, as suggested by
the notation of Eq. (8.64), two times the vector part of the quaternion, two times the vector
of Rodrigues parameters, four times the vector of MRPs, or a vector of suitably indexed
roll, pitch, and yaw angles [49].
The order of the factors on the right side of Eq. (8.64) means that the attitude errors
are in the body reference frame. This leads to a major advantage of the MEKF, that
the covariance of the attitude error angles in the body frame has a transparent physical
interpretation. The covariance of estimators using other attitude representations has a
less obvious interpretation unless the attitude matrix is close to the identity matrix. It is
possible to reverse the order of the factors on the right side of Eq. (8.64) so the attitude
errors are in the reference frame [19]. The covariance can be mapped into the body frame
if desired.
The MEKF estimates # and any other state variables of interest. This discussion
addresses only the attitude part, as the equations for the other components of the state
vector obey the usual EKF equations. The MEKF proceeds by iteration of three steps:
measurement update, state vector reset, and propagation to the next measurement time.
The measurement update step updates the local error state vector. The reset moves the
updated information from the local error state to the global attitude representation and
resets the components of the local error state to zero. The propagation step propagates the
global variables to the time of the next measurement. The local error state variables do not
need to be propagated because they are identically zero over the propagation step. These
three steps will be discussed in more detail.

8.9.1. Measurement Update The observation model in given in terms of the true
global state
y = h(qtrue ) + v (8.66)
but the measurement sensitivity matrix is the partial derivative with respect to the local
error state, so the measurement sensitivity matrix is
@h @qtrue
H= (8.67)
@q @( #)
Equations (8.64), (8.65), (8.18b), and (8.20) give, to first order in the error vector,

#/2 1
qtrue ⇡ ⌦ q̂ = q̂ + ⌅(q̂) # (8.68)
1 2
90 8. ATTITUDE ESTIMATION

Inserting the partial derivative of this with respect to # into Eq. (8.67) then gives
1 @h
H= ⌅(q̂) (8.69)
2 @q
Simplifications are possible in some special cases.
The Kalman gain computation and covariance update have the standard Kalman filter
forms. The error state update employs the first-order Taylor expansion
E{h(qtrue )} ⇡ h(q̂) + H #̂ (8.70)
giving
+ ⇥ ⇤ ⇥ ⇤
#̂ = #̂ + K y E{h(qtrue )} = (I KH) #̂ + K y h(q̂ ) (8.71)
8.9.2. Reset The discrete measurement update assigns a finite post-update value to
+
#̂ , but the global state still retains the value q̂ . A reset procedure is used to move the
update information to a post-update estimate global statef vector q̂+ , while simultaneously
resetting #̂ to 03 , the three-component vector with all zero components. The reset does
not change the overall estimate, so the reset must obey
+
q̂+ = q(03 ) ⌦ q̂+ = q( #̂ ) ⌦ q̂ (8.72)
Thus the reset moves information from one part of the estimate to another part.
Every EKF includes an additive reset of the global state vector, but this is usually
implicit rather than explicit. The multiplicative quaternion reset is the special feature of
the MEKF. This reset has to preserve the quaternion norm, so an exact unit-norm expression
for the functional dependence of q on # must be used, not the linear approximation of
Eq. (8.65). Using the Rodrigues parameter vector has the practical advantage that the reset
operation for this parameterization is
 +
+ + 1 #̂ /2 ⌦ q̂
q̂ = q( #̂ ) ⌦ q̂ = q (8.73)
+ 1
1 + k #̂ /2k2
Using an argument similar to Eq. (8.68), this can be accomplished in two steps:
1 +
q0 = q̂ + ⌅(q̂ ) #̂ (8.74)
2
followed by
q0
q̂+ = 0 (8.75)
kq k
The first step is just the usual linear Kalman update, and the second step is equivalent to
a brute force normalization of the updated quaternion. Thus the MEKF using Rodrigues
parameters for the error vector provides a theoretical justification for brute force renormal-
ization, with the added advantage of completely avoiding the accumulation of quaternion
norm errors after many updates. The Rodrigues parameters also have the conceptual ad-
vantage that they map the rotation group into three-dimensional Euclidean space, with the
largest possible 180 attitude errors mapped to points at infinity. Thus probability distri-
butions with infinitely long tails, such as Gaussian distributions, make sense in Rodrigues
parameter space.
If a measurement update immediately follows a reset or propagation, the #̂ term on
the right side of Eq. (8.71) can be omitted because #̂ is zero. The reset is often delayed
8.9. MULTIPLICATIVE EKF (MEKF) 91

for computational efficiency until all the updates for a set of simultaneous measurements
have been performed, though, in which case #̂ may have a finite value and all the terms
in Eq. (8.71) must be included. It is imperative to perform a reset before beginning the time
propagation, however, to avoid the necessity of propagating #̂ between measurements.
There is some controversy over the question of whether the reset a↵ects the covariance.
One argument holds that it doesn’t because the covariance depends not on the actual
measurements but on their assumed statistics. Measurement errors are assumed to have
zero mean, so the mean reset is zero. But the reset is very di↵erent from the measurement
update in that it changes the reference frame for the attitude covariance, which might be
expected to modify the covariance even though it adds no new information. The change
in the covariance of # resulting from the e↵ect of the actual update, rather than its zero
expectation, can be computed to be [47, 60]
⇣ +
⌘ ⇣ +
⌘T
+
Preset
## = I 3 [ #̂ ⇥]/2 P ## I 3 [ #̂ ⇥]/2 (8.76)
+
to first order in #̂ . Comparison with Eq. (8.7) shows that the covariance reset looks to
+
this order like a rotation by #̂ /2, but this equivalence does not hold in higher orders. Most
applications omit this covariance reset, but Reynolds has found that it speeds convergence
and adds robustness in the presence of large updates, and that omitting it can even lead to
filter divergence in some cases [60].

8.9.3. Propagation An EKF must propagate the expectation and covariance of the
state. The MEKF is unusual in propagating the expectation q̂ and the covariance of the
error-state vector. The propagation of the attitude error is found by by di↵erentiating
Eq. (8.64):
q̇true = q̇ ⌦ q̂ + q ⌦ q̂˙ (8.77)
The true and estimated quaternions satisfy the kinematic equations

true 1 ! true
q̇ = ⌦ qtrue (8.78a)
2 0

1 ! ˆ
q̂˙ = ⌦ q̂ (8.78b)
2 0
where ! true and !
ˆ are the true and estimated angular rates, respectively. Substituting these
equations and Eq. (8.64) into Eq. (8.77), multiplying on the right by q̂ 1 , and rearranging
terms gives [43]
✓ true  ◆
1 ! !ˆ
q̇ = ⌦ q q⌦ (8.79)
2 0 0
Substituting Eq. (8.65) and ! true = !
ˆ + !, where ! is the angular velocity error, and
multiplying by two leads to
      
#̇ ˆ
! #/2 #/2 ˆ
! ! #/2
= ⌦ ⌦ + ⌦ (8.80)
0 0 1 1 0 0 1
Ignoring products of the small terms ! and #, in the spirit of the (linearized) EKF, the
first three components of Eq. (8.80) are
#̇ = ˆ ⇥ #+ !
! (8.81)
92 8. ATTITUDE ESTIMATION

and the fourth component is 0 = 0. Equation (8.81) is the equation needed to propagate
the covariance of the attitude error-angle covariance.
The expectation of Eq. (8.81) is
˙
#̂ = ! ˆ ⇥ #̂ (8.82)
because ! has zero expectation. This says that if #̂ is zero at the beginning of a prop-
agation it will remain zero through the propagation, which is equivalent to saying that q̂
will be equal to the identity quaternion throughout the propagation.
CHAPTER 9

Usability Considerations

Contributed by J. Russell Carpenter

This chapter is a catch-all to address best practices for things like selective processing
of measurements, backup ephemeris, reinitializations and restarts, availability of ground-
commandable tuning parameters, etc.
9.1. Editing
Let r = y h(x), where y is the observed measurement, h(x) is the value of the measure-
ment computed from the state x, y = h(x) + v, and v is the measurement noise, E[v] = 0,
E[vvT ] = R. The quantity r is known as the innovation or sometimes the pre-fit residual.
The covariance of r is given by
W = HPHT + R (9.1)
where P = E[ee ], H = @h(x)/@x, and e is the error in the estimate of x. The squared
T

Mahalanobis distance associated with r,

m2r = rT W 1
r (9.2)
has a 2 distribution with degrees of freedom equal to the number of measurements con-
tained in the vector y. The statistic m2r , also known as the squared residual or innovations
ratio, may be compared to a 2 statistic with a given probability in order to edit outlying
measurements. For a purely linear estimation scheme, such editing is unnecessary, but for
an ad hoc linearization such as the EKF, editing is essential to prevent large innovations that
would violate Taylor series truncations used to develop the EKF approximation from being
violated, even in the unlikely scenario in which sensors produced measurements with noise
characteristics that perfectly followed their assumed (Gaussian) probability distributions.
Experience has shown that it is beneficial to provide for a command-able capability
to selectively apply a three-way editing flag to each measurement type. This flag may be
enumerated with the labels “accept,” “inhibit,” and “force,” or similar. The “accept” label
denotes use of the measurement, if it is accepted by the aforementioned edit test. The
“inhibit” flag indicates that the measurement should be rejected regardless of the status
of the edit test. The “force” flag correspondingly indicates that the measurement should
ingested regardless of the edit test result.
9.2. Reinitialization, Restarts, and Backup Ephemeris
Since no EKF can be guaranteed to remain converged, it is prudent to provide for
features that can ease the process of recovering nominal filter operation. While to a large
extent the particulars of each application will guide a designer to select among a variety of
filter recovery features, a few general design principles are broadly relevant.
93
94 9. USABILITY CONSIDERATIONS

Experience has shown that the fairly drastic step of completely re-initializating the
filter may not always be necessary or prudent. In particular, under conditions in which it
is reasonably clear that the filter has begun to edit measurements because its covariance
matrix has become overly optimistic, but there remains reason to believe that the state
estimate has not yet become corrupted, it may be beneficial to reinitialize the covariance
while retaining the current state estimate. It may also be desirable to retain flexibility
to retain only the position/velocity state components, while reinitializing the various bias
components.
If the filter has halted for some reason other than divergence (e.g. a flight computer
reset), or if the start of the divergence can be reliably determined, it may be useful to
“restart” the filter from a previously saved state and covariance, especially if there would
otherwise be a long time required for re-convergence. To enable such a restart capability,
the full state and covariance must have been periodically saved, and then they must be
propagated to the restart time.
Periodically saving the filter state also enables the capability to maintain a backup
ephemeris, which provides an additional comparison source for evaluating filter divergence.
For flight phases of limited duration, the backup ephemeris may be propagated inertially
without any measurement updates. For extended operations, it will usually be necessary
to re-seed the backup with a current filter state at periodic intervals. In some applications,
such as GPS filtering, an independent “point solution” may also serve as a useful comparison
source.
9.3. Ground System Considerations
While it may seem intuitive that all parameters a↵ecting filter performance should be
available for re-tuning from the ground via mechanisms such as commands, table uploads,
etc., experience has shown that decisions about ground system design often limit the accessi-
bility of key tuning parameters. Adequate bandwidth in telemetry for full insight into filter
performance, including access to full covariance data, must be available, if only for limited
periods during commissioning and/or troubleshooting activities. The ground system must
be able to reproduce the onboard filter’s performance when provided with corresponding
input data via telemetry playback. And the ground system must also be able to form “best
estimated trajectories” for comparison to onboard filter performance, e.g. through the use
of smoothing algorithms.
CHAPTER 10

Smoothing

Contributed by Christopher N. D’Souza and J. Russell Carpenter

Since this work is primarily concerned with onboard navigation filters, one might ques-
tion the need for a chapter on best practices for smoothing. While the addition of a smoother
to an onboard navigation system has usually proved unnecessary, smoothing has nonethe-
less proved to be a useful ancillary capability for trajectory reconstruction by ground-based
analysts. Smoothed trajectories form the basis for our best proxies for truth, in the form of
“best estimated trajectories,” (BET) and McReynold’s “filter-smoother consistency test,”
propagated by Jim Wright [77], has proven to be a useful aid to tuning a filter using flight
data. Also, sequential smoothing techniques can provide optimal estimates of the process
noise sequence, as Appendix X of Bierman’s text [2] shows. These estimates may prove
useful as part of the filter tuning process.
It is also worth mentioning the topic of “smoothability.” As described in for exam-
ple Gelb [20], only states that are controllable from the process noise will be a↵ected by
smoothing. So for example, estimates of random constant measurement biases cannot be
improved by smoothing.
In point of fact there are three types of smoothing: fixed-interval smoothing, fixed-lag
smoothing, and fixed-point smoothing. The context described above is concerned with fixed-
interval smoothing. Maximum likelihood estimation (MLE) of states over a fixed interval has
been subject of investigation ever since the advent of the Kalman filter [36] in 1961. In 1962,
Bryson and Frazier first approached the problem from a continuous time perspective [4] and
obtained the smoother equations as necessary conditions of an optimal control problem†. In
1965, Rauch, Tung and Striebel [59] (RTS) continued the development of the MLE filters but
from a discrete time perspective. Their smoother, soon called the RTS smoother, was widely
used because of its ease of implementation. However, as Bierman [2] and others [51] pointed
out, there can sometimes arise numerical difficulties in implementing the RTS smoother. A
short time later Fraser and Potter [17,18] approached the problem a bit di↵erently, looking
at smoothing as a optimal combination of two optimal linear filters and obtained di↵erent,
yet equivalent, equations. Bierman’s Square-Root Information Filter [2] (SRIF) also has
an accompanying smoother form, suitable for applications utilizing the SRIF. Since the
Fraser-Potter form avoids the numerical issues of the RTS form, and since it can be easily
adapted from existing onboard Kalman-type forward filtering algorithms, it is generally to
be preferred.

†The Bryson-Frazier smoother is a continuous time instantiation of the smoother. It won’t be considered
here for we are interested in discrete smoothers. [4]

95
96 10. SMOOTHING

The boundary conditions for the Fraser-Potter (FP) smoother require the backward
covariance at the final time to be infinite, and the backward filter’s final state to be zero.
Fraser and Potter avoided the infinity by maintaining the backward filter covariance in
information form, so that both the information matrix and the information vector are
zero. As Brown points out [3], the backward filter may be retained in covariance form,
and the infinite boundary condition covariance replaced by either a covariance that is many
multiples of the forward filter’s initial covariance, or by the covariance and state from a short
batch least-squares solution using the final few measurements. Many practical smoother
implementations used by NASA have followed an approach of this sort.
Thus, a practical covariance form of the FP smoother results from running whatever
implementation of Algorithm 1.3 has proved suitable for the application at hand, but in
reverse time, and combining the backward filter results with the forward filter results at
each measurement time. Given the forward filter state and covariance, X̂F+i and P+ Fi , which
include the measurement at ti , and the backward filter state and covariance, X̂Bi and PBi ,
which do not include the measurement at ti , the optimally smoothed state and covariance
at ti are given by
h i
X̂iS = PSi (P+ Fi ) 1 +
X̂ Fi + (P Bi ) 1
X̂ Bi (10.1)
h i 1
PSi = (P+ Fi )
1
+ (PBi ) 1 (10.2)
If covariance form is to be retained, the tedious number of inverses apparent in Eqs. (10.1)
and (10.2) may be avoided as follows. Suppose we define the smoothed state as a linear
fusion of the forward and backward filter states:
X̂iS = WFi X̂F+i + WBi X̂Bi (10.3)
For X̂iS to be unbiased, we must choose either WFi = I WBi or WBi = I WFi . Choosing
the latter, the smoothed state becomes
X̂iS = WFi X̂F+i + (I WFi ) X̂Bi (10.4)
Given the enforced lack of correlation between the forward and backward filters, the fused
(smoothed) covariance is given by
PSi = WFi P+ T T
F i W F i + W B i P B i PB i W B i (10.5)
= W F i P+ T
Fi WFi + (I WFi ) PBi (I W Fi ) T (10.6)
Choosing WFi to minimize the trace of PSi results in
WFi = PBi (P+
F i + PB i )
1
(10.7)
To see that Eq. (10.6), with Eq. (10.7), is equal to Eq. (10.2), expand Eq. (10.6),
substituting Eq. (10.7), and recall Woodbury’s identity:
h i 1
(P+
Fi ) 1
+ (P Bi ) 1
= PBi PBi (P+ 1
F i + P B i ) PB i (10.8)
= P+
Fi P+ +
Fi (PFi + PBi )
1
P+
Fi (10.9)
To see that Eq. (10.4), with Eq. (10.7), is equal to Eq. (10.1), use Eq. (10.8) to show that
PSi (PBi ) 1 = WBi and use Eq. (10.9) to show that PSi (P+ Fi )
1 =W .
Fi
In a typical application, the forward filter has been running continuously onboard the
vehicle, and ground-based analysts will periodically wish to generate a BET over a particular
10. SMOOTHING 97

span of recently downlinked data. If the telemetry system has recorded and downlinked
the full state and covariance at each measurement time, along with the measurements,
the ground system need only run a “copy” of the forward filter backwards through the
measurements and fuse the data according to Eqs. (10.1) and (10.2). Care must be taken
that regeneration of the state transition matrices and process noise covariances is consistent
with the forward filter’s modeling, and with the negative flow of time.
For various reasons, it may be necessary or desirable to run the forward filter on the
ground as well, e.g. with higher-fidelity modeling than the onboard implementation permits.
If so, it is efficient to store the state transition matrices and process noise covariances
computed in the forward pass for use in the backward filter. In this case, Brown shows that
the backward covariance may be propagated using
h i
1
PBi = i+1,i P+
Bi+1 + S i+1
T
i+1,i (10.10)
CHAPTER 11

Advanced Estimation Algorithms

This chapter will describe advanced estimation algorithms that have yet to achieve the
status of best practices, but which appear to the contributors to have good potential for
someday reaching such status.

The Sigma-Point Estimator

Contributed by J. Russell Carpenter

Derivative-free state estimation techniques have received increasing attention in recent

years. A particular class of such estimators make use of the columns of the factors of the
estimators’ error covariance matrices, which are scaled to form vectors that have become
generally known as “sigma points.” The so-called “Unscented Kalman Filter” [33,34,42] is
a particular example of a sigma-point filter. A more general form is the divided-di↵erence
sigma-point filter, which is a sequential estimator that replaces first-order truncations of
Taylor series approximations with second-order numerical di↵erencing equations to approxi-
mate nonlinear dynamics and measurement models [53,54]. If the process and measurement
noise enter the system additively, Lee and Alfriend recently showed that several simplifica-
tions are possible, including a substantial reduction in the number of sigma-points [41].
This section highlights some broad aspects of sigma-point filtering, then briefly reviews
how the ADDSPF works. It concludes with some brief comments comparing the ADDSPF
to other sequential filters.

The Sigma-Point Filter In its most general form, the sigma-point filter performs
sequential estimation of the n-dimensional state, x, whose nonlinear dynamics over the
time interval [tk , tk+1 ] are given by
xk+1 = f (xk , wk ) (11.1)
The process noise input, w, consists of independent increments whose first two moments
are E[wk ] = 0 and E[wk w` ] = Qk k` , where k` is the Kronecker delta. Although the
second moment may be a function of the time index, this estimator assumes that all of the
samples of w arise from the same type of distribution, and this work further assumes that
this distribution is Gaussian, so that higher-order moments may be neglected.
The filter sequentially processes an ordered set of measurements, Yk = [y0 , y1 , ..., yk ] of
the form
yk = h(xk , vk ) (11.2)
where the measurement noise input, v, consists of independent and identically distributed
(again, in this work, Gaussian) increments whose first two moments are E[vk ] = 0 and
99
100 11. ADVANCED ESTIMATION ALGORITHMS

E[vk v` ] = Rk k` . By contrast, the ADDSPF utilizes models where the noise sources enter
additively:
xk+1 = f (xk ) + g(xk )wk (11.3)
and
yk = h(xk ) + vk . (11.4)
All sigma-point filters utilize a linear measurement update equation of the form
x̂+
k = x̂k + Kk yk ŷk (11.5)
where the accented variables in Eq. 11.5 denote conditional expectations, as in the Kalman
filter:
x̂+
k = E[xk |Yk ] (11.6)
x̂k = E[xk |Yk 1] (11.7)
ŷk = E[yk |Yk 1] (11.8)
The gain matrix, K, is based on conditional covariances, as in the Kalman filter:
1
Kk = Pxyk Pyyk (11.9)
h i
T
Pk = Pxxk = E xk x̂k xk x̂k |Yk 1 (11.10)
h i
T
Pxyk = E xk x̂k yk ŷk |Yk 1 (11.11)
h i
T
Pyyk = E yk ŷk yk ŷk |Yk 1 (11.12)
(11.13)
and the covariance associated with state estimate x̂+
k is
h i
T
P+ +
k = Pxxk = E xk x̂+
k xk x̂+
k |Yk (11.14)

Hereafter, equations will suppress the time index if it is the same for all variables in the
equation.
Estimators such as the Kalman filter estimate these conditional expectations by approx-
imating the nonlinear functions f and h with first-order Taylor series truncations, e.g.:
f (x) ⇡ f (x̂ ) + f 0 (x)(x x̂ ) (11.15)
where f 0 is an exact gradient. By contrast, the divided di↵erence filter uses a second-order
truncation along with numerical di↵erencing formulas for the derivatives:
(1) (2)
f (x) ⇡ f (x̂ ) + D̃ x f (x̂ ) + D̃ x f (x̂ ) (11.16)
(i)
where the divided di↵erence operators, D̃ x f (x̂ ), approximate the coefficients of the mul-
tidimensional Taylor series expansion using Stirling interpolations. These interpolators
di↵erence perturbations of f (x̂ ) across an interval, h, over a spanning basis set. Whether
they are first-order, such as the unscented filter, or second order, sigma-point filters choose
the interval so as to better approximate the moments required for the gain calculation, and
choose as the spanning basis a set of sigma points, which are derived from x̂ and the
columns of the Cholesky factors of P as follows.
THE SIGMA-POINT ESTIMATOR 101

The ADDSPF Let X̂ denote the array whose columns are a particular ordering of the
sigma points derived from x̂ and its corresponding covariance, P. Then
h p p p p i
c c c c
X̂ = x̂, x̂ + h P(:,1) , x̂ + h P(:,2) , ..., x̂ h P(:,1) , x̂ h P(:,2) , ... (11.17)
p p T
where the subscript (:, i) denotes column i of the corresponding array, andpP = c P c P
denotes a Cholesky factorization. In the sequel, the shorthand notation x̂±h c P will denote
the array on the right-hand side of the equation above. Then, for the ADDSPF, Ref. 41
shows that as each new measurement becomes available, an array of sigma points generated
from the prior update should be propagated to the new measurement time:
X̂k = f (X̂k+ 1 ) (11.18)
These propagated sigma points are then merged to form the state estimate just prior to
incorporating the new measurement as follows:
2n+1
h2 n 1 X
x̂ = µh (X̂ ) = X̂(:,1) + 2 X̂(:,i) (11.19)
h2 2h
i=2
To form an associated covariance, the following divided-di↵erences are next computed:
(1) 1 h i
D̃ x f (x̂ )(:,i) = X̂(:,i+1) X̂(:,i+1+n) (11.20)
2h
p
(2) h2 1 h i
D̃ x f (x̂ )(:,i) = X̂ (:,i+1) + X̂ (:,i+1+n) 2X̂ (:,1) (11.21)
2h2
Ref. 41 shows that the covariance may then be computed from
h (1) (2) p i h (1) (2) p iT
P = D̃ x f (x̂ ), D̃ x f (x̂ ), c Qd D̃ x f (x̂ ), D̃ x f (x̂ ), c Qd (11.22)
One advantage of sigma-point filters is that the full covariance need not be maintained,
but rather only its Cholesky factor. Although the factors in square brackets in Eq. 11.22
are not Cholesky factors, since each is a full n ⇥ 3n matrix, one may extract an n ⇥ n
triangular factor from it using the so-called “thin” version [23] of the QR decomposition1,
or alternatively using a Householder factorization [2]. Thus,
p h (1)
c
P
T
(2) p iT
M = D̃ x f (x̂ ), D̃ x f (x̂ ), c Qd (11.23)
O2n⇥n
where M is a full 3n ⇥ 3n orthonormal matrix, and O2n⇥n is a 2n ⇥ n matrix of zeros.
For the measurement update, a new array of sigma points must be generated from x̂
and P ; this array is denoted X̂ ⇤ . These sigma points are used to generate a set of sigma
points representing the measurement:
Ŷ = h(X̂ ⇤ ) (11.24)
In similar fashion to the time update, the sigma points of the measurement are then merged
to form the estimated measurement:
ŷ = µh (Ŷ ) (11.25)
1For Matlab users, this may be accomplished in several ways, e.g. by passing the transpose of this
matrix to the qr function, then keeping the first n non-zero rows from the second output, and transposing
this result.
102 11. ADVANCED ESTIMATION ALGORITHMS

the corresponding divided-di↵erences are computed as:

(1) 1 h i
D̃ x h(x̂ )(:,i) = Ŷ(:,i+1) Ŷ(:,i+1+n) (11.26)
2h
p
(2) h2 1 h i
D̃ x h(x̂ )(:,i) = Ŷ (:,i+1) + Ŷ (:,i+1+n) 2Ŷ (:,1) (11.27)
2h2
and the covariances required for the gain calculation may then be computed from
h (1) (2) p i h (1) (2) p iT
c c
Pyy = D̃ x h(x̂ ), D̃ x h(x̂ ), R D̃ x h(x̂ ), D̃ x h(x̂ ), R (11.28)
pc
h (1) iT
Pxy = P D̃ x h(x̂ ) (11.29)
(2)
Note that the second-order divided di↵erence for the measurement function, D̃ x h(x̂ ),
does not appear in the cross-covariance update. As with the time update, through the use
of the thin QR factorization, only triangular factors need be maintained for Pyy 2 :
"q #
c
T
h (1) p iT
Pyy (2) c
Myy = D̃ x h(x̂ ), D̃ x h(x̂ ), R (11.30)
O2n⇥n
Now, all of the terms required for the state update (Eqs. 11.5 and 11.9), are available.
Ref. 41 shows that the corresponding Cholesky factor of the covariance is extracted from
p hp h (2) ii
+
c T
P+ = c P (1) pc
M KD̃ x h(x̂ ), K D̃ x h(x̂ ), R (11.31)
O2n⇥n
The ADDSPF vs. Other Sequential Filters To conclude this section, some obser-
vations concerning the ADDSPF in comparison to other filters are o↵ered. These observa-
tions concern the number of sigma points, the order of approximation, and the existence
and method of choice of free parameters in the algorithms.
Although in many problems of practical interest the noise enters the system additively,
if this is not the case, then either the original divided di↵erence filter or the unscented filter
may provide superior results to the ADDSPF, at the cost of requiring more sigma points.
In both of the former algorithms, the nonlinear functions must be perturbed not only over
a basis spanning the state space, but also over the discrete process noise and measurement
noise spaces. Thus, rather than 2n + 1 sigma points, the more general algorithms require
2na + 1, where na = n + nw + nv , and nw and nv are the dimensions of the discrete process
noise and measurement noise inputs.
The Kalman filter is an exact algorithm for linear stochastic systems driven by Gaussian
noise, and nothing is to be gained from the use the sigma-point filters for such purely linear
systems. First-order sigma-point filters such as the UKF retain the Kalman filter’s first-
order truncation, but avoid the need for the designer to supply explicit gradients. The
divided di↵erence filter is comparable to a derivative-free version of the modified second-
order Gaussian filter [30] in that, for symmetric distributions, it retains some terms as high
as order four.
Unlike the Kalman filter, for which all of the parameters in principle can be associated
with properties of the underlying stochastic system, all of the sigma point filters involve at
2Although it might seem that the full matrix P
yy is required for the gain computation of Eq. 11.9,
Ref. 53 points out that, rather than inverting the product of the factors to compute the gain, the gain may
be solved from forward and back substitution directly using the Cholesky factor.
THE SIGMA-POINT ESTIMATOR 103

least one free parameter. In the unscented filter, the weights for combining the sigma points
involve three parameters whose physical interpretation is perhaps less clear than with the
single parameter in the divided di↵erence algorithms, where the free parameter h is clearly
associated with the size of the perturbation in the numerical di↵erencing formulae. Ref. p53
shows that h should be bounded below by h > 1, and that for symmetric distributions, h
should be equal to the kurtosis, which for a Gaussian distribution is three3.

3Some authors subtract three from the definition of kurtosis, so that Gaussian distributions have zero
kurtosis.
APPENDIX A

Models and Realizations of Random Variables

Contributed by J. Russell Carpenter

A continuous random variable is a function that maps the outcomes of random events to
the real line. Realizations of random variables are thus real numbers. A vector of n random
variables maps outcomes of random events to Rn . For our purposes, random variables
will always be associated with a probability density function that indicates the likelihood
that a realization occurs within a particular interval of the real line, or within a particular
subspace of Rn for the vector case. It is common to assume that this density is the normal
or Gaussian density. For the vector case, the normal probability density function is
1 1
(x µ)T P 1
(x µ)
px (x) = p e 2 (A.1)
|2⇡P|
where µ is a vector of mean values for each component of x, and P is a matrix that con-
tains the variances of each component of x along its diagonal, and the covariances between
each component as its o↵-diagonal components. The covariances indicate the degree of
correlation between the random variables composing x. The matrix P is thus called the
variance-covariance matrix, which we will hereafter abbreviate to just “covariance matrix,”
or “covariance.” Since the normal density is completely characterized by its mean and co-
variance, we will use the following notation as a shorthand to describe drawing a realization
from a normally-distributed random vector:
x ⇠ N (µ, P) (A.2)
Thus, the model for realizations of a measurement noise vector is
v ⇠ N (0, R) (A.3)
For the scalar case, or for the vector case when the covariance is diagonal, we may
directly generate realizations of a normally-distributed random vector from normal random
number generators available in most software libraries. If P has non-zero o↵-diagonal el-
ements, we must model the specified correlations when we generate realizations. If P is
strictly positive definite, we can factor it as follows:
p
C
p
C T
P= P P (A.4)
p
where C P is a triangular matrix known as a Cholesky factor; this can be viewed as a “matrix
square root.” p The Cholesky factorization is available in many linear algebra libraries. We
C
can then use P to generate correlated realizations of x as follows. Let z be a realization
of a normally-distributed random vector of the same dimension as x, with zero mean and
105
106 A. RANDOM VARIABLES

unit variance, that is

z ⇠ N (0, I) (A.5)
Then, with p
C
x = Pz (A.6)
we can generate properly correlated realizations of x. We can also use a Cholesky factoriza-
tion of the measurement noise covariance R, if R is non-diagonal, to transform correlated
measurements into uncorrelated auxiliary measurements for cases in which the estimator
cannot handle correlated measurement data.
If P is only non-negative definite, i.e. P 0 rather than P > 0 as above, the Cholesky
factorization does not exist. In this case, since P’s eigenvalues are real and distinct, it has
a diagonal factorization:
P = VDVT (A.7)
where V is a matrix of eigenvectors and D is a diagonal matrix of eigenvalues. Then, with
z as above, p
x = V Dz (A.8)
p
where D implies taking the square roots of each diagonal element.
APPENDIX B

The Mathematics Behind the UDU Factorization

Contributed by Chris D’Souza

B.1. The Partitioning into Two Subproblems

We can find that the update equation is
T
UDU = UDUT T + Q (B.1)
T T T
= 2 1 UDU 1 2 + Q1 + Q2
T T T
= 2 [ 1 UDU 1 ] 2 + Q1 + Q2 (B.2)
1 T T
Recalling that Q1 = 2 2 Q1 2 2 and

1 I 0
2 = 1 (B.3)
0 M
1
where M = diag(1/mi ), i = 1, 2, 3, · · · , np . We note that
   
1 T I 0 Qxx 0 I 0 Qxx 0
Q = = = Q1 (B.4)
2 1 2 0 M 1 0 0 0 M 1 0 0
B.2. The Mathematics Behind the Second Subproblem
Recall that we partitoned Ue and De as
2 3 2 3
e aa
U e ab U
U e ac } na e aa
D 0 0 } na
6
e =4 0 7
e bc 5 } 1 6
e =4 0 e 7
U 1 U and D db 0 5 }1 (B.5)
0 0 U e cc } nc 0 0 De cc } nc
in order to isolate a parameter. In fact, the state we choose to isolate is one of the Gauss-
Markov states (likely associated with a sensor). Let
2 32 3
I 0 0 I 0 0
2 =4 0 1 0 5 4 0 mb 0 5 = c b (B.6)
0 0 Mc 0 0 I
and
2 3 2 3
0 0 0 0 0 0
Q 2 = 4 0 qb 0 5 + 4 0 0 0 5 = Q b + Q c (B.7)
0 0 0 0 0 Qc
As in the previous exercise, we note that c 1 Qb c T = Qb . So, now Eq. (7.24) becomes
h T
i
UDU = c
T
e
U e
D e
U T
+ Q T
b b b c + Qc (B.8)
107
108 B. FACTORIZATION MATH

The term in the square bracket in Eq. (B.8) is

T
ŬD̆Ŭ = e e eT
b UDU
T
+ Qb (B.9)
b

The left side of Eq.(B.9) (recalling that Ŭbb = 1) is

2 T T T
3
Ŭaa D̆aa Ŭaa | Ŭac D̆cc Ŭac | Ŭac D̆cc Ŭcc
6 T 7
6 +Ŭab d˘b Ŭab | +Ŭab d˘b | 7
6 T 7
6 Ŭac D̆cc Ŭ | | 7
6 ac 7
T 6 | | 7
ŬD̆Ŭ = 66 T T T
7
7 (B.10)
6 d˘b Ŭab | d˘b + Ŭbc D̆cc Ŭbc | Ŭbc D̆cc Ŭcc 7
6 T 7
6 Ŭbc D̆cc Ŭac | | 7
6 7
4 | | 5
T T T
Ŭcc D̆cc Ŭac | Ŭcc D̆cc Ŭbc | Ŭcc D̆cc Ŭcc
The right side of Eq.(B.9), once again recalling that U e bb = 1, is
2 3
e aa D
U e aa U eT | mb Ue ac D eT
e cc U | Ue ac D eT
e cc U
aa ac cc
6 eT
e ab deb U e ab deb 7
6 +U ab | +m b U | 7
6 7
6 U e ac D
e cc U eT | | 7
6 ac 7
T 6 | | 7
e e e T + Qb = 6
b UDU
7(B.11)
b 6 e
mb db Uabe T
e bc D
| m2b U eT
e cc U e bc D
| mb U eT
e cc U 7
6 bc cc 7
6 e bc De cc UeT | 2d eb + qb 7
6 +mb U ac +m | 7
6 b 7
4 | | 5
e cc D
U e cc U eT | mb Ue cc D eT
e cc U | e cc D
U eT
e cc U
ac bc cc
Equating the components in Eqs. ((B.10) and (B.11), from the (1,3) and (3,3) element, we
find
Ŭac = Ue ac , e cc ,
D̆cc = D
T
eT
Ŭcc = U (B.12)
cc
From the (2,3) element we get,
e bc
Ŭbc = mb U (B.13)
From the (2,2) element and using the results of Eq. (B.13), we find that
d˘b = m2b deb + qb (B.14)
The (2,1) element yields
deb e
Ŭab = mb Uab (B.15)
d˘b
What finally remains is the (1,1) element and it is on this we focus. Using the relations in
the previous equations, we find that
" #
T eb
d
T
e aa D
Ŭaa D̆aa Ŭaa = U e + deb m2
e aa U e eT
aa b ˘ Uab Uab (B.16)
db
The term in the bracket can be simplified as
" #
deb m2 de2 + qb deb m2b de2b deb qb deb qb
deb mb2
= b b = = (B.17)
d˘b m2b deb + qb m2b deb + qb d˘b
B.3. THE AGEE-TURNER RANK-ONE UPDATE 109

so Eq.(B.16) becomes
!
T
e aa D
e aa U
e + T deb qb eT
e ab U
Ŭaa D̆aa Ŭaa = U aa U ab (B.18)
d˘b

We note that U e ab is a column vector so Eq.(B.18), and hence is of rank 1, constitutes a

‘rank one’ update. Since d˘b , deb and qb are all positive (assuming mb is a positive quantity),
we can use the Agee-Turner Rank One update [1]. It should be pointed out that as the
algorithm proceeds down the ‘list’ of parameters, the size of the states a increases by one
(and consequently the size of the parameters c decreases by one. Hence Ŭaa and D̆aa begins
with a dimension of nx and concludes with dimension nx + np 1.
Therefore, this is done recursively for all the (sensor) parameters p which are of size np .

B.3. The Agee-Turner Rank-One Update

In trying to get an efficient algorithm for performing the time update of the covariance
matrix, we were faced with Eq.(B.18), which is of the form
eD
U e T = UDUT + cxxT
eU (B.19)
This is called a ‘rank one’ update because we are updating the matrix factors U and D
based upon products of x which is of rank 1.
In order to reduce the number of mathematical operations (adds/subtracts, multiplies
and divides), for the case of parameter or ECRV/First-order Gauss Markov processes, for
sensor parameters, we consider the rank-one update first introduced by Agee and Turner
(of the White Sands Missile Range (WSMR)) in 1972.
Consider a covariance matrix update of the form,
e = P + cxxT
P (B.20)
or
eD
U e T = UDUT + cxxT
eU (B.21)
so, peij can be expressed (and defined) as
n
X n
X
peij = eik dekk u
u ejk = uik dkk ujk + cxi xj (B.22)
k=j k=j

and
n
X n
X
peii = e2ik dekk =
u u2ik dkk + cx2i (B.23)
k=i k=i

eii = uii = 1 and thus, for an n ⇥ n matrix, for j = n (i.e. the last column),
We recall that u
denn = dnn + c x2n (B.24)
ein denn u
pein = u enn = dnn uin unn + cxi xn (B.25)
so that
1
ein =
u (dnn uin + cxi xn ) (B.26)
denn
110 B. FACTORIZATION MATH

The second-to-the-last (n 1-th) column of U can now be can be operated on, by means
of the following decomposition of Eq. (B.22), as
n
X1 n
X1
eik dekk u
u ein denn u
ejk + u ejn = uik dkk ujk + uin dnn ujn + cxi xj (B.27)
k=j k=j
which leads to
n
X1 n
X1
eik dekk u
u ejk = uik dkk ujk + ⌥n (B.28)
k=j k=j

ein and u
If we work on the terms outside the two summations, using Eq. (B.26) for u ejn , ⌥n
becomes
⌥n = ein denn u
u ejn + uin dnn ujn + cxi xj
1
= [dnn uin + cxi xn ] [dnn ujn + cxj xn ]
denn
dnn + c x2n
+ (uin dnn ujn + cxi xj )
denn
1 ⇥
= c dnn ujn xn xi c dnn uin xn xj + c dnn xi xj + c dnn x2n uin ujn
e
dnn
c dnn
= (xi uin xn ) (xj ujn xn ) (B.29)
denn
Therefore, Eq. (B.27) can be written as
n
X1 n
X1 c dnn
eik dekk u
u ejk = uik dkk ujk + (xi uin xn ) (xj ujn xn ) (B.30)
k=j k=j denn
Now, if we operate a bit more on the quantity uein , we find from Eq. (B.26), that we get
dnn c
ein =
u uin + xi xn (B.31)
e
dnn e
dnn
denn c x2n c
= uin + xi xn (B.32)
denn denn
c xn
= uin + (xi uin xn ) (B.33)
denn
and if we define ↵i and vn as
↵i = (xi uin xn ) (B.34)
c xn
vn = (B.35)
denn
ein can be rewritten as
u
ein = uin + ↵i vn
u (B.36)
If we want to generalize this, we can write Eq. (B.30) as
n
X1 n
X1
eik dekk u
u ejk = uik dkk ujk + C n Xin Xjn (B.37)
k=j k=j
B.3. THE AGEE-TURNER RANK-ONE UPDATE 111

where
c dnn
Cn = (B.38)
denn
Xin = xi uin xn (B.39)

with

ein = uin + ↵in vn

u
↵in = xi uin xn
c xn
vn =
denn
denn = dnn + c x2n

Thus, for the third-to-the-last column (j = n 2), we expand Eq.(B.37) as

n
X2 n
X2
eik dekk u
u ejk + u
ei,n e ej,n 1
1 dn 1,n 1 u = uik dkk ujk
k=j k=j
+ui,n 1 dn 1,n 1 uj,n 1
+C n Xin Xjn (B.40)

which produces
n
X2 n
X2
eik dekk u
u ejk = uik dkk ujk
k=j k=j
C n dn 1,n 1 n n
⇥ ⇤
+ [Xi ui,n 1 Xn ] Xjn uj,n n
1 Xn (B.41)
d˜n 1,n 1
so that using the same machinery as above, we get
n
X2 n
X2
eik dekk u
u ejk = uik dkk ujk + C n 1
Xin 1
Xjn 1
(B.42)
k=j k=j

C n dn 1,n 1
Cn 1
= (B.43)
den 1,n 1
Xin 1
= ↵in 1 = Xin ui,n n
1 Xn (B.44)
C n Xin
vn 1 = (B.45)
den 1,n 1
ei,n
u 1 = ui,n 1 + Xin 1
vn 1 (B.46)
den 1,n 1 = dn 1,n 1 + C n x2n 1 (B.47)

ei,i = 1.
We also are reminded that u
112 B. FACTORIZATION MATH

B.4. Decorrelating Measurements

We normalize the (original) measurement equation
zorig = Horig x + ⌫ orig (B.48)
where the measurement noise has statistics
E[⌫ orig ] = 0 (B.49)
E[⌫ orig ⌫ Torig ] = Rorig (B.50)
where Rorig is the measurement noise.
We now change variables so that
1/2
z = Rorig zorig (B.51)
1/2
H = Rorig Horig (B.52)
1/2
⌫ = Rorig ⌫ orig (B.53)
1/2 1/2 T /2
where Rorig is the inverse of the Cholesky factor of Rorig ( = Rorig Rorig ). With this, the
new normalized measurement equation is
z = Hx + ⌫ (B.54)
where E[⌫] = 0 and E[⌫⌫ T ] = I. This is sometimes referred to as pre-whitening or
decorrelation, because if the original measurements were correlated, the normalized mea-
surements are now uncorrelated (via the Cholesky decomposition of Rorig ).
B.5. The Carlson Rank-One Update
The Carlson rank-one update [5], introduced in 1973, addresses the problem of updating
the covariance due to a loss of precision involved in the di↵erencing of two positive quantities
which are nearly equal. In particular, the diagonal elements dii have the potential of going
negative in certain cases if the Agee-Turner rank-one update is blindly used. Thankfully, we
resort to the Carlson rank-one update to compute the measurement update without losing
numerical precision.
We recall that ↵ and v were defined earlier. We also define the n ⇥ 1 vector f as
T
f = U HT (B.55)
Therefore, since f is an n ⇥ 1 vector and D is a diagonal matrix, we can express ↵ as
n
X
↵ = ↵n = R + fi2 dii (B.56)
i=1
so that
j
X
↵j = R + fi2 dii = ↵j 1 + fj2 djj (B.57)
i=1
Since v can be written as
v = Df (B.58)
we can also write
vj = djj fj (B.59)
B.5. THE CARLSON RANK-ONE UPDATE 113

and we can write ↵j as

vj2
↵j = ↵j 1 + (B.60)
djj

eD e T = D̄
eU 1 T
U vv (B.61)
↵
which can be rewritten as:
eD e T = ŪD̄ŪT
eU 1 T
U vv (B.62)
↵
where Ū = I. Following the reasoning in the description for the Rank-One update earlier
in this Appendix,
n
X 1
peij = eik dekk u
u ejk = vi vj (B.63)
↵
k=j

and
n
X 1 2
peii = e2ik dekk = dii
u v (B.64)
↵ i
k=i

For j = n, Eq.(B.63) becomes

1
ein denn u
u enn = vi vn (B.65)
↵
and from Eq. (B.64),
1 2
e2nn denn = dnn
u v (B.66)
↵ n
enn = 1, we get
Recalling that u
1 2
denn = dnn v (B.67)
↵ n
and
1
ein =
u vi vn (B.68)
e
↵dnn
So, Eq. (B.63) can be written as
n
X1 1
eik dekk u
u ein denn u
ejk + u ejn = vi vj (B.69)
↵
k=j

ein and u
Substituting for u ein from Eq. (B.68), we find
n
X1 
e 1 1 2
eik dkk u
u ejk = 1+ vn vi vj (B.70)
↵ e
↵dnn
k=j

But since

1 2 dnn
1+ vn = (B.71)
↵denn denn
114 B. FACTORIZATION MATH

Eq. (B.70) becomes

n
X1 1 dnn
eik dekk u
u ejk = vi vj (B.72)
↵ denn
k=j

So, we can expand Eq.(B.72) as

n
X2 1 dnn
eik dekk u
u ejk + u
ei,n e ej,n 1
1 dn 1,n 1 u = vi vj (B.73)
↵ denn
k=j

ei,n 1 and den

We need to obtain u 1,n 1 . ei,n
First we work on u e
1 dn 1,n 1 from Eq.(B.63)
with i = n 1 as follows:

e e 1 2
e2n
u 1,n 1 dn 1,n 1 e2n
+u 1,n dn,n = dn 1 v (B.74)
↵ n 1

en
Recalling that u 1,n 1 en
= 1 and u 1,n was obtained (with i = n 1) in Eq. (B.68), we get
" #
1 1
den 1,n 1 = dn 1,n 1 1+ vn vn2 1
2
(B.75)
↵ e
↵dn,n
Knowing that
" #
1 dn,n
1+ vn2 = (B.76)
↵den,n den,n

den 1,n 1 becomes

!
1 dn,n
den 1,n 1 = dn 1,n 1 vn2 1 (B.77)
↵ den,n

ei,n
Now we work on u 1. We recall that from Eq. (B.64), with j = n 1, we find that

e 1
ei,n
u en 1,n 1
1 dn 1,n 1 u ei,n den,n u
+u en 1,n = vi vn 1 (B.78)
↵
We substitute for den 1,n 1 from Eq. (B.77), for u
ei,n and uen 1,n from Eq.(B.68) and noting
en 1,n 1 = 1, we get
that u
" #
1 1 2
ei,n 1 =
u 1+ vn vi vn 1 (B.79)
↵den 1,n 1 ↵den,n
ei,n
Using Eq.(B.76), u 1 becomes
!
1 dn,n
ei,n
u 1 = vi vn 1 (B.80)
↵den 1,n 1 den,n
B.5. THE CARLSON RANK-ONE UPDATE 115

With this in mind, are now prepared to work on Eq. (B.73) and we find that
n
!" #
X2 d n,n d n,n
eik dekk u
u ejk = v2 1 vi vj
↵ en,n
d ↵ en,n n 1
d
k=j
! !
dn,n dn 1,n 1
= vi vj (B.81)
↵den,n den 1,n 1
This has the same form as Eq. (B.69), so this suggests a recursion as follows:
With Cn = 1/↵ for j = n, · · · , 1:
dejj = djj + C j vj2 (B.82)
eij
u = C j vi vj /dejj , k = 1, · · · , j 1 (B.83)
C j 1
= C djj /dejj
j
(B.84)
From Eq. (B.82), we get
dejj = djj + C j vj2
and from Eq. (B.84), we find that
djj C j djj djj
Cj 1
= Cj = = (B.85)
dejj djj + C j vj2 djj
+ vj2
Cj
This can be written as
1 1 vj2
= + (B.86)
Cj 1 Cj djj
or
1 1 vj2
= + (B.87)
Cj Cj 1 djj
Comparing Eqs. (B.60) and Eq. (B.87) we find that
1
↵j = (B.88)
Cj
Using this equation, we find that
✓ ◆
↵j 1
dejj = djj (B.89)
↵j
eij as
From Eq.(B.83) and using Eqs. (B.89) and (B.88), we can express u
vi vj
eij =
u (B.90)
djj ↵j 1
Recalling that vj = djj fj ,
vi fj
eij
u = (B.91)
↵j 1
If we define j as
fj
j = (B.92)
↵j 1
116 B. FACTORIZATION MATH

eij becomes
u
eij
u = j vi (B.93)
eij has the structure
U
2 3
1 2 v1 3 v1 4 v1 ··· n v1
6 0 1 3 v2 4 v2 ··· n v2
7
6 7
6 0 0 1 4 v3 ··· 7
n v3 7
e =6
U 6 7 (B.94)
6 0 0 0 1 ··· n v4 7
6 .. .. .. .. .. .. 7
4 . . . . . . 5
0 0 0 0 ··· 1
e as
We can rewrite U
⇥ ⇤
Ue = In + 0n⇥1 2v
(1)
3v
(2)
4v
(3) ··· nv
(n 1) (B.95)
where v(j) is an n ⇥ 1 vector defined as
⇥ ⇤T
v(j) = v1 v2 v3 · · · vj 0 ··· 0 (B.96)
We recall that

T 1 T T
UDU = U D vv Ū
↵
and
eD eT = D
eU 1 T
U vv
↵
and
e
U = UU and e
D=D
With this in mind, U is
⇥ (1) (3) (3) (n 1)
⇤
U=U+U 0n⇥1 2v 3v 4v ··· nv (B.97)
(j)
If we denote U(j) and U as the jth columns of U and U, respectively, we find that
(j)
U(j) = U + j Kj 1 (B.98)
where
(j)
Kj = Uv(j) = Kj 1 + vj U with K0 = 0n⇥1 (B.99)
Finally,
1
K= Kn (B.100)
↵n
APPENDIX C

An Analysis of Dual Inertial-Absolute and Inertial-Relative

Navigation Filters

Contributed by Chris D’Souza

This appendix describes a dual inertial-absolute state and dual inertial-relative state
navigation filter trade study performed for Orion. The formulation of each of these filters is
detailed, the advantages and disadvantages of each are discussed, and a recommendation to
use the dual-inertial formulation is made. This appendix is reproduced from CEV Flight
Dynamics Technical Brief Number FltDyn–CEV–07–141, dated December 21, 2007.

C.1. Introduction
Orion will need an efficient and well formulated relative navigation filter. Among the
many possibilities, two of the most promising will be discussed in this report. The two are
the dual inertial-absolute state navigation filter and the dual inertial-relative state naviga-
tion filter. The dual inertial-absolute state filter includes the absolute inertial state of both
vehicles (with respect to the center of mass of the central body). The dual inertial-relative
state navigation filter has as its states the absolute inertial state of the chaser (Orion) vehi-
cle and the relative inertial state of the target with respect to the chaser (xrel = xT xC ).

C.2. The Filter Dynamics

C.2.1. The Dual Inertial-Absolute Filter Dynamics In general, the inertial states
of the chaser vehicle can be expressed as
ẋC = fC (xC ) + wC (C.1)
where fC are the chaser nonlinear dynamics and wC is the process (plant) noise (with
statistics E(wC (t)) = 01 and E(wC (t)wC (⌧ )) = QC (t ⌧ )). Similarly, the inertial states
of the target vehicle evolve according to
ẋT = fT (xT ) + wT (C.2)
where fT are the target nonlinear dynamics and wT is the process (plant) noise (with
statistics E(wT (t)) = 0 and E(wT (t)wT (⌧ )) = QT (t ⌧ )). The nominal state dynamics

1The expectation operator E(·) for continuous random variables is defined as follows
Z 1
E(X) = x p(x) dx
1

where p(x) is the probability density function.

117
118 C. DUAL INERTIAL VS. INERTIAL/RELATIVE TRADE

can be expressed as
ẋCnom = fC (xCnom ) (C.3)
ẋTnom = fT (xtnom ) (C.4)
Defining
xC = xC xCnom (C.5)
xT = xT xTnom (C.6)
and taking derivatives and expanding to first-order, we get
ẋC = AC (xCnom ) xC + wC (C.7)
ẋT = AT (xTnom ) xT + wT (C.8)
where
✓ ◆ ✓ ◆
@fC @fT
AC (xCnom ) = and AT (xTnom ) = (C.9)
@xC xC =xCnom @xT xT =xTnom
Equivalently, we can express the filter errors as
x̂C = x̂C xC (C.10)
x̂T = x̂T xT (C.11)
2
where, with a bit of abuse of notation
x̂C = E(xC ) and x̂T = E(xT ) (C.12)
so that the filter error dynamics evolve as
x̂˙ C = AC (xC ) x̂C + wC (C.13)
ẋT = AT (xT ) x̂T + wT (C.14)
We can, therefore, write the inertial-absolute3 filter error dynamics (dropping the functional
dependence for compactness) as
    
x̂˙ C AC 0 x̂C I 0 wC
˙x̂T = 0 AT x̂T
+
0 I wT
(C.15)

Defining
⇢  
x̂C ⇥ ⇤ E[ x̂C x̂TC ] E[ x̂C x̂TT ]
PC,T PC,C
PIA = E x̂TC x̂TT = =
x̂T E[ x̂T x̂TC ] E[ x̂T x̂TT ]
PT,T PT,C
(C.16)
where the subscript IA denotes that this is the covariance associated with the inertial-
absolute filter. The di↵erential equation for the covariance (assuming that the plant/process
noise for the two vehicles are independent and are independent of the states of the two
vehicles) is
ṖIA = AIA PIA + PIA ATIA + GIA QIA GTIA (C.17)
2To be precise, x̂ should be written in terms of the conditional expectation
C

x̂CK = E(xC |Z1 , · · · , Zk ) and x̂TK = E(xT |Z1 , · · · , Zk )

with measurements Z1 through Zk . At the initial time x̂C0 = E(xC0 ) and x̂T0 = E(xT0 )
3In order to distinguish between the two filters, we call this filter the (dual) inertial-absolute filter
because both the chaser and the target states are expressed in terms of absolute inertial coordinates.
C.2. THE FILTER DYNAMICS 119

where
  
AC 0 I 0 QC 0
AIA = , GIA = , QIA = (C.18)
0 AT 0 I 0 QT
with the initial condition 
PC,C 0 0
PIA = (C.19)
0 PT,T 0
where PC,C 0 and PT,T 0 are the initial covariances of the chaser and target inertial (absolute)
states, respectively. Finally, the covariance of the relative state is
h i
Prel,rel = E ( x̂C x̂T ) ( x̂C x̂T )T (C.20)
T
= PC,C + PT,T PT,C PC,T = PC,C + PT,T PT,C PT,C (C.21)

C.2.2. The Dual Inertial-Relative Filter Dynamics Consistent with the earlier
definitions, we define the inertial relative state as
xrel = xT xC (C.22)
Taking derivatives of this equation and substituting from Eqs.(1) and (2) yields
ẋrel = fT (xT ) fC (xC ) + wT wC (C.23)
Expanding this equation to first-order yields
ẋrel = AT (xT ) x̂T AC (xC ) x̂C + wT wC (C.24)
= AT (xT ) ( x̂rel + x̂C ) AC (xC ) x̂C + wT wC (C.25)
= (AT AC ) x̂C + AT x̂rel + (wT wC ) (C.26)
Therefore we write the inertial-relative4 filter error dynamics (once again dropping the
functional dependence for compactness) as
    
x̂˙ C AC 0 x̂C I 0 wC
= + (C.27)
x̂˙ rel AT AC AT x̂T I I wT
Defining, as before,
⇢ 
x̂C ⇥ ⇤ E[ x̂C x̂TC ] E[ x̂C x̂Trel ]
PIR = E x̂TC x̂Trel = (C.28)
x̂rel E[ x̂rel x̂TC ] E[ x̂rel x̂Trel ]

PC,C PC,rel
= (C.29)
Prel,C Prel,rel
where the subscript IR denotes that this is the covariance associated with the inertial-
relative filter. The di↵erential equation for the covariance is
ṖIR = AIR PIR + PIR ATIR + GIR QIR GTIR (C.30)
where
  
AC 0 I 0 QC 0
AIR = , GIR = , QIR = (C.31)
AT AC AT I I 0 QT

4We refer to this filter as the inertial-relative filter to distinguish it from the prior inertial-absolute filter.
In this formulation, the filter states consist of the inertial absolute chaser state and the inertial relative target
state.
120 C. DUAL INERTIAL VS. INERTIAL/RELATIVE TRADE

where the initial covariance of the inertial-relative state is found to be5

 
PC,C 0 PC,rel0 PC,C 0 PC,C 0
PIR0 = = (C.32)
Prel,C 0 Prel,rel0 PC,C 0 PC,C 0 + PT,T 0
in order to be consistent with the inertial-absolute formulation. We can also express Eqs.
(30) and (31) as
ṖIR = AIR PIR + PIR ATIR + Q0IR (C.33)
where

QC QC
Q0IR = (C.34)
QC Qrel
where Qrel = QT + QC . This may simplify tuning. While the covariance of the relative
state is easily determined since it is lower right partition of the covariance matrix (Prel,rel ),
the covariance of the target vehicle (inertial) state is found (after a bit of manipulation) to
be
T
PT,T = Prel,rel + PC,C + Prel,C + PC,rel = Prel,rel + PC,C + Prel,C + Prel,C (C.35)
C.3. Incorporation of Measurements
Whereas the previous section analyzed the filter error dynamics/propagation as it ap-
plies to the inertial-absolute and inertial-relative filter formulation, this section will analyze
the di↵erence in measurement processing between the two filters. Obviously, those measure-
ments that are tied only to the chaser absolute inertial state have the same instantiation in
both formulations. This section will only address those measurement types which have dif-
ferent formulations in the two filters. In particular, the measurement and the measurement
partials will be discussed.

C.3.1. The Dual Inertial-Absolute Measurement Formulations

C.3.1.1. The Target Inertial State Ground Update There will be instances during which
the on-board filter will need to process ground updates of the target vehicle. For the
inertial-absolute formulation, the measurement takes the following expression
zIA
TGU = xT + ⌫TGU and ⌫TGU ⇠ N (0, RTGU ) (C.36)
Since the target state is a member of this filter’s state-space, the measurement partials
associated with this measurement for the inertial-absolute formulation is
⇥ ⇤
HTIA
GU
= 0 I (C.37)

C.3.1.2. Range Measurements For the case of range measurements (either from the RF
link or from the Lidar), the measurement equation can be written simply as
p
zIA
range = (rT rC ) · (rT rC ) + brange + ⌫range (C.38)

with the range (measurement) noise statistics ⌫range ⇠ N (0, Rrange ). Since the target
state is a member of this filter’s state-space, the measurement partials associated with this

5We assume that at the initial time, the chaser and target initial error covariance matrices are
uncorrelated.
C.3. INCORPORATION OF MEASUREMENTS 121

measurement for the inertial-absolute measurement are

@zIA
range
= uTrel (C.39)
@rC
@zIA
range
= 01⇥3 (C.40)
@vC
@zIA
range
= uTrel (C.41)
@rT
@zIA
range
= 01⇥3 (C.42)
@vT
where
(rT rC )
urel = (C.43)
|rT rC |
C.3.1.3. Bearing Measurements Bearing measurements, which can be obtained from
either the star-tracker or the lidar, are
   
IA ↵ b↵ ⌫↵ ↵
zbearing = + + = + bbearing + ⌫bearing (C.44)
b ⌫
where the angles ↵ and , the azimuth and elevation angles in the sensor frame with biases
b↵ and b , and with noise characteristics

T R↵ 0
E (⌫bearing ) = 02⇥1 and E ⌫bearing ⌫bearing = (C.45)
0 R
We can express ↵ and in terms of the line-of-sight vector which is defined as
2 3
cos ↵ cos
4 sin ↵ cos 5 = 1 TSB (qSB )TBI (qBI ) (rT rC ) (C.46)
sin r

where TSB is the transformation matrix from the body(IMU) frame to the sensor frame (with
qSB being the quaternion associated with the transformation from body frame to sensor
frame) and TBI is the transformation matrix from the inertial frame to the body(IMU)
frame (with qBI being the quaternion associated with the transformation from the inertial
frame to the body frame).
With this in hand, the measurement partials can be obtained (after a bit of manipulation
[1]) to be

@↵ uT↵
= TSB (qSB )TBI (qBI ) (C.47)
@rC r
@↵
= 01⇥3 (C.48)
@vC
@↵ uT↵
= TSB (qSB )TBI (qBI ) (C.49)
@rT r
@↵
= 01⇥3 (C.50)
@vT
122 C. DUAL INERTIAL VS. INERTIAL/RELATIVE TRADE

and
@ uT
= TSB (qSB )TBI (qBI ) (C.51)
@rC r
@
= 01⇥3 (C.52)
@vC
@ uT
= TSB (qSB )TBI (qBI ) (C.53)
@rT r
@
= 01⇥3 (C.54)
@vT
where
2 3
sin ↵
1 4
u↵ = cos ↵ 5 (C.55)
cos 0
2 3
cos ↵ sin
u = 4 cos ↵ cos 5 (C.56)
cos

C.3.2. The Dual Inertial-Relative Measurement Formulations

C.3.2.1. The Target Inertial State Ground Update For the inertial-relative formulation,
the measurement takes the following expression
zIR
TGU = xrel + xC + ⌫TGU = and ⌫TGU ⇠ N (0, RTGU ) (C.57)
Since the target state is not a member of this filter’s state-space, the measurement partials
associated with this measurement for the inertial-relative formulation is
⇥ ⇤
HTIR
GU
= I I (C.58)

C.3.2.2. Range Measurements For the case of range measurements with the inertial-
relative filter, the measurement equation can be written simply as
q
IA
zrange = rTrel rrel + brange + ⌫range (C.59)

with, as before, the range (measurement) noise statistics ⌫range ⇠ N (0, Rrange ). Since the
relative state is a member of this filter’s state-space, the measurement partials associated
with this measurement for the inertial-relative measurement are
@zIA
range
= 01⇥3 (C.60)
@rC
@zIA
range
= 01⇥3 (C.61)
@vC
@zIA
range rTrel
= (C.62)
@rrel |rrel |
@zIA
range
= 01⇥3 (C.63)
@vrel
C.4. ANALYSIS OF THE MERITS OF THE INERTIAL-ABSOLUTE AND INERTIAL-RELATIVE FILTERS
123

C.3.2.3. Bearing Measurements Bearing measurements, as in the previous formulation

(in Eqs.(44) and (45)), can be expressed in terms of the line-of-sight vector and the relative
position vector which can be expressed as follows
2 3
cos ↵ cos
4 sin ↵ cos 5 = 1 TSB (qSB )TBI (qBI )rrel (C.64)
sin r

the quantities in Eq.(64) are defined in Section 3.1.3. The measurement partials are ex-
pressed as
@↵
= 01⇥3 (C.65)
@rC
@↵
= 01⇥3 (C.66)
@vC
@↵ uT↵
= TSB (qSB )TBI (qBI ) (C.67)
@rrel r
@↵
= 01⇥3 (C.68)
@vrel
and
@
= 01⇥3 (C.69)
@rC
@
= 01⇥3 (C.70)
@vC
@ uT
= TSB (qSB )TBI (qBI ) (C.71)
@rrel r
@
= 01⇥3 (C.72)
@vrel
where
2 3
sin ↵
1 4
u↵ = cos ↵ 5 (C.73)
cos 0
2 3
cos ↵ sin
u = 4 sin ↵ sin 5 (C.74)
cos
C.4. Analysis of the Merits of the Inertial-Absolute and Inertial-Relative
Filters
The Flight Day 1 rendezvous trajectory and models as described in [2] were used to
analyze the two filter formulations. Both formulations had the same driving dynamics
and measurement models – only the covariance propagation and covariance updates were
di↵erent. With this in mind, the comparison of the two filter formulation as it related to
covariance operations were analyzed.
C.4.1. Covariance Propagation First it must be pointed out that the propagation
of the filter dynamics are identical between both filter parameterizations. That is to say,
in each filter, the inertial absolute states of both the chaser vehicle and the target vehicle
will be propagated. The di↵erence arises in the propagation of the covariance matrices
124 C. DUAL INERTIAL VS. INERTIAL/RELATIVE TRADE

associated with each of the filter paramerizations. In the IA filter, the covariances (and
cross-covarinces) of the chaser inertial states and the target inertial states are computed. It
should be noted that the dynamics of the two vehicles’ states are inherently un-correlated
(see AIA in Eq.(18)). In contrast, for the IR filter the covariances (and cross-covariances)
the chaser inertial states and the inertial relative states (of the target with respect to
the chaser) are computed. It should be noted that the dynamics in this filter’s states
are inherently correlated (see AIR in Eq.(31)). Hence, there are inherently more non-zero
computations (both additions and multiplications) involved6. Hence, there is more room for
round-o↵ errors in the covariance propagation in the IR filter. In order to see this, Table 1
contains an analysis of the propagation error as a function of the propagation interval. This
propagation is carried out without process noise on either the chaser or target states. In
addition, the chaser and the target states are uncorrelated at the initial time. Hence, since
there are no measurements which could correlate the two vehicles’ states, the correlation
coefficients should remain zero (i.e. PC,T = 06⇥6 ) throughout the interval. This was verified
to be the case. Because of the reduced number of computations inherent in the IA filter
formulation, it was assumed that the IA filter propagation was the ‘truth’ and the IR filter
was compared to it.

t(sec) |Prel,rel |2 |PT,T |2 | Prel,rel |2 /|Prel,rel |2 | PT,T |2 /|PT,T |2

100 1.010E7 4.150E3 1.529E-16 4.867E-13
1000 6.997E7 4.519E3 5.729E-9 8.869E-6
10000 1.239E9 7.943E4 1.926E-5 0.349
Table 1. Numerical Precision Comparison of the IA and IR filter formula-
tions for propagation without process noise

t(sec) |Prel,rel |2 |PT,T |2 | Prel,rel |2 /|Prel,rel |2 | PT,T |2 /|PT,T |2

100 1.010E7 4.150E3 1.236E-13 1.551E-7
1000 6.997E7 4.519E3 2.172E-8 2.320E-4
10000 1.239E9 7.943E4 1.905E-5 0.392
Table 2. Numerical Precision Comparison of the IA and IR filter formula-
tions for propagation with process noise

It is clear that the additional non-zero multiplications and additions for the IR filter
formulation compared to the IA formulation result in a build-up of round-o↵ error. This has
6First, notice that in the IR filter, the term A AC will inherently cause a loss of precision. Second,
T
assuming only the gravity gradient term in A, which is symmetric, AT AC involves 6 additions/subtractions.
The term (AT AC )PCC in Eqs.(30) and (31) involve 18 multiplications and 12 additions/subtractions. The
term (AT AC )PC,rel (or Prel,C (AT AC )T ) involve 27 multiplications and 18 additions/subtractions. The
terms (AT AC )PCC + AT Prel,C and (AT AC )PC,rel + AT Prel,rel each involve 9 additions/subtractions.
These total 45 multiplications and 54 additions/subtractions. This is doubled because of the symmetric
nature of the covariance matrix. Therefore, there are 90 additional multiplications and 108 additional
additions/subtractions per function evaluation in the IR filter formulation over the IA filter formulation.
For a fourth-order Runge-Kutta integration method, the IR filter formulation results in an additional 360
multiplications and 432 additions/subtractions per integration step. Each of these operations results in a
numerical loss of precision in finite-state machines.
C.5. CONCLUSIONS 125

the e↵ect of reducing the propagation accuracy of the IR filter vis-à-vis the IA filter. The
additional operations, in concert with the accompanying loss of precision, make a strong
case for the use of the IA filter formulation.
C.4.2. Measurement Update It should be apparent from comparing Eqs. (35) and
(56) that for the case of the target ground-update, there are 18 more multiplications (because
of the identity matrix) for the IR formulation than the IA formulation for each target
ground-update.
For all other relative measurements, there are more operations for the IA filter formu-
lation than for the IR formulation. In fact, for the range measurements, there are more 54
multiplications and 54 more additions for the IA formulation than the IR formulation for
each range measurement update.
For bearing measurements, there are more 108 multiplications and 108 more additions
for the IA formulation than the IR formulation for each bearing measurement update.
So, for relative sensor measurements, clearly there are more multiplications and more
additions for the IA filter formulation than for the IR formulation.
C.5. Conclusions
While the inertial-absolute and inertial-relative filter formulations are mathematically
equivalent, the implementation on finite-state machines influences the choice.
With regard to propagation of the covariance matrices, there are more computations for
the IR filter than the (mathematically equivalent) IA filter. These additional computations,
in concert with the types of operations, result in a (significant) loss of precision with regard
to the propagation of the covariance matrices.
With regard to the measurement updates to the covariance matrices, for the case of
relative navigation measurements, there are fewer non-zero operations for the IR filter than
for the IA filter. This is one of the strengths of this filter (IR)formulation, and if the
computation and precision with regard to the propagation of the covariance matrices were
the same, the IR filter would be advantageous in terms of computations and precision.
After considering all these factors, the Inertial-Absolute filter is recommended for use
on Orion.
Bibliography

[1] W.S. Agee and R.H. Turner. Triangular decomposition of a positive definite matrix plus a symmetric
dyad with application to kalman filtering. White Sands Missile Range Tech. Rep. No. 38, 1972.
[2] G. J. Bierman. Factorization Methods for Discrete Sequential Estimation. Academic Press, Dover Pub-
lications, New York, 1977, 2006.
[3] Robert G. Brown and Patrick Y.C. Hwang. Introduction to Random Signals and Applied Kalman Fil-
tering. John Wiley and Sons, Inc., New York, NY, 3rd edition, 1997.
[4] A. E. Bryson and M. Frazier. Smoothing for linear and nonlinear dynamic systems. Technical Report
TDR-63-119, Aeronautical Systems Division, Wright-Patterson Air Force Base, Ohio, Sept. 1962.
[5] N.A. Carlson. Fast triangular factorization of the square root filter. AIAA Journal, 11(9):1259–1265,
September 1973.
[6] J. R. Carpenter and K. T. Alfriend. Navigation accuracy guidelines for orbital formation flying. Journal
of the Astronautical Sciences, 53(2):207–219, 2006. Also appears as AIAA Paper 2003–5443.
[7] J. Russell Carpenter. Navigation filter best practices. https://mediaex-
server.larc.nasa.gov/Academy/Catalog/Full/f1d0abb028d3491f8701da3fc64bcb2021, January 2015.
Accessed: January 4, 2018.
[8] J. Russell Carpenter and Kevin Berry. Artificial damping for stable long-term orbital covariance propa-
gation. In Astrodynamics 2007, volume 129 of Advances in the Astronautical Sciences, pages 1697–1707.
Univelt, 2008.
[9] J. Russell Carpenter and Emil R. Schiesser. Semimajor axis knowledge and gps orbit determination.
NAVIGATION: Journal of The Institute of Navigation, 48(1):57–68, Spring 2001. Also appears as AAS
Paper 99–190.
[10] Russell Carpenter and Taesul Lee. A stable clock error model using coupled first- and second-order
gauss-markov processes. In AAS/AIAA 18th Spaceflight Mechanics Meeting, volume 130, pages 151–
162. Univelt, 2008.
[11] W. H. Clohessy and R. S. Wiltshire. Terminal guidance for satellite rendezvous. Journal of the Aerospace
Sciences, 27(5):653–658, 674, 1960.
[12] W. F. Denham and S. Pines. Sequential estimation when measurement function nonlinearity is compa-
rable to measurement error. AIAA Journal, 4(6):1071–1076, June 1966.
[13] Cornelius J. Dennehy and J. Russell Carpenter. A summary of the rendezvous, proximity operations,
docking, and undocking (rpodu) lessons learned from the defense advanced research project agency
(darpa) orbital express (oe) demonstration system mission. NASA Technical Memorandum 2011-217088,
NASA Engineering and Safety Center, 2011.
[14] J. L. Farrell. Attitude determination by Kalman filtering. Automatica, 6:419–430, 1970.
[15] James L. Farrell. Attitude determination by Kalman filtering. Contractor Report NASA-CR-598, NASA
Goddard Space Flight Center, Washington, DC, Sept. 1964.
[16] P. Ferguson and J. How. Decentralized estimation algorithms for formation flying spacecraft. In AIAA
Guidance, Navigation and Control Conference, 2003.
[17] Donald C. Fraser. A New Technique for the Optimal Smoothing of Data. Sc.D. thesis, Massachusetts
Institute of Technology, Cambridge, Massachusetts, 1967.
[18] Donald C. Fraser and James E. Potter. The optimum smoother as a combination of two opimum linear
filters. IEEE Transactions on Automatic Control, AC-14(4):387–390, Aug. 1969.
[19] Eliezer Gai, Kevin Daly, James Harrison, and Linda Lemos. Star-sensor-based attiude/attitude rate
estimator. Journal of Guidance, Control, and Dynamics, 8(5):560–565, Sept.-Oct. 1985.
[20] Arthur Gelb, editor. Applied Optimal Estimation. The MIT Press, Cambridge, MA, 1974.

127
128 BIBLIOGRAPHY

[21] David K. Geller. Orbital rendezvous: When is autonomy required? Journal of Guidance, Control and
Dynamics, 30(4):974–981, July–August 2007.
[22] Herbert Goldstein. Classical Mechanics. Addison-Wesley Publishing Company, Reading, MA, 2nd edi-
tion, 1980.
[23] Gene Howard Golub and Charles F. Van Loan. Matrix Computations. JHU Press, 3rd edition, 1996.
[24] M. Grigoriu. Response of dynamic systems to poisson white noise. Journal of Sound and Vibration,
195(3):375–389, 1996.
[25] C. F. Hanak. Reducing the e↵ects of measurement ordering on the gkf algorithm via a hybrid lin-
ear/extended kalman filter. Technical Report FltDyn-CEV-06-0107, NASA Johnson Space Center, Au-
gust 2006.
[26] G. W. Hill. Researches in the lunar theory. American Journal of Mathematics, 1(1):5–26, 129–147,
245–260, 1878.
[27] Jonathan P. How, Louis S. Breger, Megan Mitchell, Kyle T. Alfriend, and Russell Carpenter. Di↵er-
ential semimajor axis estimation performance using carrier-phase di↵erential global positioning system
measurements. Journal of Guidance, Control, and Dynamics, 30(2):301–313, Mar-Apr 2007.
[28] Peter C. Hughes. Spacecraft Attitude Dynamics. Wiley, New York, NY, 1986.
[29] M. Idan. Estimation of Rodrigues parameters from vector observations. IEEE Transactions on Aerospace
and Electronic Systems, 32(2):578–586, April 1996.
[30] Andrew H. Jazwinski. Stochastic Processes and Filtering Theory. Academic Press, Inc., and Dover
Publications, Inc., New York, NY, and Mineola, NY, 1970 and 2007.
[31] S. C. Jenkins and D. K. Geller. State estimation and targeting for autonomous rendezvous and proximity
operations. In Proceedings of the AAS/AIAA Astrodynamics Specialists Conference. Mackinac Island,
MI, August 19–23 2007.
[32] S. J. Julier and J. K. Uhlmann. A new extension of the kalman filter to nonlinear systems. In Int. Symp.
Aerospace/Defense Sensing, Simul. and Controls, Orlando, FL, 1997.
[33] Simon Julier and Je↵rey Uhlmann. Authors’ reply. IEEE Transactions on Automatic Control,
47(8):1408–1409, August 2002.
[34] Simon Julier, Je↵rey Uhlmann, and Hugh F. Durrant-Whyte. A new method for the nonlinear transfor-
mation of means and covariances in filters and estimators. IEEE Transactions on Automatic Control,
45(3):477–482, March 2000.
[35] John L. Junkins and J. D. Turner. Optimal Spacecraft Rotational Maneuvers. Elsevier, New York, NY,
1986.
[36] R. E. Kalman. A new approach to linear filtering and prediction problems. Transactions of the ASME
– Journal of Basic Engineering, 82:35–45, 1960.
[37] Christopher D. Karlgaard and Hanspeter Schaub. Nonsingular attitude filtering using modified Ro-
drigues parameters. The Journal of the Astronautical Sciences, 57(4):777–791, Oct.-Dec. 2010.
[38] B. A. Kriegsman and Y. C. Tau. Shuttle navigation system for entry and landing mission phases. Journal
of Spacecraft, 12(4):213–219, April 1975.
[39] W. M. Lear. Multi-phase navigation program for the space shuttle orbiter. Internal Note No. 73-FM-132,
NASA Johnson Space Center, Houston, TX, 1973.
[40] William M. Lear. Kalman Filtering Techniques. Technical Report JSC-20688, NASA Johnson Space
Center, Houston, TX, 1985.
[41] Deok-Jin Lee and Kyle T. Alfriend. Additive divided di↵erence filtering for attitude estimation us-
ing modified rodrigues parameters. In John L. Crassidis et al., editors, Proceedings of the F. Landis
Markley Astronautics Symposium, volume 132 (CD–ROM Supplement) of Advances in the Astronauti-
cal Sciences. American Astronautical Society, Univelt, 2008.
[42] Tine Lefebvre, Herman Bruyninckx, and Joris De Schutter. Comment on ‘a new method for the nonlinear
transformation of means and covariances in filters and estimators’. IEEE Transactions on Automatic
Control, 47(8):1406–1408, August 2002.
[43] Eugene J. Le↵erts, F. Landis Markley, and Malcolm D. Shuster. Kalman filtering for spacecraft
attitude estimation. Journal of Guidance, Control, and Dynamics, 5(5):417–429, Sept.-Oct. 1982.
doi:10.2514/3.56190.
[44] M. Mandic. Distributed estimation architectures and algorithms for formation flying spacecraft. Master’s
thesis, Massachusetts Institute of Technology, 2006.
BIBLIOGRAPHY 129

[45] S. R. Marandi and V. J. Modi. A preferred coordinate system and the associated orientation represen-
tation in attitude dynamics. Acta Astronautica, 15(11):833–843, Nov. 1987.
[46] F. Landis Markley. Unit quaternion from rotation matrix. Journal of Guidance, Control, and Dynamics,
31(2):440–442, March-April 2008.
[47] F. Landis Markley. Lessons learned. The Journal of the Astronautical Sciences, 57(1 & 2):3–29, Jan.-
June 2009.
[48] F. Landis Markley and J. Rusell Carpenter. Generalized linear covariance analysis. The Journal of the
Astronautical Sciences, 57(1 & 2):233–260, Jan.-June 2009.
[49] F. Landis Markley and John L. Crassidis. Fundamentals of Spacecraft Attitude Determination and
Control, pages 46, 257–260, 263–269. Springer, New York, NY, 2014. doi:10.1007/978-1-4939-0802-8.
[50] Peter S. Maybeck. Stochastic Models, Estimation and Control, Vol. 1. Academic Press, New York, NY,
1979.
[51] Peter S. Maybeck. Stochastic Models, Estimation and Control, Vol. 2. Academic Press, New York, NY,
1979.
[52] Eugene S. Muller and Peter M. Kachmar. A new approach to on-board orbit navigation. Navigation:
Journal of the Institute of Navigation, 18(4):369–385, Winter 1971–72.
[53] Magnus Nørgaard, Niels K. Poulsen, and Ole Ravn. Advances in derivative-free state estimation for non-
linear systems. Technical Report IMM–REP–1998–15, Technical University of Denmark, 2800 Lyngby,
Denmark, April 7, 2000.
[54] Magnus Nørgaard, Niels K. Poulsen, and Ole Ravn. New developments in state estimation for nonlinear
estimation problems. Automatica, 36(11):1627–1638, November 2000.
[55] Young W. Park, Jack P. Brazzel, J. Russell Carpenter, Heather D. Hinkel, and James H. Newman.
Flight test results from real-time relative gps experiment on sts-69. In SPACEFLIGHT MECHANICS
1996, volume 93 of Advances in the Astronautical Sciences, pages 1277–1296, San Diego, CA, 1996.
Univelt.
[56] L. Perea, J. How, L. Breger, and P. Elosegui. Nonlinearity in sensor fusion: Divergence issues in ekf,
modified truncated sof, and ukf. In AIAA Guidance, Navigation and Control Conference and Exhibit,
Hilton Head, SC, August 20–23, 2007.
[57] Mark E. Pittelkau. An analysis of the quaternion attitude determination filter. The Journal of the
Astronautical Sciences, 51(1), Jan.-March 2003.
[58] H. Plinval. Analysis of relative navigation architectures for formation flying spacecraft. Master’s thesis,
Massachusetts Institute of Technology, 2006.
[59] H. E. Rauch, F. Tung, and C. T. Striebel. Maximum likelihood estimates of linear dynamic systems.
AIAA Journal, 3(8):1445–1450, Aug. 1965.
[60] Reid G. Reynolds. Asymptotically optimal attitude filtering with guaranteed covergence. Journal of
Guidance, Control, and Dynamics, 31(1):114–122, Jan.-Feb. 2008.
[61] Olinde Rodrigues. Des lois géométriques qui régissent les déplacements d’un système solide dans l’espace,
et de la variation des coordonnées provenant de ces déplacements considérés indépendamment des causes
qui peuvent les produire. Journal de Mathématiques Pures et Appliquées, 5:380–440, 1840.
[62] Hanspeter Schaub and John L. Junkins. Analytical Mechanics of Aerospace Systems. American Institute
of Aeronautics and Astronautics, Inc., New York, NY, 2nd edition, 2009.
[63] Emil Schiesser, Jack P. Brazzel, J. Russell Carpenter, and Heather D. Hinkel. Results of sts-80 relative
gps navigation flight experiment. In SPACEFLIGHT MECHANICS 1998, volume 99 of Advances in the
Astronautical Sciences, pages 1317–1334, San Diego, CA, 1998. Univelt.
[64] S. F. Schmidt. Application of state-space methods to navigation problems. In C. T. Leondes, editor,
Advances in Control Systems: Theory and Applications, volume 3, pages 293–340. Academic Press, New
York, 1966.
[65] Stanley. F. Schmidt. The kalman filter - its recognition and development for aerospace applications.
Journal of Guidance, Control, and Dynamics, 4(1):4–7, 2016/01/09 1981.
[66] John H. Seago, Jacob Griesback, James W. Woodburn, and David A. Vallado. Sequential orbit-
estimation with sparse tracking. In Space Flight Mechanics 2011, volume 140 of Advances in the Astro-
nautical Sciences, pages 281–299. Univelt, 2011.
[67] Malcolm D. Shuster. A survey of attitude representations. The Journal of the Astronautical Sciences,
41(4):439–517, Oct.-Dec. 1993.
130 BIBLIOGRAPHY

[68] John Stuelpnagel. On the parametrization of the three-dimensional rotation group. SIAM Review,
6(4):422–430, Oct. 1964.
[69] Byron D. Tapley, Bob E. Schutz, and George R. Born. Statistical Orbit Determination. Academic Press,
2004.
[70] C.L Thornton and G.J. Bierman. Gram-schmidt algorithms for covariance propagation. IEEE Confer-
ence on Decision and Control, pages 489–498, 1975.
[71] C.L Thornton and G.J. Bierman. Numerical comparison of discrete kalman filtering algorithms: An
orbit determination case study. JPL Technical Memorandum, 33-771, June 1976.
[72] Oldrich Vasicek. An equilibrium characterization of the term structure. J. Financial Economics,
5(2):177–188, November 1977.
[73] M. C. Wang and G. E. Uhlenbeck. On the theory of brownian motion ii. In N. Wax, editor, Selected
Papers on Noise and Stochastic Processes, pages 113–132. Dover, 1954.
[74] James R. Wertz, editor. Spacecraft Attitude Determination and Control. Kluwer Academic Publishers,
The Netherlands, 1978.
[75] Thomas F. Wiener. Theoretical Analysis of Gimballess Inertial Reference Equipment Using Delta-
Modulated Instruments. Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA,
1962.
[76] J. R. Wright. Sequential orbit determination with auto-correlated gravity modeling errors. Journal of
Guidance and Control, 4(3):304–309, May–June 1980.
[77] James R. Wright. Optimal orbit determination. In Kyle T. Alfriend et al., editor, Space Flight Mechanics
2002, volume 112 of Advances in the Astronautical Sciences, pages 1123–134. Univelt, 2002.
[78] Renato Zanetti, Kyle J. DeMars, and Robert H. Bishop. Underweighting nonlinear measurements.
Journal of Guidance, Control and Dynamics, 33(5):1670–1675, September–October 2010.
[79] Renato Zanetti and Christopher D’Souza. Recursive implementations of the consider filter. Proceedings
of the 2012 AAS Jer-Nan Juang Symposium, 2012.
Form Approved
REPORT DOCUMENTATION PAGE OMB No. 0704–0188
The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources,
gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection
of information, including suggestions for reducing this burden, to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports
(0704-0188), 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be
subject to any penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number.
PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS.
1. REPORT DATE (DD-MM-YYYY) 2. REPORT TYPE 3. DATES COVERED (From - To)
18-04-2018 Technical Publication
4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER
Navigation Filter Best Practices
5b. GRANT NUMBER

5c. PROGRAM ELEMENT NUMBER

6. AUTHOR(S) 5d. PROJECT NUMBER

Carpenter, J. R.; D’Souza, C. N.
5e. TASK NUMBER

5f. WORK UNIT NUMBER

869021.03.04.01.03

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION

NASA NASA Engineering and Safety Center REPORT NUMBER
Hampton, Virginia 23681 L–20926

9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S)

National Aeronautics and Space Administration NASA
Washington, DC 20546-0001
11. SPONSOR/MONITOR’S REPORT
NUMBER(S)
NASA/TP–2018–219822
12. DISTRIBUTION/AVAILABILITY STATEMENT
Unclassified-Unlimited
Subject Category 17 (Space Communications, Spacecraft Communications, Command and Tracking: Navigation Systems)
Availability: NASA STI Program (757) 864-9658
13. SUPPLEMENTARY NOTES
An electronic version can be found at http://ntrs.nasa.gov.

14. ABSTRACT
This work identifies best practices for onboard navigation filtering algorithms. These best practices have been collected by
NASA practitioners of the art and science of navigation over the first 50 years of the Space Age. While this is a NASA
document concerned with space navigation, it is likely that many of the principles would apply equally to the wider
navigation community.

15. SUBJECT TERMS

Navigation, Kalman Filter, Estimation

16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF RESPONSIBLE PERSON
a. REPORT b. ABSTRACT c. THIS PAGE ABSTRACT OF STI Information Desk ([email protected])
PAGES
19b. TELEPHONE NUMBER (Include area code)
U U U UU 149 (757) 864-9658
Standard Form 298 (Rev. 8/98)
Prescribed by ANSI Std. Z39.18

Angle of Arrival Estimation
No ratings yet
Angle of Arrival Estimation
360 pages
Lectures On Cement - July 2023
No ratings yet
Lectures On Cement - July 2023
114 pages
Biology Life On Earth 10th Edition Audesirk Solutions Manual Instant Download
100% (6)
Biology Life On Earth 10th Edition Audesirk Solutions Manual Instant Download
37 pages
Applying Monte Carlo Simulation To Launch Vehicle Design and Requirements Analysis
No ratings yet
Applying Monte Carlo Simulation To Launch Vehicle Design and Requirements Analysis
134 pages
Nasa TM 20210000047
No ratings yet
Nasa TM 20210000047
119 pages
NASA Project Management Handbook
No ratings yet
NASA Project Management Handbook
486 pages
Nasa CR 20230012505
No ratings yet
Nasa CR 20230012505
130 pages
Reliability and Maintainability (RAM) Training NASA-TP-2000-207428
No ratings yet
Reliability and Maintainability (RAM) Training NASA-TP-2000-207428
366 pages
Nasa Rolling Bearing Life Prediction
No ratings yet
Nasa Rolling Bearing Life Prediction
66 pages
2005 - Meeting The Challenges of Exploration
No ratings yet
2005 - Meeting The Challenges of Exploration
16 pages
NASA Probabilistic Risk Assessment Procedures Guide
100% (1)
NASA Probabilistic Risk Assessment Procedures Guide
431 pages
Nasa
100% (1)
Nasa
275 pages
The Hexagon For Trigonometric Identities
No ratings yet
The Hexagon For Trigonometric Identities
11 pages
Government Polytechnic College Vechoochira, Pathanamthitta: Ajumal Anish Reg No: 20130611
No ratings yet
Government Polytechnic College Vechoochira, Pathanamthitta: Ajumal Anish Reg No: 20130611
26 pages
NASA - Physics of Failure-Based Reliability Assessments
No ratings yet
NASA - Physics of Failure-Based Reliability Assessments
95 pages
UAS Integration in The NAS Project
No ratings yet
UAS Integration in The NAS Project
117 pages
Genome Organization
100% (1)
Genome Organization
23 pages
(Preprint) AAS 19-031: Leonard Vance, Jekan Thangavelautham, and Erik Asphaug
No ratings yet
(Preprint) AAS 19-031: Leonard Vance, Jekan Thangavelautham, and Erik Asphaug
9 pages
Chapter 6 (Convective Heat Transfer Only)
No ratings yet
Chapter 6 (Convective Heat Transfer Only)
28 pages
Designing Flight Deck Prodecures
No ratings yet
Designing Flight Deck Prodecures
81 pages
Best Practice
No ratings yet
Best Practice
297 pages
Cyclomatic Complexity and Basis Path Testing Study: NASA/TM 20205011566 NESC-RP-20-01515
No ratings yet
Cyclomatic Complexity and Basis Path Testing Study: NASA/TM 20205011566 NESC-RP-20-01515
27 pages
MAAB Control Algorithm Modeling Guidelines Using MATLAB Simulink and Stateflow
No ratings yet
MAAB Control Algorithm Modeling Guidelines Using MATLAB Simulink and Stateflow
85 pages
Agricultural Extension and Advisory Services in Nigeria, Malawi, South Africa, Uganda, and Kenya
No ratings yet
Agricultural Extension and Advisory Services in Nigeria, Malawi, South Africa, Uganda, and Kenya
66 pages
NASA System Safety HB V2 20150015500
No ratings yet
NASA System Safety HB V2 20150015500
216 pages
NASA TPm2000 - 207428
No ratings yet
NASA TPm2000 - 207428
366 pages
Checklists and Monitoring in The Cockpit - Why Crucial Defenses Sometimes Fail - NASA (2010)
No ratings yet
Checklists and Monitoring in The Cockpit - Why Crucial Defenses Sometimes Fail - NASA (2010)
62 pages
11th Business Maths & Stastics Reduced Syllabus 2021 - 2022 English Medium - WWW - Kalvikadal.in
No ratings yet
11th Business Maths & Stastics Reduced Syllabus 2021 - 2022 English Medium - WWW - Kalvikadal.in
4 pages
Butler TM 2008 215108 Primer FT
No ratings yet
Butler TM 2008 215108 Primer FT
53 pages
TM 20210014025
No ratings yet
TM 20210014025
46 pages
CST 303
No ratings yet
CST 303
19 pages
Nasa 格伦实验室条件简介
No ratings yet
Nasa 格伦实验室条件简介
76 pages
Response of Newly Collected Acetobacter Isolates in Sweet Corn (Zea Mays L. Saccharata)
No ratings yet
Response of Newly Collected Acetobacter Isolates in Sweet Corn (Zea Mays L. Saccharata)
5 pages
Manua - LS-LG GH24NSD1 Specifications
No ratings yet
Manua - LS-LG GH24NSD1 Specifications
3 pages
AIoT and Big Data Analytics for Smart Healthcare Applications
From Everand
AIoT and Big Data Analytics for Smart Healthcare Applications
Shreyas Suresh Rao
No ratings yet
NASA 99 tp209107 PDF
No ratings yet
NASA 99 tp209107 PDF
94 pages
NASA Risk MNGT HBK
No ratings yet
NASA Risk MNGT HBK
256 pages
PROPOSAL FROM M/s S.M.ASGHAR (PVT) LIMITED FOR CUSTOM CLEARANCE-SHIPPING & LOGISTICS SOLUTION
No ratings yet
PROPOSAL FROM M/s S.M.ASGHAR (PVT) LIMITED FOR CUSTOM CLEARANCE-SHIPPING & LOGISTICS SOLUTION
144 pages
NASA - Space Flight Program and Project Management
No ratings yet
NASA - Space Flight Program and Project Management
486 pages
NASA Bayesian PDF
No ratings yet
NASA Bayesian PDF
275 pages
Development of An Automated Multi-Level Car Parking System: December 2015
No ratings yet
Development of An Automated Multi-Level Car Parking System: December 2015
8 pages
Friday July 30, 2010 Leader
No ratings yet
Friday July 30, 2010 Leader
47 pages
Tongue Management Part 2!!!!! 5
No ratings yet
Tongue Management Part 2!!!!! 5
6 pages
Foundations of Tensor Analysis For Students of Physics and Engineering With An Introduction To Theory of Relativity (Nasa)
No ratings yet
Foundations of Tensor Analysis For Students of Physics and Engineering With An Introduction To Theory of Relativity (Nasa)
92 pages
Cyber Physical Systems - Advances and Applications
From Everand
Cyber Physical Systems - Advances and Applications
Anitha Kumari K.
No ratings yet
Failure Modes and Effects Analysis (FMEA) : A Bibliography
No ratings yet
Failure Modes and Effects Analysis (FMEA) : A Bibliography
187 pages
Extreme Learning Machine For Missing Data Using Multiple Imputations
No ratings yet
Extreme Learning Machine For Missing Data Using Multiple Imputations
18 pages
Lecture Notes 06
67% (3)
Lecture Notes 06
34 pages
MC DC Tutor
No ratings yet
MC DC Tutor
85 pages
The TASAR Project: Launching Aviation On An Optimized Route Toward Aircraft Autonomy
No ratings yet
The TASAR Project: Launching Aviation On An Optimized Route Toward Aircraft Autonomy
163 pages
Foundations of Tensor Analysis For Students of Physics and Engineering With An Introduction To The Theory of Relativity
No ratings yet
Foundations of Tensor Analysis For Students of Physics and Engineering With An Introduction To The Theory of Relativity
92 pages
UNILORIN 2022-23 UTME CUT-OFF (TripleHay)
100% (1)
UNILORIN 2022-23 UTME CUT-OFF (TripleHay)
3 pages
The Evolution of Neighborhood Planning Since The Early 20th Century
No ratings yet
The Evolution of Neighborhood Planning Since The Early 20th Century
3 pages
DAN2400 Product Brief
No ratings yet
DAN2400 Product Brief
2 pages
Gantry Girder Data (Group 1)
No ratings yet
Gantry Girder Data (Group 1)
2 pages
Component-Level Electronic-Assembly Repair (CLEAR) Spacecraft Circuit Diagnostics by Analog and Complex Signature Analysis
No ratings yet
Component-Level Electronic-Assembly Repair (CLEAR) Spacecraft Circuit Diagnostics by Analog and Complex Signature Analysis
37 pages
A Collection of Nonlinear Aircraft Simulations
No ratings yet
A Collection of Nonlinear Aircraft Simulations
92 pages
Evaluation of The Oxygen Concentrator Prototypes: Pressure Swing Adsorption Prototype and Electrochemical Prototype
No ratings yet
Evaluation of The Oxygen Concentrator Prototypes: Pressure Swing Adsorption Prototype and Electrochemical Prototype
42 pages
Failure Modes and Effects Analysis (FMEA) : A Bibliography
No ratings yet
Failure Modes and Effects Analysis (FMEA) : A Bibliography
187 pages
QUALITATIVE ANALYSIS OF GROUP II CATIONS Lab Chm360 2 Full
No ratings yet
QUALITATIVE ANALYSIS OF GROUP II CATIONS Lab Chm360 2 Full
8 pages
SAPC
No ratings yet
SAPC
2 pages
Basic Measures of Epidemiology
100% (1)
Basic Measures of Epidemiology
51 pages
NASA Langley Technical Report Server NASA-2000-Sp6112 Probabilistic Risk Assessment
No ratings yet
NASA Langley Technical Report Server NASA-2000-Sp6112 Probabilistic Risk Assessment
83 pages
B-737 Linear Autoland Simulink Model: NASA/CR-2004-213021
No ratings yet
B-737 Linear Autoland Simulink Model: NASA/CR-2004-213021
123 pages
HPT Clearance Control - Engine Systems
No ratings yet
HPT Clearance Control - Engine Systems
16 pages
Scientific Satellites
No ratings yet
Scientific Satellites
822 pages
On Wings of The Minimum Induced Drag PDF
No ratings yet
On Wings of The Minimum Induced Drag PDF
22 pages
Screenshot 2022-08-18 at 1.20.20 PM
No ratings yet
Screenshot 2022-08-18 at 1.20.20 PM
4 pages
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Aviation Aircraft Reliability Study PDF
100% (2)
Aviation Aircraft Reliability Study PDF
113 pages
Method Statement Refrigerant Copper Piping
No ratings yet
Method Statement Refrigerant Copper Piping
9 pages
Precision Navigation with the Compass: Definitive Reference for Developers and Engineers
From Everand
Precision Navigation with the Compass: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Industrial Instrumentation: Process Measurement
No ratings yet
Industrial Instrumentation: Process Measurement
54 pages
The Design and Simulated Performance of the Attitude Determination and Control System of a Gravity Gradient Stabilized Cube Satellite
From Everand
The Design and Simulated Performance of the Attitude Determination and Control System of a Gravity Gradient Stabilized Cube Satellite
Windell Jones
No ratings yet
Computational Science: An Introduction for Scientists and Engineers
From Everand
Computational Science: An Introduction for Scientists and Engineers
Christopher D Wentworth
No ratings yet
Bistatic SAR Data Processing Algorithms
From Everand
Bistatic SAR Data Processing Algorithms
Xiaolan Qiu
5/5 (1)
Using Remote Sensing to Monitor Natural Resources
From Everand
Using Remote Sensing to Monitor Natural Resources
Samir Ganaka
No ratings yet
Direction Finding Paper
100% (1)
Direction Finding Paper
94 pages
Adaptive Filtering Prediction and Control
From Everand
Adaptive Filtering Prediction and Control
Graham C Goodwin
No ratings yet
Statistical Models and Methods for Reliability and Survival Analysis
From Everand
Statistical Models and Methods for Reliability and Survival Analysis
Vincent Couallier
No ratings yet
Seismic Instrumentation Design: Selected Research Papers on Basic Concepts
From Everand
Seismic Instrumentation Design: Selected Research Papers on Basic Concepts
Raman K. Attri
No ratings yet
GEO - Report - 167 Alkali Silica Reaction in Concrete
100% (1)
GEO - Report - 167 Alkali Silica Reaction in Concrete
81 pages
Advanced Kalman Filtering, Least-Squares and Modeling: A Practical Handbook
From Everand
Advanced Kalman Filtering, Least-Squares and Modeling: A Practical Handbook
Bruce P. Gibbs
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Training and Human Resource Considerations for Nuclear Facility Decommissioning
From Everand
Training and Human Resource Considerations for Nuclear Facility Decommissioning
IAEA
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
A-STR-STD-000-30053-0 - STD Details of Ladders SHT 1 PDF
No ratings yet
A-STR-STD-000-30053-0 - STD Details of Ladders SHT 1 PDF
1 page
Handyman/mainance/costruction
No ratings yet
Handyman/mainance/costruction
2 pages
Application of Probabilistic Methods for the Safety Assessment and the Reliable Operation of Research Reactors
From Everand
Application of Probabilistic Methods for the Safety Assessment and the Reliable Operation of Research Reactors
IAEA
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet