Quaternion Attitude Estimation Using Vector Observations

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

QUATERNION ATTITUDE ESTIMATION

USING VECTOR OBSERVATIONS


F. Landis Markley1 and Daniele Mortari2

ABSTRACT

This paper contains a critical comparison of estimators minimizing Wahba’s loss function. Some
new results are presented for the QUaternion ESTimator (QUEST) and EStimators of the Optimal
Quaternion (ESOQ and ESOQ-2) to avoid the computational burden of sequential rotations in these
algorithms. None of these methods is as robust in principle as Davenport’s q-method or the Singular
Value Decomposition (SVD) method, which are significantly slower. Robustness is only an issue for
measurements with widely differing accuracies, so the fastest estimators, the modified ESOQ and ESOQ-
2, are well suited to sensors that track multiple stars with comparable accuracies. More robust forms
of ESOQ and ESOQ-2 are developed that are intermediate in speed.

INTRODUCTION
In many spacecraft attitude systems, the attitude observations are naturally represented as unit vectors.
Typical examples are the unit vectors giving the direction to the sun or a star and the unit vector in the
direction of the Earth’s magnetic field. This paper will consider algorithms for estimating spacecraft
attitude from vector measurements taken at a single time, which are known as “single-frame” methods
or “point” methods, instead of filtering methods that employ information about spacecraft dynamics.
Almost all single-frame algorithms are based on a problem proposed in 1965 by Grace Wahba [1].
Wahba’s problem is to find the orthogonal matrix A with determinant +1 that minimizes the loss
function
1 X
L(A) ≡ ai |bi − A ri |2 (1)
2 i
where bi is a set of unit vectors measured in a spacecraft’s body frame, ri are the corresponding unit
vectors in a reference frame, and ai are non-negative weights. In this paper we choose the weights to be
inverse variances, ai ≡ σi−2 , in order to relate Wahba’s problem to Maximum Likelihood Estimation [2].
This choice differs from that of Wahba and many other authors, who assumed the weights normalized
to unity. It is possible and has proven very convenient to write the loss function as

L(A) = λ0 − tr(A B T ) (2)


1 Guidance, Navigation, and Control Center, NASA’s Goddard Space Flight Center, Greenbelt, MD, lan-

[email protected]
2 Aerospace Engineering School, University of Rome, “La Sapienza,” Rome, Italy, [email protected], Visiting

Associate Professor, Texas A&M University, College Station, TX

1
with X
λ0 ≡ ai (3)
i

and X
B≡ ai bi rTi (4)
i

Now it is clear that L(A) is minimized when the trace, tr(A B T ), is maximized. This has a close
relation to the orthogonal Procrustes problem, which is to find the orthogonal matrix A that is closest
to B in the Frobenius norm (also known as the Euclidean, Schur, or Hilbert-Schmidt norm) [3]
2
X
kM kF ≡ Mij2 = tr(M M T ) (5)
i,j

Now
kA − Bk2F = kAk2F + kBk2F − 2 tr(A B T ) = 3 + kBk2F − 2 tr(A B T ) (6)
so Wahba’s problem is equivalent to the orthogonal Procrustes problem with the proviso that the
determinant of A must be +1.
The purpose of this paper is to give an overview in a unified notation of algorithms for solving
Wahba’s problem, to provide accuracy and speed comparisons, and to present two significant enhance-
ments of existing methods. The popular QUaternion EStimator (QUEST) and EStimators of the Opti-
mal Quaternion (ESOQ and ESOQ-2) algorithms avoid singularities by employing a rotated reference
system. Methods introduced in this paper use information from an a priori quaternion estimate or from
the diagonal elements of the B matrix to determine a desirable reference system, avoiding expensive
sequential computations. Also, tests show that a first-order expansion in the loss function is adequate,
avoiding the need for iterative refinement of the loss function, and motivating the introduction of new
first-order versions of ESOQ and ESOQ-2, which are at present the fastest known first-order methods
for solving Wahba’s problem.

FIRST SOLUTIONS OF WAHBA’S PROBLEM


J. L. Farrell and J. C. Stuelpnagel [4], R. H. Wessner [5], J. R. Velman [6], J. E. Brock [7], R. Desjardins,
and Wahba presented the first solutions of Wahba’s problem. Farrell and Stuelpnagel noted that any
real square matrix, including B, has the polar decomposition

B = W R, (7)

where W is orthogonal and R is symmetric and positive semidefinite. Then R can be diagonalized by

R = V D V T, (8)

where V is orthogonal and D is diagonal with elements arranged in decreasing order. The optimal
attitude estimate is then given by

Aopt = W V diag[1 1 det(W )] V T . (9)

In most cases, det(W ) is positive and Aopt = W , but this is not guaranteed.

2
Wessner and Brock independently proposed the alternative solution

Aopt = (B T )−1 (B T B)1/2 = B (B T B)−1/2 ; (10)

but the matrix inverses in Eq. (10) exist only if B is non-singular, which means that a minimum of three
vectors must be observed. It is well known that two vectors are sufficient to determine the attitude;
and the method of Farrell and Stuelpnagel, as well as the other methods described in this paper, only
require B to have rank two.

SINGULAR VALUE DECOMPOSITION (SVD) METHOD


This method has not been widely used in practice, because of its computational expense, but it yields
valuable analytic insights [8, 9]. The matrix B has the Singular Value Decomposition [3]:

B = U ΣT V T = U diag[Σ11 Σ22 Σ33 ] V T , (11)

where U and V are orthogonal, and the singular values obey the inequalities Σ11 ≥ Σ22 ≥ Σ33 ≥ 0.
Then
tr(A B T ) = tr(A V diag[Σ11 Σ22 Σ33 ] U T ) = tr(U T A V diag[Σ11 Σ22 Σ33 ]) (12)
The trace is maximized, consistent with the constraint det(A) = 1, by

U T Aopt V = diag[1 1 (det U )(det V )] (13)

which gives the optimal attitude matrix

Aopt = U diag[1 1 (det U )(det V )] V T (14)

The SVD solution is completely equivalent to the original solution by Farrell and Stuelpnagel, since Eq.
(14) is identical to Eq. (9) with U = W V . The difference is that robust SVD algorithms exist now
[3, 10]. In fact, computing the SVD is one of the most robust numerical algorithms.
It is convenient to define

s1 = Σ11 , s2 = Σ22 , and s3 = det(U ) det(V ) Σ33 , (15)

so that s1 ≥ s2 ≥ |s3 |. We will loosely refer to s1 , s2 , and s3 as the singular values, although the third
singular value of B is actually |s3 |. It is clear from Eq. (11) that redefinition of the basis vectors in the
reference or body frame affects V or U , respectively, but does not affect the singular values.
The estimation error is characterized by the rotation angle error vector ferr in the body frame,
defined by
exp[ ferr ×] = Atrue ATopt (16)
where [ ferr ×] is the cross product matrix:
 
0 −φ3 φ2
[ ferr ×] ≡  φ3 0 −φ1  (17)
−φ2 φ1 0

The SVD method gives the covariance of the rotation angle error vector as

P = U diag[(s2 + s3 )−1 (s3 + s1 )−1 (s1 + s2 )−1 ] U T (18)

3
DAVENPORT’S q-METHOD
Davenport provided the first useful solution of Wahba’s problem for spacecraft attitude determination
[11, 12]. He parameterized the attitude matrix by a unit quaternion [13, 14]
   
qv e sin(φ/2)
q= = (19)
q4 cos(φ/2)
as
A(q) = (q42 − |qv |2 ) I + 2 qv qTv − 2 q4 [ qv ×] (20)
The rotation axis e and angle will be useful later. Since A(q) is a homogenous quadratic function of q,
we can write
tr(A B T ) = qT K q (21)
where K is the symmetric traceless matrix
 
S − I tr(B) z
K≡ (22)
zT tr(B)

with  
 B23 − B32  X
S = B + BT and z≡ B31 − B13 = ai bi × ri (23)
B12 − B21
 
i

The optimal attitude is represented by the quaternion maximizing right side of Eq. (21), subject to
the unit constraint |q| = 1, which is implied by Eq. (19). It is not difficult to see that the optimal
quaternion is equal to the normalized eigenvector of K with the largest eigenvalue, i.e., the solution of

K qopt ≡ λmax qopt (24)

With Eqs. (2) and (21), this gives the optimized loss function as

L(Aopt ) = λ0 − λmax (25)

Very robust algorithms exist to solve the symmetric eigenvalue problem [3, 10].
The eigenvalues of the K matrix, λmax ≡ λ1 ≥ λ2 ≥ λ3 ≥ λ4 ≡ λmin , are related to the singular
values by [11, 15]

λ1 = s1 + s2 + s3 , λ2 = s1 − s2 − s3 , λ3 = −s1 + s2 − s3 , λ4 = −s1 − s2 + s3 . (26)

The eigenvalues sum to zero because K is traceless. There is no unique solution if the two largest
eigenvalues of K are equal, or s2 + s3 = 0. This is not a failure of the q-method; it means that the data
aren’t sufficient to determine the attitude uniquely. Equation (18) shows that the covariance is infinite
in this case. This is expected, since the covariance should be infinite when the attitude is unobservable.

QUATERNION ESTIMATOR (QUEST)


This algorithm, first applied in the MAGSAT mission in 1979, has been the most widely used algorithm
for Wahba’s problem [16, 17]. Equation (24) is equivalent to the two equations

[(λmax + tr(B) I − S ] qv = q4 z (27)

4
and
(λmax − tr (B) q4 = qTv z (28)
Equation (27) gives (setting τ = λmax + tr(B)),
q4
qv = q4 (τ I − S)−1 z = adj (τ I − S) z (29)
det(τ I − S)

The optimal quaternion is then given by


 
1 x
qopt = p (30)
γ 2 + |x|2 γ

where
x ≡ adj [(λmax + trB) I − S] zv = [α I + (λmax − trB) S + S 2 ] zv (31)
and
γ ≡ det[(λmax + tr(B)) I − S ] = α [λmax + tr(B)] − det(S) (32)
with
α ≡ λ2max − [tr(B)]2 + tr(adj[S]) (33)
The second form on the right sides of Eqs. (31) and (32) follows from the Cayley-Hamilton Theorem
[3, 17]. These computations require knowledge of λmax , which is obtained by substituting Eqs. (30)
and (31) into Eq. (28), yielding:

0 = ψ(λmax ) ≡ γ [λmax − tr(B)] − zTv [α I + (λmax − tr(B) S + S 2 ] zv (34)

Substituting α and γ from Eqs. (32) and (33) gives a fourth-order equation in λmax , which is simply
the characteristic equation det(λmax I − K) = 0. Shuster observed that λmax can be easily obtained by
Newton-Raphson iteration of Eq. (34) starting from λ = λ0 as the initial estimate, since Eq. (25) shows
that λmax is very close to λ0 if the optimized loss function is small. In fact, a single iteration is generally
sufficient. But numerical analysts know that solving the characteristic equation is an unreliable way to
find eigenvalues, in general, so QUEST is in principle less robust than Davenport’s original q-method.
The analytic solution of the quartic characteristic equation is slower and no more accurate than the
iterative solution. Shuster provided the first estimate of the covariance of the rotation angle error vector
in the body frame,
" #−1
X
P = αi ( I − bi bTi ) (35)
i
2
He also showed that 2L(Aopt ) obeys a χ probability distribution with (2 nobs −3) degrees of freedom, to
a good approximation and assuming Gaussian measurement errors, where nobs is the number of vector
observations. This can often provide a useful data quality check, as will be seen below.
The optimal quaternion is not defined by Eq. (30) if γ 2 +|x|2 = 0, so it is of interest to see when this
condition arises. Applying the Cayley-Hamilton theorem twice to eliminate S 4 and S 3 after substituting
Eq. (31) gives, with some tedious algebra,


γ 2 + |x|2 = γ (36)

5
where ψ(λ) is the quartic function defined implicitly by Eq. (34). The discussion following Eq. (15)
implies that dψ/dλ is invariant under rotations, since the coefficients in the polynomial ψ(λ) depend
only on the singular values of B [15]. The Newton-Raphson iteration for λmax requires dψ/dλ to be
nonzero, so γ 2 + |x|2 = 0 implies that γ = 0. This means that qopt (4) = 0 and the optimal attitude
represents a 180◦ rotation. Shuster devised the method of sequential rotations to avoid this singular
case [16, 17, 18].

REFERENCE FRAME ROTATIONS


The qopt (4) = 0 singularity occurs because QUEST does not treat the four components of the quaternion
on an equal basis. Davenport’s q-method avoids this singularity by treating the four components
symmetrically, but some other methods have singularities similar to that in QUEST. These singularities
can be avoided by solving for the attitude with respect to a reference coordinate frame related to the
original frame by 180◦ rotations about the x, y, or z coordinate axis. That is, we solve for one of the
quaternions
       
i ei qv ei q4 ei − qv × ei
q ≡q⊗ = ⊗ = for i = 1, 2, · · · , (37)
0 q4 0 −qTv ei
where ei is the unit vector along the i-th coordinate axis. We use the convention of Ref. [14] for
quaternion multiplication, rather than the historic convention. The products in Eq. (37) are trivial to
implement by merely permuting and changing signs of the quaternion components. For example,
q1 = [ q1 , q2 , q3 , q4 ]T ⊗ [ 1, 0, 0, 0 ]T = [ q4 , −q3 , q2 , −q1 ]T (38)

The equations for the inverse transformations are the same, since a 180 rotation in the opposite
direction has the same effect. These rotations are also easy to implement on the input data, since a
rotation about axis i simply changes the signs of the j-th and k-th columns of the B matrix, where
{i, j, k} is a permutation of {1, 2, 3}. The reference system rotation is easily “undone” by Eq. (38) or
its equivalent after the optimal quaternion has been computed.
The original QUEST implementation performed sequential rotations one axis at a time, until an
acceptable reference coordinate system was found. It is clearly preferable to save computations by
choosing a single desirable rotation as early in the computation as possible. This can be accomplished
by considering the components of an a priori quaternion, which is always available in a star tracker
application since an a priori attitude estimate is needed to identify the stars in the tracker’s field of
view. If the fourth component of the a priori quaternion has the largest magnitude, no rotation is
performed, while a rotation about the i-th axis is performed if the i-th component has the largest
magnitude. Then Eq. (38) or its equivalent shows that the fourth component of the rotated quaternion
will have the largest magnitude. This magnitude must be at least 1/2, but no larger magnitude can be
guaranteed, because a unit quaternion may have all four components with magnitude 1/2. The use of
a previous estimate as the a priori attitude guarantees q4 > cos(82.5◦ ) ≈ 0.13 in the rotated frame if
rotations between successive estimates are less than 45◦ .

FAST OPTIMAL ATTITUDE MATRIX (FOAM)


The singular value decomposition of B gives a convenient representation for adj(B), det(B), and kBk2F .
These can be used to write the optimal attitude matrix as [15]
Aopt = [κλmax − det(B)]−1 [ (κ + kBk2F ) B + λmax adj(B T ) − BB T B ] (39)

6
where
1 2
κ≡ (λ − kBk2F ). (40)
2 max
It’s important to note that all the quantities in Eqs. (39) and (40) can be computed without performing
the singular value decomposition. In this method, is found from

λmax = tr(Aopt B T ) =
= [κλmax − det(B)]−1 [(κ + kBk2F )kBk2F + 3λmax det(B) − tr(BB T BB T )], (41)

or, after some matrix algebra,

0 = ψ(λmax ) ≡ (λ2max − kBk2F )2 − 8λmax det(B) − 4kadj(B)k2F (42)

Equations (34) and (42) for ψ(λmax ) would be numerically identical with infinite-precision computations,
but the FOAM form of the coefficients is less subject to errors arising in finite-precision computations.
The FOAM algorithm gives the error covariance as:

P = [ κλmax − det(B) ]−1 (κ I + BB T ) (43)

A quaternion can be extracted from Aopt , with a cost of 13 MATLAB flops. This has several advantages:
the four-component quaternion is more economical than the nine-component attitude matrix, easier to
interpolate, and more easily normalized if Aopt is not exactly orthogonal due to computational errors
[19].

ESTIMATOR OF THE OPTIMAL QUATERNION (ESOQ or ESOQ-1)


Davenport’s eigenvalue equation, Eq. (24), says that the optimal quaternion is orthogonal to all the
columns of the matrix
H ≡ K − λmax I, (44)
which means that it must be orthogonal to the three-dimensional subspace spanned by the columns of
H. The optimal quaternion is conveniently computed as the generalized four-dimensional cross-product
of any three columns of this matrix [20, 21, 22].
Another way of seeing this result is to examine the classical adjoint of H. Representing K in terms
of its eigenvalues and eigenvectors and using the orthonormality of the eigenvectors gives, for any scalar
" 4 # 4
X X
T
adj(K − λ I) = adj (λµ − λ) qµ qµ = (λν − λ)(λρ − λ)(λτ − λ) qµ qµT (45)
µ=1 µ=1

We use Greek indices to label different quaternions, to avoid confusion with Latin indices that label
quaternion components; and let {µ, ν, ρ, τ } denote a permutation of {1, 2, 3, 4}. Setting λ = λmax ≡ λ1
causes all the terms in this sum to vanish except the first, with the result

adj(H) = (λ2 − λmax )(λ3 − λmax )(λ4 − λmax ) qopt qTopt (46)

Thus qopt can be computed by normalizing any non-zero column of adj(H), which we denote by index
k. Let F denote the symmetric 3 × 3 matrix obtained by deleting the k-th row and k-th column from

7
H, and let f denote the three-component column vector obtained by deleting the k-th element from the
k-th column of H. Then the k-th element of the optimal quaternion is given by

(qopt )k = −c det(F ) (47)

and the other three elements are

(qopt )1, ··· , k−1, k+1, ··· , 4 = c adj(F ) f (48)

where the coefficient c is determined by normalizing the quaternion. It is desirable to let k denote the
column with the maximum Euclidean norm, which Eq. (46) shows to be the column containing the
maximum diagonal element of the adjoint. Computing all the diagonal elements of adj(H), though
not as burdensome as QUEST’s sequential rotations, is somewhat expensive; but this computation can
be avoided by using an a priori quaternion as in QUEST. In the ESOQ case, however, no rotation is
performed; we merely choose k to be the index of the element of the a priori quaternion with maximum
magnitude.
The original formulation of ESOQ used the analytic solution of the characteristic equation [23];
but the analytic formula sometimes gives complex eigenvalues, which is theoretically impossible for
a real symmetric matrix. These errors arise from inaccurate values of the coefficients of the quartic
characteristic equation, not from the solution method. It is faster, and equally accurate, to compute
max by iterative solution of Eq. (42). Equation (34) would give a faster solution, but it would be less
robust, and an even more efficient solution is described below.

First Order Update (ESOQ-1.1)

Test results show that higher-order updates do not improve the performance of the iterative methods,
providing motivation for developing a first-order approximation. The matrix H can be expanded to
first order in ∆λ ≡ λ0 − λmax as
H = H 0 + (∆λ) I (49)
where
H 0 ≡ K − λ0 I (50)
The vector f does not depend on λmax , which only appears in the diagonal elements of H; but the
matrix F depends on λmax , giving
F = F 0 + (∆λ) I (51)
where F 0 is derived from H 0 in the same way that F is derived from H. Matrix identities give

adj(F ) = adj(F 0 ) + ∆λ [tr(F 0 ) I − F 0 ] (52)

and
det(F ) = det(F 0 ) + (∆λ) tr(adj(F 0 )) (53)
to first order in . The characteristic equation can be expressed to the same order as
0
0 = det(H) = (Hkk + ∆λ) det(F ) − f T adj(F ) f =
0
= Hkk det(F 0 ) − f T g + [ Hkk
0
tr(adj(F 0 )) + det(F 0 ) − f T h] ∆λ, (54)

8
where
g ≡ adj(F 0 ) f (55)
and
h ≡ [ tr(F 0 ) I − F 0 ] f (56)
Equation (54) is easily solved for ∆λ ≡ λ0 − λmax , and then the first order quaternion estimate is given
by
(qopt )k = −c [ det(F 0 ) + (∆λ) tr(adj(F 0 )) ] (57)
and
(qopt )1, ··· , k−1, k+1, ··· , 4 = c (g + h ∆λ). (58)

SECOND ESTIMATOR OF THE OPTIMAL QUATERNION (ESOQ-2)


This algorithm uses the rotation axis/angle form of the optimal quaternion, as given in Eq. (19).
Substituting these into Eqs. (27) and (28) gives

[ λmax − tr(B) ] cos(φ/2) = zT e sin(φ/2) (59)

and
z cos(φ/2) = { [ λmax + tr(B) ] I − S } e sin(φ/2) (60)
Multiplying Eq. (60) by and substituting Eq. (59) gives

M e sin(φ/2) = 0 (61)

where
. .
M ≡ [λmax − tr(B)]{[λmax + tr(B)] I − S } − z zT = [ m1 .. m2 .. m3 ] (62)
These computations lose numerical significance if [λmax − tr(B)] and z are both close to zero, which
would be the case for zero rotation angle. We can always avoid this singular condition by using one of
the sequential reference system rotations [16, 17, 18] to ensure that tr(B) is less than or equal to zero.
If we rotate the reference frame about the i-th axis

tr(B)rotated = (Bii − Bjj − Bkk )unrotated = [ 2 Bii − tr(B)]unrotated (63)

where {i, j, k} is a permutation of {1, 2, 3}. Thus no rotation is performed in ESOQ-2 if tr(B) is
the minimum of (B11 , B22 , B33 , tr(B)), while a rotation about the i-th axis is performed if Bii is the
minimum. This will ensure the most negative value for the trace in the rotated frame. As in the
QUEST case, the rotation is easily “undone” by Eq. (38) or its equivalent after the quaternion has
been computed. Note that efficiently finding an acceptable rotated frame for ESOQ-2 does not require
an a priori attitude estimate.
Equation (61) says that the rotation axis is a null vector of M . The columns of adj(M ) are the cross
products of the columns of M :
. .
adj(M ) = [ m2 × m3 .. m3 × m1 .. m1 × m2 ] (64)

Because M is singular, all these columns are parallel, and all are parallel to the rotation axis e. Thus
we set
e = y/|y|, (65)

9
where y is the column of adj(M ) (i.e., the cross product) with maximum norm. Because M is symmetric,
it is only necessary to find the maximum diagonal element of its adjoint to determine which column to
use.
The rotation angle is found from Eq. (59) or one of the components of Eq. (60). We will show that
Eq. (59) is the best choice. Comparing Eq. (22) with the eigenvector/eigenvalue expansion
h
X
K= λµ qµ qTµ , (66)
µ=1

establishes the identities


4
X
tr(B) = λµ (qµ )24 , (67)
µ=1

and
4
X
z= λµ (qµ )4 qµ . (68)
µ=1

Using Eq. (67) and the orthonormality of the eigenvectors of K, we find that
4
X
|z|2 = λ2µ (qµ )24 − [tr(B)]2 (69)
µ=1

This equation and give the inequality

|z| ≤ max |λµ | = max (λmax , −λmin ) (70)


µ=1, ··· , 4

This shows that choosing the rotated reference system that provides the most negative value of trB
makes Eq. (59) the best equation to solve for the rotation angle. With Eq. (65), this can be written

[λmax − tr(B)]|y| cos(φ/2) = (z · y) sin(φ/2) (71)

which means that there is some scalar for which

cos(φ/2) = η (z · y) (72)

and
sin(φ/2) = η [λmax − tr(B)] |y| (73)
Substituting into Eq. (19) and using Eq. (65) gives the optimal quaternion as [24, 25]
 
1 [λmax − tr(B)] y
qopt = p (74)
|[λmax − tr(B)] y|2 + (z · y)2 z·y

Note that it is not necessary to normalize the rotation axis. ESOQ-2 does not define the rotation axis
uniquely if M has rank less than two. This includes the usual case of unobservable attitude and also the
case of zero rotation angle. Requiring tr(B) to be non-positive avoids zero rotation angle singularity,
however. We compute λmax by iterative solution of Eq. (42) in the general case, as for ESOQ.

10
First Order Update (ESOQ-2.1)

The motivation for and development of this algorithm are similar to those of the first order update
used in ESOQ-1.1. The matrix M can be expanded to first order in ∆λ ≡ λ0 − λmax as

M = M 0 + (∆λ) N, (75)

where
. .
M 0 ≡ [λ0 − tr(B)]{[λ0 + tr(B)] I − S } − zzT = [ m01 .. m02 .. m03 ] (76)
and
. .
N ≡ S − 2λ0 I = [ n1 .. n2 .. n3 ], (77)
To this same order, we have

y ≡ mi × mj = (m0i + ni ∆λ) × (m0j + nj ∆λ) = y0 + p ∆λ, (78)

where
y0 ≡ m0i × m0j , (79)
and
p ≡ m0i × nj + ni × m0j . (80)
The maximum eigenvalue can be found from the condition that M is singular. This gives the first-order
approximation

0 = det(M ) = (mi × mj ) · mk = (y0 + p ∆λ) · (m0k + nk ∆λ) = y0 · m0k + (y0 · nk + m0k · p) ∆λ, (81)

where {i, j, k} is a cyclic permutation of {1, 2, 3}. This is solved for ∆λ, and the attitude estimate is
found by substituting Eq. (78) and λmax = λ0 − ∆λ into Eq. (74).
There is an interesting relation between the eigenvalue condition M (λ) = 0 used in ESOQ-2.1 and
the condition ψ(λ) = 0 used in other algorithms. Since M (λ) is a 3 × 3 matrix quadratic in λ, the
eigenvalue condition is of sixth order in λ. Straightforward matrix algebra shows that

det[M (λ)] = [ λ − tr(B) ]2 ψ(λ). (82)

Thus M (λ) has the four roots of ψ(λ), the eigenvalues of Davenport’s K matrix, and an additional
double root at tr(B). Choosing the rotated reference frame maximizing −tr(B) ensures that this
double root is far from the desired root at λmax

TESTS
We test the accuracy and speed of MATLAB implementations of these methods, using simulated data.
The q and SVD methods use the functions eig and svd, respectively; the others use the equations in
this paper. MATLAB uses IEEE double-precision floating-point arithmetic, in which the numbers have
approximately 16 significant decimal digits [26].
We analyze three test scenarios. In all these scenarios, the pointing of one spacecraft axis, which
we take to be the spacecraft x-axis, is much better determined that the rotation about this axis. This
is a very common case that arises in spacecraft that point a single instrument (like an astronomical

11
telescope) very precisely. This is also a characteristic of attitude estimates from a single narrow-field-of-
view star tracker, where the rotation about the tracker boresight is much less well determined than the
pointing of the boresight. The x-axis error and the yz error, which is the error about an axis orthogonal
to the x-axis and determines the x-axis pointing, are computed from an error quaternion qerr by writing
       
φx φyz
e sin e sin
  
qverr x yz
  
qerr = =  2 ⊗  2 =
q4err  cos φx   cos yz φ 
    2   2       (83)
 ex cos φyz sin φx + eyz cos φx sin φyz − (ex × eyz ) sin φx sin φyz 
2 2 2  2  2 2
= φx φyz
 cos 2 cos 2 

where ex = {1 0 0}T and eyz is a unit vector orthogonal to ex . We can always find φx in [−π, π] and
φyz in [0, π] by selecting eyz appropriately. Then, since eyz and ex × eyz form an orthonormal basis for
the y-z (or 2-3) plane, the error angles are given by

φx = 2 tan−1 (qerr1 /qerr4 ) (84)

and q 
−1 2 2
φyz = 2 sin qerr2 + qerr3 (85)

Equations (84) and (85) would be unchanged if the order of the rotations about ex and eyz were
reversed; only the unit vector eyz would be different. The magnitude φerr of the rotation angle error
vector defined by Eq. (16) is given by

cos(φerr /2) = qerr4 = cos(φx /2) cos(φyz /2). (86)

Thus φerr /2 is the hypotenuse of a right spherical triangle with sides φx /2 and φyz /2. The angles φx
and φyz are the spherical trigonometry equivalents of two orthogonal components of the error vector.

First Test Scenario

The first scenario simulates an application for which the QUEST algorithm has been widely used.
A single star tracker with a narrow field of view and boresight at [1, 0, 0]T is assumed to track five stars
at
         
1 0.99712 0.99712 0.99712 0.99712
b1 =  0  , b2 =  0.07584  , b3 =  −0.07584  , b4 =  0  , b5 =  0  . (87)
0 0 0 0.07584 −0.07584

We simulate 1, 000 test cases with uniformly distributed random attitude matrices, which we use to map
the five observation vectors to the reference frame. The reference vectors are corrupted by Gaussian
random noise with equal standard deviations of 6 arcseconds per axis and then normalized. Equation
(35) gives the covariance for the star tracker scenario as
5
X
2
P = (6 arcsec) [ 5 I − bi bTi ]−1 = diag[1565, 7.2, 7.2] arcsec2 (88)
i=1

12
which gives the standard deviations of the attitude estimation errors as
√ √
σx = 1565 arcsec = 40 arcsec and σyz = 7.2 + 7.2 arcsec = 3.8 arcsec (89)
This is a very favorable five-star case, since the stars are uniformly and symmetrically distributed across
the tracker’s field of view. One advantage of simulating a fixed star distribution and applying Gaussian
noise to the reference vectors rather than the observation vectors is that Eqs. (87-89) are always valid,
and the predicted covariance can be compared with the results of the Monte Carlo simulation.

Figure 1: Empirical (solid line) and Theoretical (circles) Loss Function Distribution for the Seven-
Degree-of-Freedom Star Tracker Scenario

The loss function is computed with measurement variances in (radians) [2], since this results in
2 L(Aopt ) approximately obeying a χ2 distribution. The minimum and maximum values of the loss
function in the 1, 000 test runs are 0.23 and 12.1, respectively. The probability distribution of the loss
function is plotted as the solid line in Figure 1, and several values of P (χ2 |ν) for χ2 = 2 L(Aopt ) and
ν = 2nobs − 3 = 7 are plotted as circles [23]. The agreement is seen to be excellent, which indicates
that the measurement weights accurately reflect the normally-distributed measurement errors for this
scenario.

13
The RSS (outside of parentheses) and the maximum (in parentheses) estimation errors over the
1, 000 cases for the star tracker scenario are presented in Table 1. The q-method and the SVD method
should both give the truly optimal solution, since they are based on robust matrix analysis algorithms
[3, 10]. The q-method is taken as optimal by definition, so no estimated-to-optimal differences are
presented for that algorithm, and the differences between the SVD and q-methods provide an estimate
of the computational errors of both methods. No estimate of the loss function is provided when no
update of λmax is performed, accounting for the lack of entries in the loss function column in the tables
for these cases.
The loss function is computed exactly by both the q and SVD methods, in principle. Equation
(3) gives λ0 = 5.9 · 109 rad−2 for this scenario, so the expected errors in double-precision machine
computation of λmax and thus of the loss function is on the order of 10−16 times this, or about 10−6 ,
in rough agreement with the difference shown in the table. In fact, all the algorithms that compute
the loss function give nearly the same accuracy in this scenario. The table also shows show that one
Newton-Raphson iteration for λmax is always sufficient, with a second iteration providing no practical
improvement.

Table 1: Estimation Errors for Star Tracker Scenario

Although the attitude estimates of the different algorithms in Table 1 may be closer or farther from
the optimal estimate, all the algorithms provide estimates that are equally close to the true attitude.
The number of digits presented in the table is chosen to emphasize this point, but these digits are not
all significant; the results of 1, 000 different random cases would not agree with these cases to more than

14
two decimal places. The smallest angle differences in the table, about 10−10 arcseconds, are equal to
5 · 10−16 radians, which is at the limit of double precision math. The differences between the estimated
and optimal values further show that no update of λmax is required in this scenario, since the estimates
using λ0 are equally close to the truth. Finally, it is apparent that the covariance estimate of Eq. (88)
is quite accurate.

Figure 2: Empirical (solid line) and Theoretical (circles) Loss Function Distribution for the Three-
Degree-of-Freedom Unequal Measurement Weight Scenario

Second Test Scenario

The second scenario uses three observations with widely varying accuracies to provide a difficult test
case for the algorithms under consideration. The three observation vectors are
     
1 −0.99712 −0.99712
b1 =  0  , b2 =  0.07584  , and b3 =  −0.07584  . (90)
0 0 0

15
We simulate 1, 000 test cases as in the star tracker scenario, but with Gaussian noise of one arcsecond
per axis on the first observation, and 1 per axis on the other two. This models the case that the
first observation is from an onboard astronomical telescope, and the other two observations are from
a coarse sun sensor and a magnetometer, for example. A very accurate estimate of the orientation of
the x-axis is required in such an application, but the rotation about this axis is expected to be fairly
poorly determined. This is reflected by the predicted covariance in this scenario, which is to a very
good approximation,
 
1 2 −1 2 2 2
P = diag (1 − 0.99712 ) deg , 1 arcsec , 1 arcsec , (91)
2
giving
σx = 9.3 deg and σyz = 1.4 arcsec. (92)
The minimum and maximum values of the loss function computed by the q-method in the 1, 000
test runs for the second scenario are 0.003 and 8.5, respectively. The probability distribution of the
loss function is plotted as the solid line in Figure 2, and several values of the χ2 distribution with three
degrees of freedom are plotted as circles. The agreement is almost as good as the seven-degree-of-freedom
case.

Table 2: Estimation Errors for Unequal Measurement Weight Scenario

The estimation errors for this scenario are presented in Table 2, which is similar to Table 1 except
that the rotation errors about the x axis are given in degrees. The agreement of the q and SVD

16
methods is virtually identical to their agreement in the star tracker scenario, but the other algorithms
show varying performance. The best results for the attitude accuracies are in agreement with the
covariance estimates of Eq. (91).
Equation (3) gives λ0 = 8.5·1010 rad−2 for this scenario, so the expected accuracy of the loss function
in double-precision machine computations is on the order of 10−5 , which is the level of agreement
between the values computed by the q and SVD methods. None of the other methods computes
the loss function nearly as accurately. This differs from the first scenario, where all the algorithms
came close to achieving the maximum precision available in double-precision arithmetic. The iterative
computation of λmax in QUEST, ESOQ-1.1, and ESOQ-2.1 is poor, but this has surprisingly little
effect on the determination of the x-axis pointing. The determination of the rotation about the x-axis
is adversely affected by an inaccurate computation of λmax , however, with maximum deviations from
the optimal estimate of almost 180◦ . The only useful results of QUEST are obtained by not performing
any iterations for λmax .

Figure 3: Empirical Loss Function Distribution for the Mismodeled Measurement Weight Scenario

As noted in the discussion of the analytic solution of the characteristic equation for ESOQ-1, errors
in the computation of the eigenvalues are believed to arise from inaccurate values of the coefficients

17
of the quartic characteristic equation rather than from the solution method employed. The superior
accuracy of the iterative computation of λmax in FOAM, ESOQ, and ESOQ-2 as compared to QUEST,
ESOQ-1.1, and ESOQ-2.1 is likely due to the fact that Eq. (42) deals with B directly, while the other
algorithms lose some numerical significance by using the symmetric and skew parts S and z.

Table 3: Estimation Errors for Mismodeled Measurement Weight Scenario Algorithm

Third Test Scenario

The third scenario investigates the effect of measurement noise mismodeling, illustrating problems
that first appeared in analyzing data from the Upper Atmosphere Research Satellite [27]. Of course, no
one would intentionally use erroneous models, but it can be very difficult to determine an accurate noise
model for real data, and the assumption of any level of white noise is often a poor approximation to
real measurement errors. This scenario uses the same three observation vectors as the second scenario,
given by Eq. (90). We again simulate 1, 000 test cases, but with Gaussian white noise of 1◦ per axis on
the first observation and 0.1◦ per axis on the other two. The estimator, however, incorrectly assumes
measurement errors of 0.1◦ per axis on all three observations, so it weights the measurements equally.
The minimum and maximum values of the loss function computed by the q-method in the 1, 000
test runs for the third scenario are 0.07 and 453, respectively. The probability distribution of the loss
function is plotted in Figure 3. The theoretical three-degree-of-freedom distribution was plotted in
Figure 2; it is not plotted in Figure 3 since it would be a very poor fit to the data. More than 95%
of the values of L(Aopt ) are theoretically expected to lie below 4, according to the χ2 distribution of

18
Figure 2, but almost half of the values of the loss function in Figure 3 have values greater than 50.
Shuster has emphasized that large values of the loss function are an excellent indication of measurement
mismodeling or simply of bad data.
The estimation errors for this scenario are presented in Table 3, which is similar to Tables 1 and 2
except that all the angular errors are given in degrees. The truly optimal q and SVD methods agree
even more closely than in the other scenarios. Equation (3) gives λ0 = 5 · 105 rad−2 for this scenario,
so the expected accuracy of the loss function in double-precision machine computations is on the order
of 10−10 , the level of agreement between the q and SVD methods. As in the second scenario, none
of the other methods computes the loss function nearly as accurately. In the third scenario, though,
the iterative computation of max works well for all the algorithms, and both iterations improve the
agreement of the loss function and attitude estimates with the optimal values. The first order refinement
is reflected in a reduction of the attitude errors, particularly in determining the rotation about the x-
axis, but no algorithm is aided significantly by a second-order update. As in the first scenario, all the
algorithms with the first order update to λmax perform as well as the q and SVD methods.

Figure 4: Execution Times for Robust Estimation Algorithms

FOAM, ESOQ, and ESOQ-2 use first order approximation for λmax .

19
Speed

Absolute speed numbers are not very important for ground computations, since the actual estimation
algorithm is only a part of the overall attitude determination data processing effort. Speed was more
important in the past, when thousands of attitude solutions had to be computed by slower machines,
which is why QUEST was so important for the MAGSAT mission. For a real-time computer in a
spacecraft attitude control system or a star tracker, which must finish all its required tasks in a limited
time, the longest computation time is more important than the average time. This would penalize some
algorithms for real-time applications, unless we eliminate the need for sequential rotations by using the
methods described above. Our speed comparisons use an a priori quaternion for QUEST and ESOQ,
or the diagonal elements of B for ESOQ-2, to eliminate these extra computations. Its independence
from a priori attitude information somewhat favors ESOQ-2 for real-time applications.

Figure 5: Execution Times for Fast Estimation Algorithms Numbers in parentheses denote order of
max approximation

Figures 4 and 5 show the maximum number of MATLAB floating-point operations (flops) to compute
an attitude using two to six reference vectors; the times to process more than six vectors follow the

20
trends seen in the figure. The inputs for the timing tests are the nobs normalized reference and
observation vector pairs and their nobs weights. One thousand test cases with random attitudes and
random reference vectors with Gaussian measurement noise were simulated for each number of reference
vectors.
Figure 4 plots the times of the more robust methods. The break in the plots for FOAM, ESOQ,
and ESOQ-2 at nobs = 3 results from using an exact solution of the characteristic equation in the
two-observation case, when det(B) and Eq. (42) shows that ψ(λmax ) is a quadratic function of λ2max .
For three or more observations, these algorithms are timed for a first-order update to λmax using Eq.
(42). Additional iterations for λmax are not expensive, costing only 11 flops each. It is clear that the q-
method and the SVD method require significantly more computational effort than the other algorithms,
as expected. The q-method is more efficient than the SVD method, except in the least interesting two-
observation case. The other three algorithms are much faster, with the fastest, ESOQ and ESOQ-2,
being nearly equal in speed. An implementation of the q-method computing only the largest eigenvalue
of the K matrix and its eigenvector would be faster in principle than eig, which computes all four
eigenvalues and eigenvectors. However, the MATLAB routine that can be configured this way is much
slower than eig. This option was not investigated further, since the q-method is unlikely to be the
method of choice when speed is a primary consideration.
Figure 5 compares the timing of the fastest methods, which generally use zeroth order and first order
approximations for . Both QUEST(1) and ESOQ-2.1 use the exact quadratic solution for in the two-
observation case, but ESOQ-1.1 uses its faster first order approximation for any number of observations.
It is clear that ESOQ and ESOQ-2 are the fastest algorithms using the zeroth order approximation for
max, and ESOQ-1.1 is the fastest of the first order methods.

CONCLUSIONS
This paper has examined the most useful algorithms for estimating spacecraft attitude from vector
measurements based on minimizing Wahba’s loss function. These were tested in three scenarios, which
show that the most robust, reliable, and accurate estimators are Davenport’s q-method and the Singular
Value Decomposition (SVD) method. This is not surprising, since these methods are based on robust and
well-tested general-purpose matrix algorithms. The q-method, which computes the optimal quaternion
as the eigenvector of a symmetric 4 × 4 matrix with the largest eigenvalue, is the faster of these two.
Several algorithms are significantly less burdensome computationally than the q and SVD methods.
These methods are less robust in principle, since they solve the quartic characteristic polynomial equa-
tion for the maximum eigenvalue, a procedure that is potentially numerically unreliable. Algorithms
that use the form of the characteristic polynomial from the Fast Optimal Attitude Matrix (FOAM)
algorithm performed as well as the q and SVD methods in practice, however. The fastest of these
algorithms are the EStimators of the Optimal Quaternion, ESOQ and ESOQ-2. The execution times
of these methods are reduced by using the information from an a priori attitude estimate to eliminate
sequential rotations in QUEST and extra computations in ESOQ, or information derived from the
observations to speed ESOQ-2.
All the algorithms tested perform as well as the more robust algorithms in cases where measurement
weights do not vary too widely and are reasonably well modeled. If the measurement uncertainties are
not well represented by white noise, however, an update is required, while this update can be unreliable
if the measurement weights span a wide range. The examples in the paper show that these robustness
concerns are not an issue for the processing of multiple star observations with comparable accuracies,
the most common application of Wahba’s loss function. Thus the fastest algorithms, the zeroth-order

21
ESOQ and ESOQ-2 and the first-order ESOQ-1.1, are well suited to star tracker attitude determination
applications. In general-purpose applications where measurement weights may vary greatly, one of the
more robust algorithms may be preferred.

22
Bibliography

[1] Wahba, Grace “A Least Squares Estimate of Spacecraft Attitude,” SIAM Review, Vol. 7, No. 3,
July 1965, p. 409.
[2] Shuster, Malcolm D. “Maximum Likelihood Estimate of Spacecraft Attitude,” The Journal of the
Astronautical Sciences, Vol. 37, No. 1, January-March 1989, pp. 79-88.
[3] Horn, Roger A. and Charles R. Johnson Matrix Analysis, Cambridge, UK, Cambridge University
Press, 1985.
[4] Farrell, J. L. and J. C. Stuelpnagel “A Least Squares Estimate of Spacecraft Attitude,” SIAM
Review, Vol. 8, No. 3, July 1966, pp. 384-386.
[5] Wessner, R. H. SIAM Review, Vol. 8, No. 3, July 1966, p. 386.
[6] Velman, J. R. SIAM Review, Vol. 8, No. 3, July 1966, p. 386.
[7] Brock, J. E. SIAM Review, Vol. 8, No. 3, July 1966, p. 386.
[8] Markley, F. Landis “Attitude Determination Using Vector Observations and the Singular Value
Decomposition,” AAS Paper 87-490, AAS/AIAA Astrodynamics Specialist Conference, Kalispell,
MT, August 1987.
[9] Markley, F. Landis “Attitude Determination Using Vector Observations and the Singular Value
Decomposition,” Journal of the Astronautical Sciences, Vol. 36, No. 3, July-Sept. 1988, pp. 245-258.
[10] Golub, Gene H. and Charles F. Van Loan Matrix Computations, Baltimore, MD, The Johns Hop-
kins University Press, 1983.
[11] Keat, J. “Analysis of Least-Squares Attitude Determination Routine DOAOP,” Computer Sciences
Corporation Report CSC/TM-77/6034, February 1977.
[12] Lerner, Gerald M. “Three-Axis Attitude Determination,” in Spacecraft Attitude Determination and
Control, ed. by James R. Wertz, Dordrecht, Holland, D. Reidel, 1978.
[13] Markley, F. Landis “Parameterizations of the Attitude,” in Spacecraft Attitude Determination and
Control, ed. by James R. Wertz, Dordrecht, Holland, D. Reidel, 1978.
[14] Shuster, Malcolm D. “A Survey of Attitude Representations,” The Journal of the Astronautical
Sciences, Vol. 41, No. 4, October-December 1993, pp. 439-517.

23
[15] Markley, F. Landis “Attitude Determination Using Vector Observations: a Fast Optimal Matrix
Algorithm,” The Journal of the Astronautical Sciences, Vol. 41, No. 2, April-June 1993, pp. 261-
280.
[16] Shuster, Malcolm D. “Approximate Algorithms for Fast Optimal Attitude Computation,” AIAA
Paper 78-1249, AIAA Guidance and Control Conference, Palo Alto, CA, August 7-9, 1978.

[17] Shuster, Malcolm D. and S. D. Oh “Three-Axis Attitude Determination from Vector Observations,”
Journal of Guidance and Control, Vol. 4, No. 1, January-February 1981, pp. 70-77.
[18] Shuster, Malcolm D. and Gregory A. Natanson “Quaternion Computation from a Geometric Point
of View,” The Journal of the Astronautical Sciences, Vol. 41, No. 4, October-December 1993, pp.
545-556.
[19] Markley, F. L. “New Quaternion Attitude Estimation Method,” Journal of Guidance, Control, and
Dynamics, Vol. 17, No. 2, March-April 1994, pp. 407-409.
[20] Mortari, Daniele “EULER-q Algorithm for Attitude Determination from Vector Observations,”
Journal of Guidance, Control, and Dynamics, Vol. 21, No. 2, March-April 1998, pp. 328-334.
[21] Mortari, Daniele “ESOQ: A Closed-Form Solution to the Wahba Problem,” The Journal of the
Astronautical Sciences, Vol. 45, No.2 April-June 1997, pp. 195-204.
[22] Mortari, Daniele “n-Dimensional Cross Product and its Application to Matrix Eigenanalysis,”
Journal of Guidance, Control, and Dynamics, Vol. 20, No. 3, May-June 1997, pp. 509-515.
[23] Abramowitz, Milton, and Irene A. Stegun, Handbook of Mathematical Functions with Formulas,
Graphs, and Mathematical Tables, New York, NY, Dover Publications, Inc., 1965, Chapter 26.
[24] Mortari, Daniele “ESOQ-2 Single-Point Algorithm for Fast Optimal Attitude Determination,”
Paper AAS 97-167, AAS/AIAA Space Flight Mechanics Meeting, Huntsville, AL, February 10-12,
1997.
[25] Mortari, Daniele “Second Estimator of the Optimal Quaternion,” Journal of Guidance, Control,
and Dynamics, Vol. 23, No. 5, September-October 2000, pp. 885-888.
[26] The Math Works, Inc., MATLAB User’s Guide, Natick, MA, 1992.
[27] Deutschmann, J. “Comparison of Single Frame Attitude Determination Methods,” Goddard Space
Flight Center Memo to Thomas H. Stengle, July 26, 1993.

24

You might also like