KLT DSP Part1
KLT DSP Part1
Stephane Dumas
jgsdumas@gmail
1 The Eigenproblems 4
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Euler’s rotation theorem . . . . . . . . . . . . . . . . . . . 5
1.2 Solving the problem . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 The Jacobi Method . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 The QR Algorithm . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 The Lanczos Algorithm . . . . . . . . . . . . . . . . . . . 7
1.2.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Also known as... . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Information Content 24
4.1 Detecting Structure in Signal . . . . . . . . . . . . . . . . . . . . 24
4.2 SETI Quest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1
CONTENTS 2
This document is the first part of a report of my work regarding the use of the
Karhunen-Loève Transform and its application to data analysis (i.e. Digital
Signal Processing). I also present some other aspects of the KLT tool regarding
data compression. Several parts of this document have been presented in dif-
ferent conferences. My first application of the eigenproblem was related to my
work on using Principal Component Analysis (PCA) to analyse infrared spectra
in order to classify them. I was able to developed a method to identify unknown
samples when compared to known substances.
Stephane Dumas
3
Chapter 1
The Eigenproblems
1.1 Introduction
The eigenproblems [12, 21] are part of the field known as Functionnal Analysis.
In linear algebra, an eigenvector of a square matrix A is a vector that does not
change direction under the associated linear transformation, as illustrated by
equation 1.1
4
CHAPTER 1. THE EIGENPROBLEMS 5
Figure 1.1: Illustration of the eigenspace in 2D. In the original space (black),
the data required 2 coordinates to located. While in the eigenspace (red), the
data require only 1 coordinate.
w
~ = R~v = ~v (1.2)
which is a special case of the more general equation
w
~ = R~v = λ~v (1.3)
after some rearrangements, one can be that
(R − λI)~v = 0 (1.4)
where I is the identity matrix. Equation 1.4 can only be true if and only if
CHAPTER 1. THE EIGENPROBLEMS 6
R11 − λ R12 R13
det(R − λI) = 0 = det R21 R22 − λ R23 (1.5)
R33 R32 R33 − λ
where det(R) is the determinant of the matrix R. This produces three pos-
sibles values for λ (i.e. roots, characteristic values or eigenvalues). Only real
values (as opposed to complex) are valid solutions for this type of problem.
Later (around 1850), the same problem was explored under the name :
the Sturm-Liouville problem. And in the beginning of the 20-th century, H.
Hotelling[11] revisited the problem and expanded it to the form known today.
A′ ← JT AJ (1.6)
It is named after Carl Gustav Jacob Jacobi[13], who first proposed the
method in 1846, but only became widely used in the 1950s with the advent
of computers. Each transformation (e.g. a Jacobi rotation) is just a plane rota-
tion designed to annihilate one of the off-diagonal matrix element. With each
rotation, the matrix becomes more and more diagonal.
The algorithm is relatively easy to implement on a computer but its perfor-
mance decrease a lot with increasing values of N (i.e. order of the matrix to be
processed).
In summary the method is as follow. Choose a pair (p, q) such that 1 ≤ p <
q ≤ n (where n is the rank of the matrix) then compute the corresponding pair
(c, s) (i.e. cosine and sine of the rotating angle θ).
aqq − app
θ≡ (1.7)
2apq
where app , aqq and apq are elements of the matrix A. Then the cosine and
sine are defined as:
CHAPTER 1. THE EIGENPROBLEMS 7
c= √ 1
1+t2
s = tc (1.8)
1
t= √
|θ|+ θ 2 +1
1
..
.
c ··· s
.. ..
J= (1.9)
. 1 .
−s ··· c
..
.
1
It is obvious that the method requires the manipulation of the whole matrix
A all the time and will be become slow for large values of N.
A= Q·R (1.10)
where Q is orthogonal and R is upper triangular. The algorithm is based
on the tridiagonal decomposition of the matrix A. The matrix A must be
symmetric. This method is harder to implement but faster than Jacobi.
The decomposition is constructed by applying the Householder[9, 28] trans-
formation to annihilate successive column of A below the diagonal.
This method also requires the manipulation of the whole matrix during the
operations.
Extremal eigenvectors (i.e. Φ1 and Φn ) tend to emerge long before the end of
the tridiagonalisation. This means the algorithm does not have to be executed
on the full range of the eigenvalues.
The algorithm can be summarised by a series of operations to construct
a matrix Tj which is a tridiagonal matrix. The main diagonal is defined by
αi = viT Avi where i is the location of the α from 1 to N (rank of the matrix),
and vi are randomly generated unit vectors. The other diagonals (above and
T
below the main) are defined by βi+1 = vi+1 Avi with B1 = 0. Another equation
is used in the iteration to link α and β, which is βi+1 vi+1 = Avi − αi vi − βi vi−1 .
The matrices T are used to construct the answer of the eigenproblem. A
more elaborated explanation of the Lanczos algorithm can be found in [3, 9].
α1 β2 0
β2 α2 . . .
Tj = (1.11)
.. ..
. . βj
0 βj αj
1.2.4 Performance
Jacobi and QR methods require access to the whole matrix A while Lanczos can
work on a small part at a time. The only weak chain of the Lanczos algorithm
is the matrix-vector multiplication subroutine.
Table 1.1 shows a series of comparison between Jacobi, QR and Lanczos.
The calculations were performed on the same matrix A and using the same
computer (i.e. AMD Phenon II X4 955). The time for each calculation is a
mean value over several executions.
Table 1.1: Comparison between Jacobi, QR algorithm and Lanczos. The cal-
culations involve a time series vector of N points, which requires a matrix R
of N × N . The Jacobi and QR algorithm are taken from Numerical Recipes.
In this example, the Lanczos algorithm is executed using only two cores of the
CPU.
N Jacobi QR Modified Lanczos
500 0m12.52s 0m1.00s 0m0.09s
1000 2m58.04s 0m26.25s 0m0.31s
2000 32m37.68s 4m24.23s 0m1.51s
4000 n/a 48m0.78s 0m4.56s
8000 n/a n/a 0m12.20s
16000 n/a n/a 0m34.58s
32000 n/a n/a 2m55.42s
64000 n/a n/a 9m47.65s
The author has also wrote a GPU version of the Lanczos algorithm. The
performance is given in table 1.2. This version of the algorithm is used to explore
CHAPTER 1. THE EIGENPROBLEMS 9
the analysis capability of KLT applied to the search of pulsars (see chapter ??).
Note that a matrix of rank 100,000 would require about 10 Gb to be stored
in memory. The Lanczos algorithm requires only about 3.2 Gb to hold all the
buffers (i.e. matrices and vectors).
The Karhunen-Loève
Transform
10
CHAPTER 2. THE KARHUNEN-LOÈVE TRANSFORM 11
The functions Φk are also called eigenvectors which define a new orthogonal
axis system (figure 1.1). They are found by solving the eigenproblem associated
X. In the case of X being a time-dependant series, then the problem to solve
is related to the autocorrelation matrix of X (see section 3.2). In the case of
X being spatial-dependent (i.e. image) than the problem would related to the
covariance matrix.
3000
2500
2000
amplitude
1500
1000
500
0
0 10 20 30 40 50
N
The author has also found that the relation 4.1 can be useful to detect the
minimal number of φk to add (chapter 4). If the criterion is below 1 then the
KLT method will not find any information in the data.
Chapter 3
This chapter is listing a series of results using KLT as the main tool in the anal-
ysis of digital signals. The field of Digital Signal Processing (DSP) is dedicated
to the analysis of signals, including any time-varying (i.e. a time series) mea-
surements such as sounds, images, voltages, electrocardiograms, temperatures,
etc. Its main tool is the Fourier Transform. This transform has limitations that
KLT does not and examples will be shown through this chapter.
where the coefficients are given by (the an and bn are projections of X(t) to
the orthonormal functions cos() and sin().).
Z ∞
1
a0 = X(t)dt (3.2a)
2π −∞
1 ∞
Z
an = X(t) cos(nt)dt (3.2b)
π −∞
1 ∞
Z
bn = X(t) sin(nt)dt (3.2c)
π −∞
13
CHAPTER 3. DIGITAL SIGNAL PROCESSING 14
2πn
Hamming w(n) = 0.54 − 0.46 cos N −1
Hann w(n) = 0.5 1 − cos N2πn
−1
related to the sampling of the data may introduce artefacts in the resulting dis-
crete signal. One method to reduce those is to apply a windowing technique (i.e.
a convolution with another signal that is based on a Cosine). Typical window
functions are Hanning, Hamming and Blackman. They smooth the edges of the
signal to reduce the artifacts.
Through the Fourier Analysis, parts of the signal frequencies can be removed.
This process is known as filtering. It can be used to remove low and high
frequencies.
Figure 3.1 illustrates the difference between using or not the windowing
function before passing the data through the FFT. The windowing is removing
the small perturbations. More on the use of windowing functions and their
applications can be found in [24].
1
no window
with Hamming
0.8
0.6
0.4
0.2
0
0 5 10 15 20 25
noise from a signal. The idea is to select only the eigenvectors that contain the
information by inspecting the values of λ.
The eigenproblem associated to this type of analysis is solved by using the
autocorrelation matrix of the signal, defined by equation 3.3.
Z ∞ Z ∞
R(t1 , t2 ) = E[x(t1 )x(t2 )] = x1 x2 f (x1 , x2 ; t1 , t2 )dx1 dx2 (3.3)
∞ ∞
Equation 3.4 shows how to obtain an estimated signal from the first N few
eigenvectors.
N
X
X̃[n] = Zi Φi [n] (3.4)
i=1
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35 40 45 50
Frequency [Hz]
Figure 3.2: Comparison between FFT and KLT for a sine without noise. The
spectrum of the first eigenvector is shown under the label KLT.
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35 40 45 50
Frequency [Hz]
Figure 3.3: Comparison between FFT and KLT for a sine with a noise level of
SNR=-5dB.
CHAPTER 3. DIGITAL SIGNAL PROCESSING 17
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35 40 45 50
Frequency [Hz]
Figure 3.4: Comparison between FFT and KLT for a sine with a noise level of
SNR=-15dB.
When the noise is too strong for the FFT, the spectrum shows multiples
peaks at different frequencies. This is not the case for KLT. When the noise
is too strong, KLT will show a spectrum with a single peak. The difficulty is
to make sure that the answer is valid. This can be accomplished by looking
at the signal parameters (i.e. duration, sampling frequency). The duration of
the signal has an impact on the outcome of KLT. It performs better for long
duration then short one. This has no effect on the outcome of FFT.
Section 3.6 will discussed in more details the characteristics of the KLT
analysis and the signal parameters.
0.9
0.8
0.7
0.6
Amplitude
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60
Frequency [Hz]
0.9
0.8
0.7
0.6
Amplitude
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80 90
Frequency [Hz]
Figure 3.6: Spectrum of the signal from fig 3.5. There are too many peaks to
be able to isolate the right frequencies.
CHAPTER 3. DIGITAL SIGNAL PROCESSING 19
170000
160000
150000
140000
amplitude
130000
120000
110000
100000
90000
80000
0 1 2 3 4 5 6 7 8 9 10
eigenvalue
Figure 3.7: The eigenvalues from the signal shown in fig. 3.6. The first four λ
are far larger than the rest which indicate the signal can be represented by the
first 4 Φk .
0.8
0.7
0.6
Amplitude
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80 90 100
Frequency [Hz]
Table 3.2: Values of sampling frequency (Fs ) required to obtain a rate of 100% of
success in detection using KLT. For different duration (T) of signal and different
noise level. The values inside () are the rate of success for the FFT under the
same conditions.
SNR T=1 T=2 T=3 T=4 T=5
-1.0 250.0 250.0 250.0 250.0 250.0
(1.00) (1.00) (1.00) (1.00) (1.00)
-2.0 250.0 250.0 250.0 250.0 250.0
(1.00) (1.00) (1.00) (1.00) (1.00)
-3.0 250.0 250.0 250.0 250.0 250.0
(1.00) (1.00) (1.00) (1.00) (1.00)
-4.0 250.0 250.0 250.0 250.0 250.0
(1.00) (1.00) (1.00) (1.00) (1.00)
-5.0 250.0 250.0 250.0 250.0 250.0
(1.00) (1.00) (1.00) (1.00) (1.00)
-7.0 250.0 250.0 250.0 250.0 250.0
(0.98) (1.00) (1.00) (1.00) (1.00)
-9.0 300.0 250.0 250.0 250.0 250.0
(0.97) (1.00) (1.00) (1.00) (1.00)
-10.0 300.0 250.0 250.0 250.0 250.0
(0.96) (0.99) (1.00) (1.00) (1.00)
-11.0 300.0 250.0 250.0 250.0 250.0
(0.84) (0.94) (1.00) (1.00) (1.00)
-13.0 550.0 250.0 250.0 250.0 250.0
(0.71) (0.73) (0.94) (0.92) (1.00)
-15.0 950.0 600.0 300.0 250.0 250.0
(0.61) (0.97) (0.85) (0.72) (0.87)
-17.0 1150.0 500.0 450.0 300.0 250.0
(0.44) (0.42) (0.72) (0.64) (0.60)
-19.0 1950.0 1000.0 700.0 600.0 450.0
(0.42) (0.50) (0.38) (0.84) (0.52)
-21.0 3250.0 1950.0 1400.0 900.0 600.0
(0.56) (0.65) (0.66) (0.60) (0.60)
-23.0 5600.0 2600.0 2100.0 1400.0 1150.0
(0.33) (0.38) (0.60) (0.45) (0.50)
Figure 3.9 show results for the detection of a pulse (Fc = 10 Hz) in three
noise level. Each panel shows the rate of success as a function of the duration of
the segment and the sampling frequency. The regions of interest are the black
ones. Each scenario (i.e. a pair (T, Fs )) was ran 100 times and the average is
plotting on those figures.
CHAPTER 3. DIGITAL SIGNAL PROCESSING 22
SNR=-10dB
100 1
90
0.8
80
Sampling Frequency [Hz]
70
0.6
60
0.4
50
40
0.2
30
20 0
1 2 3 4 5 6 7 8 9 10
Time [s]
SNR=-15dB
1
140
0.8
120
Sampling Frequency [Hz]
100 0.6
80
0.4
60
0.2
40
20 0
2 4 6 8 10 12 14
Time [s]
SNR=-20dB
200 1
180
0.8
160
Sampling Frequency [Hz]
140
0.6
120
100
0.4
80
60
0.2
40
20 0
2 4 6 8 10 12 14 16 18 20
Time [s]
Figure 3.9: Relation between the duration (T ) of the signal and the sampling
frequency (Fs ). The black regions indicate a 100% rate of success in detection
of the pulse.
CHAPTER 3. DIGITAL SIGNAL PROCESSING 23
where Ei is the energy in the i-th frequency bin, A(0) is the total energy of
the signal. The factor 2 is in the KLT measure because of the second highest
eigenvalue is ideally the same and has an equal amount of energy.
0.8
KLT
FFT
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-20 -15 -10 -5 0 5 10
noise level [dB]
Figure 3.10: Probability of detection for selected confident level. For each noise
level several repetitions of the measure were plotted (this is a stochastic process).
Chapter 4
Information Content
λk − µ
Ck = (4.1a)
µ
100
X
µ= λi (4.1b)
i
24
CHAPTER 4. INFORMATION CONTENT 25
1000
100
amplitude
10
0.1
0 10 20 30 40 50
N
1000
100
Ck
10
0.1
FM1
FM2
FM3
FM4
MUSIC1
MUSIC2
MUSIC3
MUSIC4
NOAA1
NOAA2
NOAA3
NOAA4
SINE
NOISE1
NOISE2
Figure 4.2: Values of C1 for some data, related to figure 4.1 NOISE3
Figure 4.3 shows the values of Ck for a single pulse without and with noise.
Notice that the value decreases with the noise level.
100
Ck
10
1
NO_NOISE
-5dB
-7dB
-10dB
-15dB
-20dB
-25dB
noise level
Figure 4.3: Values of C1 for a single sine with and without noise.
CHAPTER 4. INFORMATION CONTENT 26
2.5
2
1.5
1
0.5
0
exo051 (1.420e+09)
exo-gl581_2 (4.462e+09)
exo202 (1.420e+09)
bl0716-714_1414_2 (1.414e+09)
galanticenter-3 (1.420e+09)
exo160 (1.420e+09)
exo127_1 (3.100e+09)
psrb0329+54 (1.420e+09)
goes-11_1 (6.692e+09)
GPS-27_1575_1 (1.575e+09)
0136+478_4462_1 (4.462e+09)
0136+478_2008_1 (2.008e+09)
exo130 (1.420e+09)
kepler04_2 (1.420e+09)
exo060 (1.420e+09)
exo201_1 (4.462e+09)
exo191_1 (3.100e+09)
deep-impact (1.413e+09)
mars-odessey_8438 (8.438e+09)
exo150 (4.462e+09)
AZEL075-18 (1.410e+09)
exo118_1 (4.462e+09)
rosetta (8.422e+09)
bllac_2008 (2.008e+09)
AZEL015-18 (1.410e+09)
Moon_1420_1 (1.420e+09)
Rosetta (8.422e+09)
gps-prn26 (1.413e+09)
amc7 (3.693e+09)
galanticenter-3 (1.420e+09)
galaxy-19 (3.991e+09)
goes-13_1 (6.692e+09)
sun_1426_3 (1.413e+09)
koi174.01_1690 (1.690e+09)
psrb0950+08_1 (4.462e+09)
Figure 4.4: Results of the KLT analysis of a small portion of the SETI Quest
database. The two interesting candidates (above the blue line) are artificial
satellites (i.e. Deep Impact and AMC7).
1 http://http://setiquest.org/
Chapter 5
5.1 Introduction
The null hypothesis testing is part of the inferential statistics. The rejection or
acceptance of the hypothesis is a central task in the modern practice of science,
and gives a precise sense in which a claim can be proven false.
It is not the purpose of this document to explain the mathematical founda-
tions of the method. However, the reader can look at the extended bibliography
on the topics and in particular the follow papers.
Many of the basic ideas and techniques of estimation originated in the funda-
mental papers of [4, 5]. The formulation of modern hypotheses testing problem
is due to [20] and see [27] for a general discussion. The first uses of KL Expan-
sion to statistical problems was by [10], he also extended the basis concept of
estimation and testing to general stochastic processes.
[15] use the similar techniques in a very broad treatment of the detection
of radar signal buried in noise, and accurate parameters estimation. [23] is
discussing the KL Expansion, its application to the null hypothesis and the use
of the likelihood ratio in the process. [25, 19, 1] are also good reading on the
topics as they discussed other aspects of the method.
27
CHAPTER 5. THE NULL HYPOTHESIS TEST 28
P [x = s + n]
L= (5.2)
P [x = n]
This ratio is compared with the two threshold A = (1 − β)/α and B =
β/(1 − α). If α, β < 1/2, we have that A > B. The test terminates with a
decision that s is present when and if L exceed A; the decision is that s is
absent when L decreases below B. α is defined as the conditional probability of
deciding that s is present when s is in fact absent. β should then be defined as
the conditional probability of deciding that s is absent, when s is in fact present.
Following [23], a discrete system can be defined by equation 5.3. Where s
and x are defined in term of eigenvectors and eigenvalues.
s[k]Φi [k]
Si [k] = √ (5.3a)
λi
x[k]Φi [k]
Xi [k] = √ (5.3b)
λi
where Φi (n) is the i-th eigenvectors and λi is the i-th eigenvalues. The
likelihood ratio can than be rewritten as equation 5.4 and therefore the only
remaining criterion is that L > 1 for a detection.
n
S2
Q
P [xi = si + ni ] X
L= iQ
= X i Si − i (5.4)
i P [xi = ni ] i
2
where Xi is the noisy signal and Si is the candidate signal defined by equa-
tions 5.5 and 5.6. The summation is performed over the number of eigenvalues.
N
1 X
Si = √ S[k]Φi [k] (5.5)
λi k
N
1 X
Xi = √ X[k]Φi [k] (5.6)
λi k
The summation is done over the whole length of the signal. The likelihood
is then computed using candidate signal and the largest value is the most likely
to be signal s.
CHAPTER 5. THE NULL HYPOTHESIS TEST 29
15
10
-5
0 5 10 15 20
Fc (Hz)
Figure 5.1: Results of the Null Hypothesis Test method for a single pulse signal.
The bars show the value of L for each candidate.
Figure 5.2 shows results when there is no data s present in the signal x.
There is no positive values of L.
15
10
-5
0 5 10 15 20
Fc (Hz)
Figure 5.2: Results of the Hypothesis Test method for a signal without data,
only noise.
different colour. The candidate signal is a single sine. The only frequencies that
present strong positive values are the one in the signal (i.e. 7 and 11 Hz). The
smaller peaks at frequencies between 8 and 12 Hz could be filtered out by post
processing (depending on the application for this signal). Since it is a stochastic
process the data shown in the figure is an average value over 100 iterations per
case.
40
SNR=-5
SNR=-7
SNR=-9
SNR=-11
30
20
10
-10
1 3 5 7 9 11 13 15 17 19
Fc (Hz)
Figure 5.3: Results of the Hypothesis Test method for a signal with two fre-
quencies.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5
Frequency [Hz]
Figure 5.6 shows the results for a noisy signal of SNR=-10dB. The highest
positive value points to 11Hz as the valid frequency. There are more false alarms
than with a simple sine but the values are smaller and can be filtered out.
10
SNR=-5
SNR=-7
SNR=-9
8 SNR=-11
-2
-4
1 3 5 7 9 11 13 15 17 19
Fc (Hz)
Figure 5.5: Results of the Hypothesis Test method for the signal shown in figure
5.4.
10
-2
-4
1 3 5 7 9 11 13 15 17 19
Fc (Hz)
Figure 5.6: Close-up of figure 5.5 that illustrate the results for SNR=-9dB.
Chapter 6
Reconstruction and
compression
1.60E+06
1.40E+06
1.20E+06
amplitude
1.00E+06
8.00E+05
6.00E+05
4.00E+05
2.00E+05
0.00E+00
0 1 2 3 4
eigenvalue
Figure 6.2 shows an example of data compression. The whole signal can be
represented by only 3 eigenvalues. The original data can be approximated by
summing the first three eigenvectors into a single vector. Each eigenvectors can
32
CHAPTER 6. RECONSTRUCTION AND COMPRESSION 33
0.5
0
y
-0.5
-1
-1.5
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
x
Figure 6.2: Illustration of data compression of sin(x). The whole curve can be
approximated using only 2 eigenvalues.
Figure 6.3 illustrates a more complex scenario where the original data set is
a Brownian motion. The data being more complex than a single sinus curve,
more eigenvectors are required to approximated it. Compression of even more
complex signal can be performed, as the next section will illustrate with the use
KLT apply to infrared spectroscopy.
CHAPTER 6. RECONSTRUCTION AND COMPRESSION 34
10
Bronian motion
3 eigenvalues
-5
y
-10
-15
-20
-25
0 100 200 300 400 500 600 700
x
10
Bronian motion
6 eigenvalues
-5
y
-10
-15
-20
-25
0 100 200 300 400 500 600 700
x
10
Bronian motion
30 eigenvalues
-5
y
-10
-15
-20
-25
0 100 200 300 400 500 600 700
x
800
600
value
400
200
-200
0.00e+00 2.00e+05 4.00e+05 6.00e+05 8.00e+05 1.00e+06
t
Figure 6.4: KTL applied to a Brownian motion data set of 1 million points.
6.2 Eigenimage
The compression aspect of the eigenproblem can also be applied to images. The
SVD (equ.6.1) process is used in this case and the matrix A is build from the
covariant matrix of the image. The compression is related to the number of bits
CHAPTER 6. RECONSTRUCTION AND COMPRESSION 35
per pixel and not the dimension of the image. If the dimensions of the image
are N and M, then the total storage unit is N × M words. The storage of k
eigenimages is given by k(M + N + 1) words. An eigenimage Sk is describes by
the equ. 6.2.
A = UWVT (6.1)
where U is a matrix containing the equivalent of A in the eigenspace, W is a
matrix containing the eigenvalues on the diagonal and V is a matrix containing
the eigenvectors.
k
X
Sk = λi ui φi (6.2)
i=1
The method cannot be used to retrieve information that is not in the signal.
It is a method to fill gaps based on the current information. It will insert data
point that are following the general thread.
The method can be used to fill the gaps of the data related to targets that
are not always observed (e.g. exoplanets).
200
light curve
KLT
100
-100
-200
-300
-400
0 500 1000 1500 2000 2500 3000 3500 4000 4500
[1] I.V. Basawa and B.L.S. Prakasa Rao. Asymptotic Inference for Stochastic
Processes. Stochastic Processes and their Applications, 10:221–254, 1980.
[2] K. Bryan and T. Leise. The $25,000,000,000 eigenvector - the linear algebra
behind Google. SIAM, 48(3):569–581, 2006.
[3] J.K. Cullum and R.A. Willoughby. Lanczos algorithms for large symmetric
eigenvalue computations. SIAM, 2002.
[4] R.A. Fisher. On the mathematical foundations of theoretical statistics.
Philos. Trans. Roy. Soc., 222:309–368, 1922.
[5] R.A. Fisher. Theory of statistical estimation. Proc. Cambridge Philos.
Soc., 22:700–725, 1925.
[6] J.G.F. Francis. The QR Transformation, I. The Computer Journal,
4(3):265–271, 1961.
[7] J.G.F. Francis. The QR Transformation, II. The Computer Journal,
4(4):332–345, 1962.
[8] fukunaga. Representation of random processes using the finite Karhunen-
Loeve expansion. Information and control, 16:85–101, 1970.
[9] G.H. Golub and C.F. van Loan. Matrix computations. John Hopkins Press,
Baltimore, 1983.
[10] U. Grenander. Stochastic processes and statistical inference. Arkiv for
Matematik, (17):195–277, 1950.
[11] H. Hotelling. Analysis of a complex of statistical variables into principal
components. Journal of Educational Psychology, 24(6):417–441, 1933.
[12] I.T. Jolliffee. Principal Component Analysis. Springer, 2004.
[13] C.G.J. Jacobi. Über ein leichtes Verfahren, die in der Theorie
der Säkularstörungen vorkommenden Gleichungen numerisch aufzulösen.
Crelle’s, 30:51–94, 1846.
37
BIBLIOGRAPHY 38