Lecture 1: Statistical Signal Processing
Lecture 1: Statistical Signal Processing
Lecture 1: Statistical Signal Processing
x ∈ Rn or x ∈ Cn .
This is why we sometimes will write x (and that is the notation of the textbook), which is often the
notation for a vector. This course will focus on four fundamental inference problems in statistical
signal processing2 .
• Detection: Suppose our signal x is a realization of a random variable X drawn from one of
two possible probability distributions, f1 (x) or f2 (x). When we measure x we need to decide
which model is better, f1 or f2 . This is called binary hypothesis testing. More generally, we
may have M different models to choose from. We call this M-ary hypothesis testing.
Example: Solar flares. Astrophysicists would like to develop better algorithms to detect
when a solar flare explodes on the sun. We may ask the question whether or not a solar flare
is occurring based on image frames from a video of the sun.
∗
Last updated January 6, 2016.
1
http://en.wikipedia.org/wiki/Statistical_inference
2
This and following material is heavily borrowed from notes by Professor Robert Nowak for his statistical signal
processing course at the University of Wisconsin.
1
• Parameter Estimation: Suppose now that our signal x is a realization of a random vari-
able X drawn from one of an infinite collection probability distributions, but we know that
collection is parameterized by parameters θ. For example, we may model life expectancy as
a Gaussian or normally distributed variable with unknown mean µ and variance σ 2 , and so
we let
µ
θ= 2 .
σ
There are infinitely many values θ could take, and in estimation we are choosing among these
infinitely many values for θ.
Example: sinusoidal parameter estimation for radar. Suppose we work at the DTW
air traffic control, and we send a sinusoidal signal (an electromagnetic wave) to detect whether
there are airplanes in the vicinity. That wave will bounce of an airplane and come back to our
antennae; we then receive a signal s(n) = A cos(2πf n + φ). We can estimate the distance to
the plane using the speed of our wave, but we have to know precisely when that wave arrives
back to compute that. It is crucial to get the phase φ of the received signal to know exactly
when the front edge arrived at the receiver. Also, the amplitude A is unknown because of
attenuation and the frequency f is unknown because of the doppler effect; the latter can also
be used to estimate the speed at which the plane is going. We would like to estimate all these
unknowns from the received signal.
Example: Estimating wildlife populations. Suppose now we work with the Michigan
DEQ (Department for Environmental Quality) and we’d like to understand the health of the
fish populations in the great lakes. To do that we would capture some fish, tag them, and
release them into the lakes. Then over time, we would capture more fish, see how many were
already tagged, and tag others. We’ll continue this sampling over time, and from these data
we’d like to estimate the total fish population.
Example: human health. Often we try to predict your short-term or long-term health
using proxies like your weight, blood pressure, cholesterol levels, hormone levels, etc. Those
are all things we can measure, but your health is the signal we really care about and often
only detect the “unhealthy” state when the person has gotten sick.
2
scene. These algorithms train a model using hundreds of millions of labeled images, where
our signal x is the image pixels and y is a vector of labels of what objects are in the scene.
These models have proven to be very powerful, and yet we still don’t deeply understand why
they work.
• Filtering: Here we consider the case where the number n of measurements is increasing
over time. This situation arises whenever “streaming data” are being collected and need to
be processed in real-time or at least in an online fashion. We may model these data as a
random process, and filtering is the act of estimating parameters in an online fashion from a
realization of the random process as it reveals itself.
Example: target tracking. Tracking a moving target or tracking markings from a moving
target. For example ecologists who are interested in animal’s movements may be able to collect
measurements like bird song or chipmunk chirps, and from those they can track the animal.
Driverless cars need to track the dotted lines on either side of them as well as the vehicle in
front in order to stay in the lane at a safe distance.
1 Sufficient Statistics
Definition 1. A statistic is a function of observed data, and may be scalar or vector valued.
• Likelihood ratios (which we will learn later) are a statistic; these are relevant to the detection
problem and are a sufficient statistic in that case.
X ∼ fθ (x) .
The statistic T = τ (X) is a sufficient statistic for θ if the conditional distribution of X given T
is independent of θ.