0% found this document useful (0 votes)
114 views

Introduction To Compressive Sampling

The document introduces compressed sensing as an efficient method for signal acquisition. Compressed sensing samples signals in a compressed, sparse form using far fewer measurements than traditional methods. This is possible because many real-world signals are sparse or compressible when represented in a suitable basis. Compressed sensing involves taking coded measurements of the sparse signal, then using computational techniques to reconstruct the original signal from these incomplete measurements by exploiting the signal's sparsity. The method can significantly reduce measurement time and resource usage compared to traditional uniform sampling.

Uploaded by

Karthik Yogesh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views

Introduction To Compressive Sampling

The document introduces compressed sensing as an efficient method for signal acquisition. Compressed sensing samples signals in a compressed, sparse form using far fewer measurements than traditional methods. This is possible because many real-world signals are sparse or compressible when represented in a suitable basis. Compressed sensing involves taking coded measurements of the sparse signal, then using computational techniques to reconstruct the original signal from these incomplete measurements by exploiting the signal's sparsity. The method can significantly reduce measurement time and resource usage compared to traditional uniform sampling.

Uploaded by

Karthik Yogesh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Introduction to compressive sampling

Sparsity and the equation Ax = y

Emanuele Grossi

DAEIMI, Università degli Studi di Cassino


e-mail: [email protected]

Gorini 2010, Pistoia


Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Traditional data acquisition

Uniformly sample (or sense) data at Nyquist rate


Compress data (adaptive, non linear)

data transmit/store
sample compress
size n size k  n

receive recovered data


decompress
size k size n
Introduction Compressed sensing Discussion

Sparsity/Compressibility
Many signals can be well-approximated by a sparse expansion in
terms of a suitable basis, i.e., by few of non-zero coefficients

Fourier transform

time, n = 512 frequency, k = 6  n


Introduction Compressed sensing Discussion

Sparsity/Compressibility

Wavelet transform

1.5 MB image wavelet domain


Introduction Compressed sensing Discussion

Sparsity/Compressibility

Wavelet transform

1100

1000

900

800

700

600

500

400

300

200

100 threshold

0
0 1 2 3 4 5 6
wavelet coefficients (sorted) x 10
6

n = 6.016 · 106 coefficients k = 7% of n


Introduction Compressed sensing Discussion

Sparsity/Compressibility

Wavelet transform

original compress & decompress


Introduction Compressed sensing Discussion

Traditional data acquisition


data transmit/store
sample compress
size n size k  n

receive recovered data


decompress
size k size n

Pro: simple data recovery


Cons: inefficient
n can be very large even if k is small
n transform coefficients must be computed but only the largest k
are stored
also the location of the k largest coefficients must be encoded
(=overhead)
in some applications measurements can be costly, lengthy, or
otherwise difficult (e.g., radar, MRI, etc.)
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Compressive data acquisition

Why spend so much effort to acquire all the data when most of it
will be discarded?

Wouldn’t it be possible to acquire the data in a compressed form


so that one does not need to throw away anything?

Yes: compressed sensing (CS)

Compressed sensing, a.k.a. as compressive sensing or compressive


sampling,† is a simple and efficient signal acquisition protocol


E. J. Candès and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency
information,” IEEE Trans. Inorm. Theory, 2006; D. L. Donoho, “Compressed sensing,” IEEE Trans. Inorm. Theory, 2006.
Introduction Compressed sensing Discussion

Compressive data acquisition


CS samples – in a signal independent fashion – at low rate

later uses computational power and exploit sparsity for


reconstruction from what appears to be an incomplete set of
measurements

data transmit/store
compressed sensing
size m = O(k ln n)

receive reconstruct recovered data


size m
sparsity-aware size n

reduced measurement time


reduced sampling rates
reduced ADC resources usage
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

CS recipe

Sparse signal representation

Coded measurements (sampling process)

Recovery algorithms (non-linear)


Introduction Compressed sensing Discussion

Sparse signal representation


Many types of real-world signals (e.g., sound, images, video) can
be viewed as an n-dimensional vector of real numbers, where n
is large (e.g., n = 106 )
They may have a concise representation in a suitable basis
 
s1 n
 ..  X
s =.= xi ψ i = Ψ x
sn i=1

signal to be sensed
basis vectors (not necessarily orthogonal)
into an n × n matrix Ψ = (ψ1 · · · ψn ),
e.g., spikes, sinusoids, wavelets, etc.
signal coefficients into an n-dimensional vector
Introduction Compressed sensing Discussion

Sparsity and compressibility

Pn 
p 1/p ,
kxkp = i=1 |xi | `p -norm, e.g.,
v
u n
uX
kxk2 = t |xi |2 , Euclidian
i=1
n
X
kxk1 = |xi |, gives the Manhattan distance
i=1

kxk0 = card i ∈ {1, . . . , n} : xi 6= 0 , number of
non-zero entries (little abuse: it is not a norm)
Introduction Compressed sensing Discussion

Sparsity and compressibility

Pn 
p 1/p ,
kxkp = i=1 |xi | `p -norm

Definitions:
x sparse if kxk0  n
x k-sparse, if kxk0 ≤ k ≤ n
best k-term approximation of x

xk = arg min kx − wkp


w:kwk0 ≤k

= x with the smallest n − k entries set to 0

x compressible if kx − xk kp ≤ ck−r , for some c > 0 and r > 1


(namely, decay quickly in k)
Introduction Compressed sensing Discussion

Traditional compression

If x is compressible then encode x with xk

Inefficiencies of the protocol:


adaptive (i.e., x must be known to select its largest k entries)
can be non-linear
Introduction Compressed sensing Discussion

Measurement model
Take m linear measurements

y = Φ s = ΦΨx = Ax linear model


measurement matrix (m × n)

measures (m-dimensional vector)

Common sensing matrices:


the rows of Φ are Dirac delta’s ⇒ y contains the samples of s
the rows of Φ are sinusoids ⇒ y contains the Fourier
coefficients (typical in MRI)
many others
Introduction Compressed sensing Discussion

Traditional data acquisition (again)

s measurement y = Ax xk
compress sensor side
size n
process size m ≥ n size k  n

xk ŝ
decompress receiver side
size k size n

compression is adaptive (or signal-dependent): need to know x


inversion of Ax = y at the sensor side
neglect sparsity: m ≥ n for matrix inversion
Introduction Compressed sensing Discussion

CS intuition
If x is k-sparse, then it should have k degrees of freedom, not n
⇒ only k measurements or so are needed
Analogy with the 12 coin problem:
“Of 12 coins, one is counterfeit and weighs either
more or less than the others. Find the counterfeit coin
and say if its lighter or heavier with 3 weighings on a
balance scale.”
General problem: (3p − 3)/2 coins coins and p weighings

3 coins: possible weighing plan


1st weighing 1 2

2nd weighing 1 3
Introduction Compressed sensing Discussion

CS intuition

12 coins: possible weighing plan


1st weighing 1 2 3 10 4 5 6 11

2nd weighing 1 2 3 11 7 8 9 10

3rd weighing 1 4 7 10 2 5 8 12

Key points
counterfeit data is sparse
weigh the coins in suitably chosen batches
each measurement picks up little information
about many coins
Introduction Compressed sensing Discussion

CS protocol

s measurement y = Ax
sensor side
size n
process size m = O(k ln n)

y ŝ
reconstruct receiver side
size m size n

exploit sparsity ⇒ m can be comparable with k


inversion of Ax = y at the receiver side through non-linear
processing
measurements have to be suitably designed; remarkably, random
measurement matrices work!
non-adaptive sensing (i.e., signal-independent): need not to
know x
Introduction Compressed sensing Discussion

CS protocol

s measurement y = Ax
sensor side
size n
process size m = O(k ln n)

y ŝ
reconstruct receiver side
size m size n

What is needed then is:


a reconstruction algorithm to invert Ax = y
a sensing matrix Φ that gives a good A
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

The equation Ax = y
Assume w.l.o.g. rank(A) = min{m, n}
m = n, determined system ⇒ solution = A−1 y
m > n, over-determined system; two cases
y ∈ I (A) ⇒ solution = AT A)−1 AT y = A† y
y∈/ I (A) (e.g., noisy measurements)
⇒ no solution; the Least-squares (LS) one is

arg min kAx − yk2 = AT A)−1 AT y


x | {z }
A†

m < n, under-determined system


⇒ infinite solutions; the LS one is

arg min kxk2 = AT AAT )−1 y


x: Ax=y | {z }
A†
Introduction Compressed sensing Discussion

The under-determined case

Recovery of x is possible only if prior information is available


if x is low-energy (i.e., kxk2 is small), then LS is reasonable

Least-squares
min kxk2 , s.t. Ax = y
x
Ax = y

unique solution for any A and y


solution in closed form
Introduction Compressed sensing Discussion

The under-determined case

Recovery of x is possible only if prior information is available


if x is sparse (i.e., kxk0 is small)

Problem
(P0 ) min kxk0 , s.t. Ax = y
x

Ax = y

solution not always unique


problem in general NP-hard
Introduction Compressed sensing Discussion

Uniqueness

Proposition
If any 2k ≤ m columns of A are linearly independent, then any
k-sparse signal x can be recovered uniquely from Ax.

Proof: If not there would exists x1 , x2 such that Ax1 = Ax2 . This
implies A(x1 − x2 ) = 0, with x1 − x2 2k-sparse, and it is not
possible

Observation
If (A)i, j are Gaussian (or from other continuous distribution) i.i.d.
then the condition is satisfied w.p.1.
Introduction Compressed sensing Discussion

Computational complexity

p=2 p=1
|x|p

p=1/3

p=0

0 x

(
convex, if p ≥ 1
kxkpp is That’s why (P0 ) is hard!
non-convex, otherwise
Introduction Compressed sensing Discussion

Computational complexity

p=2 p=1
|x|p

p=1/3

p=0

0 x

Possible ways:
look for iterative algorithms: greedy algorithms
convex relaxation: use the convex norm with the lowest p
Introduction Compressed sensing Discussion

`1 regularization

Problem
(P1 ) min kxk1 , s.t. Ax = y
x

(P1 ) is a convex optimization problem and admits a solution


It can be recast as
n
X
min ti , s.t. |xi | ≤ ti ∀ i, Ax = y
t, x
i=1

linear program (LP) in the real case, second order cone program
(SOCP) in the complex case
fast (polynomial time), accurate algorithms are available
Introduction Compressed sensing Discussion

`1 regularization

Problem
(P1 ) min kxk1 , s.t. Ax = y
x

Ax = y

Heuristic way to obtain sparse solutions


In the example:
the solution is always unique and sparse unless the line has ±45◦
slope
if A is sampled from an i.i.d. continuous distribution, this
happens w.p.0
Introduction Compressed sensing Discussion

`0 , `1 , and `2 together

{z : Az = y} {z : Az = y} {z : Az = y}
x̂ = x x x̂ = x

`0 `2 `1

x is k-sparse, and y = Ax

Example: k = 1, A ∈ R2×3 , any 2 columns of A are linearly


independent
Introduction Compressed sensing Discussion

`0 , `1 , and `2 together

{z : Az = y} {z : Az = y} {z : Az = y}
x̂ = x x x̂ = x

`0 `2 `1

x is k-sparse, and y = Ax
`0 works if any 2k columns of A are linearly independent
`2 never works
`1 works if the condition on A is strengthened
Introduction Compressed sensing Discussion

Example
Reconstruction of a 512-long signal from 120 random measurements

1.5 4

3
1
2
0.5
1

0 0

ï1
ï0.5
ï2
ï1
ï3

ï1.5 ï4

ï5
0 100 200 300 400 500 0 100 200 300 400 500

superposition of 10 cosines in the frequency domain


time domain
Introduction Compressed sensing Discussion

Example
Reconstruction of a 512-long signal from 120 random measurements

5 5

4 4
1
3 3

2 0.5 2

1 1

0 0 0

ï1 ï1

ï2 ï0.5 ï2

ï3 ï3
ï1
ï4 ï4

ï5 ï5
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500

frequency `2 reconstruction `1 reconstruction


Introduction Compressed sensing Discussion

Example
Reconstruction of a 256 × 256 image (= 65536 pixels) from 5481
measurements in the Fourier domain†

Shepp-Logan phantom (a toy sampling pattern in the frequency


model for MRI) domain (22 approximately radial
lines)

E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete
frequency information,” IEEE Trans. Inform. Theory, 2006.
Introduction Compressed sensing Discussion

Example
Reconstruction of a 256 × 256 image (= 65536 pixels) from 5481
measurements in the Fourier domain†

original min-energy reconstruction min-TV reconstruction


E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete
frequency information,” IEEE Trans. Inform. Theory, 2006.
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Sparse recovery & incoherence

Theorem†
Let (a1 · · · an ) be the columns of A, normalized so that kai k2 = 1
∀ i, M = maxi6=j aTi aj , and y = Ax. If

kxk0 < (1 + 1/M)/2

then x is the unique solution of (P1 ).

M = maxi6=j aTi aj is called mutual coherence


Easy to check but “coarse/pessimistic”


D. L. Donoho and M. Elad, “Optimally sparse representation in general (nonorthogonal) dictionaries via `1
minimization,” Proc. Nat. Acad. Sci., 2003
Introduction Compressed sensing Discussion

Sparse recovery & RIP


For k ∈ {1, 2, . . . , n}, let δk be the smallest δ such that

(1 − δ)kxk22 ≤ kAxk22 ≤ (1 + δ)kxk22 , ∀x : kxk0 ≤ k

A satisfies the restricted isometry property (RIP) of order k if


δk ∈ [0, 1), i.e., any k columns are nearly orthogonal

Theorem†

Let y = Ax and x̂ be the solution of (P1 ). If δ2k < 2 − 1, then

kx − x̂k1 ≤ C0 kx − xk k1

kx − x̂k2 ≤ C0 kx − xk k1 / k

for some constant C0 . In particular, if x is k-sparse, x̂ = x.



E. J. Candès, “The restricted isometry property and its implications for compressed sensing,” Compte Rendus de
l’Academie des Sciences, 2008.”
Introduction Compressed sensing Discussion

Noisy measurements

Any measurement process introduces noise (say n)

y = Ax + n

In this case, if knk2 ≤ ,

Problem
(P1 ) min kxk1 , s.t. kAx − yk2 ≤ 
x

(P1 ) is a convex optimization problem and can be recast as SOCP


n
X
min ti , s.t. |xi | ≤ ti ∀ i, kAx − yk2 ≤ 
t, x
i=1
Introduction Compressed sensing Discussion

Noisy measurements

Problem
(P1 ) min kxk1 , s.t. kAx − yk2 ≤ 
x

kAx − yk2 ≤ 
Introduction Compressed sensing Discussion

Approximate recovery & RIP


Theorem†
Let y =
√Ax + n, with knk2 ≤ , and x̂ be the solution of (P1 ). If


δ2k < 2 − 1, then



kx − x̂k2 ≤ C0 kx − xk k1 / k + C1 

for some constant C1 (C0 same as before).

Stable recovery
Reconstruction error bounded by 2 terms:
same as in the noiseless case
proportional to the noise level
C0 and C1 are rather small, e.g., if δ = 0.25, then C0 ≤ 5.5 and
C1 ≤ 6


E. J. Candès, “The restricted isometry property and its implications for compressed sensing,” Compte Rendus de
l’Academie des Sciences, 2008.”
Introduction Compressed sensing Discussion

Sparse recovery & NSP

A has the null space property (NSP) of order k if, for some
γ ∈ (0, 1),

kηT k1 ≤ γkηT c k1 , ∀ η ∈ ker(A), T ⊂ {1, . . . , n}, card(T) ≤ k

The elements in the null space should have no structure (look


like noise)
NSP is actually equivalent to sparse `1 -recovery since

Theorem†
Let y = Ax. If A has NSP of order k, then x is the solution of (P1 ) ∀ x
k-sparse. Conversely, if x is the solution of (P1 ) ∀ x k-sparse, then A
has NSP of order 2k.


A. Cohen, W. Dahmen, and R. DeVore, “Compressed sensing and best k-term approximation,” J. Amer. Math. Soc., 2009.
Introduction Compressed sensing Discussion

Recovery conditions
Mutual coherence: easy to check but “coarse/pessimistic”

RIP: maybe “almost sharp,” works in the noisy case, but


hard to compute
not invariant to invertible linear transformations G, i.e.,

y = Ax ≡ Gy = GAx, but
A satisfies RIP ; GA satisfies RIP

NSP: tight but


hard to compute (usually NSP is verified through RIP)
not available in the noisy case

Others: many conditions are present in the literature (e.g.,


incoherence between Φ and Ψ)
Introduction Compressed sensing Discussion

How many measurements?


If kxk1 ≤ R, the reconstruction error from linear measurements
q
of any recovery method is lower bounded by C2 R ln n/m+1 m , for
some constant C2 †

If A is such that δ2k ≤ 2 − 1 then
√ √
kx − x̂k2 ≤ C0 kx − xk k1 / k ≤ C0 R/ k
q
Thus k ≥ C2 R ln n/m+1
C√0 R
m , and then, for a constant C,

m ≥ Ck(ln n/m + 1)
O(k ln n) measurements are sufficient to recover the signal with an
accuracy comparable to that attainable with direct knowledge of the k
largest coefficients

B. Kashin, “Diameters of some finite-dimensional sets and classes of smooth functions,” Izv. Akad. Nauk SSSR, Ser.
Mat., 1977; A. Y. Garnaev and E. D. Gluskin, “On widths of the Euclidean ball,” Sov. Math.–Dokl., 1984.
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Good sensing matrices


Goal: find A which satisfies RIP
Deterministic constructions of A have been proposed but m is
much larger than the optimal value
Try with random matrices and accept a (hopefully small)
probability of failure
Key property

Concentration inequality
The random matrix A satisfies the concentration inequality if, ∀ x and
∀  ∈ (0, 1),

P kAxk22 − kxk22 ≥ kxk22 ≤ 2e−nc

where c > 0
Introduction Compressed sensing Discussion

Good sensing matrices

Theorem†
Let δ ∈ (0, 1). If A satisfies the concentration inequality, then there
exist constants c1 , c2 > 0 depending only on δ such that the restricted
isometry constant of A satisfies δk ≤ δ with probability exceeding
1 − 2e−c1 m provided that m ≥ c2 k ln n/k.

Observation: m ≥ c2 k ln n/k ⇒ m≥ c2
1+c2 k(ln n/m + 1)


R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin, “A Simple Proof of the Restricted Isometry Property for Random
Matrices,” Constr. Approx., 2009
Introduction Compressed sensing Discussion

Random sensing

Random Concentration solution of


RIP (P1 )-(P1 )
matrices inequality

Random matrices allow perfect/approximate recovery of


k-sparse/compressible signals with overwhelming probability using
O(k ln n) measurements

Examples
Two important cases satisfy the concentration inequality
Gaussian: (A)i, j ∼ N (0, 1/m) i.i.d.

Bernoully: (A)i, j ∼ B(1/2) i.i.d. with values ±1/ m
Introduction Compressed sensing Discussion

The sensing matrix


Recall that (
Φ = sensing matrix
y = Φs = |{z}
ΦΨ x,
Ψ = sparsifying matrix
A

If Ψ orthogonal ⇒ Φ = AΨT

Not actually needed: just take Φ Gaussian or Bernoully


If Φ satisfies the concentration inequality, so A does:
 
P kAxk22 − kxk22 ≥ kxk22 =P kΦsk22 − kΨT sk22 ≥ kΨT sk22

P kΦsk22 − ksk22 ≥ ksk22

Random sensing is “universal:” it does not matter in which basis


the signal is sparse (Ψ not needed at the sensor side)
Introduction Compressed sensing Discussion

Random partial Fourier matrices

Gaussian and Bernoulli matrices provide minimal number of


measurements but
physical constraints on the sensor may preclude Gaussian
both are not structured ⇒ no fast matrix-vector multiplication
algorithm

Possible alternative: select m rows uniformly at random in a


n × n Fourier matrix F
1
(F)h,k = √ e−2πihk/n
n

equivalent to observe m random entries of the DFT of the signal


RIP holds with high probability, but m ≥ Ck ln4 n
Introduction Compressed sensing Discussion

More random matrices

(A)i, j i.i.d. with distribution 61 δ √−3 + 23 δ0 + 16 δ √3 ; in this case


m m

m ≥ Ck ln n/k

A is formed selecting m vectors uniformly at random from the


surface of the unit `2 sphere in Rm ; in this case

m ≥ Ck ln n/k

A is formed selecting m rows uniformly at random from a unitary


matrix U, and re-normalizing the columns to be unit `2 -norm; in
this case
m ≥ Cµ2 k ln4 n

where µ = n maxi, j |(U)i, j | (in the Fourier matrix µ = 1)
Introduction Compressed sensing Discussion

More random matrices


If U = Φ̃Ψ, both unitary, then

m ≥ Cµ2 k ln4 n, µ= n max | < φ̃i , ψj > |
i, j
µ is a measure of the mutual incoherence between the
measurement
 √  basis and the sparsity basis
µ ∈ 1, n , and low coherence is good: e.g., Φ̃ = Fourier &
Ψ = I gives µ = 1, i.e., maximal incoherence
basis vectors of Ψ must be “spread out” in the basis Φ̃ (e.g., in
the Fourier-Identity case, δ ↔ exponential)

sparse signal incoherent measurements


Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Three equivalent problems

There is a considerable interest in solving the unconstrained


optimization problem (convex but non differentiable)

min kAx − yk22 + λkxk1 (1)
x

Example: Bayesian estimation


x ∼ Laplace, n ∼ white Gaussian, the MAP estimate of x from
y = Ax + n solves (1), since
1
− kAx−yk22
max f (x|y) ≡ max f (y|x)f (x) ≡ max e 2σ 2 e−γkxk1
x x x
Introduction Compressed sensing Discussion

Three equivalent problems

There is a considerable interest in solving the unconstrained


optimization problem (convex but non differentiable)

min kAx − yk22 + λkxk1 (1)
x

Problem (1) is closely related to

min kxk1 , s.t. kAx − yk22 ≤  (2a)


x
min kAx − yk22 , s.t. kxk1 ≤ η (2b)
x

The solution of (2a), which is just (P1 ), is either x = 0, or a


solution of (1) for some λ > 0
The solution of (2b) is also a solution for (1) for some η
Introduction Compressed sensing Discussion

Geophysics: early references



min kAx − yk22 + λkxk1 (1)
x

Claerbount and Muir wrote in 1973†


“In deconvolving any observed seismic trace, it is
rather disappointing to discover that there is a nonzero
spike at every point in time regardless of the data
sampling rate. One might hope to find spikes only
where real geologic discontinuities take place. Perhaps
the `1 norm can be utilized to give a [sparse] output
trace. . . ”
Santosa and Symes proved in 1986 that (1) succeeds under mild
conditions in recovering spike trains from seismic traces∗

J. F. Claerbout and F. Muir, “Robust modeling of erratic data,” Geophysics, 1973.

F. Santosa and W. W. Symes, “Linear inversion of band-limited reflection seismograms,” SIAM J. Sci. Statist.
Comput.1986.
Introduction Compressed sensing Discussion

Signal processing: basis pursuit†


 
x1
.
y = a1 · · · an  ..  = Ax
xn
signal to be represented
basis vectors (overcomplete) (sparse) coefficients

There is a very large number of basis functions (called


dictionary) so that x is likely to be sparse
Goal: find a good fit of the signal as a linear combination of a
small number of the basis functions – i.e., basis pursuit (BP)
BP finds signal representations in overcomplete dictionaries
solving
min kxk1 , s.t. Ax = y
x

S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM J. Sci. Comput., 1999.
Introduction Compressed sensing Discussion

Signal processing: basis pursuit†

y = Ax + n
noise

Can also deal with noisy measurements of the signal (basis


pursuit denoising)

In this case
min kxk1 , s.t. kAx − yk22 ≤ 
x

which is (P1 ), or equivalently



min kAx − yk22 + λkxk1
x


S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM J. Sci. Comput., 1999.
Introduction Compressed sensing Discussion

Statistics: linear regression


A common problem in statistics is linear regression
n
X
y = ai xi + n = Ax + n
i=1
noise (error term)
measures (response variables) i.i.d., zero-mean
regressors (explanatory variables)
parameter vector (regression coefficients)

In oder to mitigate modeling biases, a large number of regressors


can be included ⇒ m < n
Goals:
minimize the prediction error ky − Axk2 (good data fit)
identifying the significative regressors (variable selection)
Introduction Compressed sensing Discussion

Regularization

Penalized regression can be used



min kAx − yk22 + λkxkp
x

As the parameter λ varies over (0, ∞), its solution traces out the
optimal trade-off curve

The most common is ridge regression†



min kAx − yk22 + λkxk22
x

The solution is x̂ = (AT A + λI)−1 AT y, but it cannot produce


model selection


A. E. Hoerl and R. W. Kennard, “Ridge regression: applications to nonorthogonal problems,” Technometrics, 1970.
Introduction Compressed sensing Discussion

Regularization
Bridge regression† is more general

min kAx − yk22 + λkxkpp
x

If p ≤ 1 and λ is sufficiently large it combines parameter


estimation and model selection

The p = 1 case is related to the least absolute shrinkage and


selection operator (lasso)∗

min kAx − yk2 , s.t. kxk1 ≤ η


x

Lasso and problem (P1 ) are formally identical, and equivalent to



min kAx − yk22 + λkxk1
x

I. E. Frank and J. H. Freidman, “A statistical view of some chemometrics regression tools,” Technometrics 1993.

R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Roy. Statist. Soc. B, 1996.
Introduction Compressed sensing Discussion

Variants of lasso
If more prior information (other than sparsity) is available on x, it can
be included in the optimization problem through proper penalizing
terms
The fused lasso† preserves local constancy when the regressors
are properly arranged
( n
)
X
min kAx − yk2 + λ1 kxk1 + λ2
2
|xi − xi−1 |
x
i=2

In reconstruction/denoising problems this can be used to recover


sparse and piece-wise constant signals
Pn
Pn i=2 |xi − xi−1
If the signal is smooth, the total variation | can be
substituted with a quadratic smoothing i=2 |xi − xi−1 |2


R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, “Sparsity and smoothness via the fused lasso,” J. Roy. Stat.
Soc. B, 2005.
Introduction Compressed sensing Discussion

Variants of lasso
The group lasso† promotes group selection
( k
)
X
min kAx − yk22 + λ kxi k2
x
i=1

where x = (xT1 · · · xTk )T has been partitioned in k groups


effective for recovery of sparse signals where coefficients appears
in groups

The elastic net∗ is a “stabilized” version of lasso



min kAx − yk22 + λ1 kxk1 + λ2 kxk2
x

It can select more than m variables even when m < n. It’s in


between ridge and lasso

M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables,” J. Roy. Stat. Soc. B, 2006.

H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” J. Roy. Stat. Soc. B, 2005.
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Convex programs

 can be recast as a perturbed


1 minx kAx − yk22 + λkxk1 LP (a quadratic program (QP)
with structure similar to LP)

2 minx kAx − yk22 , s.t. kxk1 ≤ η QP

3 minx kxk1 , s.t. Ax = y can be cast as a LP

4 minx kxk1 , s.t. kAx − yk22 ≤  can be cast as a SOCP


Introduction Compressed sensing Discussion

Algorithms
Can be all solved through standard convex optimization methods,
e.g., interior point methods (primal-dual, log-barrier, etc.)
general purposes solvers can handle small to medium size
problems
optimized algorithms (with fast matrix-vector operations) can
scale to large problems

Homotopy methods, e.g., least angle regression (LARS):


compute the entire solution path (i.e., for any λ > 0)
exploit the piece-wise linear property of the regularization path
fast if the solution is very sparse

Greedy algorithms for signal reconstructions, e.g., matching


pursuit MP and orthogonal MP (OMP)
not based on optimization
iteratively chooses the dictionary element with the highest inner
product with the current residual
low complexity but less powerfull
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Applications

Compressed sensing is advantageous whenever


signals are sparse in a known basis
measurements (or computation at the sensor end) are expensive,
but
computations at the receiver end are cheap
Such situations can arise in
Compressive imaging (e.g., the “single-pixel camera”)
Medical imaging (e.g., MRI and computed tomography)
AD conversion
Computational biology (e.g., DNA microarrays)
Geophysical data analysis (e.g., seismic data recovery)
Radar
Sensor networks
Astronomy
and others
Introduction Compressed sensing Discussion

Compressive imaging: the single-pixel camera

http://dsp.rice.edu/cscamera
Introduction Compressed sensing Discussion

Compressive imaging: the single-pixel camera

http://dsp.rice.edu/cscamera
Introduction Compressed sensing Discussion

Compressive imaging: the single-pixel camera

original image and CS reconstruction (65536 pixels) from 3300 mea-


surements (5%)

http://dsp.rice.edu/cscamera
Introduction Compressed sensing Discussion

Compressive imaging: the single-pixel camera

original image and CS reconstruction (65536 pixels) from 6600 mea-


surements (10%)

http://dsp.rice.edu/cscamera
Introduction Compressed sensing Discussion

Medical imaging: MRI


(1)

⇓ (3)

1 MRI “scans” the patient by collecting


coefficients in the frequency domain
2 These coefficients are very sparse
3 Inverse Fourier transform produces medical
image
Introduction Compressed sensing Discussion

Medical imaging: MRI

original image

Rapid acquisition of a mouse heart beating in dynamic MRI†



M.E. Davies and T. Blumensath,“Faster & greedier: algorithms for sparse reconstruction of large datasets,” IEEE
ISCCSP 2008
Introduction Compressed sensing Discussion

Medical imaging: MRI

Reconstruction from 20% of available measurements (linear and CS)

Rapid acquisition of a mouse heart beating in dynamic MRI†



M.E. Davies and T. Blumensath,“Faster & greedier: algorithms for sparse reconstruction of large datasets,” IEEE
ISCCSP 2008
Introduction Compressed sensing Discussion

Medical imaging: MRI

original image
Angiogram with observations along 80 lines in the Fourier domain
and 16129 measurements†

E. J. Candès and J. Romberg, “Practical signal recovery from random projections,” SPIE Conf. on Wavelet App. in
Signal and Image Process. 2008
Introduction Compressed sensing Discussion

Medical imaging: MRI

minimum energy and CS reconstructions


Angiogram with observations along 80 lines in the Fourier domain
and 16129 measurements†

E. J. Candès and J. Romberg, “Practical signal recovery from random projections,” SPIE Conf. on Wavelet App. in
Signal and Image Process. 2008
Introduction Compressed sensing Discussion

Medical imaging: MRI

detail
Angiogram with observations along 80 lines in the Fourier domain
and 16129 measurements†

E. J. Candès and J. Romberg, “Practical signal recovery from random projections,” SPIE Conf. on Wavelet App. in
Signal and Image Process. 2008
Introduction Compressed sensing Discussion

AD conversion: the random demodulator†

x(t) n/R y(n)


LPF
R

p(t)

high rate pseudo-noise sequence

P
x(t) = `∈Λ a` e
2πif` t , multi-tone signal
Λ ⊂ {0, ±1, . . . , ±(W/2 − 1), W/2}, W/2 ∈ N,
card(Λ) = k  W
sampling rate: R = O(k ln W) ⇒ no need for a high-rate ADC


J. A. Tropp, J. N. Laska, M. F. Duarte, J. K. Romberg, and R. G. Baraniuk, “Beyond Nyquist: efficient sampling of sparse
bandlimited signals,” IEEE Trans. Inform. Theory, 2010
Introduction Compressed sensing Discussion

AD conversion: the random demodulator†

x(t) n/R y(n)


LPF
R

p(t)

high rate pseudo-noise sequence

|X(f )| |X(f )P (f )| |Y (f )|

−W/2 W/2 −W/2 W/2 −R/2 R/2

Each frequency receives a unique signature that can be discerned


by examining the filter output

J. A. Tropp, J. N. Laska, M. F. Duarte, J. K. Romberg, and R. G. Baraniuk, “Beyond Nyquist: efficient sampling of sparse
bandlimited signals,” IEEE Trans. Inform. Theory, 2010
Introduction Compressed sensing Discussion

AD conversion: the modulated wideband converter†


p1 (t)
n/B y1 (n)
LPF
B

x(t)
mixing functions pi (t)
pm (t)
n/B ym (n)
LPF
B

B X(f ) W ∼ 10’s GHz

−W/2 0 W/2

Practical sampling stage for sparse wideband analog signals


It enables generating a low rate sequence corresponding to each
of the bands, without going through the high Nyquist rate

M. Mishali and Y. C. Eldar, “From theory to practice: sub-Nyquist sampling of sparse wideband analog signals,” IEEE
Trans. Signal Process., 2010
Introduction Compressed sensing Discussion

AD conversion: the modulated wideband converter†


p1 (t)
n/B y1 (n)
LPF
B

x(t)
mixing functions pi (t)
pm (t)
n/B ym (n)
LPF
B

Hardware implementation

M. Mishali and Y. C. Eldar, “From theory to practice: sub-Nyquist sampling of sparse wideband analog signals,” IEEE
Trans. Signal Process., 2010
Introduction Compressed sensing Discussion

CDMA synchronization and channel estimation†


known code matrix A (pilot symbols) unknown channel vector x
user 1 user 2 user K

| {z }
shifts of the
received multiplex y user’s signature noise n





channel response
Model: y = Ax + n, x sparse 


of the user

Standard method: m > n & LS


Sparse recovery allow m < n ⇒ higher data-rates

D. Angelosante, E. Grossi, G. G. Giannakis, and M. Lops, “Sparsity-Aware Estimation of CDMA System Parameters,”
EURASIP J. Adv. Signal. Process., 2010
Introduction Compressed sensing Discussion

CDMA synchronization and channel estimation†

P=10, N=15, K=5, ISR=0dB P=4, N=15, K=5, ISR=0dB


1 3
10 10
Lasso Lasso
LS LS
2
0 OMP 10 OMP
10

1
10
−1
10
0
10
NMSE

NMSE
−2
10
−1
10
−3
10
−2
10

−4
10 −3
10

−5 −4
10 10
0 10 20 30 40 50 0 10 20 30 40 50
SNR SNR

Normalized mean square error (NMSE) in channel estimation for


known number of active users (left over-determinate case, right
under-determinate case)

D. Angelosante, E. Grossi, G. G. Giannakis, and M. Lops, “Sparsity-Aware Estimation of CDMA System Parameters,”
EURASIP J. Adv. Signal. Process., 2010
Introduction Compressed sensing Discussion

CDMA synchronization and channel estimation†

P=10, N=15, K=10, S=5, SNR=20dB, ISR=0dB P=10, N=15, K=10, S=5, ISR=0dB, PFA=0.01
0
1 10
Lasso
0.9 LS
OMP
0.8
−1
10
0.7

0.6 Lasso
LS

PMD
−2
PD

0.5 OMP 10

0.4

0.3
−3
10
0.2

0.1
−4
0 10
0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50
PFA SNR

User activity detection for unknown number of active users: receiver


operating characteristics (left) and probability of miss versus SNR
(right)

D. Angelosante, E. Grossi, G. G. Giannakis, and M. Lops, “Sparsity-Aware Estimation of CDMA System Parameters,”
EURASIP J. Adv. Signal. Process., 2010
Introduction Compressed sensing Discussion

Outline
1 Introduction
Traditional data acquisition
Compressive data acquisition

2 Compressed sensing
Measurement protocol
Recovery procedure
Recovery conditions
Sensing matrices

3 Discussion
Connections with other fields
Numerical methods
Applications
Conclusions
Introduction Compressed sensing Discussion

Conclusions

Compressed sensing is an efficient signal acquisition protocol


that collect data in a compressed form
Linear measurements can be taken at low rate and
non-adaptively (signal independent)
Sparsity is exploited for reconstruction
The measurement matrix must be properly chosen but
Random matrices work
Introduction Compressed sensing Discussion

Some on-line resources

CS resources http://dsp.rice.edu/cs
CS blog http://nuit-blanche.blogspot.com/
Software:
SparseLab http://sparselab.stanford.edu/
`1 -magic http://www.acm.caltech.edu/l1magic/
GPSR http://www.lx.it.pt/˜mtf/GPSR/
`1 LS http://www.stanford.edu/˜boyd/l1_ls/

You might also like