0% found this document useful (0 votes)
2 views

Introduction

The document outlines the agenda for an introductory course on compressed sensing, covering topics such as underdetermined systems of linear equations, convex optimization, and applications in various fields like genetics and MRI. It discusses the principles of sparse signal recovery and the importance of convex optimization techniques in solving these problems. The concept of compressed sensing, coined by David Donoho, is introduced as a significant framework for analyzing underdetermined problems in signal processing and related disciplines.

Uploaded by

Dhiraj Hegde
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Introduction

The document outlines the agenda for an introductory course on compressed sensing, covering topics such as underdetermined systems of linear equations, convex optimization, and applications in various fields like genetics and MRI. It discusses the principles of sparse signal recovery and the importance of convex optimization techniques in solving these problems. The concept of compressed sensing, coined by David Donoho, is introduced as a significant framework for analyzing underdetermined problems in signal processing and related disciplines.

Uploaded by

Dhiraj Hegde
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Stats 330: An Introduction to Compressed Sensing

Lecture 1 & 2: agenda

1 Underdetermined systems of linear equations


2 Convex optimization often returns the “correct” answer
3 What is compressed sensing?
Underdetermined Systems of Linear Equations
Underdetermined systems of linear equations

b = A

x ∈ Rn
m  n linear equations about x

b = Ax, b ∈ Rm

Would like to recover x


Arises in many fields of science and engineering
Example in genetics: sparse regression

Find the locations on the genome that influence a trait: e.g. cholesterol level

Model
y = Ax + z

yi : measured cholesterol level of patient i


A : genotype matrix; ith row encodes the genotype of patient i
e.g. Ai,j number of alleles of type a at location j
zi environmental factors (not accounted by genetics) for patient i
Typical dimensions:
# of columns: about 500,000
# of rows: a few thousands
Example in reflection seismology (sketch)
Impedence and reflectivity profiles

(a) (b)

(a): impedance A(z)


(b): reflectivity α0 (z)
Probe and seismic trace

wavelet probe seismic trace


Linear inversion of bandlimited reflection seismograms
1D wave propagation

A(z)utt = (A(z)uz )z , t > 0, z ≥ 0

t is time and z is depth


u(t, z) particle displacement at time t and depth z
A(z) acoustical impedance; desired unknown
System is probed by a “wavelet” f (t)

uz (t, z = 0) = f (t)

Reflection seismogram (seismic trace): surface particle velocity

ut (t, z = 0) = h(t), 0≤t≤T

Problem: find A(z) from given data (f (t), h(t))


Linear inversion

A(z) = 1 + α(z)
α small pertubation

Linear approximation
Z t
g(t) = f (t) + h(t) = f (t − τ )α0 (τ /2) dτ
0

so with r(τ ) = α0 (τ /2) (this is the reflectivity function)

f ? r = g.

This is a deconvolution problem


Bandpass effect

Probing wavelet f is bandpass


Consequence: seismic experiment is missing low and high frequencies
Unknown reflectivity has low and high frequency content

wavelet Fourier transform


Ill-posed or undersampled problem

Discrete-time Fourier transform:


N −1
1
fˆ[k] =
X
√ f [t]e−i2πkt/N , 0≤k ≤n−1
t=0
N

Deconvolution in the Fourier domain

ĝ[k] = fˆ[k] r̂[k], 0≤k ≤n−1

fˆ[k] ≈ 0 → missing frequencies/equations


Widefield microscope
Deconvolution/super-resolution in widefield spectroscopy
Widefield microscope

Iout = |H|2 � Iin

Optical Transfer Function

UV 1

ser
0.5

CCD Camera 0
−kmax 0 kmax

widefield microscope transfer function H

Iout = H ? Iin
Magnetic resonance imaging (sketch)
(Powerful) magnetic field aligns nuclear magnetization
of (usually) hydrogen atoms in water in the body
RF fields systematically alter the alignment of this
magnetization → hydrogen nuclei produce a rotating
magnetic field detectable by the scanner
Make excitation strength space dependent
Goal is to recover proton density

Model
Discrete-time Fourier transform of f [t1 , t2 ], 0 ≤ t1 , t2 ≤ n − 1

fˆ(ω1 , ω2 ) =
X
f [t1 , t2 ]e−i2π(ω1 t1 +ω2 t2 )
(t1 ,t2 )

MR scan
y(ω1 , ω2 ) = fˆ(ω1 , ω2 ) + noise
for some frequencies (ω1 , ω2 )
Undersampling in magnetic resonance angiography

space frequency

Wish to speed up MRI by sampling less (very important to widen applicability)

example: 500 × 500 pixel image


about 20 radial lines; 500 measurements along each line
∼ 96% of the equations are missing
Frequency interpolation
Original Phantom (Logan−Shepp)

50

A Row of the Fourier Matrix


25

100
20

15
150

10

200
5

0
250
50 100 150 200 250
−5

−10

−15

−20

−25
0 50 100 150 200 250 300

Undersample Nyquist by 25 at high frequencies!


Underdetermined system

N -pixel image f
Underdetermined system

y(ω1 , ω2 ) = fˆ(ω1 , ω2 ) + z(ω1 , ω2 )

z is a noise term
Interested in m  N samples
The Netflix problem

Netflix database
About a million users
About 25,000 movies
People rate movies
Sparsely sampled entries
The Netflix problem

Movies
Netflix database  
× ×
About a million users  × × 
About 25,000 movies
 
× × 
Users  
People rate movies 
 × ×

Sparsely sampled entries × 
× ×

Challenge
Complete the “Netflix matrix”

Many such problems → collaborative filtering


Partially filled out surveys...
Matrix completion

? ? ? × ?
 
×
? ? × × ? ?
Matrix M ∈ Rn1 ×n2
? ? × ? ?
 
×
Observe subset of entries ? ? × ? ? ×
 

? ? ? ? ?
 
Wish to infer the missing entries ×
? ? × × ? ?
Underdetermined systems of linear equations

b = A

When fewer equations than unknowns


Fundamental theorem of algebra says we cannot find x
In general, this is absolutely correct
Convex Optimization often Returns the Correct Answer
Special structure

= *

If unknown is assumed to be
sparse (genomics example)
low-rank (Netflix)
then one can often find solutions to these problems by convex optimization
Sparsity in genomics

y = Ax (+z)

n ∼ 500, 000
m ∼ 1, 000
But number of nonzero components of x in the tens

x is said to be s-sparse if

kxk`0 = #{i : xi 6= 0} ≤ s

Under some conditions, noiseless recovery by `1 minimization is exact


`1 minimization

`0 minimization `1 minimization
min kxk`0 min kxk`1
s. t. Ax = b s. t. Ax = b
Combinatorially hard Linear program

Why an ! objective?
`1 norm is closest convex 1approximation to `0 quasi-norm

!0 quasi-norm !1 norm !2 norm

`1 ball, smallest convex ball containing ±ei = (0, . . . , 0, ±1, 0, . . . , 0)


Linear programming formulation

P
minimize i |xi |
subject to Ax = b
is equivalent to P
minimize i ti
subject to Ax = b
−ti ≤ xi ≤ ti
with variables x, t ∈ Rn

x? solution ⇔ (x? , t? = |x? |) solution


Sparse regression in genomics

y = Ax + z

Lasso Dantzig selector


min kxk`1
min kxk`1
s. t. kb − Axk`2 ≤ 
s. t. kA∗ (b − Ax)k`∞ (w) ≤ δ

Under some conditions


accurate recovery/estimation from noisy data
perfect recovery from noiseless data (z = 0)
Sparse deconvolution in seismics
g =f ?r

(a) (b)

Real earth is irregularly layered


Consequence: r is an irregular series of spikes (“sparse spike train”)
Recovery by `1 minimization

min krk`1 s. t. f ?r =g

Under some conditions, the recovery is exact


`1 reconstruction from undersampled freq. data

Signal length: n = 256


Data: m = 30 (complex) valued Fourier coefficients
Sparsity: s = 15

2.5

1.5

0.5

!0.5

!1

!1.5
0 50 100 150 200 250 300

original
`1 reconstruction from undersampled freq. data

Signal length: n = 256


Data: m = 30 (complex) valued Fourier coefficients
Sparsity: s = 15

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

!0.5 !0.5

!1 !1

!1.5 !1.5
0 50 100 150 200 250 300 0 50 100 150 200 250 300

original `1 recovery = perfect!


`1 reconstruction from undersampled freq. data

Signal length: n = 256


Data: m = 30 (complex) valued Fourier coefficients
Sparsity: s = 15

2.5 2.5 0.6

0.5
2 2

0.4
1.5 1.5
0.3

1 1
0.2

0.5 0.5 0.1

0
0 0

!0.1
!0.5 !0.5
!0.2

!1 !1
!0.3

!1.5 !1.5 !0.4


0 50 100 150 200 250 300 0 50 100 150 200 250 300 0 50 100 150 200 250 300

original `1 recovery = perfect! `2 recovery


Example in MRI
image is not sparse
gradient is sparse
So minimize `1 norm of gradient subject to constraints
X
min kxkT V := |∇x(t1 , t2 )| s. t. Ax = b
t1 ,t2
Original Phantom (Logan−Shepp)

50

100

150

200

250
50 100 150 200 250

original
Example in MRI
image is not sparse
gradient is sparse
So minimize `1 norm of gradient subject to constraints
X
min kxkT V := |∇x(t1 , t2 )| s. t. Ax = b
t1 ,t2
Original Phantom (Logan−Shepp) Naive Reconstruction

50 50

100 100

150 150

200 200

250 250
50 100 150 200 250 50 100 150 200 250

original filtered backprojection


Example in MRI
image is not sparse
gradient is sparse
So minimize `1 norm of gradient subject to constraints
X
min kxkT V := |∇x(t1 , t2 )| s. t. Ax = b
t1 ,t2
Original Phantom (Logan−Shepp) Naive Reconstruction Reconstruction: min BV + nonnegativity constraint

50 50 50

100 100 100

150 150 150

200 200 200

250 250 250


50 100 150 200 250 50 100 150 200 250 50 100 150 200 250

original filtered backprojection perfect recovery

Under some conditions, the recovery is exact


Low-rank modeling in collaborative filtering

? ? ? × ?
 
×
? ? × × ? ?
large number of entries 
× ? ? × ? ?

only sees a few ? ? × ? ? ×
 

? ? ? ? ?
 
But unknwon matrix has low rank ×
? ? × × ? ?

Under some conditions, noiseless recovery by convex programming is exact


Nuclear-norm minimization

rank minimization nuclear-norm minimization

i σi (X)
P
min rank(X) min
s. t. Xij = Mij (Mij observed) s. t. Xij = Mij (Mij observed)

Combinatorially hard Semidefinite program

Convex relaxation of the rank minimization program


Ball {X : i σi (X) ≤ 1}: convex hull of rank-1 matrices obeying kxy ∗ k ≤ 1
P
Low-rank regression

Yij = Mij + Zij , (i, j) ∈ E

Under some conditions


accurate matrix recovery/estimation from noisy data
perfect matrix recovery from noiseless data (Z = 0)
What is Compressive Sensing?
The goal of image compression is to represent the digital model of an object as
compactly as possible. One can regard the the possibility of digital compression as
a failure of sensor design. If it is possible to compress measured data, one might
argue that too many measurements were taken.
David Brady
Compressed sensing

Name coined by David Donoho


Has become a label for sparse signal recovery
But really one instance of underdetermined problems
Compressed sensing

Name coined by David Donoho


Has become a label for sparse signal recovery
But really one instance of underdetermined problems

Informs analysis of underdetermined problems


Changes viewpoint about underdetermined problems
Starting point of a general burst of activity in
information theory
signal processing
statistics
some areas of computer science
...
Inspired new areas of research, e. g. low-rank matrix recovery
A little experiment

Raw 5 MegaPixel image


15 Megabytes
A little experiment

JPEG Compressed 5 MegaPixel image


150 Kilobytes
A contemporary paradox

Massive data acquisition


Most of the data is redundant and can be Raw: 15MB
thrown away
Seems enormously wasteful

JPEG: 150KB
A long established tradition

Acquire/Sample (A-to-D converter, digital camera)


Compress (signal dependent, nonlinear)

N >> M
N M
sample compress transmit/store

!
! sparse !
wavelet
transform

receive
M decompress
N
Sparsity in signal processing

1 megapixel image
Sparsity in signal processing

wavelet coefficients
16000

14000

12000

10000

8000

6000

4000

2000

−2000

−4000
0 2 4 6 8 10 12
5
x 10


3000

2000

1000

−1000

1 megapixel image −2000

−3000
2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7
4
x 10

zoom in

Implication: can discard small coefficients without much perceptual loss


Sparsity and wavelet “compression”
Take a mega-pixel image
1 Compute 1,000,000 wavelet coefficients
2 Set to zero all but the 25,000 largest coefficients
3 Invert the wavelet transform

1 megapixel image
Sparsity and wavelet “compression”
Take a mega-pixel image
1 Compute 1,000,000 wavelet coefficients
2 Set to zero all but the 25,000 largest coefficients
3 Invert the wavelet transform

1 megapixel image 25k term approximation

This principle underlies modern lossy coders


Transform-domain image coding

Sparse representation = good compression


Why? Because there are fewer things to send/store
Transform-domain image coding

Sparse representation = good compression


Why? Because there are fewer things to send/store

Canonical image coder

Most of the transformed coefficients are ≈ 0 ⇒ few bits


How many measurements to acquire a sparse signal?
Motivation: Sam
concentrated vector
x is s-sparse
Take m random and nonadaptive measurements

yk = hak , xi, k = 1, . . . , m

e.g. ak ∼ N (0, I)
Reconstruct by `1 minimization

• Signal is local, measurements

• Each measurement picks up a

• Triangulate significant compone

• Formalization: Relies on uncer


measurement system
How many measurements to acquire a sparse signal?
Motivation: Sam
concentrated vector
x is s-sparse
Take m random and nonadaptive measurements

yk = hak , xi, k = 1, . . . , m

e.g. ak ∼ N (0, I)
Reconstruct by `1 minimization

• Signal is local, measurements


First fundamental result • Each measurement picks up a
If m & s log n
• Triangulate significant compone
Recovers original exactly
• Formalization: Relies on uncer
Efficient acquisition is possible by nonadaptive sensing
measurement system
Motivation: Sampling a Sparse Vector
concentrated vector incoherent measurements

• Signal is local,
signal is localmeasurements are global
measurements are global
• Each measurement picks up a little information about each component

• Triangulate significant components from measurements

• Formalization: Relies on uncertainty principles between sparsity basis and


Signals/images may not be exactly sparse

image wavelet coefficient table


Nonadaptive sensing of compressible signals
1

0.5

!0.5
0 5 10 15 20 25 30 35 40 45 50

Classical viewpoint
Measure everything
(all the pixels, all the coefficients)
Store s-largest coefficients
distortion : kx − xs k`2
Nonadaptive sensing
Motivation:of compressible
Sampling signals
a Sparse Vector
concentrated vector incoherent measurements
1

0.5

!0.5
0 5 10 15 20 25 30 35 40 45 50

• Signal is local, measurements are global Compressed sensing


Classical viewpoint
• Each measurement picks up a little information about each component
Measure everything Take m random measurements
Triangulateallsignificant components from measurements
(all the• pixels, the coefficients) bk = hak , xi
• Formalization: Relies on uncertainty principles between sparsity basis and
Store s-largest coefficients
measurement system
distortion : kx − xs k`2 Reconstruct by `1 minimization

Second fundamental result


With m & s log n or even m & s log n/s

kx̂ − xk`2 ≤ kx − xs k`2

Simultaneous (nonadaptive) acquisition and compression


Optimality

Cannot do essentially better with


fewer measurements
with other reconstruction algorithms
But data are always noisy...
Random sensing with a ∼ N (0, I)

yk = hak , xi + σzk , k = 1, . . . , m

a ∼ N (0, I) (say)
zk iid N (0, 1) (say)
But data are always noisy...
Random sensing with a ∼ N (0, I)

yk = hak , xi + σzk , k = 1, . . . , m

a ∼ N (0, I) (say)
zk iid N (0, 1) (say)

Third fundamental result: C. and Plan (’09)


Recovery via lasso or Dantzig selector
s̄ = m/ log(n/m)

sσ 2
kx̂ − xk2`2 . inf kx − xs k2`2 + log n
1≤s.s̄ m
= near-optimal bias-variance trade off

Result holds more generally


Opportunities

When measurements are


expensive (e.g. fuel cell imaging, near IR imaging)
slow (e.g. MRI)
beyond current capabilities (e.g. wideband analog to digital conversion)
wasteful
missing
...
Compressed sensing may offer a way out
Example of hardware development
New analog-to-digital converter architectures (guest lecture by A. Emami)
New optical devices; cameras, spectroscopes, microscopes
Fast MRI (guest lecture by M. Lustig)
Course agenda and objectives

1 fundamental theoretical and mathematical ideas


2 efficient numerical methods in large-scale convex optimization
3 progress in implementing compressive sensing ideas into acquisition devices
Course agenda and objectives

1 fundamental theoretical and mathematical ideas


2 efficient numerical methods in large-scale convex optimization
3 progress in implementing compressive sensing ideas into acquisition devices

Course objectives
expose students to a novel active field of research
give tools to make theoretical and/or practical contributions

You might also like