0% found this document useful (0 votes)
6 views31 pages

Mathematical Methods in Image Processing

The document discusses mathematical methods for image processing, focusing on sparsity priors in both ℓ0 and ℓ1 cases, dimensionality reduction techniques for approximation and denoising, and applications such as compressed sensing. It presents theorems related to non-linear approximation, compressed sensing, and the NP-hard nature of ℓ0 minimization problems, along with algorithms for proximal gradient methods. The content is part of a series of lectures by François Malgouyres at the Institut de Mathématiques de Toulouse, funded by the Global Initiative on Academic Network.

Uploaded by

SivaCharan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views31 pages

Mathematical Methods in Image Processing

The document discusses mathematical methods for image processing, focusing on sparsity priors in both ℓ0 and ℓ1 cases, dimensionality reduction techniques for approximation and denoising, and applications such as compressed sensing. It presents theorems related to non-linear approximation, compressed sensing, and the NP-hard nature of ℓ0 minimization problems, along with algorithms for proximal gradient methods. The content is part of a series of lectures by François Malgouyres at the Institut de Mathématiques de Toulouse, funded by the Global Initiative on Academic Network.

Uploaded by

SivaCharan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Mathematical methods for Image Processing

François Malgouyres

Institut de Mathématiques de Toulouse, France

invitation by
Jidesh P., NITK Surathkal

funding
Global Initiative on Academic Network

Oct. 23–27

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 1 / 30


Plan
1 Sparsity prior : ℓ0 case

2 Sparsity prior : ℓ1 case

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 2 / 30


Dimensionality reduction

RN (N large)
u

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 3 / 30


Dimensionality reduction for approximation

RN (N large)
u
small

VK vector space (v.s.) of dimension K ≪ N

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 4 / 30


Dimensionality reduction for denoising

u+b

Nσ RN (N large)

u
small

VK v.s. of dimension K ≪ N

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 5 / 30


Dimensionality reduction for denoising

u+b

Nσ RN (N large)

u
small √

VK v.s. of dimension K ≪ N

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 6 / 30


Dimensionality reduction

Applications:
Approximation
Denoising, compressed sensing/inverse problems (later in the talk)
Feature extraction
Questions:
Can we do it for images?
How to construct VK in a stable and realistic manner (from a numerical
standpoint)?

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 7 / 30


Images are smooth

Let B be an orthonormal basis (e.g. Fourier, ondelettes...)


We assume B is such that for any image u and x such that u = Bx.

kxkp is small for p small.

q = 10.00
q = 1.00
1 q = 0.20 1 1

0.5 0.5 0.5

0 0 0

−0.5 −0.5 −0.5

−1 −1 −1

−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1

p = 0.2 p=1 p = 10

Assymptotically : Assumption on Sobolev smoothness for Fourier, Besov


smoothness for Wavelets.

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 8 / 30


Images are smooth
We denote S the index of the K largest entries in x and

xk , if k ∈ S,
x̃k =
0 , otherwise.

B x̃ ∼ u B x̃ ∈ VK = Vect{B k , k ∈ S}

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 9 / 30


Non-linear approximation

Theorem
We have for 0 ≤ p ≤ 2,

kxkp
ku − B x̃k2 = kx − x̃k2 ≤ ,
Kr
1
for r = p − 21 .

Examples:
If p = 0.01, r = 99.5
If p = 0.1, r = 9.5
If p = 1, r = 0.5

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 10 / 30


Dimensionality reduction of smooth images

u+b

Nσ RN (N large)

u
kxkp
Kr

VK = Vect{B k , k ∈ S}

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 11 / 30


Dimensionality reduction: Manipulating vector spaces

Implicit representation (Often called Analysis, co-sparsity prior): We have


an analysis linear operator
A : RN −→ RP
where typically P ≫ N and build a set of equation S ⊂ {1, . . . , P} such that

VK = {w ∈ RN |∀i ∈ S, (Aw )i = 0}

(Ex: A computes finite differences as in Total Variation prior)


Explicit representation (Often called Synthesis, sparsity prior): We have a
synthesis linear operator
D : RP −→ RN
where typically P ≫ N and build support S ⊂ {1, . . . , P} such that

VK = {Dx|x ∈ RP and Supp (x) ⊂ S}

(Ex: D is a wavelet transform)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 12 / 30


Redundancy in the Synthesis prior

For P > N and kxk0 = |{i, xi 6= 0}|


 
minx∈RP kxk0 minx∈RP kDx − uk2
(P0 ) or (P0′ )
s.t.kDx − uk2 ≤ τ s.t. kxk0 ≤ K

Dx ∼ u
D

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 13 / 30


Redundancy: the picture in RN

Drawing for N = 2, K = 1
replacements

u 5
minx∈RP kxk0
D s.t. kDx − uk ≤ τ

D4

D3
D1 D2
Number of v.s. of dimension K :
+1)
CPK = P(P−1)...(P−K
K!

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 14 / 30


Redundancy: the picture in RP
x3
Drawing for P = 3, K = 1, τ = 0
Hyp.: There exists x0 s.t. kx0 k0 ≤ K and u = Dx0

x ∗ ∈ Argminx∈RP kxk0

s.t. kDx − uk = 0
x2

x0
x1

x0 + Ker (D)


x ∈ (x0 + Ker (D)) ∩ ∪|S|≤K VS
But when dim(Ker (D)) = P − N and K ≪ N, the intersection of
x0 + Ker (D)
and any finite collection of vector space of dimension K
is unlikely in RP

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 15 / 30


Compressed sensing, noiseless ℓ0 case

Theorem
If
There exists x0 such that kx0 k0 ≤ K and u = Dx0
D is such that, for any x 6= 0 such that kxk0 ≤ 2K , Dx 6= 0
then x0 is the unique solution of (P0 ), when τ = 0.

Proof: Otherwise, there exists x with kxk0 ≤ kx0 k0 ≤ K and such that
Dx = Dx0 .
Therefore kx − x0 k0 ≤ 2K and D(x − x0 ) = 0. Contradiction. 

Thanks to the sparsity hypothesis, we can invert D !


(But as such it might be unstable and (P0 ) is NP-Hard.)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 16 / 30


Compressed sensing, noisy ℓ0 case

We assume the columns D are of norm 1.


Definition (Restricted Isometry Property)
We say that D satisfies the ”K -Restricted Isometry Property” (K -RIP)), where
K ∈ {1, . . . , N}, if there exists δK ∈ [0, 1) such that

(1 − δK )kxk2 ≤ kDK xk2 ≤ (1 + δK )kxk2 , ∀x ∈ RK

whatever the matrix DK extracted from D, of size N × K .

We can equivalently say

(1 − δK )kxk2 ≤ kDxk2 ≤ (1 + δK )kxk2 , ∀x ∈ RN s.t. kxk0 ≤ K

(Geometrically, in the parameter space, the (RIP) forces the singular vector of D
corresponding to small singular values to be almost orthogonal to all K -sparse
vectors.)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 17 / 30


Compressed sensing, noisy ℓ0 case

Remind
x ∗ ∈ Argmin kxk0

(P0 ) :
x s. t. kDx − uk ≤ τ.

Theorem
Assume there exists K ≤ N and x0 such that kx0 k0 ≤ K and if we consider the
data
u = Dx0 + b.
If D satisfies the 2K -RIP with constant δ2K , then
2
kx0 − x ∗ k ≤ √ τ
1 − δ2K

where x ∗ is any solution of (P0 ), with kbk ≤ τ .

(b contains some approximation error and some noise)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 18 / 30


Compressed sensing, noisy ℓ0 case, Thm proof

Notice first that since kbk ≤ τ , x0 satisfies the constraint kDx0 − uk ≤ τ .


Therefore
kx ∗ k0 ≤ kx0 k0 ≤ K
and
kx0 − x ∗ k0 ≤ 2K .
Therefore
(1 − δ2K )kx0 − x ∗ k2 ≤ kD(x0 − x ∗ )k2
= k(Dx0 − u) − (Dx ∗ − u)k2
≤ (kDx0 − uk + kDx ∗ − uk)2
≤ (2τ )2

Rq : The Upper-RIP (kDxk2 ≤ (1 + δK )kxk2 ) is not used.

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 19 / 30


ℓ0 minimization: A NP-Hard problem

Theorem
All the variants
x ∗ ∈ Argmin kxk0

x s. t. kDx − uk ≤ τ.

x ∗ ∈ Argmin kDx − uk2




x s. t. kxk0 ≤ K .
x ∗ ∈ Argmin kxk0 + λkDx − uk2
are NP-Hard problems.

(See Davies-Mallat-Avellaneda, ”Adaptive Greedy Approximation”, Constructive


approximation, 13:57-98, 1997.)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 20 / 30


ℓ0 minimization: the proximal gradient algorithm
Theorem
For any x ∈ RP and any t ≥ 0
( √
xi , if |xi | ≥ √2t
proxtk.k0 (x)i =
0 , otherwise

It is a hard-thresholding.

Proof: By definition
t
kx − x ′ k22 + kx ′ k0
proxtk.k0 (x) = Argminx ′ ∈RP
2
This problem is separable and we can prove that, for all i = 1..P,
t
proxtk.k0 (x)i = Argminy∈R (xi − y )2 + 1y6=0 (y ).
2
We have 
t 1 , if y = xi 6= 0
(xi − y )2 + 1y6=0 (y ) = t 2
2 x
2 i , if y = 0
Choosing the smallest objective value leads to the result. 
François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 21 / 30
ℓ0 minimization: the proximal gradient algorithm

Theorem
For any starting point and any stepsize 0 ≤ t < L1 , where L is the Lipschitz
constant of the gradient
x 7−→ 2λD ∗ (Dx − u)
(i.e. L = 2λσmax (D)2 where σmax (D) is the largest singular value of D), the
proximal point algorithm (also called ”Iterative Hard thresholding”)

x k+1 = proxtk.k0 (x k − t 2λD ∗ (Dx k − u))

converges to a critical point of

kxk0 + λkDx − uk2

It is a straightforward consequence of Bolte-Sabach-Teboulle, ”Proximal


alternating linearized minimization for nonconvex and nonsmooth problems”,
Math. Prog. serie A, 2013.

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 22 / 30


Application of compressed sensing: Inpainting

Top: Image with missing pixels; Bottom: restored and ideal image

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 23 / 30


Application of compressed sensing: Zoom

Un-zoomed; zoomed (x4) by two methods

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 24 / 30


Application of compressed sensing: Morphological
component analysis

Figure: Initial image, wavelet part, curvelet part (Courtasy of J.L. Starck)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 25 / 30


Plan
1 Sparsity prior : ℓ0 case

2 Sparsity prior : ℓ1 case

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 26 / 30


ℓ1 minimization

We replace the nonconvex, not continuous, NP-hard problem by a convex,


non-differentiable problem
 ∗
x ∈ Argmin kxk1
(P1 ) :
x s. t. kDx − uk ≤ τ.

where τ ≥ 0 and for the ℓ1 norm defined by


P
X
kxk1 = |xi |.
i =1

Or a variant using a Lagrangian

x ∗ ∈ Argmin kxk1 + λkDx − uk2

for λ ≥ 0.
(They describe the same solution path, when λ and τ vary.)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 27 / 30


ℓ1 minimization: Proximal gradient algorithm

The objective function


E(x) = λkDx − uk2 + kxk1
has the form E + R
Where E : W −→ R is convexe, differentiable, with a Lipschitz gradient1 of
constant L > 0
Where R is lower semi-continuous, proper, convex and coercive.

Beside the fact that E might not be coercive (which turn out not to be an issue),
this permit to guarantee that the iterates of the proximal gradient algorithm
converge (See lecture on ”non-smooth optimization”).

∀w , w ′ ∈ W , k∇E (w ′ ) − ∇E (w )k ≤ Lkw ′ − w k

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 28 / 30


ℓ1 minimization: Proximal gradient algorithm
Theorem (Iterative Soft Thresholding convergence)
The iterates
x k+1 = proxtk.k1 (x k − t 2λD ∗ (Dx k − u))
1
converge, for any t < L , where L is the Lipschitz constant of the gradient

x 7−→ 2λD ∗ (Dx − u)

(i.e. L = 2λσmax (D)2 where σmax (D) is the largest singular value of D).
Moreover,(E(x k ))k∈N is non-increasing and, for any minimizer w ∗ of E,

L 0
E(x k ) − E(x ∗ ) ≤ kx − x ∗ k2 .
2k
We remind that
1
, if xi′ > L1 .
 ′
 xi − L
proxLk.k1 (x ′ )i = 0 , if − L1 ≤ xi′ ≤ L1 ,
1
, if xi′ < − L1 ,
 ′
xi + L

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 28 / 30


ℓ1 minimization
We remind
x ∗ ∈ Argmin kxk1

(P1 ) :
x s. t. kDx − uk ≤ τ.

Theorem (Compressed sensing: ℓ1 case)


Let K ≤ N and x0 be such that kx0 k0 ≤ K . We consider the datum

u = Dx0 + b.

If D satisfies the 4K -RIP and the constants δ3K and δ4K are such that
δ3K + 3δ4K < 2, then
kx0 − x ∗ k ≤ CK τ
where x ∗ is the solution of (P1 ), any τ such that kbk ≤ τ and
4
CK = √ √ .
3 − 3δ4K − 1 + δ3K

(b contains some approximation error and some noise)

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 29 / 30


Compressed sensing ℓ1 : Geometric intuition
Lemma
We have
kD(x ∗ − x0 )k ≤ 2τ
kx ∗ k1 ≤ kx0 k1

Proof:

kD(x ∗ − x0 )k = k(Dx ∗ − u) − (Dx0 − u)k


≤ kDx ∗ − uk + kDx0 − uk
≤ 2τ

The second inequality is a straightforward consequence of kbk ≤ τ . 

The rest of the proof quantifies and argues that the ℓ1 ball is narrow. (See
Candes-Romberg-Tao, ”Stable Signal recovery from incomplete and inaccurate
measurements”, Comm. Pure Appl. Math. 2006. )

François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 30 / 30

You might also like