Mathematical Methods in Image Processing
Mathematical Methods in Image Processing
François Malgouyres
invitation by
Jidesh P., NITK Surathkal
funding
Global Initiative on Academic Network
Oct. 23–27
RN (N large)
u
RN (N large)
u
small
u+b
√
Nσ RN (N large)
u
small
VK v.s. of dimension K ≪ N
u+b
√
Nσ RN (N large)
u
small √
Kσ
VK v.s. of dimension K ≪ N
Applications:
Approximation
Denoising, compressed sensing/inverse problems (later in the talk)
Feature extraction
Questions:
Can we do it for images?
How to construct VK in a stable and realistic manner (from a numerical
standpoint)?
q = 10.00
q = 1.00
1 q = 0.20 1 1
0 0 0
−1 −1 −1
p = 0.2 p=1 p = 10
x̃
B x̃ ∼ u B x̃ ∈ VK = Vect{B k , k ∈ S}
Theorem
We have for 0 ≤ p ≤ 2,
kxkp
ku − B x̃k2 = kx − x̃k2 ≤ ,
Kr
1
for r = p − 21 .
Examples:
If p = 0.01, r = 99.5
If p = 0.1, r = 9.5
If p = 1, r = 0.5
u+b
√
Nσ RN (N large)
u
kxkp
Kr
√
Kσ
VK = Vect{B k , k ∈ S}
VK = {w ∈ RN |∀i ∈ S, (Aw )i = 0}
Dx ∼ u
D
Drawing for N = 2, K = 1
replacements
u 5
minx∈RP kxk0
D s.t. kDx − uk ≤ τ
D4
D3
D1 D2
Number of v.s. of dimension K :
+1)
CPK = P(P−1)...(P−K
K!
x ∗ ∈ Argminx∈RP kxk0
s.t. kDx − uk = 0
x2
x0
x1
x0 + Ker (D)
∗
x ∈ (x0 + Ker (D)) ∩ ∪|S|≤K VS
But when dim(Ker (D)) = P − N and K ≪ N, the intersection of
x0 + Ker (D)
and any finite collection of vector space of dimension K
is unlikely in RP
Theorem
If
There exists x0 such that kx0 k0 ≤ K and u = Dx0
D is such that, for any x 6= 0 such that kxk0 ≤ 2K , Dx 6= 0
then x0 is the unique solution of (P0 ), when τ = 0.
Proof: Otherwise, there exists x with kxk0 ≤ kx0 k0 ≤ K and such that
Dx = Dx0 .
Therefore kx − x0 k0 ≤ 2K and D(x − x0 ) = 0. Contradiction.
(Geometrically, in the parameter space, the (RIP) forces the singular vector of D
corresponding to small singular values to be almost orthogonal to all K -sparse
vectors.)
Remind
x ∗ ∈ Argmin kxk0
(P0 ) :
x s. t. kDx − uk ≤ τ.
Theorem
Assume there exists K ≤ N and x0 such that kx0 k0 ≤ K and if we consider the
data
u = Dx0 + b.
If D satisfies the 2K -RIP with constant δ2K , then
2
kx0 − x ∗ k ≤ √ τ
1 − δ2K
Theorem
All the variants
x ∗ ∈ Argmin kxk0
x s. t. kDx − uk ≤ τ.
x s. t. kxk0 ≤ K .
x ∗ ∈ Argmin kxk0 + λkDx − uk2
are NP-Hard problems.
It is a hard-thresholding.
Proof: By definition
t
kx − x ′ k22 + kx ′ k0
proxtk.k0 (x) = Argminx ′ ∈RP
2
This problem is separable and we can prove that, for all i = 1..P,
t
proxtk.k0 (x)i = Argminy∈R (xi − y )2 + 1y6=0 (y ).
2
We have
t 1 , if y = xi 6= 0
(xi − y )2 + 1y6=0 (y ) = t 2
2 x
2 i , if y = 0
Choosing the smallest objective value leads to the result.
François Malgouyres (IMT) Mathematics for Image Processing Oct. 23–27 21 / 30
ℓ0 minimization: the proximal gradient algorithm
Theorem
For any starting point and any stepsize 0 ≤ t < L1 , where L is the Lipschitz
constant of the gradient
x 7−→ 2λD ∗ (Dx − u)
(i.e. L = 2λσmax (D)2 where σmax (D) is the largest singular value of D), the
proximal point algorithm (also called ”Iterative Hard thresholding”)
Top: Image with missing pixels; Bottom: restored and ideal image
Figure: Initial image, wavelet part, curvelet part (Courtasy of J.L. Starck)
for λ ≥ 0.
(They describe the same solution path, when λ and τ vary.)
Beside the fact that E might not be coercive (which turn out not to be an issue),
this permit to guarantee that the iterates of the proximal gradient algorithm
converge (See lecture on ”non-smooth optimization”).
∀w , w ′ ∈ W , k∇E (w ′ ) − ∇E (w )k ≤ Lkw ′ − w k
(i.e. L = 2λσmax (D)2 where σmax (D) is the largest singular value of D).
Moreover,(E(x k ))k∈N is non-increasing and, for any minimizer w ∗ of E,
L 0
E(x k ) − E(x ∗ ) ≤ kx − x ∗ k2 .
2k
We remind that
1
, if xi′ > L1 .
′
xi − L
proxLk.k1 (x ′ )i = 0 , if − L1 ≤ xi′ ≤ L1 ,
1
, if xi′ < − L1 ,
′
xi + L
u = Dx0 + b.
If D satisfies the 4K -RIP and the constants δ3K and δ4K are such that
δ3K + 3δ4K < 2, then
kx0 − x ∗ k ≤ CK τ
where x ∗ is the solution of (P1 ), any τ such that kbk ≤ τ and
4
CK = √ √ .
3 − 3δ4K − 1 + δ3K
Proof:
The rest of the proof quantifies and argues that the ℓ1 ball is narrow. (See
Candes-Romberg-Tao, ”Stable Signal recovery from incomplete and inaccurate
measurements”, Comm. Pure Appl. Math. 2006. )