Image Analysis Lecture 9
Image Analysis Lecture 9
image reconstruction
From model-based to data-driven approaches
Subhadip Mukherjee
Indian Institute of Technology
Kharagpur, India
# [email protected]
sites.google.com/view/subhadip-
mukherjee/home
§ github.com/Subhadip-1
This lecture
• Previously, we have seen how to apply optimization methods to
reconstruct an image by minimizing a variational energy function (data
fidelity plus regularizer).
• In this lecture, we will learn how to formulate different data-adaptive
reconstruction approaches and training strategies by drawing inspiration
from the variational framework and various optimization algorithms.
• We will study two important classes of techniques for learning an image
reconstruction operator and two representative methods from each class.
1. Supervised approaches
1.1. Algorithm unrolling
1.2. Bi-level learning
2. Unsupervised and weakly supervised approaches
2.1. Plug-and-play (PnP) denoising algorithms
2.2. Adversarial regularization
• You will get an overview of the pros and cons of these approaches and learn
which approach is suitable in what context. You will learn to implement a
simple unrolling approach in the practical (using odl and pytorch).
Variational image reconstruction: A Bayesian view
• The Bayesian approach of reconstruction views the image x and the
corresponding data y as two random variables related as y = Ax + w.
• In the Bayesian framework, we characterize the posterior distribution
p(x|y) by either summarizing it in a point estimate or by sampling from it.
ground-truth
model-based data-driven
Different data-driven image reconstruction techniques
Image Source: SM, A. Hauptmann, O Öktem, M Pereyra, and C.-B. Schönlieb, “Learned reconstruction
methods with convergence guarantees: A survey of concepts and applications," IEEE Signal Processing
Magazine 40 (1), 164-182.
Data-driven post-processing
• Post-processing approaches seek to remove artifacts from a model-driven
approach using machine learning.
model-driven data-driven
yδ −→ x† −→ ^x
• These approaches are simple to implement, but they work as a black box
with limited interpretability.
• An example in the context of tomographic reconstruction: n
◦ Create a dataset consisting of (FBP, ground-truth) pairs of the form xi† , xi ,
i=1
where xi† = FBP(yi ) are the FBP images.
◦ Train a U-net to remove artifacts from the FBP images:
X
n
2
min xi − Uθ (xi† ) .
θ 2
i=1
• Cons: Needs more data to generalize well, does not combine imaging
physics and data in a principled fashion, does not admit variational
interpretation, lacks data consistency (i.e., Axi† ≈ yi =⇒
̸ AUθ (xi† ) ≈ yi ).
Reference: Jin et al., “Deep convolutional neural network for inverse problems in imaging," IEEE-TIP, 2017.
Algorithm unrolling: combining imaging physics with machine
learning
Algorithm unrolling: how does it work?
Key idea: Build an optimization-inspired architecture of the reconstruction
network. Each iteration in the optimization algorithm forms a layer in the
reconstruction network.
1 X (i)
n
2
xN θ; yδ − x(i)
min .
θ n
i=1
The learned primal-dual (LPD) approach
• The learned gradient network seeks to learn the proximal operator
corresponding to the regularizer, but the data fidelity remains fixed.
• The learned variant of PDHG, on the other hand, offers the flexibility of
learning both the regularizer and the fidelity (by parameterizing the
proximal operators in both primal and dual spaces using two CNNs).
Image source: Adler and Öktem, "Learned primal-dual reconstruction," IEEE-TMI, 2018.
Some key points about algorithm unrolling
Image source: Adler and Öktem, "Learned primal-dual reconstruction," IEEE-TMI, 2018.
Bi-level learning: Supervised learning of a variational image
reconstruction operator
Bi-level learning of regularizer parameters – 1
• In unrolling, the reconstructed image cannot be interpreted as a
minimizer of some variational energy function such as (1).
• Can we have a reconstruction operator that indeed corresponds to a
minimizer of a (learned) variational energy?
1 X (i)
N
2
min ^x (θ; y(i) ) − x(i)
θ N
i=1
| {z }
g(θ)
subject to ^x (θ; y ) ∈ arg min f y(i) , Ax + Rθ (x) .
(i) (i)
x | {z }
Ji (x,θ)
1 X (i)
N
∇g(θ) = ∂^x (θ; y(i) )⊤ ^x(i) (θ; y(i) ) − x(i)
N
i=1
1 X 2
N
=− ∇xθ Ji (x, θ)⊤ ∇2xx Ji (x, θ)−1 x − x(i) .
N
i=1 x(i) (θ;y(i) )
x=^
• Both unrolling
and bi-level learning are supervised, i.e., they need
x(i) , y(i) pairs for training, which are difficult to obtain for practical
problems. Moreover, if the imaging forward operator changes, one needs
to retrain the networks.
How PnP methods work
1. PnP methods use an off-the-shelf image denoiser, which is typically
learned (modern PnP methods), but it can be model-driven too (classical
PnP methods).
2. Gaussian denoisers are image priors in disguise.
For a sufficiently small σ, we can approximate ∇ log p(x) ≈ ∇ log pσ (x), and
then Tweedie’s identity leads to
ηλ
xk+1 = xk − η ∇ f (yδ , Axk ) + (Dσ (x) − x) .
σ2
The RED-PnP algorithm: xk+1 = xk − η ∇ f (yδ , Axk ) + ηγ (Dσ (x) − x) .
For widely used regularizers such as g(x) = ∥x∥1 , the proximal operator can
essentially be interpreted as a denoiser.
Idea: Replace the proximal operators in proximal algorithms with
off-the-shelf denoisers.
• Can get the PnP variants of algorithms such as PGD, ADMM, etc. by
replacing the proximal operators with denoisers.
(θ)
• First train a denoiser Dσ to eliminate Gaussian noise:
X
n
2
min (θ)
Dσ (xi + σi ϵi ) − xi , where σi ∼ uniform[0, σ], ϵi ∼ N(0, σi ).
θ 2
i=1
Figure: Compressive image recovery (with 20% subsampling) using PnP vis-à-vis
unrolling (ISTA-Net+). The PnP (AR) method uses a problem-dependent
artifact-removal (AR) operator, while PnP (denoising) uses a pre-trained Gaussian
denoiser.
Image source: Kamilov et al., "Plug-and-play methods for integrating physical and learned models in
computational imaging: Theory, algorithms, and applications," IEEE Signal Processing Magazine, 2023.
Learning a direct regularizer using deep neural networks
Learning an explicit regularizer
• There is a class of techniques that learn a direct regularization function
parameterized by a neural network (denoted as Rθ , where θ represents the
parameters of the network modeling the regularizer).
• The regularizer is generally learned independent of the imaging operator A
and then plugged into the variational framework, which is then minimized
for reconstruction.
• We will learn about two specific techniques for learning explicit
regularization functions.
1. Adversarial regularization (AR): Uses ideas from optimal transport.
2. Network Tikhonov (NETT): Uses an encoder-decoder-based approach to learn
a direct regularizer.
L(θ) := Ex∼πx [Rθ (x)] − Ez∼πz [Rθ (z)] subject to Rθ being 1-Lipschitz. (3)
1 X 1 X 1 X
nb nb nb 2
(ϵ)
L (θ) = Rθ (xj ) − Rθ (zj ) + λgp · ∇Rθ xj −1 .
nb j=1 nb j=1 nb j=1 2
◦ Update θ(m) = Adam-optimizer θ(m−1) , ∇θ L θ(m−1) .
• Output: The trained network with parameter θ(m) = θ∗ .
Reconstruction using an AR
2
min yδ − Ax 2
+ λ Rθ∗ (x)
x
Introducing convexity in the regularizer
• If the regularizer is convex in its input, the overall variational problem is a
convex optimization =⇒ efficient solver and theoretical guarantees.
• Can be constructed by composing simple convex functions.
LPD (33.62 dB, 0.89) AR (31.83 dB, 0.84) ACR (30.00 dB, 0.82)
Ground truth FBP: 21.61 dB, 0.17 TV: 25.74 dB, 0.80
LPD: 29.51 dB, 0.85 AR: 26.83 dB, 0.71 ACR: 27.98 dB, 0.84
Figure: Limited-angle CT reconstruction, along with the respective PSNR and SSIM
values. In this case, ACR outperforms TV and AR in terms of reconstruction quality.
LPD produces incorrect structures not present in the ground truth (unacceptable for
clinical applications).
Reference: SM et al., “Learned convex regularizers for inverse problems,” arXiv:2008.02839v2, 2020.
Some observations