0% found this document useful (0 votes)

13 views

Image Analysis Lecture 9

Uploaded by

Frew Dokem

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Image Analysis Lecture 9

Uploaded by

Frew Dokem

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Part 3, Lecture 2: Learned regularization for

image reconstruction
From model-based to data-driven approaches

Subhadip Mukherjee
Indian Institute of Technology
Kharagpur, India
# [email protected]
sites.google.com/view/subhadip-
mukherjee/home
§ github.com/Subhadip-1
This lecture
• Previously, we have seen how to apply optimization methods to
reconstruct an image by minimizing a variational energy function (data
fidelity plus regularizer).
• In this lecture, we will learn how to formulate different data-adaptive
reconstruction approaches and training strategies by drawing inspiration
from the variational framework and various optimization algorithms.
• We will study two important classes of techniques for learning an image
reconstruction operator and two representative methods from each class.
1. Supervised approaches
1.1. Algorithm unrolling
1.2. Bi-level learning
2. Unsupervised and weakly supervised approaches
2.1. Plug-and-play (PnP) denoising algorithms
2.2. Adversarial regularization

• You will get an overview of the pros and cons of these approaches and learn
which approach is suitable in what context. You will learn to implement a
simple unrolling approach in the practical (using odl and pytorch).
Variational image reconstruction: A Bayesian view
• The Bayesian approach of reconstruction views the image x and the
corresponding data y as two random variables related as y = Ax + w.
• In the Bayesian framework, we characterize the posterior distribution
p(x|y) by either summarizing it in a point estimate or by sampling from it.

pdata (y|x)pprior (x)

Bayes rule : p(x|y) =
py (y)

• The Bayes maximum a-posteriori probability (MAP) estimate is the

maximizer of the (log) posterior distribution p(x|y):

MAP estimate : ^xMAP = arg min [− log p(x|y)]

x

≡ arg min − log pdata (y|x) − log pprior (x) .
x

• When w is Gaussian and pprior (x) ∝ exp (−γ R(x)), we have

^xMAP = arg min ∥y − Ax∥22 + λ R(x).

• One might be interested in other statistical estimators (besides MAP).

The inadequacy of model-driven approaches

Recall the model-based variational approach:

xλ (yδ ) = arg min f yδ , Ax + λ R(x)

(1)
x∈R d | {z } | {z }
data fidelity regularizer

• It is generally difficult to handcraft a regularizer R that models images in

different applications.
• Solving the optimization problem (1) above using one of the iterative
algorithms that we studied before could take a few thousand iterations to
converge. This leads to a slow and computationally inefficient
reconstruction, especially for large images.
• Learning the regularizer from training data can lead to a significant
improvement in image quality, as we will see next.
A motivating example

ground-truth

model-based data-driven
Different data-driven image reconstruction techniques

Figure: Categorization of data-driven reconstruction approaches, which are

color-coded based on the strongest type of convergence guarantee they satisfy.

Image Source: SM, A. Hauptmann, O Öktem, M Pereyra, and C.-B. Schönlieb, “Learned reconstruction
methods with convergence guarantees: A survey of concepts and applications," IEEE Signal Processing
Magazine 40 (1), 164-182.
Data-driven post-processing
• Post-processing approaches seek to remove artifacts from a model-driven
approach using machine learning.

model-driven data-driven
yδ −→ x† −→ ^x

• These approaches are simple to implement, but they work as a black box
with limited interpretability.
• An example in the context of tomographic reconstruction: n
◦ Create a dataset consisting of (FBP, ground-truth) pairs of the form xi† , xi ,
i=1
where xi† = FBP(yi ) are the FBP images.
◦ Train a U-net to remove artifacts from the FBP images:

X
n
2
min xi − Uθ (xi† ) .
θ 2
i=1

• Cons: Needs more data to generalize well, does not combine imaging
physics and data in a principled fashion, does not admit variational
interpretation, lacks data consistency (i.e., Axi† ≈ yi =⇒
̸ AUθ (xi† ) ≈ yi ).
Reference: Jin et al., “Deep convolutional neural network for inverse problems in imaging," IEEE-TIP, 2017.
Algorithm unrolling: combining imaging physics with machine
learning
Algorithm unrolling: how does it work?
Key idea: Build an optimization-inspired architecture of the reconstruction
network. Each iteration in the optimization algorithm forms a layer in the
reconstruction network.

(Proximal) gradient descent Learned gradient descent

• Initialize x0 , choose step-size η. • Initialize x0 .
• Iterate until convergence: • Iterate for N ≈ 10 steps:

xk+1 = proxλ R xk − η ∇ f (yδ , Axk ) . xk+1 = hθk xk , ∇ f (yδ , Axk ) .

hθk is a CNN with learnable

parameters.

Training an unrolled network

Learn θ by using back-propagation on the (supervised) training loss:

1 X (i)
n
2
xN θ; yδ − x(i)

min .
θ n
i=1
The learned primal-dual (LPD) approach
• The learned gradient network seeks to learn the proximal operator
corresponding to the regularizer, but the data fidelity remains fixed.
• The learned variant of PDHG, on the other hand, offers the flexibility of
learning both the regularizer and the fidelity (by parameterizing the
proximal operators in both primal and dual spaces using two CNNs).

PDHG iterations Learned primal dual (LPD)

◦ uk+1 = proxσ
f∗ (uk + σ A x̄k ) ◦ uk+1 = ϕθk uk , yδ , σk A xk

◦ xk+1 = proxτg xk − τ A⊤ uk+1

◦ xk+1 = ψγk xk , τk A⊤ uk+1

◦ x̄k+1 = xk+1 + θ (xk+1 − xk )

Pros and cons of LPD (and unrolling in general)

• Pros: Fast and efficient reconstruction, data-efficient (generalizes well
with fewer training examples), parsimonious parameterization.
• Cons: Lack theoretical guarantees (unless parameterized and trained in a
specific fashion), training could be resource-intensive (due to the presence
of A and A⊤ in the architecture).
Schematic of the LPD network

Figure: A schematic diagram of the learned primal-dual (LPD) reconstruction network

for X-ray CT reconstruction.

Image source: Adler and Öktem, "Learned primal-dual reconstruction," IEEE-TMI, 2018.
Some key points about algorithm unrolling

• Unrolling exploits the modular structure of iterative optimization

algorithms: only the component (proximal operator) related to the image
prior (regularizer) is learned.
• The choice of the optimization algorithm dictates the architecture of the
reconstruction network.
• A trained unrolled network cannot generally be interpreted as a variational
minimizer (although its architecture is inspired by an optimization
algorithm). One can interpret the unrolled reconstruction as the
(approximate) conditional mean of the true image x given its noisy
measurement y (denoted as E [x|y]).
Learned primal-dual (LPD) for low-dose CT

Image source: Adler and Öktem, "Learned primal-dual reconstruction," IEEE-TMI, 2018.
Bi-level learning: Supervised learning of a variational image
reconstruction operator
Bi-level learning of regularizer parameters – 1
• In unrolling, the reconstructed image cannot be interpreted as a
minimizer of some variational energy function such as (1).
• Can we have a reconstruction operator that indeed corresponds to a
minimizer of a (learned) variational energy?

Bi-level learning of regularizer

1 X (i)
N
2
min ^x (θ; y(i) ) − x(i)
θ N
i=1
| {z }
g(θ)

subject to ^x (θ; y ) ∈ arg min f y(i) , Ax + Rθ (x) .
(i) (i)
x | {z }
Ji (x,θ)

• Unlike unrolling, the reconstruction operator ^x(i) (θ; y(i) ) in bi-level

learning corresponds to the minimizer of a variational energy Ji (x, θ).
Reference: Crockett and Fessler, “Bilevel methods for image reconstruction," arXiv:2109.09610v2, 2021.
Bi-level learning of regularizer parameters – 2
For training, we need to differentiate the upper-level loss g(θ) w.r.t. the
learnable parameter θ. Thanks to the implicit function theorem, this can be
obtained as

1 X (i)
N
∇g(θ) = ∂^x (θ; y(i) )⊤ ^x(i) (θ; y(i) ) − x(i)
N
i=1

1 X 2
N
=− ∇xθ Ji (x, θ)⊤ ∇2xx Ji (x, θ)−1 x − x(i) .
N
i=1 x(i) (θ;y(i) )
x=^

• Computing the upper-level gradient w.r.t. θ needs derivatives of the

lower-level loss w.r.t. both x and θ, evaluated at the true lower-level
solution ^x(i) (θ; y(i) ).
• However, ^x(i) (θ; y(i) ) is typically only approximated using an iterative
optimization algorithm, and therefore all these quantities are only known
approximately, leading to an inaccurate gradient ∇g(θ).
Bi-level learning for image denoising

ground-truth noisy (20.28 dB)

fields-of-expert regularizer (30.01 dB) smoothed TV (29.33 dB)

Figure: Bi-level learning of regularizer (with two different parameterizations) for

image denoising (source: Salehi et al., arXiv:2308.10098, 2023).
Plug-and-play (PnP) algorithms: Using image denoisers to solve
image reconstruction
Plug-and-play (PnP) algorithms

• Both unrolling
and bi-level learning are supervised, i.e., they need
x(i) , y(i) pairs for training, which are difficult to obtain for practical
problems. Moreover, if the imaging forward operator changes, one needs
to retrain the networks.
How PnP methods work
1. PnP methods use an off-the-shelf image denoiser, which is typically
learned (modern PnP methods), but it can be model-driven too (classical
PnP methods).
2. Gaussian denoisers are image priors in disguise.

Tweedie’s identity : E [x∗ |x] −x = σ2 ∇ log pσ (x)

| {z } | {z }
Denoiser Dσ (x) gradient of the log-prior (score)

• x∗ : clean image, x: noisy image, x = x∗ + Gaussian noise with variance σ2 , pσ :

probability density of x.
Two variants of PnP: RED-PnP and proximal PnP – 1
Regularization by denoising (RED)
Recall the MAP estimation problem given by

minn f yδ , Ax − λ log p(x).

(2)
x∈R

Gradient descent for (2) takes the form

xk+1 = xk − η ∇ f (yδ , Axk ) + η λ ∇ log p(x).

For a sufficiently small σ, we can approximate ∇ log p(x) ≈ ∇ log pσ (x), and
then Tweedie’s identity leads to

ηλ
xk+1 = xk − η ∇ f (yδ , Axk ) + (Dσ (x) − x) .
σ2
The RED-PnP algorithm: xk+1 = xk − η ∇ f (yδ , Axk ) + ηγ (Dσ (x) − x) .

Note: Although a practical Gaussian denoiser is not always the gradient of an

underlying potential, the RED algorithm works surprisingly well in practice.
Reference: Reehorst and Schniter, “Regularization by denoising: clarifications and new interpretations,”
IEEE-TCI, 2019.
Two variants of PnP: RED-PnP and proximal PnP – 2
Proximal PnP
Recall proximal gradient descent:

xk+1 = proxηg (xk − η ∇f (xk )) .

For widely used regularizers such as g(x) = ∥x∥1 , the proximal operator can
essentially be interpreted as a denoiser.
Idea: Replace the proximal operators in proximal algorithms with
off-the-shelf denoisers.
• Can get the PnP variants of algorithms such as PGD, ADMM, etc. by
replacing the proximal operators with denoisers.
(θ)
• First train a denoiser Dσ to eliminate Gaussian noise:

X
n
2
min (θ)
Dσ (xi + σi ϵi ) − xi , where σi ∼ uniform[0, σ], ϵi ∼ N(0, σi ).
θ 2
i=1

• Plug it inside a proximal algorithm.

• The denoiser is trained independently of the imaging operator.
PnP for compressive image recovery

Figure: Compressive image recovery (with 20% subsampling) using PnP vis-à-vis
unrolling (ISTA-Net+). The PnP (AR) method uses a problem-dependent
artifact-removal (AR) operator, while PnP (denoising) uses a pre-trained Gaussian
denoiser.

Image source: Kamilov et al., "Plug-and-play methods for integrating physical and learned models in
computational imaging: Theory, algorithms, and applications," IEEE Signal Processing Magazine, 2023.
Learning a direct regularizer using deep neural networks
Learning an explicit regularizer
• There is a class of techniques that learn a direct regularization function
parameterized by a neural network (denoted as Rθ , where θ represents the
parameters of the network modeling the regularizer).
• The regularizer is generally learned independent of the imaging operator A
and then plugged into the variational framework, which is then minimized
for reconstruction.
• We will learn about two specific techniques for learning explicit
regularization functions.
1. Adversarial regularization (AR): Uses ideas from optimal transport.
2. Network Tikhonov (NETT): Uses an encoder-decoder-based approach to learn
a direct regularizer.

• The main advantage of these frameworks is that the training is

unsupervised, i.e., one does not need (measurement, ground-truth) pairs
to train the regularizer. This makes these approaches generalizable to
different imaging operators (at least in principle).
Adversarial regularization (AR)
The key idea is to train a regularizer such that it is small on clean
(ground-truth) images and large on images with artifacts.

L(θ) := Ex∼πx [Rθ (x)] − Ez∼πz [Rθ (z)] subject to Rθ being 1-Lipschitz. (3)

• Here, πx and πz denote the distributions of clean and artifact-ridden

images.
• Rθ is a (convolutional) neural network with learnable weights and biases.
• The training dataset consists of a set of clean images (xi )Ni=1
1
drawn from πx
N2
and a set of images (zi )i=1 that can be obtained using a cheap model-based
approach (e.g., the pseudo-inverse solution zi = A† xi ).
• The clean images and the artifact-ridden images do not have to correspond
to each other, alleviating the need for strict supervision.
• The 1-Lipschitz condition on the regularizer makes the training problem
well-posed and leads to nice theoretical interpretations of (3) and the
resulting variational problem using concepts from optimal transport.
Reference: S. Lunz, O. Öktem, and C.-B. Schönlieb, “Adversarial regularizers in inverse problems,”
NeurIPS-2018.
Training and reconstruction for AR
Training an AR
• Input: A penalty parameter λgp (to enforce the Lipschitz constraint), initial
value of the network parameter(s) θ(0) .
• for mini-batches m = 1, 2, · · · , do (until convergence):
◦ Sample xi ∼ πx , zj ∼ πz , and ϵj ∼ uniform [0, 1]; for 1 ⩽ j ⩽ nb , where nb =
minibatch size.
(ϵ)
◦ Compute xj = ϵj xj + 1 − ϵj zj .
◦ Compute the training loss L (θ) for the mth mini-batch:

1 X 1 X 1 X
nb nb nb 2
(ϵ)
L (θ) = Rθ (xj ) − Rθ (zj ) + λgp · ∇Rθ xj −1 .
nb j=1 nb j=1 nb j=1 2

◦ Update θ(m) = Adam-optimizer θ(m−1) , ∇θ L θ(m−1) .
• Output: The trained network with parameter θ(m) = θ∗ .

Reconstruction using an AR
2
min yδ − Ax 2
+ λ Rθ∗ (x)
x
Introducing convexity in the regularizer
• If the regularizer is convex in its input, the overall variational problem is a
convex optimization =⇒ efficient solver and theoretical guarantees.
• Can be constructed by composing simple convex functions.

Figure: Adversarial convex regularizer: The blue rectangles indicate convolutional

layers with non-negative weights and the orange rectangles represent standard
convolutional layers (with no restrictions on the weights). The gray triangle in the end
denotes an average-pooling operation. The (pointwise) nonlinear activation φ needs
to be convex and monotonically increasing to preserve convexity. This requirement is
already satisfied by activations such as ReLU or leaky-ReLU.

• The architecture above is the so-called input-convex neural network

(ICNN), first proposed by Amos et al. (ICML-2017).
Performance on sparse-view CT reconstruction

ground-truth FBP (21.63 dB, 0.24) TV (29.25 dB, 0.79)

LPD (33.62 dB, 0.89) AR (31.83 dB, 0.84) ACR (30.00 dB, 0.82)

Figure: Sparse-view CT reconstruction: Comparison of different reconstruction

methods in terms of the PSNR and SSIM of the reconstructed image.
Performance on limited-angle CT reconstruction

Ground truth FBP: 21.61 dB, 0.17 TV: 25.74 dB, 0.80

LPD: 29.51 dB, 0.85 AR: 26.83 dB, 0.71 ACR: 27.98 dB, 0.84
Figure: Limited-angle CT reconstruction, along with the respective PSNR and SSIM
values. In this case, ACR outperforms TV and AR in terms of reconstruction quality.
LPD produces incorrect structures not present in the ground truth (unacceptable for
clinical applications).

Reference: SM et al., “Learned convex regularizers for inverse problems,” arXiv:2008.02839v2, 2020.
Some observations

• Learning the regularizer works better than handcrafting it.

• Unrolling-based (supervised) methods perform the best if well-trained,
but can lead to undesirable artifacts in the reconstruction for severely
ill-conditioned imaging operators (such as limited-angle CT).
• When the forward operator is less ill-posed (e.g., in sparse-view CT),
introducing convexity in the regularizer reduces its expressive power,
leading to degradation in image quality.
• When the problem is more severely ill-posed (e.g., in limited-angle CT),
the inductive bias of convexity helps achieve better reconstruction.
• More research is needed to bridge the gap between good numerical
performance and interpretability.
Training the regularizer using the NETT approach
X
• Parameterize the regularizer as Rθ (x) = βi |[ϕθ (x)]i |q , where βi > 0
i
ϕθ is a deep neural network.

Figure: Training a NETT regularizer (reference: Li et al., Li et al., “Solving inverse

problems with deep neural networks," Inverse Problems, 2020.)
n1 +n2
• Training data: (image, artifact) pairs xj , rj j=1 . For j = 1, · · · , n1 ,
†
rj = xj − A yj , and for j = n1 + 1, · · · , n1 + n2 , rj = 0.
P 1 +n2
• Training loss: J(ν, θ) = nj=1

loss Ψν ϕθ (xj ) , rj =⇒ regularizer
takes a small (large) value when the image is clean (artifact-ridden).
Summary and key takeaways
• The modularity of the variational framework and proximal algorithms for
minimization opens up the possibility of incorporating imaging physics
within data-driven approaches.
• Modeling the proximal operator with a neural network and learning it in a
data-adaptive fashion enables data-driven regularization, which
significantly outperforms model-driven (analytical/hand-crafted)
regularizers.
• Supervised techniques such as algorithm unrolling generally perform
better empirically than unsupervised approaches but are not easily
amenable to easy variational interpretation.
• Unsupervised approaches are more flexible in terms of requirements on
the training data.
• Regularizers can be learned implicitly (e.g., using a denoiser) or in a more
direct fashion (e.g., adversarial regularization).

THERMODYNAMICS
33% (3)
THERMODYNAMICS
66 pages
K To 12 Mechanical Drafting Learning Module
91% (43)
K To 12 Mechanical Drafting Learning Module
179 pages
Chapter 1 Large Numbers Class 5
67% (3)
Chapter 1 Large Numbers Class 5
5 pages
Percentile Rank For Grouped
No ratings yet
Percentile Rank For Grouped
17 pages
ELEC 2607 Final Exam
No ratings yet
ELEC 2607 Final Exam
11 pages
Image Analysis Lecture 10
No ratings yet
Image Analysis Lecture 10
32 pages
Plug-and-Play Image Restoration With Deep Denoiser Prior
No ratings yet
Plug-and-Play Image Restoration With Deep Denoiser Prior
16 pages
Unrolled Optimization With Deep Priors: Steven Diamond Vincent Sitzmann Felix Heide Gordon Wetzstein December 20, 2018
No ratings yet
Unrolled Optimization With Deep Priors: Steven Diamond Vincent Sitzmann Felix Heide Gordon Wetzstein December 20, 2018
11 pages
A Mathematical Guide To Operator Learning
No ratings yet
A Mathematical Guide To Operator Learning
45 pages
Deep Learning and Inverse Problems: Ali Mohammad-Djafari Orcid Number:0000-0003-0678-7759, Ning Chu, Li Wang, Liang Yu
No ratings yet
Deep Learning and Inverse Problems: Ali Mohammad-Djafari Orcid Number:0000-0003-0678-7759, Ning Chu, Li Wang, Liang Yu
13 pages
PINN Notes
No ratings yet
PINN Notes
10 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
Zhang Learning Deep CNN CVPR 2017 Paper
No ratings yet
Zhang Learning Deep CNN CVPR 2017 Paper
10 pages
Improved Denoising Diffusion Probabilistic Models
No ratings yet
Improved Denoising Diffusion Probabilistic Models
17 pages
On The Interplaybw Physical and Content Priors in Deep Learning For Computational Imaging
No ratings yet
On The Interplaybw Physical and Content Priors in Deep Learning For Computational Imaging
19 pages
2022Arxiv_ADL： Adversarial Distortion Learning for Denoising and Distortion Removal
No ratings yet
2022Arxiv_ADL： Adversarial Distortion Learning for Denoising and Distortion Removal
22 pages
Denoising Diffusion Restoration Models
No ratings yet
Denoising Diffusion Restoration Models
32 pages
pnp-admm
No ratings yet
pnp-admm
93 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Zhang 2021
No ratings yet
Zhang 2021
17 pages
ppt3dl
No ratings yet
ppt3dl
15 pages
cours4
No ratings yet
cours4
30 pages
UNNP For Solving IIPs A Survey
No ratings yet
UNNP For Solving IIPs A Survey
21 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
17 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
Physics-Informed Neural Networks
No ratings yet
Physics-Informed Neural Networks
22 pages
Fast Image Recovery Using Variable Splitting and Constrained Optimization
No ratings yet
Fast Image Recovery Using Variable Splitting and Constrained Optimization
11 pages
5 Regularization
No ratings yet
5 Regularization
79 pages
A Probabilistic Theory of Deep Learning: Unit 2
No ratings yet
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Raissi - PIDL Part 2
No ratings yet
Raissi - PIDL Part 2
19 pages
Learning To Solve Multiple Partial Differential Equations With DNN
No ratings yet
Learning To Solve Multiple Partial Differential Equations With DNN
4 pages
Chen, Deng et al 2021 - Effective and Efficient Batch Normalization
No ratings yet
Chen, Deng et al 2021 - Effective and Efficient Batch Normalization
15 pages
Sciadv Abi8605
No ratings yet
Sciadv Abi8605
10 pages
cours6
No ratings yet
cours6
26 pages
2. Deep Neural Network
No ratings yet
2. Deep Neural Network
60 pages
Introduction To Differentiable Physics - Physics-Based Deep Learning
No ratings yet
Introduction To Differentiable Physics - Physics-Based Deep Learning
8 pages
Unit-2 L2 (3)
No ratings yet
Unit-2 L2 (3)
22 pages
Hernandez Lobatoc15
No ratings yet
Hernandez Lobatoc15
9 pages
Training Neural
No ratings yet
Training Neural
16 pages
Solving Inverse Problems Using Datadriven Models
No ratings yet
Solving Inverse Problems Using Datadriven Models
174 pages
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
No ratings yet
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
22 pages
07_regularization
No ratings yet
07_regularization
51 pages
NeurIPS-2020-learning-to-solve-tv-regularised-problems-with-unrolled-algorithms-Paper
No ratings yet
NeurIPS-2020-learning-to-solve-tv-regularised-problems-with-unrolled-algorithms-Paper
12 pages
Deep Learning 02
No ratings yet
Deep Learning 02
28 pages
WEEK 10
No ratings yet
WEEK 10
69 pages
DN CNN
No ratings yet
DN CNN
14 pages
Deep Learning-Accelerated Computational Framework Based On Physics
No ratings yet
Deep Learning-Accelerated Computational Framework Based On Physics
51 pages
PINN
100% (1)
PINN
22 pages
UNIT 2 Notes
No ratings yet
UNIT 2 Notes
19 pages
Early Stopping For Deep Image Prior: Hengkang Wang
No ratings yet
Early Stopping For Deep Image Prior: Hengkang Wang
40 pages
Chapter 1 - Introduction and Supervised
No ratings yet
Chapter 1 - Introduction and Supervised
40 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
1710 11573 PDF
No ratings yet
1710 11573 PDF
14 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
8.TrainingNN-3
No ratings yet
8.TrainingNN-3
67 pages
Neural Network
No ratings yet
Neural Network
45 pages
winter1516_lecture54
No ratings yet
winter1516_lecture54
20 pages
DL Unit-3
No ratings yet
DL Unit-3
10 pages
Beyond A Gaussian Denoiser: Residual Learning of Deep CNN For Image Denoising
No ratings yet
Beyond A Gaussian Denoiser: Residual Learning of Deep CNN For Image Denoising
14 pages
Rspa 2022 0576
No ratings yet
Rspa 2022 0576
23 pages
Noise2Noise: Learning Image Restoration Without Clean Data
No ratings yet
Noise2Noise: Learning Image Restoration Without Clean Data
12 pages
Deep Convolutional Neural Network For Inverse Problems in Imaging
No ratings yet
Deep Convolutional Neural Network For Inverse Problems in Imaging
20 pages
Laumont etal22-BaysianImagingPnP
No ratings yet
Laumont etal22-BaysianImagingPnP
37 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
2016-CME620 Stochastic
No ratings yet
2016-CME620 Stochastic
105 pages
EE262. Antenna Theory & Design: Fall, 2021
No ratings yet
EE262. Antenna Theory & Design: Fall, 2021
18 pages
CH 20 Electric Potential & Electric Potential Energy
No ratings yet
CH 20 Electric Potential & Electric Potential Energy
2 pages
Short Paper: Pulsed-Latch Replacement Using Concurrent Time Borrowing and Clock Gating
No ratings yet
Short Paper: Pulsed-Latch Replacement Using Concurrent Time Borrowing and Clock Gating
5 pages
IEMTCModule7 Final
No ratings yet
IEMTCModule7 Final
34 pages
The Role of Experimentation in Software
No ratings yet
The Role of Experimentation in Software
17 pages
John W. Cell - Engineering Problems Illustrating Mathematics
No ratings yet
John W. Cell - Engineering Problems Illustrating Mathematics
190 pages
Week 3 Logic Mathematical Quantifiers
No ratings yet
Week 3 Logic Mathematical Quantifiers
8 pages
kw1 561csa
No ratings yet
kw1 561csa
4 pages
1931 - Colburn
No ratings yet
1931 - Colburn
4 pages
Core Maths Extended Project and FSMQ Grade Boundaries June 2023
No ratings yet
Core Maths Extended Project and FSMQ Grade Boundaries June 2023
1 page
PDFen
No ratings yet
PDFen
68 pages
s10551-018-3977-0
No ratings yet
s10551-018-3977-0
19 pages
Trigonometry
No ratings yet
Trigonometry
12 pages
Operations Management Lecture 6 Inventory Management
No ratings yet
Operations Management Lecture 6 Inventory Management
60 pages
Worksheet B: Directions:: : Topic 1.5 Polynomial Functions and Complex Zeros Created by Bryan Passwater
No ratings yet
Worksheet B: Directions:: : Topic 1.5 Polynomial Functions and Complex Zeros Created by Bryan Passwater
2 pages
Physics All You Want
No ratings yet
Physics All You Want
232 pages
D
No ratings yet
D
2 pages
Continuity Exe Sheet
No ratings yet
Continuity Exe Sheet
31 pages
Bayesian Analysis: Bayesian Inference Bayesian Statistics Test For Significance
No ratings yet
Bayesian Analysis: Bayesian Inference Bayesian Statistics Test For Significance
1 page
Research Methodology: Dr. Anwar Hasan Siddiqui
100% (1)
Research Methodology: Dr. Anwar Hasan Siddiqui
30 pages
GRADE-8 Term-1 Portions (2024-2025) - 1
No ratings yet
GRADE-8 Term-1 Portions (2024-2025) - 1
4 pages
Lecture Notes in Physics: Monographs
No ratings yet
Lecture Notes in Physics: Monographs
200 pages
Hydrometer Calibration by Hydrostatic Weighing With Automated Liquid Surface Positioning
No ratings yet
Hydrometer Calibration by Hydrostatic Weighing With Automated Liquid Surface Positioning
15 pages
Design of Packed Bed Absorber-: Step 1: Calculate Y, Y, X
No ratings yet
Design of Packed Bed Absorber-: Step 1: Calculate Y, Y, X
5 pages

Image Analysis Lecture 9

Uploaded by

Image Analysis Lecture 9

Uploaded by

Part 3, Lecture 2: Learned regularization for

pdata (y|x)pprior (x)

• The Bayes maximum a-posteriori probability (MAP) estimate is the

MAP estimate : ^xMAP = arg min [− log p(x|y)]

• When w is Gaussian and pprior (x) ∝ exp (−γ R(x)), we have

^xMAP = arg min ∥y − Ax∥22 + λ R(x).

• One might be interested in other statistical estimators (besides MAP).

Recall the model-based variational approach:

xλ (yδ ) = arg min f yδ , Ax + λ R(x)

• It is generally difficult to handcraft a regularizer R that models images in

Figure: Categorization of data-driven reconstruction approaches, which are

(Proximal) gradient descent Learned gradient descent

xk+1 = proxλ R xk − η ∇ f (yδ , Axk ) . xk+1 = hθk xk , ∇ f (yδ , Axk ) .

hθk is a CNN with learnable

Training an unrolled network

PDHG iterations Learned primal dual (LPD)

◦ xk+1 = proxτg xk − τ A⊤ uk+1

◦ x̄k+1 = xk+1 + θ (xk+1 − xk )

Pros and cons of LPD (and unrolling in general)

Figure: A schematic diagram of the learned primal-dual (LPD) reconstruction network

• Unrolling exploits the modular structure of iterative optimization

Bi-level learning of regularizer

• Unlike unrolling, the reconstruction operator ^x(i) (θ; y(i) ) in bi-level

• Computing the upper-level gradient w.r.t. θ needs derivatives of the

ground-truth noisy (20.28 dB)

fields-of-expert regularizer (30.01 dB) smoothed TV (29.33 dB)

Figure: Bi-level learning of regularizer (with two different parameterizations) for

Tweedie’s identity : E [x∗ |x] −x = σ2 ∇ log pσ (x)

• x∗ : clean image, x: noisy image, x = x∗ + Gaussian noise with variance σ2 , pσ :

minn f yδ , Ax − λ log p(x).

Gradient descent for (2) takes the form

xk+1 = xk − η ∇ f (yδ , Axk ) + η λ ∇ log p(x).

Note: Although a practical Gaussian denoiser is not always the gradient of an

xk+1 = proxηg (xk − η ∇f (xk )) .

• Plug it inside a proximal algorithm.

• The main advantage of these frameworks is that the training is

• Here, πx and πz denote the distributions of clean and artifact-ridden

Figure: Adversarial convex regularizer: The blue rectangles indicate convolutional

• The architecture above is the so-called input-convex neural network

ground-truth FBP (21.63 dB, 0.24) TV (29.25 dB, 0.79)

Figure: Sparse-view CT reconstruction: Comparison of different reconstruction

• Learning the regularizer works better than handcrafting it.

Figure: Training a NETT regularizer (reference: Li et al., Li et al., “Solving inverse

You might also like