Image Reconstruction Using Deep Learning

Image Reconstruction Using Deep Learning
Po-Yu Liu, 3035123887, The University of Hong Kong

Supervisor: Prof. Edmund Y. Lam
2nd examiner: Dr. Yik-Chung Wu
Abstract—This paper proposes a deep learning Due to cost and space efficiency, mobiles phones are
architecture that attains statistically significant usually shipped with lower-grade camera lenses and
improvements over traditional algorithms in Poisson sensors. When taking pictures, especially in the
image denoising espically when the noise is strong. nighttime, the resulting images are usually plagued with
Poisson noise commonly occurs in low-light and photon-
dirty pixels, which is the image noise. With the
limited settings, where the noise can be most accurately
modeled by the Poission distribution. Poisson noise increasing prevalence of mobiles devices, the necessity
traditionally prevails only in specific fields such as for an effective noise-removing algorithm is also
astronomical imaging. However, with the booming increased.
market of surveillance cameras, which commonly operate Image noise can be further categorized by how it is
in low-light environments, or mobile phones, which
modeled mathematically. When an image is taken in a
produce noisy night scene pictures due to lower-grade
sensors, the necessity for an advanced Poisson image bright setting, the Gaussian distribution can most
denoising algorithm has increased. Deep learning has conveniently approximate the noise characteristics [7].
achieved amazing breakthroughs in other imaging The resulting noise is the called Gaussian noise, a most
problems, such image segmentation and recognition, and common and most extensively researched type of noise.
this paper proposes a deep learning denoising network Another type of noise, the Poisson noise, models the
that outperforms traditional algorithms in Poisson brightness of a pixel as proportional to the number of
denoising especially when the noise is strong. The independently arriving photons. Since the photons
architecture incorporates a hybrid of convolutional and arrive independently of each other and arrive at a fixed
deconvolutional layers along with symmetric connections. rate, we can assume the pixel brightness as being
The denoising network achieved statistically significant
sampled from a Poisson distribution [8]. Poisson noise
0.38dB, 0.68dB, and 1.04dB average PSNR gains over
benchmark traditional algorithms in experiments with is most significant in low-light and photo-limited
image peak values 4, 2, and 1. The denoising network can settings, and typical cases include night scenes, medical
also operate with shorter computational time while still imaging [9] and astronomical images [10]. Poisson
outperforming the benchmark algorithm by tuning the noise was less researched because it was far less
reconstruction stride sizes. common than Gaussian noise. But due of the booming
number of smart phones, more and more Poisson-noisy
Keywords: Image reconstruction, deep learning, night pictures are produced. Therefore, this paper only
convolutional neural network, image denoising, Poisson focuses on removing the Poisson noise, although the
noise. algorithm proposed in this paper might be able to adapt
to other types of noise with minimal modifications.
Previous research on denoising Poisson noise
1 INTRODUCTION includes VST+BM3D [11], [12], non-local space PCA
[13], and sparse coding with dictionary learning [14].
Image reconstruction, or image restoration, refers to Although these algorithms achieved varying degrees of
success, a pursuit for a more effective and efficient
recovering the original clean images from corrupted
denoising algorithm has never stopped. A popular
ones. The corruption arises in various forms, such as
motion blur, low resolution, and the topic of this paper: technique in machine learning, deep learning, has been
showing superior performance in other imaging
noise. Image noise refers the variations of color and
problems, such as image recognition [15], [16]
brightness in an image with respect to an ideal image of
the real scene. Image noise originates from the segmentation [17], and captioning [18], and is also a top
atmospheric disturbances [1], heat in semiconductor candidate in addressing the denoising problem.
Traditional algorithms are often limited in
devices [2], or simply the stochastic process of
representation complexity and number of parameters.
incoming photons [3]. Visually, the noise adds “dirty”
grains with random intensity to the images, which in For example, sparse coding with dictionary learning
assumes that each pixel can be linearly reconstructed
some cases severely degrades visual pleasure and image
from a sparse dictionary. Such a linearity assumption,
details such as edges [4]. Image noise is ubiquitous due
to lack of light, or imperfect camera sensors [5], [6]. however, oversimplifies the nature of image noise,
1
which is often applied in a non-linear fashion. Deep The following sections are arranged as follows.
learning, on the other hand, learns parameters from data Section 2 summarizes previous works in image
without human intervention. Moreover, deep learning denoising via both traditional and deep learning
does not assume linearity and can learn arbitrarily algorithms. Section 3 explains the characteristics of the
complex non-linear transformation [19], [20]. For Poisson denoising problem. Section 4 details the design
example, a deep learning model can learn a complex of experiments for both the benchmark algorithm and
transformation from noisy images to clean images, and the proposed deep learning network. Section 5 reports
such a transformation constitutes a denoising algorithm. both the qualitative and quantitative performance of the
Burger, Schuler, and Harmeling [21] demonstrated that proposed deep learning architecture, along with the
a simple feed-forward neural network could achieve the effect of stride sizes and noise level on the denoising
same level of performance as BM3D in Gaussian performance. Section 6 concludes that the proposed
denoising. Other researchers also demonstrated the network can achieve statistically significant
potential of convolutional neural network in image improvement over traditional algorithms especially
denoising [22], [23], [24], [25], [26]. The success stories when the Poisson noise strong.
prompted me into addressing the Poisson denoising
problem with deep learning.
2 RELATED WORKS
The major contributions of this paper can be
summarized as follows: A popular non-machine learning approach [29], [30],
[31] for Poisson denoising depends on variance-
stabilizing transformation (VST) that transforms
• I proposed a deep convolutional neural network for Poisson noisy images into Gaussian images, such as
Poisson denoising. The network incorporates two Anscombe [32] and Fisz [33], [34], and then apply
designs in other works: convolutional autoencoders effective Gaussian denosing algorithms [35], [36], [37],
[25] and symmetric connections [26]. [38], [11] to denoise images. Among Gaussian
Autoencoders [27] compress the input to compact denoising algorithms, Block-matching and 3D Filtering
representations via the bottleneck design. The (BM3D) [11], a pioneering non-machine learning
compact representations only retain principle Gaussian denoising algorithm, is frequently employed
elements of the input while discarding insignificant with VST for Poisson denoising; this combination is
information such as noise. Therefore, simple often referred to as VST+BM3D. Other algorithms
autoencoders are effective in denoising [28], and tackle Poisson noise directly without relying on
one of the variants, convolutional autoencoders, are Gaussian denoising algorithms, which include non-
effective in denoising images. The symmetric local space PCA [13], and sparse coding with dictionary
connections are placed between corresponding learning [14].
encoders and decoders. The connections are However, the trend of tackling imaging problems
advantageous in reminding the decoders of the with deep learning has been growing ever since deep
image details forgotten by the decoders. They also learning achieved ground-breaking achievements in
let backpropagation propagates gradients to other fields such as speech recognition [39] and
previous layers more efficiently. machine translation [40]. The deep learning approach
• The network contains multiple branches of for image denoising almost always relies on
convolutional autoencoders with varying depths. A convolutional neural network (CNN) due to the
deeper branch smoothes color fluctuations more network’s ability to capture images’ spatial locality
effectively while sacrifices slightly more image [41]. The training styles are nevertheless split into two
details. By incorporating branches of varying categories. One approach is to attempt to directly
depths, the whole network can learn color translate a noisy image to an uncorrupted image.
smoothing more from deeper branches while Another approach is to extract noise from a noisy image
learning image details more from shallower and subtract the noisy image with the noise; this is also
branches. called residual learning [42]. Zhang, Zuo, Gu, and
• The network has demonstrated outstanding Zhang [23] developed a deep CNN denoiser for
denoising performance compared to the non- denoising, de-blurring, and super-resolution with
machine learning benchmark algorithm. In a test of residual learning. Remez et al. [24] also developed a
denoising 21 standard test images of peak values 4 residual CNN with a different structure that
with Poisson noise, the propoesed network outperforms VST+BM3D and other traditional
achieved a statistically significant improvement of algorithms in Poisson denoising. Gondara [25]
0.38dB PSNR gain on average over the benchmark proposed a convolutional autoencoder that achieved
algorithm, and even higher gains for stronger noise. medical image denoising. Mao, Shen, and Yang [26]
2
improved upon a simple convolutional autoencoder and modeling the number of photons in a clean pixel. Then
incorporated symmetric connections. They claim that 8 and 9 will be related by the following formula
the symmetric connections are valuable in combating
56
loss of back propagation gradient and image details. ) 34
5>0
!(8 = *|9 = 5) = ; #! (2)
Their model accomplished outstanding results in
several low-level imaging problems such as denoising >? 5=0
and super-resolution. where >? denotes a probability distribution with 0
appearing with probability of 1. Unlike Gaussian noise,
where the noise is determined by a single parameter: the
3 THE POISSON DENOISING PROBLEM standard deviation, Poisson noise is not determined by
any specific parameter except for the pixel intensity as
5. As a result, the peak value of 9 is conventionally
Poisson noise commonly exists in low-light, or
adopted to define the strength of Poisson noise in an
photon-limited, images. Although the visual effect of
image [24]. Lower peak values such as 1 and 2 result in
the noise is similar to that of Gaussian noise, the stronger Poisson noise, while higher peak values such
properties as well as the underlying stochastic processes
as 8 and 16 result in weaker Poisson noise. The rationale
vary in term of mathematical modeling. This section
behind this convention will be discussed in Section 3.3.
introduces some important properties of Poisson noise,
beginning with an introduction to the Poisson
distribution. 3.3 Signal-to-noise Ratio
The true signal’s power divided by the noise’s power,
3.1 Poisson Distribution namely the signal-to-noise ratio, is a commonly
embraced measure to quantify the strength of noise.
Poisson distribution models discrete occurrences When the ratio is higher, the image enjoys a higher
within a time interval when the events occur in a fixed
quality. For Poisson noise, assuming the average photon
rate, and when the arrival of an event is independent of
numbers in a pixel is 9, and the noise level is defined
each other [8]. It is a discrete probability distribution,
and its probability mass function is by the standard deviation, which is √9, the signal-to-
noise ratio is roughly
56
!(# %&&'(()*&)+ ,* ,*-)(./0) = ) 34 (1) 9
#! = √9 (3)
where 5 is the average number of event occurrences. √9
This distribution is widely adopted to model events As the brightness becomes higher, the signal-to-noise
happening with a fixed rate and independently, such as ratio also becomes higher, so that a Poisson noisy image
the number of customers arriving at checkout counters, will be cleaner compared to low-light images. This
the number of phone calls received by a customer corresponds to our previous statement that Poisson
service center, and also the focus of this paper: the noise is most significant in low-light settings. We can
number of photons hitting the image sensors in cameras. also observe from this equation that when the peak
value of Y becomes higher, the noise becomes weaker
and vice versa. Therefore, when we apply the Poisson
3.2 Poisson Noise noise to an image, the peak value of the image is used
A large quantity of photons hit a sensor pixel in order to control the strength of the applied Poisson noise..
to form a pixel in the resulting images. In the relatively However, the above statement is only useful for
short exposure time, which usually ranges from 0.1 to controlling the level of applied noise. To precisely
0.01 seconds, we can assume the photons arrive at the quantify a denoising algorithm’s effectiveness, the peak
sensor in a fixed rate. We can also assume that photons signal-to-noise ratio (PSNR) is commonly adopted. Its
arrive independently. Therefore, the number of photons definition is
hitting the sensor, and hence the brightness, can be
approximated by a Poisson distribution. If 5 is very FG8HI
10 0%BCD E L (4)
large, as is usually the case in picture taking, the Poisson FJK
distribution resembles a Gaussian distribution. The
prevalence is why Gaussian noise is the most where FJK stands for mean squared error, and
investigated in image denoising. Nevertheless, when the FG8H stands for the greatest potential pixel intensity in
pictures are taken in low-light settings, the 5 will be image M, 255 in the case of an 8-bit grey scale image. A
small, and the Gaussian approximation no longer holds higher PSNR in a reconstructed image represents a
true. The Poisson noise thus explains a major proportion cleaner reconstruction compared with the uncorrupted
of the noise in low-light images. More specifically, let image.
8 modeling the number of photons in a noisy pixel, 9
3
4 DESIGN OF EXPERIMENTS
3
P = 2RO + (5)
8
This section presents details regarding the data
sources, and the experiment designs for both the where O are the original values and P are the
traditional benchmark and the proposed deep learning transformed values. After applying BM3D on the
algorithms. transformed values P, we conduct the inverse transform
and obtain the denoised image. The inverse transform
can be done by simply reversing the role of O and P so
4.1 Data that
I chose the PASCAL Visual Object Classes 2010
P I 3
(PASCAL VOC) images for deep learning training, O=V W − (6)
which contains a total of 11,321 images, and I also used 2 8
a set of 21 standard test images for final evaluation and However, Mäkitalo and Foi [30] indicates that this
visual impression. Standard test images are a small set naive inverse transform will bias the estimators. To
of conventional images widely used across various eliminate the bias, the following closed-form
image processing works to produce comparable results, approximation of an unbiased transform
which include well-known images such as pepper and
1 I 1 1 3 3C 11 3I 5 3 3[
Lena. All the images are turned to grey scale for ease of O= P − + R P − P + R P (7)
4 8 4 2 8 8 2
experimentation. To generate a image training dataset
suitable for deep learning training and prediction, I should be used instead, and this is the inverse
extracted 64 image patches of dimension 64×64 from transform employed in this paper.
each of the PASCAL VOC image as the training data
for the deep learning network. Any larger patch size
would be too memory consuming and infeasible be fed 4.3 Deep Learning Approach
into my computer’s memory all at once. Figure 1 visualizes the deep learning denoising
network proposed in this paper for Poisson denoising.
Arrows marked “Conv” are convolutional layers,
4.2 VST+BM3D
arrows marked “Deconv” are deconvolutional layers,
Block-matching and 3D Filtering (BM3D) with and arrows without any marks are simple connections.
Variance-stabilizing transformation (VST) [12] is a The 3D blocks represent the input or output tensors of
very popular non-machine learning Poisson denoising the neural layers.
algorithm, and I adopted it as the non-machine learning
benchmark algorithm in this paper. BM3D is a
pioneering non-machine learning Gaussian denoising
algorithm [11]. To denoise a patch of a larger image,
this algorithm first searches other resembling patches.
These patches are transformed into a 3D matrix, filtered
in the 3D space, and the transformed back to 2D space
to obtain the denoised patch. This algorithm is designed
for Gaussian noise specifically. Therefore, when we
perform Poisson denoising through BM3D, VST is
applied to a Poisson noisy image before applying
BM3D so that the noisy image fed into BM3D will be
approximately Gaussian noisy. Figure 1. Visualization of the proposed deep learning
VST is a subclass of data transformations in applied architecture. This figure illustrates the deep learning
statistics. The transformation is a function N where architecture proposed in this paper. This network contains 2
values O in a dataset are fed as the input to generate branches. The lower branch contains 3 convolutional layers
appended by 3 deconvolutional layers, and the upper branch
values P where contains 2 convolutional layers appended by 2
P = N (O) deconvolutional layers.
so that the P’s variance is fixed and independent of their
mean values. A specific kind of transformation, The network contains two branches, and each branch
Anscombe transform [32], converts a random variable contains two main components: the “compressor” built
of Poission distribution to approximately standard with convolutional layers, followed by the
Gaussian distribution by making the standard deviation “decompressor” built with deconvolutional layers. A
approximately constant at 1. The formula of this noisy image patch of 64×64 is fed into the input, and the
transformation is corresponding clean patch of the same size is fed into
4
the output to instruct the network how to transform a each noise level, and each network took around 40
noisy patch into a clean one. As an example, when a epochs to finish training. I chose the mean squared error
noisy patch is passed through the upper branch, the first (MSE) as the loss function because my evaluation
convolutional layer transforms it into 32 smaller images metric, PSNR, links directly to the mean squared error.
of 32×32, and the second convolutional layer further Minimizing MSE is equivalent to maximizing PSNR.
compresses the 32 images into 16 smaller images of
When denoising an image larger than 64×64,
16×16. The following 2 deconvolutional layers reverse
overlapping patches of 64×64 are extracted from the
the operations and reconstruct the smaller images back
image with a fixed stride, and the patches are denoised
to the original size. Because the convolutional layers
individually. To reconstruct the clean image from the
compress a large patch into images of lower resolutions,
clean patches, the patches are stacked back to the
only representative components of the image will be
original positions, and overlapping regions are averaged
retained, while less representative components such as
by 2D Gaussian weights. For example, when the
noise will be removed. The function of the
reconstruction stride is 2, an image of 512×512 yields
convolutional layers is also similar to a down-sampler. I
( )
The compact representation is then passed through a V 512 − 64 ^2 + 1W = 50,625 patches. Each of the
series of deconvolutional layers, which reconstruct the patches is denoised by the network before stacked back
representation back to a 64×64 denoised image patch. to the original position. A smaller reconstruction stride
The deconvolutional layers act as an up-sampler in the size yields more patches, which results in longer
process. computational time but a more accurate reconstruction.
However, although the compressor-decompressior While larger stride yields less patches, which results in
hybrid architecture suppresses noise, it inevitably shorter computation time but a less accurate
degrades some of the image details such as object edges reconstruction.
in the lossy compression process. In order to mitigate
this problem, the network contains two branches with
different compression ratios. The lower branch 5 EMPIRICAL RESULTS
possesses a higher compression ratio, so that it can
suppress more noise but retain less details. The upper
branch possesses a lower compression ratio, so that it This section presents the empirical results, both
can retain more details but suppress less noise. By qualitatively and quantitatively and the effect of the
incorporating both branch into a single network, the reconstruction stride size and noise level on the
network can learn how to suppress noise from the lower denoising performance, for the denoising network
branch while learning how to recover the image details proposed in Section 4.3.
from the upper branch.
In addition, the network is also special in the 5.1 Visual Impression
symmetric connections between each pair of Figure 2 visualizes the denoised images for both the
convolutional and deconvolutional layers with the same deep learning algorithm with stride 1 and the
dimension. These connections mitigate two major benchmark algorithm VST+BM3D. The images are a
problems common in training a simple network [26]. sample of the standard test images, and the noisy images
First, the connections alleviate the loss of details during are clean images applied with Poisson noise with image
the compression process. By “reminding” the later peak value 4. The resulting PSNR values are reported
layers of the noisy but uncompressed images at the on top of each noisy image; the higher PSNR, the
previous layers, the network can reconstruct principle cleaner the image is.
features from the compressed images and reconstruct
details from the noisy but uncompressed images. Numerically, the proposed denoising network
Second, it is generally more difficult to propagate performs denoising more accurately than the
gradients back to bottom layers in a deep network. With benchmark algorithm by achieving higher PSNRs.
the connections, the gradients would be more Visually, in the 1st set of images, we can observe that
effectively and efficiently propagated in the the woman’s cheek and chin are plagued by uneven
backpropagation process. color fluctuations when denoised by VST+BM3D,
while the same locations are smoother when denoised
This network has a total of 39,098 parameters to be by the denoising network. The background also
trained. With a training sample of 579,635 patches and demonstrates similar effects, where my denoising
validation sample of 144,909 patches, RMSProp as the network delivers smoother color transitions. The
optimizer, mean squared error as the cost function, and denoising network also exhibits similar advantages in
100 as the training batch, the deep learning architecture the 2nd set of images, where surfaces of the peppers are
took 2,000 seconds for a single training epoch on an smoother with less color fluctuations and ripples. The
Nvidia Quadro K620 GPU. I trained one network for
5
lusters on the peppers are also more accurately Image ID
(1) VST+BM3D
1 2 3 4 5 6 7 8 9 10 11
24.14 26.78 25.37 28.34 24.98 27.76 21.00 24.89 21.95 23.84 26.30
recovered. (2) Deep Learning 24.63 27.32 26.34 28.75 25.31 28.70 20.86 25.48 22.25 24.16 26.64
(2) - (1) 0.49 0.54 0.96 0.40 0.33 0.93 -0.14 0.58 0.30 0.32 0.34
12 13 14 15 16 17 18 19 20 21 Mean
23.82 23.99 22.28 26.73 24.73 26.16 21.98 25.27 28.13 28.36 25.09
23.00 24.30 22.13 27.26 25.03 26.84 22.25 25.48 29.18 28.86 25.47 t stat p-value
-0.82 0.32 -0.16 0.53 0.30 0.68 0.27 0.21 1.05 0.50 0.38 4.2418 0.0004
Table 1. PSNR for each denoised standard test image. This

table presents the PSNR (dB) for each denoised standard test
image with each denoising algorithm when the noise is
Poisson noise of image peak value 4. Winning PSNR values
are marked bold. The last row reports individual PSNR gains
of the denoising network compared with the benchmark
algorithm. This table also reports the t-test results against the
null hypothesis that the average improvement is 0.
The denoising network outperforms the benchmark

algorithm in 18 of the 21 test images, and achieved an
average PSNR gain of 0.38dB. To further test whether
the improvement is statistically significant enough, a
two-tail t-test over the PSNR improvements is
conducted against the null hypothesis that the average
PSNR gain is 0. The resulting t statistic is 4.2418, and
the p-value is 0.0004. I can thus reject the null
hypothesis under a significance level of 0.05 and assert
that my denoising network achieves statistically
significant improvements over the benchmark
Figure 2. Visual impression of denoising algorithms. This algorithm VST+BM3D.
figure presents the visual impression and resulting PSNR
values for both the deep learning and benchmark denoising
algorithm. The noisy images are obtained by applying 5.3 Effect of Stride Size
Poisson noise when the clean images are of peak value 4. While the denoising network delivers superior
performance compared to VST+BM3D, its execution
Moreover, the denoising network does not simply time is approximately 20 times longer than
smooth out the color fluctuations at the cost of details. VST+BM3D when the reconstruction stride is 1. An
In the 3rd set of images, the “F-16” word on the back immediate approach to shorten the computational time
wing of the fighter jet is as clear when denoised by the is by enlarging the stride so that the patch numbers are
denoising network as by VST+BM3D. In the 4th set of decreased. However, reducing the quantity of patches
images, the stripes on the hat’s feather when denoised will inevitably degrade the denoising accuracy. Table 2
by the denoising network are also at the same clarity reports the effect of stride size on both the
level compared to the benchmark. Roughly speaking, computational time per image and the average PSNR
the denoising network possesses a stronger smoothing gain in the 21 test images with Poisson noise with image
power than VST+BM3D without severely sacrificing peak value 4.
image details, and therefore can denoise with better
performance than the benchmark.
VST+BM3D Deep Learning
Stride 1 2 4 8 16 32
Time per Image (s) 7.74 131.23 31.57 8.03 2.08 0.56 0.16
5.2 Quantitative Comparisons
Average PSNR 25.09 25.47 25.43 25.36 25.32 25.32 25.32
Table 1 reports the individual PSNR values for each PSNR Gain 0.38 0.35 0.28 0.23 0.23 0.23
of the 21 standard test images denoised with the t stat 4.2418 3.9187 3.2585 2.8097 2.8097 2.8089
p-value 0.0004 0.0009 0.0039 0.0108 0.0108 0.0108
denoising network with stride 1 and the benchmark
algorithm when the noise is Poisson noise with image Table 2. Effect of stride size on computational time and
peak value 4. The winning PSNR values between the denoising accuracy. This table reports the effect of the stride
two algorithms are marked bold. size on the computational time and denoising accuracy. As
the stride size increases, the computational time is
significantly decreased, while the average PSNR gain is only
slightly affected.
6
Theoretically, whenever the stride size is doubled, Peak Value 1 2 4 8 16
the number of patches is reduced by 4 times, and so is VST+BM3D PSNR 21.92 23.56 25.09 26.69 28.23
Deep Learning PSNR 22.97 24.24 25.47 26.73 28.12
the computational time. The empirical data reported in
PSNR Gain 1.04 0.68 0.38 0.04 -0.11
Table 2 supports this idea, where the computational Win 100.00% 95.24% 85.71% 71.43% 33.33%
time per images is 131 seconds with reconstruction t stat 9.52 7.57 4.24 0.68 -1.71
stride 1, and decreased by roughly 4 times whenever the p-value 7.17E-09 2.68E-07 4.00E-04 5.05E-01 1.02E-01
stride size is doubled. Starting from stride size of 8 up
to 32, the denoising network runs faster than Table 3. Deep learning denoising performance under
different noise levels. This table reports the deep learning
VST+BM3D while maintaining statistically significant
denoising performance compared with the benchmark
PSNR gains. In practice, when we adopt the denoising algorithm VST+BM3D under various noise levels. The “Win”
network as a solution toward image denoising, a row reports the percentage of the 21 standard test images
tradeoff between computational speed and denoising where the deep learning achieves higher PSNR. The winning
accuracy should be considered. In real-time PSNR values are marked bold, and the results for t test
applications where low latency is desired, we should against the null hypothesis that PSNR equals zero are
choose large stride sizes for fastest possible reported. The reconstruction strides are 1 for all cases in this
computational time while keeping a statistically table.
significant PSNR gain. When the reconstruction
accuracy is desired, we should choose smaller stride Figure 3 and Figure 4 further visualizes the images
size so that an optimal reconstruction is achieved. reconstructed by the denoising network compared to the
benchmark algorithm. As the noise becomes stronger,
the images reconstructed by VST+BM3D are plagued
5.4 Effect of Noise Strength by stronger color fluctuations in large color chunks. The
denoising network, on the other hand, succeeds in
Section 5.1 to section 5.3 focus on the denoising obtaining cleaner color chunks without blurring object
network’s performance when the images are Poisson edges. Even when the peak value is 1, where the noise
noisy with image peak value 4. This section further is so strong to the extent that the noisy images only
analyzes the denoising network’s performance under contain 2 to 3 grayscale levels, the denoising network
different noise levels. I trained one network for each of still successfully smoothes the large color chunks while
tested peak values 1, 2, 8 and 16 in addition to the being able to retain clear object edges.
original peak value 4. Poisson noise with peak value 1
and 2 are stronger than peak value 4, specifically when It can thus be concluded that the denoising network
the peak value is 1, the noisy images consist of only 2 demonstrates a general tendency to be more superior
to 3 brightness levels, which is a severe degradation to against the benchmark algorithm when the noise is
the original image. Poisson noise with peak value 8 and stronger. The reasons behind this phenomenon,
16, on the other hand, are weaker noise compared to although not verified in this paper, could stem from two
peak value 4. Table 3 reports the performance of the possible factors. The first reason might be that when the
denoising network under various peak values when the noise is weak, the compression power of the denoising
reconstruction stride is 1. It can be observed that when network is too strong so that after the noise is removed,
the noise is weak, as in the cases of peak value 8 and 16, some of the image details are also removed, while the
the denoising network only wins by a small margin or VST+BM3D could adapt to weaker noise and maintain
even loses compared to the benchmark algorithm. But as many image details as possible. The second reason
when the noise level is strong, as in the cases of peak might be that although VST can transform a Poisson
values 1 and 2, the denoising network demonstrates distribution to a Gaussian distribution with a constant
superior denoising accuracy with statistically variance of 1, this relationship is most strongly held
significant PSNR gains of 1.04 and 0.68 respectively. In when the mean of the transformed Poisson distribution
addition, the denoising network wins 100% and 95.24% is larger than 4 [30]. When this mean value drops below
of the cases in the 21 standard test images when the peak 4, the variance of the Gaussian distribution begins to
values are 1 and 2. drop below 1 and become unstable. BM3D needs to
know the noise variance in advance before denoising
Gaussian noise. When the variance is unknown, as
commonly in the cases for Poisson noise of peak value
1 and 2, BM3D performs suboptimally.
7
6 CONCLUSION
Although state-of-the-art non-machine learning

algorithms for image denoising exist, we are constantly
wondering that can we achieve better performance with
the assistance of deep learning. This paper proposes a
deep learning denoising network that achieves
statistically significant improvements over traditional
benchmark algorithms in Poisson denoising. The
denoising network can achieve a statistically significant
0.38dB PSNR gain under Poisson noise of peak value
4, and even more superior PSNR gain when the noise is
stronger, 1.04dB and 0.68dB for peak values 1 and 2,
respectively. Although the denoising network is 20
times slower than the benchmark algorithm in
computational speed with reconstruction stride 1, by
tuning the stride size, the network can achieve a faster
computational speed while maintaining a positive
PSNR gain. Due to the computer’s computational
capability, this network contains only 6 layers, which is
relatively shallow compared to modern architectures
[42], [43] where more than 50 or 100 layers are made
feasible. Nevertheless, this network performs Poisson
Figure 3. Visual impression of denoising algorithms with denoising without being specifically taught the noise
image peak value 2. characteristics, while still being able to learn the
parameters from the data alone. Future research could
investigate whether the network can also learn to
denoise other types of noise, such as the Gaussian noise
or a random noise with unknown characteristics. Future
research can investigate whether this architecture can be
adapted to other imaging problems such as deblurring
or impainting.
7 REFERENCES
[1] A. K. Boyat and B. K. Joshi, "A Review

Paper: Noise Models in Digital Image
Processing," Signal & Image Processing : An
International Journal, vol. 6, no. 2, pp. 63-75,
2015.
[2] M. Covington, Digital SLR Astrophotography,
Cambridge University Press, 2007.
[3] H. Cao, Y. Ling, J. Y. Xu, C. Cao and P.
Kumar, "Photon Statistics of Random Lasers
with Resonant Feedback," Physical Review
Letters, vol. 86, no. 20, pp. 4524-4527, 2001.
[4] R. M. Willett and R. D. Nowak, "Platelets: a
Figure 4. Visual impression of denoising algorithms with multiscale approach for recovering edges and
image peak value 1.
surfaces in photon-limited medical imaging,"
IEEE Transactions on Medical Imaging, vol.
22, no. 3, pp. 332-350, 2003.
[5] K. Irie, A. E. McKinnon, K. Unsworth and I.
Woodhead, "A model for measurement of
noise in CCD digital-video cameras,"
8
Measurement Science and Technology, vol. computer-assisted intervention, 2015, pp. 234-
19, no. 4, 2008. 241.
[6] J. Pawley, Handbook of Biological Confocal [18] K. Xu, J. Ba, R. Kiros, K. Cho, A. C.
Microscopy, Springer, 2006. Courville, R. Salakhudinov, R. Zemel and Y.
[7] A. Foi, M. Trimeche, V. Katkovnik and K. Bengio, "Show, attend and tell: Neural image
Egiazarian, "Practical Poissonian-Gaussian caption generation with visual attention," in
noise modeling and fitting for single-image International Conference on Machine
raw-data," IEEE Transactions on Image Learning, 2015, pp. 2048-2057.
Processing, vol. 17, no. 10, pp. 1737-1754, [19] K. Hornik, M. Stinchcombe and H. White,
2008. "Multilayer feedforward networks are
[8] F. A. Haight, Handbook of the Poisson universal approximators," Neural Networks,
Distribution, John Wiley & Sons, 1967. vol. 2, no. 5, pp. 359-366, 1989.
[9] D. L. Snyder and M. I. Miller, Random point [20] Y. Cho and L. K. Saul, "Large-margin
processes in time and space, Springer Science classification in infinite neural networks,"
& Business Media, 2012. Neural Computation, vol. 22, no. 10, pp.
2678-2697, 2010.
[10] J. Schmitt, J. Fadili and I. A. Grenier, "Poisson
denoising on the sphere: application to the [21] H. C. Burger, C. J. Schuler and S. Harmeling,
Fermi gamma ray space telescope," Astronomy "Image denoising: Can plain neural networks
and Astrophysics, vol. 517, 2010. compete with BM3D?," in IEEE Conference
on Computer Vision and Pattern Recognition
[11] K. Dabov, A. Foi, V. Katkovnik and K. O.
(CVPR), 2012, pp. 2392-2399.
Egiazarian, "Image denoising with block-
matching and 3D filtering," in Image [22] L. Xu, J. S. J. Ren, C. Liu and J. Jia, "Deep
Processing: Algorithms and Systems, Neural convolutional neural network for image
Networks, and Machine Learning, vol. 6064, deconvolution," in Advances in Neural
International Society for Optics and Photonics, Information Processing Systems, 2014, pp.
2006, pp. 354-365. 1790-1798.
[12] L. Azzari and A. Foi, "Variance Stabilization [23] K. Zhang, W. Zuo, S. Gu and L. Zhang,
for Noisy+Estimate Combination in Iterative "Learning deep CNN denoiser prior for image
Poisson Denoising," IEEE Signal Processing restoration," arXiv preprint, 2017.
Letters, vol. 23, no. 8, pp. 1086-1090, 2016. [24] T. Remez, O. Litany, R. Giryes and A. M.
[13] J. Salmon, Z. Harmany, C.-A. Deledalle and Bronstein, "Deep Convolutional Denoising of
R. Willett, "Poisson noise reduction with non- Low-Light Images," arXiv preprint, vol.
local PCA," Journal of mathematical imaging arXiv:1701.01687, 2017.
and vision, vol. 48, no. 2, pp. 279-294, 2014. [25] L. Gondara, "Medical image denoising using
[14] R. Giryes and M. Elad, "Sparsity Based convolutional denoising autoencoders," in
Poisson Denoising with Dictionary Learning," IEEE 16th International Conference on Data
IEEE Transactions on Image Processing, vol. Mining Workshops (ICDMW), 2016, pp. 241-
23, no. 12, pp. 5057-5069, 2014. 246.
[15] Y. LeCun, B. E. Boser, J. S. Denker, D. [26] X.-J. Mao, C. Shen and Y.-B. Yang, "Image
Henderson, R. E. Howard, W. E. Hubbard and restoration using convolutional auto-encoders
L. D. Jackel, "Handwritten digit recognition with symmetric skip connections," arXiv
with a back-propagation network," in preprint, vol. abs/1606.08921, 2016.
Advances in neural information processing [27] D. E. Rumelhart, G. E. Hinton and R. J.
systems, 1990, pp. 396-404. Williams, "Learning representations by back-
[16] A. Krizhevsky, I. Sutskever and G. E. Hinton, propagating errors," Nature, vol. 323, no.
"ImageNet classification with deep 6088, pp. 533-536, 1986.
convolutional neural networks," in Advances [28] K. G. Lore, A. Akintayo and S. Sarkar,
in neural information processing systems, "LLNet: A deep autoencoder approach to
2012, pp. 1097-1105. natural low-light image enhancement," Pattern
[17] O. Ronneberger, P. Fischer and T. Brox, "U- Recognition, vol. 61, pp. 650-662, 2017.
net: Convolutional networks for biomedical [29] F.-X. Dupé, J. M. Fadili and J.-L. Starck, "A
image segmentation," in International Proximal Iteration for Deconvolving Poisson
Conference on Medical image computing and Noisy Images Using Sparse Representations,"
9
IEEE Transactions on Image Processing, vol. [41] K. Fukushima, "Neocognitron: A Self-
18, no. 2, pp. 310-321, 2009. organizing Neural Network Model for a
[30] M. Mäkitalo and A. Foi, "Optimal Inversion of Mechanism of Pattern Recognition Unaffected
the Anscombe Transformation in Low-Count by Shift in Position," Biological Cybernetics,
Poisson Image Denoising," IEEE Transactions vol. 36, no. 4, pp. 193-202, 1980.
on Image Processing, vol. 20, no. 1, pp. 99- [42] K. He, X. Zhang, S. Ren and J. Sun, "Deep
109, 2011. Residual Learning for Image Recognition," in
[31] B. Zhang, J. M. Fadili and J.-L. Starck, Proceedings of the IEEE conference on
"Wavelets, ridgelets, and curvelets for Poisson computer vision and pattern recognition,
noise removal," IEEE Transactions on Image 2016, pp. 770-778.
Processing, vol. 17, no. 7, pp. 1093-1108, [43] K. He, X. Zhang, S. Ren and J. Sun, "Identity
2008. Mappings in Deep Residual Networks," in
[32] F. J. Anscombe, "The transformation of European Conference on Computer Vision,
Poisson, binomial and negative-binomial 2016, pp. 630-645.
data," Biometrika, vol. 35, no. 3/4, pp. 246- [44] D. L. Snyder, A. M. Hammoud and R. L.
254, 1948. White, "Image recovery from data acquired
[33] M. Fisz, "The Limiting Distribution of a with a charge-coupled-device camera,"
Function of Two Independent Random Journal of the Optical Society of America A,
Variables and its Statistical Application," vol. 10, no. 5, pp. 1014-1023, 1993.
Colloquium Mathematicum, vol. 3, no. 2, p.
138–146, 1955.
[34] P. Fryzlewicz and G. P. Nason, "A Haar-Fisz
Algorithm for Poisson Intensity Estimation,"
Journal of Computational and Graphical
Statistics, vol. 13, no. 3, pp. 621-638, 2004.
[35] W. Dong, G. Shi, Y. Ma and X. Li, "Image
Restoration via Simultaneous Sparse Coding:
Where Structured Sparsity Meets Gaussian
Scale Mixture," International Journal of
Computer Vision, vol. 114, no. 2-3, pp. 217-
232, 2015.
[36] W. Dong, L. Zhang, G. Shi and X. Li,
"Nonlocally Centralized Sparse
Representation for Image Restoration," IEEE
Transactions on Image Processing, vol. 22,
no. 4, pp. 1620-1630, 2013.
[37] J. Mairal, F. R. Bach, J. Ponce, G. Sapiro and
A. Zisserman, "Non-local sparse models for
image restoration," in 2009 IEEE 12th
International Conference on Computer Vision,
2009, pp. 2272-2279.
[38] U. Schmidt, Q. Gao and S. Roth, "A
generative perspective on MRFs in low-level
vision," in IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2010,
pp. 1751-1758.
[39] A. Graves, A.-r. Mohamed and G. E. Hinton,
"Speech recognition with deep recurrent
neural networks," in IEEE International
Conference on Acoustics, Speech and Signal
Processing (ICASSP), 2013, pp. 6645-6649.
[40] Y. LeCun, Y. Bengio and G. Hinton, "Deep
learning," Nature, vol. 521, no. 7553, p. 436,
2015.
10

Image Reconstruction Using Deep Learning

Uploaded by

Copyright:

Available Formats

Image Reconstruction Using Deep Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Image Reconstruction Using Deep Learning

Uploaded by

Copyright:

Available Formats

Image Reconstruction Using Deep Learning

Po-Yu Liu, 3035123887, The University of Hong Kong

Table 1. PSNR for each denoised standard test image. This

The denoising network outperforms the benchmark

Although state-of-the-art non-machine learning

[1] A. K. Boyat and B. K. Joshi, "A Review

You might also like