0% found this document useful (0 votes)
14 views8 pages

Image Super-Resolution As Sparse Representation of Raw Image Patches

Uploaded by

ARUN KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

Image Super-Resolution As Sparse Representation of Raw Image Patches

Uploaded by

ARUN KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Image Super-Resolution as Sparse Representation of Raw Image Patches

Jianchao Yang† , John Wright‡ , Yi Ma‡ , Thomas Huang†


University of Illinois at Urbana-Champagin
Beckman Institute† and Coordinated Science Laboratory‡
{jyang29, jnwright, yima, [email protected]}

Abstract under generic image priors such as Huber MRF (Markov


Random Field) and Bilateral Total Variation [14, 11, 25].
This paper addresses the problem of generating a super- However, the performance of these reconstruction-based
resolution (SR) image from a single low-resolution input super-resolution algorithms degrades rapidly if the mag-
image. We approach this problem from the perspective of nification factor is large or if there are not enough low-
compressed sensing. The low-resolution image is viewed resolution images to constrain the solution, as in the ex-
as downsampled version of a high-resolution image, whose treme case of only a single low-resolution input image [2].
patches are assumed to have a sparse representation with Another class of super-resolution methods that can over-
respect to an over-complete dictionary of prototype signal- come this difficulty are learning based approaches, which
atoms. The principle of compressed sensing ensures that use a learned co-occurrence prior to predict the correspon-
under mild conditions, the sparse representation can be dence between low-resolution and high-resolution image
correctly recovered from the downsampled signal. We will patches [12, 26, 16, 5, 20].
demonstrate the effectiveness of sparsity as a prior for reg- In [12], the authors propose an example-based learn-
ularizing the otherwise ill-posed super-resolution problem. ing strategy that applies to generic images where the low-
We further show that a small set of randomly chosen raw resolution to high-resolution prediction is learned via a
patches from training images of similar statistical nature to Markov Random Field (MRF) solved by belief propaga-
the input image generally serve as a good dictionary, in the tion. [23] extends this approach by using the Primal Sketch
sense that the computed representation is sparse and the priors to enhance blurred edges, ridges and corners. Nev-
recovered high-resolution image is competitive or even su- ertheless, the above methods typically require enormous
perior in quality to images produced by other SR methods. databases of millions of high-resolution and low-resolution
patch pairs to make the databases expressive enough. In [5],
the authors adopt the philosophy of LLE [22] from manifold
learning, assuming similarity between the two manifolds in
1. Introduction
the high-resolution patch space and the low-resolution patch
Conventional approaches to generating a super- space. Their algorithm maps the local geometry of the low-
resolution (SR) image require multiple low-resolution resolution patch space to the high-resolution patch space,
images of the same scene, typically aligned with sub-pixel generating high-resolution patch as a linear combination of
accuracy. The SR task is cast as the inverse problem of neighbors. Using this strategy, more patch patterns can be
recovering the original high-resolution image by fusing represented using a smaller training database. However, us-
the low-resolution images, based on assumptions or ing a fixed number K neighbors for reconstruction often re-
prior knowledge about the generation model from the sults in blurring effects, due to over- or under-fitting.
high-resolution image to the low-resolution images. The In this paper, we focus on the problem of recovering
basic reconstruction constraint is that applying the image the super-resolution version of a given low-resolution im-
formation model to the recovered image should produce age. Although our method can be readily extended to han-
the same low-resolution images. However, because much dle multiple input images, we mostly deal with a single in-
information is lost in the high-to-low generation process, put image. Like the aforementioned learning-based meth-
the reconstruction problem is severely underdetermined, ods, we will rely on patches from example images. Our
and the solution is not unique. Various methods have been method does not require any learning on the high-resolution
proposed to further regularize the problem. For instance, patches, instead working directly with the low-resolution
one can choose a MAP (maximum a-posteriori) solution training patches or their features. Our approach is motivated

1
directly used to recover the corresponding high-resolution
patch from D ~ . We obtain a locally consistent solution by
allowing patches to overlap and demanding that the recon-
structed high-resolution patches agree on the overlapped ar-
eas. Finally, we apply global optimization to eliminate the
reconstruction errors in the recovered high-resolution im-
age from local sparse representation, suppressing noise and
ensuring consistency with the low-resolution input.
Figure 1. Reconstruction of a raccoon face with magnification fac-
Compared to the aforementioned learning-based meth-
tor 2. Left: result by our method. Right: the original image. There ods, our algorithm requires a much smaller database. The
is little noticeable difference. online recovery of the sparse representation uses the low-
resolution dictionary only – the high-resolution dictionary
is used only to calculate the final high-resolution image.
by recent results in sparse signal representation, which en-
The computation, mainly based on linear programming, is
sure that linear relationships among high-resolution signals
reasonably efficient and scalable. In addition, the computed
can be precisely recovered from their low-dimensional pro- sparse representation adaptively selects the most relevant
jections [3, 9].
patches in the dictionary to best represent each patch of the
To be more precise, let D ∈ Rn×K be an overcomplete given low-resolution image. This leads to superior perfor-
dictionary of K prototype signal-atoms, and suppose a sig- mance, both qualitatively and quantitatively, compared to
nal x ∈ Rn can be represented as a sparse linear combi-
methods [5] that use a fixed number of nearest neighbors,
nation of these atoms. That is, the signal vector x can be generating sharper edges and clearer textures.
written as x = Dα0 where α0 ∈ RK is a vector with very
The remainder of this paper is organized as follows. Sec-
few (≪ K) nonzero entries. In practice, we might observe
tion 2 details our formulation and solution to the image
only a small set of measurements y of x:
super-resolution problem based on sparse representation. In
.
y = Lx = LDα0 , (1) Section 3, we discuss how to prepare a dictionary from sam-
ple images and what features to use. Various experimental
where L ∈ Rk×n with k < n. In the super-resolution results in Section 4 demonstrate the efficacy of sparsity as a
context, x is a high-resolution image (patch), while y is prior for image super-resolution.
its low-resolution version (or features extracted from it). If
the dictionary D is overcomplete, the equation x = Dα is
underdetermined for the unknown coefficients α. The equa- 2. Super-resolution from Sparsity
tion y = LDα is even more dramatically underdetermined. The single-image super-resolution problem asks: given a
Nevertheless, under mild conditions, the sparsest solution low-resolution image Y , recover a higher-resolution image
α0 to this equation is unique. Furthermore, if D satisfies X of the same scene. The fundamental constraint is that the
an appropriate near-isometry condition, then for a wide va- recovered X should be consistent with the input, Y :
riety of matrices L, any sufficiently sparse linear represen-
Reconstruction constraint. The observed low-resolution
tation of a high-resolution image x in terms of the D can be
image Y is a blurred and downsampled version of the solu-
recovered (almost) perfectly from the low-resolution image
tion X:
[9, 21]. Figure 1 shows an example that demonstrates the Y = DHX (2)
capabilities of our method derived from this principle. Even Here, H represents a blurring filter, and D the downsam-
for this complicated texture, sparse representation recovers pling operator.
a visually appealing reconstruction of the original signal. Super-resolution remains extremely ill-posed, since for
Recently sparse representation has been applied to many a given low-resolution input Y , infinitely many high-
other related inverse problems in image processing, such resolution images X satisfy the above reconstruction con-
as compression, denoising [10], and restoration [17], often straint. We regularize the problem via the following prior
improving on the state-of-the-art. For example in [10], the on small patches x of X:
authors use the K-SVD algorithm [1] to learn an overcom-
Sparse representation prior. The patches x of the high-
plete dictionary from natural image patches and success-
resolution image X can be represented as a sparse linear
fully apply it to the image denoising problem. In our set-
combination in a dictionary D ~ of high-resolution patches
ting, we do not directly compute the sparse representation
sampled from training images:1
of the high-resolution patch. Instead, we will work with two
coupled dictionaries, D ~ for high-resolution patches, and x ≈ D ~ α for some α ∈ RK with kαk0 ≪ K. (3)
Dℓ = LD~ for low-resolution patches. The sparse repre- 1 Similar mechanisms – sparse coding with an overcomplete dictionary

sentation of a low-resolution patch in terms of Dℓ will be – are also believed to be employed by the human visual system [19].
To address the super-resolution problem using the sparse Solving (6) individually for each patch does not guar-
representation prior, we divide the problem into two steps. antee compatibility between adjacent patches. We enforce
First, using the sparse prior (3), we find the sparse repre- compatibility between adjacent patches using a one-pass
sentation for each local patch, respecting spatial compati- algorithm similar to that of [13].3 The patches are pro-
bility between neighbors. Next, using the result from this cessed in raster-scan order in the image, from left to right
local sparse representation, we further regularize and refine and top to bottom. We modify (5) so that the super-
the entire image using the reconstruction constraint (2). In resolution reconstruction D ~ α of patch y is constrained to
this strategy, a local model from the sparse prior is used closely agree with the previously computed adjacent high-
to recover lost high-frequency for local details. The global resolution patches. The resulting optimization problem is
model from the reconstruction constraint is then applied to
remove possible artifacts from the first step and make the min kαk1 s.t. kF Dℓ α − F yk22 ≤ ǫ1
(7)
image more consistent and natural. kP D~ α − wk22 ≤ ǫ2 ,

2.1. Local Model from Sparse Representation where the matrix P extracts the region of overlap be-
tween current target patch and previously reconstructed
As in the patch-based methods mentioned previously, high-resolution image, and w contains the values of the pre-
we try to infer the high-resolution patch for each low- viously reconstructed high-resolution image on the overlap.
resolution patch from the input. For this local model, we The constrained optimization (7) can be similarly reformu-
have two dictionaries D ℓ and D~ : D~ is composed of high- lated as:
resolution patches and D ℓ is composed of corresponding min λkαk1 + 21 kD̃α − ỹk22 , (8)
low-resolution patches. We subtract the mean pixel value    
for each patch, so that the dictionary represents image tex- F Dℓ Fy
where D̃ = and ỹ = . The parameter β
tures rather than absolute intensities. βP D ~ βw
controls the tradeoff between matching the low-resolution
For each input low-resolution patch y, we find a sparse
input and finding a high-resolution patch that is compatible
representation with respect to Dℓ . The corresponding high-
with its neighbors. In all our experiments, we simply set
resolution patches D~ will be combined according to these
β = 1. Given the optimal solution α∗ to (8), the high-
coefficients to generate the output high-resolution patch x.
resolution patch can be reconstructed as x = D~ α∗ .
The problem of finding the sparsest representation of y can
be formulated as: 2.2. Enforcing Global Reconstruction Constraint
min kαk0 s.t. kF Dℓ α − F yk22 ≤ ǫ, (4) Notice that (5) and (7) do not demand exact equality
between the low-resolution patch y and its reconstruction
where F is a (linear) feature extraction operator. The main Dℓ α. Because of this, and also because of noise, the
role of F in (4) is to provide a perceptually meaningful con- high-resolution image X 0 produced by the sparse repre-
straint2 on how closely the coefficients α must approximate sentation approach of the previous section may not satisfy
y. We will discuss the choice of F in Section 3.
the reconstruction constraint (2) exactly. We eliminate this
Although the optimization problem (4) is NP-hard in discrepency by projecting X 0 onto the solution space of
general, recent results [7, 8] indicate that as long as the DHX = Y , computing
desired coefficients α are sufficiently sparse, they can be
efficiently recovered by instead minimizing the ℓ1 -norm, as X ∗ = arg min kX − X 0 k s.t. DHX = Y . (9)
X
follows:
The solution to this optimization problem can be efficiently
min kαk1 s.t. kF Dℓ α − F yk22 ≤ ǫ. (5) computed using the back-projection method, originally de-
veloped in computer tomography and applied to super-
Lagrange multipliers offer an equivalent formulation
resolution in [15, 4]. The update equation for this iterative
min λkαk1 + 12 kF Dℓ α − F yk22 , (6) method is
X t+1 = X t + ((Y − DHX t ) ↑ s) ∗ p, (10)
where the parameter λ balances sparsity of the solution and
fidelity of the approximation to y. Notice that this is es- where X t is the estimate of the high-resolution image af-
sentially a linear regression regularized with ℓ1 -norm on the ter the t-th iteration, p is a “backprojection” filter, and ↑ s
coefficients, known in statistical literature as the Lasso [24]. denotes upsampling by a factor of s.
2 Traditionally,one would seek the sparsest α s.t. kDℓ α − yk2 ≤ ǫ. 3 There are different ways to enforce compatibility. In [5], the values in
For super-resolution, it is more appropriate to replace this 2-norm with a the overlapped regions are simply averaged, which will result in blurring
quadratic norm k · kF T F that penalizes visually salient high-frequency effects. The one-pass algorithm [13] is shown to work almost as well as
errors. the use of a full MRF model [12].
Algorithm 1 (Super-resolution via Sparse Representation). may take the form of a generic regularization term (e.g.,
1: Input: training dictionaries D ~ and D ℓ , a low- Huber MRF, Total Variation, Bilateral Total Variation).
resolution image Y . Algorithm 1 can be interpreted as a computationally effi-
2: for each 3 × 3 patch y of Y , taken starting from the cient approximation to (11). The sparse representation step
upper-left corner with 1 pixel overlap in each direction, recovers the coefficients α by approximately minimizing
• Solve the optimization problem with D̃ and ỹ de- the sum of the second and third terms of (11). The sparsity
fined in (8): min λkαk1 + 21 kD̃α − ỹk22 . term kαij k0 is relaxed to kαij k1 , while the high-resolution
fidelity term kD~ αij −Pij Xk2 is approximated by its low-
• Generate the high-resolution patch x = D ~ α∗ . resolution version kF Dℓ αij − F y ij k2 .
Put the patch x into a high-resolution image X 0 . Notice, that if the sparse coefficients α are fixed, the
3: end third term of (11) essentially penalizes the difference be-
4: Using back-projection, find the closest image to X 0
which satisfies the reconstruction constraint:
tween the super-resolution image P X and the reconstruc-
tion given by the coefficients: i,j kD~ αij − Pij Xk22 ≈
X ∗ = arg min kX − X 0 k s.t. DHX = Y . kX 0 − Xk22 . Hence, for small γ, the back-projection step
X
5: Output: super-resolution image X ∗ . of Algorithm 1 approximately minimizes the sum of the first
and third terms of (11).
Algorithm 1 does not, however, incorporate any prior be-
We take result X ∗ from backprojection as our final es- sides sparsity of the representation coefficients – the term
timate of the high-resolution image. This image is as close ρ(X) is absent in our approximation. In Section 4 we will
as possible to the initial super-resolution X 0 given by spar- see that sparsity in a relevant dictionary is a strong enough
sity, while satisfying the reconstruction constraint. The en- prior that we can already achieve good super-resolution per-
tire super-resolution process is summarized as Algorithm 1. formance. Nevertheless, in settings where further assump-
tions on the high-resolution signal are available, these pri-
2.3. Global Optimization Interpretation ors can be incorperated into the global reconstruction step
of our algorithm.
The simple SR algorithm outlined above can be viewed
as a special case of a general sparse representation frame-
work for inverse problems in image processing. Related 3. Dictionary Preparation
ideas have been profitably applied in image compression,
denoising [10], and restoration [17]. These connections
3.1. Random Raw Patches from Training Images
provide context for understanding our work, and also sug- Learning an over-complete dictionary capable of opti-
gest means of further improving the performance, at the cost mally representing broad classes of image patches is a dif-
of increased computational complexity. ficult problem. Rather than trying to learn such a dictionary
Given sufficient computational resources, one could [19, 1] or using a generic set of basis vectors [21] (e.g.,
in principle solve for the coefficients associated with Fourier, Haar, curvelets etc.), we generate dictionaries by
all patches simultaneously. Moreover, the entire high- simply randomly sampling raw patches from training im-
resolution image X itself can be treated as a variable. ages of similar statistical nature. We will demonstrate that
Rather than demanding that X be perfectly reproduced by so simply prepared dictionaries are already capable of gen-
the sparse coefficients α, we can penalize the difference be- erating high-quality reconstructions,4 when used together
tween X and the high-resolution image given by these co- with the sparse representation prior.
efficients, allowing solutions that are not perfectly sparse, Figure 2 shows several training images and the patches
but better satisfy the reconstruction constraints. This leads sampled from them. For our experiments, we prepared
to a large optimization problem: two dictionaries: one sampled from flowers (Figure 2 top),
X which will be applied to generic images with relative sim-
X ∗ = arg min kDHX − Y k22 + η

kαij k0 ple textures, and one sampled from animal images (Figure
X,{αij }
i,j
X (11) 2 bottom), with fine furry or fractal textures. For each high-
+γ kD~ αij − Pij Xk22 + τ ρ(X) . resolution training image X, we generate the correspond-
i,j ing low-resolution image Y by blurring and downsampling.
For each category of images, we sample only about 100,000
Here, αij denotes the representation coefficients for the patches from about 30 training images to form each dic-
(i, j)th patch of X, and Pij is a projection matrix that se- tionary, which is considerably smaller than that needed by
lects the (i, j)th patch from X. ρ(X) is a penalty function
that encodes prior knowledge about the high-resolution im- 4 The competitiveness of such random patches has also been noticed

age. This function may depend on the image category, or empirically in the context of content-based image classification [18].
35

30

Number of Supports
25

20

15

10

0
0 50 100 150 200 250 300
Patch Index
Figure 2. Left: three out of the 30 training images we use in our
experiments. Right: the training patches extracted from them. Figure 3. Number of nonzero coefficients in the sparse representa-
tion computed for 300 typical patches in a test image.

other learning-based methods [12, 23]. Empirically, we find


such a small dictionary is more than sufficient. component only, since humans are more sensitive to illumi-
nance changes. Our algorithm has only one free parameter
3.2. Derivative Features λ, which balances sparsity of the solution with fidelity to
the reconstruction constraint. In our experience, the recon-
In (4), we use a feature transformation F to ensure that struction quality is stable over a large range of λ. The rule
the computed coefficients fit the most relevant part of the of thumb, λ = 50 × dim(patch feature), gives good results
low-resolution signal. Typically, F is chosen as some kind for all the test cases in this paper.
of high-pass filter. This is reasonable from a perceptual
One advantage of our approach over methods such as
viewpoint, since people are more sensitive to the high-
neighbor embedding [5] is that it selects the number of rel-
frequency content of the image. The high-frequency com-
evant dictionary elements adaptively for each patch. Figure
ponents of the low-resolution image are also arguably the
3 demonstrates this for 300 typical patches in one test im-
most important for predicting the lost high-frequency con-
age. Notice that the recovered coefficients are always sparse
tent in the target high-resolution image.
(< 35 nonzero entries), but the level of sparsity varies de-
Freeman et al. [12] use a high-pass filter to extract the
pending on the complexity of each test patch. However, em-
edge information from the low-resolution input patches as
pirically, we find the support of the recovered coefficients
the feature. Sun et al. [23] use a set of Gaussian derivative
typically is neither a superset nor subset of the K nearest
filters to extract the contours in the low-resolution patches.
neighbors [5]. The chosen patches are more informative for
Chang et al. [5] use the first-order and second-order gradi-
recovering the high-resolution patch, leading to more faith-
ents of the patches as the representation. For our algorithm,
ful texture reconstruction in the experiments below.
we also use the first-order and second-order derivatives as
the feature for the low-resolution patch. While simple, these
features turn out to work very well. To be precise, the four Experimental results: We first apply our algorithm to
1-D filters used to extract the derivatives are: generic images including flower, human face, and architec-
ture, all using the same dictionary sampled from training
f 1 = [−1, 0, 1], f 2 = f T1 , images of flowers (first row of Figure 2). We will further
(12)
f 3 = [1, 0, −2, 0, 1], f 4 = f T3 , demonstrate our algorithm’s ability to handle complicated
textures in animal images, with the second dictionary sam-
where the superscript “T ” means transpose. Applying these pled from training animal images (second row of Figure 2).
four filters, we get four description feature vectors for each Figure 4 compares our results with neighbor embed-
patch, which are concatenated as one vector as the final rep- ding [5]5 on two test images of a flower and a girl. In
resentation of the low-resolution patch. both cases, our method gives sharper edges and reconstructs
more clearly the details of the scene. There are noticeable
4. Experiments differences in the texture of the leaves, the fuzz on the leaf-
stalk, and also the freckles on the face of the girl.
Experimental settings: In our experiments, we will In Figure 5, we compare our method with several other
mostly magnify the input image by a factor of 3. In the methods on an image of the Parthenon used in [6], including
low-resolution images, we always use 3 × 3 low-resolution back projection, neighbor embedding [5], and the recently
patches, with overlap of 1 pixel between adjacent patches,
corresponding to 9 × 9 patches with overlap of 3 pixels for 5 Our implementation of the neighbor embedding method [5] differs

the high-resolution patches. The features are not extracted slightly from the original. The feature for the low-resolution patch is not
extracted from the original 3 × 3 patch, which will give smoother results,
directly from the 3 × 3 low-resolution patch, but rather from but on the upsampled low-resolution patch. We find that setting K = 15
an upsampled version produced by bicubic interpolation. gives the best performance. This is approximately the average number of
For color images, we apply our algorithm to the illuminance coefficients recovered by sparse representation (see Figure 3).
Figure 4. The flower and girl image magnified by a factor of 3. Left to right: input, bicubic interpolation, neighbor embedding [5], our
method, and the original. (Also see Figure 8 for the same girl image magnified by a factor of 4).

Figure 5. Results on an image of the Parthenon with magnification factor 3. Top row: low-resolution input, bicubic interpolation, back
projection. Bottom row: neighbor embedding[5], soft edge prior [6], and our method.

proposed method based on a learned soft edge prior [6]. The come more difficult than on images with simpler textures,
result from back projection has many jagged effects along such as flowers or faces. In Figure 6, we apply our method
the edges. Neighbor embedding generates sharp edges in to the same raccoon face image with magnification factor 3.
places, but blurs the texture on the temple’s facade. The Since there are no explicit edges in most part of the image,
soft edge prior method gives a decent reconstruction, but methods proposed in [12], [23], and [6] would have tremen-
introduces undesired smoothing that is not present in our dous difficulty here. Compared to neighbor embedding [5],
result. Additional results on generic images using this dic- our method gives clearer fur and sharper whiskers. Figure 7
tionary are shown in Figure 7 left and center. Notice that in shows an additional image of a cat face reconstructed using
both cases, the algorithm significantly improves the image this dictionary. We compare several SR methods quantita-
resolution by sharpening edges and textures. tively in terms of their RMS errors for some of the images
We now conduct more challenging experiments on more shown above. The results are shown in Table 1.
intricate textures found in animal images, using the ani- Finally, we test our algorithm on the girl image again, but
mal dictionary with merely 100,000 training patches (sec- with a more challenging magnification factor 4. The results
ond row of Figure 2). As already shown in Figure 1, our are shown in Figure 8. Here, back-projection again yields
method performs quite well in magnifying the image of a jagged edges. Freeman et. al’s method [12] introduces many
raccoon face by a factor of 2. When complex textures such artifacts and fails to capture the facial texture, despite rely-
as this one are down-sampled further, the SR task will be- ing on a much larger database. Compared to the soft edge
Figure 6. A raccoon face magnified by a factor of 3. The input image, bicubic interpolation, neighbor embedding, and our method.

Figure 7. More results on a few more generic (left and center) and animal (right) images. Top: input images. Bottom: super-resolution
images by our method, with magnification factor 3.

Figure 8. The girl image magnified by a factor of 4. From left to right: low-resolution input, back projection, learning-based method in
[12], soft edge prior [6], and our method.
Images Bicubic NE [5] Our method [8] D. L. Donoho. For most large underdetermined systems
Flower 3.5052 4.1972 3.2276 of linear equations, the minimal ℓ1 -norm near-solution ap-
Girl 5.9033 6.6588 5.6175 proximates the sparsest near-solution. Preprint, accessed at
Parthenon 12.7431 13.5562 12.2491 http://www-stat.stanford.edu/˜donoho/. 2004. 3
Raccoon 9.7399 9.8490 9.1874 [9] D. L. Donoho. Compressed sensing. Preprint, accessed at
http://www-stat.stanford.edu/˜donoho/. 2005. 2
Table 1. The RMS errors of different methods for super-resolution
with magnification factor 3, respect to the original images. [10] M. Elad and M. Aharon. Image denoising via sparse and
redundant representations over learned dictionaries. IEEE
TIP, Vol. 15, No. 12, 2006. 2, 4
prior method [6], our method generates shaper edges and is [11] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar. Fast
more faithful to the original facial texture. and robust multiframe super-resolution. IEEE TIP, 2004. 1
[12] W. T. Freeman, E. C. Pasztor, and O. T. Carmichael. Learn-
ing low-level vision. IJCV, 2000. 1, 3, 5, 6, 7, 8
5. Discussion
[13] W. T. Freeman, T. R. Jones, and E. C. Pasztor. Example-
The experimental results of the previous section demon- based super-resolution. IEEE Computer Graphics and Ap-
strate the effectiveness of sparsity as a prior for learning- plications, Vol. 22, Issue 2, 2002. 3
based super-resolution. However, one of the most important [14] R.C. Hardie, K.J. Barnard, and E.A. Armstrong. Joint MAP
questions for future investigation is to determine, in terms registration and high-resolution image estimation using a se-
of the within-category variation, the number of raw sam- quence of undersampled images. IEEE TIP, 1997. 1
ple patches required to generate a dictionary satisfying the [15] M. Irani and S. Peleg. Motion analysis for image enhance-
sparse representation prior. Tighter connections to the the- ment: resolution, occlusion and transparency. JVCI, 1993.
ory of compressed sensing may also yield conditions on the 3
appropriate patch size or feature dimension. [16] C. Liu, H. Y. Shum, and W. T. Freeman. Face hallucina-
From a more practical standpoint, it would be desirable tion: theory and practice. IJCV, Vol. 75, No. 1, pp. 115-134,
to have a way of effectively combining dictionaries to work October, 2007. 1
with images containing multiple types of textures or mul- [17] J. Mairal, G. Sapiro, and M. Elad. Learning multi-
tiple object categories. One approach to this would inte- scale sparse representations for image and video restoration.
grate supervised image segmentation and super-resolution, SIAM Multiscale Modeling and Simulation, 2008. 2, 4
applying the appropriate dictionary within each segment. [18] E. Nowak, F. Jurie, and B. Triggs. Sampling strategies for
bag-of-features image classification. Proc. ECCV, 2006. 4
[19] B. Olshausen and D. Field. Sparse coding wih an overcom-
References plete basis set: A strategy employed by V1? Vision Re-
[1] M. Aharon, M. Elad, and A. Bruckstein. K-SVD: An algo- search, 37:3311-3325, 1997. 3, 4
rithm for designing overcomplete dictionaries for sparse rep- [20] L. C. Pickup, S. J. Roberts, and A. Zisserman. A sampled
resentation. IEEE Transactions on Signal Processing, Vol. texture prior for image super-resolution. Proc. NIPS, 2003.
54, No. 11, Novermber 2006. 2, 4 1
[2] S. Baker and T. Kanade. Limits on super-resolution and how [21] H. Rauhut, K. Schnass, and P. Vandergheynst. Compressed
to break them. IEEE TPAMI, 24(9):1167-1183, 2002. 1 sensing and redundant dictionaries. Preprint, accessed at
http://homepage.univie.ac.at/holger.rauhut/.
[3] E. Candes. Compressive sensing. Proc. International
2007. 2, 4
Congress of Mathematicians, 2006. 2
[22] S. T. Roweis and L. K. Saul. Nonlinear dimensionality re-
[4] D. Capel. Image mosaicing and super-resolution. Ph.D. duction by locally linear embedding. Science, 290(5500):
Thesis, Department of Eng. Science, University of Oxford, 2323-2326, 2000. 1
2001. 3
[23] J. Sun, N.-N. Zheng, H. Tao, and H. Shum. Image halluci-
[5] H. Chang, D.-Y. Yeung, and Y. Xiong. Super-resolution nation with primal sketch priors. Proc. CVPR, 2003. 1, 5,
through neighbor embedding. CVPR, 2004. 1, 2, 3, 5, 6, 6
8 [24] R. Tibshirani. Regression shrinkge and selection via the
[6] S. Dai, M. Han, W. Xu, Y. Wu, and Y. Gong. Soft edge Lasso. J. Royal Statist. Soc B., Vol. 58, No. 1, pages 267-
smoothness prior for alpha channel super resolution Proc. 288, 1996. 3
ICCV, 2007. 6, 7, 8 [25] M. E. Tipping and C. M. Bishop. Bayesian image super-
[7] D. L. Donoho. For most large underdetermined systems of resolution. Proc. NIPS, 2003. 1
linear equations, the minimal ℓ1 -norm solution is also the [26] Q. Wang, X. Tang, and H. Shum. Patch based blind image
sparsest solution. Comm. on Pure and Applied Math, Vol. super resolution. Proc. ICCV, 2005. 1
59, No. 6, 2006. 3

You might also like