0% found this document useful (0 votes)
4 views12 pages

Guided Linear Upsampling

Uploaded by

david briard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views12 pages

Guided Linear Upsampling

Uploaded by

david briard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Guided Linear Upsampling

SHUANGBING SONG, FAN ZHONG∗ , TIANJU WANG, XUEYING QIN, and CHANGHE TU, Shandong
University, China

interpolation
parameters
self-upsampling

Θ
joint
optimization
arXiv:2307.09582v1 [cs.CV] 13 Jul 2023

downsampling processing upsampling

𝐼↓ 𝑇↓
input I output 𝑇%

Fig. 1. Our method to accelerate high-resolution image processing with guided linear upsampling. Given a high-resolution source image 𝐼 , our method can
jointly optimize the downsampled source image 𝐼 ↓ and the interpolation parameters Θ, and then 𝐼 ↓ is processed by a black-box image operator to get the
low-resolution target image 𝑇 ↓ . The high-resolution target image 𝑇ˆ can be linearly upsampled from 𝑇 ↓ with the optimized parameters Θ.

Guided upsampling is an effective approach for accelerating high-resolution ACM Reference Format:
image processing. In this paper, we propose a simple yet effective guided Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu.
upsampling method. Each pixel in the high-resolution image is represented 2023. Guided Linear Upsampling. ACM Trans. Graph. 42, 4 (August 2023),
as a linear interpolation of two low-resolution pixels, whose indices processing
and 12 pages. https://doi.org/10.1145/3592453
weights are optimized to minimize the upsampling error. The downsampling
can be jointly optimized in order to prevent missing small isolated regions. 1 INTRODUCTION
Our method can be derived from the color line model and local color transfor-
mations. Compared to previous methods, our method can better preserve In the past decades, many useful image processing methods have
detail effects while suppressing artifacts such as bleeding and blurring. It is been proposed for various tasks such as enhancement [Aubry et al.
efficient, easy to implement, and free of sensitive parameters. We evaluate 2014], style transfer [Li et al. 2018; Zhu et al. 2017], matting [Levin
the proposed method with a wide range of image operators, and show its ad- et al. 2007], colorization [Iizuka et al. 2016], etc. Most of them require
vantages through quantitative and qualitative analysis. We demonstrate the intensive computation and memory, and thus face great challenges
advantages of our method for both interactive image editing and real-time for high-resolution images. At the same time, the popularity of
high-resolution video processing. In particular, for interactive editing, the mobile devices requires us to consider more about computational
joint optimization can be precomputed, thus allowing for instant feedback efficiency. The problem is even more prominent for interactive image
without hardware acceleration.
editing [Bousseau et al. 2009; Levin et al. 2004], which requires
CCS Concepts: • Imaging/Video → Matting & Compositing; Interactive repetitive user interactions, so instant feedback is necessary for a
Editing. better user experience.
Additional Key Words and Phrases: guided upsampling, optimized down-
For general image processing, the guided upsampling should be
sampling, image processing the simplest and most effective way to achieve acceleration. By using
the original image as a guidance map, a large ratio downsampling
∗ Corresponding author. of the output image can be upsampled to the original resolution
without noticeable artifacts. This is amazing because even for image
Authors’ address: Shuangbing Song, [email protected]; Fan Zhong, zhongfan@ operators of linear complexity in image size, using 8× downsampling
sdu.edu.cn; Tianju Wang, [email protected]; Xueying Qin, [email protected]. can result in 64× speed up.
cn; Changhe Tu, [email protected], Shandong University, China.
Two classical approaches for guided upsampling are joint bilateral
Permission to make digital or hard copies of all or part of this work for personal or
upsampling (JBU) [Kopf et al. 2007] and bilateral guided upsampling
classroom use is granted without fee provided that copies are not made or distributed (BGU) [Chen et al. 2016]. JBU is an extension of the bilateral fil-
for profit or commercial advantage and that copies bear this notice and the full citation ter [Durand and Dorsey 2002; Paris and Durand 2006], while BGU is
on the first page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or based on the local color transformations [Levin et al. 2007], whose
republish, to post on servers or to redistribute to lists, requires prior specific permission effectiveness for guided upsampling has been demonstrated in ear-
and/or a fee. Request permissions from [email protected]. lier works such as transform recipes [Gharbi et al. 2015] and guided
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
0730-0301/2023/8-ART $15.00 filter [He et al. 2012]. In BGU, the local transformations are applied
https://doi.org/10.1145/3592453 in the bilateral space [Barron et al. 2015], which further improves the

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
2 • Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu

efficiency and quality. Recent works are mainly learning-based [Dai


3 "
et al. 2021; Gharbi et al. 2017; Xia et al. 2021, 2020], which can better 45% ($% ) = 89: 49↓ + (1 − 89: )4:↓
employ domain knowledge for improving quality. However, they "↓
1
need to be trained for each specific task, and thus cannot be gen-
eralized to other tasks. Instead, we will follow the roadmap of the
classical approaches, in order to seek a universal guided upsampler Ω↓%
applicable to a wide range of image operators.
In this paper, we propose Guided Linear Upsampling (GLU), which "↓
is pretty simple but very effective. We introduce a new representa-
tion of high-resolution images, with each pixel represented as the upsample
linear interpolation of only two low-resolution pixels. By optimizing
the representation parameters, i.e. the indices and weights of the
interpolated pixel pairs, very small errors can be achieved even for
large ratio upsampling. The parameters can be optimized for the !↓ 45
source image and then applied to the target image, resulting in a
high-resolution target image that can well preserve details while
Fig. 2. The proposed linear representation of high-resolution images. Each
avoiding artifacts such as bleeding and blurring. We also propose an pixel 𝑝 in the high-resolution image is represented as the linear interpola-
efficient method to optimize the downscaled source image, in order tion of two pixels (𝑎, 𝑏 ) in the low-resolution image, with (𝑎, 𝑏 ) and the
to better preserve thin structures and small regions. As illustrated interpolation weight optimized for minimizing the representation error.
in Figure 1, with our method the downsampling and upsampling
can be jointly optimized in order to minimize the upsampling error,
thus effectively preventing the loss of small isolated local structures. In guided image filtering [He et al. 2012], the target image is locally
The proposed method contains only a few parameters that are represented as the affine transformations of the source image. This
easy to set, and a fixed parameter setting can be well suited for approach is effective in preserving the local structures of the source
various tasks and input images. It is efficient and easy to implement, image. [Gharbi et al. 2015] introduces the concept of transform recipe
and can achieve fast speed with a GPU implementation. Moreover, for efficient cloud image processing, which shows that high-quality
the joint optimization of upsampling and downsampling is target- upsampling can be obtained with local affine transformations for
free, i.e. independent of the target image, and thus needs to be a wide range of processing tasks. [Chen et al. 2016] proposes to
done only once for each image, regardless of how the target image apply the local color transformations in the bilateral space, which
changes. As a demonstration, we show that with our method, real- further enhances the ability to represent detail effects by localizing
time image editing and video processing can be achieved easily for the transformations in both spatial and range spaces.
time-costly image operators. Recent works mainly achieve improvements by leveraging ma-
chine learning. [Gharbi et al. 2017] shows that the local color trans-
formations can be directly learned from pairs of training images,
2 RELATED WORK which eliminates the need to run the processing operator online.
The computational efficiency of image processing algorithms is very [Wu et al. 2018] proposes an end-to-end trainable guided filter by
important for many applications, so it has attracted a lot of studies formulating the local transformations into a fully differentiable mod-
for various image processing tasks, such as filtering [Liu and Shen ule. For better preserving details, [Pan et al. 2019] proposes to use
2011; Vaudrey and Klette 2009], enhancement [Farbman et al. 2008] linear representations with more localized support and learning-
and modern learning-based image synthesis [Chai et al. 2022] etc. based regularization. [Shi et al. 2021] reformulates the guided filter
GPU-based acceleration has also been widely studied [Kazhdan and as an unsharp mask operator more suitable for learning. Although
Hoppe 2008; Li et al. 2012; Wu and Xu 2009]. Most of these methods impressive results can be achieved, the learning-based methods need
are specific to the processing algorithms. In contrast, we will focus to be trained for each specific task, and thus may be unavailable or
on guided upsampling, which can accelerate a wide range of image inconvenient in some cases.
operators by treating them as black boxes.
The problem of guided upsampling is first introduced in [Kopf 3 METHOD
et al. 2007], in which joint bilateral upsampling (JBU) is proposed Given an image operator 𝑓 and a high-resolution input image 𝐼
as the solution. JBU represents each output pixel as the weighted with RGB colors, our goal is to obtain an approximation 𝑇ˆ of the
average of a set of low-resolution pixels. The weights are computed original output 𝑇 = 𝑓 (𝐼 ). With the guided upsampling method, we
with bilateral weighting function [Tomasi and Manduchi 1998] in- first apply 𝑓 to a downsampled image 𝐼 ↓ , and then upsample the
corporating the guidance of the high-resolution input image. As a result 𝑇 ↓ to the high-resolution output 𝑇ˆ with the guidance of 𝐼 . We
result, JBU also inherits the problems of the bilateral filter, such as need to optimize the downsampling and upsampling processes in
edge blur and gradient reversal [Durand and Dorsey 2002; He et al. order to minimize the difference between 𝑇ˆ and 𝑇 . Note that the
2012]. Our method has the same general form as JBU, but it involves same as previous universal guided upsampling methods [Chen et al.
the weighted average of only two pixels, and the artifacts can be 2016; Kopf et al. 2007], we also treat 𝑓 as a black-box operator that
avoided by the proposed optimization techniques. is scale-invariant.

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
Guided Linear Upsampling • 3

(a) Source (560 × 560) (b) JBU (8× ) (c) GLU∗ (16× ) (d) GLU∗ (32× ) (e) GLU (32× )

Fig. 3. Guided self-upsampling with JBU and our method. The input source image is downscaled and then upsampled to the original resolution with the
guidance of itself. GLU∗ is our method as described in Section 3.1, GLU is the accelerated version introduced in Section 3.2. They both can well recover the
original image for large ratios as 32×. As the comparison, JBU produces obvious blur even for smaller ratios as 8×.

3.1 Guided Linear Upsampling weighting parameter should be


We first assume that the downsampled input image 𝐼 ↓ is given, (𝐼𝑝 − 𝐼𝑏↓ )(𝐼𝑎↓ − 𝐼𝑏↓ )
or produced with regular grid downsampling by default. As illus- 𝜔𝑎𝑏 = (4)
trated in Figure 2, in order to optimize the upsampling, our basic ∥ 𝐼𝑎↓ − 𝐼𝑏↓ ∥ 2 +𝜀
assumption is that each pixel 𝑝 of the high-resolution target image which makes the interpolation result 𝐼ˆ𝑝 the projection of 𝐼𝑝 on the
𝑇 can be well approximated by the linear interpolation of a pair of
color line determined by 𝐼𝑎↓ and 𝐼𝑏↓ , just as previous sampling-based
low-resolution pixels (𝑎, 𝑏) as follows:
matting methods [Wang and Cohen 2007]. 𝜀 is a small constant (10 −3
𝑇ˆ𝑝 = 𝜔𝑎𝑏𝑇𝑎↓ + (1 − 𝜔𝑎𝑏 )𝑇𝑏↓ 𝑠.𝑡 . 𝑎, 𝑏 ∈ Ω𝑝 ↓ (1) in our implementation) to avoid dividing by zero in flat patches.
Figure 3 demonstrates an example to upsample an image from
where 𝜔𝑎𝑏 is the weighting function, 𝑝 ↓ is the downscaled coordi-
its downscaled counterpart. The above simple method achieves sur-
nates of 𝑝, and Ω𝑝 ↓ is a small neighborhood of 𝑝 ↓ . 𝑇ˆ𝑝 is an estimate of
prisingly good results. Even for large ratios such as 32×, details
the original output𝑇𝑝 . The above assumption can actually be derived
of the original image can be reconstructed almost perfectly. As a
from the well-known color line model [Levin et al. 2007], which we
comparison, the result of JBU is obviously blurred even for smaller
will explain in Section 4.1. Eq. (1) contains three parameters 𝑎, 𝑏, 𝜔𝑎𝑏 ,
ratios. Note that for the methods based on the local color transfor-
which need to be optimized in order to minimize the upsampling
mation [Chen et al. 2016; He et al. 2012], the above task is trivial
error. Denote the parameters of pixel 𝑝 by Θ𝑝 = {𝑎, 𝑏, 𝜔𝑎𝑏 }, and
because an identity transformation would be learned if the source
then Θ is a 3 × 𝐻 ×𝑊 tensor containing the parameters of all pixels
and target images are the same.
of 𝑇 . Given Θ, the corresponding high-resolution output 𝑇ˆ (Θ) can
be easily computed with Eq. (1).
The same as previous local color transformation methods [Chen
et al. 2016; He et al. 2012; Levin et al. 2007], we also assume that the
target image can be locally represented as the affine transformation
of the source image. As will be explained in Section 4.1, in this
case the source and target images can be optimally upsampled with
the same set of parameters. In other words, if Θ is optimal for the
source image, then it should be optimal for the target image as well.
Therefore, the optimal parameters Θ can be solved w.r.t only the
source image in order to minimize its upsampling error:
Θ = arg min ∥ 𝐼ˆ(Θ) − 𝐼 ∥ (2)
Θ (a) Source (112 × 168) (b) GNU (c) GLU
in which 𝐼ˆ(Θ) is the upsampled source image with the given param-
Fig. 4. Comparison of GLU and GNU for upsampling an image patch from
eters Θ. We assume that Θ𝑝 of each pixel is independent of each its 8× downsampling. GNU cannot represent the smooth variations of the
other, so the above equation can be solved for each pixel as original image, and thus produces obvious artifacts.
Θ𝑝 = arg min ∥ 𝜔𝑎𝑏 𝐼𝑎↓ + (1 − 𝜔𝑎𝑏 )𝐼𝑏↓ − 𝐼𝑝 ∥ (3)
Θ𝑝

which is a combinatorial optimization problem that is usually diffi- 3.2 Efficient Computation
cult to solve. Fortunately, in our case Ω𝑝 ↓ is a small neighborhood The complexity of the above method is quadratic to the number of
(a 3 × 3 window in our experiments), so it is easy to enumerate all pixels in Ω𝑝 ↓ . For a typical 3 × 3 window, it needs to check 36 pairs
possible pixel pairs. For each selected pixel pair (𝑎, 𝑏), the optimal of pixels in order to minimize Eq. (3). For high-resolution images,

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
4 • Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu

this still requires a large amount of computation, so we propose the downscaling and upscaling processes, which however, are not well
following improvements for better efficiency. suited for the proposed GLU upsampler.
Firstly, we find that it is not necessary to enumerate all pixel pairs Given the GLU upsampler Ψ(𝐼 ↓ , Θ), we can formulate the down-
(𝑎, 𝑏) ∈ Ω𝑝 ↓ in order to optimize Eq. (3). Instead, we can first fix 𝑎 sampling process as an optimization problem aiming to minimize
as the pixel with the most similar color to 𝐼𝑝 , and then optimize only the self-upsampling error of the source image. In practice since Θ
𝑏 and 𝜔𝑎𝑏 with respect to Eq. (3). In this way, the complexity can is unknown, the downsampling and upsampling need to be jointly
be reduced to be linear with |Ω𝑝 ↓ |. Since 𝑎 is close to 𝐼𝑝 in the color optimized as
space, the approximation error should be small for the projection of 𝐼 ↓ , Θ = arg min ∥ 𝐼 − Ψ(𝐼 ↓ , Θ) ∥ (6)
𝐼𝑝 on the color line. 𝐼 ↓ ,Θ
Secondly, it is easy to see that if 𝐼𝑝 is on the color line determined
with each pixel of 𝐼 ↓ from exactly one pixel of 𝐼 . Note that this is
by 𝐼𝑎↓ and 𝐼𝑏↓ , the interpolation weight 𝜔𝑎𝑏 as in Eq. (4) reduces to
different from previous downsampling optimization methods, in
∥ 𝐼𝑝 − 𝐼𝑏↓ ∥ which each pixel of 𝐼 ↓ is usually filtered from multiple pixels of 𝐼 in
𝜔𝑎𝑏 = (5) order to reduce aliasing artifacts. For our method, the filtering in
∥ 𝐼𝑝 − 𝐼𝑎↓ ∥ + ∥ 𝐼𝑝 − 𝐼𝑏↓ ∥ +𝜀
downsampling may significantly blur the upsampled image because
which can be computed more efficiently and the results are guar- it would shrink the endpoints of color lines, which is detrimental to
anteed to be in [0, 1]. Since the color lines not crossing 𝐼𝑝 are less image details.
likely to be selected, the above approximation has little impact on Eq. (6) can be solved by iteratively optimizing 𝐼 ↓ and Θ. Given 𝐼 ↓ ,
the quality of our method. the upsampling parameters Θ can be solved as in Algorithm 1. To
As shown in Figure 3, the above accelerations would not introduce optimize the downsampling, we first compute the pixel-wise error
noticeable differences compared to our original method, but the map 𝐸, 𝐸𝑝 =∥ 𝐼𝑝 − 𝐼ˆ𝑝 ∥. Obviously, the pixels with large error must
complexity is much lower. Therefore, in the following we will use be those that cannot be well represented by 𝐼 ↓ , and thus need to
the accelerated method by default. be added to 𝐼 ↓ by replacing some existing pixels. Note that since
Our final upsampling method is as described in the Algorithm each pixel in 𝐼 ↓ may be used to interpolate multiple pixels of 𝐼 , the
1. It is very simple and efficient. Ω𝑝 ↓ is typically chosen as a 3 × 3 above operation may not reduce the total error. Therefore, we adopt
window, so for each pixel, only 9 pixel pairs need to be checked. a trial-and-error approach, and if replacing some pixels in 𝐼 ↓ does
Note that if we fix 𝜔𝑎𝑏 to 1, then the optimization in line 3 is not not reduce the total error, the replaced pixels would be rolled back.
necessary, and 𝑇ˆ𝑝 would be equal to 𝑇𝑎↓ . We call this special case Figure 6 illustrates the procedure of our method, more details are
of our method as Guided Nearest Upsampling (GNU). As shown described in Algorithm 2. The trial-and-error procedure is executed
in Figure 4, GNU lacks the ability to recover the ramp edges and for each connected region 𝐶𝑖 of pixels with large error (E). The
smooth variations of natural images, thus producing blocky effects pixels with large errors are tried to be added to the downsampled
and false contours, which can be effectively eliminated by using image, and the operation would be accepted if it can reduce the total
GLU. error, otherwise it would be unrolled. For multiple high-resolution
pixels [𝑞 ↑ ] mapped to the same low-resolution pixel location 𝑞 ∈ 𝐼 ↓ ,
ALGORITHM 1: Efficient Guided Linear Upsampling. the one with the largest error would be selected to replace the
Input: High-res source image 𝐼 , low-res source image 𝐼 ↓ and original color of 𝑞.
corresponding target image 𝑇 ↓ . As shown in Figure 5, the above method can effectively prevent
Output: High-res target image 𝑇ˆ . the missing of thin structures and small regions. In most cases, it
1 for each pixel 𝑝 ∈ 𝐼 do requires only 1 or 2 iterations to converge, and after the initialization,
2 Find 𝑎 as the pixel in Ω𝑝↓ with the most similar color to 𝐼𝑝 ; only pixels with large errors are involved for further processing, so
3 Fix 𝑎 and optimize 𝑏, 𝜔𝑎𝑏 with Eq. (3)(5); only a little more computation is required.
4 Compute 𝑇ˆ𝑝 as Eq. (1);
5 end
4 ANALYSIS
An ideal guided upsampling method should be able to preserve the
detail effects of the target image while avoiding artifacts such as
bleeding and blurring. In the following we will analyze the capabili-
3.3 Downsample Optimization ties of our method and show how it relates to previous methods.
For large downsampling ratio, isolated thin structures and small
regions may be completely lost due to regular grid downsampling. 4.1 Theoretical Derivation
In this case, it would be impossible for the upsampling process to The proposed upsampling method in Section 3.1 can be derived from
recover the original content. Figure 5 demonstrates such a situation. the color line model [Levin et al. 2007] and local color transformation
Although downsampling optimization has been extensively studied, methods [Chen et al. 2016; He et al. 2012; Levin et al. 2007].
previous works mainly aim to avoid aliasing artifacts [Kopf et al. The color line model tells us that the colors of pixels in a small
2013; Oeztireli and Gross 2015; Weber et al. 2016], which is different patch should be roughly on the same line in the color space. There-
from our goal. Some super-resolution methods [Kim et al. 2018; fore, the color of each pixel in the patch must be well approximated
Sun and Chen 2020; Xiao et al. 2020] also jointly optimize their by the linear interpolation of the two endpoints [𝑎, 𝑏] of the color

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
Guided Linear Upsampling • 5

(a) Source (284 × 202) (b) 16× downsampling (c) GLU − (d) Optimized downsampling (e) GLU

Fig. 5. Demonstration of downsample optimization. (a) The input image with some thin structures. (b) Most thin structures are lost with 16× default
downsampling. (c) The result upsampled from (b), the thin structures cannot be recovered. (d) Optimized 16× downsampled image. (e) The result upsampled
from (d), the thin structures are well recovered.

8×↓
𝐼↓

GLU(Θ)
1st iteration
Optimizer
𝐼↓

GLU(Θ)
2nd iteration
Optimizer …
Optimizer for 𝑪𝒊
𝐼 ⊖ 𝐼 ⊖ Scroll back 𝐼↓ , Θ and ℇ

𝑒( = * 𝐸,
,∈/0 𝑌
𝐼 ↓ , Θ, ℇ 𝑁 𝐼 ↓ , Θ, ℇ
ℇ( ℇ4 Update 𝐼↓ , Θ, ℇ with 𝐶3 𝑒4 > 𝑒(?
𝑒4 = * 𝐸,
,∈/0

𝐶3

Fig. 6. Illustration of the proposed downsample optimization method. For the input high-resolution image 𝐼 , we initialize 𝐼 ↓ with regular grid downsampling,
and then iteratively update 𝐼 ↓ by trying to add large-error pixels to it for minimizing the total upsampling error.

line. After downsampling, it can be expected that [𝑎, 𝑏] still can be can recover only the step edges, and thus would introduce significant
well represented by two pixels in the downsampled patch, because artifacts, as is shown in Figure 4.
of the information redundancy in the high-resolution image. As a A natural question is whether we can achieve further improve-
result, each pixel color in the original patch can also be linearly ments by interpolating more pixels. Indeed, Eq. (1) can be more
interpolated by two pixels in the downsampled patch, as in Eq. (1). generally expressed as
The local color transformation methods assume that the output
𝑇ˆ𝑝 =
∑︁
image can be locally represented as the affine transformation of 𝜔𝑞 𝑇𝑞↓ (9)
the input image, i.e. 𝑇𝑝 = 𝐴𝑝 𝐼𝑝 , where 𝐴𝑝 is an affine transforma- 𝑞 ∈Ω𝑝 ↓

tion that varies smoothly over the image space. In addition, we with 𝜔𝑞 as the normalized weights. Interestingly, this is exactly the
require the operator to be approximately scale-invariant: 𝑇𝑝↓ = 𝐴𝑝 𝐼𝑝↓ . form of JBU. However, in JBU 𝜔𝑞 is not optimized, and the filtering
Therefore, if using Θ𝑝 can linearly interpolate 𝐼𝑝 , i.e. effect would result in blur and edge reversal artifacts [He et al. 2012].
𝐼𝑝 = 𝜔𝑎𝑏 𝐼𝑎↓ + (1 − 𝜔𝑎𝑏 )𝐼𝑏↓ (7) It is easy to see that when 𝜎𝑑 → ∞ and 𝜎𝑟 → 0, JBU will reduce
to GNU. However, in practice this is hard to achieve due to the
then it immediately follows that numerical problems of the exp weighting function. By decreasing
𝜎𝑟 , the blurring artifacts of JBU can be reduced, but may lead to
𝑇𝑝 = 𝜔𝑎𝑏 𝐴𝑝 𝐼𝑎↓ + (1 − 𝜔𝑎𝑏 )𝐴𝑝 𝐼𝑏↓ = 𝜔𝑎𝑏𝑇𝑎↓ + (1 − 𝜔𝑎𝑏 )𝑇𝑏↓ (8)
aliasing artifacts as GNU. Therefore, in this sense both GLU and
which means that using Θ𝑝 also can linearly interpolate 𝑇𝑝 , as we GNU can be taken as special cases of JBU with optimized weights.
have assumed in Section 3.1. Although not tested, we do not see the need to take more pix-
els for interpolation. Involving more pixels not only makes the
4.2 Edge Recovery optimization more difficult, but may also lead to overfitting and
Typical image edges can be classified into three types: step edge, extrapolation, both of which can reduce the result quality.
ramp edge, and roof edge [Koschan and Abidi 2005; Yin et al. 2019].
For natural images, most edges should be ramp edges connecting 4.3 Detail Preservation
two regions. Obviously, the transition effects of ramp edges can be As discussed in Section 4.1, our method implicitly takes advantage
well represented by linear interpolation of the two region colors. of the local color transformation for transferring the upsampling
Therefore, by interpolating only two pixels, GLU can recover the parameters. However, it should be noted that unlike previous ap-
edges of the original image very well. As a comparison, using GNU proaches such as guided filter [He et al. 2012] and BGU [Chen

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
6 • Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu

ALGORITHM 2: Joint Optimization of Down- and Upsampling. image. However, if the pixel affinities of the source and target im-
Input: High-res source image 𝐼 , the error threshold 𝜏, the maximum ages are significantly different (e.g., when new edges are introduced
iterations 𝑁 . in the target image), unsmooth artifacts may be produced. Actually,
Output: Optimized low-res image 𝐼 ↓ and upsampling parameters Θ. this is the main limitation of our method, which we will discuss
1 Initialize 𝐼 ↓ with regular grid downsampling; further in Section 6.
2 Initialize Θ from 𝐼, 𝐼 ↓ with Algorithm 1;
3 Compute initial error map 𝐸; 5 EXPERIMENTS
4 for n = 1, ..., N do In experiments we evaluate the proposed method in various image
5 Find the set of pixels with large error: E = {𝑝 | 𝐸𝑝 > 𝜏 } ;
processing applications, and compare it qualitatively and quantita-
6 if E=∅ then
tively with previous methods. We also demonstrate the advantages
7 Break
of our method for interactive image editing and real-time video
8 end
processing, and reveal its limitations for more diverse applications.
9 Cluster E as connected components 𝐶 1 , · · · , 𝐶𝑀 ;
10 for 𝑖 = 1, · · · , 𝑀 do
11 Backup Θ, 𝐼 ↓ , 𝐸 for scroll back; 5.1 Comparisons
Compute 𝑒 0 = 𝑝 ∈𝐶𝑖 𝐸𝑝 ; For quantitative evaluations we tested our method with the follow-
Í
12

𝑄 = {𝑞 | 𝑞 ∈ 𝐼 ↓ & [𝑞 ↑ ] 𝐶𝑖 ≠ ∅}; ing applications and datasets:


Ñ
13
14 for 𝑞 ∈ 𝑄 do • Alpha Matting with the method of [Chen et al. 2013]. The
15 Update 𝐼𝑞↓ with 𝐼𝑝 , 𝑝 = arg max 𝐸𝑝 dataset is from [Rhemann et al. 2009], which consists of 27 high-
𝑝 ∈ [𝑞 ↑ ]
resolution images with a size of about 6 − 8M pixels.
Ñ
𝐶𝑖
16 end • Colorization with the method of [Levin et al. 2004]. The dataset
17 for 𝑝 ∈ 𝐶𝑖 do is from the high-quality 2K images for super-resolution [Agustsson
18 Update Θ𝑝 as in Algorithm 1;
and Timofte 2017]. We use all 100 images in the validation set. The
19 Update 𝐸𝑝 with updated Θ𝑝 ;
source grayscale images are produced by graying the original RGB
20 end
images and then converted to 3 channels simply by replicating
Compute 𝑒 1 = 𝑝 ∈𝐶𝑖 𝐸𝑝 ;
Í
21
1 0
the channel. The required seed constraints [Levin et al. 2004] are
22 if 𝑒 > 𝑒 then
sparsely sampled from the original color images.
23 Scroll back updated regions of Θ, 𝐼 ↓ and 𝑒;
• Dehazing with the method of [Li et al. 2017]. The dataset is
24 end
from NTIRE-19 benchmark [Ancuti et al. 2019], which includes 55
25 end
real hazy images with ∼2M pixels.
26 end
• Unsharp Masking for enhancing image details by the method
of [Ngo et al. 2020]. The source images are the same as colorization.
• 𝐿0 Smoothing using the method of [Xu et al. 2011]. The dataset
is the same as the colorization.
• Laplacian Filtering for enhancing image details with the
method of [Aubry et al. 2014]. The dataset is the same as colorization.
The low-resolution target images are produced from the down-
sampled source images using the chosen image operators, and then
upsampled using our technique to produce the full-resolution tar-
(a) Step edge (b) Ramp edge (c) Roof edge
get images. Table 1 and Table 2 are the quantitative results with
PSNR and SSIM scores, respectively. PSNR measures the difference
Fig. 7. The three types of edges in natural images.
in pixel values, while SSIM mainly measures the similarity of local
structures. We compare our method with JBU [Kopf et al. 2007]
and BGU [Chen et al. 2016]. For JBU, we use the default param-
et al. 2016], our method does not require the transformations to be eter setting with 5 × 5 support windows and 𝜎𝑑 = 0.5, 𝜎𝑟 = 0.1.
smooth in either image space or bilateral space. Therefore, it can For BGU we test both the global method with the authors’ MAT-
better preserve the detail effects of the target image while avoiding LAB code, and the fast local method (BGU-fast) implemented with
the bleeding artifacts caused by over-smoothing. Halide [Ragan-Kelley et al. 2012, 2013]. For GLU we use the default
One potential issue with our method is the lack of an explicit setting as described in Section 5.2. GLU − is our method without
smoothness constraint. Although preserving smoothness is impor- downsampling optimization.
tant for most image processing operators, we find that our method For most applications our method outperforms JBU and BGU for
which operates on each pixel independently also works well in most both scores. Figure 8 shows some examples. In general, for large
situations. This is mainly because the linear interpolation model scaling ratios such as 8× and 16×, JBU tends to overblur low-contrast
can well approximate the appearance of the original source image, edges, while BGU tends to produce bleeding artifacts that mix the
which serves as a smooth guidance map that can suppress unsmooth effects of different regions due to the smooth constraint of local
artifacts if the target image has similar local affinities as the source transformations, as we analyzed in Sections 4.2 and 4.3.

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
Guided Linear Upsampling • 7

Table 1. Comparisons of different methods with PSNR scores. The low-resolution target is produced from the low-resolution source using the image operator,
except the applications with † , for which the low-resolution target is obtained by downsampling the reference image.

alpha matting colorization unsharp mask 𝐿0 -smoothing dehazing laplacian filter unsharp mask†
PSNR↑
8× 16× 8× 16× 8× 16× 8× 16× 8× 16× 8× 16× 8× 16×

JBU 25.6 22.9 20.9 20.1 18.2 16.9 22.3 20.1 25.9 22.3 15.6 13.8 19.0 17.8
BGU-fast 21.4 22.2 28.5 27.8 23.8 23.2 22.8 22.2 21.1 17.7 21.8 21.3 25.1 24.9
BGU 28.3 25.8 30.7 28.8 23.5 22.4 27.0 25.4 26.8 23.4 23.7 22.3 25.4 25.0

GLU − 31.4 28.9 29.7 27.7 23.6 22.3 23.6 24.5 27.6 24.1 20.5 17.2 25.2 24.0
GLU 31.5 29.1 31.3 29.6 24.0 22.4 28.8 27.1 27.6 24.1 23.1 24.5 25.9 25.2

Table 2. Comparisons of different methods with SSIM scores.

alpha matting colorization unsharp mask 𝐿0 -smoothing dehazing laplacian filter unsharp mask†
SSIM↑
8× 16× 8× 16× 8× 16× 8× 16× 8× 16× 8× 16× 8× 16×

JBU 0.93 0.91 0.60 0.56 0.40 0.36 0.80 0.78 0.91 0.88 0.32 0.27 0.41 0.37
BGU-fast 0.71 0.64 1.00 1.00 0.85 0.84 0.83 0.82 0.79 0.72 0.82 0.81 0.77 0.75
BGU 0.86 0.78 1.00 1.00 0.88 0.85 0.88 0.84 0.90 0.85 0.88 0.84 0.79 0.77

GLU − 0.96 0.94 0.97 0.97 0.86 0.83 0.86 0.83 0.94 0.89 0.80 0.68 0.82 0.79
GLU 0.96 0.94 0.99 0.99 0.87 0.83 0.89 0.85 0.94 0.89 0.85 0.80 0.83 0.81
alpha matting

source local zoom reference JBU (20.2/0.88) BGU (25.3/0.79) GLU (28.2/0.94)
colorization

source local zoom reference JBU (22.9/0.70) BGU (31.6/1.00) GLU (34.0/0.99)
L0-smoothing

source local zoom reference JBU (16.3/0.96) BGU (30.8/0.85) GLU (31.1/0.98)

Fig. 8. Visual comparisons of different methods with 8× downsampling. In the parentheses are the PSNR/SSIM scores.
optical flow

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.

source local zoom reference JBU (31.6/0.98) BGU (29.4/0.96) GLU (35.1/0.98)
8 • Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu

(a) source (b) reference


(a) source (1356×919) (b) reference

(c) BGU (d) GLU


(c) low-res target of GLU − (d) GLU −
Fig. 10. Unsharp Masking with 8× downsampling. The results of BGU and
GLU are comparable, and both look better than the reference.

assumed by our method), the low-resolution target image should


have the same aliasing as the downsampled source image, which
should be beneficial for the upsampled target image according to
our analysis in Section 4.
(e) low-res target of GLU (f) GLU
For Unsharp Masking, it is strange that BGU-fast performs better
than BGU in terms of PSNR. Actually, this is caused by the different
Fig. 9. The effect of downsampling optimization demonstrated on coloriza-
effects of the image operator at different scales. As shown in Figure
tion at 16×. GLU can better preserve fine image structures (the cat whiskers)
10, the upsampled results of BGU and GLU are largely different from
than GLU − .
the reference, which is obviously caused by the image operator
rather than the upsampler. In this case, we consider the scores to
Since BGU represents the target image as the local transforma- be mostly noise and not trustworthy for the evaluation. The last
tions of the source image, it is good at preserving the source image column of Tables 1 and 2 shows the results with low-resolution
structures, and is therefore advantageous for applications such as target downsampled from the reference, which can better reflect
Colorization, Unsharp Masking, and Laplacian Filter. For these ap- the net effect of upsampling methods.
plications BGU achieves better SSIM scores than our method. In Finally, to see the effect of our method for large ratio of upsam-
particular, for Colorization, BGU obtains a full SSIM score, because pling, we conducted experiments with larger ratios of 32×, 64×, and
our colorization method modifies only the chrominance channels, even 128×. Figure 11 shows an example for matting. Surprisingly, for
while SSIM is computed using only the grayscale channel. However, the tested image, even for 128× of downsampling and upsampling,
the PSNR scores of these applications are still comparable to or lower our method still obtains pretty good results, which are better than
than our method. For applications where some source details need the results of JBU and BGU with smaller ratios.
to be removed, such as Matting and Smoothing, preserving the local
structures of the source image may have unwanted effects, e.g. for 5.2 Parameters
image smoothing BGU may re-introduce some source details that Our method contains only 3 parameters, i.e. the window size 𝑆 of
have been removed by the smoothing operator, as demonstrated Ω𝑝 ↓ , the error threshold 𝜏 and the maximum number of iterations
in Figure 8. For these applications our method can significantly 𝑁 . 𝜏 should be an error value that may introduce noticeable visual
outperform BGU. difference, and for most examples our joint optimization requires
The downsample optimization can improve the PSNR and SSIM only 1 or 2 iterations to remove large-error regions, so the param-
scores for all tested applications. As shown in Figure 9, fine image eters are easy to set. In our experiments, they are fixed to 𝑆 = 3,
structures can be better preserved with the optimized downsampling. 𝜏 = 30/255, 𝑁 = 3.
In comparison with the regular grid downsampler, the optimized Table 3 investigates the effect of the window size 𝑆. Using a larger
downsampler may result in more gritty and aliased low-resolution window size can lead to better self-upsampling, but slightly reduces
images due to the irregular spatial sampling. However, note that the quality of the target images. Actually, a 3 × 3 window in the
such aliasing actually can improve the upsampled image because downscaled image corresponds to a large enough neighborhood in
the joint optimization only accepts results with lower upsampling the original resolution, so using a larger window size is not necessary
errors. Moreover, for operators that can preserve local affinities (as and may degrade performance around low-contrast edges.

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
Guided Linear Upsampling • 9

(a) Source (3270 × 2388) (b) BGU 32× (c) BGU 64× (d) JBU 32×

(e) Reference (f) GLU 32× (g) GLU 64× (h) GLU 128×

Fig. 11. Experiments with large ratios of downsampling and upsampling. JBU produces obvious blur for 32×, BGU produces significant bleeding for 64×, and
our method gets pretty good results even for 128×. For this experiment the low-resolution target images are downsampled from the reference, in order to
observe the net effect of upsampling methods.

Table 3. Effect of the window size 𝑆 to the resulting quality in PSNR. only about 5ms for 2K images and 14ms for 4K images, so it can be
easily incorporated for real-time video processing.
window size 3×3 5×5 7×7 9×9
A great advantage of our method is that the joint optimization
self-upsampling 41.31 42.83 44.04 44.92 process is target-free, i.e. the optimization is independent of the
matting 35.47 35.18 34.76 34.15 target image. Therefore, it can be precomputed before the target
image is acquired, as we will demonstrate in Sections 5.4 and 5.5. For
colorization 32.39 29.95 29.94 28.74
applications where multiple operators may be applied for the same
image, optimized parameters can be cached and shared between
Table 4. Time cost (ms). different operators, which further reduces the overall computations
of our method.
JBU BGU BGU-fast GLU GLU −
Image Size
C++ Matlab Halide C++ CUDA

2K 364 5772 13.1 125 5.6


4K 1427 18029 28.5 507 14.3

Compared with JBU and BGU, the parameter setting of our method
is much simpler. This advantage makes it more suitable to be used (a) (b)
as a universal guided upsampler for different situations.

5.3 Time Cost


Table 4 compares the time cost of different methods. The BGU global
method implemented with MATLAB is slow to compute, JBU and
GLU are much faster, but still cannot reach real-time speed. BGU-
fast implemented with Halide can be very fast even on CPU, but as
compared in Tables 1 and 2, its quality is significantly lower than
our method because the local method is unstable in some situations. (c) (d)
The main computation of our method is the joint optimization of Fig. 12. Interactive matting with instant feedback. (a) input image with
Θ and 𝐼 ↓ . Note that since each pixel is optimized independently, our initial scribble brushes. (b) the initial alpha matte. (c)(d) the changes in
method is parallelizable and can be greatly accelerated with GPU. alpha matte can be observed instantly while the user moving mouse to add
a new brush.
For a test we implemented GLU − with CUDA and tested it on a
laptop with Nvidia GTX1650 GPU. The accelerated version takes

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
10 • Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu

Figure 12 demonstrates the effectiveness of our system. The user


can instantly observe changes in the matting result when mov-
ing the mouse for editing. This enables the user to add brushes
accordingly in the areas with large errors, thus greatly improving
efficiency and bringing a much better user experience. Please see
the supplementary material for video demonstrations.

5.5 Real-time Video Processing


(a) Source (3840×2160) (b) full res. (<5fps) Our method is parallelable and can be executed fast with GPU imple-
mentation, thus enabling real-time video processing. To demonstrate
this, we apply our method to BackgroundMattingV2 [Lin et al. 2021],
an efficient video matting method that can achieve 30fps for 4K
videos on RTX2080ti GPU. However, for our low-end test machine
with GTX1650 GPU, it requires more than 200ms for each frame. To
achieve real-time speed, we apply our method with 8× downsam-
pling, then the low-resolution matting with BackgroundMattingV2
requires about 21ms, and upsampling with GLU − requires about
14ms, so in total it takes about 35ms per frame.
(c) BGU-fast 8× (∼20fps) (d) GLU − 8× (>30fps) The target-free property of our method can further improve the
Fig. 13. Real-time video matting with BackgroundMattingV2 [Lin et al. efficiency. Since the optimization of GLU − is independent of the
2021], which can achieve only less than 5fps with GTX1650 GPU. With the matting result, it can be executed in parallel with the matting proce-
acceleration of our method, it can easily achieve real-time speed. dure. In this way the time cost can be further reduced. Compared to
the results in full resolution, the sacrifice in quality is usually small
for 8× downsampling, as shown in Figure 13. In comparison with
BGU-fast, our method can achieve faster speed and better quality.
5.4 Interactive Editing with Instant Feedback
As mentioned above, the joint optimization process is target-free, so
it can be precomputed for interactive image editing. The optimized
upsampling parameter Θ can be stored as a 3×𝐻 ×𝑊 tensor. Given Θ,
the upsampling requires only a simple linear interpolation for each
pixel, whose computation is negligible for interactive applications.
As a demonstration, we built an interactive matting system that
enables instant feedback without GPU acceleration. It is known that
automatic image matting is very hard for general objects, while
interactive matting requires fast speed for better user experience.
To demonstrate the power of the proposed approach, we adopted (a) Source (b) Reference (c) GLU 8×
the global matting method proposed in [Levin et al. 2007], which Fig. 14. Failure case of our method. New edges not appear in the source
needs to solve a large sparse linear system, and is therefore very image cannot be well recovered.
slow for large images.
Regardless of the size of the input image, our system resizes it to
a small image of no more than 10K pixels, for which the matting
method is invoked and then the result matte is upsampled to the full
resolution for display. Thus, the main computation is to solve the
10000 × 10000 sparse linear system. We adopted the MKL PARDISO
solver, which is very efficient and requires about 0.5 seconds for each
solution. Surprisingly, although 10K pixels seems very small, we find
it works well with the proposed guided upsampling method. Actu-
ally, as shown in Figure 11, our method can achieve good results for
very large ratios such as 64× and 128×. For a typical high-resolution
input of 10M pixels, downsampling it to 10K requires a downscale
ratio about 30×, which is not too large in comparison. After a pe-
riod of user interaction, a trimap can be estimated from the current (a) Source (b) Reference (c) GLU 8×
alpha matte, then our system proceeds with unknown pixels at a Fig. 15. Our method may produce unsmooth artifacts when the source and
finer scale. In this way, the matting operator is invoked from coarse target images have very different pixel affinities.
to fine, which helps avoid upsampling errors due to large ratio of
downsampling.

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
image2Vangogh
apple2orange
horse2zebra Guided Linear Upsampling • 11

(a) source (b) reference (c) BGU (d) GLU

Fig. 16. Examples of CycleGAN style transfer [Zhu et al. 2017], which may introduce dramatic changes to local image structures. In this case, our method may
produce unsmooth artifacts, while BGU may smooth out the new structures. Since CycleGAN can only output 256×256 fixed-size results, the reference here is
the low-resolution target image.

6 LIMITATIONS being applied to 8× downsampling inputs, because it is originally


The same as JBU and BGU, a basic assumption of our method is that designed for high-resolution video. Another example is optical flow
the source and target images have almost the same local affinity, estimation [Teed and Deng 2020], whose accuracy is greatly in-
i.e. pixels with more similar color in the source image should have fluenced by image resolution and requires special treatment after
more similar color in the target output. Therefore, our method is not upsampling. How to adapt these methods for better low-resolution
suitable for applications that may introduce new edges in the target image processing is a remaining problem.
image. Figure 14 shows such an example. The affinity of new edges
differs a lot from the source and is therefore out of the scope of 7 CONCLUSION
joint optimization. In a more general case, our method may produce We propose a simple yet effective guided upsampling method, which
unsmooth edges due to different local affinities. As shown in Figure represents each high-resolution pixel as a linear interpolation of
15, since the object has similar color to the background, our method two low-resolution pixels. Transition edges and smooth variations
fails to accurately recover the sharp matte edges. in natural images can be well represented. The upsampling parame-
Note that unlike adding new edges, removing edges does not ters and the downscaled input image are jointly optimized in order
usually cause problems for our method, because the resultant pixels to minimize the upsampling error. We reveal and discuss the con-
can still be interpolated by neighboring pixels, so applications such nections with previous methods. In particular, our method can be
as matting and smoothing can be well supported. In contrast, BGU considered as a special case of JBU with optimized weights, and it
may insistent on preserving the edges and local structures of the also implicitly exploits pixel-level local color transformations. These
source image, so for edge-removing applications it may produce properties enable it to overcome the blurring and bleeding artifacts
artifacts, as demonstrated in Figure 8. of previous approaches.
Due to the limitation on handling new edges, our method is We demonstrate the advantages of our method with a wide range
not suitable for applications that may drastically change the local of image operators. For interactive editing tasks, our approach en-
image structures, such as the recent learning-based style transfer ables time-costly operators to achieve instant feedback without
methods [Park et al. 2020; Zhu et al. 2017]. This is a common limita- hardware acceleration. Real-time video processing can also be en-
tion for universal guided upsampling methods including JBU and abled easily for high-resolution inputs. Finally, our work shows that
BGU. Figure 16 shows some examples. As shown, for cases such a high-resolution image can be well represented by simple linear
as Horse2Zebra, BGU may completely ignore new edges, while our interpolation of its downscaled counterpart. Similar to bilateral filter
method results in unsmooth artifacts. In fact, since for new edges, and local color transformations, the power of such representation
there is no guidance information in the source image, it is not able can be further explored in related tasks such as learning-based im-
to get good results without any domain-specific prior knowledge, age processing [Gharbi et al. 2017], super resolution [Sun and Chen
so using learning-based approaches should be better in this case. 2020], segmentation [Mazzini 2018], etc.
For real applications, a practical issue is to find an image operator
that is suitable for low-resolution processing. As demonstrated in 8 ACKNOWLEDGE
Figure 10, the input image scale may have a great effect on the out- The authors would like to thank the anonymous reviewers for their
put of some image processing methods. For example, we found that valuable comments and suggestions. This work is supported by Na-
BackgroundMattingV2 produces significantly more errors when tional Key R&D Program of China under grant (2022YFB3303200),

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.
12 • Shuangbing Song, Fan Zhong, Tianju Wang, Xueying Qin, and Changhe Tu

Natural Science Foundation of China (62072284), and Center-initiated Recognition. 8762–8771.


Research Project of Zhejiang Lab (2021NB0AL01). Hengsheng Liu and Jianbing Shen. 2011. Tone mapping using intensity layer
decomposition-based fast trilateral filter. Journal of Computer-Aided Design &
Computer Graphics 23, 1 (2011), 85–90.
REFERENCES Davide Mazzini. 2018. Guided upsampling network for real-time semantic segmentation.
arXiv preprint arXiv:1807.07466 (2018).
Eirikur Agustsson and Radu Timofte. 2017. NTIRE 2017 Challenge on Single Image Dat Ngo, Seungmin Lee, and Bongsoon Kang. 2020. Nonlinear Unsharp Masking
Super-Resolution: Dataset and Study. In The IEEE Conference on Computer Vision Algorithm. In 2020 International Conference on Electronics, Information, and Commu-
and Pattern Recognition (CVPR) Workshops. nication (ICEIC). IEEE, 1–6.
Codruta O Ancuti, Cosmin Ancuti, Radu Timofte, Luc Van Gool, Lei Zhang, and Ming- A Cengiz Oeztireli and Markus Gross. 2015. Perceptually based downscaling of images.
Hsuan Yang. 2019. Ntire 2019 image dehazing challenge report. In Proceedings of the ACM Transactions on Graphics (TOG) 34, 4 (2015), 1–10.
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0. Jinshan Pan, Jiangxin Dong, Jimmy S Ren, Liang Lin, Jinhui Tang, and Ming-Hsuan
Mathieu Aubry, Sylvain Paris, Samuel W Hasinoff, Jan Kautz, and Frédo Durand. 2014. Yang. 2019. Spatially variant linear representation models for joint filtering. In
Fast local laplacian filters: Theory and applications. ACM Transactions on Graphics Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
(TOG) 33, 5 (2014), 1–14. 1702–1711.
Jonathan T Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast Sylvain Paris and Frédo Durand. 2006. A fast approximation of the bilateral filter using
bilateral-space stereo for synthetic defocus. In Proceedings of the IEEE Conference on a signal processing approach. In European conference on computer vision. Springer,
Computer Vision and Pattern Recognition. 4466–4474. 568–580.
Adrien Bousseau, Sylvain Paris, and Frédo Durand. 2009. User-assisted intrinsic images. Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros,
In ACM SIGGRAPH Asia 2009 papers. 1–10. and Richard Zhang. 2020. Swapping Autoencoder for Deep Image Manipulation. In
Lucy Chai, Michaël Gharbi, Eli Shechtman, Phillip Isola, and Richard Zhang. 2022. Any- Advances in Neural Information Processing Systems.
Resolution Training For High-Resolution Image Synthesis. In Computer Vision – Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amaras-
ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, inghe, and Frédo Durand. 2012. Decoupling algorithms from schedules for easy
Part XVI. Springer-Verlag, Berlin, Heidelberg, 170–188. optimization of image processing pipelines. ACM Transactions on Graphics (TOG)
Jiawen Chen, Andrew Adams, Neal Wadhwa, and Samuel W Hasinoff. 2016. Bilateral 31, 4 (2012), 1–12.
guided upsampling. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1–8. Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand,
Qifeng Chen, Dingzeyu Li, and Chi-Keung Tang. 2013. KNN matting. IEEE transactions and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing
on pattern analysis and machine intelligence 35, 9 (2013), 2175–2188. parallelism, locality, and recomputation in image processing pipelines. Acm Sigplan
Yutong Dai, Hao Lu, and Chunhua Shen. 2021. Learning Affinity-Aware Upsampling Notices 48, 6 (2013), 519–530.
for Deep Image Matting. In Proceedings of the IEEE/CVF Conference on Computer Christoph Rhemann, Carsten Rother, Jue Wang, Margrit Gelautz, Pushmeet Kohli, and
Vision and Pattern Recognition. 6841–6850. Pamela Rott. 2009. A perceptually motivated online benchmark for image matting. In
Frédo Durand and Julie Dorsey. 2002. Fast bilateral filtering for the display of high- 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1826–1833.
dynamic-range images. In Proceedings of the 29th annual conference on Computer Zenglin Shi, Yunlu Chen, Efstratios Gavves, Pascal Mettes, and Cees GM Snoek. 2021.
graphics and interactive techniques. 257–266. Unsharp Mask Guided Filtering. IEEE Transactions on Image Processing 30 (2021),
Zeev Farbman, Raanan Fattal, Dani Lischinski, and Richard Szeliski. 2008. Edge- 7472–7485.
preserving decompositions for multi-scale tone and detail manipulation. ACM Wanjie Sun and Zhenzhong Chen. 2020. Learned image downscaling for upscaling
Transactions on Graphics (TOG) 27, 3 (2008), 1–10. using content adaptive resampler. IEEE Transactions on Image Processing 29 (2020),
Michaël Gharbi, Jiawen Chen, Jonathan T Barron, Samuel W Hasinoff, and Frédo 4027–4040.
Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical
Transactions on Graphics (TOG) 36, 4 (2017), 1–12. flow. In European conference on computer vision. Springer, 402–419.
Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images.
and Frédo Durand. 2015. Transform recipes for efficient cloud photo enhancement. In Sixth international conference on computer vision (IEEE Cat. No. 98CH36271). IEEE,
ACM Transactions on Graphics (TOG) 34, 6 (2015), 1–12. 839–846.
Kaiming He, Jian Sun, and Xiaoou Tang. 2012. Guided image filtering. IEEE transactions Tobi Vaudrey and Reinhard Klette. 2009. Fast trilateral filtering. In International Con-
on pattern analysis and machine intelligence 35, 6 (2012), 1397–1409. ference on Computer Analysis of Images and Patterns. Springer, 541–548.
Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color! Jue Wang and Michael F Cohen. 2007. Optimized color sampling for robust matting. In
Joint end-to-end learning of global and local image priors for automatic image 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–8.
colorization with simultaneous classification. ACM Transactions on Graphics (ToG) Nicolas Weber, Michael Waechter, Sandra C Amend, Stefan Guthe, and Michael Goesele.
35, 4 (2016), 1–11. 2016. Rapid, detail-preserving image downscaling. ACM Transactions on Graphics
Michael Kazhdan and Hugues Hoppe. 2008. Streaming multigrid for gradient-domain (TOG) 35, 6 (2016), 1–6.
operations on large images. ACM Transactions on graphics (TOG) 27, 3 (2008), 1–10. Hao Wu and Dan Xu. 2009. Improved Poisson image editing and its implementation
Heewon Kim, Myungsub Choi, Bee Lim, and Kyoung Mu Lee. 2018. Task-aware image on GPU. In 2009 IEEE 10th International Conference on Computer-Aided Industrial
downscaling. In Proceedings of the European Conference on Computer Vision (ECCV). Design & Conceptual Design. IEEE, 1044–1048.
399–414. Huikai Wu, Shuai Zheng, Junge Zhang, and Kaiqi Huang. 2018. Fast end-to-end trainable
Johannes Kopf, Michael F Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint guided filter. In Proceedings of the IEEE Conference on Computer Vision and Pattern
bilateral upsampling. ACM Transactions on Graphics (ToG) 26, 3 (2007), 96–es. Recognition. 1838–1847.
Johannes Kopf, Ariel Shamir, and Pieter Peers. 2013. Content-adaptive image down- Xide Xia, Tianfan Xue, Wei-sheng Lai, Zheng Sun, Abby Chang, Brian Kulis, and Jiawen
scaling. ACM Transactions on Graphics (TOG) 32, 6 (2013), 1–8. Chen. 2021. Real-time localized photorealistic video style transfer. In Proceedings of
Andreas Koschan and Mongi Abidi. 2005. Detection and classification of edges in color the IEEE/CVF Winter Conference on Applications of Computer Vision. 1089–1098.
images. IEEE Signal Processing Magazine 22, 1 (2005), 64–73. Xide Xia, Meng Zhang, Tianfan Xue, Zheng Sun, Hui Fang, Brian Kulis, and Jiawen
Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. In Chen. 2020. Joint bilateral learning for real-time universal photorealistic style
ACM SIGGRAPH 2004 Papers. 689–694. transfer. In European Conference on Computer Vision. Springer, 327–342.
Anat Levin, Dani Lischinski, and Yair Weiss. 2007. A closed-form solution to natural Mingqing Xiao, Shuxin Zheng, Chang Liu, Yaolong Wang, Di He, Guolin Ke, Jiang
image matting. IEEE transactions on pattern analysis and machine intelligence 30, 2 Bian, Zhouchen Lin, and Tie-Yan Liu. 2020. Invertible image rescaling. In European
(2007), 228–242. Conference on Computer Vision. Springer, 126–144.
Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, and Dan Feng. 2017. Aod-net: Li Xu, Cewu Lu, Yi Xu, and Jiaya Jia. 2011. Image smoothing via L 0 gradient minimiza-
All-in-one dehazing network. In Proceedings of the IEEE international conference on tion. In Proceedings of the 2011 SIGGRAPH Asia conference. 1–12.
computer vision. 4770–4778. Hui Yin, Yuanhao Gong, and Guoping Qiu. 2019. Side window filtering. In Proceedings
Ping Li, Hanqiu Sun, Jianbing Shen, and Chen Huang. 2012. HDR image rerendering of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8758–8766.
using GPU-based processing. International Journal of Image and Graphics 12, 01 Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-
(2012), 1250007. to-Image Translation using Cycle-Consistent Adversarial Networkss. In Computer
Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, and Jan Kautz. 2018. A closed- Vision (ICCV), 2017 IEEE International Conference on.
form solution to photorealistic image stylization. In Proceedings of the European
Conference on Computer Vision (ECCV). 453–468.
Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L Curless, Steven M
Seitz, and Ira Kemelmacher-Shlizerman. 2021. Real-time high-resolution background
matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern

ACM Trans. Graph., Vol. 42, No. 4, Article . Publication date: August 2023.

You might also like