Johnson ECCV16 Supplementary
Johnson ECCV16 Supplementary
1 Network Architectures
Our style transfer networks use the architecture shown in Table 1 and our super-
resolution networks use the architecture shown in Table 2. In these tables “C ×
H × W conv” denotes a convolutional layer with C filters size H × W which is
immediately followed by spatial batch normalization [1] and a ReLU nonlinearity.
Our residual blocks each contain two 3×3 convolutional layers with the same
number of filters on both layer. We use the residual block design of Gross and
Wilber [2] (shown in Figure 1), which differs from that of He et al [3] in that the
ReLU nonlinearity following the addition is removed; this modified design was
found in [2] to perform slightly better for image classification.
For style transfer, we found that standard zero-padded convolutions resulted
in severe artifacts around the borders of the generated image. We therefore
remove padding from the convolutions in residual blocks. A 3 × 3 convolution
with no padding reduces the size of a feature map by 1 pixel on each side, so in
this case the identity connection of the residual block performs a center crop on
the input feature map. We also add spatial reflection padding to the beginning
of the network so that the input and output of the network have the same size.
×4 ×8
Layer Activation size Layer Activation size
Input 3 × 72 × 72 Input 3 × 36 × 36
64 × 9 × 9 conv, stride 1 64 × 72 × 72 64 × 9 × 9 conv, stride 1 64 × 36 × 36
Residual block, 64 filters 64 × 72 × 72 Residual block, 64 filters 64 × 36 × 36
Residual block, 64 filters 64 × 72 × 72 Residual block, 64 filters 64 × 36 × 36
Residual block, 64 filters 64 × 72 × 72 Residual block, 64 filters 64 × 36 × 36
Residual block, 64 filters 64 × 72 × 72 Residual block, 64 filters 64 × 36 × 36
64 × 3 × 3 conv, stride 1/2 64 × 144 × 144 64 × 3 × 3 conv, stride 1/2 64 × 72 × 72
64 × 3 × 3 conv, stride 1/2 64 × 288 × 288 64 × 3 × 3 conv, stride 1/2 64 × 144 × 144
3 × 9 × 9 conv, stride 1 3 × 288 × 288 64 × 3 × 3 conv, stride 1/2 64 × 288 × 288
- - 3 × 9 × 9 conv, stride 1 3 × 288 × 288
3 x 3 Conv
3 x 3 Conv
Batch Norm
Batch Norm
ReLU
ReLU
3 x 3 Conv
3 x 3 Conv
Batch Norm
Batch Norm
ReLU
+
Fig. 1. Residual block used in our networks and an equivalent convolutional block.
3 Super-Resolution Metrics
Table 3. Quantitative results for super-resolution using FSIM [5] and VIF [6].
5 Super-Resolution Examples
We show additional examples of ×4 single-image super-resolution in Figure 4
and additional examples of ×8 single-image super-resolution in Figure 3.
Ground Truth Bicubic Ours (`pixel ) SRCNN [7] Ours (`f eat )
PSNR / SSIM 30.18 / 0.8737 29.96 / 0.8760 32.00 / 0.9026 27.80 / 0.8053
Ground Truth Bicubic 29.84 Ours (`pixel ) SRCNN [7] Ours (`f eat )
PSNR / SSIM / 0.8144 29.69 / 0.8113 31.20 / 0.8394 28.18 / 0.7757
Ground Truth Bicubic Ours (`pixel ) SRCNN [7] Ours (`f eat )
PSNR / SSIM 32.48 / 0.8575 32.30 / 0.8568 33.49 / 0.8741 30.85 / 0.8125
References
1. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by
reducing internal covariate shift. In: ICML. (2015)
2. Gross, S., Wilber, M.: Training and investigating residual nets.
http://torch.ch/blog/2016/02/04/resnets.html (2016)
3. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
In: CVPR. (2016)
4. Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: ICLR. (2015)
5. Zhang, L., Zhang, L., Mou, X., Zhang, D.: Fsim: a feature similarity index for
image quality assessment. IEEE transactions on Image Processing 20(8) (2011)
2378–2386
6. Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Transac-
tions on Image Processing 15(2) (2006) 430–444
7. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for
image super-resolution. In: ECCV. (2014)
8. Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity
single-image super-resolution based on nonnegative neighbor embedding. (2012)
9. Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-
representations. In: Curves and Surfaces. Springer (2010) 711–730
10. Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed
self-exemplars. In: CVPR. (2015)