Need For Upsampling in GANs
Need For Upsampling in GANs
The architecture is comprised of a generator and a discriminator model, both of which are
implemented as a deep convolutional neural network. The discriminator is responsible for
classifying images as either real (from the domain) or fake (generated). The generator is
responsible for generating new plausible examples from the problem domain.
The generator works by taking a random point from the latent space as input and outputting
a complete image, in a one-shot manner.
A traditional convolutional neural network for image classification, and related tasks, will use
pooling layers to downsample input images. For example, an average pooling or max
pooling layer will reduce the feature maps from a convolutional by half on each dimension,
resulting in an output that is one quarter the area of the input.
Both of these layers can be used on a GAN to perform the required upsampling operation to
transform a small input into a large image
output.
In the following sections, we will take a closer look at each and develop an intuition for how
they work so that we can use them effectively in our GAN models.
For example, an input image with the shape 2×2 would be output as 4×4.
1 1, 2
2 Input = (3, 4)
4 1, 1, 2, 2
5 Output = (1, 1, 2, 2)
6 3, 3, 4, 4
7 3, 3, 4, 4
It can be added to a convolutional neural network and repeats the rows and columns
provided as input in the output. For example:
1 ...
2 # define model
3 model = Sequential()
4 model.add(UpSampling2D())
We can demonstrate the behavior of this layer with a simple contrived example.
First, we can define a contrived input image that is 2×2 pixels. We can use specific values
for each pixel so that after upsampling, we can see exactly what effect the operation had on
the input.
1 ...
3 X = asarray([[1, 2],
4 [3, 4]])
6 print(X)
Once the image is defined, we must add a channel dimension (e.g. grayscale) and also a
sample dimension (e.g. we have 1 sample) so that we can pass it as input to the model.
1 ...
3 X = X.reshape((1, 2, 2, 1))
The model has only the UpSampling2D layer which takes 2×2 grayscale images as input
directly and outputs the result of the upsampling operation.
1 ...
2 # define model
3 model = Sequential()
4 model.add(UpSampling2D(input_shape=(2, 2, 1)))
6 model.summary()
We can then use the model to make a prediction, that is upsample a provided input image.
1 ...
3 yhat = model.predict(X)
The output will have four dimensions, like the input, therefore, we can convert it back to a
2×2 array to make it easier to review the result.
1 ...
4 # summarize output
5 print(yhat)
Tying all of this together, the complete example of using the UpSampling2D layer in Keras
is provided below.
6 X = asarray([[1, 2],
7 [3, 4]])
9 print(X)
11 X = X.reshape((1, 2, 2, 1))
12 # define model
13 model = Sequential()
14 model.add(UpSampling2D(input_shape=(2, 2, 1)))
16 model.summary()
18 yhat = model.predict(X)
19 # reshape output to remove channel to make printing easier
21 # summarize output
22 print(yhat)
Running the example first creates and summarizes our 2×2 input data.
Next, the model is summarized. We can see that it will output a 4×4 result as we expect,
and importantly, the layer has no parameters or model weights. This is because it is not
learning anything; it is just doubling the input.
Finally, the model is used to upsample our input, resulting in a doubling of each row and
column for our input data, as we expected.
1 [[1 2]
2 [3 4]]
4 _________________________________________________________________
6 =================================================================
8 =================================================================
9 Total params: 0
10 Trainable params: 0
11 Non-trainable params: 0
12 _________________________________________________________________
13
14
15 [[1. 1. 2. 2.]
16 [1. 1. 2. 2.]
17 [3. 3. 4. 4.]
18 [3. 3. 4. 4.]]
By default, the UpSampling2D will double each input dimension. This is defined by the ‘size‘
argument that is set to the tuple (2,2).
You may want to use different factors on each dimension, such as double the width and
triple the height. This could be achieved by setting the ‘size‘ argument to (2, 3). The result of
applying this operation to a 2×2 image would be a 4×6 output image (e.g. 2×2 and 2×3). For
example:
1 ...
3 model.add(UpSampling2D(size=(2, 3)))
Additionally, by default, the UpSampling2D layer will use a nearest neighbor algorithm to fill
in the new rows and columns. This has the effect of simply doubling rows and columns, as
described and is specified by the ‘interpolation‘ argument set to ‘nearest‘.
Alternately, a bilinear interpolation method can be used which draws upon multiple
surrounding points. This can be specified via setting the ‘interpolation‘ argument to ‘bilinear‘.
For example:
1 ...
3 model.add(UpSampling2D(interpolation='bilinear'))