0% found this document useful (0 votes)
17 views39 pages

CS485 Ch4 Latentcodes-1

Uploaded by

Mennan Gök
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views39 pages

CS485 Ch4 Latentcodes-1

Uploaded by

Mennan Gök
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

CS485/585

Deep Generative Networks


Bilkent University
Content
• Reconstruct an image
• Manipulate an image
• Find interpretable directions in the latent space
Image2StyleGAN
• Goal is to edit an existing photograph with
StyleGAN.
• StyleGAN generates novel faces.
• First, we need to find the latent code that can
generate an existing photograph.
Latent Space Embedding
• Learn an encoder.
– Fast
– Limitation in generalizing beyond the training dataset.

• Select a random initial latent code and optimize it


using gradient descent.
– Slow
– Adapts to novel images
Image2StyleGAN

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Image2StyleGAN

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Image2StyleGAN
• Start with a random latent code and backpropagate
the target loss.
Loss Functions
• the low-level similarity between two images is
measured in the pixel space with L1/L2 loss
functions.
• Per-pixel losses do not capture perceptual
differences between output and target image.
– Two identical image with 1 pixel offset from each other
results in high per-pixel loss, despite their high perceptual
similarity
• Perceptual Loss

Perceptual losses for real-time style transfer and super-resolution


J Johnson, A Alahi, L Fei-Fei - European conference on computer vision, 2016
Perceptual Loss
• Perceptual loss functions based not on differences
between pixels but instead on differences between
high-level image feature representations extracted
from pretrained convolutional neural networks
• Key insight: convolutional neural networks
pretrained for image classification have already
learned to encode the perceptual and semantic
information we would like to measure in our loss
functions

Perceptual losses for real-time style transfer and super-resolution


J Johnson, A Alahi, L Fei-Fei - European conference on computer vision, 2016
Perceptual Loss

Let φj (x) be the activations of the jth layer of the


network φ when processing the image x; if j is a
convolutional layer then φj (x) will be a feature
map of shape Cj × Hj × Wj .

Perceptual losses for real-time style transfer and super-resolution


J Johnson, A Alahi, L Fei-Fei - European conference on computer vision, 2016
Perceptual Loss

Optimization to find an image y’ that minimizes the feature reconstruction loss


φ for several layers j from the pretrained VGG-16 loss network φ.
Reconstruction from higher layers, image content and overall spatial structure
are preserved, but color, texture, and exact shape are not.

Perceptual losses for real-time style transfer and super-resolution


J Johnson, A Alahi, L Fei-Fei - European conference on computer vision, 2016
Image2StyleGAN
• Start with a random latent code and backpropagate
the target loss.

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Which Latent Space to Choose?

1) Initial latent space z.


2) the intermediate latent space W
3) extended latent space W+. W+ is a
concatenation of 18 different 512-
dimensional w vectors, one for each
layer of the StyleGAN architecture
that can receive input via AdaIn

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Image2StyleGAN

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Image2StyleGAN - Morphing

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Image2StyleGAN – Expression
Transfer

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
Image2StyleGAN – Style Transfer

Image2StyleGAN: How to Embed Images Into


the StyleGAN Latent Space?, ICCV 2019
In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion
• inversion methods typically focus on reconstructing
the target image by pixel values.
• Is it enough?

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion
• inversion methods typically focus on reconstructing
the target image by pixel values.
• Is it enough?
What if the code is not aligned with the semantic domain of
the latent space?

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion
• inversion methods typically focus on reconstructing
the target image by pixel values yet fail to land the
inverted code in the semantic domain of the original
latent space.
• As a result, the reconstructed image cannot well
support semantic editing through varying the
inverted code.
• To solve this problem, in-domain GAN inversion
approach, not only faithfully reconstructs the input
image but also ensures the inverted code to be
semantically meaningful for editing.

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion
• Domain Guided Encoder
• Then optimization

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion
• it is hard to learn a perfect reverse mapping with an
encoder alone due to its limited representation
capability.
• Therefore, even though the inverted code from the
proposed domain-guided encoder can well
reconstruct the input image based on the pre-
trained generator and ensure the code itself to be
semantically meaningful, we still need to refine the
code to make it better fit the target individual image
at the pixel values.

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion
• The domain-guided encoder provides an ideal
starting point which avoids the code from getting
stuck at a local minimum
• Also used as a regularizer to preserve the latent
code within the semantic domain of the generator.

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

[29] Image2stylegan: How to embed images into the stylegan latent space? In: ICCV (2019)
[36] Generative visual manipulation on the natural image manifold. In: ECCV (2016)

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

[29] Image2stylegan: How to embed images into the stylegan latent space? In: ICCV (2019)

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


In Domain GAN Inversion

In-Domain GAN Inversion for Real Image Editing, ECCV 2020


High-fidelity Image Inversion

High-Fidelity GAN Inversion for Image Attribute Editing, CVPR 2022


High-fidelity Image Inversion

High-Fidelity GAN Inversion for Image Attribute Editing, CVPR 2022


High-fidelity Image Inversion

High-Fidelity GAN Inversion for Image Attribute Editing, CVPR 2022


High-fidelity Image Inversion

High-Fidelity GAN Inversion for Image Attribute Editing, CVPR 2022


High-fidelity Image Inversion

High-Fidelity GAN Inversion for Image Attribute Editing, CVPR 2022


High-fidelity Image Inversion

High-Fidelity GAN Inversion for Image Attribute Editing, CVPR 2022


Next Class – finding directions

In-Domain GAN Inversion for Real Image Editing, ECCV 2020

You might also like