0% found this document useful (0 votes)
4 views

AAI - Module 2 - Variational Autoencoders

Uploaded by

shlokpanchal2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

AAI - Module 2 - Variational Autoencoders

Uploaded by

shlokpanchal2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Module 2 - Variational Autoencoders

Basic Components of VAEs

●​ Encoder Network
●​ Converts input data into a probability distribution in latent space
●​ Maps high-dimensional data to lower-dimensional representations
●​ Produces mean (μ) and variance (σ) parameters of the latent distribution
●​ Latent Space
●​ Compressed, continuous representation of input data
●​ Usually follows a normal distribution
●​ Enables meaningful interpolation between data points
●​ Dimensionality is much lower than input space
●​ Decoder Network
●​ Reconstructs input data from latent space samples
●​ Maps latent representations back to original data space
●​ Learns to generate new data samples

Architecture and Training of VAEs

1.​ Network Structure


●​ Encoder: Input → Hidden Layers → Latent Parameters (μ, σ)
●​ Sampling Layer: Uses reparameterization trick
●​ Decoder: Latent Sample → Hidden Layers → Reconstructed Output
​ The basic architecture of VAE is depicted in the diagram below:

2.​ Training Process


●​ Forward pass through encoder
●​ Sampling from latent distribution
●​ Reconstruction through decoder
●​ Backpropagation using loss function
​ The training process is depicted in diagram given below:

Loss Function Components

1.​ Reconstruction Loss


●​ Measures how well the decoder reconstructs input
●​ Usually binary cross-entropy or mean squared error
2.​ KL Divergence Loss
●​ Ensures latent space follows desired distribution
●​ Regularizes the latent space
●​ Prevents overfitting and enables generation
​ The Loss Function Components are depicted in the diagram given below:
​ ​

Latent Space Representation and Inference

●​ Properties
●​ Continuous and smooth
●​ Semantically meaningful
●​ Supports interpolation
●​ Enables feature disentanglement
​ The Latent Space Representation is depicted in the diagram given below:
●​ Inference Process
●​ Encode input to get latent parameters
●​ Sample from latent distribution
●​ Generate new samples via decoder

Applications in Image Generation

1.​ Image Synthesis


●​ Generate new, realistic images
●​ Interpolate between existing images
●​ Style transfer and modification
2.​ Image Manipulation
●​ Feature manipulation in latent space
●​ Attribute editing
●​ Image completion/inpainting
​ The image manipulation Workflow is depicted in the diagram given below:​
​ ​

3.​ Domain Adaptation


●​ Transfer learning between domains
●​ Style transfer applications
●​ Cross-domain translation

Key Advantages

●​ Probabilistic approach to generative modeling


●​ Learns meaningful latent representations
●​ Enables both generation and inference
●​ Supports various data types (images, text, audio)

Challenges and Limitations

●​ Blurry reconstructions in image applications


●​ Difficulty with complex, high-dimensional data
●​ Balancing reconstruction vs. KL divergence
●​ Mode collapse issues
Module 2.2: Types of Autoencoders

1. Undercomplete Autoencoders

●​ Definition: Simplest form where hidden layers have fewer dimensions than input
layer 5
●​ Key Characteristics:
●​ Forces compressed data representation
●​ Acts as dimensionality reduction technique
●​ More powerful than PCA due to non-linear transformation capabilities 1

2. Sparse Autoencoders

●​ Core Concept: Similar to undercomplete but uses different regularization approach 1


●​ Distinctive Features:
●​ Doesn't require dimension reduction
●​ Uses sparsity penalty in loss function
●​ Penalizes activation of hidden layer neurons 7
●​ Benefits:
●​ Different nodes specialize for different input types
●​ Better at preventing overfitting
●​ Uses L1 Regularization and KL divergence 1

3. Contractive Autoencoders

●​ Primary Principle: Similar inputs should have similar encodings 1


●​ Key Features:
●​ Keeps input hidden layer activations derivatives small
●​ Enforces robustness in learned representations
●​ Contracts neighborhood of inputs to small neighborhood of outputs 1

4. Denoising Autoencoders

●​ Main Purpose: Remove noise from input data 1


●​ Operational Method:
●​ Input and output are intentionally different
●​ Fed with corrupted/noisy versions of input
●​ Trained to recover clean versions 6
●​ Applications:
●​ Image restoration
●​ Signal denoising
●​ Data cleaning

5. Variational Autoencoders (VAEs)

●​ Unique Approach: Creates probabilistic distributions for each dimension 1


●​ Key Characteristics:
●​ Encoder outputs probability distribution parameters
●​ Decoder samples from these distributions
●​ Enables generative capabilities 1
●​ Applications:
●​ Data generation
●​ Image synthesis
●​ Feature interpolation

The different types of autoencoders are show in the diagram below:

graph TD
A[Input Data] --> B{Autoencoder Type}
B --> C[Undercomplete]
B --> D[Sparse]
B --> E[Contractive]
B --> F[Denoising]
B --> G[Variational]

C --> H[Encoder (compressed mapping)


→ Bottleneck → Decoder]

D --> I[Encoder (with sparsity penalty)


→ Bottleneck → Decoder]

E --> J[Encoder (with contractive penalty, keeping


activations locally invariant)
→ Bottleneck → Decoder]

F --> K[Encoder (processes noisy input)


→ Bottleneck → Decoder (reconstructs clean output)]

G --> L[Probabilistic Encoder (outputs μ & σ)


→ Sampling → Decoder (generative model)]

Comparison of Architectures

Training Objectives:
1.​ Undercomplete: Minimize reconstruction error with bottleneck constraint 5
2.​ Sparse: Balance reconstruction with activation sparsity 7
3.​ Contractive: Maintain similar encodings for similar inputs 1
4.​ Denoising: Reconstruct clean data from noisy input 6
5.​ Variational: Balance reconstruction with distribution matching 1

Use Cases:

●​ Dimensionality Reduction: Undercomplete, Sparse


●​ Feature Learning: Sparse, Contractive
●​ Noise Removal: Denoising
●​ Data Generation: Variational

Benefits and Limitations:

●​ Undercomplete: Simple but may lose important information 5


●​ Sparse: Better feature extraction but complex training 7
●​ Contractive: Robust features but computationally intensive 1
●​ Denoising: Effective noise removal but requires corrupt-clean pairs 6
●​ Variational: Powerful generation but complex optimization

You might also like