VAE
VAE
Data Processing: MNIST digits normalized to [0, 1], reshaped into flat vectors of
1. Encoder: Utilized two dense layers; the first with ReLU activation, followed by
layers outputting mean (μ) and log variance (σ²) of latent space distributions.
Training: Conducted over 100 epochs with visual evaluations at selected intervals
1. Training Loss: The sum of reconstruction loss and KL divergence gradually declines
over 100 epochs, reflecting that the VAE is learning a smoother latent representation
of MNIST.
2. Reconstruction: Random test images and their reconstructions reveal that by ~100
epochs, the VAE captures the digit shapes reasonably well (though some blurring is
3. Latent Space Sampling: Sampling from a 2D latent grid [−2 ,2]×[−2 ,2]. yields a
grid of digits that morph smoothly between digit classes. This indicates the VAE has
2
learned a continuous latent manifold where nearby z points correspond to similar digit
features.
objective that averages over many possible styles. However, it offers a more interpretable
latent space and stable training. By epoch 100, the latent distribution is well-organized,
Conclusions
VAE:
The VAE builds a probabilistic latent space, encoding each input into a mean/variance
distribution.
After 100 epochs, the reconstructions are fairly accurate, and the latent space forms
The VAE is typically more stable and interpretable, though sometimes produces blurrier
samples. In any case, both yield convincing synthetic digits that confirm they have captured