Generating Diverse High-Fidelity Images with VQ-VAE-2

Razavi, Ali; Oord, Aaron van den; Vinyals, Oriol

Computer Science > Machine Learning

arXiv:1906.00446 (cs)

[Submitted on 2 Jun 2019]

Title:Generating Diverse High-Fidelity Images with VQ-VAE-2

Authors:Ali Razavi, Aaron van den Oord, Oriol Vinyals

View PDF

Abstract:We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher coherence and fidelity than possible before. We use simple feed-forward encoder and decoder networks, making our model an attractive candidate for applications where the encoding and/or decoding speed is critical. Additionally, VQ-VAE requires sampling an autoregressive model only in the compressed latent space, which is an order of magnitude faster than sampling in the pixel space, especially for large images. We demonstrate that a multi-scale hierarchical organization of VQ-VAE, augmented with powerful priors over the latent codes, is able to generate samples with quality that rivals that of state of the art Generative Adversarial Networks on multifaceted datasets such as ImageNet, while not suffering from GAN's known shortcomings such as mode collapse and lack of diversity.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1906.00446 [cs.LG]
	(or arXiv:1906.00446v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.00446

Submission history

From: Aäron van den Oord [view email]
[v1] Sun, 2 Jun 2019 16:46:42 UTC (12,827 KB)

Computer Science > Machine Learning

Title:Generating Diverse High-Fidelity Images with VQ-VAE-2

Submission history

Access Paper:

References & Citations

3 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generating Diverse High-Fidelity Images with VQ-VAE-2

Submission history

Access Paper:

References & Citations

3 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators