0% found this document useful (0 votes)
37 views

Generative Adversarial Text To Image Synthesis

1) The document describes a method for generating images from text descriptions using a deep recurrent neural network and generative adversarial network (GAN) formulation. 2) The model was able to capture shape and color from flower descriptions but lacked other details to produce realistic images. 3) The model had limitations in generalizing to images with multiple objects or variable backgrounds.

Uploaded by

AMIT MANCHANDA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Generative Adversarial Text To Image Synthesis

1) The document describes a method for generating images from text descriptions using a deep recurrent neural network and generative adversarial network (GAN) formulation. 2) The model was able to capture shape and color from flower descriptions but lacked other details to produce realistic images. 3) The model had limitations in generalizing to images with multiple objects or variable backgrounds.

Uploaded by

AMIT MANCHANDA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Generative Adversarial Text to Image Synthesis

Amit Manchanda | Anshul Jain | Dr. Vinod Pankajakshan


Department of Electronics and Communication Engineering, IIT Roorkee

Abstract Methodology Conclusion

We implemented a deep recurrent neural net- • We developed a simple and effective model for
work architecture and Generative Adversar- generating images based on detailed visual de-
ial Network(GAN) formulation to effectively scriptions.
bridge the advances in text and image mod- • The images are able to capture shape and color
eling, translating visual concepts from char- of the flower but lacks other significant details
acters to pixels. We show the capability of to pass off as a realistic sample.
the model to generate images of flowers from • The model could not generalize to images with
detailed text descriptions. multiple objects.

Introduction Figure 1: Text-conditional convolutional GAN architecture. Future Works

• Artificial synthesis of images using text descrip- DCGAN • Improve Generator learning with manifold in-
tions could have profound applications in visual GAN training procedure is similar to a two-player min-max game with the following objective function: terpolation.
editing, animation, and digital design. min max V (D, G) = Ex∼ pdata [logD(x)] + Ez∼ pz [log(1 − D(G(z)))] • Implementation of Stacked GANs to produce
G D
• The distribution of images conditioned on a text
high quality images.
where x is a real image from the true distribution, and z is a noise vector sampled from pz , which might
description is highly multimodal. • Explore the possibility of using Wasserstein
be a Gaussian or uniform distribution.
• In GANs, the discriminator D tries to distin-
GANs and Cyclic GANs.
guish real images from syntheticized images. Skip Thought Vectors • Generalizing the model to generate images with

The generator G tries to fool D. An unsupervised approach to train a generic, distributed sentence encoder. We train an encoder-decoder multiple objects and variable backgrounds using
• The discriminator views (text, image) pairs as model where encoder maps the input sentence to a vector and the decoder generates the surrounding MS-COCO dataset.
joint observations and is trained to judge a pair sentences.
t
logP (wi+1
X <t
|wi+1 t
, hi) + logP (wi−1 <t
|wi−1
X
, hi) References
as real or fake. t t

Subproblems Objective is to reduce the sum of the log-probabilities for the forward and backward sentences [1] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B.
• Learn a text feature representation that cap- conditioned on the encoder output. Schiele, and H. Lee. Generative adversarial
tures the important visual details. text-to-image synthesis. In ICML, 2016.
• Use these features to synthesize a compelling Results [2] A. Radford, L. Metz, and S. Chintala. Unsu-
image. pervised representation learning with deep con-
this flower has petals that are the flower has an abundance of flower is purple and pink in petal volutional generative adversarial networks. In
Literature Survey red and are bunched together yellow petals and brown anthers and feature a dark, dense core. ICLR, 2016.
[3] Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdi-
• [1] estimated generative models via adversarial nov, Richard Zemel, Raquel Urtasun, Antonio
process to generate image conditioned on text Torralba, and Sanja Fidler. Skip-thought vec-
and input noise. tors. In NIPS, 2015.
• In [2,4] authors describe architectural guidelines [4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza,
for stable GANs. B. Xu, D. Warde-Farley, S. Ozair, A. C.
• In [3] authors gave unsupervised approach to Courville, and Y. Bengio. Generative adver-
Figures generated from corresponding caption using the trained model.
train a generic sentence encoder. sarial nets. In NIPS, 2014.

You might also like