Assignment 3

assignment3

Uploaded by

animesh22074

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views2 pages

Assignment 3

assignment3

Uploaded by

animesh22074

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Deep Learning (CSE641/ECE555)

Assignment 3 (5 Marks)

Generative Adversarial Text-to-Image Synthesis

In this assignment, you will learn about text-to-image synthesis using conditional GANs. A typical GAN has a
Generator (G) that takes random noise as input to generate realistic data samples (e.g., images or audio or text) and
a Discriminator (D) that acts as a binary classifier, distinguishing between real and generated data. In Conditional
GANs, input to (G) is conditioned over additional information.

In this assignment, you have to train a conditional GAN to generate images where input to Target Generator
(G) is conditioned over textual descriptions. In addition, you have to train a Source Encoder, which will provide
learned representations as input to (G) instead of noise. You may train the whole setup in an end-to-end manner or
in parts. For instance, one approach could be knowledge distillation from source encoder to generator.

Overall Setup:
1. Source Encoder: Takes input image and outputs a representation. Any model size or type.
2. Target Generator: Takes representations from the source model and text encoding to generate new samples.
The number of parameters should be half of that of the Source Encoder. Any model type.
3. Discriminator: Distinguishes between real and generated data. Any model size or type.

Rules:
1. You can use any library to design your GAN.

2. You can use any loss function, coding style, batch size, optimizer or learning rate scheduler.
3. You can use any model architecture except modern ones, such as transformer or diffusion-based models. (If you
are unsure, please ask & clarify first.)
4. You can use the following as base repo for data: https://github.com/aelnouby/Text-to-Image-Synthesis?
tab=readme-ov-file
5. You cannot use any pretrained model/checkpoint, i.e., all parameters in your setup should be trained from
scratch (some random seed).
6. You have to demonstrate your setup by randomly selecting 20 classes (for the train) and 5 classes (for the test)
from the Oxford-102 dataset. Text descriptions are available in the GitHub repo mentioned above.
7. Source encoder can not use class labels during training. You may use any loss function to make it as discriminative
as possible for the real images of all 25 classes.
8. We will only run & test your code on Google Colab. You have a maximum of 200 epochs for training using
Colab resources. Time per epoch doesn’t matter but it is advisable that the training and testing can be finished
within 1 hr (though not mandatory). Hence, choose a resonable model size.
9. We encourage you to save .ipynb file cell outputs such as plots, visualization, loss/acc logs etc to aid in subjective
evaluation component.

[ I n p u t ] −−> [ S o u r c e Encoder ] −−> [ R e p r e s e n t a t i o n ] −−> [ Tar get G e n e r a t o r ] −−> [ Generated Image ]

ˆ
|
[ Text I n p u t ] −−> [ Text Encoder ] −−> [ Text Encoding ] −−−−−−−−−−−−+

[ Generated Image ] −−> [ D i s c r i m i n a t o r ] <−−> [ Real Images ]

1
Deliverables:
1. We don’t need your trained model but a robust code that can replicate your best setting.
2. Submit a single .ipynb file for this assignment with clean documented code. Beautifully structure your notebook
as if you are given a demo tutorial to a 1st year B.Tech student who can easily follow the steps.

3. Highlight the innovations (new things), if any, you have used that you believe make your submission stand out
and different from the entire class.
4. There should be two separate sections, one for Training and one for Testing.
5. In Training/Testing, you may use the dataloader from the above-mentioned GitHub repo.

6. In Testing, using the best model checkpoint you have to

(a) Generate and plot 5 random images from each test class as a grid of 5x5. (Hint: use diverse unseen text.)
(b) Plot the 3D-tSNE embedding of Source Encoder on all images from both train and test sets.
(c) Print in the form of a table: the total number of parameters, number of trainable parameters and model
size on disk for encoder, generator and discriminator.

Marking:

This assignment will not be fully auto-graded. Marking will be manual with subjective evaluations using the
following components:

1. Overall structure & cleanliness of submitted code notebook [1 mark]

2. Successful training of the full GAN model [1 mark]
3. Discriminative ability of the embeddings from Source encoder [1 mark]
4. Subjective diversity and quality of generation [1 mark]

5. Subjective evaluation of innovation in model architecture (including its size and memory footprint) and training
paradigm [1 mark]

Quadratic Forms and Systems of Linear Equations
No ratings yet
Quadratic Forms and Systems of Linear Equations
14 pages
GenAI Assignment 3 & 4
No ratings yet
GenAI Assignment 3 & 4
6 pages
Optics and Lasers in Engineering
No ratings yet
Optics and Lasers in Engineering
11 pages
Generative Adversarial Text To Image Synthesis
No ratings yet
Generative Adversarial Text To Image Synthesis
1 page
AML2023 Assignment 2 GAN
No ratings yet
AML2023 Assignment 2 GAN
1 page
Steps
No ratings yet
Steps
3 pages
modi method
No ratings yet
modi method
40 pages
set 1
No ratings yet
set 1
4 pages
Homework2
No ratings yet
Homework2
2 pages
Assignment4 – Deeplearning
No ratings yet
Assignment4 – Deeplearning
10 pages
Gen AI lab questions
No ratings yet
Gen AI lab questions
3 pages
Project1 Report
No ratings yet
Project1 Report
11 pages
Deep Learning Viva Questions
No ratings yet
Deep Learning Viva Questions
10 pages
Python for Finance Exam 2023 (1)
No ratings yet
Python for Finance Exam 2023 (1)
3 pages
AD3511 SET3
No ratings yet
AD3511 SET3
2 pages
AI Quiz3 MCQs
No ratings yet
AI Quiz3 MCQs
51 pages
Unit2_PracticeQuestions
No ratings yet
Unit2_PracticeQuestions
3 pages
AD3511 SET4
No ratings yet
AD3511 SET4
3 pages
AML2024 Assignment 2 (1)
No ratings yet
AML2024 Assignment 2 (1)
1 page
Assignment I-4
No ratings yet
Assignment I-4
3 pages
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
No ratings yet
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
17 pages
AAI3
No ratings yet
AAI3
7 pages
Lab 4 Assignment_W2022
No ratings yet
Lab 4 Assignment_W2022
8 pages
Assignment-10.1 NLP 2103a51375
No ratings yet
Assignment-10.1 NLP 2103a51375
8 pages
Cauchy Machine Neural Network Presentation
No ratings yet
Cauchy Machine Neural Network Presentation
25 pages
Dl lab answers batch 2
No ratings yet
Dl lab answers batch 2
27 pages
Generating Latex Code For Commutative Diagrams
No ratings yet
Generating Latex Code For Commutative Diagrams
8 pages
LSTM Flow
No ratings yet
LSTM Flow
3 pages
Exp 8 Machine Translation
No ratings yet
Exp 8 Machine Translation
11 pages
Slovin and Crochans Formula
75% (4)
Slovin and Crochans Formula
4 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Application of An Adaptive Model Predictive Control Algorithm On The Pelton Turbine Governor Control
No ratings yet
Application of An Adaptive Model Predictive Control Algorithm On The Pelton Turbine Governor Control
8 pages
Lab 2 Assignment_W2022
No ratings yet
Lab 2 Assignment_W2022
8 pages
AD3511 dl
No ratings yet
AD3511 dl
2 pages
Probability and Statistics - Int 2 - Answer Paper - Part B
No ratings yet
Probability and Statistics - Int 2 - Answer Paper - Part B
26 pages
About The Equivalence of Abraham's and Minkowski's Electrodynamics (M.Kranyš, 1979)
No ratings yet
About The Equivalence of Abraham's and Minkowski's Electrodynamics (M.Kranyš, 1979)
6 pages
Mit6 100l f22 Lec06
No ratings yet
Mit6 100l f22 Lec06
43 pages
CSC2626: Assignment 1 Due January 28 at 6pm ET 25 Points
No ratings yet
CSC2626: Assignment 1 Due January 28 at 6pm ET 25 Points
2 pages
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
No ratings yet
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
3 pages
NLP Assignment 2024
No ratings yet
NLP Assignment 2024
12 pages
Deep Learning 20CSE21_previous paper
No ratings yet
Deep Learning 20CSE21_previous paper
2 pages
Cse425 Assignement - 20101257
No ratings yet
Cse425 Assignement - 20101257
12 pages
EE 402 Lecture 1
No ratings yet
EE 402 Lecture 1
7 pages
Individual Report - CA 2 - 20000086
No ratings yet
Individual Report - CA 2 - 20000086
3 pages
ITNPAI1 Assignment S22
No ratings yet
ITNPAI1 Assignment S22
3 pages
Quantum Random Numbers Generator Ways
No ratings yet
Quantum Random Numbers Generator Ways
47 pages
Assignment-3
No ratings yet
Assignment-3
3 pages
Applied Optimization For Wireless, Machine Learning, Big Data
No ratings yet
Applied Optimization For Wireless, Machine Learning, Big Data
1 page
GAN Paper
No ratings yet
GAN Paper
9 pages
taask
No ratings yet
taask
18 pages
NM Narash
No ratings yet
NM Narash
6 pages
Ad3301 Set1
No ratings yet
Ad3301 Set1
2 pages
Acts 211 Actuarial Mathematics
No ratings yet
Acts 211 Actuarial Mathematics
6 pages
CSCE 636: Deep Learning (Fall 2019) Assignment #2 Due 10/4/2019
No ratings yet
CSCE 636: Deep Learning (Fall 2019) Assignment #2 Due 10/4/2019
2 pages
Assignment PDF
0% (1)
Assignment PDF
5 pages
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
10 pages
202: Dynamic Macroeconomics: Neoclassical Growth With Optimizing Agents (Ramsey-Cass-Koopmans Model)
No ratings yet
202: Dynamic Macroeconomics: Neoclassical Growth With Optimizing Agents (Ramsey-Cass-Koopmans Model)
86 pages
245008-23CS2902 - Deep Learning
No ratings yet
245008-23CS2902 - Deep Learning
4 pages
Zero Shot Text To Image Generation (DALL E)
No ratings yet
Zero Shot Text To Image Generation (DALL E)
20 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
3 pages
Applied Mathematics and Computation: Abdon Atangana
No ratings yet
Applied Mathematics and Computation: Abdon Atangana
9 pages
Dl Lab Manual
No ratings yet
Dl Lab Manual
18 pages
30 Assignments PDF
No ratings yet
30 Assignments PDF
5 pages
A Problem Course in Mathematical Logic: Stefan Bilaniuk
No ratings yet
A Problem Course in Mathematical Logic: Stefan Bilaniuk
91 pages
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
No ratings yet
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
2 pages
CCS355 SET1 Anna University Lab Manual Question Set
100% (1)
CCS355 SET1 Anna University Lab Manual Question Set
3 pages
Subject: Process Modelling, Simulation and Optimization (2180503) Topic: Solve NLP Optimization Problems Using Matlab/Scilab
No ratings yet
Subject: Process Modelling, Simulation and Optimization (2180503) Topic: Solve NLP Optimization Problems Using Matlab/Scilab
8 pages
DL - Assignment 1
No ratings yet
DL - Assignment 1
12 pages
An Introduction To Matched Filter
No ratings yet
An Introduction To Matched Filter
19 pages
(BESTFITTERS) Inverse Image Captioning Using Generative Adversarial Networks
No ratings yet
(BESTFITTERS) Inverse Image Captioning Using Generative Adversarial Networks
12 pages
Uncertainty Principles and Fourier Analysis: Alladi Sitaram
No ratings yet
Uncertainty Principles and Fourier Analysis: Alladi Sitaram
4 pages
Theory of Computation Lecture Notes
No ratings yet
Theory of Computation Lecture Notes
50 pages
Lesson 8 Association Rules
No ratings yet
Lesson 8 Association Rules
58 pages
Struktur Data 12 Juli
No ratings yet
Struktur Data 12 Juli
3 pages
CCS355 SET2 Anna University Lab Question Set Neural Network
No ratings yet
CCS355 SET2 Anna University Lab Question Set Neural Network
2 pages
Modern Cryptography
0% (1)
Modern Cryptography
6 pages
Genetic Programming Sol PDF
No ratings yet
Genetic Programming Sol PDF
7 pages
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
No ratings yet
Cs 224N: Assignment #4: 1. Neural Machine Translation With Rnns (45 Points)
7 pages
Next Word Prediction With NLP and Deep Learning
No ratings yet
Next Word Prediction With NLP and Deep Learning
13 pages
CNN RNN Assignment Set 4
0% (1)
CNN RNN Assignment Set 4
2 pages
Unbound Tech A Primer in Secure Multiparty Computation MPC
No ratings yet
Unbound Tech A Primer in Secure Multiparty Computation MPC
11 pages
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
Control System Lecture1
No ratings yet
Control System Lecture1
35 pages
MODULE in Mathematics in The Modern World Week 14 17
No ratings yet
MODULE in Mathematics in The Modern World Week 14 17
38 pages
Modeling and Position Control of Mobile Robot
No ratings yet
Modeling and Position Control of Mobile Robot
6 pages
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
From Everand
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
Steven Mcananey
No ratings yet
Microsoft Visual C++ Windows Applications by Example
From Everand
Microsoft Visual C++ Windows Applications by Example
Stefan BjÃ¶rnander
3.5/5 (3)
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
From Everand
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
Kanto
No ratings yet
C Programming Wizardry: From Zero to Hero in 10 Days: Programming Prodigy: From Novice to Virtuoso in 10 Days
From Everand
C Programming Wizardry: From Zero to Hero in 10 Days: Programming Prodigy: From Novice to Virtuoso in 10 Days
kok keong teo
No ratings yet
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
From Everand
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
Georgio Daccache
No ratings yet