Advance Deep Learning - BIT L1
Advance Deep Learning - BIT L1
ZG513
Dr. Sugata Ghosal
BITS Pilani [email protected]
Pilani Campus
BITS Pilani
Pilani Campus
Session 1
Time – 8:30 AM to 10:30 PM
• Objective of course
• Evaluation Plan
• Course Overview
Please note there will be no change in submission dates for quiz and assignment
1 Autoencoders
2 Deep Autoencoders
3 Convolutional Autoencoders
4 Variational Autoencoders
5 Generative Adversarial Networks
Yann LeCun
Need tremendous
amount of information
to build machines that
have common sense
and generalize
[LeCun-20161205-NeurIPS-keynote]
14
BITS Pilani, Pilani Campus
Topics Covered in This Course
• PCA and Variants • Generative Adversarial
Networks
• Likelihood based models
• Diffusion Based Models
• Autoregressive Models • Energy Based Models
• Semi-supervised Learning
• Autoencoders and VAE
15
• Applications in Time Series
Principle Component Analysis
PixelCNN 4 8 16 32 256
64 128
A. van den Oord et al., "Conditional Image Generation with PixelCNN Decoders", NeurIPS 2016
S. Reed et al., "Parallel Multiscale Autoregressive Density Estimation", ICML 2017
BITS Pilani, Pilani Campus
Normalizing Flow Models
! "
Class-conditioned samples generated by BigGAN
min max 𝔼#~%[log 𝐷 " 𝑥 ] + 𝔼#~&! [log(1 − 𝐷 "(𝑥))]
I. Goodfellow, J . Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets”, NIPS 2014.
A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks”, I C L R 2016
L. Karacan, Z. Akata, A. Erdem and E. Erdem, “Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts”, arXiv preprint 2016
A. Brock, J . Donahue, K. Simonyan, “Large Scale G A N Training for High Fidelity Natural Image Synthesis”, ICLR2019
BITS Pilani, Pilani Campus
Progress in GANs
When we started
Source: https://github.com/hindupuravinash/the-gan-zoo
28
BITS Pilani, Pilani Campus
Score-Based and Denoising Diffusion Models
J . Devlin, M.-W. Chang, K. Lee, K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, NAACL-HLT 2019.
C. Raffel et al., "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer", J M L R 2020. 32
BITS Pilani, Pilani Campus
Multimodal Pretraining
J . Lu, D. Batra, D. Parikh, S, Lee, “ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks”, NeurIPS 2019
X. Li et al., "Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks”, E C C V 2020. 33
BITS Pilani, Pilani Campus
What is Deep Unsupervised
Learning
pmodel
pdata
P
pmodel
pdata
P
Assumptions on P:
• tractable sampling
pmodel
pdata
P
Assumptions on P:
• tractable sampling
• tractable likelihood function
54
BITS Pilani, Pilani Campus
Self-Supervised/Predictive Learning
• Given unlabeled data, design
supervised tasks that induce a good
representation for downstream
tasks.
• No good mathematical formalization,
but the intuition is to “force” the
predictor used in the task to learn
something “semantically Image credit: LeCun’s self-supervised learning slide
3
6
Geoffrey E. Hinton, Simon Osindero and Yee-Whye Teh, A Fast Learning Algorithm for Deep Belief Nets, Neural Computation, 2006
BITS Pilani, Pilani Campus
Generate Images
37
I.J. Goodfellow, J . Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative Adversarial Networks. NIPS 2014.
BITS Pilani, Pilani Campus
Generate Images
38
Alec Radford, Luke Metz, Soumith Chintala, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial
Networks”, IC L R 2016. BITS Pilani, Pilani Campus
Generate Images
39
Alec Radford, Luke Metz, Soumith Chintala, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial
Networks”, IC L R 2016.
BITS Pilani, Pilani Campus
Generate Images
Christian Ledig, Lucas Theis, Ferenc Huszar et al., Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,
CVPR 2017 66
BITS Pilani, Pilani Campus
Generate Images
Andrew Brock, Jeff Donahue, Karen Simonyan, Large Scale G A N Training for High Fidelity Natural Image Synthesis, IC L R 2019
68
BITS Pilani, Pilani Campus
Generate Images
Tero Karras, Samuli Laine, Timo Aila, A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019
BITS 69
Pilani, Pilani Campus
Generate Images
43
BITS Pilani, Pilani Campus
Generate Images
Eric Ryan Chan et al., EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks, arXiv:2112.07945, 2021. 70
BITS Pilani, Pilani Campus
Generate Images
Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, Daniel Cohen-Or., StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators,
arXiv:2108.00946, 2021. 71
BITS Pilani, Pilani Campus
Generate Audio
Parametric WaveNet
46
BITS Pilani, Pilani Campus
Generate Video
DVD-GAN: Adversarial Video Generation on Complex Datasets, Clark, Donahue, Simonyan, 2019
47
BITS Pilani, Pilani Campus
Generate Text
PANDARUS:
Alas, I think he shall be come approached and the day
When little srain would be attain'd into being never fed,
And who is but a chain and subjects of his death,
I should not sleep.
Second Senator:
They are away this miseries, produced upon my soul,
Breaking and strongly should be buried, when I perish
The earth and thoughts of many states.
DUKE VINCENTIO:
Well, your wit is in the care of side and that.
48
BITS Pilani, Pilani Campus
Generate Math
\begin{proof}
We may assume that $\mathcal{I}$ is an abelian sheaf on
$\mathcal{C}$.
\item Given a morphism $\Delta : \mathcal{F} \to
\mathcal{I}$
is an injective and let $\mathfrak q$ be an abelian sheaf on
$X$.
Let $\mathcal{F}$ be a fibered complex. Let $\mathcal{F}$
be a category.
\begin{enumerate}
\item \hyperref[setain-construction-phantom]{Lemma}
\label{lemma-characterize-quasi-finite}
Let $\mathcal{F}$ be an abelian quasi-coherent sheaf on
$\mathcal{C}$.
Let $\mathcal{F}$ be a coherent $\mathcal{O}_X$-module.
Then
$\mathcal{F}$ is an abelian catenary over $\mathcal{C}$.
\item The following are equivalent
\begin{enumerate}
\item $\mathcal{F}$ is an $\mathcal{O}_X$-module.
\end{lemma}
49
BITS Pilani, Pilani Campus
Generate Text
Prompt [human written]: A train carriage containing controlled nuclear materials was stolen in
Cincinnati today. Its whereabouts are unknown.
GPT-2: The incident occurred on the downtown train line, which runs from Covington and Ashland stations.
In an email to Ohio news outlets, the U.S. Department of Energy said it is working with the Federal
Railroad Administration to find the thief.
‘The theft of this nuclear material will have significant negative consequences on public and
environmental health, our workforce and the economy of our nation,’ said Tom Hicks, the U.S. Energy
Secretary, in a statement. ‘Our top priority is to secure the theft and ensure it doesn’t happen again.’
The stolen material was taken from the University of Cincinnati’s Research Triangle Park nuclear
research site, according to a news release from Department officials.
The Nuclear Regulatory Commission did not immediately release any information. According to the release,
the U.S. Department of Energy’s Office of Nuclear Material Safety and Security is leading that team’s
investigation.
‘The safety of people, the environment and the nation’s nuclear stockpile is our highest priority,’
Hicks said. ‘We will get to the bottom of this and make no excuses.’
[OpenAI’s GPT2 -- Radford, Wu, Child, Luan, Amodei, Sutskever, 2019]
50
BITS Pilani, Pilani Campus
Generate Text
Prompt (human written): Recycling is good for the world, no, you could not be more
wrong.
GPT-2: Recycling is NOT good for the world. It is bad for the environment, it is
bad for our health, and it is bad for our economy. I’m not kidding. Recycling is
not good for the environment. It is destructive to the earth and it is a major
contributor to global warming. Recycling is not good for our health. It
contributes to obesity and diseases like heart disease and cancer. Recycling is
bad for our economy. It increases the cost of a product, and in turn, the price of
everything that is made with that product. Recycling is not good for our nation.
We pay a tremendous price for the privilege of having the world’s most advanced
and efficient recycling system. Recycling is a huge, colossal waste of time,
energy, money, and resources.
51
BITS Pilani, Pilani Campus
Generating Text
Tom B. Brown, Benjamin Mann, Nick Ryder et al., Language Models are Few-Shot Learners, NeurIPS 2020 79
BITS Pilani, Pilani Campus
Generating Text
Tom B. Brown, Benjamin Mann, Nick Ryder et al., Language Models are Few-Shot Learners, NeurIPS 2020 81
BITS Pilani, Pilani Campus
Generating Images from Text
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray et al., DALL·E: Creating Images from Text, OpenAI, 2021 BITS Pilani, Pilani Campus
83
Generating Images from Text
C. Baykal, A. B. Anees, D. Ceylan, E. Erdem, A. Erdem, D. Yuret, Manipulating Images with Text Prompts, Work in Progress,
BITS2022 84
Pilani, Pilani Campus
Generating Code
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, et al., DALL·E: Creating Images from Text, OpenAI, arXiv:2107.03374, 2021.
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma et al., Program Synthesis with Large Language Models, arXiv:2108.07732, 2021
BITS Pilani, Pilani 85
Campus
Generating Code
Yujia Li, David Choi, Junyoung Chung, Nate Kushman et al., Competition-Level Code Generation with AlphaCode, DeepMind, 2022 86
BITS Pilani, Pilani Campus
Generating Molecules
Nicola De Cao, Thomas Kipf, MolGAN: An implicit generative model for small molecular graphs, ICML 2018 workshop on Theoretical
Foundations and Applications of Deep Generative Models, 2018 87
BITS Pilani, Pilani Campus
Compression - Lossless
Rewon Child, Scott Gray, Alec Radford, Ilya Sutskever, Generating Long Sequences with Sparse Transformers, arXiv:1904.10509, 2019
BITS Pilani, 88
Pilani Campus
Compression - Lossy
J P EG JPEG2000 WaveOne
Oren Rippel, Lubomir Bourdev, Real-Time Adaptive Image Compression, ICML 2017 89
BITS Pilani, Pilani Campus
Downstream Task – Sentiment
Dectection
63
BITS Pilani, Pilani Campus
Downstream Tasks - NLP (BERTRevolution)
https://gluebenchmark.com/leaderboard
91
BITS Pilani, Pilani Campus
Downstream Tasks - Vision (Contrastive)
Olivier J . Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord, Data-Efficient Image
Recognition with Contrastive Predictive Coding, ICML 2020 92
BITS Pilani, Pilani Campus
Summary
Y. LeCun, Y. Bengio, G. Hinton, "Deep Learning", Nature, Vol. 521, 28 May 2015
BITS Pilani, Pilani Campus
Neural building blocks: RNNs
K. Xu et al., “Show, Attend and Tell: Neural Image Caption Generation with
Visual Attention”, ICML 2015
D. Bahdanau, K. Cho and Y. Bengio, “Neural Machine Translation by Jointly
Learning to Align and Translate”, IC L R 2015
A. Vaswani et al., “Attention Is All You Need”, NIPS 2016
Seq2Seq with Attenion BITS Pilani, Pilani Campus