0% found this document useful (0 votes)

69 views36 pages

Unit 5

Uploaded by

vishustler

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views36 pages

Unit 5

Uploaded by

vishustler

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

CS7015 (Deep Learning) : Lecture 21

Variational Autoencoders

Mitesh M. Khapra

Department of Computer Science and Engineering

Indian Institute of Technology Madras

1/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Acknowledgments
Tutorial on Variational Autoencoders by Carl Doersch1
Blog on Variational Autoencoders by Jaan Altosaar2
1
Tutorial
2
Blog

2/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
3/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Module 21.1: Revisiting Autoencoders

4/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Before we start talking about VAEs, let us
X̂ quickly revisit autoencoders
W∗ An autoencoder contains an encoder which
takes the input X and maps it to a hidden
h representation
W The decoder then takes this hidden represent-
ation and tries to reconstruct the input from
X it as X̂
The training happens using the following ob-
h = g(W X + b) jective function
m n
X̂ = f (W ∗ h + c) 1 XX
min (x̂ij − xij )2
∗
W,W ,c,b m i=1 j=1

where m is the number of training instances,

{xi }m n
i=1 and each xi ∈ R (xij is thus the j-th
dimension of the i-th training instance)
5/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
But where’s the fun in this ?
X̂ We are taking an input and simply recon-
W∗ structing it
Of course, the fun lies in the fact that we are
h getting a good abstraction of the input
W But RBMs were able to do something more
besides abstraction (they were able to do gen-
X eration)
Let us revisit generation in the context of au-
h = g(W X + b)
toencoders
X̂ = f (W ∗ h + c)

6/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Can we do generation with autoencoders ?
X̂ In other words, once the autoencoder is
W∗ trained can I remove the encoder, feed a hid-
den representation h to the decoder and de-
h code a X̂ from it ?
W In principle, yes! But in practice there is a
problem with this approach
X h is a very high dimensional vector and only
a few vectors in this space would actually cor-
h = g(W X + b) respond to meaningful latent representations
X̂ = f (W ∗ h + c) of our input
So of all the possible value of h which values
should I feed to the decoder (we had asked a
similar question before: slide 67, bullet 5 of
lecture 19)
7/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Ideally, we should only feed those values of h
X̂ which are highly likely
W∗ In other words, we are interested in sampling
from P (h|X) so that we pick only those h’s
h which have a high probability
But unlike RBMs, autoencoders do not have
such a probabilistic interpretation
They learn a hidden representation h but not
a distribution P (h|X)
X̂ = f (W ∗ h + c) Similarly the decoder is also deterministic and
does not learn a distribution over X (given a
h we can get a X but not P (X|h) )

8/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
We will now look at variational autoencoders which have the same structure as
autoencoders but they learn a distribution over the hidden variables

9/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Module 21.2: Variational Autoencoders: The Neural
Network Perspective

10/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Let {X = xi }N
i=1 be the training data
We can think of X as a random variable in Rn
For example, X could be an image and the
dimensions of X correspond to pixels of the
image
We are interested in learning an abstraction
Figure: Abstraction
(i.e., given an X find the hidden representa-
tion z)
We are also interested in generation (i.e.,
given a hidden representation generate an X)
In probabilistic terms we are interested in
P (z|X) and P (X|z) (to be consistent with the
Figure: Generation literation on VAEs we will use z instead of H
and X instead of V )

11/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Earlier we saw RBMs where we learnt P (z|X)
H∈ {0, 1}n and P (X|z)
c1 c2 cn Below we list certain characteristics of RBMs
h1 h2 ··· hn Structural assumptions: We assume cer-
tain independencies in the Markov Network
Computational: When training with Gibbs
w1,1 wm,n W ∈ Rm×n Sampling we have to run the Markov Chain
for many time steps which is expensive
Approximation: When using Contrastive
v1 v2 ··· vm Divergence, we approximate the expectation
by a point estimate
b1 b2 bm
V ∈ {0, 1}m (Nothing wrong with the above but we just
mention them to make the reader aware of
these characteristics)

12/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
We now return to our goals
Reconstruction: X̂
Goal 1: Learn a distribution over the latent
variables (Q(z|X))
Decoder Pφ (X|z) Goal 2: Learn a distribution over the visible
variables (P (X|z))
z VAEs use a neural network based encoder for
Goal 1
and a neural network based decoder for Goal
Encoder Qθ (z|X)
2
We will look at the encoder first
Data: X

θ: the parameters of the encoder

neural network
φ: the parameters of the decoder
neural network 13/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
z Encoder: What do we mean when we say
we want to learn a distribution? We mean
that we want to learn the parameters of the
µ Σ distribution
But what are the parameters of Q(z|X)?
Well it depends on our modeling assump-
Qθ (z|X)
tion!
In VAEs we assume that the latent variables
come from a standard normal distribution
X N (0, I) and the job of the encoder is to then
predict the parameters of this distribution
X ∈ Rn , µ ∈ Rm and Σ ∈ Rm×m

14/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
X̂i
Now what about the decoder?
Pφ (X|z) The job of the decoder is to predict a probab-
ility distribution over X : P (X|z)
Once again we will assume a certain form for
z this distribution
Sample For example, if we want to predict 28 x 28
pixels and each pixel belongs to R (i.e., X ∈
R784 ) then what would be a suitable family
µ Σ for P (X|z)?
We could assume that P (X|z) is a Gaussian
Qθ (z|X) distribution with unit variance
Xi
The job of the decoder f would then be to
predict the mean of this distribution as fφ (z)

15/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
X̂i
What would be the objective function of the
Pφ (X|z) decoder ?
For any given training sample xi it should
maximize P (xi ) given by
z
ˆ
Sample P (xi ) = P (z)P (xi |z)dz

= −Ez∼Qθ (z|xi ) [log Pφ (xi |z)]

µ Σ
(As usual we take log for numerical stability)

Qθ (z|X)

16/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
X̂i
This is the loss function for one data point
Pφ (X|z) (li (θ)) and we will just sum over all the data
points to get the total loss L (θ)
Xm
L (θ) = li (θ)
z i=1
Sample
In addition, we also want a constraint on the
distribution over the latent variables
µ Σ Specifically, we had assumed P (z) to be
N (0, I) and we want Q(z|X) to be as close
Qθ (z|X) to P (z) as possible
Thus, we will modify the loss function such
Xi that
KL divergence captures
the difference (or distance) li (θ, φ) = −Ez∼Qθ (z|xi ) [log Pφ (xi |z)]
between 2 distributions +KL(Qθ (z|xi )||P (z))
17/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
X̂i
The second term in the loss function can actually be
thought of as a regularizer
Pφ (X|z)
It ensures that the encoder does not cheat by mapping
each xi to a different point (a normal distribution with
very low variance) in the Euclidean space
z In other words, in the absence of the regularizer the
encoder can learn a unique mapping for each xi and
Sample
the decoder can then decode from this unique mapping
Even with high variance in samples from the distribu-
tion, we want the decoder to be able to reconstruct
µ Σ the original data very well (motivation similar to the
adding noise)
Qθ (z|X) To summarize, for each data point we predict a distri-
bution such that, with high probability a sample from
Xi
this distribution should be able to reconstruct the ori-
ginal data point
li (θ, φ) = −Ez∼Qθ (z|xi ) [log Pφ (xi |z)] But why do we choose a normal distribution? Isn’t
it too simplistic to assume that z follows a normal
+KL(Qθ (z|xi )||P (z)) distribution
18/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Isn’t it a very strong assumption that P (z) ∼
N (0, I) ?
For example, in the 2-dimensional case how
can we be sure that P (z) is a normal distri-
bution and not any other distribution
The key insight here is that any distribution
in d dimensions can be generated by the fol-
lowing steps
Step 1: Start with a set of d variables that are
normally distributed (that’s exactly what we
are assuming for P (z))
Step 2: Mapping these variables through a
li (θ, φ) = −Ez∼Qθ (z|xi ) [log Pφ (xi |z)] sufficiently complex function (that’s exactly
what the first few layers of the decoder can
+KL(Qθ (z|xi )||P (z)) do)
19/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
In particular, note that in the adjoining example if z
is 2-D and normally distributed then f (z) is roughly
ring shaped (giving us the distribution in the bottom
figure) z z
f (z) = +
10 ||z||
A non-linear neural network, such as the one we use
for the decoder, could learn a complex mapping from
z to fφ (z) using its parameters φ
The initial layers of a non linear decoder could learn
their weights such that the output is fφ (z)
The above argument suggests that even if we start with
normally distributed variables the initial layers of the
decoder could learn a complex transformation of these
variables say fφ (z) if required
The objective function of the decoder will ensure that
li (θ, φ) = −Ez∼Qθ (z|xi ) [log Pφ (xi |z)] an appropriate transformation of z is learnt to recon-
struct X
+KL(Qθ (z|xi )||P (z))

20/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Module 21.3: Variational autoencoders: (The graphical
model perspective)

21/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Here we can think of z and X as random vari-
z ables
We are then interested in the joint prob-
ability distribution P (X, z) which factorizes
X as P (X, z) = P (z)P (X|z)
N This factorization is natural because we can
imagine that the latent variables are fixed first
and then the visible variables are drawn based
on the latent variables
For example, if we want to draw a digit we
could first fix the latent variables: the digit,
size, angle, thickness, position and so on and
then draw a digit which corresponds to these
latent variables
And of course, unlike RBMs, this is a directed
graphical model
22/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Now at inference time, we are given an X (observed
variable) and we are interested in finding the most
z likely assignments of latent variables z which would
have resulted in this observation
Mathematically, we want to find
X
P (X|z)P (z)
P (z|X) =
N P (X)

This is hard to compute because the LHS contains

P (X) which
ˆ is intractable
P (X) = P (X|z)P (z)dz
ˆ ˆ ˆ
= ... P (X|z1 , z2 , ..., zn )P (z1 , z2 , ..., zn )dz1 , ...dzn

In RBMs, we had a similar integral which we approx-

imated using Gibbs Sampling
VAEs, on the other hand, cast this into an optimiza-
tion problem and learn the parameters of the optim-
ization problem 23/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Specifically, in VAEs, we assume that instead
z of P (z|X) which is intractable, the posterior
distribution is given by Qθ (z|X)
Further, we assume that Qθ (z|X) is a Gaus-
X sian whose parameters are determined by a
neural network µ, Σ = gθ (X)
N
The parameters of the distribution are thus
determined by the parameters θ of a neural
network
Our job then is to learn the parameters of this
neural network

24/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
But what is the objective function for this
z neural network
Well we want the proposed distribution
Qθ (z|X) to be as close to the true distribu-
X tion
N We can capture this using the following ob-
jective function

minimize KL(Qθ (z|X)||P (z|X))

What are the parameters of the objective

function ? (they are the parameters of the
neural network - we will return back to this
again)

25/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Let us expand the KL divergence term
ˆ ˆ
D[Qθ (z|X)||P (z|X)] = Qθ (z|X) log Qθ (z|X)dz − Qθ (z|X) log P (z|X)dz

= Ez∼Qθ (z|X) [log Qθ (z|X) − log P (z|X)]

For shorthand we will use EQ = Ez∼Qθ (z|X)

P (X|z)P (z)
Substituting P (z|X) = P (X) , we get

D[Qθ (z|X)||P (z|X)] = EQ [log Qθ (z|X) − log P (X|z) − log P (z) + log P (X)]
= EQ [log Qθ (z|X) − log P (z)] − EQ [log P (X|z)] + log P (X)
= D[Qθ (z|X)||p(z)] − EQ [log P (X|z)] + log P (X)

∴ log p(X) = EQ [log P (X|z)] − D[Qθ (z|X)||P (z)] + D[Qθ (z|X)||P (z|X)]

26/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
So, we have
log P (X) = EQ [log P (X|z)] − D[Qθ (z|X)||P (z)] + D[Qθ (z|X)||P (z|X)]
Recall that we are interested in maximizing the log likelihood of the data i.e.
P (X)
Since KL divergence (the red term) is always >= 0 we can say that
EQ [log P (X|z)] − D[Qθ (z|X)||P (z)] <= log P (X)
The quantity on the LHS is thus a lower bound for the quantity that we want
to maximize and is knows as the Evidence lower bound (ELBO)
Maximizing this lower bound is the same as maximizing log P (X) and hence
our equivalent objective now becomes
maximize EQ [log P (X|z)] − D[Qθ (z|X)||P (z)]
And, this method of learning parameters of probability distributions associ-
ated with graphical models using optimization (by maximizing ELBO) is called
variational inference
Why is this any easier? It is easy because of certain assumptions that we make
as discussed on the next slide 27/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
First we will just reintroduce the parameters in the
equation to make things explicit

X̂i maximize EQ [log Pφ (X|z)] − D[Qθ (z|X)||P (z)]

Pφ (X|z) At training time, we are interested in learning the

parameters θ which maximize the above for every
training example (xi ∈ {xi }N
i=1 )

So our total objective function is

z N
X
Sample maximize EQ [log Pφ (X = xi |z)]
θ
i=1

− D[Qθ (z|X = xi )||P (z)]

µ Σ
We will shorthand P (X = xi ) as P (xi )
However, we will assume that we are using stochastic
Qθ (z|X) gradient descent so we need to deal with only one of the
terms in the summation corresponding to the current
Xi training example
28/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
So our objective function w.r.t. one example is
maximize EQ [log Pφ (xi |z)] − D[Qθ (z|xi )||P (z)]
θ
X̂i
Now, first we will do a forward prop through the en-
Pφ (X|z) coder using Xi and compute µ(X) and Σ(X)
The second term in the above objective function
is the difference between two normal distribution
N (µ(X), Σ(X)) and N (0, I)
z
With some simple trickery you can show that this term
Sample reduces to the following expression (Seep proof here)

D[N (µ(X), Σ(X))||N (0, I)]

1
µ Σ = (tr(Σ(X)) + (µ(X))T [µ(X)) − k − log det(Σ(X))]
2
where k is the dimensionality of the latent variables
Qθ (z|X)
This term can be computed easily because we have
already computed µ(X) and Σ(X) in the forward pass
Xi

29/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Now let us look at the other term in the ob-
jective function
X̂i n
X
EQ [log Pφ (X|z)]
Pφ (X|z)
i=1

This is again an expectation and hence in-

tractable (integral over z)
z
Sample In VAEs, we approximate this with a single z
sampled from N (µ(X), Σ(X))
Hence this term is also easy to compute (of
µ Σ course it is a nasty approximation but we will
live with it!)
Qθ (z|X)

30/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Further, as usual, we need to assume some
parametric form for P (X|z)
X̂i For example, if we assume that P (X|z) is a
Gaussian with mean µ(z) and variance I then
Pφ (X|z) 1
log P (X = Xi |z) = C − ||Xi − µ(z)||2
2
µ(z) in turn is a function of the parameters of
z the decoder and can be written as fφ (z)
1
Sample
log P (X = Xi |z) = C − ||Xi − fφ (z)||2
2
Our effective objective function thus becomes
µ Σ N
X 1
minimize (tr(Σ(Xi )) + (µ(Xi ))T [µ(Xi )) − k
θ,φ
n=1
2
Qθ (z|X)
− log det(Σ(Xi ))] + ||Xi − fφ (z)||2
Xi

31/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
The above loss can be easily computed and we
can update the parameters θ of the encoder
X̂i and φ of decoder using backpropagation
However, there is a catch !
Pφ (X|z)
The network is not end to end differentiable
because the output fφ (z) is not an end to end
differentiable function of the input X
z Why? because after passing X through the
Sample
Sample
network we simply compute µ(X) and Σ(X)
and then sample a z to be fed to the decoder
µ
This makes the entire process non-
Σ
deterministic and hence fφ (z) is not a
continuous function of the input X
Qθ (z|X)

32/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
VAEs use a neat trick to get around this prob-
lem
X̂i This is known as the reparameterization trick
wherein we move the process of sampling to
Pφ (X|z)
an input layer
For 1 dimensional case, given µ and σ we can
sample from N (µ, σ) by first sampling ∼
z N (0, 1), and then computing
Sample
z =µ+σ∗

µ Σ The adjacent figure shows the difference

between the original network and the repara-
Qθ (z|X)
mterized network
The randomness in fφ (z) is now associated
Xi with and not X or the parameters of the
model 33/36
X̂i Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Data: {Xi }N
i=1
With that we are done with the process of
Model: X̂ = fφ (µ(X) + Σ(X) ∗ ) training VAEs
Parameters: θ, φ Specifically, we have described the data,
Algorithm: Gradient descent model, parameters, objective function and
Objective:
learning algorithm
N
Now what happens at test time? We need to
X 1 T consider both abstraction and generation
(tr(Σ(Xi )) + (µ(Xi )) [µ(Xi ))
n=1
2
In other words we are interested in computing
2
− k − log det(Σ(Xi ))] + ||Xi − fφ (z)|| a z given a X as well as in generating a X
given a z
Let us look at each of these goals

34/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Abstraction
After the model parameters are learned we
feed a X to the encoder
X̂i
By doing a forward pass using the learned
Pφ (X|z) parameters of the model we compute µ(X)
and Σ(X)
z
We then sample a z from the distribution
+
µ(X) and Σ(X) or using the same reparamet-
∗ ∼ N (0, I)
erization trick
In other words, once we have obtained
µ(X) and Σ(X), we first sample ∼
µ Σ N (µ(X), Σ(X)) and then compute z

Qθ (z|X)
z =µ+σ∗

35/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21
Generation
After the model parameters are learned we re-
move the encoder and feed a z ∼ N (0, I) to
X̂i
the decoder
Pφ (X|z) The decoder will then predict fφ (z) and we
can draw an X ∼ N (fφ (z), I)
z
Why would this work ?
+
Well, we had trained the model to minimize
∗ ∼ N (0, I) D(Qθ (z|X)||p(z)) where p(z) was N (0, I)
If the model is trained well then Qθ (z|X)
should also become N (0, I)
µ Σ
Hence, if we feed z ∼ N (0, I), it is almost
as if we are feeding a z ∼ Qθ (z|X) and the
Qθ (z|X)
decoder was indeed trained to produce a good
fφ (z) from such a z
Xi
Hence this will work !
36/36
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 21

Intertek MODU Stability PG 01may2014-2
No ratings yet
Intertek MODU Stability PG 01may2014-2
236 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
Understanding Machine Learning Solution Manual: 2 Gentle Start
No ratings yet
Understanding Machine Learning Solution Manual: 2 Gentle Start
67 pages
Artificial Neural Networks Video Tutorial: Machine Learning 17CS73
No ratings yet
Artificial Neural Networks Video Tutorial: Machine Learning 17CS73
23 pages
Variational Autoencoder Explanation
No ratings yet
Variational Autoencoder Explanation
11 pages
Variational Autoencoders
No ratings yet
Variational Autoencoders
94 pages
ch14 Autoencoder
No ratings yet
ch14 Autoencoder
42 pages
Variational Autoencoders
No ratings yet
Variational Autoencoders
14 pages
Autoencoders
No ratings yet
Autoencoders
66 pages
Self Organizing Maps
No ratings yet
Self Organizing Maps
27 pages
An Introduction To Kohonen Self Organizing Maps: Rajarshi Guha
No ratings yet
An Introduction To Kohonen Self Organizing Maps: Rajarshi Guha
12 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
Autoencoder Report 1
No ratings yet
Autoencoder Report 1
34 pages
Btech CSE
No ratings yet
Btech CSE
17 pages
UNIT-I - Introduction To Computer Vision
No ratings yet
UNIT-I - Introduction To Computer Vision
45 pages
Math4ml PDF
No ratings yet
Math4ml PDF
21 pages
RAG With Math
No ratings yet
RAG With Math
7 pages
4.1 Reinforcement Learning 2
No ratings yet
4.1 Reinforcement Learning 2
31 pages
Deep Learning
No ratings yet
Deep Learning
127 pages
RBM, DBN, and DBM
No ratings yet
RBM, DBN, and DBM
79 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Lecture 1: Introduction To Reinforcement Learning: David Silver
No ratings yet
Lecture 1: Introduction To Reinforcement Learning: David Silver
46 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Soft Max
No ratings yet
Soft Max
6 pages
The Backpropagation Algorithm
No ratings yet
The Backpropagation Algorithm
4 pages
Pthread
No ratings yet
Pthread
4 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Gradient Descent Optimization
No ratings yet
Gradient Descent Optimization
27 pages
Computer Vision Unit 4
No ratings yet
Computer Vision Unit 4
186 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
No ratings yet
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
9 pages
PThread API Reference
No ratings yet
PThread API Reference
348 pages
Unit 2
No ratings yet
Unit 2
112 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
Deep Learning - Unit-III Two Marks
100% (1)
Deep Learning - Unit-III Two Marks
3 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Multiple-Layer Networks Backpropagation Algorithms
No ratings yet
Multiple-Layer Networks Backpropagation Algorithms
46 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
2023 ML Assignment
No ratings yet
2023 ML Assignment
57 pages
Lesson 4 Gradient Descent
No ratings yet
Lesson 4 Gradient Descent
13 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
ML LAB Mannual-1
No ratings yet
ML LAB Mannual-1
79 pages
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
100% (1)
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
33 pages
ML Assignment 3 Nptel 2019
No ratings yet
ML Assignment 3 Nptel 2019
26 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
Unit4 DL Final
No ratings yet
Unit4 DL Final
30 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
ML - Expectation-Maximization Algorithm
No ratings yet
ML - Expectation-Maximization Algorithm
3 pages
Lec19 - GANs
No ratings yet
Lec19 - GANs
47 pages
Lecture 22
No ratings yet
Lecture 22
24 pages
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
No ratings yet
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
51 pages
Tutorial On Diffusion Models
No ratings yet
Tutorial On Diffusion Models
4 pages
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
No ratings yet
QUALITY POLICY Draft
No ratings yet
QUALITY POLICY Draft
2 pages
Carmen National High School-Day Class Cogon West, Carmen, Cebu Midterm Examination in Oral Communication in Context Table of Specifications
No ratings yet
Carmen National High School-Day Class Cogon West, Carmen, Cebu Midterm Examination in Oral Communication in Context Table of Specifications
2 pages
Fundamental of AI (BE02000041)
No ratings yet
Fundamental of AI (BE02000041)
55 pages
JMH 300 Assignment 1
No ratings yet
JMH 300 Assignment 1
28 pages
Velkley, Richard (Seth Benardete On de Anima)
No ratings yet
Velkley, Richard (Seth Benardete On de Anima)
17 pages
Zambian Language G 5-7
No ratings yet
Zambian Language G 5-7
67 pages
The Impact of Artificial Intelligence On Education and Learning
No ratings yet
The Impact of Artificial Intelligence On Education and Learning
2 pages
Local Heritage Themes
No ratings yet
Local Heritage Themes
3 pages
The Waiting Room
No ratings yet
The Waiting Room
39 pages
Narayana GTMS 2024
No ratings yet
Narayana GTMS 2024
10 pages
IsiXhosa HL P1 May-June 2023
No ratings yet
IsiXhosa HL P1 May-June 2023
13 pages
Tugas Off-Class Bahasa Inggris Profesi Film Review On A Movie "Taare Zameen Par"
No ratings yet
Tugas Off-Class Bahasa Inggris Profesi Film Review On A Movie "Taare Zameen Par"
4 pages
Música Nivel Superior: Sinopsis de La Asignatura
No ratings yet
Música Nivel Superior: Sinopsis de La Asignatura
2 pages
Synthesis Writing Template: I. Introduction - MUST HAVE ALL THREE
No ratings yet
Synthesis Writing Template: I. Introduction - MUST HAVE ALL THREE
4 pages
LP 3rdmath1
No ratings yet
LP 3rdmath1
2 pages
Work Immersion Program: Its Effects in Choosing Career Among Grade 12 Students
No ratings yet
Work Immersion Program: Its Effects in Choosing Career Among Grade 12 Students
11 pages
Conjuntions (Updated Feb4 2025)
No ratings yet
Conjuntions (Updated Feb4 2025)
7 pages
Handouts - Sociolinguistics
No ratings yet
Handouts - Sociolinguistics
30 pages
Lesson Plan
No ratings yet
Lesson Plan
16 pages
Head Master
100% (1)
Head Master
1 page
Organization As Flux and Transformation
0% (1)
Organization As Flux and Transformation
3 pages
Practice Quiz M1 (Ungraded) 03
No ratings yet
Practice Quiz M1 (Ungraded) 03
5 pages
M Quiz
No ratings yet
M Quiz
6 pages
Action Plan in He 2019 2020
No ratings yet
Action Plan in He 2019 2020
2 pages
Hima CV 2016 Eee
No ratings yet
Hima CV 2016 Eee
3 pages
Senior Project Letter of Intent-5
No ratings yet
Senior Project Letter of Intent-5
2 pages
Development of One'S Self As A Product of Enculturation: Marilyn B. Encarnacion
100% (1)
Development of One'S Self As A Product of Enculturation: Marilyn B. Encarnacion
18 pages
Master Thesis Template Chalmers Latex
100% (3)
Master Thesis Template Chalmers Latex
6 pages
Script For Teacher's Day Celebration 2023
No ratings yet
Script For Teacher's Day Celebration 2023
2 pages