Deep Learning Notes
Deep Learning Notes
MODULE-1
Ans: Recurrent Neural Network (RNN) are a type of Neural Network where
the output from previous step are fed as input to the current step. In traditional
neural networks, all the inputs and outputs are independent of each other, but in
cases like when it is required to predict the next word of a sentence, the previous
words are required and hence there is a need to remember the previous words.
Thus RNN came into existence, which solved this issue with the help of a Hidden
Layer. The main and most important feature of RNN is Hidden state, which
remembers some information about a sequence.
RNN have a “memory” which remembers all information about what has been
calculated. It uses the same parameters for each input as it performs the same task
on all the inputs or hidden layers to produce the output. This reduces the
complexity of parameters, unlike other neural networks.
The working of a RNN can be understood with the help of below example:
Example:
Suppose there is a deeper network with one input layer, three hidden layers and
one output layer. Then like other neural networks, each hidden layer will have its
own set of weights and biases, let’s say, for hidden layer 1 the weights and biases
are (w1, b1), (w2, b2) for second hidden layer and (w3, b3) for third hidden layer.
This means that each of these layers are independent of each other, i.e. they do not
memorize the previous outputs.
Now the RNN will do the following:
RNN converts the independent activations into dependent activations by
providing the same weights and biases to all the layers, thus reducing the
complexity of increasing parameters and memorizing each previous outputs
by giving each output as input to the next hidden layer.
Hence these three layers can be joined together such that the weights and bias
of all the hidden layers is the same, into a single recurrent layer.
2) What are the advantages and disadvantages of RNN and mention the
steps involved to train neural network through RNN?
Recurrent neural network are even used with convolutional layers to extend
the effective pixel neighbourhood.
The following steps are performed to train a neural network through RNN:
the weights between the input -the hidden layer ( the ‘reservoir’) : Win and also
the weights are randomly assigned and not trainable
the weights of the output neurons (the ‘readout’ layer) are trainable and can be
learned so that the network can reproduce specific temporal patterns
the hidden layer (or the ‘reservoir’) is very sparsely connected (typically < 10%
connectivity)
The following are the reasons of why to use and when to use echo state
network:
Traditional NN architectures suffer from the vanishing/exploding gradient
problem and as such, the parameters in the hidden layers either don’t change
that much or they lead to numeric instability and chaotic behavior. Echo state
networks don’t suffer from this problem
1. First, present the input pattern and propagate it through the network to get
the output.
2. Then compare the predicted output to the expected output and calculate the
error.
3. Then calculate the derivates of the error with respect to the network weights
The Backpropagation algorithm is suitable for the feed forward neural network on
fixed sized input-output pairs.
LSTMs can be applied to a variety of deep learning tasks that mostly include
prediction based on previous information. Two noteworthy examples include text
prediction and stock prediction:
Text Prediction
Stock Prediction
Simple Machine Learning (SML) models are able to predict stock values and
prices based on inputs such as the opening value and the volume of the stock.
While these values do take part in stock prediction, they lack a key component.
To properly predict a stock value with high accuracy, the model needs to take
into account one of the biggest factors—the trend of the stock. To do so, the
model needs to identify the trend based on the values recorded over the preceding
days—a task suited to an LSTM network.
MODULE-2
The Algorithm:
3. Every node is examined to calculate which one’s weights are most like the
input vector. The winning node is commonly known as the Best Matching Unit
(BMU).
4. Then the neighbourhood of the BMU is calculated. The amount of neighbors
decreases over time.
5. The winning weight is rewarded with becoming more like the sample vector.
The nighbors also become more like the sample vector. The closer a node is to
the BMU, the more its weights get altered and the farther away the neighbor is
from the BMU, the less it learns.
1. It does not build a generative model for the data, i.e, the model does not
understand how data is created.
2. It does not behave so gently when using categorical data, even worse for mixed
types data.
3. The time for preparing model is slow, hard to train against slowly evolving
data Q3. Explain the training processes of SOM.
SOM doesn’t use backpropagation with SGD to update weights, this type of
unsupervised artificial neural network uses competetive learning to update its
weights.
• Competetion
• Cooperation
• Adaptation
Competetion : each neuron in a SOM is assigned a weight vector with the same
dimensionality as the input space. InIn the example below, in each neuron of the
output layer we will have a vector with dimension n. WeWe compute distance
between each neuron (neuron from the output layer) and the input data, and the
neuron with the lowest distance will be the winner of the competetion. The
Coorporation: the vector of the winner neuron in the final process will be
updated (adaptation) but it is not the only one, also it’s neighbor will be updated.
To choose neighbors we use neighborhood kernel function, this function depends
on two factor : time ( time incremented each new input data) and distance
between the winner neuron and the other neuron (How far is the neuron from the
winner neuron).The image below show us how the winner neuron’s ( The most
green one in the center) neighbors are choosen depending on distance and time
factors.
Adaptation: After choosing the winner neuron and it’s neighbors we compute
neurons update. Those choosen neurons will be updated but not the same update,
more the distance between neuron and the input data grow less we adjust it like
in
The winner neuron and it’s neighbors will be updated using this formula: This
learning rate indicates how much we want to adjust our weights.After time t
(positive infinite), this learning rate will converge to zero so we will have no
update even for the neuron winner .
Kmeans algorithm is an iterative algorithm that tries to partition the dataset into
Kpre-defined distinct non-overlapping subgroups (clusters) where each data point
belongs to only one group. It tries to make the inter-cluster data points as similar
as possible while also keeping the clusters as different (far) as possible. It assigns
data points to a cluster such that the sum of the squared distance between the data
points and the cluster’s centroid (arithmetic mean of all the data points that
belong to that cluster) is at the minimum. The less variation we have within
clusters, the more homogeneous (similar) the data points are within the same
cluster.
3. Keep iterating until there is no change to the centroids. i.e assignment of data
points to clusters isn’t changing. 4. Compute the sum of the squared distance
between data points and all centroids. 5. Assign each data point to the closest
cluster (centroid). 6. Compute the centroids for the clusters by taking the average
of the all data points that belong to each cluster.
5. Explain the application and distance measure of k means clustering.
Academic Performance
Diagnostic systems
Search engines
The clustering algorithm plays the role of finding the cluster heads, which
collects all the data in its respective cluster.
Distance Measure
Distance measure determines the similarity between two elements and influences
the shape of clusters.
K-Means clustering supports various kinds of distance measures, such as:
The most common case is determining the distance between two points. If we
have a point P and point Q, the euclidean distance is an ordinary straight line. It is
the distance between the two points in Euclidean space.
This is identical to the Euclidean distance measurement but does not take the
square root at the end.
The Manhattan distance is the simple sum of the horizontal and vertical
components or the distance between two points measured along axes at right
angles.
In this case, we take the angle between the two vectors formed by joining the
points from the origin.
Pros:
1. Simple: It is easy to implement k-means and identify unknown groups of data
from complex data sets. The results are presented in an easy and simple manner.
2. Flexible: K-means algorithm can easily adjust to the changes. If there are any
problems, adjusting the cluster segment will allow changes to easily occur on the
algorithm.
4. Efficient: The algorithm used is good at segmenting the large data set. Its
efficiency depends on the shape of the clusters. K-means work well in
hyperspherical clusters.
10. Spherical clusters: This mode of clustering works great when dealing with
spherical clusters. It operates with an assumption of joint distributions of features
since each cluster is spherical. All the clusters features or characters have equal
variance and each is independent of each other.
Cons:
3. Uniform effect: It produces cluster with uniform size even when the input data
has different sizes.
4. Order of values: The way in which data is ordered in building the algorithm
affects the final results of the data set.
Autoencoder objective is to minimize reconstruction error between the input and output.
This helps autoencoders to learn important features present in the data. When a
representation allows a good reconstruction of its input then it has retained much of the
information present in the input.
Modern autoencoders have generalized the idea of an encoder and a decoder beyond
deterministic functions to stochastic mappings p encoder(h | x) and p decoder(x | h).
The idea of autoencoders has been part of the historical landscape of neural networks for
decades. Traditionally, autoencoders were used for dimensionality reduction or feature
learning. Recently, theoretical connections between autoencoders and latent variable
models have brought autoencoders to the forefront of generative modeling. Autoencoders
may be thought of as being a special case of feedforward networks and may be trained
with all the same techniques, typically mini batch gradient descent following gradients
computed by back-propagation. Unlike general feedforward networks, autoencoders may
also be trained using recirculation, a learning algorithm based on comparing the
activations of the network on the original input to the activations on the reconstructed
input. Recirculation is regarded as more biologically plausible than back-propagation but
is rarely used for machine learning applications.
The general structure of an autoencoder, mapping an input x to an output(called
reconstruction) r through an internal representation or code h. The autoencoder has two
components: the encoder f (mapping x to h) and the decoder g (mapping h to r).
One way to obtain useful features from the autoencoder is to constrain to have a smaller
dimension than x. An autoencoder whose code dimension is less than the input dimension
is called undercomplete. Learning an under complete representation forces the
autoencoder to capture the most salient features of the training data.The learning process
is described simply as minimizing a loss function L(x, g(f(x))), where Lis a loss function
penalizing g(f(x)) for being dissimilar from x, such as the mean squared error.When the
decoder is linear andLis the mean squared error, an undercomplete autoencoder learns to
span the same subspace as PCA. In this case, an autoencoder trained to perform the
copying task has learned the principal subspace of the training data as a side effect.
Undercomplete autoencoders, with code dimension less than the input dimension,can
learn the most salient features of the data distribution. We have seen that these
autoencoders fail to learn anything useful if the encoder and decoder are given too much
capacity.
This is when our encoding output's dimension is larger than our input's dimension. A
similar problem occurs if the hidden code is allowed to have dimension equal to the
input, and in the overcomplete case in which the hidden code has dimension greater than
the input. In these cases, even a linear encoder and a linear encoder can learn to copy the
input to the output without learning anything useful about the data distribution.
The penalty Ω(h) is the squared Frobenius norm (sum of squared elements) of
theJacobian matrix of partial derivatives associated with the encoder function.There is a
connection between the denoising autoencoder and the contractive autoencoder: Alain
and Bengio showed that in the limit of small Gaussian Input noise, the denoising
reconstruction error is equivalent to a contractive penalty on the reconstruction function
that maps x to r=g(f(x)).
Denoising refers to intentionally adding noise to the raw input before providing it to the
network. Denoising can be achieved using stochastic mapping.Denoising autoencoders
create a corrupted copy of the input by introducing some noise. This helps to avoid the
autoencoders to copy the input to the output without learning features about the data.
Corruption of the input can be done randomly by making some of the input as zero.
Remaining nodes copy the input to the noised input.Denoising autoencoders must remove
the corruption to generate an output that is similar to the input. Output is compared with
input and not with noised input. To minimize the loss function we continue until
convergence
Denoising autoencoders minimizes the loss function between the output node and the
corrupted input.Denoising helps the autoencoders to learn the latent representation
present in the data. Denoising autoencoders ensures a good representation is one that can
be derived robustly from a corrupted input and that will be useful for recovering the
corresponding clean input.
MODULE_4
1. What is Boltzmann Machine? Explain with its Testing and Training Algorithm.
A Boltzmann Machine is a network of symmetrically connected, neuron like units that
make stochastic decisions about whether to be on or off. Boltzmann machines have a
simple learning algorithm that allows them to discover interesting features in datasets
composed of binary vectors. The learning algorithm is very slow in networks with many
layers of feature detectors, but it can be made much faster by learning one layer of feature
detectors at a time.
Boltzmann machines are used to solve two quite different computational problems.
For a search problem, the weights on the connections are fixed and are used to represent
the cost function of an optimization problem. The stochastic dynamics of a Boltzmann
machine then allow it to sample binary state vectors that represent good solutions to the
optimization problem.
For a learning problem, the Boltzmann machine is shown a set of binary data vectors and
it must find weights on the connections so that the data vectors are good solutions to the
optimization problem defined by those weights. To solve a learning problem, Boltzmann
machines make many small updates to their weights, and each update requires them to
solve many different search problems.
The following diagram shows the architecture of Boltzmann machine. It is clear
from the diagram, that it is a two-dimensional array of units. Here, weights on
interconnections between units are –p where p > 0. The weights of self-
connections are given by b where b > 0.
2. What is Restricted Boltzmann machine? Explain its working in detail.
RBMs are a two-layered artificial neural network with generative capabilities. They have the
ability to learn a probability distribution over its set of input. RBMs were invented by
Geoffrey Hinton and can be used for dimensionality reduction, classification, regression,
collaborative filtering, feature learning, and topic modeling. RBMs are a special class
of Boltzmann Machines and they are restricted in terms of the connections between the visible
and the hidden units. This makes it easy to implement them when compared to Boltzmann
Machines. As stated earlier, they are a two-layered neural network (one being the visible layer
and the other one being the hidden layer) and these two layers are connected by a fully
bipartite graph. This means that every node in the visible layer is connected to every node in
the hidden layer but no two nodes in the same group are connected to each other. This
restriction allows for more efficient training algorithms than what is available for the general
class of Boltzmann machines, in particular, the gradient-based contrastive divergence
algorithm.
Each visible node takes a low-level feature from an item in the dataset to be learned. At node
1 of the hidden layer, x is multiplied by a weight and added to a bias. The result of those two
operations is fed into an activation function, which produces the node’s output, or the
strength of the signal passing through it, given input x.
Next, let’s look at how several inputs would combine at one hidden node. Each x is
multiplied by a separate weight, the products are summed, added to a bias, and again the result
is passed through an activation function to produce the node’s output.
At each hidden node, each input x is multiplied by its respective weight w. That is, a single
input x would have three weights here, making 12 weights altogether (4 input nodes x 3 hidden
nodes). The weights between the two layers will always form a matrix where the rows are equal
to the input nodes, and the columns are equal to the output nodes.
Each hidden node receives the four inputs multiplied by their respective weights. The sum of
those products is again added to a bias (which forces at least some activations to happen), and
the result is passed through the activation algorithm producing one output for each hidden node.
Now that you have an idea about how Restricted Boltzmann Machine works, let’s continue our
Restricted Boltzmann Machine Tutorial and have a look at the steps involved in the training of
RBM
The training of the Restricted Boltzmann Machine differs from the training of regular neural
networks via stochastic gradient descent.
Gibbs Sampling
The first part of the training is called Gibbs Sampling. Given an input vector v we use p(h|v)for
prediction of the hidden values h. Knowing the hidden values we use p(v|h) :
for prediction of new input values v. This process is repeated k times. After k iterations, we
obtain another input vector v_k which was recreated from original input values v_0.
Contrastive Divergence step
The update of the weight matrix happens during the Contrastive Divergence step.
Vectors v_0 and v_k are used to calculate the activation probabilities for hidden
values h_0 and h_k :
The difference between the outer products of those probabilities with input
vectors v_0 and v_k results in the updated matrix :
Using the update matrix the new weights can be calculated with gradient ascent, given by:
Now that you have an idea of what are Restricted Boltzmann Machines and the layers of RBM,
let’s move on with our Restricted Boltzmann Machine Tutorial and understand their working
with the help of an example.
4. Explain Deep Boltzmann Machine. How does it differ from Deep Belief Network?
Deep Boltzmann Machine
Like RBM, no intralayer connection exists in DBM. Connections exists only between units
of the neighboring layers
DBM can be organized as bipartite graph with odd layers on one side and even layers on one
side
Units within the layers are independent of each other but are dependent on neighboring layers
Learning is made efficient by layer by layer pre training — Greedy layer wise pre training
slightly different than done in DBM
After learning the binary features in each layer, DBM is fine tuned by back propagation.
Difference between Deep Belief networks(DBN) and Deep Boltzmann Machine(DBM)
Deep Belief Network(DBN) have top two layers with undirected connections and lower
layers have directed connections
Approximate inference procedure for DBM uses a top-down feedback in addition to the usual
bottom-up pass, allowing Deep Boltzmann Machines to better incorporate uncertainty about
ambiguous inputs.
A disadvantage of DBN is the approximate inference based on mean field approach is slower
compared to a single bottom-up pass as in Deep Belief Networks. Mean field inference needs
to be performed for every new test input.
Deep belief nets are probabilistic generative models that are composed of multiple layers of
stochastic, latent variables. The latent variables typically have binary values and are often
called hidden units or feature detectors. The top two layers have undirected, symmetric
connections between them and form an associative memory. The lower layers receive top-down,
directed connections from the layer above. The states of the units in the lowest layer represent a
data vector.
The two most significant properties of deep belief nets are:
There is an efficient, layer-by-layer procedure for learning the top-down, generative weights
that determine how the variables in one layer depend on the variables in the layer above.
After learning, the values of the latent variables in every layer can be inferred by a single,
bottom-up pass that starts with an observed data vector in the bottom layer and uses the
generative weights in the reverse direction.
Deep belief nets are learned one layer at a time by treating the values of the latent variables in
one layer, when they are being inferred from data, as the data for training the next layer. This
efficient, greedy learning can be followed by, or combined with, other learning procedures that
fine-tune all of the weights to improve the generative or discriminative performance of the whole
network.
Discriminative fine-tuning can be performed by adding a final layer of variables that represent
the desired outputs and backpropagating error derivatives. When networks with many hidden
layers are applied to highly-structured input data, such as images, backpropagation works much
better if the feature detectors in the hidden layers are initialized by learning a deep belief net that
models the structure in the input data
The principle of greedy layer-wise unsupervised training can be applied to DBNs with RBMs as
the building blocks for each layer .The process is as follows:
1. Train the first layer as an RBM that models the raw input as its visible layer.
2. Use that first layer to obtain a representation of the input that will be used as data for the second
layer. Two common solutions exist. This representation can be chosen as being the mean
activations or samples of .
3. Train the second layer as an RBM, taking the transformed data (samples or mean activations) as
training examples (for the visible layer of that RBM).
4. Iterate (2 and 3) for the desired number of layers, each time propagating upward either samples
or mean values.
5. Fine-tune all the parameters of this deep architecture with respect to a proxy for the DBN log-
likelihood, or with respect to a supervised training criterion (after adding extra learning machinery
to convert the learned representation into supervised predictions, e.g. a linear classifier).
Adaptive computation time. We can run sequential refinement for long amount of time
to generate sharp, diverse samples or a short amount of time for coarse less diverse
samples. In the limit of infinite time, this procedure is known to generate true samples
from the energy model.
Not restricted by generator network. In both VAEs and Flow based models, the
generator must learn a map from a continuous space to a possibly disconnected space
containing different data modes, which requires large capacity and may not be possible to
learn. In EBMs, by contrast, can easily learn to assign low energies at disjoint regions.
Built-in compositionality. Since each model represents an unnormalized probability
distribution, models can be naturally combined through product of experts or other
hierarchical models.
Generation
Studies found energy-based models are able to generate qualitatively and quantitatively high-
quality images, especially when running the refinement process for a longer period at test time.
By running iterative optimization on individual images, we can auto-complete images and morph
images from one class (such as truck) to another (such as frog).
In addition to generating images, they found that energy-based models are able to generate
stable robot dynamics trajectories across large number of timesteps. EBMs can generate a
diverse set of possible futures, while feedforward models collapse to a mean prediction.
MODULE-5
1)Generative Adversarial Network (GAN) and why were GAN developed ?
Generative Adversarial Networks (GANs) are a powerful class of neural networks that
are used for unsupervised learning. It was developed and introduced by Ian J.
Goodfellow in 2014. GANs are basically made up of a system of two competing neural
network models which compete with each other and are able to analyze, capture and
copy the variations within a dataset.
GAN s were developed as it has been noticed most of the mainstream neural networks
can be easily fooled into misclassifying things by adding only a small amount of noise
into the original data. The model after adding noise has higher confidence in the wrong
prediction than when it predicted correctly. The reason for such adversary is that most
machine learning models learn from a limited amount of data, which is a huge
drawback, as it is prone to overfitting. Also, the mapping between the input and the
output is almost linear. It may seem that the boundaries of separation between the
various classes are linear, but in reality, they are composed of linearity‟s and even a
small change in a point in the feature space might lead to misclassification of data.
Generative Adversarial Networks (GANs) can be broken down into three parts:
Generative: To learn a generative model, which describes how data is generated
in terms of a probabilistic model.
Adversarial: The training of a model is done in an adversarial setting.
Networks: Use deep neural networks as the artificial intelligence (AI) algorithms
for training purpose in GANs, there is a generator and a discriminator.
The Generator generates fake samples of data(be it an image, audio, etc.) and tries
to fool the Discriminator. The Discriminator tries to distinguish between the real
and fake samples. The Generator and the Discriminator are both Neural Networks
and they both run in competition with each other in the training phase. The steps
are repeated several times the Generator and Discriminator get better and better in
their respective jobs after each repetition. The working can be visualized by the
diagram given.
The generative model captures the distribution of data and is trained in such a
manner that it tries to maximize the probability of the Discriminator. The Discriminator, is
based on a model that estimates the probability that the sample that it got is received
from the training data and not from the Generator.
The above method is repeated for a few epochs and then manually check the fake data
if it seems genuine. If it seems acceptable, then the training is stopped, otherwise, its
allowed to continue for few more epochs.
3)What are the Different types of GAN's?
Many different types of GAN have been implemented. Some of the important ones that
are actively used are described below:
1. Vanilla GAN: This is the simplest type GAN. Here, the Generator and the
Discriminator are simple multi-layer perceptron‟s. In vanilla GAN, the algorithm is
really simple, it tries to optimize the mathematical equation using stochastic
gradient descent.
2. Conditional GAN (CGAN): CGAN can be described as a deep learning method in
which some conditional parameters are put into place. In CGAN, an additional
parameter „y‟ is added to the Generator for generating the corresponding data.
Labels are also put into the input to the Discriminator in order for the Discriminator
to help distinguish the real data from the fake generated data.
3. Deep Convolutional GAN (DCGAN): DCGAN is one of the most popular also the
most successful implementation of GAN. It is composed of Convents in place of
multi-layer perceptron‟s. The Convent‟s are implemented without max pooling,
which is in fact replaced by convolutional stride. Also, the layers are not fully
connected.
4. Laplacian Pyramid GAN (LAPGAN): The Laplacian pyramid is a linear invertible
image representation consisting of a set of band-pass images, spaced an octave
apart, plus a low-frequency residual. This approach uses multiple numbers of
Generator and Discriminator networks and different levels of the Laplacian
Pyramid. This approach is mainly used because it produces very high-quality
images. The image is down-sampled at first at each layer of the pyramid and then
it is again up-scaled at each layer in a backward pass where the image acquires
some noise from the Conditional GAN at these layers until it reaches its original
size.
5. Super Resolution GAN (SRGAN): SRGAN as the name suggests is a way of
designing a GAN in which a deep neural network is used along with an adversarial
network in order to produce higher resolution images. This type of GAN is
particularly useful in optimally up-scaling native low-resolution images to enhance
its details minimizing errors.
Generative adversal network consist of two parts: generators and discriminators. The
generator model produces synthetic examples (e.g., images) from random noise
sampled using a distribution, which along with real examples from a training data set
are fed to the discriminator, which attempts to distinguish between the two. Both the
generator and discriminator improve in their respective abilities until the discriminator is
unable to tell the real examples from the synthesized examples with better than the 50%
accuracy expected.
GANs train in an unsupervised fashion, meaning that they infer the patterns within data
sets without reference to known, labelled, or annotated outcomes. Interestingly, the
discriminator‟s work informs that of the generator every time the discriminator correctly
identifies a synthesized work, it tells the generator how to tweak its output so that it
might be more realistic in the future.
1. Image and video synthesis- GANs are perhaps best known for their
contributions to image synthesis.StyleGAN, a model NVidia developed,
has generated high-resolution head shots of fictional people by learning
attributes like facial pose, freckles, and hair. A newly released version .
In June 2019, Microsoft researchers detailed ObjGAN, a novel GAN that could
understand captions, sketch layouts, and refine the details based on the wording. The
co-authors of a related study proposed a system . StoryGAN that synthesizes
storyboards from paragraphs. GANs have been applied to the problems of super-
resolution and pose estimation (object transformation).
Tang says one of his teams used GANs to train a model to upscale 200-by-200-pixel
satellite imagery to 1,000 by 1,000 pixels, and to produce images that appear as though
they were captured from alternate angles.
2) Video- Predicting future events from only a few video frames a task
once considered impossible is nearly within grasp thanks to state-of-
the-art approaches involving GANs and novel data sets.
One of the newest papers on the subject from DeepMind details recent advances in the
budding field of AI clip generation. Using “computationally efficient” components and
techniques and a new custom-tailored data set, researchers say their best-performing
model Dual Video Discriminator GAN (DVD-GAN) can generate coherent 256 x 256-
pixel videos of “notable fidelity” up to 48 frames in length.
3) Artwork- GANs are capable of more than generating images and video
footage. When trained on the right data sets, they’re able to produce de
novo works of art.
Researchers at the Indian Institute of Technology Hyderabad and the Sri Sathya Sai
Institute of Higher Learning devised a GAN, dubbed SkeGAN, that generates stroke-
based vector sketches of cats, fire trucks, mosquitoes, and yoga poses.
Meanwhile, a team at the University of Edinburgh‟s Institute for Perception and Institute
for Astronomy designed a model that generates images of fictional galaxies that closely
follow the distributions of real galaxies.
In March during its GPU Technology Conference (GTC) in San Jose, California, Nvidia
took the wraps off of GauGAN, a generative adversarial AI system that lets users create
lifelike landscape images that never existed.
6)Medicine- In the medical field, GANs have been used to produce data
on which other AI models in some cases, other GANs might train and to
invent treatments for rare diseases that to date haven’t received much
attention.
In April, the Imperial College London, University of Augsburg, and Technical University
of Munich sought to synthesize data to fill in gaps in real data with a model dubbed
Snore-GAN. In a similar vein, researchers from Nvidia, the Mayo Clinic, and the MGH
and BWH Centre for Clinical Data Science proposed a model that generates synthetic
magnetic resonance images (MRIs) of brains with cancerous tumours
“The idea of using adversarial loss for training agent trajectories is not new, but what‟s
new is allowing it to work with a lot less data,” Tang said. “The trick to applying these
adversarial learning approaches is figuring out which inputs the discriminator has
access to what information is available to avoid being tricked, discriminators need
access to data alone, allowing us to train with expert demonstrations where all we have
are the state data.”