0% found this document useful (0 votes)
7 views82 pages

AI Slide 2

The document provides an overview of deep learning concepts, including supervised and unsupervised learning, neural networks, convolutional networks, and reinforcement learning. It highlights the use of TensorFlow for building and training models, particularly focusing on applications of Generative Adversarial Networks (GANs) and their capabilities in generating realistic images and data augmentation. Additionally, it discusses various techniques such as backpropagation, data preprocessing, and the architecture of convolutional neural networks.

Uploaded by

curvelearning52
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views82 pages

AI Slide 2

The document provides an overview of deep learning concepts, including supervised and unsupervised learning, neural networks, convolutional networks, and reinforcement learning. It highlights the use of TensorFlow for building and training models, particularly focusing on applications of Generative Adversarial Networks (GANs) and their capabilities in generating realistic images and data augmentation. Additionally, it discusses various techniques such as backpropagation, data preprocessing, and the architecture of convolutional neural networks.

Uploaded by

curvelearning52
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Introduction

• Basics of Tensorflow
• Machine Learning: analytic solution vs.
gradient descent
Supervised Learning (image recognition)
• Neural networks reminder
• Convolution Networks
• Going deeper with ConvNets
Unsupervised Learning
• Autoencoder
• Generative Adversial Network
Sequence modelling
• RNN, LSTM
• Word2vec
Reinforcement Learning
• Deep Q-learning
• Frozen Lake
DIGITAL VISUALIZATION OF IMAGE:
MNIST DATASET

• Handwritten digits
• 60.000 training data and 10.000 test data
• 28x28 grayscale images
• matrix of size 28x28 with value between 0 and 255
• data preprocessing = rescaling to [0,1]
DIGITAL VISUALIZATION OF IMAGE:

Inputs and
Outputs 256 X 256
Matrix

DL model

4-Element Vector

X Y

1
2 A
3 C M
4 T F
5 G
6

With deep learning, we are searching for a surjective


(or onto) function f from a set X to a set Y.
NEURAL NETWORKS
Supervised Deep Learning with Neural
Networks
Input Hidden Layers Output
From one layer to the next

X1
W1

X2 W2
f is the activation function,
Wi is the weight, and bi is Y3
the bias.
X3 W3
Activation Functions

Image Credit: towardsdatascience.com


BACKPROPAGATION

• Forward Activation: Predict the output


• Compute the loss
• Backward Error: And correct the parameters

X f✓ ŷ y
BACKPROPAGATION

• Forward Activation: Predict the output


• Compute the loss
• Backward Error: And correct the parameters

forward pass

X f✓ ŷ y
BACKPROPAGATION

• Forward Activation: Predict the output


• Compute the loss
• Backward Error: And correct the parameters

error

X f✓ ŷ y
BACKPROPAGATION

• Forward Activation: Predict the output


• Compute the loss
• Backward Error: And correct the parameters

X f✓ ŷ y

backpropagation of the error over the network


using derivative function
BACKPROPAGATION

• Forward Activation: Predict the output


• Compute the loss
• Backward Error: And correct the parameters

forward pass

error

X f✓ ŷ y

backpropagation of the error over the network


using derivative function
Training - Minimizing the
Loss
The loss function with regard to weights Input Output
and biases can be defined as

W1, b1 X1

Y2

The weight update is computed by moving W2, b2 X2


a step to the opposite direction of the cost
gradient. L
W3, b3 X3

Iterate until L stops decreasing.


CONVOLUTIONAL NETWORKS
Convolution in
2D
Convolution
Kernel
Max Pooling
Pooling - Max-Pooling and Sum-
Pooling
Convolutional Neural Networks
A convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward
artificial neural networks that explicitly assumes that the inputs are images, which allows
us to encode certain properties into the architecture.

(Image Credit: https://becominghuman.ai)


Deep Learning for Facial Recognition

(Image Credit: www.edureka.co)


MNIST - CNN
Visualization
CONVNET

« Convolutional neural networks »

• Created by Yann LeCun (90’s)

• Well-know since 2000

• Big acceleration with GPUs

• Computer vision

• NLP

• Artificiel Intelligence

• Convolution & Pooling

ConvNets usually evaluated on ImageNet (5 millions images, 1000 classes)


CONVNETS

Alex Net

80.1%

Google Net Inception

93.4%
CONVNETS
FEATURE MAPS

Layer 1: ~ Gabor filters


FEATURE MAPS
FEATURE MAPS
FILTERS
FINE-TUNING

FROZEN FINE-TUNED

• Filters after first convolutional


layer are generic (Gabor filters)
• Deeper you go in network and
more task specific are your
filters
Transfer Learning
Large Compute Overhead for power Limited Edge Computing Applications!
CNN Implementation - Data Augmentation
(DA)

DA helps to popular
artificial training
instances from the
existing train data sets.
Object Detection

Clean Background , Static Objects


Object Detection

Clean Background , Static Objects


Object Detection

Cluttered Background , Movable Objects


Object Detection

Cluttered Background , Static Objects


Yolo
Object detection using Regression

By detecting possible regions of interest using the Region Proposal Network and then
performing recognition on those regions separately, YOLO performs all of its predictions with
the help of a single fully connected layer.
WHAT IS TENSORFLOW?
• A python library
• pip install tensorflow
• Google
• open-source
• library for numerical computation
using data flow graphs
• CPU and GPU
• Research & Industry
PRINCIPLE
« HELLO WOLRD » 13

• INTRODUCTION EXERCISES

• Difference between constant/variable


and placeholder

• Constant = a fixed Variable

• With placeholder you need to feed


data to your graph during your
session

• Tensorflow workflow:
• Draw your graph
• Feed data
• … and optimize
AUTOENCODER
NEURAL NETWORK LEARNING

X f Y

Supervised learning
‣ y are given !

X X
f Z g

selfsupervised learning
‣ y is no longer needed
AUTOENCODER

X X
f Z g
encoder latent decoder

• Learning a compact data representation


• Encode input to smaller latent space
• Decode from the latent space to the input
• Predict input from input
• Loss function = mean square error
• f and g are neural networks
• SGD as usual
AUTOENCODER

X X
f Z g
encoder latent decoder
Operation in encoder CNN

No padding, No padding,
padding,
No stride stride
stride

Operation in decoder CNN

padding,
and stride
Epoch 1 (top) vs Epoch 10 (bottom).
GENERATIVE MODELS
GENERATIVE MOMENT MATCHING NETWORKS

X X’
f Z g
encoder latent decoder RMSE

X
GENERATIVE MOMENT MATCHING NETWORKS

Z
GENERATED
predicted latent

N(0,1)
GENERATIVE MOMENT MATCHING NETWORKS
Z latent

MMD
X
f Z
GENERATED
predicted latent

encoder

N(0,1)
GENERATIVE MOMENT MATCHING NETWORKS

X
Z
GENERATED
g GENERATED

decoder

N(0,1)
GENERATIVE MOMENT MATCHING NETWORKS
latent
X
f Z
encoder ~
~
X
latent Z
GENERATED
g GENERATED

decoder

N(0,1)
Discriminator

▪ Discriminator is a Convolutional Neural Network consisting of many hidden layers and


one output layer, GANs can have only two outputs: either be 1 or 0 :
if the output is 1 then the provided data is real and if the output is 0 then it refers to it
as fake data.

▪ Discriminator is trained on the real data so it learns to recognize how actual data
looks like and what features should the data have to be classified as real.
Generator

▪ Generator is an Inverse Convolutional Neural Net, it does exactly opposite of what a


CNN does, because in CNN an actual image is given as an input and a classified label is
expected as an output but in Generator, a random input (a vector having some values )
is given to this Inverse CNN

▪ An actual image is expected as an output. In simple terms, it generates data from a


piece of data using its past learning.
GENERATIVE ADVERSIAL NETWORKS
Intuition

Generator fake money


GENERATIVE ADVERSIAL NETWORKS
Intuition

Generator fake money Discriminator

FAKE
OR
REAL?

FAKE
OR
REAL?

real money
GENERATIVE ADVERSIAL NETWORKS

• Discriminator is trained on actual data to classify whether given data is true or not. The Generator starts to
generate data from a random input and discriminator analyzes the data and checks how close it is to be
classified as real.

• If the generated data does not contain enough features to be classified as real by the Discriminator, using
backpropagation, generator weights are readjusted to create new data which is better than the previous one.

• This process keeps repeating as long as the Discriminator keeps classifying the generated data as fakes,

• Eventually, Generator becomes so accurate that it becomes tough to distinguish between the real data and
the data generated by the Generator.
GENERATIVE ADVERSIAL NETWORKS
noise Generator fake image Discriminator real or not?

Z G D Y

• G and D are neural networks


• Find a G that minimizes
the accuracy of the best D
D Y
• Alternate optimization of G
and D real image

http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/
GAN: APPLICATIONS
Generate Examples for Image Datasets
GANs can be used to generate new examples for image datasets in various domains, such as medical
imaging, satellite imagery, and natural language processing. By generating synthetic data,
researchers can augment existing datasets and improve the performance of machine learning
models.
Generate Photographs of Human Faces
GANs can generate realistic photographs of human faces, including images of people who do not
exist in the real world. You can use these rendered images for various purposes, such as creating
avatars for online games or social media profiles.
Generate Realistic Photographs
GANs can generate realistic photographs of various objects and scenes, including landscapes,
animals, and architecture. These rendered images can be used to augment existing image datasets
or to create entirely new datasets.
Generate Cartoon Characters
GANs can be used to generate cartoon characters that are similar to those found in popular movies
or television shows. These developed characters can create new content or customize existing
characters in games and other applications.
Image-to-Image Translation
GANs can translate images from one domain to another, such as converting a photograph of a real-
world scene into a line drawing or a painting. You can create new content or transform existing
images in various ways.
Text-to-Image Translation
GANs can be used to generate images based on a given text description. You can use it to create
visual representations of concepts or generate images for machine learning tasks.
GAN: APPLICATIONS
Semantic-Image-to-Photo Translation
GANs can translate images from a semantic representation (such as a label map or a
segmentation map) into a realistic photograph. You can use it to generate synthetic data for
training machine learning models or to visualize concepts more practically.
Face Frontal View Generation
GANs can generate frontal views of faces from images that show the face at an angle. You
can use it to improve face recognition algorithms' performance or synthesize pictures for
use in other applications.
Generate New Human Poses
GANs can generate images of people in new poses, such as difficult or impossible for
humans to achieve. It can be used to create new content or to augment existing image
datasets.
Photos to Emojis
GANs can be used to convert photographs of people into emojis, creating a more
personalized and expressive form of communication.
Photograph Editing
GANs can be used to edit photographs in various ways, such as changing the background,
adding or removing objects, or altering the appearance of people or animals in the image.
Face Aging
GANs can be used to generate images of people at different ages, allowing users to visualize
how they might look in the future or to see what they might have looked like in the past.
GAN: APPLICATIONS

Photo Blending
GANs can blend two or more photographs, creating a new image that combines elements from
the original images.
Super Resolution
GANs can enhance images' resolution, allowing users to produce higher-quality versions of low-
resolution images.
Photo Inpainting
GANs can fill in missing or damaged parts of photographs, creating a more complete and visually
appealing image.
Clothing Translation
Clothing translation is converting an image of clothing from one style or design to another. GANs
have been used to develop systems that can translate images of clothing from one type to
another, such as changing the color or pattern of a shirt or dress.
Video Prediction
Video prediction is generating future frames of a video based on a given sequence of past frames.
GANs have been used to develop systems that can generate realistic, high-quality video frames
that accurately predict the future evolution of the scene.
3D Object Generation
3D object generation creates 3D models of objects or scenes from 2D images or other data. GANs
have been used to develop systems that can generate realistic, high-quality 3D models of objects
and settings, such as buildings, cars, and people. You can use these systems for various
applications, such as virtual reality, video games, and computer-aided design.
GAN: EXAMPLES
GAN: EXAMPLES
GAN: EXAMPLES

Ongoing topic…
Sequence Generation through SeqGAN
Sequence Generation GAN
Sequence Generation GAN
GAN Startup Landscape
High
Geographic Reach

Low Application diversity High

You might also like