0% found this document useful (0 votes)
59 views54 pages

NN DL

This document provides an overview of neural networks and deep learning. It discusses how neural networks mimic the functionality of the brain by connecting neurons. The key types of neural networks covered are perceptrons, multi-layer perceptrons, convolutional neural networks, recurrent neural networks, and deep belief networks. Training algorithms like backpropagation and techniques like pre-training and dropout help address challenges in training deep models. Neural networks are widely used today for applications involving images, text, video and more.

Uploaded by

Siva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views54 pages

NN DL

This document provides an overview of neural networks and deep learning. It discusses how neural networks mimic the functionality of the brain by connecting neurons. The key types of neural networks covered are perceptrons, multi-layer perceptrons, convolutional neural networks, recurrent neural networks, and deep belief networks. Training algorithms like backpropagation and techniques like pre-training and dropout help address challenges in training deep models. Neural networks are widely used today for applications involving images, text, video and more.

Uploaded by

Siva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Neural Network

and
Deep Learning

Md Shad Akhtar
Research Scholar
IIT Patna
Neural Network
• Mimics the functionality of a brain.
• A neural network is a graph with neurons
(nodes, units etc.) connected by links.
Neural Network: Neuron
Neural Network: Perceptron
• Network with only single layer.
• No hidden layers
Neural Network: Perceptron
X1
W1 = ?
a
t=? AND Gate

X2 W2 = ?

X1
W1 = ?
a
t=? OR Gate

X2 W2 = ?

a
X1 t=? NOT Gate
W1 = ?
Neural Network: Perceptron
X1
W1 = 1
a
t = 1.5 AND Gate

X2 W2 = 1

X1
W1 = 1
a
t = 0.5 OR Gate

X2 W2 = 1

a
X1 t = -0.5 NOT Gate
W1 = -1
Neural Network: Multi Layer Perceptron
(MLP) or Feed-Forward Network (FNN)
• Network with n+1 layers
• One output and n hidden layers.
Training: Back propagation algorithm
• Gradient decent algorithm
Training: Back propagation algorithm
Training: Back propagation algorithm
Training: Back propagation algorithm
Training: Back propagation algorithm
Training: Back propagation algorithm
1. Initialize network with random weights
2. For all training cases (called examples):
a. Present training inputs to network and calculate
output
b. For all layers (starting with output layer, back to
input layer):
i. Compare network output with correct output (error
function)
ii. Adapt weights in current layer
Deep Learning
What is Deep Learning?
• A family of methods that uses deep architectures to
learn high-level feature representations
Example 1
MAN

Example 2
Why are Deep Architectures hard to train?

• Vanishing/Exploding gradient problem in Back


Propagation
Layer-wise Pre-training
• First, train one layer
at a time, optimizing
data-likelihood
objective P(x)
Layer-wise Pre-training
• Then, train second
layer next, optimizing
data-likelihood
objective P(h)
Layer-wise Pre-training
• Finally, fine-tune labelled objective P(y|x) by
Backpropagation
Deep Belief Nets
• Uses Restricted Boltzmann Machines (RBMs)
• Hinton et al. (2006), A fast learning algorithm
for deep belief nets.
Restricted Boltzmann Machine (RBM)
• RBM is a simple energy-based model:

where

Example:
• Let weights (h1; x1), (h1; x3) be positive, others be
zero, b = d = 0.
• Calculate p(x,h) ?

• Ans: p(x1 = 1; x2 = 0; x3 = 1; h1 = 1; h2 = 0; h3 = 0)
Restricted Boltzmann Machine (RBM)
• P(x, h) = P(h|x) P(x)
• P(h|x): easy to compute
• P(x): hard if datasets are large.

Contrastive Divergence:
Deep Belief Nets (DBN) = Stacked RBM
Auto-Encoders: Simpler alternative to
RBMs
Deep Learning - Architecture
• Recurrent Neural Network (RNN)
• Convolution Neural Network (CNN)
Recurrent Neural Network (RNN)
Recurrent Neural Network (RNN)
• Enable networks to do temporal processing
and learn sequences
Character level language model Vocabulary: [h,e,l,o]
.

Training of RNN: BPTT


: Predicted
: Actual
V

W
U
.

Training of RNN: BPTT


One to many:
Sequence output (e.g. image captioning takes an image and outputs a sentence of
words)
Many to one:
Sequence input (e.g. sentiment analysis where a given sentence is classified as
expressing positive or negative sentiment)
Many to many:
Sequence input and sequence output (e.g. Machine Translation: an RNN reads a
sentence in English and then outputs a sentence in French)
Many to many:
Synced sequence input and output (e.g. Language modelling where we wish to
predict next words.
RNN Extensions
• Bidirectional RNN
• Deep (Bidirectional) RNNs
RNN (Cont..)
• “the clouds are in the sky”

clouds are in the W1

the clouds are in the


RNN (Cont..)
• “India is my home country. I can speak fluent Hindi.”
is my home fluent W2

India is my speak fluent

It is very hard for RNN to learn “Long Term Dependency”.


LSTM
• Capable of learning long-term dependencies.

Simple RNN

LSTM
LSTM
• LSTM remove or add information to the cell
state, carefully regulated by structures called
gates.

• Cell state: Conveyer belt of the cell


LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate
LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate
LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate
LSTM
• Gates
– Forget Gate
– Input Gate
– Output Gate
LSTM- Variants
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)

• A special kind of multi-layer neural networks.


• Implicitly extract relevant features.
• Fully-connected network architecture does not
take into account the spatial structure.
• In contrast, CNN tries to take advantage of the
spatial structure.
Convolutional Neural Network (CNN)

1. Convolutional layer
2. Pooling layer
3. Fully connected layer
Convolutional Neural Network (CNN)

1. Convolutional layer 1 0 1

0 1 0

1 0 1
1 1 1 0 0
Convolution Filter
0 1 1 1 0
0 0 1 1 1
0 0 1 1 0
0 1 1 0 0
Image
Convolutional Neural Network (CNN)

1. Convolutional layer 1 0 1

0 1 0

1 0 1
Convolutional Neural Network (CNN)

1. Convolutional layer 1 0 1
• Local receptive field
• Shared weights 0 1 0

1 0 1
Convolutional Neural Network (CNN)

2. Pooling layer
Convolutional Neural Network (CNN)

3. Fully connected layer

.
. .
.
Convolutional Neural Network (CNN)

Putting it all together

Pooled
feature

Labels
Convolution
feature

Input matrix 3 convolution filter Pooling Flatten Fully-connected layers


Example 1: CNN for Image
Example 2: CNN for Text

You might also like