0% found this document useful (0 votes)
18 views59 pages

Ch1-Fundamental of Neural Network

The document provides an overview of deep learning, including fundamental concepts like perceptrons, neural network architectures, and types of learning. It explains the workings of various neural network types, such as feedforward and feedback networks, and discusses the importance of weights and biases in the learning process. Additionally, it highlights applications of deep learning and categorizes deep architectures into generative, discriminative, and hybrid models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views59 pages

Ch1-Fundamental of Neural Network

The document provides an overview of deep learning, including fundamental concepts like perceptrons, neural network architectures, and types of learning. It explains the workings of various neural network types, such as feedforward and feedback networks, and discusses the importance of weights and biases in the learning process. Additionally, it highlights applications of deep learning and categorizes deep architectures into generative, discriminative, and hybrid models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

Deep Learning

By Dr. Shraddha Atul Mithbavkar


Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Example of McCulloch Pitts
• AND gate: Truth table: X1 X2 Out Put
0 0 0
0 1 0
x1 w1
X1 1 0 0
y
Y 1 1 1
x2 For input (0,0), yin=x1w1+x2w2=(0X1)+(0X1)=0
X2
w2 For input (0,1), yin=x1w1+x2w2=(0X1)+(1X1)=1
For input (1,0), yin=x1w1+x2w2=(1X1)+(0X1)=1
For input (1,1), yin=x1w1+x2w2=(1X1)+(1X1)=2
W=1 excitatory value
W=-1 inhibitory value For w1=1, w2=1
X can be 1 or 0 Hence Ѳ =2.
Here, y=1 if ∑xw ≥ 2
y=0 if ∑xw < 2
Example of McCulloch Pitts
• OR gate: Truth table: X1 X2 Out Put
0 0 0
0 1 1
x1 w1
X1 1 0 1
y
Y 1 1 1
x2 For input (0,0), yin=x1w1+x2w2=(0X1)+(0X1)=0
X2
w2 For input (0,1), yin=x1w1+x2w2=(0X1)+(1X1)=1
For input (1,0), yin=x1w1+x2w2=(1X1)+(0X1)=1
For input (1,1), yin=x1w1+x2w2=(1X1)+(1X1)=2

Here, W1=1, W2=1


Hence Ѳ = 1.
Here, y=1 if ∑xw ≥ 1
y=0 if ∑xw < 1
Example of McCulloch Pitts
• XOR gate: Truth table:
Z1= X1 Z2= X2
X1 X2 Out Put X1 X2 Out Put
0 0 0 0 0 0
0 1 0 0 1 1
1 0 1 1 0 0
1 1 0 1 1 0

Yin= Z1+Z2
X1 X2 Out Put
0 0 0
0 1 1
1 0 1
1 1 0
Example of McCulloch Pitts
• XOR gate:
Let w1=1, w2=1
Z1= X1
For input (0,0), yin=x1w1+x2w2=(0X1)+(0X1)=0
X1 X2 Out Put For input (0,1), yin=x1w1+x2w2=(0X1)+(1X1)=1
0 0 0 For input (1,0), yin=x1w1+x2w2=(1X1)+(0X1)=1
For input (1,1), yin=x1w1+x2w2=(1X1)+(1X1)=2
0 1 0 and Threshold=1,2,0 Output is not satisfy
1 0 1
Let w1=1, w2=-1
1 1 0
For input (0,0), yin=x1w1+x2w2=(0X1)+(0X-1)=0
For input (0,1), yin=x1w1+x2w2=(0X1)+(1X-1)=-1
For input (1,0), yin=x1w1+x2w2=(1X1)+(0X-1)=1
For input (1,1), yin=x1w1+x2w2=(1X1)+(1X-1)=0
If Threshold=1 Output is satisfy
Example of McCulloch Pitts
• XOR gate:
Let w1=1, w2=1
Z2= X2
For input (0,0), yin=x1w1+x2w2=(0X1)+(0X1)=0
X1 X2 Out For input (0,1), yin=x1w1+x2w2=(0X1)+(1X1)=1
Put
For input (1,0), yin=x1w1+x2w2=(1X1)+(0X1)=1
0 0 0 For input (1,1), yin=x1w1+x2w2=(1X1)+(1X1)=2
and Threshold=1,2,0 Output is not satisfy
0 1 1
1 0 0 Let w1=-1, w2=1
1 1 0 For input (0,0), yin=x1w1+x2w2=(0X-1)+(0X1)=0
For input (0,1), yin=x1w1+x2w2=(0X-1)+(1X1)=1
For input (1,0), yin=x1w1+x2w2=(1X-1)+(0X1)=-1
For input (1,1), yin=x1w1+x2w2=(1X-1)+(1X1)=0
If Threshold=1 Output is satisfy
Example of McCulloch Pitts
XOR gate Model
W1=1
x1 Z1>1 W1=1

W2=-1
y
Y>1
W1=-1

W2=1
x2 Z2>1
W2=1
What is Perceptron?

• A Perceptron is the smallest element of a


neural network. Perceptron is a single-layer
neural network linear or a Machine Learning
algorithm used for supervised learning of
various binary classifiers. It works as an
artificial neuron to perform computations by
learning elements and processing them for
detecting the business intelligence and
capabilities of the input data.
Neural Network Architecture

Neural Network

Feed forward Network Feed back/Recurrent


Network
Radia
l Comp Self
Single Multil Basis etitiv Orga Hopfi
layer ayer Funct e nizing eld
Perce Perce ion Netw map Netw
ptron ptron Netw ork netw ork
ork ork
Feed Forward Network
• Feed Forward Network: all neurons are
connected in forward direction.
x1 w11 Y1 netj=

w12
x2 Y2
Yj=f(netj)

w13 wm1
xn Ym
wm2

wm3
Feed Forward Network
• Single layer Perceptron: Input and output
layers are present. It consist of single layer.
Input directly connected to the output.
• Sum of product of input and weight matrix is
calculated. If value is above threshold answer
is 1 and otherwise -1.
Feed Forward Network
• Multi layer Perceptron Network
Hidden layer 2
Hidden layer 1
x1 w11 Y1

x2 w12

Y2

wm1
x3 wm2
Ym
Feed Forward Network
• Radial Basis function : Single Hidden layer is
present Hidden layer
x1 w11 Y1

x2

Y2

wm1
x3 wm2
Ym
Feedback/ Recurrent Network
• Feedback/ Recurrent Network: output is feed
back to the input Hidden layer 1 biases
x1 w11

x2
Y1

x3 wm2 Feedback
Feedback/ Recurrent Network
• Competitive networks: neuron of the output layers
compete between themselves to find maximum
output.
• Self organizing map: input neuron activate closest
output neuron
• Hopfield network: Each neuron is connected to the
every other neuron but not back to itself. It is
generally used in performing auto association and
optimization tasks to identify patterns.
Feedback/ Recurrent Network
Competitive networks Self organizing map

Hopfield network
Types of Learning
• Supervised learning: Input is applied to supervisor and it
produced desired output. Difference between actual response
and desired response is calculated as error which is used to
correct the network parameters.
• Unsupervised learning: In this type supervisor is not present
hence there is no idea or guess of output. Network weights based
on pattern of input and output.
• Hybrid learning: Combination of supervised and unsupervised.
• Competitive learning: neuron of the output layers compete
between themselves to find maximum output. The neuron having
the maximum response is declared as a winner neuron and the
weight of winner neurons are modified else remain unchanged.
• Example: If system has 6 input and 2 output
How many neurons are required?
• Answer->8 neurons
• What is size of weight matrix
• Size=(2X6)= 2 output and 6 input
• Which output function should be used
• Uni-polar
• Compute output of the following network using
unipolar condition
-2.82
0 4.83 H1 (net)=(4.83*0)+(-4.83*1)-2.82=-7.65
H1 5.73
-4.83 H1(out)=1/(1+e^-H1(net))=4.759X10^-4

-4.6 O

1 H2(net)=(-4.63*0)+(4.6*1)-2.74=1.86
H2 5.83
H2(output)=0.865
4.6 -2.86
O(net)=(4.758*(10^-4)*5.73)+(0.865*5.83)-
2.86
-2.74
O(net)=2.167
O(output)=0.899
Components of a Perceptron
Components of a Perceptron
Each Perceptron comprises four different parts:
• Input Values: A set of values or a dataset for predicting the output
value. They are also described as a dataset’s features and dataset.
• Weights: The real value of each feature is known as weight. It tells
the importance of that feature in predicting the final value.
• Bias: The activation function is shifted towards the left or right
using bias. You may understand it simply as the y-intercept in the
line equation.
• Summation Function: The summation function binds the weights
and inputs together. It is a function to find their sum.
• Activation Function: It introduces non-linearity in the perceptron
model.
Why do we Need Weight and Bias?

• Weight and bias are two important aspects of


the perceptron model. These are learnable
parameters and as the network gets trained it
adjusts both parameters to achieve the
desired values and the correct output.
Perceptron Learning Rule

•The late 1950s saw the development of a new type


of neural network called perceptrons, which were
similar to the neurons from an earlier work by
McCulloch and Pitts.
•One key contribution by Frank Rosenblatt was his
work for training these networks with perceptron
learning rules.
•According to the rule, perceptron can learn
automatically to generate the desired results through
optimal weight coefficients.
Perceptron Learning Rule
Perceptron architecture
d-o Wj
W
X *
o
X
d
c
b
X :input, b: bias, W:weight, o: observed, d: desired, c: constant, j=1 to n
Wij=C*[di-oi]*Xj

W new=W old+ Wij W new updated if di is not equal to o

Lets assume, if d1=1 and o1=-1 then Wij= C*[di-oi]*Xj


=C*[1-(-1)]*Xj=2CXj

If d1=-1 and o1=1 Wij= C*[di-oi]*Xj= C*[-1-1)]*Xj=-2CXj


Linearly separable/ Linearly non
separable

Linearly non separable Linearly separable


Linearly separable

XOR gate is linearly non seperable


• Multilayer Perceptron model:
• An MLP consists of at least three layers of nodes: an
input layer, a hidden layer and an output layer.
• Except for the input nodes, each node is a neuron
that uses a nonlinear activation function.
• MLP utilizes a supervised learning technique
called back propagation for training. Its multiple
layers and non-linear activation distinguish MLP from
a linear perceptron. It can distinguish data that is
not linearly separable.
r=[di-Oi]f’(net) where di =desired output, f’(net) = derivative of output
For uni-polar continuous , f’(net)=o(1-o)
For bipolar continuous, f’(net)=1/2*(1-o^2)
Weight increment d Wij=C*[di-Oi]*f’(net)*Xj where C is constant, X is input
and j=1 to n,
Wnew=Wold+ dWij
Deep Network
Working
• Understand the problem and check the
feasibility for deep learning
• Identify relevant data and prepare it
• Select Deep learning algorithm
• Train the algorithm
• Test the performance of the model
Application of Deep learning
• Automatic text generation
• Health care
• Automatic machine translation
• Image recognition
• Predicating earthquake
Three classes of deep learning
• Generative deep architectures, which are intended to
characterize the high order correlation properties of the
observed or visible data for pattern analysis or synthesis
purpose, and /or characterize the joint statistical
distribution of the visible data and their associated
classes.
• Discriminative deep architecture, which are intended to
directly provide discriminative power for pattern
classification, often by characterizing the posterior
distribution of the classes conditioned on the visible data
• Hybrid deep architecture, where the goal is
discrimination but is assisted with the outcomes of
generative architecture via better optimization
or/and regularization , or discriminative criteria are
used to learn the parameters in any of the deep
generative model.
Deep learning terminology
• Deep learning: a class of machine learning technique, where many layer
of information processing stage in hierarchical architectures are exploited
for unsupervised feature learning and for pattern analysis and
classification.
• Deep belief network(DBN):In machine learning, a deep belief
network (DBN) is a generative graphical model, or alternatively a class
of deep neural network, composed of multiple layers of latent
variables ("hidden units"), with connections between the layers but not
between units within each layer. Top two layers have in directed, symmetric
connections between them. The lower layer receive top down, directed
connections from the layer above.
Hidden layer 1

Hidden layer 2
Hidden layer 2

Visible layer
• What are Boltzmann Machines?
• It is a network of neurons in which all the neurons are
connected to each other. In this machine, there are two layers
named visible layer or input layer and hidden layer. The visible
layer is denoted as v and the hidden layer is denoted as the h.
In Boltzmann machine, there is no output layer. Boltzmann
machines are random and generative neural networks capable
of learning internal representations and are able to represent
and (given enough time) solve tough combinatory problems.
• Restricted Boltzmann Machine.
• A restricted term refers to that we are not allowed to connect the
same type layer to each other. In other words, the two neurons of
the input layer or hidden layer can’t connect to each other.
Although the hidden layer and visible layer can be connected to
each other.
• As in this machine, there is no output layer so the question arises
how we are going to identify, adjust the weights and how to
measure the that our prediction is accurate or not. All the questions
have one answer, that is Restricted Boltzmann Machine.
• The RBM algorithm was proposed by Geoffrey Hinton (2007), which
learns probability distribution over its sample training data inputs.
It has seen wide applications in different areas of
supervised/unsupervised machine learning such as feature
learning, dimensionality reduction, classification, collaborative
filtering, and topic modeling.
• Consider the example movie rating discussed in the recommender
system section. Movies like Avengers, Avatar, and Interstellar have
strong associations with the latest fantasy and science fiction
factor. Based on the user rating RBM will discover latent factors
that can explain the activation of movie choices. In short, RBM
describes variability among correlated variables of input dataset in
terms of a potentially lower number of unobserved variables.
• Deep Boltzmann Machines (DBMs):
• DBMs are similar to DBNs except that apart from the
connections within layers, the connections between the
layers are also undirected (unlike DBN in which the
connections between layers are directed). DBMs can
extract more complex or sophisticated features and
hence can be used for more complex tasks.
• Deep auto encoders: A deep auto encoder is.
One of the networks recomposed of two
symmetrical deep-belief networks having four
to five shallow layers resents the encoding half
of the net and the second network makes up
the decoding half.
Thank You

You might also like