ML Unit-2
ML Unit-2
ML Unit-2
NEURAL
NETWORKS
Syllabus
UNIT - II
Artificial Neural Networks-1– Introduction, neural network representation,
appropriate problems for neural network learning, perceptions, multilayer networks
and the backpropagation algorithm.
Artificial Neural Networks-2- Remarks on the Back-Propagation algorithm, An
illustrative example: face recognition, advanced topics in artificial neural networks.
Evaluation Hypotheses – Motivation, estimation hypothesis accuracy, basics of
sampling theory, a general approach for deriving confidence intervals, difference in
error of two hypotheses, comparing learning algorithms.
TEXT BOOK:
1. Machine Learning – Tom M. Mitchell, -
MGH
Introduction
ANN was first introduced in 1943 by the neurophysiologist Warren McCulloch and the
mathematician Walter Pitts.
Artificial neural networks (ANNs) provide a general, practical method for learning real-
valued, discrete-valued, and vector-valued functions from examples.
Suppose perceptron outputs a -1, when the target output is +1. To make
the perceptron output a + 1 instead of - 1 in
this case, the weights must be altered to increase the value of 𝑤. 𝑥
For example, if 𝒙𝒊 > 𝟎 , then increasing 𝑤𝑖 will bring the perceptron closer to
correctly classifying it.
WHAT IS GRADIENT DESCENT?
Gradient Descent is an optimization
algorithm for finding a local minimum of a
differentiable function. Gradient descent
is simply used to find the values of a
function's parameters (coefficients) that
minimize a cost function as far as possible.
Multilayer Networks and the
Backpropagation Algorithm:
Single Surfaces. In contrast, the kind of multilayer networks learned by the
Backpropagation algorithm are capable of expressing a rich variety of nonlinear decision
surfaces.
FIGURE 4.6: The sigmoid threshold unit.
The Sigmoidal unit computes its output 𝑜 as
1
𝑜 = 𝜎(𝑤. 𝑥 ) , where 𝜎 𝑦 = . The Sigmoidal
1+𝑒 −𝑦
function, often called Logistic function / Squashing
function and its output range is [0,1]. It increases
monotonically with its input. The Sigmoidal function
has the useful property that its derivative is easily
expressed in terms of its output. In particular,
𝑑𝜎(𝑦)
= 𝜎 𝑦 . (1 − 𝜎 𝑦 )
𝑑𝑦
NEED OF ACTIVATION
FUNCTION
ANNs use activation functions (AFs) to perform
complex computations in the hidden layers and then
transfer the result to the output layer. The primary
purpose of AFs is to introduce non-linear properties in
the neural network.
Softmax Function
The Backpropagation Algorithm:
The Backpropagation algorithm;
• learns the weights for a multilayer network, given a network with a
fixed set of units and interconnections.
• It employs gradient descent to attempt to minimize the squared error
between the network output
values and the target values for these outputs.
We can write 𝐸(𝑤)= 𝑑∈𝐷 𝑘∈𝑜𝑢𝑡𝑝𝑢𝑡(𝑡𝑘𝑑 −𝑜𝑘𝑑 )2
Where, 𝑜𝑢𝑡𝑝𝑢𝑡𝑠 = set of output units in the network, and
𝑡𝑘𝑑 and 𝑜𝑘𝑑 are the target and output values associated with the 𝑘𝑡ℎ output unit
and training example 𝑑.
Deep Vision AI
Sense Time
Amazon Recognition
Face First
True face
Cognitec
Applications: Hospital Security,
Airline Industry
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
756
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Comparing Learning Algorithms.
We have so many algorithms in Machine Learning.
There is always a confusion about which one to
choose. So we consider some factors.
1. Time complexity
2. Space complexity
3. Sample complexity
4. Unbiased data
5. Online and offline algorithms
6. Parallelizability
7. Parametricity – Parametric data – fixed size data
- Non parametric data-changes size of data
THANK YOU