0% found this document useful (0 votes)

34 views30 pages

Chapter 3

Uploaded by

Abinaya Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views30 pages

Chapter 3

Uploaded by

Abinaya Babu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 30

CHAPTER 3

SUPERVISED LEARNING
NETWORK
DEFINITION OF SUPERVISED LEARNING
NETWORKS
 Training and test data sets

 Training set; input & target are specified

PERCEPTRON NETWORKS
 Linear threshold unit (LTU)

x1 w1
w0
w2
x2  o
. n
 wi xi
.
. wn i=0
n
xn 1 if w xi >0
{ -1 otherwise
f(xi)= i=0
i
PERCEPTRON LEARNING

wi = wi + wi
wi =  (t - o) xi
where
t = c(x) is the target value,
o is the perceptron output,
 Is a small constant (e.g., 0.1) called learning rate.
 If the output is correct (t = o) the weights wi are not
changed

 If the output is incorrect (t  o) the weights wi are changed

such that the output of the perceptron for the new weights
is closer to t.

 The algorithm converges to the correct classification

• if the training data is linearly separable
  is sufficiently small
LEARNING ALGORITHM
 Epoch : Presentation of the entire training set to the
neural network.

 In the case of the AND function, an epoch consists of four

sets of inputs being presented to the network (i.e. [0,0],
[0,1], [1,0], [1,1]).

 Error: The error value is the amount by which the value

output by the network differs from the target value. For
example, if we required the network to output 0 and it
outputs 1, then Error = -1.
 Target Value, T : When we are training a network we not
only present it with the input but also with a value that we
require the network to produce. For example, if we present
the network with [1,1] for the AND function, the training
value will be 1.

 Output , O : The output value from the neuron.

 Ij : Inputs being presented to the neuron.

 Wj : Weight from input neuron (Ij) to the output neuron.

 LR : The learning rate. This dictates how quickly the

network converges. It is set by a matter of
experimentation. It is typically 0.1.
TRAINING ALGORITHM
 Adjust neural network weights to map inputs to outputs.

 Use a set of sample patterns where the desired output

(given the inputs presented) is known.

 The purpose is to learn to

• Recognize features which are common to good and bad
exemplars
MULTILAYER PERCEPTRON

Output Values
Output Layer

Adjustable
Weights

Input Layer
Input Signals (External Stimuli)
LAYERS IN NEURAL NETWORK
 The input layer:
• Introduces input values into the network.
• No activation function or other processing.

 The hidden layer(s):

• Performs classification of features.
• Two hidden layers are sufficient to solve any problem.
• Features imply more layers may be better.

 The output layer:

• Functionally is just like the hidden layers.
• Outputs are passed on to the world outside the neural
network.
ADAPTIVE LINEAR NEURON (ADALINE)

In 1959, Bernard Widrow and Marcian Hoff of Stanford

developed models they called ADALINE (Adaptive Linear
Neuron) and MADALINE (Multilayer ADALINE). These models
were named for their use of Multiple ADAptive LINear
Elements. MADALINE was the first neural network to be
applied to a real world problem. It is an adaptive filter which
eliminates echoes on phone lines.
ADALINE MODEL
ADALINE LEARNING RULE

Adaline network uses Delta Learning Rule. This rule is also called
as Widrow Learning Rule or Least Mean Square Rule. The delta
rule for adjusting the weights is given as (i = 1 to n):
USING ADALINE NETWORKS
Initialize
Initialize • Assign random weights to all links

Training
• Feed-in known inputs in random
sequence
• Simulate the network
Training • Compute error between the input and
the output (Error Function)
• Adjust weights (Learning Function)
• Repeat until total error < ε

Thinking Thinking
• Simulate the network
• Network will respond to any input
• Does not guarantee a correct solution
even for trained inputs
MADALINE NETWORK

MADALINE is a Multilayer Adaptive Linear Element. MADALINE

was the first neural network to be applied to a real world
problem. It is used in several adaptive filtering process.
BACK PROPAGATION NETWORK
 A training procedure which allows multilayer feed forward
Neural Networks to be trained.

 Can theoretically perform “any” input-output mapping.

 Can learn to solve linearly inseparable problems.

MULTILAYER FEEDFORWARD NETWORK

Inputs

Hiddens
I0
Outputs
h0
I1 o0

h1
I2 o1
h2 Outputs

I3 Hiddens

Inputs
MULTILAYER FEEDFORWARD NETWORK:
ACTIVATION AND TRAINING
 For feed forward networks:
• A continuous function can be
• differentiated allowing
• gradient-descent.
• Back propagation is an example of a gradient-descent
technique.
• Uses sigmoid (binary or bipolar) activation function.
In multilayer networks, the activation function is
usually more complex than just a threshold function,
like 1/[1+exp(-x)] or even 2/[1+exp(-x)] – 1 to allow for
inhibition, etc.
GRADIENT DESCENT
 Gradient-Descent(training_examples, )

 Each training example is a pair of the form <(x1,…xn),t>

where (x1,…,xn) is the vector of input values, and t is the
target output value,  is the learning rate (e.g. 0.1)

 Initialize each wi to some small random value

 Until the termination condition is met, Do

• Initialize each wi to zero
• For each <(x1,…xn),t> in training_examples Do
Input the instance (x1,…,xn) to the linear unit and
compute the output o
For each linear unit weight wi Do

• wi= wi +  (t-o) xi

• For each linear unit weight wi Do
• wi=wi+wi
MODES OF GRADIENT DESCENT
 Batch mode : gradient descent
w=w -  ED[w] over the entire data D
ED[w]=1/2d(td-od)2

 Incremental mode: gradient descent

w=w -  Ed[w] over individual training examples d
Ed[w]=1/2 (td-od)2

 Incremental Gradient Descent can approximate Batch

Gradient Descent arbitrarily closely if  is small enough.
SIGMOID ACTIVATION FUNCTION
x0=1
x1 w1
w0 net=i=0n wi xi o=(net)=1/(1+e-net)
w2
x2  o
.
. (x) is the sigmoid function: 1/(1+e-x)
. wn
d(x)/dx= (x) (1- (x))
xn
Derive gradient decent rules to train:
• one sigmoid function
E/wi = -d(td-od) od (1-od) xi
• Multilayer networks of sigmoid units
backpropagation
BACKPROPAGATION TRAINING
ALGORITHM
 Initialize each wi to some small random value.

 Until the termination condition is met, Do

• For each training example <(x1,…xn),t> Do

• Input the instance (x1,…,xn) to the network and
compute the network outputs ok
• For each output unit k
• k=ok(1-ok)(tk-ok)
• For each hidden unit h
• h=oh(1-oh) k wh,k k
• For each network weight w,j Do
• wi,j=wi,j+wi,j where
• wi,j=  j xi,j
BACKPROPAGATION
 Gradient descent over entire network weight vector

 Easily generalized to arbitrary directed graphs

 Will find a local, not necessarily global error minimum -in

practice often works well (can be invoked multiple times with
different initial weights)

 Often include weight momentum term

wi,j(t)=  j xi,j +  wi,j (t-1)

 Minimizes error training examples

 Will it generalize well to unseen instances (over-fitting)?

 Training can be slow typical 1000-10000 iterations (use

Levenberg-Marquardt instead of gradient descent)
APPLICATIONS OF BACKPROPAGATION
NETWORK
 Load forecasting problems in power systems.

 Image processing.

 Fault diagnosis and fault detection.

 Gesture recognition, speech recognition.

 Signature verification.

 Bioinformatics.

 Structural engineering design (civil).

RADIAL BASIS FUCNTION NETWORK
 The radial basis function (RBF) is a classification and
functional approximation neural network developed by
M.J.D. Powell.

 The network uses the most common nonlinearities such as

sigmoidal and Gaussian kernel functions.

 The Gaussian functions are also used in regularization

networks.

 The Gaussian function is generally defined as

RADIAL BASIS FUCNTION NETWORK
SUMMARY

This chapter discussed on the several supervised learning

networks like

 Perceptron,
 Adaline,
 Madaline,
 Backpropagation Network,
 Radial Basis Function Network.

Apart from these mentioned above, there are several other

supervised neural networks like tree neural networks,
wavelet neural network, functional link neural network and
so on.

CCCY 312: Cryptography Project 1, Due Date: Nov. 25 Groups
No ratings yet
CCCY 312: Cryptography Project 1, Due Date: Nov. 25 Groups
4 pages
HKICO 2019-2020 - Mock - Final - Python
No ratings yet
HKICO 2019-2020 - Mock - Final - Python
6 pages
Chapter3 - BP
No ratings yet
Chapter3 - BP
12 pages
Operating Systems Test 1: Number of Questions: 35 Section Marks: 30
No ratings yet
Operating Systems Test 1: Number of Questions: 35 Section Marks: 30
6 pages
Supervised Learning Network
No ratings yet
Supervised Learning Network
33 pages
Soft Computing
No ratings yet
Soft Computing
40 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Chapter3 - Perceptron Adaline
No ratings yet
Chapter3 - Perceptron Adaline
53 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Ppt-Ii NNFL
No ratings yet
Ppt-Ii NNFL
43 pages
MLP Lecture 4
No ratings yet
MLP Lecture 4
35 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
AI17-Neural Networks
No ratings yet
AI17-Neural Networks
34 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Back Propagation
No ratings yet
Back Propagation
56 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Module 3 - Modified
No ratings yet
Module 3 - Modified
106 pages
UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
RBFN and TDNN
No ratings yet
RBFN and TDNN
42 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
L04 Slides - mlp1
No ratings yet
L04 Slides - mlp1
22 pages
Ann 2 A
No ratings yet
Ann 2 A
20 pages
Unit II Supervised II
No ratings yet
Unit II Supervised II
16 pages
Learning Rules For Multilayer Feedforward Neural Networks
No ratings yet
Learning Rules For Multilayer Feedforward Neural Networks
19 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Back Propagation
No ratings yet
Back Propagation
20 pages
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
No ratings yet
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
7 pages
Anthony Kuh - Neural Networks and Learning Theory
No ratings yet
Anthony Kuh - Neural Networks and Learning Theory
72 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
2012-1158. Backpropagation NN
No ratings yet
2012-1158. Backpropagation NN
56 pages
Lecture 10
No ratings yet
Lecture 10
155 pages
Linear Separability Linearly Separable Data Non-Linearly Separable Data
No ratings yet
Linear Separability Linearly Separable Data Non-Linearly Separable Data
1 page
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Mid Summary
No ratings yet
Mid Summary
13 pages
Back Propagation Algorithm
No ratings yet
Back Propagation Algorithm
13 pages
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
No ratings yet
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
24 pages
Artificial Neural Networks - MLP
No ratings yet
Artificial Neural Networks - MLP
52 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
Adaline Madaline Comes Under The Supervised Learning Networks
No ratings yet
Adaline Madaline Comes Under The Supervised Learning Networks
8 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
Artificial Neural Network - Back-Propagation Learning
No ratings yet
Artificial Neural Network - Back-Propagation Learning
21 pages
Pr3 ANN WriteUp
No ratings yet
Pr3 ANN WriteUp
8 pages
Neural Networks Handout
No ratings yet
Neural Networks Handout
7 pages
Back Propagation Neural Network
No ratings yet
Back Propagation Neural Network
5 pages
Neural Network: Prof. Subodh Kumar Mohanty
No ratings yet
Neural Network: Prof. Subodh Kumar Mohanty
37 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Ann-Back Propagation
No ratings yet
Ann-Back Propagation
21 pages
DWDM Unit4-2
No ratings yet
DWDM Unit4-2
4 pages
Unit 2 - Soft Computing
No ratings yet
Unit 2 - Soft Computing
49 pages
Chap11 Neural Nets
No ratings yet
Chap11 Neural Nets
38 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
18 pages
Classification Advanced
No ratings yet
Classification Advanced
51 pages
TO Artificial Neural Networks
No ratings yet
TO Artificial Neural Networks
22 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
From Everand
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
Fouad Sabry
No ratings yet
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet
Cengage Book List
100% (1)
Cengage Book List
54 pages
SteganographyImage Project Report
No ratings yet
SteganographyImage Project Report
69 pages
Dsal Oral Questions 2023 24
No ratings yet
Dsal Oral Questions 2023 24
2 pages
Data Structures
No ratings yet
Data Structures
34 pages
Ferrell Book 7th Ed Java-Programming-Ch7
No ratings yet
Ferrell Book 7th Ed Java-Programming-Ch7
48 pages
Theory of Computation 2067 II Questionpaper
No ratings yet
Theory of Computation 2067 II Questionpaper
1 page
6cs5 Ds Unit 4 Unit 4
No ratings yet
6cs5 Ds Unit 4 Unit 4
65 pages
Modularization Techniques
No ratings yet
Modularization Techniques
7 pages
UI Design Dart Notes
No ratings yet
UI Design Dart Notes
35 pages
Diploma Winter 2014
No ratings yet
Diploma Winter 2014
3 pages
Coding Platforms
No ratings yet
Coding Platforms
6 pages
Algorithms, Part I: Exercises: Analysis of Algorithms
No ratings yet
Algorithms, Part I: Exercises: Analysis of Algorithms
2 pages
Shell Scripts: Unit 4 Essential Shell Programming
No ratings yet
Shell Scripts: Unit 4 Essential Shell Programming
12 pages
Difference Between High Level Language and Low Level Language
No ratings yet
Difference Between High Level Language and Low Level Language
9 pages
ECEG2052 Course Outline
No ratings yet
ECEG2052 Course Outline
3 pages
CSDS337 - 2024 PS3 Solution
No ratings yet
CSDS337 - 2024 PS3 Solution
7 pages
Oop Lab Report
No ratings yet
Oop Lab Report
6 pages
National Standard Examination in Astronomy 2018-19 (NSEA) : Question Paper Code: A423
No ratings yet
National Standard Examination in Astronomy 2018-19 (NSEA) : Question Paper Code: A423
1 page
Double
No ratings yet
Double
3 pages
DSA Notes Abdul Bari
No ratings yet
DSA Notes Abdul Bari
206 pages
OOAD Report
No ratings yet
OOAD Report
21 pages
2210 Computer Science Oct/nov 2024 Examiner Report
No ratings yet
2210 Computer Science Oct/nov 2024 Examiner Report
11 pages
CP 207
No ratings yet
CP 207
2 pages
B+ and Heaps
No ratings yet
B+ and Heaps
19 pages
(PDF Download) A Beginners Guide To Python 3 Programming 2nd Edition John Hunt Fulll Chapter
100% (6)
(PDF Download) A Beginners Guide To Python 3 Programming 2nd Edition John Hunt Fulll Chapter
64 pages
It Report Computer Science 3
No ratings yet
It Report Computer Science 3
36 pages
Data Structures and Algorithms LAB Assignment-1 Register Number: 20BCE2285 Name: Namish Devaraj
No ratings yet
Data Structures and Algorithms LAB Assignment-1 Register Number: 20BCE2285 Name: Namish Devaraj
33 pages

Chapter 3

Uploaded by

Chapter 3

Uploaded by

CHAPTER 3

 Training set; input & target are specified

 If the output is incorrect (t  o) the weights wi are changed

 The algorithm converges to the correct classification

 In the case of the AND function, an epoch consists of four

 Error: The error value is the amount by which the value

 Output , O : The output value from the neuron.

 Ij : Inputs being presented to the neuron.

 Wj : Weight from input neuron (Ij) to the output neuron.

 LR : The learning rate. This dictates how quickly the

 Use a set of sample patterns where the desired output

 The purpose is to learn to

 The hidden layer(s):

 The output layer:

In 1959, Bernard Widrow and Marcian Hoff of Stanford

MADALINE is a Multilayer Adaptive Linear Element. MADALINE

 Can theoretically perform “any” input-output mapping.

 Can learn to solve linearly inseparable problems.

 Each training example is a pair of the form <(x1,…xn),t>

 Initialize each wi to some small random value

 Until the termination condition is met, Do

• wi= wi +  (t-o) xi

 Incremental mode: gradient descent

 Incremental Gradient Descent can approximate Batch

 Until the termination condition is met, Do

• For each training example <(x1,…xn),t> Do

 Easily generalized to arbitrary directed graphs

 Will find a local, not necessarily global error minimum -in

 Often include weight momentum term

 Minimizes error training examples

 Will it generalize well to unseen instances (over-fitting)?

 Training can be slow typical 1000-10000 iterations (use

 Fault diagnosis and fault detection.

 Gesture recognition, speech recognition.

 Structural engineering design (civil).

 The network uses the most common nonlinearities such as

 The Gaussian functions are also used in regularization

 The Gaussian function is generally defined as

This chapter discussed on the several supervised learning

Apart from these mentioned above, there are several other

You might also like