0% found this document useful (0 votes)
56 views60 pages

8.lecture7 28a 29 NN

Neural networks are composed of connected input and output units, with each connection having an associated weight. Neural networks learn by adjusting these weights to correctly classify training data. The training process involves forward propagation to calculate outputs and backward propagation to adjust weights to minimize errors between network outputs and actual classifications over many iterations. This allows the neural network to learn complex decision boundaries from nonlinear input-output mappings in the data.

Uploaded by

Muhd Afiq Azmir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views60 pages

8.lecture7 28a 29 NN

Neural networks are composed of connected input and output units, with each connection having an associated weight. Neural networks learn by adjusting these weights to correctly classify training data. The training process involves forward propagation to calculate outputs and backward propagation to adjust weights to minimize errors between network outputs and actual classifications over many iterations. This allows the neural network to learn complex decision boundaries from nonlinear input-output mappings in the data.

Uploaded by

Muhd Afiq Azmir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

BIT 33603 : DATA MINING

LECTURE 7 : NEURAL NETWORKS


Neural Network is a set of connected INPUT/OUTPUT UNITS, where each connection has a WEIGHT associated with it.

Neural Network learning is also called CONNECTIONIST learning due to the connections between units.

Neural Network learns by adjusting the weights so as to be able to correctly classify the training data and hence, after testing phase, to classify
unknown data.
-0.06

W1

W2
-2.5 f(x)
W3

1.4
-0.06

2.7

-8.6
-2.5 f(x)
0.002 x = -0.06×2.7 + 2.5×8.6 + 1.4×0.002 = 21.34

1.4
Single
Multilayer
Perceptron Layer
Perceptron
Perceptron
MLP
• INPUT: records without class attribute with normalized attributes values.

• INPUT VECTOR: X = { x1, x2, …. xn}


where n is the number of (non class) attributes.

• INPUT LAYER – there are as many nodes as non-class attributes i.e. as the length of the input
vector.

• HIDDEN LAYER – the number of nodes in the hidden layer and the number of hidden layers
depends on implementation.

• OUTPUT LAYER – corresponds to the class attribute.


• There
k= 1, are
2,.. as many nodes as classes (values of the class attribute).
#classes
Neural Network Learning
2 MAJOR PROCESSES INVOLVE :-
- The inputs are fed simultaneously into the input layer.

1) Forward propagation - The weighted outputs of these units are fed into hidden layer.

- The weighted outputs of the last hidden layer are inputs to units making up the
output layer.

2) Back- Propagation
• Back Propagation learns by iteratively processing a set of training data (samples).
• For each sample, weights are modified to minimize the error between network’s classification and actual classification
Steps in training the MLP
• STEP ONE: initialize the weights and biases.

• The weights in the network are initialized to random numbers from the interval [-1,1] or [0,1]

• Each unit has a BIAS associated with it

• The biases are similarly initialized to random numbers from the interval [-1,1].

• STEP TWO: feed the training sample.

• STEP THREE: Propagate the inputs forward; we compute the net input and output of each unit in the hidden and output layers.

• STEP FOUR: back propagate the error.

• STEP FIVE: update weights and biases to reflect the propagated errors.

• STEP SIX: terminating conditions.


Terminating Conditions
• When to stop the training ….
• All
wij in the previous epoch are below some threshold, or

•The percentage of samples misclassified in the previous epoch is below some


threshold, or

• a pre specified number of epochs has expired.

• In practice, several hundreds of thousands of epochs may be required before the


weights will converge.

• Training a neural network with backpropagation learning algorithm usually requires


that all representations of the input set (called one epoch) are presented many times.
For examples, the ANN may use hundreds to thousands epochs.
Backpropagation Formulas

Output vector

Output nodes Errk  Ok (1  Ok )(Tk  Ok )


1
Oj  I
1 e j
Err j  O j (1  O j ) Errk w jk
Hidden nodes k

I j   wij Oi   j wij  j   j  (l) Err j


i

wij  wij  (l ) Err j Oi


Input nodes

Input vector: xi
A dataset:

Inputs class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Training the neural network
Inputs class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Training data
Inputs class Initialise with random weights
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Training data
Inputs class Present a training pattern
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.4
etc …
2.7

1.9
Training data
Inputs class Feed it through to get output
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.4
etc …
2.7 0.8

1.9
Training data
Inputs class Compare with target output
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.4
etc …
2.7 0.8
0
1.9 error 0.8
Training data
Inputs class Adjust weights based on error
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 1.4
etc …
2.7 0.8
0
1.9 error 0.8
Training data
Inputs class Present a training pattern
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 6.4
etc …
2.8

1.7
Training data
Inputs class Feed it through to get output
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 6.4
etc …
2.8 0.9

1.7
Training data
Inputs class Compare with target output
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 6.4
etc …
2.8 0.9
1
1.7 error -0.1
Training data
Inputs class Adjust weights based on error
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 6.4
etc …
2.8 0.9
1
1.7 error -0.1
Training data
Inputs class And so on ….
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 6.4
etc …
2.8 0.9
1
1.7 error -0.1

Repeat this thousands, maybe millions of times – each time taking a


random training instance, and making slight weight adjustments

Algorithms for weight adjustment are designed to make


changes that will reduce the error
The decision boundary perspective…
Initial random weights
The decision boundary perspective…
Present a training instance / adjust the weights
The decision boundary perspective…
Present a training instance / adjust the weights
The decision boundary perspective…
Present a training instance / adjust the weights
The decision boundary perspective…
Present a training instance / adjust the weights
The decision boundary perspective…
Eventually ….
The points are:
• weight-learning algorithms for NNs are somehow a tedious
process

• they work by making thousands and thousands of tiny


adjustments, each making the network do better at the most
recent pattern, but perhaps a little worse on many others

• but, eventually this tends to be good enough to


learn effective classifiers for many real applications
Some other points

If f(x) is non-linear, a network with 1 hidden layer can, in theory, learn


perfectly any classification problem. A set of weights exists that can
produce the targets from the inputs. The problem is finding them.
Some other ‘by the way’ points
If f(x) is linear, the NN can only draw straight decision
boundaries (like a single layer of perceptron- SLP)
Some other ‘by the way’ points
NNs with a hidden layer use nonlinear f(x) so they can draw
complex boundaries, but keep the data unchanged
Some other ‘by the way’ points
NNs use nonlinear f(x) so they SVMs only draw straight lines,
can draw complex boundaries, but they transform the data first
but keep the data unchanged in a way that makes that OK
So how the NN weights are adjusted?
Training data
Inputs class And so on ….
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0 6.4
etc …
2.8 0.9
1
1.7 error -0.1

Repeat this thousands, maybe millions of times – each time


taking a random training instance, and making slight
weight adjustments
Algorithms for weight adjustment are designed to make
changes that will reduce the error
Example of Back propagation
Input = 3,
Hidden Neuron = 2
Output =1

Initialize weights and bias To


: random numbers from -1.0 to 1.0

Initial Input and weight

x1 x2 x3 w14 w15 24 25 34 35 46 56

1 0 1 0.2 -0.3 0.4 0.1 -0.5 0.2 -0.3 -0.2

Initial bias :

θ4 θ5 θ6

-0.4 0.2 0.1


Net Input and Output Calculation

Unitj Net Input Ij Output Oj

4 0.2 + 0 - 0.5 -0.4 = -0.7 1


Oj  = 0.332
1  e0.7
5 -0.3 + 0 + 0.2 + 0.2 =0.1 1
Oj  = 0.525
1  e 0.1
6 (-0.3)0.332-(0.2)
1
(0.525)+0.1= -0.105 Oj  = 0.475
1  e0.105
Calculation of Error at Each Node
Errk  Ok (1  Ok )(Tk  Ok )
Err j  O j (1  O j ) Errk w jk
k

Unit j Error j
6 0.475(1-0.475)(1-0.475) =0.1311
We assume T 6 = 1 , where is T is target output,
O is the current output
5 0.525 x (1- 0.525)x 0.1311x
(-0.2) = 0.0065

4 0.332 x (1-0.332) x 0.1311 x


(-0.3) = -0.0087
Calculation of weights and Bias Updating
Let say the Learning Rate l =0.9
Weight New Values
w46 -0.3 + 0.9(0.1311)(0.332) = -0.261

wij  wij  (l ) Err j Oi w56 -0.2 + (0.9)(0.1311)(0.525) = -0.138

w14 0.2 + 0.9(-0.0087)(1) = 0.192

w15 -0.3 + (0.9)(-0.0065)(1) = -0.306


……..similarly ………similarly
θ6 0.1 +(0.9)(0.1311)=0.218

……..similarly ………similarly
Developing
a Neural Network–Based System
Application

• Forecasting/Market Prediction: finance and banking

• Manufacturing: quality control, fault diagnosis

• Medicine: analysis of electrocardiogram data, RNA & DNA


sequencing, drug development without animal testing

• Control: process, robotics


Time Series Prediction
• Time series prediction: given an existing data series, we observe or model the data
series to make accurate forecasts

• Example time series


• Financial (e.g., stocks, exchange rates)
• Physically observed (e.g., weather, sunspots, river flow)
• Why is it important?
• Preventing undesirable events by forecasting the event, identifying the circumstances
preceding the event, and taking corrective action so the event can be avoided (e.g.,
inflationary economic period)
• Forecasting undesirable, yet unavoidable, events to preemptively lessen their impact
(e.g., solar maximum w/ sunspots)
• Profiting from forecasting (e.g., financial markets)

58
59

You might also like