0% found this document useful (0 votes)
89 views46 pages

Chapter 4 Neural Network

The document discusses neural networks, comparing the brain's capabilities with machine learning, and detailing the architecture and functioning of neural networks, including multi-layer perceptrons and convolutional neural networks. It outlines applications of neural networks in various fields such as finance, robotics, and natural language processing, as well as the training process using backpropagation. Additionally, it highlights the pros and cons of neural networks and introduces deep learning concepts.

Uploaded by

yosefdemeke08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views46 pages

Chapter 4 Neural Network

The document discusses neural networks, comparing the brain's capabilities with machine learning, and detailing the architecture and functioning of neural networks, including multi-layer perceptrons and convolutional neural networks. It outlines applications of neural networks in various fields such as finance, robotics, and natural language processing, as well as the training process using backpropagation. Additionally, it highlights the pros and cons of neural networks and introduces deep learning concepts.

Uploaded by

yosefdemeke08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Chapter-4

Neural Network

By: Yeshambel A.
Neural Network

2
The Power of Brain vs. Machine

• The Brain
– Pattern
Recognition
– Association
– Complexity
– Noise Tolerance

• The
Machine
– Calculation
– Precision 3
Features of the Brain

• Ten billion (1010) neurons


• Face Recognition ~0.1secs
• On average, each neuron has several
thousand connections
• Hundreds of operations per second
• High degree of parallel computation
• Distributed representations
• Die off frequently (never replaced)

4
Neural Network classifier
⚫ It is represented as a layered set of interconnected processors.

⚫ These processor nodes and connections resembles a relationship with the


neurons of the brain.
⚫ Each node has a weighted connection to several other nodes in adjacent
layers.
⚫ Individual nodes take the input received from connected nodes and use the
weights together to compute output values.
⚫ The inputs are fed simultaneously into the input layer.

⚫ The weighted outputs of these units are fed into hidden layer.

⚫ The weighted outputs of the last hidden layer are inputs to units
making up the output layer.
5
6
Neural Networks Applications
There are two basic goals for neural network research:
Brain modelling
• Aid our understanding of how the brain works. This helps to understand the nature of
perception, actions, learning and memory, thought and intelligence and/or formulate medical
solutions to brain damaged patients.
Artificial System Construction/ real world applications.
• Financial modelling – predicting the stock market
• Time series prediction – climate, weather, seizures
• Computer games – intelligent agents, chess, backgammon
• Robotics – autonomous adaptable robots
• Pattern recognition – speech recognition, seismic activity, sonar signals
• Data analysis – data compression, data mining
• Bioinformatics – DNA sequencing, alignment

7
Architecture of Neural network
⚫ Neural networks are used to look for patterns in data, learn these patterns,
& then classify new patterns & make forecasts
⚫ A network with the input and output layer only is called single-layered
neural network. Whereas, a multilayer neural network is a generalized one
with one or more hidden layer.
⚫ A network containing two hidden layers is called a three-layer neural network, and so on.

8
A Multilayer Neural Network
⚫ Input: corresponds with class attribute that are with normalized
attributes values.
– There are as many nodes as classattributes, X = {x1, x2, …. xm}, where n is the number
of attributes.
• Hidden Layer
– neither its input nor its output can be observed from outside.
– The number of nodes in the
hidden layer & the number of hidden layers depends on implementation.
– Mostly different number of hidden layers and nodes produce different result

• Output Layer – corresponds to the class attribute.


There are as many nodes as classes (values of the class attribute)

9
Multi-layer Perceptron (MLP)
• One of the most popular neural network model is the multi-layer perceptron (MLP).
• In an MLP, neurons are arranged in layers. There is one input layer, one output
layer, and several (or many) hidden layers.

10
11
Hidden layer: Neuron with
Activation
⚫ The neuron is the basic information processing unit of a NN.
⚫ It consists of:
1. A set of links, describing the neuron inputs, with weights W1,W2,
…,Wm
2. An adder function (linear combiner) for computing the weighted sum
of the inputs (real numbers): M
y = ∑j = w xj j
1
3. Activation function (also called squashing function): for limiting the
output behavior of the neuron.

12
Activation Functions

(a) is a step function or (b) is a sigmoid


threshold function function:
(hardlimiting): 1/(1+e-x)
⚫Changing the bias weight W0,i moves the threshold location
⚫ Bias helps the neural network to be more flexible since it adjust the activation
function left-or-right, making it centered on some other value than x = 0. To
this effect an additional node is added to the input layer, with its constant input;
say, 1 or -1, … When this is multiplied by the weights of the hidden layer, it
provides a bias to activation function.

13
Activation Functions

14
Two Topologies of neural
network
⚫ NN can be designed in a feed forward or recurrent manner
⚫ In a feed forward neural network connections
between the units do not form a directed cycle.
⚫ In this network, the information moves in only one direction, forward, from the
input nodes, through the hidden nodes (if any) & to the output nodes. There are no
cycles or loops or no feedback connections are present in the network, that is,
connections extending from outputs of units to inputs of units in the same layer or
previous layers.
⚫ In recurrent networks data circulates back &
forth until the activation of the units is stabilized
⚫ Recurrent networks have a feedback loop where data can be fed back into the input
at some point before it is fed forward again for further processing and final output.

15
Training the neural
⚫ network
The purpose is to learn to generalize using a set of sample
patterns
where the desired output is known.
⚫ Back Propagation is the most commonly used method for
training
multilayer feed forward NN.
⚫ Back propagation learns by iteratively processing a set of training
data (samples).
⚫ For each sample, weights are modified to minimize the
error between the desired output and the actual output.
⚫ After propagating an input through the network, the error is
calculated and the error is 16
Training
Algorithm
⚫ The learning algorithm is as follows
⚫ Initialize the weights and threshold to small random
numbers.
⚫ Present a vector x to the neuron inputs and calculate
m the
output using the adder
y = ∑j j
function. w
⚫ Apply the activation function (in this case step
functionx) j=
such that y = 0 if y  1

0
1 if y >
⚫ Update the weights according to the
0
error.

W j =W j +η (yT
17
Training Multi-layer NN

18
Training Multi-layer NN

Train this layer


first

19
Training Multi-layer NN

Train this layer first


then this
layer

20
Training Multi-layer NN

Train this layer


first
then this layer
then this
layer 21
Training Multi-layer NN

Train this layer


first
then this layer
then this
layer 22
Training Multi-layer NN

Train this layer


first
then this layer
then this
layer finally this 23 23
Calculating the Error
⚫ Evaluatethe predicted output - Calcualte the error as the
difference between the predicted output against the target
output of sample n and passed to a loss function

24
Calculating the Error: Example

25
Reducing Error
• The main goal of the training is to reduce the error or the difference between
prediction and actual output.
• By decomposing prediction into its basic elements we can find that weights are the
variable elements affecting prediction value. In other words, in order to change
prediction value, we need to change weights values.

How to change\update the weights value so that the error is reduced?

The answer is Backpropagation!

26
Pros and Cons of Neural Network
• Useful for learning complex data like handwriting, speech and
image recognition
Cons
Pros  Slow training time
 Can learn more complicated
 Hard to interpret & understand the learned
class boundaries
 Fast application function (weights)
 Can handle large number of  Hard to implement: trial & error for
features choosing number of nodes
o Neural Network needs long time for training.
o Neural Network has a high tolerance to noisy and incomplete
data
o Conclusion: Use neural nets only if decision-trees fail.
27
Deep Learning…

What exactly is deep learning ?


1. ‘Deep Learning’ means using a neural network
with several layers of nodes between input and output

2. The series of layers between input & output do


feature identification and processing in a series of stages,
just as our brains seem to.

28
Deep Learning…

29
Convolutional Neural Networks (CNNs)

• CNNs are a special kind of multi-layer neural networks, designed for processing data
that has an input shape like a 2D matrix like images.

• CNN’s are typically used for image detection and classification.

• Images are 2D matrix of pixels on which we run CNN to either recognize the image or
to classify the image.

• Example: Identify if an image is of a human being, or car or just digits on an address.

30
Convolutional Neural Network Architecture

31
Convolutional Neural Network Architecture

• A CNN typically has three layers:


• Convolutional layer,
• Pooling layer, and
• Fully connected layer.

• Convolutional layer is the core building block of a CNN, and it is where the
majority of computation occurs.
• The term convolution refers to the mathematical combination of two functions to
produce a third function. It merges two sets of information.
• In the case of a CNN, the convolution is performed on the input data with the use of a
filter or kernel then produce a feature map.

32
Convolution Operation

(CONV) uses filters that perform convolution


operations as it is scanning the input with respect to
its dimensions. Its hyperparameters include the filter
size and stride . The resulting output called feature
map/activation map

33
Convolution Operation

34
Pooling Layer
The pooling Layer is a mechanism of down sampling. It is usually appended after
convolutional layers to progressively decrease the spatial size of feature maps.

• Max pooling takes the largest


value from the window of the
image currently covered by the
kernel.
• Average pooling takes
the average of all values in
the window.

35
The whole CNN
cat dog
…… Convolution

Max Pooling
Can
Fully Connected repea
Feedforward network
Convolution t
many
times
Max Pooling

Flattened
36
Recurrent Neural Networks (RNN)
• A recurrent neural network (RNN) is an extension of a regular
feedforward neural network, which is able to handle variable-length
sequential data and processing time-series prediction.
• Example: If you want to predict the next word in a sentence you need
to know which words came before it.
• In sequence problem, the output depends on
• Current Input
• Previous Output
• Example: Sequence is important for part of speech (POS) tagging
• Traditional neural network cannot capture such relationship.

37
Typical RNN Architecture

RNN can be seen as an MLP network with the addition of loops to the
architecture.

38
RNN Example: Guess part of speech (POS)

39
RNN Example: Sentiment Analysis

40
Recurrent Neural Networks: Process
Sequences

e.g. Image Captioning


image -> sequence of
words
41
Recurrent Neural Networks: Process
Sequences

e.g. Sentiment Classification


sequence of words ->
sentiment
42
Recurrent Neural Networks: Process
Sequences

e.g. Machine Translation


seq of words -> seq of
words
43
Recurrent Neural Networks: Process
Sequences

e.g. Video classification on frame


level
44
RNN Applications
• Natural language processing
• E.g. Given a sequence of words, RNN predicts the probability of next word given the
previous ones.
• Machine translation: Similar to language modeling
• E.g. Google translator (English to Amharic )
• Speech recognition:
• given input: sequence of acoustic signals, produce output phonetic segments
• Image tagging : RNN + CNN jointly trained.
• CNN generates features (hidden state representation).
• RNN reads CNN features and produces output (end-to-end training).
• Time series prediction : Forecast of future values in a time series, from past seen
values.
• e.g Weather forecast, financial time series
45
THANK
YOU

46

You might also like