0% found this document useful (0 votes)
8 views20 pages

UNIT-II Chapter-2

The document provides an overview of Artificial Neural Networks (ANNs), explaining their structure, including input, hidden, and output layers, and how they mimic biological neural networks. It discusses the working of ANNs, including the role of activation functions and the backpropagation algorithm for training. Additionally, it introduces the concept of perceptrons as basic building blocks of ANNs and outlines their types and functionalities.

Uploaded by

s76710396
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views20 pages

UNIT-II Chapter-2

The document provides an overview of Artificial Neural Networks (ANNs), explaining their structure, including input, hidden, and output layers, and how they mimic biological neural networks. It discusses the working of ANNs, including the role of activation functions and the backpropagation algorithm for training. Additionally, it introduces the concept of perceptrons as basic building blocks of ANNs and outlines their types and functionalities.

Uploaded by

s76710396
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT – II

Chapter-II

Artificial Neural Network:


The term "Artificial Neural Network" is derived from Biological neural networks that develop
the structure of a human brain. Similar to the human brain that has neurons interconnected to
one another, artificial neural networks also have neurons that are interconnected to one another
in various layers of the networks. These neurons are known as nodes.

The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.

Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.

1
Relationship between Biological neural network and artificial neural network:

An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic
the network of neurons makes up a human brain so that computers will have an option to
understand things and make decisions in a human-like manner. The artificial neural network
is designed by programming computers to behave simply like interconnected brain cells.
Example:

• Consider an example of a digital logic gate that takes an input and gives an output.
"OR" gate, which takes two inputs.
• If one or both the inputs are "On," then we get "On" in output.
• If both the inputs are "Off," then we get "Off" in output.
• Here the output depends upon input.
• Our brain does not perform the same task.
• The outputs to inputs relationship keep changing because of the neurons in our brain,
which are "learning."

2
The architecture of an artificial neural network:
To understand the concept of the architecture of an artificial neural network, we have to
understand what a neural network consists of. In order to define a neural network that consists
of a large number of artificial neurons, which are termed units arranged in a sequence of layers.
Lets us look at various types of layers available in an artificial neural network.

Artificial Neural Network primarily consists of three layers:

Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the
programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the calculations
to find hidden features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.

It determines weighted total is passed as an input to an activation function to produce the


output. Activation functions choose whether a node should fire or not. Only those who are
fired make it to the output layer. There are distinctive activation functions available that can
be applied upon the sort of task we are performing.

3
Working of Artificial neural networks:
Artificial Neural Network can be best represented as a weighted directed graph, where the
artificial neurons form the nodes. The association between the neurons outputs and neuron
inputs can be viewed as the directed edges with weights. The Artificial Neural Network
receives the input signal from the external source in the form of a pattern and image in the form
of a vector. These inputs are then mathematically assigned by the notations x(n) for every n
number of inputs.

Afterward, each of the input is multiplied by its corresponding weights (these weights are the
details utilized by the artificial neural networks to solve a specific problem). In general terms,
these weights normally represent the strength of the interconnection between neurons inside
the artificial neural network. All the weighted inputs are summarized inside the computing unit.

If the weighted sum is equal to zero, then bias is added to make the output non-zero or
something else to scale up to the system's response. Bias has the same input, and weight equals
to 1. Here the total of weighted inputs can be in the range of 0 to positive infinity. Here, to keep
the response in the limits of the desired value, a certain maximum value is benchmarked, and
the total of weighted inputs is passed through the activation function.

The activation function refers to the set of transfer functions used to achieve the desired output.
There is a different kind of the activation function, but primarily either linear or non-linear sets
of functions. Some of the commonly used sets of activation functions are the Binary, linear,
and Tan hyperbolic sigmoidal activation functions. Let us take a look at each of them in details:

4
Binary:

In binary activation function, the output is either a one or a 0. Here, to accomplish this, there
is a threshold value set up. If the net weighted input of neurons is more than 1, then the final
output of the activation function is returned as one or else the output is returned as 0.

Sigmoidal Hyperbolic:

The Sigmoidal Hyperbola function is generally seen as an "S" shaped curve. Here the tan
hyperbolic function is used to approximate output from the actual net input. The function is
defined as:
F(x) = (1/1 + exp(-????x))
Where ???? is considered the Steepness parameter.

5
Appropriate problems for Neural network learning:
1.Instances have many attributes value pairs

2.Target function has discrete values, continuous values or combination of both


3.Training examples with errors or missing values

4.Long training times are acceptable

5.Fast evaluation of the target function learnt

6.Ability for humans to understand the target function learnt by machine is not important

Perceptions:
Perceptron is a building block of an Artificial Neural Network. Initially, in the mid of 19th
century, Mr. Frank Rosenblatt invented the Perceptron for performing certain calculations to
detect input data capabilities or business intelligence.
Perceptron is a linear Machine Learning algorithm used for supervised learning for various
binary classifiers.

This algorithm enables neurons to learn elements and processes them one by one during
preparation.

Perceptron model is also treated as one of the best and simplest types of Artificial Neural
networks.

However, it is a supervised learning algorithm of binary classifiers.


Hence, we can consider it as a single-layer neural network with four main parameters, i.e.,
input values, weights and Bias, net sum, and an activation function.

Basic Components of Perceptron:

Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains
three main components. These are as follows:

6
o Input Nodes or Input Layer:

This is the primary component of Perceptron which accepts the initial data into the system for
further processing. Each input node contains a real numerical value.

o Wight and Bias:

Weight parameter represents the strength of the connection between units. This is another
most important parameter of Perceptron components. Weight is directly proportional to the
strength of the associated input neuron in deciding the output. Further, Bias can be considered
as the line of intercept in a linear equation.
o Activation Function:

These are the final and important components that help to determine whether the neuron will
fire or not. Activation Function can be considered primarily as a step function.

Types of Activation functions:

o Sign function

o Step function, and

o Sigmoid function

The data scientist uses the activation function to take a subjective decision based on various
problem statements and forms the desired outputs. Activation function may differ (e.g., Sign,
Step, and Sigmoid) in perceptron models by checking whether the learning process is slow or
has vanishing or exploding gradients.

7
Working of perceptron:
In Machine Learning, Perceptron is considered as a single-layer neural network that consists
of four main parameters named input values (Input nodes), weights and Bias, net sum, and an
activation function. The perceptron model begins with the multiplication of all input values
and their weights, then adds these values together to create the weighted sum. Then this
weighted sum is applied to the activation function 'f' to obtain the desired output. This
activation function is also known as the step function and is represented by 'f'.

This step function or Activation function plays a vital role in ensuring that output is mapped
between required values (0,1) or (-1,1). It is important to note that the weight of input is
indicative of the strength of a node. Similarly, an input's bias value gives the ability to shift
the activation function curve up or down.

Perceptron model works in two important steps as follows:

Step-1

In the first step first, multiply all input values with corresponding weight values and then add
them to determine the weighted sum. Mathematically, we can calculate the weighted sum as
follows:

Add a special term called bias 'b' to this weighted sum to improve the model's performance.

Step-2

In the second step, an activation function is applied with the above-mentioned weighted sum,
which gives us output either in binary form or a continuous value as follows:

8
Types of Perceptron Models:

Based on the layers, Perceptron models are divided into two types. These are as follows:

1. Single-layer Perceptron Model

2. Multi-layer Perceptron model

Single-layer Perceptron Model:

This is one of the easiest Artificial neural networks (ANN) types. A single-layered perceptron
model consists feed-forward network and also includes a threshold transfer function inside
the model. The main objective of the single-layer perceptron model is to analyze the linearly
separable objects with binary outcomes.

In a single layer perceptron model, its algorithms do not contain recorded data, so it begins
with inconstantly allocated input for weight parameters. Further, it sums up all inputs
(weight). After adding all inputs, if the total sum of all inputs is more than a pre-determined
value, the model gets activated and shows the output value as +1.

If the outcome is same as pre-determined or threshold value, then the performance of this
model is stated as satisfied, and weight demand does not change. However, this model
consists of a few discrepancies triggered when multiple weight inputs values are fed into the
model. Hence, to find desired output and minimize errors, some changes should be necessary
for the weights input.

Multi-layer Perceptron model:

Like a single-layer perceptron model, a multi-layer perceptron model also has the same
model structure but has a greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation algorithm, which
executes in two stages as follows:
o Forward Stage: Activation functions start from the input layer in the forward stage
and terminate on the output layer.

o Backward Stage: In the backward stage, weight and bias values are modified as per
the model's requirement. In this stage, the error between actual output and demanded
originated backward on the output layer and ended on the input layer.

Hence, a multi-layered perceptron model has considered as multiple artificial neural networks
having various layers in which activation function does not remain linear, similar to a single
layer perceptron model. Instead of linear, activation function can be executed as sigmoid,
TanH, ReLU, etc., for deployment.

A multi-layer perceptron model has greater processing power and can process linear and non-
linear patterns. Further, it can also implement logic gates such as AND, OR, XOR, NAND,
NOT, XNOR, NOR.

9
Backpropagation Algorithm:
Backpropagation is a technique used in deep learning to train artificial neural networks
particularly feed-forward networks. It works iteratively to adjust weights and bias to
minimize the cost function.

Backpropagation often uses optimization algorithms like gradient descent or stochastic


gradient descent. The algorithm computes the gradient using the chain rule from calculus
allowing it to effectively navigate complex layers in the neural network to minimize the cost
function.

How the backpropagation works by adjustments of weights

10
Working of Backpropagation Algorithm:
The Backpropagation algorithm involves two main steps: the Forward Pass and
the Backward Pass.
Working of Forward Pass:

In forward pass the input data is fed into the input layer. These inputs combined with their
respective weights are passed to hidden layers. For example in a network with two hidden
layers (h1 and h2 as shown in Fig. (a)) the output from h1 serves as the input to h2. Before
applying an activation function, a bias is added to the weighted inputs.

Each hidden layer computes the weighted sum (`a`) of the inputs then applies an activation
function like ReLU (Rectified Linear Unit) to obtain the output (`o`). The output is passed
to the next layer where an activation function such as softmax converts the weighted outputs
into probabilities for classification.

The forward pass using weights and biases

11
Working of Backward Pass:

In the backward pass the error (the difference between the predicted and actual output) is
propagated back through the network to adjust the weights and biases. One common method
for error calculation is the Mean Squared Error (MSE) given by:

Once the error is calculated the network adjusts weights using gradients which are computed
with the chain rule. These gradients indicate how much each weight and bias should be adjusted
to minimize the error in the next iteration. The backward pass continues layer by layer ensuring
that the network learns and improves its performance. The activation function through its
derivative plays a crucial role in computing these gradients during backpropagation.

Example of Backpropagation in Machine Learning

Assume the neurons use the sigmoid activation function for the forward and backward pass.
The target output is 0.5, and the learning rate is 1.

Forward Propagation

1. Initial Calculation

The weighted sum at each node is calculated using:

Where,

12
(output): After applying the activation function to a, we get the output of the neuron:

2. Sigmoid Function

The sigmoid function returns a value between 0 and 1, introducing non-linearity into the model.

To find the outputs of y3, y4, and y5

3. Computing Outputs

At h1 node

Once we calculated the a1 value, we can now proceed to find the y3 value:

13
Similarly find the values of y4 at h2 and y5 at O3

Values of y3,y4,y5

4. Error Calculation

Our actual output is 0.5 but we obtained 0.67. To calculate the error we can use the below
formula:

Using this error value we will be backpropagating.

14
Backpropagation

1. Calculating Gradients

The change in each weight is calculated as:

2. Output Unit Error

For O3:

3. Hidden Unit Error

For h1:

For h2:

4. Weight Updates

For the weights from hidden to output layer:

New weight:

15
For weights from input to hidden layer:

New weight:

Similarly other weights are updated:

The updated weights are illustrated below

After updating the weights the forward pass is repeated yielding:

Since is still not the target output the process of calculating the error and
backpropagating continues until the desired output is reached.

16
This process demonstrates how backpropagation iteratively updates weights by minimizing
errors until the network accurately predicts the output.

This process is said to be continued until the actual output is gained by the neural network.

Remarks on the backpropagation algorithm:


Convergence and local minima:

Backpropagation over multilayer networks is only guaranteed to converge toward some local
minimum in E and not necessarily to the global minimum error.

When gradient descent falls into a local minimum with respect to one of those weights, it will
not necessarily be in a local minimum with respect to the other weights.

A second perspective on local minima can be gained by considering the manner in which
network weights evolve as the number of training iterations increases.

17
An illustrative example Face recognition:
Face recognition using Artificial Intelligence(AI) is a computer vision technology that is used
to identify a person or object from an image or video. It uses a combination of techniques
including deep learning, computer vision algorithms, and Image processing. These
technologies are used to enable a system to detect, recognize, and verify faces in digital images
or videos.

Image Processing and Machine learning

Image processing by computers involves the process of Computer Vision. It deals with a high-
level understanding of digital images or videos. The requirement is to automate tasks that the
human visual systems can do. So, a computer should be able to recognize objects such as the
face of a human being or a lamppost, or even a statue.
Image reading:

The computer reads any image in a range of values between 0 and 255. For any color image,
there are 3 primary colors – Red, green, and blue. A matrix is formed for every primary color
and later these matrices combine to provide a Pixel value for the individual R, G, and B colors.
Each element of the matrices provide data about the intensity of the brightness of the pixel.

OpenCV is a Python library that is designed to solve computer vision problems. OpenCV was
originally developed in 1999 by Intel but later supported by Willow Garage.
Machine learning

Every Machine Learning algorithm takes a dataset as input and learns from the data it basically
means to learn the algorithm from the provided input and output as data. It identifies the
patterns in the data and provides the desired algorithm. For instance, to identify whose face is
present in a given image, multiple things can be looked at as a pattern:

• Height/width of the face.

• Height and width may not be reliable since the image could be rescaled to a smaller
face or grid. However, even after rescaling, what remains unchanged are the ratios –
the ratio of the height of the face to the width of the face won’t change.

• Color of the face.


• Width of other parts of the face like lips, nose, etc.

There is a pattern involved – different faces have different dimensions like the ones above.
Similar faces have similar dimensions. Machine Learning algorithms only understand numbers
so it is quite challenging. This numerical representation of a “face” (or an element in the
training set) is termed as a feature vector. A feature vector comprises of various numbers in a
specific order.

18
As a simple example, we can map a “face” into a feature vector which can comprise various
features like:

• Height of face (cm)

• Width of the face (cm)

• Average color of face (R, G, B)

• Width of lips (cm)

• Height of nose (cm)

Essentially, given an image, we can convert them into a feature vector like:

Height of face (cm) Width of the face (cm) Average color of face (RGB) Width of lips (cm)
Height of nose (cm)

23.1 15.8 (255, 224, 189) 5.2 4.4

So, the image is now a vector that could be represented as (23.1, 15.8, 255, 224, 189, 5.2, 4.4).
There could be countless other features that could be derived from the image,, for instance, hair
color, facial hair, spectacles, etc.

Machine Learning does two major functions in face recognition technology. These are given
below:

1. Deriving the feature vector: it is difficult to manually list down all of the features
because there are just so many. A Machine Learning algorithm can intelligently label
out many of such features. For instance, a complex feature could be the ratio of the
height of the nose and the width of the forehead.
2. Matching algorithms: Once the feature vectors have been obtained, a Machine
Learning algorithm needs to match a new image with the set of feature vectors present
in the corpus.
3. Face Recognition Operations

19
Face Recognition Operations:
The technology system may vary when it comes to facial recognition. Different software
applies different methods and means to achieve face recognition. The stepwise method is as
follows:

Face Detection:

To begin with, the camera will detect and recognize a face. The face can be best detected when
the person is looking directly at the camera as it makes it easy for facial recognition. With the
advancements in technology, this is improved where the face can be detected with slight
variation in their posture of face facing the camera.

Face Analysis:

Then the photo of the face is captured and analyzed. Most facial recognition relies on 2D
images rather than 3D because it is more convenient to match to the database. Facial
recognition software will analyze the distance between your eyes or the shape of your
cheekbones.

Image to Data Conversion:

Now it is converted to a mathematical formula and these facial features become numbers. This
numerical code is known as a face print. The way every person has a unique fingerprint, in the
same way, they have unique face prints.
Match Finding:

Then the code is compared against a database of other face prints. This database has photos
with identification that can be compared. The technology then identifies a match for your exact
features in the provided database. It returns with the match and attached information such as
name and address or it depends on the information saved in the database of an individual.

20

You might also like