0% found this document useful (0 votes)
29 views22 pages

ML Module 5

The document provides an overview of neural networks, detailing their evolution, structure, and functioning. It explains the significance of neural networks in various applications, including pattern recognition and decision-making, and compares artificial neural networks with biological neural networks. Additionally, it outlines the advantages and disadvantages of both types of networks, along with different neural network architectures and their operational mechanisms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views22 pages

ML Module 5

The document provides an overview of neural networks, detailing their evolution, structure, and functioning. It explains the significance of neural networks in various applications, including pattern recognition and decision-making, and compares artificial neural networks with biological neural networks. Additionally, it outlines the advantages and disadvantages of both types of networks, along with different neural network architectures and their operational mechanisms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

MODULE 5

Classification models

What is a neural network?

Neural Networks are computational models that mimic the complex functions
of the human brain. The neural networks consist of interconnected nodes or
neurons that process and learn from data, enabling tasks such as pattern
recognition and decision making in machine learning

Evolution of Neural Networks:


Since the 1940s, there have been a number of noteworthy advancements in the
field of neural networks:
• 1940s-1950s: Early Concepts
Neural networks began with the introduction of the first
mathematical model of artificial neurons by McCulloch and Pitts.
But computational constraints made progress difficult.
• 1960s-1970s: Perceptrons
This era is defined by the work of Rosenblatt on
perceptrons. Perceptrons are single-layer networks whose
applicability was limited to issues that could be solved linearly
separately.
• 1980s: Backpropagation and Connectionism
Multi-layer network training was made possible by Rumelhart,
Hinton, and Williams’ invention of the backpropagation method.
With its emphasis on learning through interconnected nodes,
connectionism gained appeal.
• 1990s: Boom and Winter
With applications in image identification, finance, and other fields,
neural networks saw a boom. Neural network research did, however,
experience a “winter” due to exorbitant computational costs and
inflated expectations.
• 2000s: Resurgence and Deep Learning
Larger datasets, innovative structures, and enhanced processing
capability spurred a comeback. Deep learning has shown amazing
effectiveness in a number of disciplines by utilizing numerous layers.
• 2010s-Present: Deep Learning Dominance
Convolutional neural networks (CNNs) and recurrent neural
networks (RNNs), two deep learning architectures, dominated

Prof. Ashwini Garole Module 5


machine learning. Their power was demonstrated by innovations in
gaming, picture recognition, and natural language processing.

What are Neural Networks?

Neural networks extract identifying features from data, lacking pre-


programmed understanding. Network components include neurons,
connections, weights, biases, propagation functions, and a learning rule.
Neurons receive inputs, governed by thresholds and activation functions.
Connections involve weights and biases regulating information transfer.
Learning, adjusting weights and biases, occurs in three stages: input
computation, output generation, and iterative refinement enhancing the
network’s proficiency in diverse tasks.

These include:
1. The neural network is simulated by a new environment.
2. Then the free parameters of the neural network are changed as a
result of this simulation.
3. The neural network then responds in a new way to the environment
because of the changes in its free parameters.

Prof. Ashwini Garole Module 5


Importance of Neural Networks

The ability of neural networks to identify patterns, solve intricate puzzles, and
adjust to changing surroundings is essential. Their capacity to learn from data
has far-reaching effects, ranging from revolutionizing technology like natural
language processing and self-driving automobiles to automating decision-
making processes and increasing efficiency in numerous industries. The
development of artificial intelligence is largely dependent on neural networks,
which also drive innovation and influence the direction of technology.

How does Neural Networks work?

Consider a neural network for email classification. The input layer takes features
like email content, sender information, and subject. These inputs, multiplied by
adjusted weights, pass through hidden layers. The network, through training,
learns to recognize patterns indicating whether an email is spam or not. The
output layer, with a binary activation function, predicts whether the email is
spam (1) or not (0). As the network iteratively refines its weights through
backpropagation, it becomes adept at distinguishing between spam and
legitimate emails, showcasing the practicality of neural networks in real-world
applications like email filtering.

Working of a Neural Network:


Neural networks are complex systems that mimic some features of the
functioning of the human brain. It is composed of an input layer, one or more
hidden layers, and an output layer made up of layers of artificial neurons that
are coupled. The two stages of the basic process are called backpropagation
and forward propagation.

Prof. Ashwini Garole Module 5


Forward Propagation
• Input Layer: Each feature in the input layer is represented by a node
on the network, which receives input data.
• Weights and Connections: The weight of each neuronal connection
indicates how strong the connection is. Throughout training, these
weights are changed.
• Hidden Layers: Each hidden layer neuron processes inputs by
multiplying them by weights, adding them up, and then passing them
through an activation function. By doing this, non-linearity is
introduced, enabling the network to recognize intricate patterns.
• Output: The final result is produced by repeating the process until
the output layer is reached.
Backpropagation
• Loss Calculation: The network’s output is evaluated against the real
goal values, and a loss function is used to compute the difference.
For a regression problem, the Mean Squared Error (MSE) is
commonly used as the cost function.
Loss Function:
• Gradient Descent: Gradient descent is then used by the network to
reduce the loss. To lower the inaccuracy, weights are changed based
on the derivative of the loss with respect to each weight.
• Adjusting weights: The weights are adjusted at each connection by
applying this iterative process, or backpropagation, backward across
the network.
• Training: During training with different data samples, the entire
process of forward propagation, loss calculation, and
backpropagation is done iteratively, enabling the network to adapt
and learn patterns from the data.
• Actvation Functions: Model non-linearity is introduced by
activation functions like the rectified linear unit (ReLU) or sigmoid.
Their decision on whether to “fire” a neuron is based on the whole
weighted input.

Types of Neural Networks :
There are seven types of neural networks that can be used.
• Feedforward Neteworks: A feedforward neural network is a simple
artificial neural network architecture in which data moves from input
to output in a single direction. It has input, hidden, and output layers;
feedback loops are absent. Its straightforward architecture makes it
appropriate for a number of applications, such as regression and
pattern recognition.

Prof. Ashwini Garole Module 5


• Multilayer Perceptron (MLP): MLP is a type of feedforward neural
network with three or more layers, including an input layer, one or
more hidden layers, and an output layer. It uses nonlinear activation
functions.
• Convolutional Neural Network (CNN): A Convolutional Neural
Network (CNN) is a specialized artificial neural network designed
for image processing. It employs convolutional layers to
automatically learn hierarchical features from input images, enabling
effective image recognition and classification. CNNs have
revolutionized computer vision and are pivotal in tasks like object
detection and image analysis.
• Recurrent Neural Network (RNN): An artificial neural network
type intended for sequential data processing is called a Recurrent
Neural Network (RNN). It is appropriate for applications where
contextual dependencies are critical, such as time series prediction
and natural language processing, since it makes use of feedback
loops, which enable information to survive within the network.
• Long Short-Term Memory (LSTM): LSTM is a type of RNN that
is designed to overcome the vanishing gradient problem in training
RNNs. It uses memory cells and gates to selectively read, write, and
erase information.

Artificial Neural Network and Biological Neural


Network
Artificial Neural Network and Biological Neural Network both are forms of neural
networks. The primary distinction between these networks is that in an artificial neural
network, the system's unique functional memory is placed independently with
the CPU. On the other hand, the distributed memory in the biological neural network
is located within the neural interlinks.

The biological processes of the human nervous system serve as the foundation
for neural computers (brain). Neural computing entails substantial parallel
processing and self-learning computers similar to a brain, which is made feasible by
the brain's neural network. A neural network is just a collection of processing pieces
that are interconnected in a web-like fashion and can provide some outcomes after
receiving input.

In this article, you will learn the difference between Artificial Neural
Network and Biological Neural Network. But before discussing the differences, you

Prof. Ashwini Garole Module 5


must know about Artificial Neural Network and Biological Neural Network with their
advantages and disadvantages.

What is Artificial Neural Network?


The artificial neural network is the mathematical model which is mainly inspired by
the biological neuron system in the human brain. The neural network is made up of
a large number of processing components that are linked together by weighted paths
to form networks. The result of every element is computed by applying a non-linear
function of its weighted inputs. When these processing components are combined
into networks, they can perform arbitrarily complicated non-linear functions like
classification, prediction, or optimization.

Advantages and Disadvantages of Artificial Neural Network


There are various advantages and disadvantages of artificial neural networks. Some
advantages and disadvantages of artificial neural networks are as follows:

Advantages

1. Large volumes of data may be utilized to train and generalize artificial neural networks.
They may be trained using vast datasets, allowing them to make patterns-based
predictions and judgments.
2. ANNs may be improved and employed efficiently on hardware accelerators or
dedicated AI processors like GPUs and AI accelerators for quick and parallel
processing.
3. Another advantage of ANN is that they continue to function even in the presence of
noise or errors in data. As a result, they are appropriate in scenarios involving noisy,
partial, or distorted data.
4. They are non-linear in nature as well. It enables them to represent complex data
relationships and patterns. They can also be customized to handle various sorts of data
and perform various activities.
5. They are capable of extracting features from data. It removes the need for manual
feature editing. They can also be taught to handle many jobs at once. As a result, they
may be utilized in advanced AI applications.

Disadvantages

Prof. Ashwini Garole Module 5


1. Artificial neural networks may grow overly complex due to their architecture and the
massive information used to train them. They can memorize the training data. It may
result in a poor generalization of new data.
2. Artificial neural networks require suitable hardware components like central processors
or dedicated AI accelerators, vast storage spaces, and massive random access memory.
3. Their working principles and even outcomes can be difficult to grasp because of the
complexities of ANNs. Some people may find it difficult to comprehend their decision-
making processes.
4. No explicit rule determines the structure of ANN. The proper network structure is
obtained by trial and error.
5. They are also susceptible to adversarial instances or slight changes in input data. These
modifications may cause the artificial neural network to make wrong decisions and
produce irrelevant results.

What is Biological Neural Network?


The biological neural network is also composed of several processing pieces known
as neurons that are linked together via synapses. These neurons accept either
external input or the results of other neurons. The generated output from the
individual neurons propagates its effect on the entire network to the last layer, where
the results can be displayed to the outside world.

Every synapse has a processing value and weight recognized during network training.
The performance and potency of the network fully depend on the neuron numbers in
the network, how they are connected to each other (i.e., topology), and the weights
assigned to every synapse.

Advantages and Disadvantages of Biological Neural


Network
There are various advantages and disadvantages of the biological neural network.
Some advantages and disadvantages of the biological neural network are as follows:

Advantages

1. It can handle extremely complex parallel inputs.


2. The input processing element is the synapses.

Disadvantages

Prof. Ashwini Garole Module 5


1. As it is complex, the processing speed is slow.
2. There is no controlling mechanism in this network.

Some main differences between the Artificial Neural Network and Biological Neural
Network are as follows:

1. ANN is the mathematical model which is mainly inspired by the biological neuron
system in the human brain. In contrast, the biological neural network is also composed
of several processing pieces known as neurons that are linked together via synapses.
2. An artificial neural network's processing was sequential and centralized. In contrast, a
biological neural network processes information in parallel and distributive.
3. The artificial neural network is of a much smaller size than the biological neural
network. In contrast, the biological neural network is large in size.
4. The biological neural network is fault tolerant. In contrast, the artificial neural network
is not fault tolerant.
5. The processing speed of an artificial neural network is in the nanosecond range, which
is faster than the biological neural network, where the cycle time associated with a
neural event triggered by an external input is in the millisecond range.
6. BNN may perform more difficult issues than artificial neural networks.
7. The operating environment of the artificial neural network is well-defined and well-
constrained. In contrast, the operating environment of the biological neural network is
poorly defined and unconstrained.
8. The reliability of the artificial neural network is very vulnerable. In contrast, the reliability
of the biological neural network is robust.

comparison between Artificial Neural Network and


Biological Neural Network
Here, you will learn head-to-head comparisons between Artificial Neural Network and
Biological Neural Network. The main differences between Artificial Neural Network and
Biological Neural Network are as follows:

Features Artificial Neural Network Biological Neural Network

Prof. Ashwini Garole Module 5


Definition It is the mathematical model which is It is also composed of several
mainly inspired by the biological processing pieces known as neurons
neuron system in the human brain. that are linked together via synapses.

Processing Its processing was sequential and It processes the information in a parallel
centralized. and distributive manner.

Size It is small in size. It is large in size.

Control Its control unit keeps track of all All processing is managed centrally.
Mechanism computer-related operations.

Rate It processes the information at a faster It processes the information at a slow


speed. speed.

Complexity It cannot perform complex pattern The large quantity and complexity of the
recognition. connections allow the brain to perform
complicated tasks.

Feedback It doesn't provide any feedback. It provides feedback.

Fault tolerance There is no fault tolerance. It has fault tolerance.

Operating Its operating environment is well- Its operating environment is poorly


Environment defined and well-constrained defined and unconstrained.

Memory Its memory is separate from a Its memory is integrated into the
processor, localized, and non-content processor, distributed, and content-
addressable. addressable.

Reliability It is very vulnerable. It is robust.

Learning It has very accurate structures and They are tolerant to ambiguity.
formatted data.

Response time Its response time is measured in Its response time is measured in
milliseconds. nanoseconds.

Prof. Ashwini Garole Module 5


1. McCulloch-Pitts Model of Neuron
The McCulloch-Pitts neural model, which was the earliest ANN model, has
only two types of inputs — Excitatory and Inhibitory. The excitatory inputs
have weights of positive magnitude and the inhibitory weights have weights of
negative magnitude. The inputs of the McCulloch-Pitts neuron could be either
0 or 1. It has a threshold function as an activation function. So, the output
signal yout is 1 if the input ysum is greater than or equal to a given threshold
value, else 0. The diagrammatic representation of the model is as follows:

McCulloch-Pitts Model

Simple McCulloch-Pitts neurons can be used to design logical operations. For


that purpose, the connection weights need to be correctly decided along with
the threshold function (rather than the threshold value of the activation
function). For better understanding purpose, let me consider an example:
John carries an umbrella if it is sunny or if it is raining. There are four given
situations. I need to decide when John will carry the umbrella. The situations
are as follows:
• First scenario: It is not raining, nor it is sunny
• Second scenario: It is not raining, but it is sunny
• Third scenario: It is raining, and it is not sunny
• Fourth scenario: It is raining as well as it is sunny
To analyse the situations using the McCulloch-Pitts neural model, I can
consider the input signals as follows:
• X1: Is it raining?
• X2 : Is it sunny?
So, the value of both scenarios can be either 0 or 1. We can use the value of
both weights X1 and X2 as 1 and a threshold function as 1. So, the neural
network model will look like:

Prof. Ashwini Garole Module 5


Truth Table for this case will be:
Situation x1 x2 ysum yout

1 0 0 0 0

2 0 1 1 1

3 1 0 1 1

4 1 1 2 1

So, I can say that,


The truth table built with respect to the problem is depicted above. From the
truth table, I can conclude that in the situations where the value of yout is 1, John
needs to carry an umbrella. Hence, he will need to carry an umbrella in scenarios
2, 3 and 4.

Perceptron model & Perceptron Learning Rule :

Perceptron is Machine Learning algorithm for supervised learning of various


binary classification tasks. Further, Perceptron is also understood as an
Artificial Neuron or neural network unit that helps to detect certain input data
computations in business intelligence.

Prof. Ashwini Garole Module 5


Perceptron model is also treated as one of the best and simplest types of Artificial
Neural networks. However, it is a supervised learning algorithm of binary
classifiers. Hence, we can consider it as a single-layer neural network with four
main parameters, i.e., input values, weights and Bias, net sum, and an
activation function.

Basic Components of Perceptron

Perceptron model as a binary classifier which contains three main components.


These are as follows:

ADVERTISEMENT
o Input Nodes or Input Layer:

This is the primary component of Perceptron which accepts the initial data into
the system for further processing. Each input node contains a real numerical
value.

o Wight and Bias:

Weight parameter represents the strength of the connection between units. This
is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding the
output. Further, Bias can be considered as the line of intercept in a linear equation.

Prof. Ashwini Garole Module 5


o Activation Function:

These are the final and important components that help to determine whether the
neuron will fire or not. Activation Function can be considered primarily as a step
function.

Types of Activation functions:

o Sign function
o Step function, and
o Sigmoid function

The data scientist uses the activation function to take a subjective decision based
on various problem statements and forms the desired outputs. Activation function
may differ (e.g., Sign, Step, and Sigmoid) in perceptron models by checking
whether the learning process is slow or has vanishing or exploding gradients.

How does Perceptron work?

In Machine Learning, Perceptron is considered as a single-layer neural network


that consists of four main parameters named input values (Input nodes), weights
and Bias, net sum, and an activation function. The perceptron model begins with
the multiplication of all input values and their weights, then adds these values
together to create the weighted sum. Then this weighted sum is applied to the
activation function 'f' to obtain the desired output. This activation function is also
known as the step function and is represented by 'f'.

Prof. Ashwini Garole Module 5


This step function or Activation function plays a vital role in ensuring that output
is mapped between required values (0,1) or (-1,1). It is important to note that the
weight of input is indicative of the strength of a node. Similarly, an input's bias
value gives the ability to shift the activation function curve up or down.

Perceptron model works in two important steps as follows:

Step-1

In the first step first, multiply all input values with corresponding weight values
and then add them to determine the weighted sum. Mathematically, we can
calculate the weighted sum as follows:

∑wi*xi = x1*w1 + x2*w2 +…wn*xn

Add a special term called bias 'b' to this weighted sum to improve the model's
performance.

∑wi*xi + b

Step-2

In the second step, an activation function is applied with the above-mentioned


weighted sum, which gives us output either in binary form or a continuous value
as follows:

Y = f(∑wi*xi + b)

Prof. Ashwini Garole Module 5


Types of Perceptron Models
Based on the layers, Perceptron models are divided into two types. These are as
follows:
1. Single-layer Perceptron Model
2. Multi-layer Perceptron model

Single Layer Perceptron Model:

The main objective of the single-layer perceptron model is to analyze the linearly
separable objects with binary outcomes. In a single layer perceptron model, its
algorithms do not contain recorded data, so it begins with inconstantly allocated
input for weight parameters. Further, it sums up all inputs (weight). After adding
all inputs, if the total sum of all inputs is more than a pre-determined value, the
model gets activated and shows the output value as +1.

If the outcome is same as pre-determined or threshold value, then the performance


of this model is stated as satisfied, and weight demand does not change. However,
this model consists of a few discrepancies triggered when multiple weight inputs
values are fed into the model. Hence, to find desired output and minimize errors,
some changes should be necessary for the weights input.

"Single-layer perceptron can learn only linearly separable patterns."

Multi-Layered Perceptron Model:

A single-layer perceptron model, a multi-layer perceptron model also has the


same model structure but has a greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation


algorithm, which executes in two stages as follows:

o Forward Stage: Activation functions start from the input layer in the
forward stage and terminate on the output layer.
o Backward Stage: In the backward stage, weight and bias values are
modified as per the model's requirement. In this stage, the error between
actual output and demanded originated backward on the output layer and
ended on the input layer.

Hence, a multi-layered perceptron model has considered as multiple artificial


neural networks having various layers in which activation function does not
remain linear, similar to a single layer perceptron model. Instead of linear,

Prof. Ashwini Garole Module 5


activation function can be executed as sigmoid, TanH, ReLU, etc., for
deployment.

A multi-layer perceptron model has greater processing power and can process
linear and non-linear patterns. Further, it can also implement logic gates such as
AND, OR, XOR, NAND, NOT, XNOR, NOR.

Limitations of Perceptron Model


o The output of a perceptron can only be a binary number (0 or 1) due to the
hard limit transfer function.
o Perceptron can only be used to classify the linearly separable sets of input
vectors. If input vectors are non-linear, it is not easy to classify them
properly.

o Sigmoid Function: It is by far the most commonly used activation


function in neural networks. The need for sigmoid function stems from
the fact that many learning algorithms require the activation function to
be differentiable and hence continuous. There are two types of sigmoid
function:
o 1. Binary Sigmoid Function

o A binary sigmoid function is of the form:


o , where k = steepness or slope parameter, By varying the value of k,
sigmoid function with different slopes can be obtained. It has a range of
(0,1). The slope of origin is k/4. As the value of k becomes very large,
the sigmoid function becomes a threshold function.

o 2. Bipolar Sigmoid Function

Prof. Ashwini Garole Module 5


o A bipolar sigmoid function is of the form
o The range of values of sigmoid functions can be varied depending on the
application. However, the range of (-1,+1) is most commonly adopted.

o Ramp:
o It has a very similar appearance to the sigmoid activation function and
translates inputs to outputs throughout the range (0,1), however the ramp
will have a steep curve rather than a smooth one. A linear function that has
been shortened.

Delta learning rule:

The delta rule in an artificial neural network is a specific kind of backpropagation


that assists in refining the machine learning/artificial intelligence network,
making associations among input and outputs with different layers of artificial
neurons. The Delta rule is also called the Delta learning rule.

Delta learning does this by using the difference between a target activation and
an obtained activation. By using a linear activation function, network connections
are balanced. Another approach to explain the Delta rule is that it uses an error
function to perform gradient descent learning.

Prof. Ashwini Garole Module 5


Delta rule refers to the comparison of actual output with a target output, the
technology tries to discover the match, and the program makes changes. The
actual execution of the Delta rule will fluctuate as per the network and its
composition. Still, by applying a linear activation function, the delta rule can be
useful in refining a few sorts of neural networks with specific kinds of
backpropagation.

Delta rule is introduced by Widrow and Hoff, which is the most significant
learning rule that depends on supervised learning.

This rule states that the change in the weight of a node is equivalent to the product
of error and the input.

Mathematical equation:

The given equation gives the mathematical equation for delta learning rule:

∆w = µ.x.z

∆w = µ(t-y)x

Here,

∆w = weight change.

µ = the constant and positive learning rate.

X = the input value from pre-synaptic neuron.

z= (t-y) is the difference between the desired input t and the actual output y. The
above mentioned mathematical rule cab be used only for a single output unit.

The different weights can be determined with respect to these two cases.

Case 1 - When t ≠ k, then

w(new) = w(old) + ∆w

Case 2 - When t = k, then

No change in weight

Prof. Ashwini Garole Module 5


Multi-layer Perceptron Network:
Multi-layer perception is also known as MLP. It is fully connected dense layers,
which transform any input dimension to the desired dimension. A multi-layer
perception is a neural network that has multiple layers. To create a neural
network we combine neurons together so that the outputs of some neurons are
inputs of other neurons.
A multi-layer perceptron has one input layer and for each input, there is one
neuron(or node), it has one output layer with a single node for each output and
it can have any number of hidden layers and each hidden layer can have any
number of nodes. A schematic diagram of a Multi-Layer Perceptron (MLP) is
depicted below.

In the multi-layer perceptron diagram above, we can see that there are three
inputs and thus three input nodes and the hidden layer has three nodes. The
output layer gives two outputs, therefore there are two output nodes. The nodes
in the input layer take input and forward it for further process, in the diagram
above the nodes in the input layer forwards their output to each of the three
nodes in the hidden layer, and in the same way, the hidden layer processes the
information and passes it to the output layer.
Every node in the multi-layer perception uses a sigmoid activation function. The
sigmoid activation function takes real values as input and converts them to
numbers between 0 and 1 using the sigmoid formula.
Error backpropagation algorithm :

In backpropagation, this error is propagated backward from the output layer or


output neuron through the hidden layers toward the input layer so that neurons
can adjust themselves along the way if they played a role in producing the error.
Activation functions activate neurons to learn new complex patterns, information

Prof. Ashwini Garole Module 5


and whatever else they need to adjust their weights and biases, and mitigate this
error to improve the network.

The algorithm gets its descent gradient name because the weights are updated
backward, from output to input.

objective of a backpropagation algorithm?

Backpropagation algorithms are used extensively to train feedforward neural


networks, such as convolutional neural networks, in areas such as deep learning.
A backpropagation algorithm is pragmatic because it computes the gradient
needed to adjust a network's weights more efficiently than computing the gradient
based on each individual weight. It enables the use of gradient methods, such as
gradient descent and stochastic gradient descent, to train multilayer networks and
update weights to minimize errors.

It's not easy to understand exactly how changing weights and biases affect the
overall behavior of an ANN. That was one factor that held back more
comprehensive use of neural network applications until the early 2000s, when
computers provided the necessary insight.

Today, backpropagation algorithms have practical applications in many areas of


artificial intelligence, including OCR, natural language processing and image
processing.

Advantages and disadvantages of backpropagation algorithms


There are several advantages to using a backpropagation algorithm, but there are
also challenges.

Advantages of backpropagation algorithms

• They don't have any parameters to tune except for the number of inputs.

Prof. Ashwini Garole Module 5


• They're highly adaptable and efficient, and don't require prior
knowledge about the network.
• They use a standard process that usually works well.

• They're user-friendly, fast and easy to program.


• Users don't need to learn any special functions.
Disadvantages of backpropagation algorithms

• They prefer a matrix-based approach over a mini-batch approach.

• Data mining is sensitive to noisy data and other irregularities. Unclean


data can affect the backpropagation algorithm when training a neural
network used for data mining.
• Performance is highly dependent on input data.
• Training is time- and resource-intensive.
What is a backpropagation algorithm in machine learning?
Backpropagation is a type of supervised learning since it requires a known,
desired output for each input value to calculate the loss function gradient, which
is how desired output values differ from actual output. Supervised learning, the
most common training approach in machine learning, uses a training data set that
has clearly labeled data and specified desired outputs.

Along with classifier algorithms such as naive Bayesian filters, K-nearest


neighbors and support vector machines, the backpropagation training algorithm
has emerged as an important part of machine learning applications that
involve predictive analytics. While backpropagation techniques are mainly
applied to neural networks, they can also be applied to both classification and
regression problems in machine learning. In real-world applications, developers
and machine learning experts implement backpropagation algorithms for neural
networks using programming languages such as Python.

Prof. Ashwini Garole Module 5


Logistic Regression:

Logistic regression is a supervised machine learning algorithm used


for classification tasks where the goal is to predict the probability that an
instance belongs to a given class or not. Logistic regression is a statistical
algorithm which analyze the relationship between two data factors.

What is Logistic Regression?


Logistic regression is used for binary classification where we use sigmoid
function, that takes input as independent variables and produces a probability
value between 0 and 1.
For example, we have two classes Class 0 and Class 1 if the value of the
logistic function for an input is greater than 0.5 (threshold value) then it
belongs to Class 1 it belongs to Class 0. It’s referred to as regression because it
is the extension of linear regression but is mainly used for classification
problems.
Logistic Function – Sigmoid Function
• The sigmoid function is a mathematical function used to map the
predicted values to probabilities.
• It maps any real value into another value within a range of 0 and 1.
The value of the logistic regression must be between 0 and 1, which
cannot go beyond this limit, so it forms a curve like the “S” form.
• The S-form curve is called the Sigmoid function or the logistic
function.
• In logistic regression, we use the concept of the threshold value,
which defines the probability of either 0 or 1. Such as values above
the threshold value tends to 1, and a value below the threshold values
tends to 0.
Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three
types:
1. Binomial: In binomial Logistic regression, there can be only two
possible types of the dependent variables, such as 0 or 1, Pass or Fail,
etc.
2. Multinomial: In multinomial Logistic regression, there can be 3 or
more possible unordered types of the dependent variable, such as
“cat”, “dogs”, or “sheep”
3. Ordinal: In ordinal Logistic regression, there can be 3 or more
possible ordered types of dependent variables, such as “low”,
“Medium”, or “High”.

Prof. Ashwini Garole Module 5

You might also like