ML Module 5
ML Module 5
Classification models
Neural Networks are computational models that mimic the complex functions
of the human brain. The neural networks consist of interconnected nodes or
neurons that process and learn from data, enabling tasks such as pattern
recognition and decision making in machine learning
These include:
1. The neural network is simulated by a new environment.
2. Then the free parameters of the neural network are changed as a
result of this simulation.
3. The neural network then responds in a new way to the environment
because of the changes in its free parameters.
The ability of neural networks to identify patterns, solve intricate puzzles, and
adjust to changing surroundings is essential. Their capacity to learn from data
has far-reaching effects, ranging from revolutionizing technology like natural
language processing and self-driving automobiles to automating decision-
making processes and increasing efficiency in numerous industries. The
development of artificial intelligence is largely dependent on neural networks,
which also drive innovation and influence the direction of technology.
Consider a neural network for email classification. The input layer takes features
like email content, sender information, and subject. These inputs, multiplied by
adjusted weights, pass through hidden layers. The network, through training,
learns to recognize patterns indicating whether an email is spam or not. The
output layer, with a binary activation function, predicts whether the email is
spam (1) or not (0). As the network iteratively refines its weights through
backpropagation, it becomes adept at distinguishing between spam and
legitimate emails, showcasing the practicality of neural networks in real-world
applications like email filtering.
The biological processes of the human nervous system serve as the foundation
for neural computers (brain). Neural computing entails substantial parallel
processing and self-learning computers similar to a brain, which is made feasible by
the brain's neural network. A neural network is just a collection of processing pieces
that are interconnected in a web-like fashion and can provide some outcomes after
receiving input.
In this article, you will learn the difference between Artificial Neural
Network and Biological Neural Network. But before discussing the differences, you
Advantages
1. Large volumes of data may be utilized to train and generalize artificial neural networks.
They may be trained using vast datasets, allowing them to make patterns-based
predictions and judgments.
2. ANNs may be improved and employed efficiently on hardware accelerators or
dedicated AI processors like GPUs and AI accelerators for quick and parallel
processing.
3. Another advantage of ANN is that they continue to function even in the presence of
noise or errors in data. As a result, they are appropriate in scenarios involving noisy,
partial, or distorted data.
4. They are non-linear in nature as well. It enables them to represent complex data
relationships and patterns. They can also be customized to handle various sorts of data
and perform various activities.
5. They are capable of extracting features from data. It removes the need for manual
feature editing. They can also be taught to handle many jobs at once. As a result, they
may be utilized in advanced AI applications.
Disadvantages
Every synapse has a processing value and weight recognized during network training.
The performance and potency of the network fully depend on the neuron numbers in
the network, how they are connected to each other (i.e., topology), and the weights
assigned to every synapse.
Advantages
Disadvantages
Some main differences between the Artificial Neural Network and Biological Neural
Network are as follows:
1. ANN is the mathematical model which is mainly inspired by the biological neuron
system in the human brain. In contrast, the biological neural network is also composed
of several processing pieces known as neurons that are linked together via synapses.
2. An artificial neural network's processing was sequential and centralized. In contrast, a
biological neural network processes information in parallel and distributive.
3. The artificial neural network is of a much smaller size than the biological neural
network. In contrast, the biological neural network is large in size.
4. The biological neural network is fault tolerant. In contrast, the artificial neural network
is not fault tolerant.
5. The processing speed of an artificial neural network is in the nanosecond range, which
is faster than the biological neural network, where the cycle time associated with a
neural event triggered by an external input is in the millisecond range.
6. BNN may perform more difficult issues than artificial neural networks.
7. The operating environment of the artificial neural network is well-defined and well-
constrained. In contrast, the operating environment of the biological neural network is
poorly defined and unconstrained.
8. The reliability of the artificial neural network is very vulnerable. In contrast, the reliability
of the biological neural network is robust.
Processing Its processing was sequential and It processes the information in a parallel
centralized. and distributive manner.
Control Its control unit keeps track of all All processing is managed centrally.
Mechanism computer-related operations.
Complexity It cannot perform complex pattern The large quantity and complexity of the
recognition. connections allow the brain to perform
complicated tasks.
Memory Its memory is separate from a Its memory is integrated into the
processor, localized, and non-content processor, distributed, and content-
addressable. addressable.
Learning It has very accurate structures and They are tolerant to ambiguity.
formatted data.
Response time Its response time is measured in Its response time is measured in
milliseconds. nanoseconds.
McCulloch-Pitts Model
1 0 0 0 0
2 0 1 1 1
3 1 0 1 1
4 1 1 2 1
ADVERTISEMENT
o Input Nodes or Input Layer:
This is the primary component of Perceptron which accepts the initial data into
the system for further processing. Each input node contains a real numerical
value.
Weight parameter represents the strength of the connection between units. This
is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding the
output. Further, Bias can be considered as the line of intercept in a linear equation.
These are the final and important components that help to determine whether the
neuron will fire or not. Activation Function can be considered primarily as a step
function.
o Sign function
o Step function, and
o Sigmoid function
The data scientist uses the activation function to take a subjective decision based
on various problem statements and forms the desired outputs. Activation function
may differ (e.g., Sign, Step, and Sigmoid) in perceptron models by checking
whether the learning process is slow or has vanishing or exploding gradients.
Step-1
In the first step first, multiply all input values with corresponding weight values
and then add them to determine the weighted sum. Mathematically, we can
calculate the weighted sum as follows:
Add a special term called bias 'b' to this weighted sum to improve the model's
performance.
∑wi*xi + b
Step-2
Y = f(∑wi*xi + b)
The main objective of the single-layer perceptron model is to analyze the linearly
separable objects with binary outcomes. In a single layer perceptron model, its
algorithms do not contain recorded data, so it begins with inconstantly allocated
input for weight parameters. Further, it sums up all inputs (weight). After adding
all inputs, if the total sum of all inputs is more than a pre-determined value, the
model gets activated and shows the output value as +1.
o Forward Stage: Activation functions start from the input layer in the
forward stage and terminate on the output layer.
o Backward Stage: In the backward stage, weight and bias values are
modified as per the model's requirement. In this stage, the error between
actual output and demanded originated backward on the output layer and
ended on the input layer.
A multi-layer perceptron model has greater processing power and can process
linear and non-linear patterns. Further, it can also implement logic gates such as
AND, OR, XOR, NAND, NOT, XNOR, NOR.
o Ramp:
o It has a very similar appearance to the sigmoid activation function and
translates inputs to outputs throughout the range (0,1), however the ramp
will have a steep curve rather than a smooth one. A linear function that has
been shortened.
Delta learning does this by using the difference between a target activation and
an obtained activation. By using a linear activation function, network connections
are balanced. Another approach to explain the Delta rule is that it uses an error
function to perform gradient descent learning.
Delta rule is introduced by Widrow and Hoff, which is the most significant
learning rule that depends on supervised learning.
This rule states that the change in the weight of a node is equivalent to the product
of error and the input.
Mathematical equation:
The given equation gives the mathematical equation for delta learning rule:
∆w = µ.x.z
∆w = µ(t-y)x
Here,
∆w = weight change.
z= (t-y) is the difference between the desired input t and the actual output y. The
above mentioned mathematical rule cab be used only for a single output unit.
The different weights can be determined with respect to these two cases.
w(new) = w(old) + ∆w
No change in weight
In the multi-layer perceptron diagram above, we can see that there are three
inputs and thus three input nodes and the hidden layer has three nodes. The
output layer gives two outputs, therefore there are two output nodes. The nodes
in the input layer take input and forward it for further process, in the diagram
above the nodes in the input layer forwards their output to each of the three
nodes in the hidden layer, and in the same way, the hidden layer processes the
information and passes it to the output layer.
Every node in the multi-layer perception uses a sigmoid activation function. The
sigmoid activation function takes real values as input and converts them to
numbers between 0 and 1 using the sigmoid formula.
Error backpropagation algorithm :
The algorithm gets its descent gradient name because the weights are updated
backward, from output to input.
It's not easy to understand exactly how changing weights and biases affect the
overall behavior of an ANN. That was one factor that held back more
comprehensive use of neural network applications until the early 2000s, when
computers provided the necessary insight.
• They don't have any parameters to tune except for the number of inputs.