Unit 1.1
Unit 1.1
The activation function decides whether a neuron should be activated or not by calculating the
weighted sum and further adding bias to it. The purpose of the activation function is to introduce
non-linearity into the output of a neuron.
Explanation: We know, the neural network has neurons that work in correspondence with weight,
bias, and their respective activation function.
In a neural network, we would update the weights and biases of the neurons on the basis of the
error at the output. This process is known as back-propagation. Activation functions make the
back-propagation possible since the gradients are supplied along with the error to update the
weights and biases.
A neural network without an activation function is essentially just a linear regression model. The
activation function does the non-linear transformation to the input making it capable to learn and
perform more complex tasks.
Mathematical proof
Suppose we have a Neural net like this :-
Hidden layer i.e. layer 1:
•X1: Is it raining?
•X2 : Is it sunny?
So, the value of both scenarios can be either 0 or 1. We can use the
value of both weights X1 and X2 as 1 and a threshold function as 1. So,
the neural network model will look like:
Truth Table for this case will be:
1 0 0 0 0
2 0 1 1 1
3 1 0 1 1
4 1 1 2 1
So, I can say that,
From the truth table, I can conclude that in the situations where the value of yout is 1,
John needs to carry an umbrella. Hence, he will need to carry an umbrella in scenarios
2, 3 and 4.
Perceptron Model
Perceptron Example
Imagine a perceptron (in your brain).
• Perceptron is regarded as a single-layer neural network comprising four key parameters in Machine
Learning. These parameters of the perceptron algorithm are input values (Input nodes), net sum,
weights and Bias, and an activation function.
• The perceptron model starts by multiplying every input value and its weights. Subsequently, it adds
these values to generate the weighted sum. This weighted sum is then applied to the activation
function “f” to get the anticipated output. The corresponding activation function is also called the step
function.
Basic Components of Perceptron
1.Input Layer: The input layer consists of one or more input neurons, which receive input signals from
the external world or from other layers of the neural network.
2.Weights: Each input neuron is associated with a weight, which represents the strength of the
connection between the input neuron and the output neuron.
3.Bias: A bias term is added to the input layer to provide the perceptron with additional flexibility in
modeling complex patterns in the input data.
4.Activation Function: The activation function determines the output of the perceptron based on the
weighted sum of the inputs and the bias term. Common activation functions used in perceptrons
include the step function, sigmoid function, and ReLU function.
5.Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates the class
or category to which the input data belongs.
6.Training Algorithm: The perceptron is typically trained using a supervised learning algorithm such as
the perceptron learning algorithm or backpropagation. During training, the weights and biases of the
perceptron are adjusted to minimize the error between the predicted output and the true output for a
given set of training examples.
Types of Perceptron:
1.Single layer: Single layer perceptron can learn only linearly separable patterns.
2.Multilayer: Multilayer perceptron can learn about two or more layers having a
greater processing power.
1.Visual Inspection: If a distinct straight line or plane divides the various groups, it
can be visually examined by plotting the data points in a 2D or 3D space. The data
may be linearly separable if such a boundary can be seen.
2.Perceptron Learning Algorithm: This binary linear classifier divides the input
into two classes by learning a separating hyperplane iteratively. The data are linearly
separable if the method finds a separating hyperplane and converges. If not, it is
not.
3.Support vector machines: SVMs are a well-liked classification technique that can
handle data that can be separated linearly. To optimize the margin between the two
classes, they identify the separating hyperplane. The data can be linearly separated
if the margin is bigger than zero.
4.Kernel methods: The data can be transformed into a higher-dimensional space
using this family of techniques, where it might then be linearly separable. The
original data is also linearly separable if the converted data is linearly separable.
5.Quadratic programming: Finding the separation hyperplane that reduces the
classification error can be done using quadratic programming. If a solution is found,
the data can be separated linearly.
Linearly Separable 2D Data
Adaline which stands for Adaptive Linear Neuron, is a network having a single linear unit.
It was developed by Widrow and Hoff in 1960. Some important points about Adaline are
as follows −
•Adaline neuron can be trained using Delta rule or Least Mean Square(LMS) rule or
widrow-hoff rule
•The net input is compared with the target value to compute the error signal.
Madaline which stands for Multiple Adaptive Linear Neuron, is a network which consists
of many Adalines in parallel. It will have a single output unit. Some important points
about Madaline are as follows −
•It is just like a multilayer perceptron, where Adaline will act as a hidden unit between
the input and the Madaline layer.
•The weights and the bias between the input and Adaline layers, as in we see in the
Adaline architecture, are adjustable.
•The Adaline and Madaline layers have fixed weights and bias of 1.
•Training can be done with the help of Delta rule.
It consists of “n” units of input layer and “m” units of Adaline layer and “1” unit of the
Madaline layer. Each neuron in the Adaline and Madaline layers has a bias of excitation
“1”. The Adaline layer is present between the input layer and the Madaline layer; the
Adaline layer is considered as the hidden layer.