Soft Computing Manual.-1
Soft Computing Manual.-1
Soft Computing Manual.-1
: 1
Software: SCILAB
Theory:
Activation functions are an integral building block of neural networks that enable
them to learn complex patterns in data. They transform the input signal of a node in a neural
network into an output signal that is then passed on to the next layer. Without activation
functions, neural networks would be restricted to modeling only linear relationships between
inputs and outputs.
Without activation functions, neural networks would just consist of linear operations like
matrix multiplication. All layers would perform linear transformations of the input, and no
non-linearities would be introduced.
Most real-world data is non-linear. For example, relationships between house prices and size,
income, and purchases, etc., are non-linear. If neural networks had no activation functions,
they would fail to learn the complex non-linear patterns that exist in real-world data.
Activation functions enable neural networks to learn these non-linear relationships by
introducing non-linear behaviors through activation functions. This greatly increases the
flexibility and power of neural networks to model complex and nuanced data.
The linear activation function is the simplest activation function, defined as:
f(x) = x
It simply returns the input x as the output. Graphically, it looks like a straight line with a slope
of 1.
The main use case of the linear activation function is in the output layer of a neural network
used for regression. For regression problems where we want to predict a numerical value,
using a linear activation function in the output layer ensures the neural network outputs a
numerical value. The linear activation function does not squash or transform the output, so the
actual predicted value is returned.
However, the linear activation function is rarely used in hidden layers of neural networks. This
is because it does not provide any non-linearity. The whole point of hidden layers is to learn
non-linear combinations of the input features. Using a linear activation throughout would
restrict the model to just learning linear transformations of the input.
Sigmoid activation
The sigmoid activation function, often represented as σ(x), is a smooth, continuously
differentiable function that is historically important in the development of neural networks.
The sigmoid activation function has the mathematical form:
f(x) = 1 / (1 + e^-x)
It takes a real-valued input and squashes it to a value between 0 and 1. The sigmoid function
has an "S"-shaped curve that asymptotes to 0 for large negative numbers and 1 for large
positive numbers. The outputs can be easily interpreted as probabilities, which makes it natural
for binary classification problems.
For inputs greater than 0, ReLU acts as a linear function with a gradient of 1. This means that
it does not alter the scale of positive inputs and allows the gradient to pass through unchanged
during backpropagation. This property is critical in mitigating the vanishing gradient problem.
Even though ReLU is linear for half of its input space, it is technically a non-linear function
because it has a non-differentiable point at x=0, where it abruptly changes from x. This
nonlinearity allows neural networks to learn complex patterns
Since ReLU outputs zero for all negative inputs, it naturally leads to sparse activations; at any
time, only a subset of neurons are activated, leading to more efficient computation.
Softmax activation
The softmax activation function, also known as the normalized exponential function, is
particularly useful within the context of multi-class classification problems. This function
operates on a vector, often referred to as the logits, which represents the raw predictions or
scores for each class computed by the previous layers of a neural network. For input vector x
with elements x1, x2, ..., xC, the SoftMax function is defined as: f(xi) = e^xi / Σj e^xj The
output of the SoftMax function is a probability distribution that sums up to one. Each element
of the output represents the probability that the input belongs to a particular class.
The use of the exponential function ensures that all output values are non-negative. This is
crucial because probabilities cannot be negative.
Conclusion
Experiment No.: 02
Theory:
The MP model represents a binary neuron that takes multiple binary inputs and produces a
binary output based on a fixed threshold. It captures the basic functionality of a real neuron
by modelling the integration of input signals and the activation of the neuron based on a
threshold.
2. Activation Function:
The weighted sum is compared to the threshold using an activation function. If the
weighted sum exceeds the threshold, the neuron fires and produces an output of 1; otherwise,
it remains inactive and produces an output of 0.
The MP model provides a binary decision mechanism, making it suitable for tasks that require
simple classification or decision-making. It forms the basis for more complex and powerful
neural network models, such as the perceptron and artificial neural networks with multiple
layers (multilayer perceptron), which allow for more flexible and sophisticated computations.
McCulloch-Pitts Neuron Activation Function
AND Gate:
• Inputs: x1, x2
• Weights: w1=1, w2=1
• Threshold: 𝜃 =2
X1 X2 Output(AND)
0 0 0
0 1 0
1 0 0
1 1 1
The Hebbian learning rule :
The Hebbian learning rule, proposed by Donald Hebb in 1949, is a simple and
fundamental principle in neural network learning. It states that "cells that fire together wire
together," meaning that the strength of the connection between two neurons increases when
they are activated simultaneously.
The Hebbian learning rule is based on the idea of synaptic plasticity, which refers to the ability
of the connections (synapses) between neurons to change in strength. It provides a mechanism
for learning and memory formation in neural networks.
The basic principle of the Hebbian learning rule can be summarized as follows:
1.If two connected neurons are both activated (fire) at the same time, the strength of the
connection between them is strengthened.
2. If two connected neurons are not activated together, or only one of them is activated, the
connection between them remains unchanged or weakens.
ΔW = η * X * Y
The Hebbian learning rule can be applied in a single neuron or in a network of neurons. It
allows the network to learn associations between input patterns and modify its connection
strengths accordingly.
It's important to note that while the Hebbian learning rule is a fundamental concept in neural
network learning, it has certain limitations and can lead to instability or over fitting in certain
scenarios. Therefore, modern neural network architectures often employ more advanced
learning algorithms, such as back propagation, that address these limitations and offer more
efficient and robust learning mechanisms
AND Gate using Hebbian Learning
• Inputs: x1, x2
• Initially: w1=0, w2=0
• Learning rate: η =1
Training Process
4. For input (1, 1), desired output is 1: Update the weights using Hebb's rule
Conclusion:
EXPERIMENT NO.: 3
In supervised learning with a neural network, the network learns a mapping from inputs to
outputs by adjusting its internal weights and biases. The process involves forward propagation,
where inputs are passed through the network to produce outputs, and backpropagation, where
the errors are used to update the weights. Key elements include the choice of activation
function, loss function, optimization method, and regularization techniques to prevent
overfitting. The Neural Network Toolbox simplifies these processes, allowing users to focus
on high-level model design and evaluation.
1. Regression Algorithms
These algorithms are used when the output variable is continuous, meaning the target is
a real value. The goal of regression is to predict a continuous number based on the input data.
Used for numerical continuous-response values. Regression models can be easily trained with
the Regression Learner app.
Common Algorithms:
Linear Regression: A method that models the relationship between a dependent variable and
one or more independent variables using a linear equation.
Ridge and Lasso Regression: Extensions of linear regression that include regularization to
reduce overfitting.
Polynomial Regression: Extends linear regression to account for non-linear relationships by
introducing polynomial terms.
Support Vector Regression (SVR): Uses support vectors for regression tasks, similar to
SVMs for classification.
Decision Trees for Regression: Builds a tree-like model to predict a continuous value by
splitting data based on feature values.
Supervised learning algorithms can be broadly categorized into two main types based on the
nature of the output variable: Uses deep learning models to map inputs to continuous
outputs.
2. Classification Algorithms
These algorithms are used when the output variable is categorical, meaning the target belongs
to a set of predefined categories or classes. The goal of classification is to assign the input to
one of these categories. Used for categorical response values, where the data can be separated
into specific classes. A binary classification model has two classes and a multiclass
classification model has more. You can train classification models with the Classification
Learner app with MATLAB.
Common Algorithms:
Logistic Regression: A statistical method for binary classification that models the probability
of a binary outcome.
Support Vector Machines (SVM): A classification algorithm that finds the optimal
hyperplane to separate data points of different classes.
K-Nearest Neighbours (KNN): A simple algorithm that classifies new data points based on
the majority class of their nearest neighbours.
Decision Trees for Classification: A tree-like model where internal nodes represent features,
branches represent decision rules, and leaves represent class labels.
Random Forest: An ensemble method using multiple decision trees to improve classification
accuracy and reduce overfitting.
Naive Bayes: A probabilistic classifier based on Bayes' theorem, often used for text
classification.
Neural Networks for Classification: Deep learning models like feedforward and
convolutional neural networks (CNNs) are commonly used for image, text, and other complex
classification tasks.
3. Forward Propagation
In supervised learning with neural networks, the model is trained to predict outputs given
inputs. During forward propagation, the input data is passed through each layer of the network.
Each neuron in the hidden and output layers applies its activation function to the weighted
sum of its inputs and passes the result forward.
The final output is compared with the true target using a loss function (e.g., mean squared
error for regression or cross-entropy for classification), and the difference (error) is computed.
4. Loss Function
The loss function (also called the cost function) measures how well the neural network’s
predictions match the true targets. For different tasks, different loss functions are used:
Regression: Mean Squared Error (MSE) is often used.
MSE = \frac{1}{n}/sum {I =1} ^{n} (y _i - \hat{y}_i) ^2
Classification: Cross-Entropy Loss is used for tasks where the output is categorical.
(p, q) = - \sum p(x) \log(q(x))
The goal is to minimize this loss function, which means the neural network is producing
predictions close to the actual labels.
% Create a feedforward neural network with one hidden layer of 10 neurons net =
patternnet(10);
% Set the training, validation, and testing ratios net.divideParam.trainRatio =
0.7; % 70% for training net.divideParam.valRatio = 0.15; % 15% for
validation net.divideParam.testRatio = 0.15; % 15% for testing
Theory :-
Perceptron is one of the first and most straightforward models of artificial neural
networks. Despite being a straightforward model, the perceptron has been proven to be
successful in solving specific categorization issues. Perceptron is one of the simplest
Artificial neural network architectures. It was introduced by Frank Rosenblatt in 1957s. It is
the simplest type of feedforward neural network, consisting of a single layer of input nodes
that are fully connected to a layer of output nodes. It can learn the linearly separable patterns.
it uses slightly different types of artificial neurons known as threshold logic units (TLU).
Types of Perceptron
• Input Features: The perceptron takes multiple input features, each input feature
represents a characteristic or attribute of the input data.
• Weights: Each input feature is associated with a weight, determining the significance
of each input feature in influencing the perceptron’s output. During training, these
weights are adjusted to learn the optimal values.
• Summation Function: The perceptron calculates the weighted sum of its inputs
using the summation function. The summation function combines the inputs with
their respective weights to produce a weighted sum.
• Activation Function: The weighted sum is then passed through an activation
function. Perceptron uses Heaviside step function functions. which take the summed
values as input and compare with the threshold and provide the output as 0 or 1.
• Output: The final output of the perceptron, is determined by the activation function’s
result. For example, in binary classification problems, the output might represent a
predicted class (0 or 1).
• Bias: A bias term is often included in the perceptron model. The bias allows the model
to make adjustments that are independent of the input. It is an additional parameter
that is learned during training.
• Learning Algorithm (Weight Update Rule): During training, the perceptron learns
by adjusting its weights and bias based on a learning algorithm. A common approach
is the perceptron learning algorithm, which updates weights based on the difference
between the predicted output and the true output.
A perceptron has a single layer of threshold logic units with each TLU connected to
all inputs.
The output of the fully connected layer can be:
where X is the input W is the weight for each inputs neurons and b is the bias and h is
the step function.
During training, The perceptron’s weights are adjusted to minimize the difference
between the predicted output and the actual output. Usually, supervised learning algorithms
like the delta rule or the perceptron learning rule are used for this.
2. Activation:
The weighted sum is passed through an activation function to produce an output:
3. Error Calculation:
The difference between predicted and actual outputs is computed to find the error
term: E=y−y^E=y−y^ where yy is the actual output and y^y^ is the predicted output
4. Weight Update:
Weights are adjusted based on the error using a learning rate αα:
wi= wi+αExiwi=wi+αExi
This step aims to minimize future errors by refining the model's parameters
5. Iteration:
The process repeats over multiple epochs until convergence, meaning that weights
stabilize and further adjustments yield minimal changes in output accuracy.
Input:
Output:
Conclusion:
EXPERIMENT NO.: 5
Aim: To study Development of ADALINE algorithm with bipolar inputs and outputs
Software requirement: SCILAB
Theory:
An Artificial Neural Network inspired by the human neural system is a network used to
process the data which consist of three types of layer i.e input layer, the hidden layer, and the
output layer. The basic neural network contains only two layers which are the input and output
layers. The layers are connected with the weighted path which is used to find net input data.
In this section, we will discuss two basic types of neural networks Adaline which doesn’t have
any hidden layer, and Madaline which has one hidden layer.
1. Adaline (Adaptive Linear Neural) :
• A network with a single linear unit is called Adaline (Adaptive Linear Neural). A unit
with a linear activation function is called a linear unit. In Adaline, there is only one
output unit and output values are bipolar (+1,-1). Weights between the input unit and
output unit are adjustable. It uses the delta rule i.e. where and are the weight, predicted
output, and true value respectively.
• The learning rule is found to minimize the mean square error between activation and
target values. Adaline consists of trainable weights, it compares actual output with
calculated output, and based on error training algorithm is applied.
Workflow:
First, calculate the net input to your Adaline network then apply the activation function to its
output then compare it with the original output if both the equal, then give the output else send
an error back to the network and update the weight according to the error which is calculated
by the delta learning rule. i.e, where and are the weight, predicted output, and true value
respectively.
Architecture:
In Adaline, all the input neuron is directly connected to the output neuron with the weighted
connected path. There is a bias b of activation function 1 is present. Algorithm: `
Step 1: Initialize weight not zero but small random values are used. Set learning rate α.
Step 2: While the stopping condition is False do steps 3 to 7.
Step 3: for each training set perform steps 4 to 6.
Step 4: Set activation of input unit xi = si for (i=1 to n).
Step 5: compute net input to output unit Here, b is the bias and n is the total number of
neurons .
Step 6: Update the weights and bias for i=1 to n and calculate when the predicted output
and the true value are the same then the weight will not change.
Step 7: Test the stopping condition. The stopping condition may be when the weight
changes at a low rate or no change.
Implementations
Problem: Design OR gate using Adaline Network?
Solution:
• Initially, all weights are assumed to be small random values, say 0.1, and set learning
rule to 0.1.
• Also, set the least squared error to 2.
• The weights will be updated until the total error is greater than the least squared error.
x1 x2 t
1 1 1
1 -1 1
-1 1 1
-1 -1 -1
This is epoch 1 where the total error is 0.49 + 0.69 + 0.83 + 1.01 = 3.02 so more epochs will
run until the total error becomes less than equal to the least squared error i.e 2.
• Python3
Error=[stop +1]
# check the stop condition for the network while Error[-1] > stop or Error[-1]-Error[-2] >
0.0001:
error = [] for i in range(Input.shape[0]):
Y_input = sum(weight*Input[i]) + bias
Output:
Error : [2.33228319]
Error : [1.09355784]
Error : [0.73680883]
Error : [0.50913731]
Error : [0.35233593]
Error : [0.24384625] Error : [0.16876305] Error : [0.01283534]
Error : [0.00888318]
Error : [0.00614795]
Error : [0.00425492]
Error : [0.00294478]
Error : [0.00203805]
Error : [0.00141051] Error : [0.0009762] weight
: [0.01081771 0.01081771
0.98675106]
Bias : [0.01081771] Predictions:
Predict from the evaluated weight and bias of Adaline
Python3
output: [array([1.0192042]),
array([0.99756877]),
array([0.99756877]), array([-
0.99756877])]
conclusion:
EXPERIMENT NO.:6
Aim: Fuzzy Logic-Based PID Controller for Pressure Regulation in Gas Pipeline
Software: PYTHON
Theory:
Steps:
1. Problem Definition:
Pressure regulation in a gas pipeline system is a critical task. Traditional PID controllers
often struggle to maintain precise control due to varying demand, changes in gas flow,
and environmental conditions, resulting in oscillations or overshoot. A Fuzzy Logic
system will be used to tune the PID parameters in real-time to handle these complexities.
2. System Overview:
• Input: Error (difference between setpoint and current pressure), and rate of change
of error (derivative of error).
• Output: PID parameters (Kp, Ki, Kd) adjustment.
• Controller: A PID controller tuned by Fuzzy Logic.
a. Fuzzification:
Define the fuzzy sets for inputs (error, rate of change of error) and outputs (Kp, Ki,
Kd):
• Input 1: Pressure error (Pa): Negative Large (NL), Negative Medium (NM),
Negative Small (NS), Zero (Z), Positive Small (PS), Positive Medium (PM),
Positive Large (PL).
• Input 2: Rate of change of pressure error (Pa/s): Negative (N), Zero (Z), Positive
(P).
• Outputs: PID controller parameters (Kp, Ki, Kd): Kp (Proportional gain): Small
(S), Medium (M), Large (L).
Ki (Integral gain): Small (S), Medium (M), Large (L).
Base:
Define a set of rules that adjust the PID gains based on the error and rate of change of
error:
e. Defuzzification:
Use the centroid method to defuzzify the fuzzy outputs and generate the crisp values of
Kp, Ki, and Kd that will be used by the PID controller.
You can use a Neural Network to predict the optimal PID parameters by learning from
historical data. This would act as an adaptive mechanism to improve the fuzzy logic
system’s performance over time.
a. Simulation Tool:
Use MATLAB with the Fuzzy Logic Toolbox and Simulink to simulate the gas pipeline
pressure control system and the fuzzy-tuned PID controller.
b. Testing:
1. Create different test scenarios with varying pressure setpoints and external
disturbances.
2. Observe how the Fuzzy-PID system responds to changes, focusing on response
time, stability, and overshoot.
3. Compare the performance of the Fuzzy-PID controller with a traditional PID
controller.
c. Performance Metrics:
• Settling Time: Time taken to reach the steady state after a disturbance.
• Overshoot: Measure the maximum deviation from the setpoint.
• Steady-State Error: Difference between the desired pressure and actual pressure at
steady state.
6. Results:
# Example input values for error and rate of change of error error_input =
5.841275134629758 rate_of_error_input = 6.965029156946181
# Pass inputs to the Fuzzy Logic system pid_simulation.input['error'] = error_input
pid_simulation.input['rate_of_error'] = rate_of_error_input
# Check if the output exists before accessing it if 'kp' in pid_simulation.output and 'ki' in
pid_simulation.output and 'kd' in pid_simulation.output: print(f"Kp:
{pid_simulation.output['kp']:.2f}") print(f"Ki:
{pid_simulation.output['ki']:.2f}") print(f"Kd:
{pid_simulation.output['kd']:.2f}") else:
print("Fuzzy logic system did not produce an output for Kp, Ki, or Kd. Please check the rules or input
values.")
Output:
Theory: A function that specifies the degree to which a given input belongs to a set is
membership function. Degree of membership is the output of a membership function; this
value is always limited to between 0 and 1, also known as a membership value or membership
grade.
Membership functions are used in the fuzzification and defuzzification steps of a FLS (fuzzy
logic system), to map the non-fuzzy input values to fuzzy linguistic terms and vice versa.
That is, μA: X €[0, 1], where [0,1] means real numbers between 0 and 1 (including 0,1).
Consequently, fuzzy set is with „vague boundary set‟ comparing with crisp set.
Many classical relation properties can be extended to fuzzy relations with some
modifications:
• Fuzziness: Unlike traditional binary relations that are either true or false, fuzzy
relations assign degrees of membership ranging from 0 to 1. This allows for partial
truths and reflects the ambiguity inherent in many real-world situations.
• Symmetry: A fuzzy relation RRR is symmetric if, for any elements xxx and yyy, the
degree of relation R(x,y)R(x, y)R(x,y) is equal to R(y,x)R(y, x)R(y,x). This property
is important in contexts where the relationship is inherently bidirectional.
• Reflexivity: A fuzzy relation is reflexive if, for every element xxx, the degree of
relation R(x,x)R(x,
x)R(x,x) is 1. This means that every element is related to itself to the fullest extent.
• Transitivity: A fuzzy relation is transitive if, for any elements xxx, yyy, and zzz, the
degree of relation R(x,z)R(x, z)R(x,z) is at least the minimum of R(x,y)R(x, y)R(x,y)
and R(y,z)R(y, z)R(y,z). This property helps to maintain consistency in relationships
across multiple elements.
5. Fuzzy Equivalence Relations:
A fuzzy relation is an equivalence relation if it satisfies reflexivity, symmetry, and
transitivity. Fuzzy equivalence relations are crucial for clustering and classification tasks
where elements are grouped based on their degree of similarity.
7. Applications:
Decision-making
Control systems
Image processing
8. Algorithm:
• Step2:Define fuzzy rules: Define the fuzzy rules based on the given conditions.
• Step 3:Use domains, sets, and rules: Use these three parts to model your system.
• Step 4:Use set operations: Combine sets with set operations like &, |, and ~.
• Step 5:Use rules to map domains: Use rules to map the input domain to the output
domain.
INPUT
Class Fuzzy Set:
def __init__(self, name, membership):
self.name = name self
membership =
membership
def __repr__(self):
return f"{self.name}: {self.membership}"
def __repr__(self):
return f"{self.relation_name}: {self.matrix}"
def fuzzy_union(relation1,
relation2):
result_matrix = {} for key in
set(relation1.matrix.keys()).union(relation2.matrix.keys()
):
result_matrix[key] = max(relation1.get_degree(*key), relation2.get_degree(*key))
return FuzzyRelation("Union", result_matrix)
def fuzzy_intersection(relation1,
relation2):
result_matrix = {} for key in
set(relation1.matrix.keys()).union(relation2.matrix.keys()
):
result_matrix[key] = min(relation1.get_degree(*key), relation2.get_degree(*key)) return
FuzzyRelation("Intersection", result_matrix) def fuzzy_composition(relation1, relation2):
result_matrix = {}
for (x, z) in
relation1.matrix.keys():
for (z2, y) in
relation2.matrix.keys():
if z == z2:
degree = min(relation1.get_degree(x, z), relation2.get_degree(z,
y)) if (x, y) in result_matrix:
result matrix[(x, y)] = max(result_matrix[(x, y)], degree)
else:
result matrix[(x, y)] = degree return
Fuzzy Relation("Composition",
result_matrix)
# Print results
print(A)
print(B)
print(R1
)
print(R2
)
print(uni
on_relati
on)
Output
CONCLUSION: