Implementing the backpropagation algorithm for a neural network in Python
from scratch involves defining the network architecture, implementing
forward propagation, calculating the error, and then performing backward
propagation to update the weights.
Here is a simplified example demonstrating the core concepts using
NumPy:
import numpy as np
class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size, learning_rate=0.1):
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.learning_rate = learning_rate
# Initialize weights and biases
self.weights_input_hidden = np.random.randn(self.input_size, self.hidden_size)
self.bias_hidden = np.zeros((1, self.hidden_size))
self.weights_hidden_output = np.random.randn(self.hidden_size, self.output_size)
self.bias_output = np.zeros((1, self.output_size))
def sigmoid(self, x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(self, x):
return x * (1 - x)
def forward_propagation(self, X):
# Input to hidden layer
self.hidden_layer_input = np.dot(X, self.weights_input_hidden) + self.bias_hidden
self.hidden_layer_output = self.sigmoid(self.hidden_layer_input)
# Hidden to output layer
self.output_layer_input = np.dot(self.hidden_layer_output, self.weights_hidden_output) +
self.bias_output
self.predicted_output = self.sigmoid(self.output_layer_input)
return self.predicted_output
def backward_propagation(self, X, y, predicted_output):
# Calculate output layer error and delta
error_output = y - predicted_output
delta_output = error_output * self.sigmoid_derivative(predicted_output)
# Calculate hidden layer error and delta
error_hidden = np.dot(delta_output, self.weights_hidden_output.T)
delta_hidden = error_hidden * self.sigmoid_derivative(self.hidden_layer_output)
# Update weights and biases
self.weights_hidden_output += np.dot(self.hidden_layer_output.T, delta_output) *
self.learning_rate
self.bias_output += np.sum(delta_output, axis=0, keepdims=True) * self.learning_rate
self.weights_input_hidden += np.dot(X.T, delta_hidden) * self.learning_rate
self.bias_hidden += np.sum(delta_hidden, axis=0, keepdims=True) * self.learning_rate
def train(self, X, y, epochs):
for epoch in range(epochs):
predicted_output = self.forward_propagation(X)
self.backward_propagation(X, y, predicted_output)
# Optional: Print error every few epochs
if epoch % 100 == 0:
loss = np.mean(np.square(y - predicted_output))
print(f"Epoch {epoch}, Loss: {loss}")
# Example Usage (XOR Problem)
if __name__ == "__main__":
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])
nn = NeuralNetwork(input_size=2, hidden_size=2, output_size=1, learning_rate=0.1)
nn.train(X, y, epochs=5000)
print("\nPredictions after training:")
for i in range(len(X)):
prediction = nn.forward_propagation(X[i])
print(f"Input: {X[i]}, Predicted Output: {prediction.round()}")
Implement Back propagation in Python.
Backpropagation in Python – A Quick
Guide
ADRITA DAS
FEBRUARY 27, 2022
PYTHON PROGRAMMING EXAMPLES
Sometimes you need to improve the accuracy of your neural
network model, and backpropagation exactly helps you achieve the
desired accuracy. The backpropagation algorithm helps you to get a
good prediction of your neural network model. In this article, we will
learn about the backpropagation algorithm in detail and also how to
implement it in Python.
What is backprograpation and why
is it necessary?
The backpropagation algorithm is a type of supervised learning
algorithm for artificial neural networks where we fine-tune the
weight functions and improve the accuracy of the model. It employs
the gradient descent method to reduce the cost function. It reduces
the mean-squared distance between the predicted and the actual
data. This type of algorithm is generally used for training feed-
forward neural networks for a given data whose classifications are
known to us.
You can also think of backward propagation as the backward spread
of errors in order to achieve more accuracy. If we have received a
prediction from a neural network model which has a huge difference
from the actual output, we need to apply the backpropagation
algorithm to achieve higher accuracy.
Note: Feed-forward neural networks are generally multi-layered
neural networks (MLN). The data travels from the input layer to the
hidden layer to the output layer.
How Does Backpropagation in
Python Work?
Now let’s get the intuition about how the algorithm works actually.
There are mainly three layers in a backpropagation model i.e input
layer, hidden layer, and output layer. Following are the main steps
of the algorithm:
Step 1:The input layer receives the input.
Step 2:The input is then averaged overweights.
Step 3:Each hidden layer processes the output. Each output is
referred to as “Error” here which is actually the difference between
the actual output and the desired output.
Step 4:In this step, the algorithm moves back to the hidden layers
again to optimize the weights and reduce the error.
Types of Backpropagation in
Python
There are mainly two types of backpropagation methods i.e Static
backpropagation and Recurrent backpropagation. Let’s look at what
each of the two types actually means. In static backpropagation,
static inputs generate static outputs. This is specifically used for
static classification problems such as Optical Character Recognition.
On the other hand, recurrent propagation keeps on taking place
until it reaches a definite value or threshold value. Once it reaches
the fixed value, the error is propagated backward.
Implementing Backpropagation in
Python
Let’s see how we can implement Backpropagation in Python in a
step-by-step manner. First of all, we need to import all the
necessary libraries.
1. Import Libraries
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
Now let’s look at what dataset we will be working with.
2. Load the Dataset
We will be working with a very simple dataset today i.e the iris
dataset. We will load the dataset using load_iris() function, which is
part of the scikit-learn library. The dataset consists of three main
classes. We will divide them into target variables and features.
# Loading dataset
data = load_iris()
# Dividing the dataset into target variable and features
X=data.data
y=data.target
3. Split Dataset in Training and Testing
Now we will split the dataset into training and test sets. We will use
the function train_test_split(). The function takes three parameters:
the features, target, and size of the test set.
# Split dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20, random_state=4)
Now in the next step, we have to start initializing the
hyperparameters. We will input the learning rate, iterations, input
size, number of hidden layers, and number of output layers.
learning_rate = 0.1
iterations = 5000
N = y_train.size
# Input features
input_size = 4
# Hidden layers
hidden_size = 2
# Output layer
output_size = 3
3. Initialize Weights
np.random.seed(10)
# Hidden layer
W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))
# Output layer
W2 = np.random.normal(scale=0.5, size=(hidden_size , output_size))
Now we will create helper functions such as mean squared error,
accuracy and sigmoid.
# Helper functions
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def mean_squared_error(y_pred, y_true):
# One-hot encode y_true (i.e., convert [0, 1, 2] into [[1, 0, 0], [0, 1, 0], [0, 0,
y_true_one_hot = np.eye(output_size)[y_true]
# Reshape y_true_one_hot to match y_pred shape
y_true_reshaped = y_true_one_hot.reshape(y_pred.shape)
# Compute the mean squared error between y_pred and y_true_reshaped
error = ((y_pred - y_true_reshaped)**2).sum() / (2*y_pred.size)
return error
def accuracy(y_pred, y_true):
acc = y_pred.argmax(axis=1) == y_true.argmax(axis=1)
return acc.mean()
results = pd.DataFrame(columns=["mse", "accuracy"])
Now we will start building our backpropagation model.
4. Building the Backpropogation Model in
Python
We will create a for loop for a given number of iterations and will
update the weights in each iteration. The model will go through
three phases feedforward propagation, the error calculation phase,
and the backpropagation phase.
# Training loop
for itr in range(iterations):
# Feedforward propagation
Z1 = np.dot(X_train, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
# Calculate error
mse = mean_squared_error(A2, y_train)
acc = accuracy(np.eye(output_size)[y_train], A2)
new_row = pd.DataFrame({"mse": [mse], "accuracy": [acc]})
results = pd.concat([results, new_row], ignore_index=True)
# Backpropagation
E1 = A2 - np.eye(output_size)[y_train]
dW1 = E1 * A2 * (1 - A2)
E2 = np.dot(dW1, W2.T)
dW2 = E2 * A1 * (1 - A1)
# Update weights
W2_update = np.dot(A1.T, dW1) / N
W1_update = np.dot(X_train.T, dW2) / N
W2 = W2 - learning_rate * W2_update
W1 = W1 - learning_rate * W1_update
Now we will plot the mean squared error and accuracy using the
pandas plot() function.
results.mse.plot(title="Mean Squared Error")
plt.show()
results.accuracy.plot(title="Accuracy")
plt.show()
Accuracy and Mean squared error plot.
Now we will calculate the accuracy of the model.
# Test the model
Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
test_acc = accuracy(np.eye(output_size)[y_test], A2)
print("Test accuracy: {}".format(test_acc))
Output:
Accuracy: 0.95
You can see the accuracy of the model have been significantly
increased to 80%.
Putting It All Together
To make things easier, here’s the complete code for performing
backpropogation.
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Loading dataset
data = load_iris()
X = data.data
y = data.target
# Split dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20, random_state=4)
# Hyperparameters
learning_rate = 0.1
iterations = 5000
N = y_train.size
input_size = 4
hidden_size = 2
output_size = 3
np.random.seed(10)
W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))
W2 = np.random.normal(scale=0.5, size=(hidden_size, output_size))
# Helper functions
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def mean_squared_error(y_pred, y_true):
# One-hot encode y_true (i.e., convert [0, 1, 2] into [[1, 0, 0], [0, 1, 0], [0, 0,
y_true_one_hot = np.eye(output_size)[y_true]
# Reshape y_true_one_hot to match y_pred shape
y_true_reshaped = y_true_one_hot.reshape(y_pred.shape)
# Compute the mean squared error between y_pred and y_true_reshaped
error = ((y_pred - y_true_reshaped)**2).sum() / (2*y_pred.size)
return error
def accuracy(y_pred, y_true):
acc = y_pred.argmax(axis=1) == y_true.argmax(axis=1)
return acc.mean()
results = pd.DataFrame(columns=["mse", "accuracy"])
# Training loop
for itr in range(iterations):
# Feedforward propagation
Z1 = np.dot(X_train, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
# Calculate error
mse = mean_squared_error(A2, y_train)
acc = accuracy(np.eye(output_size)[y_train], A2)
new_row = pd.DataFrame({"mse": [mse], "accuracy": [acc]})
results = pd.concat([results, new_row], ignore_index=True)
# Backpropagation
E1 = A2 - np.eye(output_size)[y_train]
dW1 = E1 * A2 * (1 - A2)
E2 = np.dot(dW1, W2.T)
dW2 = E2 * A1 * (1 - A1)
# Update weights
W2_update = np.dot(A1.T, dW1) / N
W1_update = np.dot(X_train.T, dW2) / N
W2 = W2 - learning_rate * W2_update
W1 = W1 - learning_rate * W1_update
# Visualizing the results
results.mse.plot(title="Mean Squared Error")
plt.show()
results.accuracy.plot(title="Accuracy")
plt.show()
# Test the model
Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
test_acc = accuracy(np.eye(output_size)[y_test], A2)
print("Test accuracy: {}".format(test_acc))
Output
Test accuracy: 0.95
Accuracy and Mean squared error plot.
Advantages of Backpropagation in
Python
It is relatively faster and simple algorithm to implement. Extensively
used in the field of face recognition and speech
recognition.Moreover, it is a flexible method as no prior knowledge
of the neural network is needed.
Disdavantages of Backpropagation
The algorithm is not disadvantageous for noisy and irregular
data.The performance of the backpropagation highly depends on the
input.
Conclusion
We learnt that backpopagation is a great way to improve the
accuracy of feed-forward nerural network model. It is quite easy and
flexible algorithm but does not work well with noisy data. It is a
great way to reduce the error and improve the accuracy of the
model.It optimizes the weights by going backwards by minimizing
the loss function with the help of gradient descent.
Write a program to implement Max-Min Composition and Max-Product
Composition.
import numpy as np
def max_min_composition(R, S):
"""
Performs Max-Min Composition of two fuzzy relations R and S.
Args:
R (np.ndarray): The first fuzzy relation matrix.
S (np.ndarray): The second fuzzy relation matrix.
Returns:
np.ndarray: The resulting fuzzy relation matrix from Max-Min
composition.
Raises:
ValueError: If the inner dimensions of the matrices do not match.
"""
if R.shape[1] != S.shape[0]:
raise ValueError("Inner dimensions of matrices must match for
composition.")
rows_R = R.shape[0]
cols_S = S.shape[1]
inner_dim = R.shape[1]
result = np.zeros((rows_R, cols_S))
for i in range(rows_R):
for j in range(cols_S):
min_values = []
for k in range(inner_dim):
min_values.append(min(R[i, k], S[k, j]))
result[i, j] = max(min_values)
return result
def max_product_composition(R, S):
"""
Performs Max-Product Composition of two fuzzy relations R and S.
Args:
R (np.ndarray): The first fuzzy relation matrix.
S (np.ndarray): The second fuzzy relation matrix.
Returns:
np.ndarray: The resulting fuzzy relation matrix from Max-Product
composition.
Raises:
ValueError: If the inner dimensions of the matrices do not match.
"""
if R.shape[1] != S.shape[0]:
raise ValueError("Inner dimensions of matrices must match for
composition.")
rows_R = R.shape[0]
cols_S = S.shape[1]
inner_dim = R.shape[1]
result = np.zeros((rows_R, cols_S))
for i in range(rows_R):
for j in range(cols_S):
product_values = []
for k in range(inner_dim):
product_values.append(R[i, k] * S[k, j])
result[i, j] = max(product_values)
return result
if __name__ == "__main__":
# Example Usage:
R = np.array([[0.7, 0.6],
[0.8, 0.3]])
S = np.array([[0.8, 0.5, 0.4],
[0.1, 0.6, 0.7]])
print("Fuzzy Relation R:\n", R)
print("Fuzzy Relation S:\n", S)
# Max-Min Composition
try:
max_min_res = max_min_composition(R, S)
print("\nMax-Min Composition (R o S):\n", max_min_res)
except ValueError as e:
print(f"\nError in Max-Min Composition: {e}")
# Max-Product Composition
try:
max_product_res = max_product_composition(R, S)
print("\nMax-Product Composition (R . S):\n", max_product_res)
except ValueError as e:
print(f"\nError in Max-Product Composition: {e}")
# Example with incompatible dimensions
R_incompatible = np.array([[0.1, 0.2]])
S_incompatible = np.array([[0.3, 0.4],
[0.5, 0.6]])
try:
max_min_composition(R_incompatible, S_incompatible)
except ValueError as e:
print(f"\nCaught expected error for incompatible dimensions: {e}")