0% found this document useful (0 votes)
13 views29 pages

Back Propagation

Backpropagation is a key algorithm for training artificial neural networks, adjusting weights and biases to minimize prediction errors. It involves a forward pass to compute outputs and a backward pass to propagate errors and update weights using gradient descent. This iterative process continues until the loss converges, enabling the network to learn complex patterns in data, and is widely used in deep learning applications.

Uploaded by

bca2m2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views29 pages

Back Propagation

Backpropagation is a key algorithm for training artificial neural networks, adjusting weights and biases to minimize prediction errors. It involves a forward pass to compute outputs and a backward pass to propagate errors and update weights using gradient descent. This iterative process continues until the loss converges, enabling the network to learn complex patterns in data, and is widely used in deep learning applications.

Uploaded by

bca2m2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Backpropagation

Introduction
• Backpropagation is a fundamental algorithm
for training artificial neural networks,
especially feedforward neural networks.
• It's a supervised learning algorithm used to
adjust the weights and biases of a neural
network in order to minimize the error
between the predicted and actual outputs.
• Backpropagation is the key technique behind
the success of deep learning.
Backpropagation
• Backpropagation is used in the context of
artificial neural networks, which consist of
layers of interconnected artificial neurons.
• The neural network typically consists of an
input layer, one or more hidden layers, and an
output layer.
• Each neuron in the network processes input
data, applies an activation function, and
produces an output.
Forward Pass
• In the forward pass, input data is propagated
through the network to make predictions.
• Each neuron computes a weighted sum of its
inputs and applies an activation function to
produce an output.
Loss or Cost Function
• A loss or cost function measures the error
between the predicted output and the actual
target.
• Common cost functions include Mean Squared
Error (MSE) for regression tasks and Cross-
Entropy for classification tasks.
Backward Pass (Backpropagation)
• The goal of backpropagation is to minimize the
loss function by adjusting the weights and
biases in the network.
• It's basically an iterative process.
Error Calculation
• The error is calculated for the output layer by
taking the derivative of the loss function with
respect to the predicted output.
• For regression problems:
δ_output = (Target - Predicted Output) *
f'(weighted sum)
• For classification problems with Cross-Entropy
loss:
δ_output = Target - Predicted Output
Propagating Error Backward
• The error is then propagated backward
through the network.
• For each neuron, the error δ is calculated by
taking the derivative of the error with respect
to the weighted sum of inputs.
• This error is used to adjust the neuron's
weights and bias.
Weight and Bias Update
• The weights and biases of each neuron are
updated to minimize the error.
• This update is typically done using gradient
descent:
New Weight = Old Weight - Learning Rate * δ * Input
New Bias = Old Bias - Learning Rate * δ
Iterative Process
• The entire process of forward and backward
passes is repeated iteratively for a defined
number of epochs.
• The learning rate, which controls the step size
in weight updates, is an important
hyperparameter.
Convergence
• Training continues until the loss converges to a
minimum, indicating that the network has
learned the underlying patterns in the data.
Testing and Prediction
• After training, the network can be used to
make predictions on new, unseen data by
performing a forward pass with the learned
weights and biases.
Regularization and Optimization
• To prevent overfitting, techniques like L1 and L2
regularization are often used.
• Variants of gradient descent, like Adam and RMSprop,
are used to improve training efficiency and convergence.
• Backpropagation is a cornerstone of training deep neural
networks and is responsible for the success of deep
learning in various applications, including image
recognition, natural language processing, and more.
• It allows the network to learn complex patterns and
relationships in data through an iterative optimization
process.
Example
• Problem Statement: We want to train a neural
network to classify whether an input (X) is greater
than 0.5 (Class 1) or less than or equal to 0.5 (Class 0).
• Network Architecture:
– Input Layer: 1 neuron
– Hidden Layer: 2 neurons
– Output Layer: 1 neuron
– Activation Function: Sigmoid
– Loss Function: Mean Squared Error (MSE)
– Learning Rate (η): 0.1
Training Data
– Input (X): 0.4
– Target (y): 0 (because 0.4 is less than or equal to
0.5)
Forward Pass
• Initialize weights and biases with random
values.
– Weights: w1 = 0.2, w2 = -0.3, w3 = 0.5, w4 = -0.1,
w5 = 0.6
– Biases: b1 = 0.1, b2 = -0.2, b3 = 0.3
• Calculate the input to the first hidden neuron
(z1):
z1 = (w1 * X) + b1 = (0.2 * 0.4) + 0.1 = 0.18
• Calculate the output of the first hidden neuron (a1) using the sigmoid activation
function:

• a1 = 1 / (1 + e^(-z1)) = 1 / (1 + e^(-0.18)) ≈ 0.545

• Similarly, calculate the input and output of the second hidden neuron (z2 and a2):

• z2 = (w2 * X) + b2 = (-0.3 * 0.4) - 0.2 = -0.32

• a2 = 1 / (1 + e^(-z2)) = 1 / (1 + e^(-(-0.32))) ≈ 0.579

• Calculate the input to the output neuron (z3):

• z3 = (w3 * a1) + (w4 * a2) + b3 = (0.5 * 0.545) + (-0.1 * 0.579) + 0.3 ≈ 0.414

• Calculate the output of the output neuron (a3):

• a3 = 1 / (1 + e^(-z3)) = 1 / (1 + e^(-0.414)) ≈ 0.602


Error Calculation
• Calculate the error (E) between the predicted
output (a3) and the target (y):

E = y - a3 = 0 - 0.602 ≈ -0.602
Backward Pass
• Calculate the delta (δ) for the output neuron:
δ3 = E * f'(z3) = -0.602 * (a3 * (1 - a3)) ≈ -0.138

• Calculate the deltas for the hidden neurons:


δ1 = (w3 * δ3) * f'(z1) ≈ (0.5 * -0.138) * (0.545 *
(1 - 0.545)) ≈ -0.009
δ2 = (w4 * δ3) * f'(z2) ≈ (-0.1 * -0.138) * (0.579 *
(1 - 0.579)) ≈ 0.007
Weight and Bias Updates
• Update the weights and biases using the delta
rule and learning rate (η):
• For output neuron:
w3 = w3 + η * δ3 * a1 ≈ 0.5 + 0.1 * (-0.138) *
0.545 ≈ 0.493
w4 = w4 + η * δ3 * a2 ≈ -0.1 + 0.1 * (-0.138) *
0.579 ≈ -0.108
b3 = b3 + η * δ3 ≈ 0.3 + 0.1 * (-0.138) ≈ 0.286
• For hidden neurons:
w1 = w1 + η * δ1 * X ≈ 0.2 + 0.1 * (-0.009) * 0.4
≈ 0.199
w2 = w2 + η * δ2 * X ≈ -0.3 + 0.1 * 0.007 * 0.4 ≈ -
0.299
b1 = b1 + η * δ1 ≈ 0.1 + 0.1 * (-0.009) ≈ 0.099
b2 = b2 + η * δ2 ≈ -0.2 + 0.1 * 0.007 ≈ -0.199
End of One Training Iteration
• Repeat the forward and backward pass for more
training iterations until the loss converges.
• This is a simplified example of backpropagation for
a small neural network.
• In practice, networks have many more neurons and
layers, and training occurs over numerous
iterations.
• Additionally, this example assumes a single training
sample, whereas real-world applications involve
entire datasets.
Implementation of Backpropagation in
Python
import numpy as np

# Define the sigmoid activation function


def sigmoid(x):
return 1 / (1 + np.exp(-x))

# Define the derivative of the sigmoid function


def sigmoid_derivative(x):
return x * (1 - x)
# Neural network class
class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size):
# Initialize weights with random values
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.learning_rate = 0.1

self.weights_input_hidden = np.random.rand(input_size,
hidden_size)
self.bias_hidden = np.zeros((1, hidden_size))
self.weights_hidden_output = np.random.rand(hidden_size,
output_size)
self.bias_output = np.zeros((1, output_size))
def forward(self, X):
# Forward pass
self.hidden_input = np.dot(X,
self.weights_input_hidden) + self.bias_hidden
self.hidden_output =
sigmoid(self.hidden_input)
self.output =
sigmoid(np.dot(self.hidden_output,
self.weights_hidden_output) +
self.bias_output)
def backward(self, X, y):
# Backpropagation
self.output_error = y - self.output
self.output_delta = self.output_error * sigmoid_derivative(self.output)

self.hidden_error = self.output_delta.dot(self.weights_hidden_output.T)
self.hidden_delta = self.hidden_error *
sigmoid_derivative(self.hidden_output)

# Update weights and biases


self.weights_hidden_output +=
self.hidden_output.T.dot(self.output_delta) * self.learning_rate
self.bias_output += np.sum(self.output_delta, axis=0, keepdims=True) *
self.learning_rate
self.weights_input_hidden += X.T.dot(self.hidden_delta) *
self.learning_rate
self.bias_hidden += np.sum(self.hidden_delta, axis=0, keepdims=True) *
self.learning_rate
def train(self, X, y, epochs):
for _ in range(epochs):
self.forward(X)
self.backward(X, y)

def predict(self, X):


self.forward(X)
return self.output
# Sample dataset for XOR gate
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

# Create and train the neural network


input_size = 2
hidden_size = 4
output_size = 1
nn = NeuralNetwork(input_size, hidden_size, output_size)
nn.train(X, y, epochs=10000)

# Make predictions
predictions = nn.predict(X)
print("Predictions:")
print(predictions)
Output

You might also like