0% found this document useful (0 votes)
32 views4 pages

Experiment 2.4 DL

Implement a backpropagation algorithm to train a DNN with at least 2 hidden layers

Uploaded by

gaming47more04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views4 pages

Experiment 2.4 DL

Implement a backpropagation algorithm to train a DNN with at least 2 hidden layers

Uploaded by

gaming47more04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Experiment 2.

Aim: Implement a backpropagation algorithm to train a DNN with at least 2 hidden


layers.

Theory: 1. Deep Neural Networks (DNNs)


A Deep Neural Network is a type of artificial neural network characterized by multiple layers of
nodes (neurons) connected with weights. DNNs are used for various tasks such as classification,
regression, and more.

Key Concepts:

• Neurons & Layers: Each neuron receives inputs, processes them, and passes the output to
the next layer. DNNs typically consist of an input layer, one or more hidden layers, and an
output layer.
• Activation Functions: Functions that introduce non-linearity into the model. Common
activation functions include:
o ReLU (Rectified Linear Unit): f(x)=max⁡(0,x)f(x)=max(0,x) allows only
positive values to pass and is computationally efficient.
o Softmax: Used in the output layer for multi-class classification, converting raw
scores into probabilities for each class.
• Forward Propagation: The process of passing inputs through the network to obtain
outputs.

2. Loss Functions
Loss functions measure how well the neural network's predictions match the actual target values.
The goal of training is to minimize this loss.

• Cross-Entropy Loss: Commonly used for classification tasks. It measures the performance
of a model whose output is a probability value between 0 and 1. Mathematically for a single
instance:

Loss=−∑i=1Cyilog⁡(pi)Loss=−i=1∑Cyilog(pi)
where yiyi is the true label (one-hot encoded) and pipi is the predicted probability.

3. Backpropagation Algorithm
Backpropagation is the primary method for training neural networks. It involves:

• Computing the gradient of the loss function concerning each weight using the chain rule.
• Propagating the gradients backward through the network to update weights via gradient
descent.

4. Gradient Descent
A common optimization algorithm used to update the weights in the direction that most reduces the
loss function.
• Learning Rate: A hyperparameter that controls how much to change the model in response
to the estimated error each time the weights are updated. A small learning rate can lead to
slow convergence, while a large one might overshoot minima.

5. One-Hot Encoding
A method for converting categorical variables into a binary matrix representation. Each category
value is converted into a binary vector that is all zeros except for the index of the category, which is
marked with a one. It's particularly useful for feeding categorical data into neural networks.

To prevent overfitting in neural networks:

• Dropout: Randomly zeros a fraction of the neurons during training to prevent co-
adaptation.
• L2 Regularization: Adds a penalty to the loss function based on the magnitude of the
weights, discouraging overly complex models.

Code: import numpy as np


from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

# Load Iris dataset


iris = load_iris()
X = iris.data
y = iris.target

# One-hot encode the target variable correctly without sparse


one_hot_encoder = OneHotEncoder(sparse=False) # Using sparse=False directly
y_one_hot = one_hot_encoder.fit_transform(y.reshape(-1, 1)) # Convert labels to one-hot

# Split dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y_one_hot, test_size=0.2,
random_state=42)

# Define the DNN architecture


n_inputs = 4
n_hidden1 = 10
n_hidden2 = 10
n_outputs = 3

# Initialize weights and biases


w1 = np.random.rand(n_inputs, n_hidden1)
b1 = np.zeros((n_hidden1,))
w2 = np.random.rand(n_hidden1, n_hidden2)
b2 = np.zeros((n_hidden2,))
w3 = np.random.rand(n_hidden2, n_outputs)
b3 = np.zeros((n_outputs,))

# Define the activation functions


def relu(x):
return np.maximum(x, 0)

def softmax(x):
exp_x = np.exp(x - np.max(x, axis=1, keepdims=True)) # Improved stability
return exp_x / np.sum(exp_x, axis=1, keepdims=True)

# Define the derivative of the ReLU activation function


def relu_derivative(x):
return np.where(x > 0, 1, 0)

# Define the learning rate and number of epochs


learning_rate = 0.01
n_epochs = 100

# Train the DNN


for epoch in range(n_epochs):
# Forward pass
h1 = relu(np.dot(X_train, w1) + b1)
h2 = relu(np.dot(h1, w2) + b2)
y_pred = softmax(np.dot(h2, w3) + b3)

# Compute the loss (Cross-Entropy Loss)


loss = -np.mean(np.sum(y_train * np.log(y_pred + 1e-10), axis=1)) # adding a small value
for stability

# Backward pass
d_y_pred = y_pred - y_train
d_h2 = np.dot(d_y_pred, w3.T) * relu_derivative(h2)
d_h1 = np.dot(d_h2, w2.T) * relu_derivative(h1)

d_w3 = np.dot(h2.T, d_y_pred) / X_train.shape[0]


d_b3 = np.sum(d_y_pred, axis=0) / X_train.shape[0]

d_w2 = np.dot(h1.T, d_h2) / X_train.shape[0]


d_b2 = np.sum(d_h2, axis=0) / X_train.shape[0]

d_w1 = np.dot(X_train.T, d_h1) / X_train.shape[0]


d_b1 = np.sum(d_h1, axis=0) / X_train.shape[0]

# Update weights and biases


w3 -= learning_rate * d_w3
b3 -= learning_rate * d_b3
w2 -= learning_rate * d_w2
b2 -= learning_rate * d_b2
w1 -= learning_rate * d_w1
b1 -= learning_rate * d_b1

print(f'Epoch {epoch + 1}, Loss: {loss:.4f}')

Output:

Conclusion:
Understanding the theoretical foundations of neural networks provides insights into how they work
and the challenges faced during training. This knowledge is critical for optimizing performance and
effectively applying DNNs in various applications such as image classification, natural language
processing, and more.

You might also like