0% found this document useful (0 votes)
25 views5 pages

Finalized Review Report 3 (Gradient, Confusion Matrix)

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 5

Department of Electrical Engineering

Summer Project 2023

Literature Review Report


about Gradient Descent and
Confusion Matrix in Machine Learning

Submitted by

1) Faseeh Ahmed | 02-3-1-013-2021

Section: A

Submitted To: Dr. Sufi Tabassum Gul

Due: July 24, 2023


Contents

1 Literature Review about Gradient Descent and Confusion Matrix in


Machine Learning 1
1.1 Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Example about Interpreting Confusion Matrix . . . . . . . . . 2
1.2.2 Main Uses of Confusion Matrix . . . . . . . . . . . . . . . . . 3
Literature Review about Gradient
Descent and Confusion Matrix in Ma-
chine Learning

1.1 Gradient Descent


Gradient Descent is an optimization algorithm commonly used in neural networks
(and other machine learning models) to update the model’s parameters in order to
minimize the loss function. It is an iterative process that helps the model to learn
from the training data and make better predictions.
In the context of neural networks, the model’s parameters are the weights and
biases of the individual neurons. The goal of training is to find the optimal values
for these parameters, which allow the neural network to make accurate predictions
on new, unseen data.
Here’s how Gradient Descent works in neural networks:

1. Initialization: The model’s weights and biases are initialized with small ran-
dom values.

2. Forward Pass: During the forward pass, the input data is fed into the neural
network. The data propagates through the network layer by layer, and the
activation function of each neuron is applied to produce the output of each
neuron. This process is repeated until the final output is generated.

3. Loss Function: A loss function is used to quantify how well the model is per-
forming on the training data. It measures the difference between the predicted
output and the actual target values. Common loss functions for different tasks
include Mean Squared Error (MSE) for regression and Cross-Entropy Loss for
classification.

4. Backward Pass (Backpropagation): This is the core of Gradient Descent in


neural networks. During the backward pass, the gradients of the loss function
with respect to each weight and bias are calculated. This step tells us how much
each parameter contributes to the error in the predictions.

5. Gradient Update: The gradients calculated in the previous step are used
to update the model’s parameters. The goal is to adjust the parameters in a
way that reduces the loss function and improves the model’s predictions. The
learning rate hyperparameter controls the size of the steps taken during the
update.

1
Summer Project Literature Review Report 3

6. Repeat: Steps 2 to 5 are repeated for each batch of training data, and this
process is known as an epoch. Training can continue for multiple epochs until
the model converges to a point where the loss is minimized, or a predefined
stopping criterion is met.

By iteratively applying the Gradient Descent algorithm, the neural network ”learns”
from the training data and updates its parameters in the direction that reduces the
prediction error. The process continues until the model reaches a satisfactory level of
performance, allowing it to make accurate predictions on new, unseen data.
Gradient Descent is a fundamental optimization technique used in training neural
networks and other machine learning models, and there are several variants of it, such
as Stochastic Gradient Descent (SGD), Mini-batch Gradient Descent, and more ad-
vanced techniques like Adam and RMSprop, which improve the optimization process
and converge faster.

1.2 Confusion Matrix


A confusion matrix is a performance measurement tool used in machine learning,
especially in the context of classification tasks. It is a table that helps visualize the
performance of a classification model by summarizing the predictions made by the
model on a set of data, and how they compare to the actual ground-truth labels.
The confusion matrix is organized into four categories based on the predicted and
actual class labels:

• True Positives (TP): The number of instances correctly predicted as positive


(correctly classified as the positive class).

• False Positives (FP): The number of instances incorrectly predicted as posi-


tive (incorrectly classified as the positive class).

• True Negatives (TN): The number of instances correctly predicted as nega-


tive (correctly classified as the negative class).

• False Negatives (FN): The number of instances incorrectly predicted as neg-


ative (incorrectly classified as the negative class).

The confusion matrix typically looks like this:

Predicted Positive (Class 1) Predicted Negative (Class 0)


Actual Positive (Class 1) True Positives (TP) False Negatives (FN)
Actual Negative (Class 0) False Positives (FP) True Negatives (TN)

1.2.1 Example about Interpreting Confusion Matrix


Here’s an example of how to interpret a confusion matrix for a binary classification
problem:
Assume we are classifying emails as spam (positive class) or not spam (negative
class).

• TP (True Positives): The number of emails correctly classified as spam.

• TN (True Negatives): The number of emails correctly classified as not spam.

2
Summer Project Literature Review Report 3

• FP (False Positives): The number of emails incorrectly classified as spam (false


alarms).

• FN (False Negatives): The number of emails incorrectly classified as not spam


(missed spam).

1.2.2 Main Uses of Confusion Matrix


The main uses of a confusion matrix are:

• Evaluating Model Performance: The confusion matrix provides valuable


insights into how well a classification model is performing. It allows you to
calculate metrics like accuracy, precision, recall (sensitivity), specificity, F1-
score, etc.

• Identifying Model Errors: By looking at the confusion matrix, you can


identify the types of errors your model is making and understand which class is
being misclassified more often.

• Model Selection and Parameter Tuning: When comparing multiple mod-


els or tuning hyperparameters, the confusion matrix can help you choose the
model or parameter settings that yield the best overall performance.

You might also like