0% found this document useful (0 votes)
6 views

ML Assign3

ML 3 ASSIGNMENT OF SPPU

Uploaded by

hrr601097
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

ML Assign3

ML 3 ASSIGNMENT OF SPPU

Uploaded by

hrr601097
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Machine Learning

Assignment No. 3
Q.1. Define Following terms with Suitable example:
i) Confusion matrix ii) False positive rate iii) True positive rate

i) Confusion Matrix

A confusion matrix is a table used to evaluate the performance of a classification model. It


summarizes the results of predictions made by the model compared to the actual outcomes.
The matrix typically includes four components:

 True Positives (TP): Correctly predicted positive instances


 True Negatives (TN): Correctly predicted negative instances
 False Positives (FP): Incorrectly predicted positive instances (Type I error)
 False Negatives (FN): Incorrectly predicted negative instances (Type II

error) Example:

Consider a binary classification problem where we are trying to predict whether an email is
spam or not. After testing the model, we get the following results:

Predicted Spam Predicted Not Spam


Actual Spam 50 (TP) 10 (FN)
Actual Not Spam 5 (FP) 35 (TN)

Confusion Matrix:

Predicted
Spam Not Spam
Actual Spam 50 10
Actual Not Spam 5 35

ii) False Positive Rate (FPR)

The false positive rate measures the proportion of negative instances that were incorrectly
classified as positive. It is calculated using the formula:

FPR=FPFP+TN\text{FPR} = \frac{FP}{FP + TN}FPR=FP+TNFP

Example:

Using the confusion matrix from the previous example:

 FP (False Positives) = 5
 TN (True Negatives) = 35
FPR=55+35=540=0.125\text{FPR} = \frac{5}{5 + 35} = \frac{5}{40} = 0.125FPR=5+355
=405=0.125

So, the false positive rate is 0.125, or 12.5%.

iii) True Positive Rate (TPR)

The true positive rate, also known as recall or sensitivity, measures the proportion of actual
positive instances that were correctly classified. It is calculated using the formula:

TPR=TPTP+FN\text{TPR} = \frac{TP}{TP + FN}TPR=TP+FNTP

Example:

Again using the confusion matrix from earlier:

 TP (True Positives) = 50
 FN (False Negatives) = 10

TPR=5050+10=5060=56≈0.833\text{TPR} = \frac{50}{50 + 10} = \frac{50}{60} =


\frac{5}{6} \approx 0.833TPR=50+1050=6050=65≈0.833

Thus, the true positive rate is approximately 0.833, or 83.3%.

Q.2. What is contingency table/Matrix? What is the use of it?

Contingency Table (Matrix)

Definition: A contingency table, also known as a cross-tabulation or crosstab, is a statistical


table that categorizes data based on two or more variables. It is used to summarize and
analyze the relationship between these variables.

Structure:

 Rows: Represent one variable.


 Columns: Represent another variable.
 Cells: Contain the frequency or count of observations that fall into specific categories
of both variables.

Example:

Gender Education Level Count


Male High School 50
Male College 75
Female High School 40
Female College 60
Uses of Contingency Tables:

1. Summarizing Data: Contingency tables provide a clear and concise way to present
data, making it easier to understand relationships and patterns between variables.
2. Testing for Independence: Statistical tests like the chi-square test can be applied to
contingency tables to determine if two variables are independent or related.
3. Analyzing Associations: Contingency tables can help identify associations or
dependencies between variables. For example, you might find that gender is
associated with education level.
4. Comparing Groups: Contingency tables can be used to compare different groups
based on their distribution across categories of another variable.

Q.3. What are support vectors and margins? Explain sift SVM and hard SVM.

Support Vectors and Margins in SVM

Support Vectors: In Support Vector Machines (SVMs), support vectors are the data points
that lie closest to the decision boundary. These points are crucial because they define the
margin, which separates the two classes.

Margins: The margin is the distance between the decision boundary and the nearest data
points (support vectors) on either side. In SVM, the goal is to find the decision boundary that
maximizes this margin. This is because a larger margin generally leads to better
generalization performance, as it helps the model avoid overfitting.

Soft SVM vs. Hard SVM

Hard SVM:

 Assumption: Assumes that the data is linearly separable, meaning there exists a clear
hyperplane that perfectly separates the two classes.
 Goal: Finds the hyperplane with the largest margin that correctly classifies all training
examples.
 Limitation: Can be sensitive to outliers or noisy data, as a single outlier can
significantly affect the position of the decision boundary.

Soft SVM:

 Assumption: Acknowledges that real-world data may not be perfectly linearly


separable due to noise or outliers.
 Goal: Introduces a slack variable for each data point, allowing for some
misclassifications. The objective is to find a hyperplane that maximizes the margin
while minimizing the number of misclassifications.
 Regularization: Uses a regularization parameter to balance the trade-off between
maximizing the margin and minimizing the number of misclassification
Q.4. What is slack variable? Discuss margin errors.

Slack Variables

In Support Vector Machines (SVMs), slack variables are introduced to allow for some
misclassifications when the data is not perfectly linearly separable. These variables measure
the degree to which a data point violates the margin constraint.

 Positive slack variable: Indicates that a data point is on the wrong side of the margin
or even misclassified.
 Zero slack variable: Indicates that a data point is correctly classified and lies on or
within the margin.

Margin Errors

Margin errors refer to the data points that are misclassified or lie on the wrong side of the
margin. The number of margin errors is directly related to the values of the slack variables.

 Hard Margin SVM: Does not allow for any margin errors, as the slack variables are
constrained to be zero.
 Soft Margin SVM: Allows for a certain number of margin errors, as the slack
variables can be positive.

The regularization parameter (C) in soft margin SVM controls the trade-off between
maximizing the margin and minimizing the number of margin errors. A larger C value
penalizes margin errors more heavily, leading to a smaller margin and fewer
misclassifications. Conversely, a smaller C value allows for more margin errors but results in
a larger margin.

Q.5. Explain kernel methods for non- linearity.

Kernel Methods for Non-Linearity in SVM

Kernel Trick: The kernel trick is a mathematical technique used in SVMs to transform data
into a higher-dimensional feature space, making it possible to find non-linear decision
boundaries. This transformation is done without explicitly computing the coordinates of the
data points in the higher-dimensional space.

Kernel Functions: Kernel functions are used to compute the inner product between data
points in the transformed feature space. Common kernel functions include:

 Linear Kernel: For linear relationships between data points.


 Polynomial Kernel: For polynomial relationships.
 Radial Basis Function (RBF) Kernel: For non-linear relationships with a Gaussian-
shaped kernel.
 Sigmoid Kernel: For neural network-like behavior.

How Kernel Methods Work:

1. Choose a Kernel Function: Select a kernel function that is appropriate for the
expected relationship between the data points.
2. Compute Kernel Matrix: Calculate the kernel matrix, where each element represents
the inner product between two data points in the transformed feature space.
3. Train SVM: Use the kernel matrix to train the SVM. The SVM algorithm operates in
the transformed feature space, finding a hyperplane that separates the data points.
4. Make Predictions: To make predictions for new data points, compute their kernel
values with the training data and use the trained SVM model.

Q.6. Calculate macro average precision, macro average recall and macro average F-
score for the following given confusion matrix of multi-class classification.

Calculating Macro Average Precision, Recall, and F-Score

Understanding the Confusion Matrix:

Before we calculate the metrics, let's identify the true positives (TP), false positives (FP), true
negatives (TN), and false negatives (FN) for each class:

Class TP FP FN TN
A 100 0 0 9
B 9 0 0 100
C 8 0 0 100
D 9 0 0 100

Calculating Precision, Recall, and F-Score for Each Class:

 Precision = TP / (TP + FP)


 Recall = TP / (TP + FN)
 F-Score = 2 * (Precision * Recall) / (Precision +

Recall) For Class A:

 Precision = 100 / (100 + 0) = 1


 Recall = 100 / (100 + 0) = 1
 F-Score = 2 * (1 * 1) / (1 + 1) = 1

Similarly, calculate for classes B, C, and

D.
Calculating Macro Averages:

 Macro Average Precision = (Precision for Class A + Precision for Class B + Precision
for Class C + Precision for Class D) / 4
 Macro Average Recall = (Recall for Class A + Recall for Class B + Recall for Class C
+ Recall for Class D) / 4
 Macro Average F-Score = (F-Score for Class A + F-Score for Class B + F-Score for
Class C + F-Score for Class D) / 4

Based on the given confusion matrix:

 Macro Average Precision = (1 + 1 + 1 + 1) / 4 = 1


 Macro Average Recall = (1 + 1 + 1 + 1) / 4 = 1
 Macro Average F-Score = (1 + 1 + 1 + 1) / 4 = 1

Interpretation:

In this case, the macro average precision, recall, and F-score are all 1, indicating that the
model performs perfectly across all classes. This is likely due to the simplified nature of the
confusion matrix, where each class is correctly predicted for all instances. In real-world
scenarios, these metrics will typically be less than perfect.

Q.7. Define following terms with reference to SVM.

i) Separating hyperplane ii) Margin

Separating Hyperplane

In a Support Vector Machine (SVM), a separating hyperplane is a decision boundary that


divides the feature space into two regions, each corresponding to a different class. It is a
linear equation that separates the data points of different classes.

 In two dimensions: The separating hyperplane is a line.


 In higher dimensions: It's a hyperplane.

The goal of SVM is to find the optimal separating hyperplane that maximizes the margin
between the two classes.

Margin

The margin is the distance between the separating hyperplane and the nearest data points
(support vectors) on either side. In SVM, the objective is to find the hyperplane that
maximizes this margin.

 Why maximize the margin? A larger margin generally leads to better generalization
performance, as it helps the model avoid overfitting. It implies that the model is more
confident in its predictions for new, unseen data.

You might also like