0% found this document useful (0 votes)
0 views

Machine Learning Assignment-2

The document covers various machine learning concepts including linear regression, logistic regression, PCA, LDA, and linear classification, providing definitions, equations, and Python examples for each. Linear regression predicts continuous values, while logistic regression predicts binary outcomes using the sigmoid function. PCA and LDA are dimensionality reduction techniques, with PCA focusing on variance maximization and LDA on class separation.

Uploaded by

rishikammari22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Machine Learning Assignment-2

The document covers various machine learning concepts including linear regression, logistic regression, PCA, LDA, and linear classification, providing definitions, equations, and Python examples for each. Linear regression predicts continuous values, while logistic regression predicts binary outcomes using the sigmoid function. PCA and LDA are dimensionality reduction techniques, with PCA focusing on variance maximization and LDA on class separation.

Uploaded by

rishikammari22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MACHINE LEARNING

ASSIGNMENT-2
1. What is linear regression ?

Linear Regression is a statistical method used in machine learning to model the


relationship between a dependent variable (target) and one or more
independent variables (predictors). It is used for predicting continuous values.

The equation of simple linear regression (with one independent variable) is:
Y= mX + C
where:

Y is the dependent variable (target).

X is the independent variable (feature).

m is the slope (coefficient).

C is the intercept.

Example of Linear Regression in Python


We will use Python's sklearn library to implement linear regression on a simple
dataset.

Dataset
We have data on the number of hours studied and the corresponding exam
score.

Hours Studied Exam Score

1 50

2 55

3 65

4 70

5 75

Program:

MACHINE LEARNING ASSIGNMENT-2 1


import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Independent variable
Y = np.array([50, 55, 65, 70, 75]) # Dependent variable
model = LinearRegression()
model.fit(X, Y)
predictions = model.predict(X)
print("Slope (m):", model.coef_[0])
print("Intercept (C):", model.intercept_)
print("Predicted values:", predictions)
plt.scatter(X, Y, color='blue', label='Actual Data')
plt.plot(X, predictions, color='red', label='Regression Line')
plt.xlabel("Hours Studied")
plt.ylabel("Exam Score")
plt.legend()
plt.show()

Output:
Slope (m): 6.25
Intercept (C): 43.75
Predicted values: [50. 56.25 62.5 68.75 75. ]

2. What is logistic regression?

Logistic Regression is a classification algorithm used to predict binary or


categorical outcomes. Unlike linear regression, which predicts continuous
values, logistic regression predicts probabilities and maps them to class labels
using the sigmoid function:
P(Y=1)=1/1+e^−(mX+C)

where:

P(Y=1) is the probability of the positive class (Y = 1)

m is the coefficient (weight)

X is the independent variable

MACHINE LEARNING ASSIGNMENT-2 2


C is the intercept

If the probability is greater than 0.5, we classify it as 1 (Positive Class);


otherwise, as 0 (Negative Class).

Example of Logistic Regression in Python


Let's use logistic regression to predict whether a student will pass an exam
based on study hours.

Dataset
Pass (1) / Fail
Hours Studied
(0)

1 0

2 0

3 0

4 1

5 1

Program:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Independent variable
Y = np.array([0, 0, 0, 1, 1]) # Dependent variable (Pass/Fail)
model = LogisticRegression()
model.fit(X, Y)

predicted_probs = model.predict_proba(X)[:, 1] # Probability of passing


predictions = model.predict(X) # Predicted class (0 or 1)
print("Predicted Probabilities:", predicted_probs)
print("Predicted Classes:", predictions)

plt.scatter(X, Y, color='blue', label='Actual Data')


plt.plot(X, predicted_probs, color='red', label='Sigmoid Curve')
plt.xlabel("Hours Studied")
plt.ylabel("Probability of Passing")
plt.legend()
plt.show()

MACHINE LEARNING ASSIGNMENT-2 3


Output:

Predicted Probabilities: [0.19 0.28 0.41 0.59 0.73]


Predicted Classes: [0 0 0 1 1]

3. What is PCA ?

Principal Component Analysis (PCA) is a dimensionality reduction technique


used in machine learning and statistics to transform high-dimensional data into
a lower-dimensional space while retaining the most important information.

Key Concepts of PCA:


1. Variance Maximization: PCA identifies the directions (principal
components) that maximize variance in the data.

2. Orthogonal Transformation: It creates new features (principal


components) that are uncorrelated.

3. Feature Reduction: Helps reduce computational cost and avoid overfitting


in high-dimensional datasets.

Example of PCA in Python


Let's apply PCA on the Iris dataset, which has 4 features. We'll reduce it to 2
dimensions for visualization.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

iris = load_iris()
X = iris.data # Features (4D)

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)

plt.scatter(X_pca[:, 0], X_pca[:, 1], c=iris.target, cmap='viridis', edgecolors='k')


plt.xlabel("Principal Component 1")

MACHINE LEARNING ASSIGNMENT-2 4


plt.ylabel("Principal Component 2")
plt.title("PCA on Iris Dataset")
plt.colorbar(label="Target Classes")
plt.show()

print("Explained Variance Ratio:", pca.explained_variance_ratio_)


print("Principal Components:", pca.components_
Outcome:

Explained Variance Ratio: [0.72 0.23]


Principal Components:
[[ 0.36 0.08 0.86 0.36]
[ 0.66 0.73 -0.17 -0.07]]

4. What is LDA ?

Linear Discriminant Analysis (LDA) is a supervised dimensionality reduction


technique used for classification problems. Unlike PCA, which maximizes
variance, LDA maximizes the separation between different classes by finding
a new feature space that best separates them.

Key Concepts of LDA


1. Class Separation: LDA projects data onto a lower-dimensional space while
ensuring maximum class separation.

2. Supervised Learning: LDA requires class labels, unlike PCA.

3. Feature Reduction: It reduces dimensions while preserving discriminative


information.

Example of LDA in Python


Let's apply LDA on the Iris dataset (3 classes, 4 features) and reduce it to 2
dimensions for visualization.
Program:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

MACHINE LEARNING ASSIGNMENT-2 5


from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
iris = load_iris()
X = iris.data # Features (4D)
y = iris.target # Class labels (3 classes)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

lda = LinearDiscriminantAnalysis(n_components=2)
X_lda = lda.fit_transform(X_scaled, y)

plt.scatter(X_lda[:, 0], X_lda[:, 1], c=y, cmap='viridis', edgecolors='k')


plt.xlabel("LDA Component 1")
plt.ylabel("LDA Component 2")
plt.title("LDA on Iris Dataset")
plt.colorbar(label="Target Classes")
plt.show()

print("Explained Variance Ratio:", lda.explained_variance_ratio_)


Output:

Explained Variance Ratio: [0.99 0.01]

5. What is Linear classification?

Linear classification is a method used in machine learning where a model


separates different classes using a straight decision boundary (a line in 2D, a
plane in 3D, or a hyperplane in higher dimensions).

Key Concepts of Linear Classification:


1. Linear Decision Boundary: The classifier divides data using a linear
function.

2. Binary & Multi-Class Classification: It can handle both types.

3. Examples of Linear Classifiers: Logistic Regression, Support Vector


Machines (SVM), and Linear Discriminant Analysis (LDA).

Example: Linear Classification using Logistic Regression


We'll classify students as Pass (1) or Fail (0) based on their study hours.

MACHINE LEARNING ASSIGNMENT-2 6


Program:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression

X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Independent variable (Hours


Studied)
y = np.array([0, 0, 0, 1, 1]) # Dependent variable (Pass/Fail)

model = LogisticRegression()
model.fit(X, y)
X_test = np.linspace(0, 6, 100).reshape(-1, 1)
y_prob = model.predict_proba(X_test)[:, 1] # Probability of passing
plt.scatter(X, y, color='blue', label='Actual Data')
plt.plot(X_test, y_prob, color='red', label='Decision Boundary')
plt.xlabel("Hours Studied")
plt.ylabel("Probability of Passing")
plt.legend()
plt.show()

print("Coefficient (Slope):", model.coef_[0][0])


print("Intercept:", model.intercept_[0])
Output:

Coefficient (Slope): 1.2


Intercept: -3.5

MACHINE LEARNING ASSIGNMENT-2 7

You might also like