0% found this document useful (0 votes)
30 views37 pages

VMTW ML Lab Manual

The document outlines the vision and mission of an educational institution and its Computer Science department, focusing on the development of skills in Artificial Intelligence and Machine Learning. It includes program educational objectives, outcomes, and specific outcomes related to engineering and machine learning, alongside a detailed lab course structure with various experiments and Python programming tasks. The document emphasizes hands-on experience and the application of ethical considerations in AI development.

Uploaded by

Anitha Vazzu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views37 pages

VMTW ML Lab Manual

The document outlines the vision and mission of an educational institution and its Computer Science department, focusing on the development of skills in Artificial Intelligence and Machine Learning. It includes program educational objectives, outcomes, and specific outcomes related to engineering and machine learning, alongside a detailed lab course structure with various experiments and Python programming tasks. The document emphasizes hands-on experience and the application of ethical considerations in AI development.

Uploaded by

Anitha Vazzu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 37

MACHINE LEARNING LAB

Laboratory Record

B. Tech III Year-II Semester

(A.Y.2025-26)

Department of CSE
VISION & MISSION OF THE INSTITUTE

Vision of the College:

To empower students with professional education using creative & innovative technical
practices of global competence and research aptitude to become competitive engineers with
ethical values and entrepreneurial skills

Mission of the college:

To impart value based professional education through creative and innovative teaching-
learning process to face the global challenges of the new era technology.

To inculcate research aptitude and to bring out creativity in students by imparting


engineering knowledge imbibing interpersonal skills to promote innovation, research and
entrepreneurship.

Vision &Mission of the Department

Vision of the Department:

To be a globally recognized center of excellence in education ,research, and innovation in


Artificial Intelligence and Machine Learning.

Mission of the Department:

DM1: To Providing a comprehensive curriculum that covers foundational concepts and


advanced topics in AI and machine learning, and emphasizes hands-on experience and project-
based learning.

DM2:Encouraging and supporting cutting-edge research in AI and machine learning ,and


fostering collaborations with industry and academia.

DM3: Providing opportunities for professional development networking, and life long learning,
and supporting our alumni in their continued success in the field of AI and machine learning.

Program Educational Objectives(PEOs):

PEO1:Exhibit the knowledge of artificial intelligence and machine learning concepts to


solve complex problems in diverse industries.
PEO2:To implement necessary skills to design, develop, and implement intelligent systems that
can analyze large datasets, learn from them, and make informed decisions.
PEO3: Apply deep understanding of ethical considerations in the development and deployment

Of AI systems, including privacy, security, and fairness.

Program Outcomes(POs):

PO1 : Engineering Knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals and an engineering specialization to the solution of complex engineering problems

PO2 : Problem Analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences and Engineering sciences.

PO3 : Design/Development of Solutions: Design solutions for complex engineering problems


and design system components or processes that meet the specified needs with appropriate
consideration for the public health safety, and the cultural, societal, and environmental
considerations.

PO4 : Conduct Investigations of Complex Problems: Use research-based knowledge and


research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.

PO5 : Modern Tool Usage: Create, select and apply appropriate techniques, resources and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.
Program Outcomes(POs):

PO1 : Engineering Knowledge: Apply the knowledge of mathematics,


science, engineering fundamentals and an engineering specialization to the
solution of complex engineering problems

PO2 : Problem Analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences and Engineering sciences.

PO3 : Design/Development of Solutions: Design solutions for complex engineering problems


and design system components or processes that meet the specified needs with appropriate
consideration for the public health safety, and the cultural, societal, and environmental
considerations.

PO4 : Conduct Investigations of Complex Problems: Use research-based knowledge and


research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.

PO5 : Modern Tool Usage: Create, select and apply appropriate techniques, resources and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.

PO6 : The Engineer and Society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.

PO7 : Environment and Sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts and demonstrate the knowledge of, and need for
sustainable development.

PO8 : Ethics:: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.

PO9 : Individual and Team Work: Function effectively as an individual and as a member or
leader in diverse teams and in multidisciplinary settings.

PO 10 : Communication: Communicate effectively on complex engineering activities with the


engineering community and with society at large, such as being able to comprehend and write
effective reports and design documentation, make effective presentations and give and receive
clear instructions.

PO 11 : Project Management and Finance: Demonstrate knowledge and understanding of the


engineering management principles and apply these to one's own work, as a member andleader in
a team to manage projects and in multidisciplinary environments.

PO 12 : Life-Long Learning: Recognize the need for and have the preparation and ability to
engage in independent and lifelong learning in the broadest context of technological change.
Program Specific Outcomes(PSOs):

PSO1: Engineering Fundamentals: Ability to apply mathematical foundations, algorithms,


and statistical models to analyze and solve complex problems in AI and machine learning.

PSO2: Industrial Skills Ability: .Ability to design and develop AI and machine learning
models using various tools and programming languages such as Python, R, and Tensor Flow,
Ability to evaluate the performance of AI and machine learning models using various
performance metrics and data visualization techniques.

PSO3: Ethical and Social Responsibility: Understanding of the ethical, social, and legal
issues associated with the development and deployment of AI and machine learning systems.
Ability to work collaboratively in inter disciplinary teams to design and implement AI and
machine learning solutions to real-world problems.

Course Objective: The objective of this lab is to get an overview of the various machine
learning techniques and can demonstrate them using python.

Course Outcomes:

 Understand modern notions in predictive data analysis


 Select data, model selection, model complexity and identify the trends
 Understand a range of machine learning algorithms along with their strengths and weaknesses
 Build predictive models from data and analyze their performance
INDEX

S.No Name of the experiment


Write a python program to compute Central Tendency Measures: Mean,
1
Median, Mode Measure of Dispersion: Variance, Standard Deviation

Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
2

Study of Python Libraries for ML application such as Pandas and Matplotlib


3

Write a Python program to implement Simple Linear Regression


4

Implementation of Multiple Linear Regression for House Price Prediction


5
using sklearn

Implementation of Decision tree using sklearn and its parameter tuning


6

Implementation of KNN using sklearn


7

Implementation of Logistic Regression using sklearn


8

Implementation of K-Means Clustering


9

. Performance analysis of Classification Algorithms on a specific dataset (Mini


10
Project)
WEEK1:

1. Write a python program to compute Central Tendency Measures: Mean, Median, Mode
Measure of Dispersion: Variance, Standard Deviation

Solution:

import statistics

import math

#This line imports Python's built-in statistics module.The statistics module provides functions to perform
statistical calculations

#This line imports Python’s built-in math module.The math module provides mathematical functions that
go beyond basic arithmetic

l = [1, 3, 8, 15]

print(statistics.mean(l))

#The mean value is the average value.To calculate the mean, find the sum of all values, and divide the sum
by the number of values:

6.75

import statistics

s=[5,6,7,8,9,11]

print(statistics.mean(s))

print(statistics.median(s))

print(statistics.mean([1, 3, 5, 7, 9, 11, 13]))

print(statistics.mean([1, 3, 5, 7, 9, 11]))

print(statistics.mean([-11, 5.5, -3.4, 7.1, -9, 22]))

7.666666666666667

7.5

5
7

1.8666666666666667

# Calculate the median from a sample of data

print(statistics.median([1, 3, 5, 7, 9, 11, 13]))

print(statistics.median([1, 3, 5, 7, 9, 11]))

print(statistics.median([-11, 5.5, -3.4, 7.1, -9, 22]))

6.0

1.05

# Calculate the mode from a sample of data

print(statistics.mode([1, 3, 3, 3, 3,5, 7, 9, 11]))

print(statistics.mode([1, 1, 3, -5, 7, -9, 11]))

print(statistics.mode(['red', 'green', 'blue', 'red']))

red

print(statistics.variance([1, 3, 5, 7, 9, 11]))

print(statistics.variance([2, 2.5, 1.25, 3.1, 1.75, 2.8]))

print(statistics.variance([-11, 5.5, -3.4, 7.1]))

print(statistics.variance([1, 30, 50, 100]))

14

0.47966666666666663

70.80333333333333
1736.9166666666667

import statistics

def compute_statistics(data):

mean = statistics.mean(data)

median = statistics.median(data)

mode = statistics.mode(data)

variance = statistics.variance(data)

std_dev = statistics.stdev(data)

print(f"Mean: {mean}")

print(f"Median: {median}")

print(f"Mode: {mode}")

print(f"Variance: {variance}")

print(f"Standard Deviation: {std_dev}")

if name == " main ":

data = [1, 2, 2, 3, 4, 5, 5, 5, 6, 7]

compute_statistics(data)

Mean: 4

Median: 4.5

Mode: 5

Variance: 3.7777777777777777

Standard Deviation: 1.9436506316151


WEEK2: Implementation of Python Basic Libraries such as Math, Numpy and

Scipy Theory/Description:

Python Libraries There are a lot of reasons why Python is popular among developers and one of them is
that it has an amazingly large collection of libraries that users can work with. In this Python Library, we
will discuss Python Standard library and different libraries offered by Python Programming Language:
scipy, numpy,etc. We know that a module is a file with some Python code, and a package is a directory for
sub packages and modules. A Python library is a reusable chunk of code that you may want to include in
your programs/ projects. Here, a library loosely describes a collection of core modules. Essentially, then, a
library is a collection of modules. A package is a library that can be installed using a package manager like
numpy. Python Standard Library The Python Standard Library is a collection of script modules accessible
to a Python program to simplify the programming process and removing the need to rewrite commonly
used commands. They can be used by 'calling/importing' them at the beginning of a script. A list of the
Standard Library modules that are most important time sys csv math random pip os statistics tkinter socket
To display a list of all available modules, use the following command in the Python
console: >>>help('modules') 

List of important Python Libraries

Python Libraries for Data Collection

 Beautiful Soup
 Scrapy
 Selenium

Python Libraries for Data Cleaning and Manipulation

 Pandas
 PyOD
 NumPy
 Scipy
 Spacy

Python Libraries for Data Visualization

 Matplotlib
 Seaborn
 Bokeh
 NumPy (Numerical Python): Efficient numerical operations, arrays, and
mathematical computations.

 SciPy (Scientific Python): Built on top of NumPy, providing additional functionalities


for optimization, integration, statistics, and signal processing.

1. NumPy: Numerical Computation & Array Operations

NumPy provides a powerful n-dimensional array object (ndarray) and functions for numerical
computation.

1.1 Installing NumPy

pip install numpy

1.2 Basic NumPy Operations

import numpy as np

# Creating arrays

arr1 = np.array([1, 2, 3, 4, 5])

arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # 2D array

# Display arrays
print("1D Array:", arr1)

print("2D Array:\n", arr2)

# Array Properties

print("Shape:", arr2.shape) # (rows, columns)

print("Size:", arr2.size) # Total elements

print("Data Type:", arr2.dtype)

# Array Operations

print("Sum:", np.sum(arr1))

print("Mean:", np.mean(arr1))

print("Standard Deviation:", np.std(arr1))

# Element-wise Operations

print("Multiplication:", arr1 * 2)

print("Square Root:", np.sqrt(arr1))

# Creating Special Arrays

zeros = np.zeros((3,3)) # 3x3 matrix of zeros

ones = np.ones((2,2)) # 2x2 matrix of ones

identity = np.eye(3) # 3x3 identity matrix

# Random Numbers

rand_array = np.random.rand(3,3) # 3x3 random values


1.3 NumPy in Machine Learning

 Dataset Handling: Used to load, manipulate, and preprocess data.

 Linear Algebra: Matrix operations in deep learning and ML.

 Random Sampling: Initializing weights in neural networks.

2. SciPy: Scientific Computation & Advanced Operations

SciPy extends NumPy by adding modules for statistics, optimization, and signal processing.

2.1 Installing SciPy

pip install scipy

2.2 SciPy Modules & Examples

2.2.1 Optimization (scipy.optimize)

Used for solving mathematical optimization problems.

from scipy.optimize import minimize

# Define function to minimize (e.g., x^2 + 3x + 5)

def func(x):

return x**2 + 3*x + 5

result = minimize(func, x0=0) # Find minimum starting at x=0

print("Optimized Result:", result.x)

2.2.2 Linear Algebra (scipy.linalg)

from scipy.linalg import inv, det

A = np.array([[4, 7], [2, 6]])


print("Determinant:", det(A)) # Compute determinant

print("Inverse Matrix:\n", inv(A)) # Compute inverse

2.2.3 Statistics (scipy.stats)

from scipy import stats

data = [12, 15, 14, 10, 13, 18, 21, 19]

print("Mean:", np.mean(data))

print("Median:", np.median(data))

print("Mode:", stats.mode(data).mode[0])

print("Standard Deviation:", np.std(data))

2.2.4 Signal Processing (scipy.signal)

from scipy.signal import butter, filtfilt

# Low-pass filter

b, a = butter(3, 0.05) # 3rd order, cutoff 0.05

filtered_signal = filtfilt(b, a, np.sin(np.linspace(0, 10,

100)))
Week 3:
Study of Python Libraries for ML application such as Pandas and Matplotlib

1. Introduction to Python for ML

Machine Learning requires efficient data handling, processing, and visualization. Python provides several
libraries that make these tasks easier, among which Pandas (for data manipulation) and Matplotlib (for
visualization) are widely used.

2. Pandas: Data Handling & Manipulation

Pandas is a Python library used for data analysis and manipulation, built on top of NumPy.

2.1 Key Features

 DataFrames & Series: Core data structures for handling tabular and labeled data.

 Data Cleaning & Transformation: Handling missing values, filtering, merging, and
reshaping data.

 Descriptive Statistics: Mean, median, correlation, and other statistical operations.

 Integration: Works well with other ML libraries such as Scikit-learn, TensorFlow, and PyTorch.

2.2 Common Pandas Functions

import pandas as pd

# Creating a DataFrame

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Score': [85, 90, 95]}

df = pd.DataFrame(data)

# Display DataFrame

print(df)
# Basic Operations

print(df.describe()) # Summary statistics

print(df.head(2)) # First two rows

print(df.dtypes) # Data types of columns

# Data Manipulation

df['Age'] = df['Age'] + 1 # Modify values

df_filtered = df[df['Score'] > 85] # Filtering data

df_sorted = df.sort_values(by='Age') # Sorting data

2.3 Use Cases in ML

 Preprocessing: Cleaning, normalizing, and structuring datasets before feeding into ML models.

 Feature Engineering: Creating new features from existing data.

 Exploratory Data Analysis (EDA): Analyzing data distributions, correlations, and outliers.

3. Matplotlib: Data Visualization

Matplotlib is a powerful library for creating static, animated, and interactive visualizations.

3.1 Key Features

 Plotting Types: Line plots, bar charts, histograms, scatter plots, etc.

 Customization: Colors, labels, annotations, and styling.

 Integration: Works well with Pandas, NumPy, and Seaborn.

3.2 Common Matplotlib Functions

import matplotlib.pyplot as plt

# Sample Data

x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 50]

# Line Plot

plt.plot(x, y, marker='o', linestyle='-', color='b',

label="Growth") plt.xlabel("X-axis")

plt.ylabel("Y-axis")

plt.title("Simple Line Plot")

plt.legend()

plt.show()

# Scatter Plot

plt.scatter(x, y, color='r')

plt.title("Scatter Plot")

plt.show()

# Histogram

import numpy as np

data = np.random.randn(1000)

plt.hist(data, bins=30, color='g', alpha=0.7)

plt.title("Histogram")

plt.show()

3.3 Use Cases in ML

 Data Exploration: Understanding data distributions and trends.

 Feature Relationships: Identifying correlations between variables.

 Model Performance Evaluation: Visualizing errors, predictions, and accuracy.

4. Combining Pandas and Matplotlib for ML Applications


import pandas as pd

import matplotlib.pyplot as plt

# Load dataset (e.g., Titanic dataset)

df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")

# Data Preprocessing

df['Age'].fillna(df['Age'].median(), inplace=True)

# Plotting Age Distribution

plt.hist(df['Age'], bins=20, color='blue', alpha=0.7)

plt.xlabel("Age")

plt.ylabel("Count")

plt.title("Age Distribution of Titanic Passengers")

plt.show()

# Scatter Plot: Age vs Fare

plt.scatter(df['Age'], df['Fare'], alpha=0.5, color='red')

plt.xlabel("Age")

plt.ylabel("Fare")

plt.title("Age vs Fare")

plt.show()
Week 4:
Write a Python program to implement Simple Linear Regression and plot
the graph.
Linear Regression: Linear regression is defined as an algorithm that provides a linear relationship between
an independent variable and a dependent variable to predict the outcome of future events. It is a statistical
method used in data science and machine learning for predictive analysis. Linear regression is a supervised
learning algorithm that simulates a mathematical relationship between variables and makes predictions for
continuous or numeric variables such as sales, salary, age, product price, etc.
Week 5:
Write a Python program to implement for House Price Prediction using
sklearn

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

import matplotlib.pyplot as plt

%matplotlib inline

df=dataset = pd.read_csv('house-prices.csv')

df.head()

print(df.head())

print(df.isnull().sum())

sns.pairplot(df)

plt.show()

X = df[['SqFt', 'Bedrooms', 'Bathrooms', 'Offers']]

Y = df['Price']

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)


scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)

X_test_scaled = scaler.transform(X_test)

model = LinearRegression()

model.fit(X_train_scaled, Y_train)

Y_pred = model.predict(X_test_scaled)

mse = mean_squared_error(Y_test,

Y_pred) r2 = r2_score(Y_test, Y_pred)

print(f"Mean Squared Error: {mse:.2f}")

print(f"R² Score: {r2:.2f}")

plt.scatter(Y_test, Y_pred, color='blue', label="Predicted vs Actual")

plt.plot(Y_test, Y_test, color='red', linestyle="dashed", label="Perfect Prediction Line")

plt.xlabel("Actual Price")

plt.ylabel("Predicted Price")

plt.title("House Price Prediction: Actual vs Predicted")

plt.legend()

plt.show()
Week 6:
Implementation of Decision tree using sklearn and its parameter tuning

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeRegressor, DecisionTreeClassifier, plot_tree

from sklearn.metrics import accuracy_score, mean_squared_error

df=dataset = pd.read_csv('house-prices.csv')

df.head()

X = df[['SqFt', 'Bedrooms', 'Bathrooms', 'Offers']]

Y = df['Price']

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

tree = DecisionTreeRegressor(max_depth=5, random_state=42)

tree.fit(X_train, Y_train)

Y_pred = tree.predict(X_test)

mse = mean_squared_error(Y_test, Y_pred)

print(f"Mean Squared Error: {mse:.2f}")

plt.figure(figsize=(12, 6))

plot_tree(tree, feature_names=X.columns, filled=True, rounded=True)

plt.show()
pruned_tree = DecisionTreeRegressor(max_depth=3, random_state=42)

pruned_tree.fit(X_train, Y_train)

# Predict again

Y_pruned_pred = pruned_tree.predict(X_test)

# Evaluate pruned model

mse_pruned = mean_squared_error(Y_test, Y_pruned_pred)

print(f"Pruned Mean Squared Error: {mse_pruned:.2f}")

Output:

Pruned Mean Squared Error: 833626308.91


Week 7:
Implementation of KNN using sklearn

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor

from sklearn.metrics import accuracy_score, mean_squared_error

iris = load_iris()

df = pd.DataFrame(data=iris.data, columns=iris.feature_names)

df['target'] = iris.target

print(df.head())

X = df.iloc[:, :-1] # Features (all columns except target)

Y = df['target'] # Target labels

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)

X_test_scaled = scaler.transform(X_test)

knn = KNeighborsClassifier(n_neighbors=5) # Using k=5

knn.fit(X_train_scaled, Y_train)
Y_pred = knn.predict(X_test_scaled)

knn_reg = KNeighborsRegressor(n_neighbors=5)

knn_reg.fit(X_train_scaled, Y_train)

Y_reg_pred =

knn_reg.predict(X_test_scaled) accuracy =

accuracy_score(Y_test, Y_pred)

print(f"Accuracy: {accuracy:.2f}")

error_rates = []

for k in range(1, 21):

knn = KNeighborsClassifier(n_neighbors=k)

knn.fit(X_train_scaled, Y_train)

Y_k_pred = knn.predict(X_test_scaled)

error_rates.append(np.mean(Y_k_pred != Y_test)) # Misclassification error

mse = mean_squared_error(Y_test,

Y_reg_pred) print(f"Mean Squared Error:

{mse:.2f}")

# Plot Error Rate vs. K Value

plt.figure(figsize=(8, 5))

plt.plot(range(1, 21), error_rates, marker='o', linestyle='dashed', color='green')

plt.xlabel("K Value")

plt.ylabel("Error Rate")

plt.title("Choosing Best K Value using Elbow Method")

plt.show()
Week 8:
Implementation of Logistic Regression using sklearn

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

from sklearn.datasets import load_iris

iris = load_iris()

df = pd.DataFrame(data=iris.data, columns=iris.feature_names)

df['target'] = iris.target # Add target labels (0,1,2)

print(df.head())

X = df.iloc[:, :-1]

Y = df['target']

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42, stratify=Y)

scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)

X_test_scaled = scaler.transform(X_test)
model = LogisticRegression(solver='lbfgs', max_iter=1000) # OVR for multiclass

model.fit(X_train_scaled, Y_train)

Y_pred = model.predict(X_test_scaled)

accuracy = accuracy_score(Y_test, Y_pred)

print(f"Accuracy: {accuracy:.2f}")

print("\nClassification Report:\n", classification_report(Y_test, Y_pred))

conf_matrix = confusion_matrix(Y_test, Y_pred)

sns.heatmap(conf_matrix, annot=True, cmap='Blues', fmt='d')

plt.xlabel("Predicted")

plt.ylabel("Actual")

plt.title("Confusion Matrix")

plt.show()
Week 9:
Implementation of K-Means Clustering

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.cluster import KMeans

from sklearn.preprocessing import StandardScaler

import os

os.environ["OMP_NUM_THREADS"] = "1"

from sklearn.datasets import load_iris

iris = load_iris()

df = pd.DataFrame(data=iris.data, columns=iris.feature_names)

print(df.head())

scaler = StandardScaler()

df_scaled = scaler.fit_transform(df)

from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)

df['Cluster'] = kmeans.fit_predict(df_scaled)

plt.figure(figsize=(8, 6))

sns.scatterplot(x=df.iloc[:, 0], y=df.iloc[:, 1], hue=df['Cluster'], palette='viridis', s=100)

plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='red', marker='X',


label='Centroids')
plt.xlabel("Feature 1")

plt.ylabel("Feature 2")

plt.title("K-Means Clustering Visualization")

plt.legend()

plt.show()
Week 10:
Performance analysis of Classification Algorithms on a specific dataset (Mini
Project)

# Performance Analysis of Classification Algorithms on the Iris Dataset

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Import classifiers

from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClassifier

from sklearn.ensemble import RandomForestClassifier

from sklearn.svm import SVC

from sklearn.naive_bayes import GaussianNB

from sklearn.neighbors import KNeighborsClassifier


# Load the Iris dataset

data = load_iris()

df = pd.DataFrame(data.data, columns=data.feature_names)

df['target'] = data.target

# Features and labels

X = df.drop('target', axis=1)

y = df['target']

# Train-Test split

X_train, X_test, y_train, y_test =

train_test_split( X, y, test_size=0.3,

random_state=42)

# Feature Scaling

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_test = scaler.transform(X_test)

# Define models

models = {

'Logistic Regression': LogisticRegression(),

'Decision Tree': DecisionTreeClassifier(),

'Random Forest': RandomForestClassifier(),

'Support Vector Machine': SVC(),

'Naive Bayes': GaussianNB(),


'K-Nearest Neighbors': KNeighborsClassifier()

# Store results

results = {}

# Train and evaluate each model

for name, model in models.items():

model.fit(X_train, y_train)

predictions = model.predict(X_test)

acc = accuracy_score(y_test, predictions)

cm = confusion_matrix(y_test, predictions)

cr = classification_report(y_test, predictions, output_dict=True)

results[name] =

{ 'Accuracy': acc,

'Confusion Matrix': cm,

'Classification Report': cr

print(f"\n{name} Results:")

print("Accuracy:", acc)

print("Confusion Matrix:\n", cm)

print("Classification Report:\n", classification_report(y_test, predictions))


# Visualize Accuracy

accs = {name: results[name]['Accuracy'] for name in models}

plt.figure(figsize=(10,6))

sns.barplot(x=list(accs.keys()), y=list(accs.values()))

plt.ylabel("Accuracy")

plt.title("Comparison of Classification Algorithms on Iris Dataset")

plt.xticks(rotation=45)

plt.ylim(0.8, 1.05)

plt.show()

Output:

Logistic Regression Results:


Accuracy: 1.0
Confusion Matrix:
[[19 0 0]
[ 0 13 0]
[ 0 0 13]]
Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 19


1 1.00 1.00 1.00 13
2 1.00 1.00 1.00 13

accuracy 1.00 45
macro avg 1.00 1.00 1.00 45
weighted avg 1.00 1.00 1.00 45

You might also like