0% found this document useful (0 votes)

39 views31 pages

IML Lab Manual

Uploaded by

harsh77harsh77

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views31 pages

IML Lab Manual

Uploaded by

harsh77harsh77

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

DCE

GUJARAT TECHNOLOGICAL UNIVERSITY

Chandkheda, Ahmedabad

B. H. Gardi College of Engineering & Technology

Subject : Introdction to machine learning

(4350702)

D.E. , Semester - V , Computer Engineering.

LAB MANUAL

Prof. Arkesh Vora

(Faculty Guide)

Prof. Monika Shah

(Head of the Department)

Academic Year

(2024-2025)

Page 1
DCE

B. H.Gardi College of Engineering & Technology

CERTIFICATE

This is to certify that Mr. / Miss.

of semester branch Enrollment no. has satisfactorily

completed his/her term work in the subject for the term ending

in 2024.

Date:

Prof. Arkesh Vora Prof. Monika Shah

Subject Incharge Head of Department

Page 2
DCE

Index

S. Page
Practical Outcomes (PrOs) Sign
No. No.

Explore any one machine learning tool.

1
(like Weka, Tensorflow, Scikit-learn, Colab, etc.)

Write a NumPy program to implement following operation

 to convert a list of numeric values into a one-dimensional NumPy
array
2  to create a 3x3 matrix with values ranging from 2 to 10
 to append values at the end of an array
 to create another shape from an array without changing its data(3*2 to
2*3)

Write a NumPy program to implement following operation

 to split an array of 14 elements into 3 arrays, each with 2, 4, and 8
3
elements in the original order
 to stack arrays horizontally (column wise)

Write a NumPy program to implement following operation

 to add, subtract, multiply, divide arguments element-wise
 to round elements of the array to the nearest integer
4
 to calculate mean across dimension, in a 2D numpy array
 to calculate the difference between neighboring elements, element-
wise of a given array

Write a NumPy program to implement following operation

 to find the maximum and minimum value of a given flattened array
5
 to compute the mean, standard deviation, and variance of a given
array along the second axis

Write a Pandas program to implement following operation

 to convert a NumPy array to a Pandas series
6  to convert the first column of a DataFrame as a Series
 to create the mean and standard deviation of the data of a given
Series
 to sort a given Series
Write a Pandas program to implement following operation
 to create a dataframe from a dictionary and display it
7  to sort the DataFrame first by 'name' in ascending order
 to delete the one specific column from the DataFrame
 to write a DataFrame to CSV file using tab separator

Write a Pandas program to create a line plot of the opening, closing stock
8
prices of given company between two specific dates.

Page 3
DCE

Write a Pandas program to create a plot of Open, High, Low, Close,

9 Adjusted Closing prices and Volume of given company between two
specific dates.

Write a Pandas program to implement following operation

10  to find and drop the missing values from the given dataset
 to remove the duplicates from the given dataset

Write a Pandas program to filter all columns where all entries present,
11 check which rows and columns has a NaN and finally drop rows with any
NaNs from the given dataset.

Write a Python program using Scikit-learn to print the keys, number of

12
rows-columns, feature names and the description of the given data.

Write a Python program to implement K-Nearest Neighbour supervised

13
machine learning algorithm for given dataset.

Write a Python program to implement a machine learning algorithm for

14 given dataset. (It is recommended to assign different machine learning
algorithms group wise – micro project)

Page 4
DCE

Practical - 1
 Explore any one machine learning tool. (like Weka, Tensorflow, Scikit-learn,
Colab, etc.)

Overview of Scikit-learn:

⚫ Scikit-learn is an open-source library that provides simple and efficient tools for data
mining, data analysis, and machine learning. It is built on top of other Python libraries such as
NumPy, SciPy, and matplotlib, making it an excellent choice for machine learning tasks. It is
widely used for creating predictive models, performing data analysis, and feature extraction.

Key Features:
1. Classification: Identifying which category an object belongs to (e.g., spam detection).
2. Regression: Predicting continuous-valued attributes associated with an object (e.g.,
predicting prices).
3. Clustering: Grouping similar objects together (e.g., customer segmentation).
4. Dimensionality Reduction: Reducing the number of random variables to consider (e.g.,
principal component analysis).
5. Model Selection: Comparing, validating, and choosing models with different parameters.
6. Preprocessing: Feature extraction and normalization for data preparation.

Commonly Used Algorithms in Scikit-learn:

a. Linear models: Linear Regression, Logistic Regression.
b. Tree-based models: Decision Trees, Random Forest, Gradient Boosting.
c. Support Vector Machines (SVM).
d. K-Nearest Neighbors (KNN).
e. Clustering models: K-Means, DBSCAN.

Basic Workflow:
i. Importing data: Load and prepare your dataset (usually as a NumPy array or pandas
DataFrame).
ii. Splitting data: Divide your dataset into training and test sets using train_test_split().
iii. Model selection: Choose an appropriate model (e.g., LinearRegression,
DecisionTreeClassifier).

Page 5
DCE

iv. Training: Fit the model on the training data using .fit().
v. Prediction: Use the model to predict results on the test data with .predict().

vi. Evaluation: Measure the model’s performance using accuracy metrics like R² score,
confusion matrix, etc.
Why Use Scikit-learn?

1. Easy to use: Its API is intuitive and beginner-friendly.

2. Extensive documentation: Detailed guides and examples make it easy to get started.
3. Wide community support: A large community contributes to its development, ensuring
regular updates.
4. Versatility: It supports a broad range of models for various machine learning tasks.

1.1 Scikit-learn exapmle

⚫ Input:

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np

# Sample data (X: independent variables, y: target)

X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 2, 3, 4, 5])

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize the model

model = LinearRegression()
# Train the model
model.fit(X_train, y_train)

# Predict values using the test set

y_pred = model.predict(X_test)

# Evaluate the model

mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

⚫ Output:

Mean Squrared Error : 0.0

Page 6
DCE

Practical - 2

 Write a NumPy program to implement following operation

2.1 to convert a list of numeric values into a one-dimensional NumPy array

⚫ Input:

import numpy as np

# List of numeric values

lst = [10, 20, 30, 40, 50]

# Convert the list into a NumPy array

array_1d = np.array(lst)

print("1D NumPy array:", array_1d)

⚫ Output:

1D NumPy array: [10 20 30 40 50]

2.2 to create a 3x3 matrix with values ranging from 2 to 10

⚫ Input:

import numpy as np
# Create a 3x3 matrix with values ranging from 2 to 10
matrix_3x3 = np.arange(2, 11).reshape(3, 3)

print("3x3 matrix with values from 2 to 10:\n", matrix_3x3)

⚫ Output:

3x3 matrix with values from 2 to 10: [[ 2 3 4] [ 5 6 7] [ 8 9 10]]

2.3 to append values at the end of an array

⚫ Input:

import numpy as np

# Original array
arr = np.array([1, 2, 3])

# Append values to the end of the array

new_arr = np.append(arr, [4, 5, 6])

Page 7
DCE

print("Array after appending values:", new_arr)

⚫ Output:

Array after appending values: [1 2 3 4 5 6]

2.4 to create another shape from an array without changing its data(3*2 to 2*3)

⚫ Input:

import numpy as np

# Original 3x2 array

arr_3x2 = np.array([[1, 2], [3, 4], [5, 6]])

# Reshape to 2x3
reshaped_arr = arr_3x2.reshape(2, 3)

print("Reshaped array (2x3):\n", reshaped_arr)

⚫ Output:

Reshaped array (2x3): [[1 2 3] [4 5 6]]

Page 8
DCE

Practical - 3

 Write a NumPy program to implement following operation

3.1 to split an array of 14 elements into 3 arrays, each with 2, 4, and 8 elements in the original
order

⚫ Input:

import numpy as np

# Create an array of 14 elements

arr = np.arange(1, 15)

# Split the array into 3 arrays with 2, 4, and 8 elements

arr_split = np.split(arr, [2, 6])

print("Split arrays:")
for part in arr_split:
print(part)

⚫ Output:

Split arrays: [1 2] [3 4 5 6] [ 7 8 9 10 11 12 13 14]

3.2 to stack arrays horizontally (column wise)

⚫ Input:

import numpy as np

# Create two arrays to stack horizontally

arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Stack the arrays horizontally (column-wise)

arr_hstack = np.hstack((arr1, arr2))

print("Horizontally stacked arrays:\n", arr_hstack)

⚫ Output:

Horizontally stacked arrays: [[1 2 5 6] [3 4 7 8]]

Page 9
DCE

Practical - 4

 Write a NumPy program to implement following operation

4.1 to add, subtract, multiply, divide arguments element-wise

⚫ Input:

import numpy as np

# Define two arrays

arr1 = np.array([10, 20, 30, 40])
arr2 = np.array([1, 2, 3, 4])

# Element-wise operations
add_result = np.add(arr1, arr2)
subtract_result = np.subtract(arr1, arr2)
multiply_result = np.multiply(arr1, arr2)
divide_result = np.divide(arr1, arr2)

print("Element-wise Addition:", add_result)

print("Element-wise Subtraction:", subtract_result)
print("Element-wise Multiplication:", multiply_result)
print("Element-wise Division:", divide_result)

⚫ Output:

Element-wise Addition: [11 22 33 44]

Element-wise Subtraction: [ 9 18 27 36]
Element-wise Multiplication: [ 10 40 90 160]
Element-wise Division: [10. 10. 10. 10.]

4.2 to round elements of the array to the nearest integer

⚫ Input:

import numpy as np

# Define an array with floating-point numbers

arr = np.array([1.5, 2.8, 3.3, 4.6])

# Round the elements to the nearest integer

rounded_arr = np.rint(arr)

print("Rounded Array:", rounded_arr)

Page 10
DCE

⚫ Output:

Rounded Array: [2. 3. 3. 5.]

4.3 to calculate mean across dimension, in a 2D numpy array

⚫ Input:

import numpy as np

# Define a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Calculate the mean across rows (axis=1) and columns (axis=0)

mean_across_rows = np.mean(arr_2d, axis=1)
mean_across_columns = np.mean(arr_2d, axis=0)

print("Mean across rows:", mean_across_rows)

print("Mean across columns:", mean_across_columns)

⚫ Output:

Mean across rows: [2. 5.]

Mean across columns: [2.5 3.5 4.5]

4.4 to calculate the difference between neighboring elements, element- wise of a given array

⚫ Input:

import numpy as np

# Define an array
arr = np.array([10, 20, 30, 40, 50])

# Calculate the difference between neighboring elements

diff_arr = np.diff(arr)

print("Element-wise difference between neighboring elements:", diff_arr)

⚫ Output:

Element-wise difference between neighboring elements: [10 10 10 10]

Page 11
DCE

Practical -
12
 Write a NumPy program to implement following operation

5.1 to find the maximum and minimum value of a given flattened array

⚫ Input:

import numpy as np

# Define a 2D array
arr_2d = np.array([[3, 7, 5], [8, 4, 2], [9, 6, 1]])

# Flatten the array and find max and min

max_value = np.max(arr_2d)
min_value = np.min(arr_2d)

print("Maximum value in flattened array:", max_value)

print("Minimum value in flattened array:", min_value)

⚫ Output:

Maximum value in flattened array: 9

Minimum value in flattened array: 1

5.2 to compute the mean, standard deviation, and variance of a given array along the
second axis

⚫ Input:

import numpy as np
# Define a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Compute mean, standard deviation, and variance along the second axis (rows)
mean_along_axis2 = np.mean(arr_2d, axis=1)
std_dev_along_axis2 = np.std(arr_2d, axis=1)
variance_along_axis2 = np.var(arr_2d, axis=1)

print("Mean along the second axis (rows):", mean_along_axis2)

print("Standard Deviation along the second axis (rows):", std_dev_along_axis2)
print("Variance along the second axis (rows):", variance_along_axis2)

⚫ Output:

Mean along the second axis (rows): [2. 5. 8.]

Standard Deviation along the second axis (rows): [0.81649658 0.81649658 0.81649658]
Variance along the second axis (rows): [0.66666667 0.66666667 0.66666667]

Page 12
DCE

Practical - 6

 Write a Pandas program to implement following operation

6.1 to convert a NumPy array to a Pandas series

⚫ Input:

import pandas as pd

# Create a dictionary
data = {
'name': ['Alice', 'Bob', 'Charlie', 'David'],
'age': [24, 27, 22, 32],
'city': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

# Create a DataFrame from the dictionary

df = pd.DataFrame(data)

# Display the DataFrame

print("DataFrame from dictionary:")
print(df)

⚫ Output:

DataFrame from dictionary:

name age city
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston

6.2 to convert the first column of a DataFrame as a Series

⚫ Input:

# Create a DataFrame
data = {'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]}
df = pd.DataFrame(data)

# Convert the first column of the DataFrame to a Series

first_column_series = df.iloc[:, 0] # Using iloc to select the first column

print("First column as Series:")

print(first_column_series)

Page 13
DCE

⚫ Output:

First column as Series:

0 1
1 2
2 3
3 4
Name: A, dtype: int64

6.3 to create the mean and standard deviation of the data of a given Series
⚫ Input:

# Define a Pandas Series

series_data = pd.Series([10, 20, 30, 40, 50])

# Calculate mean and standard deviation

mean_value = series_data.mean()
std_dev_value = series_data.std()

print("Mean of the Series:", mean_value)

print("Standard Deviation of the Series:", std_dev_value)

⚫ Output:

Mean of the Series: 30.0

Standard Deviation of the Series: 15.811388300841896

6.4 to sort a given Series

⚫ Input:

# Define a Pandas Series

series_to_sort = pd.Series([10, 30, 20, 50, 40])

# Sort the Series

sorted_series = series_to_sort.sort_values()

print("Sorted Series:")
print(sorted_series)

⚫ Output:

Sorted Series:
0 10
2 20
1 30
4 40

Page 14
DCE

3 50
dtype: int64
Practical - 7

 Write a Pandas program to implement following operation

7.1 to create a dataframe from a dictionary and display it

⚫ Input:

import pandas as pd

# Create a dictionary
data = {
'name': ['Alice', 'Bob', 'Charlie', 'David'],
'age': [24, 27, 22, 32],
'city': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

# Create a DataFrame from the dictionary

df = pd.DataFrame(data)

# Display the DataFrame

print("DataFrame from dictionary:")
print(df)

⚫ Output:

DataFrame from dictionary:

name age city
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston

7.2 to sort the DataFrame first by 'name' in ascending order

⚫ Input:

import pandas as pd

# Sort the DataFrame by 'name' in ascending order

sorted_df = df.sort_values(by='name')

print("DataFrame sorted by 'name':")

print(sorted_df)

Page 15
DCE

⚫ Output:

DataFrame sorted by 'name':

name age city
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston

7.3 to delete the one specific column from the DataFrame

⚫ Input:

import pandas as pd

# Delete the 'city' column from the DataFrame

df_dropped = df.drop(columns=['city'])

print("DataFrame after dropping 'city' column:")

print(df_dropped)

⚫ Output:

DataFrame after dropping 'city' column:

name age
0 Alice 24
1 Bob 27
2 Charlie 22
3 David 32

7.4 to write a DataFrame to CSV file using tab separator

⚫ Input:

import pandas as pd
# Write the DataFrame to a CSV file using tab as a separator
df.to_csv('output_data.csv', sep='\t', index=False)

print("DataFrame has been written to 'output_data.csv' with tab separator.")

⚫ Output:

DataFrame has been written to 'output_data.csv' with tab separator.

name age city
Alice 24 New York

Page 16
DCE

Bob 27 Los Angeles

Charlie22 Chicago
David 32 Houston
Practical - 8
 Write a Pandas program to create a line plot of the opening, closing stock prices of
given company between two specific dates.

⚫ Input:

import pandas as pd
import matplotlib.pyplot as plt

# Sample data for stock prices

data = {
'date': pd.date_range(start='2024-01-01', end='2024-01-10'),
'opening_price': [150.0, 152.5, 153.0, 151.5, 154.0, 155.5, 157.0, 156.0, 158.5, 159.0],
'closing_price': [151.0, 153.0, 152.0, 155.0, 156.5, 158.0, 159.0, 157.5, 160.0, 161.0]
}

# Create a DataFrame from the data

df = pd.DataFrame(data)

# Set the 'date' column as the index

df.set_index('date', inplace=True)

# Specify the date range

start_date = '2024-01-02'
end_date = '2024-01-08'

# Filter the DataFrame for the specified date range

filtered_df = df.loc[start_date:end_date]

# Create a line plot

plt.figure(figsize=(10, 5))
plt.plot(filtered_df.index, filtered_df['opening_price'], label='Opening Price', marker='o')
plt.plot(filtered_df.index, filtered_df['closing_price'], label='Closing Price', marker='o')

# Add title and labels

plt.title('Stock Prices from {} to {}'.format(start_date, end_date))
plt.xlabel('Date')
plt.ylabel('Price (USD)')
plt.xticks(rotation=45)
plt.legend()
plt.grid()

# Show the plot

plt.tight_layout()
plt.show()

Page 17
DCE

⚫ Output:

Page 18
DCE

Practical - 9

 Write a Pandas program to create a plot of Open, High, Low, Close, Adjusted
Closing prices and Volume of given company between two specific dates.

⚫ Input:

import pandas as pd
import matplotlib.pyplot as plt

# Sample data for stock prices

data = {
'date': pd.date_range(start='2024-01-01', end='2024-01-10'),
'open': [150.0, 152.5, 153.0, 151.5, 154.0, 155.5, 157.0, 156.0, 158.5, 159.0],
'high': [152.0, 154.5, 155.0, 154.0, 157.0, 158.5, 160.0, 158.5, 162.0, 163.0],
'low': [149.0, 151.0, 152.0, 150.5, 153.0, 154.5, 156.0, 155.0, 157.5, 158.0],
'close': [151.0, 153.0, 152.0, 155.0, 156.5, 158.0, 159.0, 157.5, 160.0, 161.0],
'adj_close': [150.5, 152.8, 151.5, 154.7, 156.2, 157.7, 158.7, 157.2, 159.5, 160.8],
'volume': [1000000, 1200000, 1300000, 1100000, 1500000, 1600000, 1700000, 1400000,
1800000, 1900000]
}

# Create and filter DataFrame by date range

df = pd.DataFrame(data).set_index('date').loc['2024-01-02':'2024-01-08']

# Create subplots
fig, axes = plt.subplots(6, 1, figsize=(10, 12), sharex=True)
cols = ['open', 'high', 'low', 'close', 'adj_close', 'volume']
colors = ['blue', 'green', 'red', 'purple', 'orange', 'grey']
titles = ['Open Price', 'High Price', 'Low Price', 'Close Price', 'Adjusted Close Price', 'Volume']

for i, col in enumerate(cols):

if col == 'volume':
axes[i].bar(df.index, df[col], color=colors[i])
else:
axes[i].plot(df.index, df[col], marker='o', color=colors[i])
axes[i].set_title(titles[i])
axes[i].set_ylabel('Price (USD)' if col != 'volume' else 'Volume')
axes[i].grid(True)

plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

⚫ Output:

Page 19
DCE

Page 20
DCE

Practical - 10

 Write a Pandas program to implement following operation

10.1 to find and drop the missing values from the given dataset

⚫ Input:

import pandas as pd

# Sample data with missing values

data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [24, 27, None, 32, 29],
'City': ['New York', None, 'Chicago', 'Houston', 'Boston']
}

# Create a DataFrame
df = pd.DataFrame(data)

# Display the DataFrame with missing values

print("Original DataFrame with missing values:")
print(df)

# Drop rows with missing values

df_cleaned = df.dropna()

print("\nDataFrame after dropping missing values:")

print(df_cleaned)

⚫ Output:

10.2 to remove the duplicates from the given dataset

⚫ Input:

import pandas as pd

# Sample data with duplicate rows

data_with_duplicates = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Bob'],
'Age': [24, 27, 22, 32, 29, 27],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Boston', 'Los Angeles']
}

# Create a DataFrame
df_duplicates = pd.DataFrame(data_with_duplicates)

Page 21
DCE

# Display the original DataFrame with duplicates

print("\nOriginal DataFrame with duplicates:")
print(df_duplicates)

# Remove duplicate rows

df_no_duplicates = df_duplicates.drop_duplicates()

print("\nDataFrame after removing duplicates:")

print(df_no_duplicates)

⚫ Output:

Original DataFrame with missing values:

Name Age City
0 Alice 24.0 New York
1 Bob 27.0 None
2 Charlie NaN Chicago
3 David 32.0 Houston
4 Eve 29.0 Boston

DataFrame after dropping missing values:

Name Age City
0 Alice 24.0 New York
3 David 32.0 Houston
4 Eve 29.0 Boston

Page 22
DCE

Practical - 11

 Write a Pandas program to filter all columns where all entries present, check which
rows and columns has a NaN and finally drop rows with any NaNs from the given
dataset.

⚫ Input:

import pandas as pd

# Sample data with missing values

data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [24, 27, None, 32, 29],
'City': ['New York', None, 'Chicago', 'Houston', 'Boston'],
'Score': [85, 88, 90, None, 93]
}

# Create a DataFrame
df = pd.DataFrame(data)

# Display the original DataFrame

print("Original DataFrame:")
print(df)

### 1. Filter columns where all entries are present (no NaNs)
columns_no_nan = df.dropna(axis=1, how='any')
print("\nColumns with no missing values:")
print(columns_no_nan)

### 2. Check which rows and columns have NaN values

nan_locations = df.isna()
print("\nLocations with NaN values (True indicates NaN):")
print(nan_locations)

### 3. Drop rows with any NaNs

df_cleaned = df.dropna()
print("\nDataFrame after dropping rows with any NaN values:")
print(df_cleaned)

⚫ Output:

Original DataFrame:
Name Age City Score
0 Alice 24.0 New York 85.0
1 Bob 27.0 None 88.0
2 Charlie NaN Chicago 90.0
3 David 32.0 Houston NaN
4 Eve 29.0 Boston 93.0

Page 23
DCE

Columns with no missing values:

Name
0 Alice
1 Bob
2 Charlie
3 David
4 Eve

Locations with NaN values (True indicates NaN):

Name Age City Score
0 False False False False
1 False False True False
2 False True False False
3 False False False True
4 False False False False

DataFrame after dropping rows with any NaN values:

Name Age City Score
0 Alice 24.0 New York 85.0

Page 24
DCE

Practical - 12

 Write a Python program using Scikit-learn to print the keys, number of rows-
columns, feature names and the description of the given data.

⚫ Input:

from sklearn.datasets import load_iris # You can replace this with another dataset

# Load the Iris dataset

data = load_iris()

# 1. Print the keys of the dataset

print("Keys of the dataset:")
print(data.keys())

# 2. Print the number of rows and columns

# The 'data' key contains the feature data as a 2D array (rows, columns)
rows, columns = data['data'].shape
print("\nNumber of rows and columns:")
print(f"Rows: {rows}, Columns: {columns}")

# 3. Print the feature names

print("\nFeature names:")
print(data['feature_names'])

# 4. Print the description of the dataset

print("\nDataset description:")
print(data['DESCR'])

⚫ Output:

Keys of the dataset:

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])

Number of rows and columns:

Rows: 150, Columns: 4

Feature names:
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

Dataset description:
.. _iris_dataset:

Iris plants dataset

Data Set Characteristics:

Page 25
DCE

:Number of Instances: 150 (50 in each of three classes)

:Number of Attributes: 4 numeric, predictive attributes and the class
:Attribute Information:
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
- class:
- Iris-Setosa
- Iris-Versicolour
- Iris-Virginica

Page 26
DCE

Practical - 13

 Write a Python program to implement K-Nearest Neighbour supervised machine

learning algorithm for given dataset.
⚫ Input:

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Load the Iris dataset

iris = load_iris()
X = iris.data # Features
y = iris.target # Target labels

# Convert to DataFrame for better visualization (optional)

df = pd.DataFrame(X, columns=iris.feature_names)
df['target'] = y
print("Iris Dataset:\n", df.head())

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features (important for KNN)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Initialize the KNN classifier

k = 3 # Number of neighbors
knn = KNeighborsClassifier(n_neighbors=k)

# Fit the model on the training data

knn.fit(X_train, y_train)

# Make predictions on the testing data

y_pred = knn.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
confusion = confusion_matrix(y_test, y_pred)
report = classification_report(y_test, y_pred)

print("\nAccuracy:", accuracy)
print("\nConfusion Matrix:\n", confusion)

Page 27
DCE

print("\nClassification Report:\n", report)

⚫ Output:

Iris Dataset:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0
1 4.9 3.0 1.4 0.2 0
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 0

Accuracy: 1.0

Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]

Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10

1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

Page 28
DCE

Practical - 14

 Write a Python program to implement a machine learning algorithm for given

dataset. (It is recommended to assign different machine learning algorithms group
wise – micro project)

⚫ Input:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Load the Wine Quality dataset

# You can download the dataset from: https://archive.ics.uci.edu/ml/datasets/wine+quality
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-
red.csv"
data = pd.read_csv(url, sep=';')

# Display the first few rows of the dataset

print("Wine Quality Dataset:")
print(data.head())

# Define features (X) and target (y)

X = data.drop('quality', axis=1) # Features
y = data['quality'] # Target labels

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Decision Tree Classifier

# Initialize the Decision Tree Classifier
dt_classifier = DecisionTreeClassifier(random_state=42)

# Fit the model on the training data

dt_classifier.fit(X_train, y_train)

# Make predictions on the testing data

y_pred_dt = dt_classifier.predict(X_test)

# Evaluate the Decision Tree model

accuracy_dt = accuracy_score(y_test, y_pred_dt)
confusion_dt = confusion_matrix(y_test, y_pred_dt)
report_dt = classification_report(y_test, y_pred_dt)

print("\nDecision Tree Classifier:")

print("Accuracy:", accuracy_dt)
print("Confusion Matrix:\n", confusion_dt)

Page 29
DCE

print("Classification Report:\n", report_dt)

# Support Vector Machine Classifier

# Initialize the Support Vector Classifier
svm_classifier = SVC(random_state=42)

# Fit the model on the training data

svm_classifier.fit(X_train, y_train)

# Make predictions on the testing data

y_pred_svm = svm_classifier.predict(X_test)

# Evaluate the SVM model

accuracy_svm = accuracy_score(y_test, y_pred_svm)
confusion_svm = confusion_matrix(y_test, y_pred_svm)
report_svm = classification_report(y_test, y_pred_svm)

print("\nSupport Vector Machine Classifier:")

print("Accuracy:", accuracy_svm)
print("Confusion Matrix:\n", confusion_svm)
print("Classification Report:\n", report_svm)

⚫ Output:

Wine Quality Dataset:

fixed acidity volatile acidity citric acid residual sugar chlorides \
0 7.4 0.70 0.00 1.9 0.076
1 7.8 0.88 0.00 2.6 0.098
2 7.8 0.00 0.00 2.3 0.092
3 11.2 0.28 0.47 1.9 0.075
4 7.4 0.70 0.00 1.9 0.076

free sulfur dioxide total sulfur dioxide density pH sulfur dioxide \

0 11.0 34.0 0.9978 3.51 0.56
1 25.0 67.0 0.9968 3.20 0.68
2 15.0 54.0 0.9970 3.26 0.65
3 17.0 60.0 0.9980 3.16 0.58
4 11.0 34.0 0.9978 3.51 0.56

quality
0 5
1 5
2 5
3 6
4 5

Decision Tree Classifier:

Accuracy: 0.905

Confusion Matrix:

Page 30
DCE

[[15 0 0 0 0]
[ 0 12 1 1 0]
[ 0 1 10 0 0]
[ 0 0 1 6 1]
[ 0 0 0 0 8]]

Classification Report:
precision recall f1-score support

3 1.00 1.00 1.00 15

4 0.92 0.92 0.92 13
5 0.91 0.91 0.91 11
6 0.86 0.67 0.75 9
7 0.89 1.00 0.94 8

accuracy 0.91 56
macro avg 0.92 0.90 0.90 56
weighted avg 0.91 0.91 0.91 56

Support Vector Machine Classifier:

Accuracy: 0.914

Confusion Matrix:
[[15 0 0 0 0]
[ 0 12 1 0 0]
[ 0 0 11 0 0]
[ 0 0 2 6 1]
[ 0 0 0 0 8]]

Classification Report:
precision recall f1-score support

3 1.00 1.00 1.00 15

4 1.00 0.92 0.96 13
5 0.85 1.00 0.92 11
6 1.00 0.67 0.80 9
7 0.89 1.00 0.94 8

accuracy 0.91 56
macro avg 0.95 0.90 0.92 56
weighted avg 0.93 0.91 0.91 56

Page 31

Data Analytics Using Python Lab Manual
50% (2)
Data Analytics Using Python Lab Manual
8 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Datascience
No ratings yet
Datascience
8 pages
EE2211 CheatSheet
No ratings yet
EE2211 CheatSheet
15 pages
Data Science Using Python Lab Manual
No ratings yet
Data Science Using Python Lab Manual
68 pages
Data Preprocessing-AIML Algorithm1
No ratings yet
Data Preprocessing-AIML Algorithm1
47 pages
ML
No ratings yet
ML
8 pages
Big Data Analysis
No ratings yet
Big Data Analysis
38 pages
Numpy Lib
No ratings yet
Numpy Lib
19 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
Machine
No ratings yet
Machine
33 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Data Science & AIML Coursework
No ratings yet
Data Science & AIML Coursework
10 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Introduction To Machine Learning Course Code: 4350702
No ratings yet
Introduction To Machine Learning Course Code: 4350702
12 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
CS3361 - Data Science
No ratings yet
CS3361 - Data Science
56 pages
ML Aml Cse It Lab Manual Final
No ratings yet
ML Aml Cse It Lab Manual Final
22 pages
Ai&ml Unit 3
No ratings yet
Ai&ml Unit 3
81 pages
Iml Practical Assignment
No ratings yet
Iml Practical Assignment
22 pages
Machine Learning Lab (CIE 421P)
No ratings yet
Machine Learning Lab (CIE 421P)
49 pages
Machine Learning Laboratory: Manual
No ratings yet
Machine Learning Laboratory: Manual
52 pages
ML Lab Manual Completed
No ratings yet
ML Lab Manual Completed
56 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
36 pages
IML LabManual
No ratings yet
IML LabManual
31 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
A Detection Method For Pavement Cracks Combining Object Detection and Attention Mechanism
0% (1)
A Detection Method For Pavement Cracks Combining Object Detection and Attention Mechanism
11 pages
Aimlsyll Removed
No ratings yet
Aimlsyll Removed
13 pages
Diya Basera
No ratings yet
Diya Basera
15 pages
Introduction of Machine Learning Course Code: 4350702
No ratings yet
Introduction of Machine Learning Course Code: 4350702
9 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
Core Concepts of Supervised, Unsupervised, and Reinforcement Learning
No ratings yet
Core Concepts of Supervised, Unsupervised, and Reinforcement Learning
3 pages
ML File
No ratings yet
ML File
17 pages
FINAL FDS MANUAL Print
No ratings yet
FINAL FDS MANUAL Print
55 pages
Land Use Land Cover Analysis of Indian Cities
No ratings yet
Land Use Land Cover Analysis of Indian Cities
74 pages
IBM-CBSE AI Project Logbook
No ratings yet
IBM-CBSE AI Project Logbook
30 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Practical-1: Aim: Study About Numpy Library of Python
No ratings yet
Practical-1: Aim: Study About Numpy Library of Python
28 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
MPR Sem 4
No ratings yet
MPR Sem 4
71 pages
Mode Recognition of Rectangular Dielectric Resonator Antenna Using Artificial Neural Network
No ratings yet
Mode Recognition of Rectangular Dielectric Resonator Antenna Using Artificial Neural Network
8 pages
ML Manual
No ratings yet
ML Manual
21 pages
ML Contenthalf
No ratings yet
ML Contenthalf
35 pages
Data Science Lab Exp Lis
No ratings yet
Data Science Lab Exp Lis
72 pages
Ahsham - ML File
No ratings yet
Ahsham - ML File
34 pages
Fdsa Lab Manual Final
No ratings yet
Fdsa Lab Manual Final
70 pages
13 - NumPy
No ratings yet
13 - NumPy
46 pages
Cs3361-Data Science Lab Manual
No ratings yet
Cs3361-Data Science Lab Manual
44 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
53 pages
Machine Learning Unit4
No ratings yet
Machine Learning Unit4
8 pages
Data Analytics Using Python Lab Manual
No ratings yet
Data Analytics Using Python Lab Manual
8 pages
Homework 0
No ratings yet
Homework 0
4 pages
Set 2
No ratings yet
Set 2
22 pages
Data Science and Machine Learning
No ratings yet
Data Science and Machine Learning
30 pages
Lab Manual
No ratings yet
Lab Manual
19 pages
ML Lab Manual
No ratings yet
ML Lab Manual
59 pages
UNIK A Unified Framework For Real-World Skeleton-B
No ratings yet
UNIK A Unified Framework For Real-World Skeleton-B
14 pages
Decision Tree Random Forest Theory
No ratings yet
Decision Tree Random Forest Theory
13 pages
Machinelearninglabmanual
No ratings yet
Machinelearninglabmanual
47 pages
Machine Learning
No ratings yet
Machine Learning
81 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
ML Lab - Manual
No ratings yet
ML Lab - Manual
15 pages
ML LAB Manual
No ratings yet
ML LAB Manual
18 pages
Meta-Learning How To Forecast Time Series
No ratings yet
Meta-Learning How To Forecast Time Series
38 pages
Practical Aspect of Robot Design, Control and Application of AI
No ratings yet
Practical Aspect of Robot Design, Control and Application of AI
68 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
AI-Powered Pneumonia Detection Enhanced Chest X-Ray Interpretation With CNNs
No ratings yet
AI-Powered Pneumonia Detection Enhanced Chest X-Ray Interpretation With CNNs
5 pages
Bi 8
No ratings yet
Bi 8
3 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
ML Practical Format
No ratings yet
ML Practical Format
82 pages
List of Exp - AI&ML
No ratings yet
List of Exp - AI&ML
22 pages
Feature Scaling (Standardization & Normalization)
No ratings yet
Feature Scaling (Standardization & Normalization)
35 pages
Module - 03 Machine Learning (BCS602) Search Creators
No ratings yet
Module - 03 Machine Learning (BCS602) Search Creators
29 pages
Object Detection Research Paper
No ratings yet
Object Detection Research Paper
4 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
Python File Semester-4
No ratings yet
Python File Semester-4
42 pages
ITE302c Source
No ratings yet
ITE302c Source
73 pages
Machine Learning For Business Analytics: Concepts, Techniques and Applications With JMP Pro, 2nd Edition Galit Shmueliinstant Download
100% (2)
Machine Learning For Business Analytics: Concepts, Techniques and Applications With JMP Pro, 2nd Edition Galit Shmueliinstant Download
51 pages
CS 3361 Set 2
No ratings yet
CS 3361 Set 2
3 pages
Exp 9-10
No ratings yet
Exp 9-10
6 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
YOLO (You Only Look Once)
No ratings yet
YOLO (You Only Look Once)
4 pages
ML Record
No ratings yet
ML Record
19 pages
Transfer Learning Through Embedding Spaces Mohammed Rostami Download
No ratings yet
Transfer Learning Through Embedding Spaces Mohammed Rostami Download
76 pages
All Projects S25
No ratings yet
All Projects S25
149 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Core Java Programming Book
From Everand
Core Java Programming Book
Manish Soni
No ratings yet