IML Lab Manual
IML Lab Manual
Chandkheda, Ahmedabad
(4350702)
LAB MANUAL
(Faculty Guide)
Academic Year
(2024-2025)
Page 1
DCE
CERTIFICATE
completed his/her term work in the subject for the term ending
in 2024.
Date:
Page 2
DCE
Index
S. Page
Practical Outcomes (PrOs) Sign
No. No.
Write a Pandas program to create a line plot of the opening, closing stock
8
prices of given company between two specific dates.
Page 3
DCE
Write a Pandas program to filter all columns where all entries present,
11 check which rows and columns has a NaN and finally drop rows with any
NaNs from the given dataset.
Page 4
DCE
Practical - 1
Explore any one machine learning tool. (like Weka, Tensorflow, Scikit-learn,
Colab, etc.)
Overview of Scikit-learn:
⚫ Scikit-learn is an open-source library that provides simple and efficient tools for data
mining, data analysis, and machine learning. It is built on top of other Python libraries such as
NumPy, SciPy, and matplotlib, making it an excellent choice for machine learning tasks. It is
widely used for creating predictive models, performing data analysis, and feature extraction.
Key Features:
1. Classification: Identifying which category an object belongs to (e.g., spam detection).
2. Regression: Predicting continuous-valued attributes associated with an object (e.g.,
predicting prices).
3. Clustering: Grouping similar objects together (e.g., customer segmentation).
4. Dimensionality Reduction: Reducing the number of random variables to consider (e.g.,
principal component analysis).
5. Model Selection: Comparing, validating, and choosing models with different parameters.
6. Preprocessing: Feature extraction and normalization for data preparation.
Basic Workflow:
i. Importing data: Load and prepare your dataset (usually as a NumPy array or pandas
DataFrame).
ii. Splitting data: Divide your dataset into training and test sets using train_test_split().
iii. Model selection: Choose an appropriate model (e.g., LinearRegression,
DecisionTreeClassifier).
Page 5
DCE
iv. Training: Fit the model on the training data using .fit().
v. Prediction: Use the model to predict results on the test data with .predict().
vi. Evaluation: Measure the model’s performance using accuracy metrics like R² score,
confusion matrix, etc.
Why Use Scikit-learn?
⚫ Input:
⚫ Output:
Page 6
DCE
Practical - 2
⚫ Input:
import numpy as np
⚫ Output:
⚫ Input:
import numpy as np
# Create a 3x3 matrix with values ranging from 2 to 10
matrix_3x3 = np.arange(2, 11).reshape(3, 3)
⚫ Output:
⚫ Input:
import numpy as np
# Original array
arr = np.array([1, 2, 3])
Page 7
DCE
⚫ Output:
2.4 to create another shape from an array without changing its data(3*2 to 2*3)
⚫ Input:
import numpy as np
# Reshape to 2x3
reshaped_arr = arr_3x2.reshape(2, 3)
⚫ Output:
Page 8
DCE
Practical - 3
3.1 to split an array of 14 elements into 3 arrays, each with 2, 4, and 8 elements in the original
order
⚫ Input:
import numpy as np
print("Split arrays:")
for part in arr_split:
print(part)
⚫ Output:
⚫ Input:
import numpy as np
⚫ Output:
Page 9
DCE
Practical - 4
⚫ Input:
import numpy as np
# Element-wise operations
add_result = np.add(arr1, arr2)
subtract_result = np.subtract(arr1, arr2)
multiply_result = np.multiply(arr1, arr2)
divide_result = np.divide(arr1, arr2)
⚫ Output:
⚫ Input:
import numpy as np
Page 10
DCE
⚫ Output:
⚫ Input:
import numpy as np
# Define a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
⚫ Output:
4.4 to calculate the difference between neighboring elements, element- wise of a given array
⚫ Input:
import numpy as np
# Define an array
arr = np.array([10, 20, 30, 40, 50])
⚫ Output:
Page 11
DCE
Practical -
12
Write a NumPy program to implement following operation
5.1 to find the maximum and minimum value of a given flattened array
⚫ Input:
import numpy as np
# Define a 2D array
arr_2d = np.array([[3, 7, 5], [8, 4, 2], [9, 6, 1]])
⚫ Output:
5.2 to compute the mean, standard deviation, and variance of a given array along the
second axis
⚫ Input:
import numpy as np
# Define a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Compute mean, standard deviation, and variance along the second axis (rows)
mean_along_axis2 = np.mean(arr_2d, axis=1)
std_dev_along_axis2 = np.std(arr_2d, axis=1)
variance_along_axis2 = np.var(arr_2d, axis=1)
⚫ Output:
Page 12
DCE
Practical - 6
⚫ Input:
import pandas as pd
# Create a dictionary
data = {
'name': ['Alice', 'Bob', 'Charlie', 'David'],
'age': [24, 27, 22, 32],
'city': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}
⚫ Output:
⚫ Input:
# Create a DataFrame
data = {'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]}
df = pd.DataFrame(data)
Page 13
DCE
⚫ Output:
6.3 to create the mean and standard deviation of the data of a given Series
⚫ Input:
⚫ Output:
⚫ Input:
print("Sorted Series:")
print(sorted_series)
⚫ Output:
Sorted Series:
0 10
2 20
1 30
4 40
Page 14
DCE
3 50
dtype: int64
Practical - 7
⚫ Input:
import pandas as pd
# Create a dictionary
data = {
'name': ['Alice', 'Bob', 'Charlie', 'David'],
'age': [24, 27, 22, 32],
'city': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}
⚫ Output:
⚫ Input:
import pandas as pd
Page 15
DCE
⚫ Output:
⚫ Input:
import pandas as pd
⚫ Output:
⚫ Input:
import pandas as pd
# Write the DataFrame to a CSV file using tab as a separator
df.to_csv('output_data.csv', sep='\t', index=False)
⚫ Output:
Page 16
DCE
⚫ Input:
import pandas as pd
import matplotlib.pyplot as plt
Page 17
DCE
⚫ Output:
Page 18
DCE
Practical - 9
Write a Pandas program to create a plot of Open, High, Low, Close, Adjusted
Closing prices and Volume of given company between two specific dates.
⚫ Input:
import pandas as pd
import matplotlib.pyplot as plt
# Create subplots
fig, axes = plt.subplots(6, 1, figsize=(10, 12), sharex=True)
cols = ['open', 'high', 'low', 'close', 'adj_close', 'volume']
colors = ['blue', 'green', 'red', 'purple', 'orange', 'grey']
titles = ['Open Price', 'High Price', 'Low Price', 'Close Price', 'Adjusted Close Price', 'Volume']
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
⚫ Output:
Page 19
DCE
Page 20
DCE
Practical - 10
10.1 to find and drop the missing values from the given dataset
⚫ Input:
import pandas as pd
# Create a DataFrame
df = pd.DataFrame(data)
⚫ Output:
⚫ Input:
import pandas as pd
# Create a DataFrame
df_duplicates = pd.DataFrame(data_with_duplicates)
Page 21
DCE
⚫ Output:
Page 22
DCE
Practical - 11
Write a Pandas program to filter all columns where all entries present, check which
rows and columns has a NaN and finally drop rows with any NaNs from the given
dataset.
⚫ Input:
import pandas as pd
# Create a DataFrame
df = pd.DataFrame(data)
### 1. Filter columns where all entries are present (no NaNs)
columns_no_nan = df.dropna(axis=1, how='any')
print("\nColumns with no missing values:")
print(columns_no_nan)
⚫ Output:
Original DataFrame:
Name Age City Score
0 Alice 24.0 New York 85.0
1 Bob 27.0 None 88.0
2 Charlie NaN Chicago 90.0
3 David 32.0 Houston NaN
4 Eve 29.0 Boston 93.0
Page 23
DCE
Page 24
DCE
Practical - 12
Write a Python program using Scikit-learn to print the keys, number of rows-
columns, feature names and the description of the given data.
⚫ Input:
from sklearn.datasets import load_iris # You can replace this with another dataset
⚫ Output:
Feature names:
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Dataset description:
.. _iris_dataset:
Page 25
DCE
Page 26
DCE
Practical - 13
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
print("\nAccuracy:", accuracy)
print("\nConfusion Matrix:\n", confusion)
Page 27
DCE
⚫ Output:
Iris Dataset:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0
1 4.9 3.0 1.4 0.2 0
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 0
Accuracy: 1.0
Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
Classification Report:
precision recall f1-score support
accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30
Page 28
DCE
Practical - 14
⚫ Input:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
Page 29
DCE
⚫ Output:
quality
0 5
1 5
2 5
3 6
4 5
Confusion Matrix:
Page 30
DCE
[[15 0 0 0 0]
[ 0 12 1 1 0]
[ 0 1 10 0 0]
[ 0 0 1 6 1]
[ 0 0 0 0 8]]
Classification Report:
precision recall f1-score support
accuracy 0.91 56
macro avg 0.92 0.90 0.90 56
weighted avg 0.91 0.91 0.91 56
Confusion Matrix:
[[15 0 0 0 0]
[ 0 12 1 0 0]
[ 0 0 11 0 0]
[ 0 0 2 6 1]
[ 0 0 0 0 8]]
Classification Report:
precision recall f1-score support
accuracy 0.91 56
macro avg 0.95 0.90 0.92 56
weighted avg 0.93 0.91 0.91 56
Page 31