0% found this document useful (0 votes)

5 views46 pages

ML Lab 2024-26 Final

The document is a lab manual for the Machine Learning and Data Analytics course at PES College of Engineering, detailing the vision and mission of the institution and department, along with program educational objectives and outcomes. It includes a series of experiments utilizing Python libraries for machine learning, such as regression analysis, decision trees, and clustering algorithms, with a focus on real-world applications. The manual aims to equip students with the necessary skills and ethical values to address societal challenges in the field of computer applications.

Uploaded by

keertthui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views46 pages

ML Lab 2024-26 Final

Uploaded by

keertthui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

P.E.S.

COLLEGE OF ENGINEERING
Mandya-571401, Karnataka
(An Autonomous Institution, under Visveswaraiah Technological University,
Belagavi)
Aided by Govt. of Karnataka Recognized by AICTE, New Delhi.
Phone: 08232-220043, 220120 Extn:213 Fax:08232-222075

Department of Master of Computer Applications

II SEMESTER

LAB MANUAL
Machine Learning and Data Analytics using Python
(Integrated course)

Subject Code: P24MCA21

Academic Year: 2024-2026

VISION AND MISSION

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Vision of PESCE
PESCE shall be a leading institution imparting quality engineering and management education
developing creative and socially responsible professionals.

Mission of PESCE
 Provide state-of-the-art infrastructure, motivate the faculty to be proficient in their field of
specialization and adopt best teaching-learning practices.
 Impart engineering and managerial skills through competent and committed faculty using
outcome based educational curriculum.
 Inculcate professional ethics, leadership qualities and entrepreneurial skills to meet the
societalneeds.
 Promote research, product development and industry-institution interaction.

Vision of the Department

A Department of high repute imparting quality education to develop competent computer
applicationsoftware professionals and technocrats to serve the society.

Mission of the Department

Committed to
 To provide state-of-the-art facilities with supportive environment for teaching and learning.
 To prepare the students with curricula of industry expectation.
 Train the students to be competent to solve the real-world problems in the field of computer
Applications and nurturing the students with ethical values for well-being in the society.

Dept. of MCA, PESCE, Mandya Page 2

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

PROGRAM EDUCATIONAL OBJECTIVES (PEOs)

PEO-1. Deliver competence in a global environment as computer software professional with
Practice of software engineering principles.

PEO-2. Exhibit Technical and managerial skills to provide solutions for societal acceptable
problems and manage projects.

PEO-3. Excel in profession with effective communication skills, ethical attitude, teamwork
and ability torelate computer applications to broader societal context.

PROGRAMME OUTCOMES (POs)

PO-1. (Foundation Knowledge): Apply knowledge of mathematics, programming logic and
coding fundamentals for solution architecture and problem solving.

PO-2. (Problem Analysis): Identify, review, formulate and analyze problems for primarily
focusing on customer requirements using critical thinking frameworks.

PO-3. (Development of Solutions): Design, develop and investigate problems with as an

innovative approach for solutions incorporating ESG/SDG goals.

PO-4. (Modern Tool Usage): Select, adapt and apply modern computational tools such as
development of algorithms with an understanding of the limitations including human biases.

PO-5. (Individual and Teamwork): Function and communicate effectively as an individual or a

team leader in diverse and multidisciplinary groups. Use methodologies such as agile.

PO-6. (Project Management and Finance): Use the principles of project management such as

scheduling, work breakdown structure and be conversant with the principles of Finance for

profitable project management.

PO-7. (Ethics): Commit to professional ethics in managing software projects with financial
aspects, learn to use new technologies for cyber security and insulate customers from
malware.

PO-8. (Life-long Learning): Change management skills and the ability to learn, keep up with
contemporary technologies and ways of working.

Dept. of MCA, PESCE, Mandya Page 3

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Sl. Blooms
Experiments COs POs
No. Levels
Python programs to show the usage of Python Libraries for ML
1 application such as Pandas, Matplotlib and Seaborn. Read the
training data from a .CSV file
Write a program to demonstrate Regression analysis with
2
residual plots on a given data set
Write a program to implement the binary logistic Bayesian
classifier for a sample training data set stored as a .CSV file.
3
Compute the accuracy of the classifier, considering few test data
sets
Write a program to implement k-Nearest Neighbour algorithm
4 to classify the iris data set. Print both correct and wrong
predictions
Write a program to demonstrate the working of the decision tree
based ID3 algorithm. Use an appropriate data set for building
5
the decision tree and apply this knowledge to classify a new
sample
Write a program to implement k-Means clustering algorithm to
6
cluster the set of data stored in .CSV file
Write a program to implement SVM algorithm to classify the
7
iris data set. Print both correct and wrong predictions
Build an Artificial Neural Network by implementing the
8 Backpropagation algorithm and test the same using appropriate
data sets
Write a program to compute summary statistics such as mean,
9 median, mode, standard deviation and variance of the given
different types of data

Dept. of MCA, PESCE, Mandya Page 4

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

1. Python programs to show the usage of Python Libraries for ML applications such as Pandas,
Matplotlib and Seaborn. Read the training data from a .CSV file

Name of the Dataset

Autos_mpg.data: info about different cars & their characteristics

(http://archive.ics.uci.edu/ml/datasets/auto+mpg)

Dataset Description

1. 'mpg': Miles per gallon, a measure of fuel efficiency.

2. 'cylinders': Number of cylinders in the engine.
3. 'displacement': Total volume of all cylinders in an engine.
4. 'horsepower': Engine power output measured in horsepower.
5. 'weight': Total weight of the vehicle.
6. 'acceleration': Rate at which the vehicle can increase its speed.
7. 'year': The manufacturing year of the vehicle.
8. 'origin': Country of origin of the vehicle.
9. 'name': Name or identifier of the vehicle model.

In [1]: import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns import warnings
warnings.filterwarnings('ignore')

In [2]: autos=pd.read_csv(r'D:\Teaching\ML\auto+mpg\auto-mpg.data', sep='\s+', header=None)

In [3]: print( autos.head (6))

In [4]: autos.info()

Out: <class 'pandas.core.frame.DataFrame'>

RangeIndex: 398 entries, 0 to 397 Data

columns (total columns):

Dept. of MCA, PESCE, Mandya Page 5

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

# Column Non-Null Count Dtype ---

------ -------------- -----
0 398 non-null float64
1 1 398 non-null int64
2 2 398 non-null float64
3 3 398 non-null object
4 4 398 non-null float64
5 5 398 non-null float64
6 6 398 non-null int64
7 7 398 non-null int64
8 8 398 non-null object
dtypes: float64(4), int64(3), object(2)
memory usage: 28.1+ KB

In [5]: autos.columns = ['mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'accelerati on',

'year', 'origin', 'name']

In [6]: autos.info()

Out: <class 'pandas.core.frame.DataFrame'>

RangeIndex: 398 entries, 0 to 397 Data
columns (total 9 columns):
# Column Non-Null Count Dtype -
-- ------ -------------- -----
mpg 398 non-null float64
1 cylinders 398 non-null int64
2 displacement 398 non-null float64
3 horsepower 398 non-null object
4 weight 398 non-null float64
5 acceleration 398 non-null float64
6 year 398 non-null int64
7 origin 398 non-null int64
8 name 398 non-null object

dtypes: float64(4), int64(3), object(2)

memory usage: 28.1+ KB

In[7]: autos.shape

Out: (398, 9)

In[8]: autos.horsepower.unique()

Dept. of MCA, PESCE, Mandya Page 6

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Out: array(['130.0', '165.0', '150.0', '140.0', '198.0', '220.0', '215.0','225.0', '190.0', '170.0', '160.0',
'95.00', '97.00', '85.00', '88.00', '46.00', '87.00', '90.00', '113.0', '200.0', '210.0', '193.0', '?', '100.0',
'105.0', '175.0', '153.0', '180.0', '110.0','72.00', '86.00', '70.00', '76.00', '65.00', '69.00', '60.00','80.00',
'54.00', '208.0', '155.0', '112.0', '92.00', '145.0', '137.0', '158.0', '167.0', '94.00', '107.0', '230.0',
'49.00','75.00', '91.00', '122.0', '67.00', '83.00', '78.00', '52.00', '61.00', '93.00', '148.0', '129.0', '96.00',
'71.00', '98.00', '115.0', '53.00', '81.00', '79.00', '120.0', '152.0', '102.0',
'108.0', '68.00', '58.00', '149.0', '89.00', '63.00', '48.00','66.00', '139.0','103.0', '125.0', '133.0',
'138.0', '135.0', '142.0', '77.00', '62.00', '132.0', '84.00', '64.00', '74.00', '116.0', '82.00'],
dtype=object)

In[9]: autos["horsepower"] = pd.to_numeric(autos["horsepower"], errors='coerce')

autos.info( )

Out: <class 'pandas.core.frame.DataFrame'>

RangeIndex: 398 entries, 0 to 397 Data
columns (total 9 columns):
# Column Non-Null Count Dtype --- --
---- -------------- -----
mpg 398 non-null float64
1 cylinders 398 non-null int64
2 displacement 398 non-null float64
3 horsepower 392 non-null float64
4 weight 398 non-null float64
5 acceleration 398 non-null float64
6 year 398 non-null int64
7 origin 398 non-null int64
8 name 398 non-null object
dtypes: float64(5), int64(3), object(1)
memory usage: 28.1+ KB

In[10]: autos.describe()

Out: mpg cylinders displacement horsepower weight acceleration year origin

count
In[11]:398.000000
mean
398.000000 398.000000
autos[autos.horsepower.isnull(
23.514573 5.454774
)] 392.000000 398.000000
193.425879 104.469388 2970.424623
398.000000
15.568090
398.000000
76.010050
398.000000
1.572864

std 7.815984 1.701004 104.269838 38.491160 846.841774 2.757689 3.697627 0.802055

min 9.000000 3.000000 68.000000 46.0000001613.000000 8.000000 70.000000 1.000000

25% 17.500000 4.000000 104.250000 75.0000002223.750000 13.825000 73.000000 1.000000

50% of23.000000
Dept. 4.000000
MCA, PESCE, Mandya 148.500000 93.5000002803.500000 15.500000 76.000000 1.000000
Page 7
75% 29.000000 8.000000 262.000000 126.000000 3608.000000 17.175000 79.000000 2.000000

max 46.600000 8.000000 455.000000 230.000000 5140.000000 24.800000 82.000000 3.000000

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Out:

In[12]: val=autos['horsepower'].mean()
Print(val)

Out: 104.46938775510205

In[13]: autos['horsepower'].fillna(autos['horsepower'].mean( ), inplace=True)

In[14]: print(autos.head(7))

Out:
mpg cylinders displacement horsepower weight acceleration year \
0 18.0 8 307.0 130.000000 3504.0 12.0 70
1 15.0 8 350.0 165.000000 3693.0 11.5 70
2 18.0 8 318.0 150.000000 3436.0 11.0 70
3 16.0 8 304.0 150.000000 3433.0 12.0 70
4 17.0 8 302.0 140.000000 3449.0 10.5 70
5 15.0 8 429.0 198.000000 4341.0 10.0 70
6 14.0 8 454.0 220.000000 4354.0 9.0 70

In[15]: autos.mpg.describe()

Out: count 398.000000

mean 23.514573
std 7.815984
min 9.000000
25% 17.500000
50% 23.000000
75% 29.000000
max 46.600000
Name: mpg, dtype: float64

In[16]: # So the minimum value is 9 and maximum is 46, but on average it is 23.44 with a variation
Dept. of MCA, PESCE, Mandya Page 8
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

sns.distplot(autos['mpg'])
plt.title('Distribution plot for MPG values', fontsize=21)

Out: Text(0.5, 1.0, 'Distribution plot for MPG values')

Analysis: So the minimum value is 9 and maximum is 46 but on average it is 23.44 with a variation
of 7.8

In[17]: autos['origin'] = autos.origin.replace([1,2,3],['USA','Europe','Japan'])

In[18]: autos.head()

Out:

Dept. of MCA, PESCE, Mandya Page 9

ML and Data Analytics using Python LAB MANUAL (P24MCA21)
.

In[19]:x=autos['origin']
y=autos['mpg']
fig = plt.figure ( figsize = (10, 5))
plt.bar(x, y,color ='Purple',width = 0.4)
plt.xlabel ("Country Name", fontsize=12)
plt.ylabel ("MPG",fontsize=12)
plt.title ("Average mpg values for different countries", fontsize=20)
plt.show ( )

Out:

Analysis:Japan has more MPG values compared to USA and EUROPE

2. Write a program to demonstrate Regression analysis with residual plots on a given data set

Name of the Dataset

MCA Salary.csv
Dataset Description

Dept. of MCA, PESCE, Mandya Page 10

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

1.YearsExperiens: Shows how long he/she has been working.

2.Salary: Indicates how much money he/she earns for their work.
In[1]: import pandas as pd
import numpy as np import statsmodels.api assn from sklearn.model_selection
import train_test_split from sklearn
import metrics from sklearn
import metrics import math
import seaborn as sns

In[2]: #mca_sal_df=pd.read_csv(r'D:\Teaching\ML\Codes-Data-Files\Machine Learning (Codes

and Data Files)\Data\MCA Salary.csv')
sal_df=pd.read_csv(r'D:\Teaching\ML\2023\Salary.csv')
head()

Out:
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
5 2.9 56642.0
6 3.0 60150.0
7 3.2 54445.0
8 3.2 64445.0
9 3.7 57189.0

In[3]: sal_df.shape

Out: (30, 2)

In [4]: sal_df.info()

Out: <class 'pandas.core.frame.DataFrame'>

RangeIndex: 30 entries, 0 to 29
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 YearsExperience 30 non-null float64
1 Salary 30 non-null float64
Dept. of MCA, PESCE, Mandya Page 11
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

dtypes: float64(2) memory usage: 612.0 bytes

In [5]: sal_df.describe()

Out:
YearsExperience Salary
count 30.000000 30.000000
mean 5.313333 76003.000000
std 2.837888 27414.429785
min 1.100000 37731.000000
25% 3.200000 56720.750000
50% 4.700000 65237.000000
75% 7.700000 100544.750000
max 10.500000 122391.000000

In [6]: # Data distribution

plt.title('Salary Distribution Plot')
sns.distplot(sal_df['Salary'])
plt.show()

In [7]: #add constant term of 1 to the dataset

X=sn.add_constant(sal_df["YearsExperience"])
Y=mca_sal_df["Salary"]

In [8]: #split dataset into train and test set into 80:20 respectively
train_X,test_X,train_y,test_y=train_test_split(X,Y,train_size=0.7,random_state=100)

Dept. of MCA, PESCE, Mandya Page 12

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In [9]: #fiiting the model using OLS method

sal_lm=sn.OLS(train_y,train_X).fit()

In [10]: #print the estimated parameters

print(sal_lm.params)

Out: const 25202.887786

YearsExperience 9731.203838
dtype: float64

In [11]: #prints the model summary contains the information required for diagnosing a regression model
sal_lm.summary()

Out: OLS Regression Results

Dep. Variable: Salary R-squared: 0.949
Model: OLS Adj. R-squared: 0.946
Method: Least Squares F-statistic: 352.9
Date: Sat, 23 Dec 2023 Prob (F- statistic): 9.91e-14
Time: 23:22:55 Log-Likelihood: -211.80
No. Observations: 21 AIC: 427.6
Df Residuals: 19 BIC: 429.7
Df Model: 1
Covariance Type: nonrobust

coef std err t P>|t| [0.0250.975]

const 2.52e+04 2875.387 8.765 0.000 1.92e+0 3.12e+04
YearsExperience 9731.2038 517.993 18.786 0.000 8647.033 1.08e+04

Omnibus: 1.843 Durbin-Watson: 1.749

Prob(Omnibus): 0.398 Jarque-Bera(JB): 1.106
Skew: 0.219 Prob(JB): 0.575
Kurtosis: 1.964 Cond. No. 12.3

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

In [12]:#takes the X parameter and returns the predicted values

pred_y=sal_lm.predict(test_X)

In [13]:print(pred_y, test_y)

Out :

Dept. of MCA, PESCE, Mandya Page 13

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

9 61208.341988
26 117649.324249
28 125434.287320
13 65100.823523
5 53423.378917
12 64127.703139
27 118622.444633
25 112783.7223306
6 54396.499301
dtype: float64
9 57189.0
25 116969.0
28 122391.0
13 57081.0
5 56642.0
12 56957.0
27 112635.0
25 105582.0
6 60150.0
Name: Salary, dtype: float64

In [14]: #R squared error

from sklearn.metrics import r2_score, mean_squared_error
error_score = abs(metrics.r2_score(test_y,pred_y))
print("R squared error:",error_score)

Out: R squared error: 0.9627668685473271

In [15]:# Prediction on test set

sns.regplot(x=test_y, y=pred_y, color = 'Green')
plt.title('Salary vs Experience (Test Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.legend(['predicted [test] values'], loc='upper left')
plt.show()

Dept. of MCA, PESCE, Mandya Page 14

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Dept. of MCA, PESCE, Mandya Page 15

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

3. Write a program to implement the binary logistic Bayesian classifier for a sample training
data set stored as a .CSV file. Compute the accuracy of the classifier, considering few test
data sets

DATASET

pima_indian.csv

In[1]: import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn import metrics

In[2]: df = pd.read_csv(r"C:\Users\DEPT\Downloads\pima_indian.csv")
feature_col_names = ['num_preg', 'glucose_conc', 'diastolic_bp', 'thickness', 'insulin', 'bmi',
'diab_pred', 'age']
predicted_class_names = ['diabetes']

In[3]: X = df[feature_col_names].values # these are factors for the prediction

y = df[predicted_class_names].values # this is what we want to predict

In[4]: #splitting the dataset into train and test data

xtrain,xtest,ytrain,ytest=train_test_split(X,y,test_size=0.33)

In[5]: print ('\n The total number of Training Data :',ytrain.shape)

print ('\n The total number of Test Data :',ytest.shape)

out:
The total number of Training Data : (514, 1)

The total number of Test Data : (254, 1)

In[6]: # Training Naive Bayes (NB) classifier on training data.

clf = GaussianNB( ).fit(xtrain,ytrain.ravel( ))
predicted = clf.predict(xtest)

In[7]: #printing Confusion matrix, accuracy, Precision and Recall

print('\n Confusion matrix')
print(metrics.confusion_matrix(ytest,predicted))

Out:
Confusion matrix
[[135 28]
[ 33 58]]

Dept. of MCA, PESCE, Mandya Page 16

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In[8]: from sklearn.metrics import confusion_matrix, classification_report

from sklearn.metrics import accuracy_score
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Creates a confusion matrix
cm = confusion_matrix(ytest, predicted)
sns.heatmap(cm, annot=True, fmt='g')
plt.title('Accuracy using brute:{0:.2f}'.format(accuracy_score(ytest, predicted)))
plt.ylabel('Actual label')
plt.xlabel('Predicted label')
plt.show()

In[9]: print('Accuracy Metrics')

print(classification_report(ytest,predicted))

Out:
Accuracy Metrics
precision recall f1-score support

0 0.80 0.83 0.82 163

1 0.67 0.64 0.66 91

accuracy 0.76 254

macro avg 0.74 0.73 0.74 254
weighted avg 0.76 0.76 0.76 254

In[10]:#Prediction for new data set

Dept. of MCA, PESCE, Mandya Page 17
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

predictTestData= clf.predict([[6,148,72,35,0,33.6,0.627,50]])
print("Predicted Value for individual Test Data:", predictTestData)

Out:
Predicted Value for individual Test Data: [1]

In[11]: predictTestData1= clf.predict([[1,80,66,29,0,26.6,0.351,31]])

print("Predicted Value for individual Test Data:", predictTestData1)

Out:
Predicted Value for individual Test Data: [0]

Dept. of MCA, PESCE, Mandya Page 18

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

4. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions

DATA SET

‘iris’ dataset from sklearn

DATASET DESCRIPTION

1. Sepal Length: The length of the sepal (the green leaf-like structure) of the iris flower, measured
in centimeters.
2. Sepal Width: The width of the sepal of the iris flower, measured in centimeters.
3. Petal Length: The length of the petal (the colored leaf-like structure) of the iris flower, measured
in centimeters.
4. Petal Width: The width of the petal of the iris flower, measured in centimeters.
5. Species: The species of the iris plant, which can be one of three types: Setosa, Versicolor, or
Virginica. This feature categorizes the iris flowers into distinct species based on their
characteristics.

In[1]: from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix
from sklearn import datasets

In[2]: iris=datasets.load_iris()

In[3]: x = iris.data
y = iris.target

In[4]: print ('sepal-length', 'sepal-width', 'petal-length', 'petal-width')

print(x)

Out: sepal-length sepal-width petal-length petal-width

[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3]
[5. 3.4 1.5 0.2]
.. ..

Dept. of MCA, PESCE, Mandya Page 19

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In[5]: print('class: 0-Iris-Setosa, 1- Iris-Versicolour, 2- Iris-Virginica')

print(y)

Out: class: 0-Iris-Setosa, 1- Iris-Versicolour, 2- Iris-Virginica

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0000000000000111111111111111111111111
1111111111111111111111111122222222222
2222222222222222222222222222222222222
2 2]

In[6]: x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.3)

In[7]: classifier = KNeighborsClassifier(n_neighbors=5)

classifier.fit(x_train, y_train)

Out: KNeighborsClassifier()

In[8]: y_pred=classifier.predict(x_test)

In[9]: import numpy as np

for i in range(len(x_test)):
x=x_test[i]
x_new=np.array([x])
prediction=classifier.predict(x_new)
print("TARGET=",y_test[i],iris["target_names"][y_test[i]],"PREDICTED=",predicti
on,iris["target_names"][prediction])
print(classifier.score(x_test,y_test))

Out : TARGET= 1 versicolor PREDICTED= [1] ['versicolor']

TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 2 virginica PREDICTED= [2] ['virginica'
.. ..

In[10]: print('Confusion Matrix')

print(confusion_matrix(y_test,y_pred))

Out: Confusion Matrix

[[12 0 0]
[ 0 16 1]
[ 0 0 16]]

Dept. of MCA, PESCE, Mandya Page 20

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In[11]:from sklearn.metrics import confusion_matrix

from sklearn.metrics import accuracy_score
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Creates a confusion matrix
cm = confusion_matrix(y_test, y_pred)
# Transform to df for easier plotting
cm_df = pd.DataFrame(cm, index = ['setosa','versicolor','virginica'],
columns = ['setosa','versicolor','virginica'])
sns.heatmap(cm_df, annot=True)
plt.title('Accuracy using brute:{0:.3f}'.format(accuracy_score(y_test, y_pred)))
plt.ylabel('Actual label')
plt.xlabel('Predicted label')
plt.show()

Out:

In[12]: print('Accuracy Metrics')

print(classification_report(y_test,y_pred))

Out: Accuracy Metrics

precision recall f1-score support

0 1.00 1.00 1.00 12

1 1.00 0.94 0.97 17
2 0.94 1.00 0.97 16

accuracy 0.98 45
macro avg 0.98 0.98 0.98 45
weighted avg 0.98 0.98 0.98 45

Dept. of MCA, PESCE, Mandya Page 21

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

5. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge to
classify a new sample

DATA SET

‘iris’ dataset from seaborn library

DATASET DESCRIPTION

These features collectively describe various physical attributes of iris flowers, which are commonly
used in machine learning tasks for tasks such as classification and clustering.

In [1]: import pandas as pd

import numpy as np
import statsmodels.api as sn
from sklearn.model_selection import train_test_split
from sklearn import metrics
import matplotlib.pyplot as plt
import seaborn as sns

In [2]: iris_df=sns.load_dataset('iris')

In [3]: iris_df.head()

Out: sepal_length sepal_width petal_length petal_width species

5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

Dept. of MCA, PESCE, Mandya Page 22

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In [4]: iris_df.info()

Out: <class 'pandas.core.frame.DataFrame'>

RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
sepal_length 150 non-null float64
1 sepal_width 150 non-null float64
2 petal_length 150 non-null float64
3 petal_width 150 non-null float64
4 species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB

In [5]: # Unique Classes in the dataset

iris_df['species'].unique()

Out: array(['setosa', 'versicolor', 'virginica'], dtype=object)

In [6]: iris_df.isnull().sum()

Out: sepal_length 0 sepal_width 0

petal_length 0 petal_width
0
species 0
dtype: int64

In [7]: # Replaces the target class values to numerical values (Object to numeric)
iris_df['species']=iris_df['species'].map({'setosa':0,'versicolor':1,'virginica':2})

In[8]: iris_df.head(105)

Dept. of MCA, PESCE, Mandya Page 23

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In [9]: #independent feature and dependent features

X=iris_df.iloc[:,:-1] y=iris_df['species']
In [10]: X,y

Out: (sepal_length sepal_width petal_length petal_width

5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2 .. ... ... ... ...
145 6.7 3.0 5.2 2.3
146 6.3 2.5 5.0 1.9
147 6.5 3.0 5.2 2.0
148 6.2 3.4 5.4 2.3
149 5.9 3.0 5.1 1.8

[150 rows x 4 columns],

0
1 0
2 0
3 0
4 0 ..
145 2
146 2
147 2
148 2
149 2
Name: species, Length: 150, dtype: int64)

In [11]: ### train test split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [12]: X_train
Out[12]:
sepal_length sepal_width petal_length petal_width
81 5.5 2.4 3.7 1.0
133 6.3 2.8 5.1 1.5
137 6.4 3.1 5.5 1.8
75 6.6 3.0 4.4 1.4
109 7.2 3.6 6.1 2.5

... ... ... ... ...

106 4.9 2.5 4.5 1.7

14 5.8 4.0 1.2 0.2
92 5.8 2.6 4.0 1.2

Dept. of MCA, PESCE, Mandya Page 24

ML and Data Analytics using Python LAB MANUAL (P24MCA21)
102 7.1 3.0 5.9 2.1
105 rows × 4 columns

In [13]: y_train
Out: 81 1
133 2
137 2
75 1
109 2
..
71 1
106 2
14 0
92 1
102 2
Name: species, Length: 105, dtype: int64

In [14]: #Model building

from sklearn.tree import DecisionTreeClassifier
## Postpruning
treemodel=DecisionTreeClassifier(max_depth=2)
treemodel.fit(X_train,y_train)

In [15]: #prediction
y_pred=treemodel.predict(X_test) y_pred
Out[15]: array([1, 0, 2, 1, 2, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 2, 1, 1, 0,
0], dtype=int64)

In [16]: from sklearn.metrics import accuracy_score,classification_report

score=accuracy_score(y_pred,y_test)
print(score)

Out: 0.9777777777777777

In [17]: print(classification_report(y_pred,y_test))

Out[17] : precision recall f1-score support

1.00 1.00 1.00 19

Dept. of MCA, PESCE, Mandya Page 25

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

1 0.92 1.00 0.96 12

2 1.00 0.93 0.96 14
accuracy 0.98 45
macro avg 0.97 0.98 0.97 45
weighted avg 0.98 0.98 0.98 45

In [18]: from sklearn import tree

plt.figure(figsize=(15,10))
tree.plot_tree(treemodel,filled=True)

Out:

In [19]: predictTestData1= treemodel.predict([[1,80,66,29,0,26.6,0.351,31]])

print("Predicted Value for individual Test Data:", predictTestData1)

Dept. of MCA, PESCE, Mandya Page 26

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

6. Write a program to implement k-Means clustering algorithm to cluster the set of data
stored in .CSV file

DATASET
Income Data.csv
DATASET DECSRIPTION
1.income: income of the individual.
2.age: age of the individual.

In [1]: import pandas as pd

import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
import warnings
warnings.filterwarnings('ignore')

In [2]: income_df=pd.read_csv(r'D:\Teaching\ML\Codes-Data-Files\Machine Learning

(Codes and Data Files)\Data\Income Data.csv')

In [3]: income_df.head()

Out:
income age
0 41100.0 48.75
1 54100.0 28.10
2 47800.0 46.75
3 19100.0 40.25
4 18200.0 35.80

In [4]: income_df.info()

Out: <class 'pandas.core.frame.DataFrame'>

RangeIndex: 300 entries, 0 to 299
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 income 300 non-null
float64 1 age 300 non-null
float64 dtypes: float64(2)
memory usage: 4.8 KB

In [5]: plt.figure(figsize=(10,6))

Dept. of MCA, PESCE, Mandya Page 27

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

plt.scatter(income_df['income'], income_df['age'])
plt.xlabel('Income')
plt.ylabel('Age')
plt.title('Income Data')

Out: Text(0.5, 1.0, 'Income Data')

Analysis: The age value upto 30 has high income ranges between 50000-60000.
In [6]: cluster_range = range(1, 10)
cluster_errors = [ ]
for num_clusters in cluster_range:
clusters = KMeans(num_clusters)
clusters.fit(income_df)
cluster_errors.append(clusters.inertia)
plt.figure(figsize=(6,4))
plt.plot(cluster_range, cluster_errors, marker = "o");
plt.title('Elbow method')
plt.xlabel('Number of clusters')
plt.ylabel('Cluster Score')

Dept. of MCA, PESCE, Mandya Page 28

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Out: Text(0, 0.5, 'Cluster Score')

Analysis:The elbow point is falling down at 2 so we take n.clusters value as 2.

In [7]: cluster_errors

Out: [77496243724.64746,
12598951960.688824,
6107696328.700776,
3093566239.1138325,
2208535279.104451,
1468601128.8812134,
1167521998.0943167,
916192175.9564873,
727270333.3059859]

In [8]: clusters_model = KMeans(n_clusters=2, random_state=42)

clusters_model.fit(income_df)

In [9]: pred=clusters_model.predict(income_df)
pred

1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,
1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0])

In [10]: clusters_model.labels_

Out: array([0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0,
0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0,
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
1,0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,
1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0])

In [11]: income_df['cluster'] = pd.DataFrame(pred, columns=['cluster'])

In [12]: income_df.head()

Out:
income age cluster
41100.0 48.75 0
1 54100.0 28.10 0
2 47800.0 46.75 0
3 19100.0 40.25 1
4 18200.0 35.80 1

In [13]: import seaborn as sn

sn.lmplot(x="age", y="income", data=income_df, fit_reg=False, hue='cluster');
#plt.legend('lower right')
plt.show()

Dept. of MCA, PESCE, Mandya Page 30

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In [14]: clusters_model.cluster_centers_

Out: array([[4.98601990e+04, 3.80713930e+01],

[1.85808081e+04, 3.92449495e+01]])

In [15]: clusters_model = KMeans(n_clusters=3, random_state=42)

clusters_model.fit(income_df)

In [16]: pred=clusters_model.predict(income_df)
Pred

Out: array([2, 0, 2, 1, 1, 1, 0, 2, 1, 2, 0, 0, 0, 2, 0, 1, 2, 2, 1, 0, 1, 2,
0, 2, 1, 1, 2, 1, 0, 0, 1, 2, 2, 0, 0, 1, 0, 1, 2, 0, 1, 0, 2, 0,
0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 2, 2, 1, 1, 0, 0, 0, 2, 1, 0, 1,
2, 0, 2, 0, 1, 1, 1, 1, 0, 2, 0, 1, 2, 2, 1, 2, 0, 2, 2, 0, 0, 1,
2, 2, 1, 0, 1, 0, 0, 0, 2, 0, 1, 2, 0, 1, 2, 0, 0, 2, 1, 2, 0, 0,
2, 1, 0, 2, 1, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
2, 1, 1, 0, 1, 2, 1, 1, 0, 2, 0, 2, 1, 1, 2, 1, 1, 0, 2, 1, 2, 0,
1, 1, 0, 0, 2, 0, 2, 0, 0, 2, 1, 0, 2, 2, 2, 1, 0, 2, 1, 0, 0, 0,
2, 0, 2, 0, 0, 1, 2, 2, 2, 2, 0, 1, 2, 1, 2, 2, 0, 0, 1, 2, 0, 1,
2, 1, 0, 1, 0, 1, 0, 1, 2, 1, 2, 0, 2, 2, 1, 0, 0, 0, 0, 2, 1, 0,
2, 0, 0, 0, 2, 1, 1, 2, 0, 2, 2, 0, 0, 2, 0, 1, 1, 1, 2, 2, 0, 1,
1, 1, 1, 0, 2, 1, 2, 0, 0, 2, 0, 0, 1, 2, 0, 1, 2, 0, 1, 0, 1, 1,
2, 1, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0, 1, 1, 0, 0, 2, 0, 0, 0, 1, 0,

Dept. of MCA, PESCE, Mandya Page 31

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

1, 1, 0, 0, 1, 1, 1, 0, 2, 2, 1, 0, 2, 2])
In [17]: income_df['cluster'] = pd.DataFrame(pred, columns=['cluster']) income_df.head()

Out:
income age cluster
41100.0 48.75 2
1 54100.0 28.10 0
2 47800.0 46.75 2
3 19100.0 40.25 1
4 18200.0 35.80 1

In [18]: import seaborn as sn

sn.lmplot(x="age", y="income", data=income_df, fit_reg=False, hue='cluster');
#plt.legend('lower right')
plt.show()

Dept. of MCA, PESCE, Mandya Page 32

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

7. Write a program to implement SVM algorithm to classify the iris data set. Print both
correct and wrong predictions

In[1]: # Import necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score
from matplotlib.colors import ListedColormap
import seaborn as sns

In[2]: # Loading the Iris dataset using scikit-learn’s datasets module. The load_iris() function from
this module loads the well-known Iris dataset
iris=datasets.load_iris( )

In[3]: #Selecting specific features from the Iris dataset.

x=iris.data[:, [2, 3]]
y=iris.target

In[4]: x

out:
array([[1.4, 0.2],
[1.4, 0.2],
[1.3, 0.2],
[1.5, 0.2],
[1.4, 0.2],
[1.7, 0.4],
[1.4, 0.3],
[1.5, 0.2],
.. .. ])

In[5]: y

out:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
Dept. of MCA, PESCE, Mandya Page 33
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In[6]: # Creating a Pandas DataFrame (`iris_df`) from the feature matrix `X` and the
target vector `y` obtained from the Iris dataset.
iris_df=pd.DataFrame(x, columns=iris.feature_names[2:])
iris_df['target']=y

In[7]: plt.figure(figsize=(10,6))
plt.scatter(x[y==0,0], x[y==0,1],color='red', marker='o', label='Setosa')
plt.scatter(x[y==1,0], x[y==1,1],color='blue', marker='x', label='Versicolor')
plt.scatter(x[y==2,0], x[y==2,1],color='green', marker='^', label='Virginica')
plt.xlabel('Petal length')
plt.ylabel('Petal width')
plt.legend(loc='upper left')
plt.title('Data Distribution')
plt.show()

Out:

In[8]: # The code is using the train_test_split function from scikit-learn to split the dataset into
training and testing sets
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size=0.3,
random_state=42)

# By using the StandardScaler from scikit-learn to standardize the features in the training
and test sets.
sc=StandardScaler()
x_train_std=sc.fit_transform(x_train)
x_test_std=sc.transform(x_test)
Dept. of MCA, PESCE, Mandya Page 34
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

In[9]: # By using scikit-learn’s SVC (Support Vector Classification) to create a Support Vector
Machine (SVM) model with a linear kernel
svm_cl=SVC(kernel='linear', C=1.0, random_state=0)
svm_cl.fit(x_train_std, y_train)

Out:
SVC(kernel='linear', random_state=0)

In[10]: # defines a function called plot_decision_regions that can be used to visualize

decision boundaries of a classifier.
def plot_dec_region(x, y, classifier, test_idx=None, resolution=0.02):
#setup marker and color map
markers=('s','x','o','^', 'v')
colors=('red','blue','lightgreen', 'gray', 'cyan')
cmap=ListedColormap(colors[:len(np.unique(y))])

In[11]: #plot the decision surface

x1_min, x1_max=x[:, 0].min()-1, x[:,0].max()+1
x2_min, x2_max=x[:, 1].min()-1, x[:,0].max()+1
xx1, xx2=np.meshgrid(np.arange(x1_min,x1_max, resolution),
np.arange(x2_min,x2_max, resolution))
z=classifier.predict(np.array([xx1.ravel(),xx2.ravel()]).T)
z=z.reshape(xx1.shape)
plt.contourf(xx1, xx2, z, alpha=0.4, cmap=cmap)
plt.xlim(xx1.min(), xx1.max())
plt.ylim(xx2.min(),xx2.max())

#plot all samples

for idx, c1 in enumerate(np.unique(y)):
plt.scatter(x=x[y==c1,0],y=x[y==c1, 1], alpha=0.8, c=cmap(idx),
marker=markers[idx], label=c1)
plt.show()

In[12]: #combine the standardized feature matrices (X_train_std and X_test_std) and the
corresponding target vectors (y_train and y_test)
x_combine_std=np.vstack((x_train_std, x_test_std))
#combine train and test target values
y_combine=np.hstack((y_train,y_test))

In[13]: # By using the plot_decision_regions function to visualize the decision boundaries

of the Support Vector Machine (SVM) classifier (svm) on the combined standardized
feature matrix (X_combined_std) and target vector (y_combined)

Dept. of MCA, PESCE, Mandya Page 35

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

#visualizing the decision boundaries

plot_dec_region(x_combine_std, y_combine, classifier=svm_cl)
plt.xlabel('Petal length [Standardized]')
plt.ylabel('Petal width [Standardized]')
plt.legend(loc='upper left')
plt.title('SVM Decision Boundaries')
plt.show()

Out:

In[14]: # Make predictions using the SVM model (svm) on the standardized test data
(X_test_std) and then calculating the confusion matrix.
y_pred=svm_cl.predict(x_test_std)
cm=confusion_matrix(y_test, y_pred)
print("Confusion Matrix\n", cm)
accuracy=accuracy_score(y_test,y_pred)
print("Accuracy:", accuracy)

Out:
Confusion Matrix
[[19 0 0]
[ 0 13 0]
[ 0 0 13]]
Accuracy: 1.0

In[15]: #plotting the confusion matrix

plt.figure(figsize=(8,6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix')
plt.show()

Dept. of MCA, PESCE, Mandya Page 36

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Out:

8. Build an Artificial Neural Network by implementing the Backpropagation algorithm and

test the same using appropriate data sets

Dept. of MCA, PESCE, Mandya Page 37

ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Training Examples:

Expected % in
Example Sleep Study
Exams
1 2 9 92
2 1 5 86
3 3 6 89

Normalize the input :

Expected % in
Example Sleep Study
Exams
1 2/3 = 0.66666667 9/9 = 1 0.92
2 1/3 = 0.33333333 5/9 = 0.55555556 0.86
3 3/3 = 1 6/9 = 0.66666667 0.89

import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinally
y = y/100

#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))

#Derivative of Sigmoid Function

def derivatives_sigmoid(x):
return x * (1 - x)

#Variable initialization
epoch=5000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer

#weight and bias initialization

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))

#draws a random range of numbers uniformly of dim x*y

for i in range(epoch):
Dept. of MCA, PESCE, Mandya Page 38
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
#Bckpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
#how much hidden layer wts contributed to error
hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad
# dotproduct of nextlayererror and currentlayerop
wout += hlayer_act.T.dot(d_output) *lr
wh += X.T.dot(d_hiddenlayer) *lr

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)

Out:

Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.89417246]
[0.88311751]
[0.89255249]]

9. Write simple python programs to understand the Basic Libraries such as Statistics, Math,
Numpy and Scipy

a. Statistics
Dept. of MCA, PESCE, Mandya Page 39
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

# Python code to demonstrate the working of mean(), median(), mode()

# importing statistics to handle statisticaloperations
from statistics import mean
from statistics import mode
from statistics import median
from statistics import median_low
from statistics import median_high
from statistics import variance
from statistics import stdev

# List of positive integer numbers

data1 = [20, 30, 40, 20, 50, 50, 70, 90, 50, 10]

# List of floating point values

data2 = [21.4, 51.1, 62.7, 82.9]

# Tuple of a set of negative integers

data3 = [-45, -11, -12, -19, -34]

# Tuple of set of positive and negative integers

data4 = [-11, -12, -13, -14, 15, 15, 17, 18]

print("DATA SET")
print("Data-set 1", data1)
print("Data-set 1", data2)
print("Data-set 1", data3)
print("Data-set 1", data4)

print("MEAN")
# using mean () to calculate average of list elements
print ("The average of data-set 1 is : %.2f " %(mean(data1)))
print ("The average of data-set 2 is : %.2f " %(mean(data2)))
print ("The average of data-set 3 is : %.2f " %(mean(data3)))
print ("The average of data-set 4 is : %.2f " %(mean(data4)))
print("\n")

print("MODE")
# Printing the median of above datasets
print("Mode of data-set 1 is %.2f " %(mode(data1)))
print("Mode of data-set 2 is %.2f " %(mode(data2)))
print("Mode of data-set 3 is %.2f " %(mode(data3)))
print("Mode of data-set 4 is %.2f " %(mode(data4)))
print("\n")

print("MEDIAN")
# Printing the median of above datasets
Dept. of MCA, PESCE, Mandya Page 40
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

print("Median of data-set 1 is %.2f " %(median(data1)))

print("Median of data-set 2 is %.2f " %(median(data2)))
print("Median of data-set 3 is %.2f " %(median(data3)))
print("Median of data-set 4 is %.2f " %(median(data4)))
print("\n")

print("LOW and HIGH MEDIAN")

# simple list of a set of integers
sample = [1, 3, 3, 4, 5, 7]
print("Sample Set:", sample)

# Printing the median of above datasets

print("Median of data-set 1 is %.2f " %(median(sample)))

# Print low median of the data-set

print("Low Median of the set is %.2f " %(median_low(sample)))

# Print high median of the data-set

print("High Median of the set is %.2f " %(median_high(sample)))
print("\n")

print("VARIANCE")
# Print the variance of the data-sets
print("Variance of data-set 1 is %.2f " %(variance(data1)))
print("Variance of data-set 2 is %.2f " %(variance(data2)))
print("Variance of data-set 3 is %.2f " %(variance(data3)))
print("Variance of data-set 4 is %.2f " %(variance(data4)))
print("\n")

print("STANDARD DEVIATION")
# Print the standard deviation of the data-sets
print("The Standard Deviation of data-set 1 is %.2f" % (stdev(data1)))
print("The Standard Deviation of data-set 2 is %.2f" % (stdev(data2)))
print("The Standard Deviation of data-set 3 is %.2f" % (stdev(data3)))
print("The Standard Deviation of data-set 4 is %.2f" % (stdev(data4)))

Output:

DATA SET
Data-set 1 [20, 30, 40, 20, 50, 50, 70, 90, 50, 10]
Dept. of MCA, PESCE, Mandya Page 41
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

Data-set 2 [21.4, 51.1, 62.7, 82.9]

Data-set 3 [-45, -11, -12, -19, -34]
Data-set 4 [-11, -12, -13, -14, 15, 15, 17, 18]

MEAN
The average of data-set 1 is : 43.00
The average of data-set 2 is : 54.53
The average of data-set 3 is : -24.20
The average of data-set 4 is : 1.88

MODE
Mode of data-set 1 is 50.00
Mode of data-set 2 is 21.40
Mode of data-set 3 is -45.00
Mode of data-set 4 is 15.00

MEDIAN
Median of data-set 1 is 45.00
Median of data-set 2 is 56.90
Median of data-set 3 is -19.00
Median of data-set 4 is 2.00

LOW and HIGH MEDIAN

Sample Set: [1, 3, 3, 4, 5, 7]
Median of data-set 1 is 3.50
Low Median of the set is 3.00
High Median of the set is 4.00

VARIANCE
Variance of data-set 1 is 601.11
Variance of data-set 2 is 660.32
Variance of data-set 3 is 219.70
Variance of data-set 4 is 237.84

STANDARD DEVIATION
The Standard Deviation of data-set 1 is 24.52
The Standard Deviation of data-set 2 is 25.70
The Standard Deviation of data-set 3 is 14.82
The Standard Deviation of data-set 4 is 15.42

b. Math

#Calculation of the permutations and the combinations using math and scipy library.
#p = n! / (n - r)!
Dept. of MCA, PESCE, Mandya Page 42
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

#c = n! / (r! * (n - r)!)

import math
from scipy.special import perm, comb
n = int(input("Enter value for n:"))
r = int(input("Enter value for r:"))
def permutations_count(n, r):
return math.factorial(n) // math.factorial(n - r)
def combinations_count(n, r):
return math.factorial(n) // (math.factorial(n - r) * math.factorial(r))
print("The permutation of", n, "and", r, "is ")
print(permutations_count(n, r))
print("The combination of", n, "and", r, "is ")
print(combinations_count(n, r))

output:

Enter value for n:6

Enter value for r:4

The permutation of 6 and 4 is

360

The combination of 6 and 4 is

c. Numpy

# Python program for matrix multiplication operations

# importing numpy
Dept. of MCA, PESCE, Mandya Page 43
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

import numpy as np
Rows1 = int(input("Give the number of rows for matrix1:"))
Columns1 = int(input("Give the number of columns for matrix1:"))
Rows2 = int(input("Give the number of rows for matrix2:"))
Columns2 = int(input("Give the number of columns for matrix2:"))
if (Columns1 != Rows2):
print("Multiplication not possible....")
else:
print("Please write the elements of the matrix1 in a single line and separated by a space: ")
# User will give the entries in a single line
elements1 = list(map(int, input().split()))
print("Please write the elements of the matrix2 in a single line and separated by a space: ")
elements2 = list(map(int, input().split()))
# Printing the matrix given by the user
mat1 = np.array(elements1).reshape(Rows1, Columns1)
print("Matrix 1")
print(mat1)
mat2 = np.array(elements2).reshape(Rows2, Columns2)
print("Matrix 2")
print(mat2)
# producting matrices
print("Product of(mat1,mat2)...")
print(np.dot(mat1,mat2))
print() # prints newline

Output1:

Give the number of rows for matrix1:2

Give the number of columns for matrix1:3

Give the number of rows for matrix2:3

Give the number of columns for matrix2:2

Please write the elements of the matrix1 in a single line and separated by a space:
1234 56
Please write the elements of the matrix2 in a single line and separated by a space:
246815

Matrix 1
[[1 2 3]
[4 5 6]]

Matrix 2
[[2 4]
[6 8]
Dept. of MCA, PESCE, Mandya Page 44
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

[1 5]]
Product of(mat1, mat2)...
[[17 35]
[44 86]]

Output2:

Give the number of rows for matrix1:2

Give the number of columns for matrix1:3

Give the number of rows for matrix2:2

Give the number of columns for matrix2:3

Multiplication not possible....

d. scipy

#Python program to calculate determinant, eigenvalues of (X) and correspond eigenvector of a

two-dimensional square matrix
from scipy import linalg
import numpy as np
n = int(input("Enter the value for n:"))

#enter value for square matrix

print("Enter matrix elements")
elements = list(map(int, input().split()))
arr = np.array(elements).reshape(n,n)
print("Input Matrix")
print(arr)
print()

#pass values to det() function

Mdet=linalg.det( arr )
print ("Determinant of a matrix is :", Mdet)
print()

#pass value into eig function

eg_val, eg_vect = linalg.eig(arr)
print("Eigen values are")
#get eigenvalues
print(eg_val)
print()

print("Eigen Vectors are")

Dept. of MCA, PESCE, Mandya Page 45
ML and Data Analytics using Python LAB MANUAL (P24MCA21)

#get eigenvectors
print(eg_vect)

Output:
Enter the value for n:2
Enter matrix elements
4715
Input Matrix
[[4 7]
[1 5]]
Determinant of a matrix is : 12.999999999999998
Eigen values are
[1.8074176+0.j 7.1925824+0.j]

Eigen Vectors are

[[-0.95428251 -0.90983868]
[ 0.29890615 -0.41496214]]

Dept. of MCA, PESCE, Mandya Page 46

Dsbda Lab Manual
No ratings yet
Dsbda Lab Manual
167 pages
DSBDA Manual
No ratings yet
DSBDA Manual
76 pages
AD-502 Machine Learning Lab - Exp 1-10
No ratings yet
AD-502 Machine Learning Lab - Exp 1-10
13 pages
AL-405 Machine Learning Lab Manual
No ratings yet
AL-405 Machine Learning Lab Manual
40 pages
Industrial Training Report Chetan Vidiyar
No ratings yet
Industrial Training Report Chetan Vidiyar
51 pages
Dav - Lab Manual
No ratings yet
Dav - Lab Manual
34 pages
Data - Science - Manaul (Te)
No ratings yet
Data - Science - Manaul (Te)
78 pages
DATA MINING Using PYTHON
No ratings yet
DATA MINING Using PYTHON
37 pages
Unit 4 - Working With Graphs - Python
No ratings yet
Unit 4 - Working With Graphs - Python
49 pages
Vishnu. ML
No ratings yet
Vishnu. ML
26 pages
ML SIG - Day 1
No ratings yet
ML SIG - Day 1
55 pages
TE DAV Lab - Dox
No ratings yet
TE DAV Lab - Dox
44 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
167 pages
MLT Lab Manual
No ratings yet
MLT Lab Manual
41 pages
Data Science Lab Manual..
No ratings yet
Data Science Lab Manual..
54 pages
Dsbdal Lab Manual
No ratings yet
Dsbdal Lab Manual
107 pages
Data Analysis Using Python2
No ratings yet
Data Analysis Using Python2
27 pages
CS-605 DataAnalyticsLab Manav
No ratings yet
CS-605 DataAnalyticsLab Manav
20 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
More On Pandas
No ratings yet
More On Pandas
51 pages
D P Lab Manual
No ratings yet
D P Lab Manual
54 pages
Machine Learning Lab Record Report
No ratings yet
Machine Learning Lab Record Report
38 pages
Ip Project 2ND Year
No ratings yet
Ip Project 2ND Year
18 pages
DS Journal
No ratings yet
DS Journal
46 pages
Vamshi ml-1,2
No ratings yet
Vamshi ml-1,2
25 pages
Python For Data Science
No ratings yet
Python For Data Science
22 pages
Become An AI Engineer - Baap of All Jobs
No ratings yet
Become An AI Engineer - Baap of All Jobs
29 pages
Self Intoduction 1 Project
No ratings yet
Self Intoduction 1 Project
11 pages
It, Hardware Exp1
No ratings yet
It, Hardware Exp1
10 pages
Machine Learning With Python - Part-2
No ratings yet
Machine Learning With Python - Part-2
27 pages
Adobe Scan 15 Apr 2025
No ratings yet
Adobe Scan 15 Apr 2025
19 pages
Suraj Report File
No ratings yet
Suraj Report File
17 pages
Main PART PDF
No ratings yet
Main PART PDF
46 pages
4BUIS014W Business Computing-Portfolio
No ratings yet
4BUIS014W Business Computing-Portfolio
7 pages
With Python: Machine Learning
No ratings yet
With Python: Machine Learning
3 pages
Wa0005.
No ratings yet
Wa0005.
29 pages
DS Final
No ratings yet
DS Final
46 pages
ML Aml Cse It Lab Manual Final
No ratings yet
ML Aml Cse It Lab Manual Final
22 pages
Python Intro Tut 16 Jun
No ratings yet
Python Intro Tut 16 Jun
4 pages
L6 and 7-Data Preprocessing-Coding
No ratings yet
L6 and 7-Data Preprocessing-Coding
34 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Vibhin Pro
No ratings yet
Vibhin Pro
36 pages
CS 601 ML Lab Manual
0% (1)
CS 601 ML Lab Manual
14 pages
Full Stack Data Science Roadmap
No ratings yet
Full Stack Data Science Roadmap
17 pages
Practical 7
No ratings yet
Practical 7
8 pages
Shwet Mlds
No ratings yet
Shwet Mlds
35 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
96 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
Scientific Methods
No ratings yet
Scientific Methods
236 pages
General Chemistry LP
No ratings yet
General Chemistry LP
4 pages
DAL EXT 1 and 2
No ratings yet
DAL EXT 1 and 2
125 pages
DVAP - Final Project Report
No ratings yet
DVAP - Final Project Report
27 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
12 pages
DS&BD Lab Manul
No ratings yet
DS&BD Lab Manul
98 pages
Preview Book Method-Validation
No ratings yet
Preview Book Method-Validation
29 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
AIIMS Class 11 & 12th PCB Sample Ebook
No ratings yet
AIIMS Class 11 & 12th PCB Sample Ebook
131 pages
Ch. 4 Roundoff and Truncation Errors
No ratings yet
Ch. 4 Roundoff and Truncation Errors
16 pages
International Standard
No ratings yet
International Standard
54 pages
Basics of Supply Chain Managment (Lesson 2)
96% (25)
Basics of Supply Chain Managment (Lesson 2)
26 pages
Cambridge IGCSE: Co-Ordinated Sciences 0654/62
No ratings yet
Cambridge IGCSE: Co-Ordinated Sciences 0654/62
20 pages
ISO Optics and Optical Instruments - Field Procedures For Testing
No ratings yet
ISO Optics and Optical Instruments - Field Procedures For Testing
26 pages
CEMS in Power Plant
No ratings yet
CEMS in Power Plant
37 pages
Roundness Measuring Instrument: RMI-D560C RMI-D420
No ratings yet
Roundness Measuring Instrument: RMI-D560C RMI-D420
8 pages
Writing Assessment Criteria: CEFR B2
No ratings yet
Writing Assessment Criteria: CEFR B2
10 pages
Desktop Dropzone Rulebook
No ratings yet
Desktop Dropzone Rulebook
48 pages
Brochure Blow Room & Weighting Scale
No ratings yet
Brochure Blow Room & Weighting Scale
20 pages
4-Image Fusion of Landsat ETM+ and SPOT Satellite Image Using IHS, Brovey and PCA (2007-Cited12)
No ratings yet
4-Image Fusion of Landsat ETM+ and SPOT Satellite Image Using IHS, Brovey and PCA (2007-Cited12)
3 pages
Answers To Chapter Review Questions PDF
No ratings yet
Answers To Chapter Review Questions PDF
48 pages
RF Network Planning & Optimization Service V100R005 Training Slides (UMTS ASP) 01-En
No ratings yet
RF Network Planning & Optimization Service V100R005 Training Slides (UMTS ASP) 01-En
31 pages
Restaurant Success Prediction
No ratings yet
Restaurant Success Prediction
14 pages
Fluconazole
No ratings yet
Fluconazole
9 pages
The Best Digital Kitchen Scales - America's Test Kitchen
No ratings yet
The Best Digital Kitchen Scales - America's Test Kitchen
8 pages
Design A Business Proposal
No ratings yet
Design A Business Proposal
17 pages
Cambridge IGCSE™: Cambridge International Mathematics 0607/61 May/June 2021
No ratings yet
Cambridge IGCSE™: Cambridge International Mathematics 0607/61 May/June 2021
7 pages
Unit 1
No ratings yet
Unit 1
5 pages
Accuracy Evaluation of Microwave Water Surface Current Meter For Measurement
No ratings yet
Accuracy Evaluation of Microwave Water Surface Current Meter For Measurement
13 pages
Dlsta LIC: STI N A SIM
No ratings yet
Dlsta LIC: STI N A SIM
11 pages
CS 3308 Discussion Assignment Unit 3
No ratings yet
CS 3308 Discussion Assignment Unit 3
5 pages
Result
No ratings yet
Result
9 pages
Question 4
No ratings yet
Question 4
4 pages
K45603, K45604, K45703, K45704 - ADA5000 - Technical Datasheet
No ratings yet
K45603, K45604, K45703, K45704 - ADA5000 - Technical Datasheet
2 pages
Innovation Technologies To Smart Education
No ratings yet
Innovation Technologies To Smart Education
15 pages
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
From Everand
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
Suhas Pote
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
PyTorch Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
PyTorch Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet