ML Lab 2024-26 Final
ML Lab 2024-26 Final
COLLEGE OF ENGINEERING
Mandya-571401, Karnataka
(An Autonomous Institution, under Visveswaraiah Technological University,
Belagavi)
Aided by Govt. of Karnataka Recognized by AICTE, New Delhi.
Phone: 08232-220043, 220120 Extn:213 Fax:08232-222075
II SEMESTER
LAB MANUAL
Machine Learning and Data Analytics using Python
(Integrated course)
Vision of PESCE
PESCE shall be a leading institution imparting quality engineering and management education
developing creative and socially responsible professionals.
Mission of PESCE
Provide state-of-the-art infrastructure, motivate the faculty to be proficient in their field of
specialization and adopt best teaching-learning practices.
Impart engineering and managerial skills through competent and committed faculty using
outcome based educational curriculum.
Inculcate professional ethics, leadership qualities and entrepreneurial skills to meet the
societalneeds.
Promote research, product development and industry-institution interaction.
PEO-2. Exhibit Technical and managerial skills to provide solutions for societal acceptable
problems and manage projects.
PEO-3. Excel in profession with effective communication skills, ethical attitude, teamwork
and ability torelate computer applications to broader societal context.
PO-2. (Problem Analysis): Identify, review, formulate and analyze problems for primarily
focusing on customer requirements using critical thinking frameworks.
PO-4. (Modern Tool Usage): Select, adapt and apply modern computational tools such as
development of algorithms with an understanding of the limitations including human biases.
PO-6. (Project Management and Finance): Use the principles of project management such as
scheduling, work breakdown structure and be conversant with the principles of Finance for
PO-7. (Ethics): Commit to professional ethics in managing software projects with financial
aspects, learn to use new technologies for cyber security and insulate customers from
malware.
PO-8. (Life-long Learning): Change management skills and the ability to learn, keep up with
contemporary technologies and ways of working.
Sl. Blooms
Experiments COs POs
No. Levels
Python programs to show the usage of Python Libraries for ML
1 application such as Pandas, Matplotlib and Seaborn. Read the
training data from a .CSV file
Write a program to demonstrate Regression analysis with
2
residual plots on a given data set
Write a program to implement the binary logistic Bayesian
classifier for a sample training data set stored as a .CSV file.
3
Compute the accuracy of the classifier, considering few test data
sets
Write a program to implement k-Nearest Neighbour algorithm
4 to classify the iris data set. Print both correct and wrong
predictions
Write a program to demonstrate the working of the decision tree
based ID3 algorithm. Use an appropriate data set for building
5
the decision tree and apply this knowledge to classify a new
sample
Write a program to implement k-Means clustering algorithm to
6
cluster the set of data stored in .CSV file
Write a program to implement SVM algorithm to classify the
7
iris data set. Print both correct and wrong predictions
Build an Artificial Neural Network by implementing the
8 Backpropagation algorithm and test the same using appropriate
data sets
Write a program to compute summary statistics such as mean,
9 median, mode, standard deviation and variance of the given
different types of data
1. Python programs to show the usage of Python Libraries for ML applications such as Pandas,
Matplotlib and Seaborn. Read the training data from a .CSV file
Dataset Description
In [4]: autos.info()
In [6]: autos.info()
In[7]: autos.shape
Out: (398, 9)
In[8]: autos.horsepower.unique()
Out: array(['130.0', '165.0', '150.0', '140.0', '198.0', '220.0', '215.0','225.0', '190.0', '170.0', '160.0',
'95.00', '97.00', '85.00', '88.00', '46.00', '87.00', '90.00', '113.0', '200.0', '210.0', '193.0', '?', '100.0',
'105.0', '175.0', '153.0', '180.0', '110.0','72.00', '86.00', '70.00', '76.00', '65.00', '69.00', '60.00','80.00',
'54.00', '208.0', '155.0', '112.0', '92.00', '145.0', '137.0', '158.0', '167.0', '94.00', '107.0', '230.0',
'49.00','75.00', '91.00', '122.0', '67.00', '83.00', '78.00', '52.00', '61.00', '93.00', '148.0', '129.0', '96.00',
'71.00', '98.00', '115.0', '53.00', '81.00', '79.00', '120.0', '152.0', '102.0',
'108.0', '68.00', '58.00', '149.0', '89.00', '63.00', '48.00','66.00', '139.0','103.0', '125.0', '133.0',
'138.0', '135.0', '142.0', '77.00', '62.00', '132.0', '84.00', '64.00', '74.00', '116.0', '82.00'],
dtype=object)
In[10]: autos.describe()
count
In[11]:398.000000
mean
398.000000 398.000000
autos[autos.horsepower.isnull(
23.514573 5.454774
)] 392.000000 398.000000
193.425879 104.469388 2970.424623
398.000000
15.568090
398.000000
76.010050
398.000000
1.572864
50% of23.000000
Dept. 4.000000
MCA, PESCE, Mandya 148.500000 93.5000002803.500000 15.500000 76.000000 1.000000
Page 7
75% 29.000000 8.000000 262.000000 126.000000 3608.000000 17.175000 79.000000 2.000000
Out:
In[12]: val=autos['horsepower'].mean()
Print(val)
Out: 104.46938775510205
In[14]: print(autos.head(7))
Out:
mpg cylinders displacement horsepower weight acceleration year \
0 18.0 8 307.0 130.000000 3504.0 12.0 70
1 15.0 8 350.0 165.000000 3693.0 11.5 70
2 18.0 8 318.0 150.000000 3436.0 11.0 70
3 16.0 8 304.0 150.000000 3433.0 12.0 70
4 17.0 8 302.0 140.000000 3449.0 10.5 70
5 15.0 8 429.0 198.000000 4341.0 10.0 70
6 14.0 8 454.0 220.000000 4354.0 9.0 70
In[15]: autos.mpg.describe()
In[16]: # So the minimum value is 9 and maximum is 46, but on average it is 23.44 with a variation
Dept. of MCA, PESCE, Mandya Page 8
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
sns.distplot(autos['mpg'])
plt.title('Distribution plot for MPG values', fontsize=21)
In[18]: autos.head()
Out:
In[19]:x=autos['origin']
y=autos['mpg']
fig = plt.figure ( figsize = (10, 5))
plt.bar(x, y,color ='Purple',width = 0.4)
plt.xlabel ("Country Name", fontsize=12)
plt.ylabel ("MPG",fontsize=12)
plt.title ("Average mpg values for different countries", fontsize=20)
plt.show ( )
Out:
Out:
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
5 2.9 56642.0
6 3.0 60150.0
7 3.2 54445.0
8 3.2 64445.0
9 3.7 57189.0
In[3]: sal_df.shape
Out: (30, 2)
In [4]: sal_df.info()
In [5]: sal_df.describe()
Out:
YearsExperience Salary
count 30.000000 30.000000
mean 5.313333 76003.000000
std 2.837888 27414.429785
min 1.100000 37731.000000
25% 3.200000 56720.750000
50% 4.700000 65237.000000
75% 7.700000 100544.750000
max 10.500000 122391.000000
In [8]: #split dataset into train and test set into 80:20 respectively
train_X,test_X,train_y,test_y=train_test_split(X,Y,train_size=0.7,random_state=100)
In [11]: #prints the model summary contains the information required for diagnosing a regression model
sal_lm.summary()
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In [13]:print(pred_y, test_y)
Out :
9 61208.341988
26 117649.324249
28 125434.287320
13 65100.823523
5 53423.378917
12 64127.703139
27 118622.444633
25 112783.7223306
6 54396.499301
dtype: float64
9 57189.0
25 116969.0
28 122391.0
13 57081.0
5 56642.0
12 56957.0
27 112635.0
25 105582.0
6 60150.0
Name: Salary, dtype: float64
3. Write a program to implement the binary logistic Bayesian classifier for a sample training
data set stored as a .CSV file. Compute the accuracy of the classifier, considering few test
data sets
DATASET
pima_indian.csv
In[2]: df = pd.read_csv(r"C:\Users\DEPT\Downloads\pima_indian.csv")
feature_col_names = ['num_preg', 'glucose_conc', 'diastolic_bp', 'thickness', 'insulin', 'bmi',
'diab_pred', 'age']
predicted_class_names = ['diabetes']
out:
The total number of Training Data : (514, 1)
Out:
Confusion matrix
[[135 28]
[ 33 58]]
Out:
Accuracy Metrics
precision recall f1-score support
predictTestData= clf.predict([[6,148,72,35,0,33.6,0.627,50]])
print("Predicted Value for individual Test Data:", predictTestData)
Out:
Predicted Value for individual Test Data: [1]
Out:
Predicted Value for individual Test Data: [0]
4. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions
DATA SET
DATASET DESCRIPTION
1. Sepal Length: The length of the sepal (the green leaf-like structure) of the iris flower, measured
in centimeters.
2. Sepal Width: The width of the sepal of the iris flower, measured in centimeters.
3. Petal Length: The length of the petal (the colored leaf-like structure) of the iris flower, measured
in centimeters.
4. Petal Width: The width of the petal of the iris flower, measured in centimeters.
5. Species: The species of the iris plant, which can be one of three types: Setosa, Versicolor, or
Virginica. This feature categorizes the iris flowers into distinct species based on their
characteristics.
In[2]: iris=datasets.load_iris()
In[3]: x = iris.data
y = iris.target
Out: KNeighborsClassifier()
In[8]: y_pred=classifier.predict(x_test)
Out:
accuracy 0.98 45
macro avg 0.98 0.98 0.98 45
weighted avg 0.98 0.98 0.98 45
5. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge to
classify a new sample
DATA SET
DATASET DESCRIPTION
1. Sepal Length: The length of the sepal (the green leaf-like structure) of the iris flower, measured
in centimeters.
2. Sepal Width: The width of the sepal of the iris flower, measured in centimeters.
3. Petal Length: The length of the petal (the colored leaf-like structure) of the iris flower, measured
in centimeters.
4. Petal Width: The width of the petal of the iris flower, measured in centimeters.
5. Species: The species of the iris plant, which can be one of three types: Setosa, Versicolor, or
Virginica. This feature categorizes the iris flowers into distinct species based on their
characteristics.
These features collectively describe various physical attributes of iris flowers, which are commonly
used in machine learning tasks for tasks such as classification and clustering.
In [2]: iris_df=sns.load_dataset('iris')
In [3]: iris_df.head()
In [4]: iris_df.info()
In [6]: iris_df.isnull().sum()
In [7]: # Replaces the target class values to numerical values (Object to numeric)
iris_df['species']=iris_df['species'].map({'setosa':0,'versicolor':1,'virginica':2})
In[8]: iris_df.head(105)
0
1 0
2 0
3 0
4 0 ..
145 2
146 2
147 2
148 2
149 2
Name: species, Length: 150, dtype: int64)
In [12]: X_train
Out[12]:
sepal_length sepal_width petal_length petal_width
81 5.5 2.4 3.7 1.0
133 6.3 2.8 5.1 1.5
137 6.4 3.1 5.5 1.8
75 6.6 3.0 4.4 1.4
109 7.2 3.6 6.1 2.5
In [13]: y_train
Out: 81 1
133 2
137 2
75 1
109 2
..
71 1
106 2
14 0
92 1
102 2
Name: species, Length: 105, dtype: int64
In [15]: #prediction
y_pred=treemodel.predict(X_test) y_pred
Out[15]: array([1, 0, 2, 1, 2, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 2, 1, 1, 0,
0], dtype=int64)
Out: 0.9777777777777777
In [17]: print(classification_report(y_pred,y_test))
Out:
6. Write a program to implement k-Means clustering algorithm to cluster the set of data
stored in .CSV file
DATASET
Income Data.csv
DATASET DECSRIPTION
1.income: income of the individual.
2.age: age of the individual.
In [3]: income_df.head()
Out:
income age
0 41100.0 48.75
1 54100.0 28.10
2 47800.0 46.75
3 19100.0 40.25
4 18200.0 35.80
In [4]: income_df.info()
In [5]: plt.figure(figsize=(10,6))
plt.scatter(income_df['income'], income_df['age'])
plt.xlabel('Income')
plt.ylabel('Age')
plt.title('Income Data')
Analysis: The age value upto 30 has high income ranges between 50000-60000.
In [6]: cluster_range = range(1, 10)
cluster_errors = [ ]
for num_clusters in cluster_range:
clusters = KMeans(num_clusters)
clusters.fit(income_df)
cluster_errors.append(clusters.inertia)
plt.figure(figsize=(6,4))
plt.plot(cluster_range, cluster_errors, marker = "o");
plt.title('Elbow method')
plt.xlabel('Number of clusters')
plt.ylabel('Cluster Score')
In [7]: cluster_errors
Out: [77496243724.64746,
12598951960.688824,
6107696328.700776,
3093566239.1138325,
2208535279.104451,
1468601128.8812134,
1167521998.0943167,
916192175.9564873,
727270333.3059859]
In [9]: pred=clusters_model.predict(income_df)
pred
Out: array([0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0,
0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0,
Dept. of MCA, PESCE, Mandya Page 29
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,
1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0])
In [10]: clusters_model.labels_
Out: array([0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0,
0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0,
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
1,0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0,
1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0])
In [12]: income_df.head()
Out:
income age cluster
41100.0 48.75 0
1 54100.0 28.10 0
2 47800.0 46.75 0
3 19100.0 40.25 1
4 18200.0 35.80 1
In [14]: clusters_model.cluster_centers_
In [16]: pred=clusters_model.predict(income_df)
Pred
Out: array([2, 0, 2, 1, 1, 1, 0, 2, 1, 2, 0, 0, 0, 2, 0, 1, 2, 2, 1, 0, 1, 2,
0, 2, 1, 1, 2, 1, 0, 0, 1, 2, 2, 0, 0, 1, 0, 1, 2, 0, 1, 0, 2, 0,
0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 2, 2, 1, 1, 0, 0, 0, 2, 1, 0, 1,
2, 0, 2, 0, 1, 1, 1, 1, 0, 2, 0, 1, 2, 2, 1, 2, 0, 2, 2, 0, 0, 1,
2, 2, 1, 0, 1, 0, 0, 0, 2, 0, 1, 2, 0, 1, 2, 0, 0, 2, 1, 2, 0, 0,
2, 1, 0, 2, 1, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
2, 1, 1, 0, 1, 2, 1, 1, 0, 2, 0, 2, 1, 1, 2, 1, 1, 0, 2, 1, 2, 0,
1, 1, 0, 0, 2, 0, 2, 0, 0, 2, 1, 0, 2, 2, 2, 1, 0, 2, 1, 0, 0, 0,
2, 0, 2, 0, 0, 1, 2, 2, 2, 2, 0, 1, 2, 1, 2, 2, 0, 0, 1, 2, 0, 1,
2, 1, 0, 1, 0, 1, 0, 1, 2, 1, 2, 0, 2, 2, 1, 0, 0, 0, 0, 2, 1, 0,
2, 0, 0, 0, 2, 1, 1, 2, 0, 2, 2, 0, 0, 2, 0, 1, 1, 1, 2, 2, 0, 1,
1, 1, 1, 0, 2, 1, 2, 0, 0, 2, 0, 0, 1, 2, 0, 1, 2, 0, 1, 0, 1, 1,
2, 1, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0, 1, 1, 0, 0, 2, 0, 0, 0, 1, 0,
1, 1, 0, 0, 1, 1, 1, 0, 2, 2, 1, 0, 2, 2])
In [17]: income_df['cluster'] = pd.DataFrame(pred, columns=['cluster']) income_df.head()
Out:
income age cluster
41100.0 48.75 2
1 54100.0 28.10 0
2 47800.0 46.75 2
3 19100.0 40.25 1
4 18200.0 35.80 1
7. Write a program to implement SVM algorithm to classify the iris data set. Print both
correct and wrong predictions
In[2]: # Loading the Iris dataset using scikit-learn’s datasets module. The load_iris() function from
this module loads the well-known Iris dataset
iris=datasets.load_iris( )
In[4]: x
out:
array([[1.4, 0.2],
[1.4, 0.2],
[1.3, 0.2],
[1.5, 0.2],
[1.4, 0.2],
[1.7, 0.4],
[1.4, 0.3],
[1.5, 0.2],
.. .. ])
In[5]: y
out:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
Dept. of MCA, PESCE, Mandya Page 33
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
In[6]: # Creating a Pandas DataFrame (`iris_df`) from the feature matrix `X` and the
target vector `y` obtained from the Iris dataset.
iris_df=pd.DataFrame(x, columns=iris.feature_names[2:])
iris_df['target']=y
In[7]: plt.figure(figsize=(10,6))
plt.scatter(x[y==0,0], x[y==0,1],color='red', marker='o', label='Setosa')
plt.scatter(x[y==1,0], x[y==1,1],color='blue', marker='x', label='Versicolor')
plt.scatter(x[y==2,0], x[y==2,1],color='green', marker='^', label='Virginica')
plt.xlabel('Petal length')
plt.ylabel('Petal width')
plt.legend(loc='upper left')
plt.title('Data Distribution')
plt.show()
Out:
In[8]: # The code is using the train_test_split function from scikit-learn to split the dataset into
training and testing sets
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size=0.3,
random_state=42)
# By using the StandardScaler from scikit-learn to standardize the features in the training
and test sets.
sc=StandardScaler()
x_train_std=sc.fit_transform(x_train)
x_test_std=sc.transform(x_test)
Dept. of MCA, PESCE, Mandya Page 34
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
In[9]: # By using scikit-learn’s SVC (Support Vector Classification) to create a Support Vector
Machine (SVM) model with a linear kernel
svm_cl=SVC(kernel='linear', C=1.0, random_state=0)
svm_cl.fit(x_train_std, y_train)
Out:
SVC(kernel='linear', random_state=0)
In[12]: #combine the standardized feature matrices (X_train_std and X_test_std) and the
corresponding target vectors (y_train and y_test)
x_combine_std=np.vstack((x_train_std, x_test_std))
#combine train and test target values
y_combine=np.hstack((y_train,y_test))
Out:
In[14]: # Make predictions using the SVM model (svm) on the standardized test data
(X_test_std) and then calculating the confusion matrix.
y_pred=svm_cl.predict(x_test_std)
cm=confusion_matrix(y_test, y_pred)
print("Confusion Matrix\n", cm)
accuracy=accuracy_score(y_test,y_pred)
print("Accuracy:", accuracy)
Out:
Confusion Matrix
[[19 0 0]
[ 0 13 0]
[ 0 0 13]]
Accuracy: 1.0
Out:
Training Examples:
Expected % in
Example Sleep Study
Exams
1 2 9 92
2 1 5 86
3 3 6 89
Expected % in
Example Sleep Study
Exams
1 2/3 = 0.66666667 9/9 = 1 0.92
2 1/3 = 0.33333333 5/9 = 0.55555556 0.86
3 3/3 = 1 6/9 = 0.66666667 0.89
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinally
y = y/100
#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))
#Variable initialization
epoch=5000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
#Bckpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
#how much hidden layer wts contributed to error
hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad
# dotproduct of nextlayererror and currentlayerop
wout += hlayer_act.T.dot(d_output) *lr
wh += X.T.dot(d_hiddenlayer) *lr
Out:
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[0.92]
[0.86]
[0.89]]
Predicted Output:
[[0.89417246]
[0.88311751]
[0.89255249]]
9. Write simple python programs to understand the Basic Libraries such as Statistics, Math,
Numpy and Scipy
a. Statistics
Dept. of MCA, PESCE, Mandya Page 39
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
print("DATA SET")
print("Data-set 1", data1)
print("Data-set 1", data2)
print("Data-set 1", data3)
print("Data-set 1", data4)
print("MEAN")
# using mean () to calculate average of list elements
print ("The average of data-set 1 is : %.2f " %(mean(data1)))
print ("The average of data-set 2 is : %.2f " %(mean(data2)))
print ("The average of data-set 3 is : %.2f " %(mean(data3)))
print ("The average of data-set 4 is : %.2f " %(mean(data4)))
print("\n")
print("MODE")
# Printing the median of above datasets
print("Mode of data-set 1 is %.2f " %(mode(data1)))
print("Mode of data-set 2 is %.2f " %(mode(data2)))
print("Mode of data-set 3 is %.2f " %(mode(data3)))
print("Mode of data-set 4 is %.2f " %(mode(data4)))
print("\n")
print("MEDIAN")
# Printing the median of above datasets
Dept. of MCA, PESCE, Mandya Page 40
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
print("VARIANCE")
# Print the variance of the data-sets
print("Variance of data-set 1 is %.2f " %(variance(data1)))
print("Variance of data-set 2 is %.2f " %(variance(data2)))
print("Variance of data-set 3 is %.2f " %(variance(data3)))
print("Variance of data-set 4 is %.2f " %(variance(data4)))
print("\n")
print("STANDARD DEVIATION")
# Print the standard deviation of the data-sets
print("The Standard Deviation of data-set 1 is %.2f" % (stdev(data1)))
print("The Standard Deviation of data-set 2 is %.2f" % (stdev(data2)))
print("The Standard Deviation of data-set 3 is %.2f" % (stdev(data3)))
print("The Standard Deviation of data-set 4 is %.2f" % (stdev(data4)))
Output:
DATA SET
Data-set 1 [20, 30, 40, 20, 50, 50, 70, 90, 50, 10]
Dept. of MCA, PESCE, Mandya Page 41
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
MEAN
The average of data-set 1 is : 43.00
The average of data-set 2 is : 54.53
The average of data-set 3 is : -24.20
The average of data-set 4 is : 1.88
MODE
Mode of data-set 1 is 50.00
Mode of data-set 2 is 21.40
Mode of data-set 3 is -45.00
Mode of data-set 4 is 15.00
MEDIAN
Median of data-set 1 is 45.00
Median of data-set 2 is 56.90
Median of data-set 3 is -19.00
Median of data-set 4 is 2.00
VARIANCE
Variance of data-set 1 is 601.11
Variance of data-set 2 is 660.32
Variance of data-set 3 is 219.70
Variance of data-set 4 is 237.84
STANDARD DEVIATION
The Standard Deviation of data-set 1 is 24.52
The Standard Deviation of data-set 2 is 25.70
The Standard Deviation of data-set 3 is 14.82
The Standard Deviation of data-set 4 is 15.42
b. Math
#Calculation of the permutations and the combinations using math and scipy library.
#p = n! / (n - r)!
Dept. of MCA, PESCE, Mandya Page 42
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
#c = n! / (r! * (n - r)!)
import math
from scipy.special import perm, comb
n = int(input("Enter value for n:"))
r = int(input("Enter value for r:"))
def permutations_count(n, r):
return math.factorial(n) // math.factorial(n - r)
def combinations_count(n, r):
return math.factorial(n) // (math.factorial(n - r) * math.factorial(r))
print("The permutation of", n, "and", r, "is ")
print(permutations_count(n, r))
print("The combination of", n, "and", r, "is ")
print(combinations_count(n, r))
output:
c. Numpy
import numpy as np
Rows1 = int(input("Give the number of rows for matrix1:"))
Columns1 = int(input("Give the number of columns for matrix1:"))
Rows2 = int(input("Give the number of rows for matrix2:"))
Columns2 = int(input("Give the number of columns for matrix2:"))
if (Columns1 != Rows2):
print("Multiplication not possible....")
else:
print("Please write the elements of the matrix1 in a single line and separated by a space: ")
# User will give the entries in a single line
elements1 = list(map(int, input().split()))
print("Please write the elements of the matrix2 in a single line and separated by a space: ")
elements2 = list(map(int, input().split()))
# Printing the matrix given by the user
mat1 = np.array(elements1).reshape(Rows1, Columns1)
print("Matrix 1")
print(mat1)
mat2 = np.array(elements2).reshape(Rows2, Columns2)
print("Matrix 2")
print(mat2)
# producting matrices
print("Product of(mat1,mat2)...")
print(np.dot(mat1,mat2))
print() # prints newline
Output1:
Please write the elements of the matrix1 in a single line and separated by a space:
1234 56
Please write the elements of the matrix2 in a single line and separated by a space:
246815
Matrix 1
[[1 2 3]
[4 5 6]]
Matrix 2
[[2 4]
[6 8]
Dept. of MCA, PESCE, Mandya Page 44
ML and Data Analytics using Python LAB MANUAL (P24MCA21)
[1 5]]
Product of(mat1, mat2)...
[[17 35]
[44 86]]
Output2:
d. scipy
#get eigenvectors
print(eg_vect)
Output:
Enter the value for n:2
Enter matrix elements
4715
Input Matrix
[[4 7]
[1 5]]
Determinant of a matrix is : 12.999999999999998
Eigen values are
[1.8074176+0.j 7.1925824+0.j]