8 To 12 Jaimeen
8 To 12 Jaimeen
AIM: Apply EM algorithm to cluster a set of data stored in a .CSV file. Use
the same data set for clustering using k-Means algorithm.
from sklearn import datasets
from sklearn.cluster import KMeans
from matplotlib import pyplot as plt
ERP:- 210303105045 P a g e | 36
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
ERP:- 210303105045 P a g e | 37
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
ERP:- 210303105045 P a g e | 38
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
k_values = [2,3,4,5]
wcss_values = []
plt.plot(k_values,wcss_values)
plt.show()
ERP:- 210303105045 P a g e | 39
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
ERP:- 210303105045 P a g e | 40
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
PRACTICAL: 09
ERP:- 210303105045 P a g e | 41
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
patient_data.hist(figsize=(20, 16))
ERP:- 210303105045 P a g e | 42
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
➢ Logical Regression
ERP:- 210303105045 P a g e | 43
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
# Create a Logistic Regression model
logreg_model = LogisticRegression(random_state=42)
# Perform grid search with cross-validation
logreg_grid_search = GridSearchCV(
logreg_model, logreg_param_grid, cv=5, scoring="accuracy"
)
logreg_grid_search.fit(X_train, Y_train)
Best Hyperparameters for Logistic Regression: {'C': 0.1, 'penalty': 'l1', 'solver': 'liblinear'}
Best Accuracy of Logistic Regression model: 85.29%
ERP:- 210303105045 P a g e | 44
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
➢ Decision Tree
ERP:- 210303105045 P a g e | 45
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
)
Classification Report:
precision recall f1-score support
ERP:- 210303105045 P a g e | 46
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
➢ Random Forest
ERP:- 210303105045 P a g e | 47
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
Best Hyperparameters for Random Forest: {'max_depth': 10, 'max_features': 'sqrt',
'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 100, 'random_state': 12}
Best Accuracy of Random Forest: 94.54%
➢ KNN
ERP:- 210303105045 P a g e | 48
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
best_KNN_acc = accuracy_score(Y_test, KNN_predict)
print("Best Accuracy of K-Neighbors Classifier:", "{:.2f}%".format(best_KNN_acc * 100))
Best Hyperparameters for K-Nearest Neighbors: {'algorithm': 'auto', 'n_neighbors': 19, 'weights': 'distance'}
Best Accuracy of K-Neighbors Classifier: 94.12%
➢ SVM
ERP:- 210303105045 P a g e | 49
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
# Create a Support Vector Classifier model
SVM = SVC()
ERP:- 210303105045 P a g e | 50
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
➢ Naive Bayes
print("\nClassification Report:")
nb_cr = classification_report(Y_test, NB_predict)
print(nb_cr)
plt.show()
Classification Report:
precision recall f1-score support
ERP:- 210303105045 P a g e | 51
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
➢ Adaboost
ERP:- 210303105045 P a g e | 52
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
ERP:- 210303105045 P a g e | 53
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
PRACTICAL: 10
➢ Comparison Table
print("Comparison Table:")
print(comparison)
Comparison Table:
Model Accuracy
0 Logistic Regression 85.29%
1 Decision Tree 86.97%
2 Random Forest 94.54%
3 KNN 94.12%
4 SVM 89.92%
5 Naive Bayes 85.71%
6 Adaboost 89.08%
Best Model:
Random Forest: 94.54%
ERP:- 210303105045 P a g e | 54
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
➢ Comparison Bar Plot
num_classifiers = 7
num_rows = (num_classifiers - 1) // 4 + 1
num_cols = min(num_classifiers, 4)
classifiers = [
("Logistic Regression", logreg_cm, best_logreg_acc),
("Decision Tree", DT_cm, max_dt_acc),
("Random Forest", RF_cm, best_RF_acc),
("K-Neighbors", KNN_cm, best_KNN_acc),
("SVM", SVM_cm, best_SVM_acc),
("Naive Bayes", NB_cm, NB_acc_score),
("AdaBoost", adaboost_cm, adaboost_acc_score),
]
for (name, cm, acc_score), ax in zip(classifiers, axes.flatten()):
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", ax=ax)
plt.tight_layout(rect=[0, 0, 1, 0.96])
plt.show()
ERP:- 210303105045 P a g e | 55
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
f1_scores = {}
recall_scores = {}
precision_scores = {}
ERP:- 210303105045 P a g e | 56
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
):
f1_scores[name] = [
cr[label]["f1-score"] for label in cr.keys() if label.isnumeric()
]
recall_scores[name] = [
cr[label]["recall"] for label in cr.keys() if label.isnumeric()
]
precision_scores[name] = [
cr[label]["precision"] for label in cr.keys() if label.isnumeric()
]
ERP:- 210303105045 P a g e | 57
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
ERP:- 210303105045 P a g e | 58
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
PRACTICAL: 11
AIM: Compare the various Unsupervised learning algorithm by using
the appropriate datasets.
Theory:
ERP:- 210303105045 P a g e | 59
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
2. Hicíaíckical Clustcíi⭲g (Agglomcíati:c)
Hicíaíckical cl"stcíi⭲g b"ilds a tícc-likc stí"ct"íc (dc⭲díogíam) wkcíc cack data poi⭲t staíts i⭲ its ow⭲ cl"stcí,
a⭲d cl"stcís aíc itcíati:clQ mcígcd bascd o⭲ somc li⭲kagc cíitcíia (c.g., dista⭲cc).
ERP:- 210303105045 P a g e | 60
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
3. Dc⭲sity-Bascd Clustcíi⭲g (DBSCAN)
DBSCAN (Dc⭲sitQ-Bascd Spatial Cl"stcíi⭲g of Applicatio⭲s witk Noisc) foíms cl"stcís bQ idc⭲tifQi⭲g dc⭲sc
ícgio⭲s of data poi⭲ts. Poi⭲ts i⭲ low- dc⭲sitQ ícgio⭲s aíc co⭲sidcícd ⭲oisc.
# Calculate silhouette score (DBSCAN may produce -1 for noise, so we filter it out)if
len(set(dbscan_labels)) > 1:
silhouette_avg = silhouette_score(X_pca[dbscan_labels != -1], dbscan_labels[dbscan_labels != -1])print(f'DBSCAN
Silhouette Score: {silhouette_avg:.3f}')
else:
print('DBSCAN did not form any clusters.')
ERP:- 210303105045 P a g e | 61
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
4. Modcl-Bascd Clustcíi⭲g (Gaussia⭲ Mixtuíc Modcls, GMM):
GMM ass"mcs tkat tkc data is gc⭲cíatcd fíom a mixt"íc of sc:cíal Ga"ssia⭲ distíib"tio⭲s. Eack cl"stcí is
modclcd bQ a m"lti:aíiatc Ga"ssia⭲ distíib"tio⭲, a⭲d tkc goal is to maximizc tkc likclikood of tkc data gi:c⭲tkc
paíamctcís.
ERP:- 210303105045 P a g e | 62
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
5. Mca⭲-Skift Clustcíi⭲g:
Mca⭲-skift cl"stcíi⭲g itcíati:clQ skifts data poi⭲ts towaíd tkc modc (dc⭲scst ícgio⭲) of tkc datasct,"ltimatclQ
idc⭲tifQi⭲g cl"stcís aío"⭲d tkcsc dc⭲sitQ pcaks.
Score:{silhouette_avg:.3f}')
ERP:- 210303105045 P a g e | 63
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
G. Spcctíal Clustcíi⭲g:
Spcctíal cl"stcíi⭲g "scs gíapk tkcoíQ to paítitio⭲ data i⭲to cl"stcís bQ "si⭲g tkc cigc⭲:al"cs (spcctí"m)of
tkc similaíitQ matíix of tkc data. It’s wcll-s"itcd foí data tkat is ⭲ot li⭲caílQ scpaíablc.
ERP:- 210303105045 P a g e | 64
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
PRACTICAL: 12
AIM: Build an Artificial Neural Network by implementing the
Backpropagation algorithm and test the same using appropriate data sets.
ERP:- 210303105045 P a g e | 65
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
import pandas as pd
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in ['job', 'marital', 'default', 'education', 'housing', 'loan', 'contact', 'month', 'poutcome', 'y']:
bank_ds[col] = le.fit_transform(bank_ds[col])
bank_ds.head()
x = bank_ds.iloc[:,:-1].values
y = bank_ds.iloc[:,-1].values
# import numpy as np
# x.shape
# for i in range(x.shape[1]):
# x[:,i] = x[:,i]/np.max(x[:,i]
x_std = StandardScaler().fit_transform(x)
x_std[0]
ERP:- 210303105045 P a g e | 66
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
array([-1.05626965, 1.71680374, -0.24642938, -1.64475535, -0.1307588 ,
0.12107186, -1.14205138, -0.42475611, -0.72364152, 0.37405206,
1.48541444, -0.7118608 , -0.57682947, -0.4072183 , -0.32041282,
0.44441328])
# ANN
model_bc = Sequential()
model_bc.add(Dense(20, activation='relu', input_shape=(16,)))
model_bc.add(Dense(5, activation='relu'))
model_bc.add(Dense(1, activation='sigmoid')) #binary classification take sigmoid as activation
Epoch 1/10
29/29 [==============================] - 1s 11ms/step - loss: 132.9066 - accuracy: 0.1773 - val_loss: 49.4584 -
val_accuracy: 0.4177
Epoch 2/10
29/29 [==============================] - 0s 5ms/step - loss: 9.2283 - accuracy: 0.7312 - val_loss: 0.9392 -
val_accuracy: 0.8674
Epoch 3/10
29/29 [==============================] - 0s 7ms/step - loss: 1.1626 - accuracy: 0.8722 - val_loss: 0.7767 -
val_accuracy: 0.8674
Epoch 4/10
29/29 [==============================] - 0s 8ms/step - loss: 0.8962 - accuracy: 0.8728 - val_loss: 0.6238 -
val_accuracy: 0.8674
Epoch 5/10
29/29 [==============================] - 0s 7ms/step - loss: 0.6996 - accuracy: 0.8744 - val_loss: 0.5429 -
val_accuracy: 0.8685
Epoch 6/10
29/29 [==============================] - 0s 8ms/step - loss: 0.5973 - accuracy: 0.8761 - val_loss: 0.5306 -
val_accuracy: 0.8685
Epoch 7/10
29/29 [==============================] - 0s 14ms/step - loss: 0.5731 - accuracy: 0.8769 - val_loss: 0.5318 -
val_accuracy: 0.8685
Epoch 8/10
29/29 [==============================] - 0s 16ms/step - loss: 0.5687 - accuracy: 0.8780 - val_loss: 0.5266 -
val_accuracy: 0.8685
Epoch 9/10
29/29 [==============================] - 0s 8ms/step - loss: 0.5616 - accuracy: 0.8791 - val_loss: 0.5201 -
val_accuracy: 0.8685
Epoch 10/10
29/29 [==============================] - 0s 8ms/step - loss: 0.5560 - accuracy: 0.8808 - val_loss: 0.5174 -
val_accuracy: 0.8729
# prediction of new
pred_proba = model_bc.predict(x)
pred = []
for proba in pred_proba:
if proba>=0.5:
pred.append(1)
else:
pred.append(0)
ERP:- 210303105045 P a g e | 67
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
iris_ds = datasets.load_iris()
x = iris_ds.data
y = iris_ds.target
model_mc = Sequential()
model_mc.add(Dense(10, activation='relu', input_shape=(4,)))
model_mc.add(Dense(5, activation='relu'))
model_mc.add(Dense(3, activation='softmax')) #multiclass classification take softmax as activation
Epoch 1/4
2/2 [==============================] - 1s 249ms/step - loss: 1.1049 - accuracy: 0.1667 - val_loss: 0.5357 -
val_accuracy: 1.0000
Epoch 2/4
2/2 [==============================] - 0s 58ms/step - loss: 1.0822 - accuracy: 0.1750 - val_loss: 0.5653 -
val_accuracy: 1.0000
Epoch 3/4
2/2 [==============================] - 0s 92ms/step - loss: 1.0599 - accuracy: 0.2250 - val_loss: 0.5945 -
val_accuracy: 1.0000
Epoch 4/4
2/2 [==============================] - 0s 107ms/step - loss: 1.0394 - accuracy: 0.3750 - val_loss: 0.6243 -
val_accuracy: 1.0000
ERP:- 210303105045 P a g e | 68
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4th Year 7th Semester
x.shape
np.unique(y)
array([0, 1, 2])
y_pred_proba = model_mc.predict(x)
y_pred = []
for proba in y_pred_proba:
y_pred.append(np.argmax(proba))
print(classification_report(y, y_pred))
C:\Users\HP\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1318: UndefinedMetricWarning:
Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division`
parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\HP\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1318: UndefinedMetricWarning:
Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division`
parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\HP\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1318: UndefinedMetricWarning:
Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division`
parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
ERP:- 210303105045 P a g e | 69