0% found this document useful (0 votes)
23 views85 pages

Aiml

The document provides implementations of various search algorithms including BFS, DFS, A*, and memory-bounded A*. It also covers the implementation of Naive Bayes models and Bayesian networks using heart disease data, as well as a linear regression model for diabetes prediction. Each section includes code snippets, outputs, and explanations for the algorithms and models used.

Uploaded by

antoniyajeswin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views85 pages

Aiml

The document provides implementations of various search algorithms including BFS, DFS, A*, and memory-bounded A*. It also covers the implementation of Naive Bayes models and Bayesian networks using heart disease data, as well as a linear regression model for diabetes prediction. Each section includes code snippets, outputs, and explanations for the algorithms and models used.

Uploaded by

antoniyajeswin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 85

EX 1:

(A) IMPLEMENTATION OF UNINFORMED SEARCH ALGORITHMS:(BFS)

CODE:

graph={
'A':['B','C'],
'B':['D','E'],
'C':['B','F'],
'D':[],
'E':['F'],
'F':[]
}
visited=[]
queue=[]

def bfs(visited,graph,node):
visited.append(node)
queue.append(node)

while queue:
s=queue.pop(0)
print(s,end=" ")

for neighbour in graph[s]:


if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)

bfs(visited,graph,'A')

OUTPUT:

ABCDEF
(B) IMPLEMENTATION OF UNINFORMED SEARCH ALGORITHMS:(DFS)

CODE:

def dfs(graph,start,visited=None):
if visited is None:
visited=set()
visited.add(start)

print(start)

for next in graph[start]-visited:


dfs(graph,next,visited)
return visited

graph={
'0':set(['1','2']),
'1':set(['0','3','4']),
'2':set(['0']),
'3':set(['1']),
'4':set(['2','3'])
}
dfs(graph,'0')

OUTPUT:

0
1
4
2
3
3
2
{'0', '1', '2', '3', '4'}

EX 2:
(A) IMPLEMENTATION OF INFORMED SEARCH ALGORITHMS:(A*)

CODE:

def astaralgo(start, stop):


open_set, closed_set, g, parents = {start}, set(), {start: 0}, {start: start}

while open_set:
n = min(open_set, key=lambda v: g[v] + h(v))
if n == stop:
path = []
while n != start:
path.append(n)
n = parents[n]
path.append(start)
print("Path found:", path[::-1]) # This line reverses the path for correct order
return path[::-1] # Return reversed path directly

open_set.remove(n)
closed_set.add(n)
for m, weight in get_neighbors(n):
if m not in closed_set and (m not in open_set or g[m] > g[n] + weight):
open_set.add(m)
g[m], parents[m] = g[n] + weight, n

print("Path does not exist!")


return None

def get_neighbors(v):
return Graph_nodes.get(v, [])

def h(n):
return {'A': 11, 'B': 6, 'C': 99, 'D': 1, 'E': 7, 'G': 0}.get(n, float('inf'))

Graph_nodes = {
'A': [('B', 2), ('E', 3)],
'B': [('C', 1), ('G', 9)],
'C': None,
'E': [('D', 6)],
'D': [('G', 1)]
}
astaralgo('A', 'G')

OUTPUT:

Path found: ['A', 'E', 'D', 'G']


['A', 'E', 'D', 'G']

(B) IMPLEMENTATION OF INFORMED SEARCH ALGORITHMS:(memory bounded-


A*)

CODE:
import math

graph = {
'S': [('A', 1), ('B', 3)],
'A': [('B', 1), ('C', 3)],
'B': [('C', 1), ('D', 2)],
'C': [('D', 1), ('G', 2)],
'D': [('E', 3)],
'E': [('G', 2)],
'G': []
}

heuristic = {
'S': 7,
'A': 6,
'B': 2,
'C': 4,
'D': 2,
'E': 1,
'G': 0
}

def ma_star(graph, heuristic, start, goal, max_memory):


stack = [(0, start, [start])]
min_fcost = {start: 0}
while stack:
(f, node, path) = stack.pop()
if node == goal:
return path
if math.isnan(f) or f > max_memory:
continue
for adjacent, cost in graph[node]:
g = cost + min_fcost[node]
if adjacent in min_fcost and g >= min_fcost[adjacent]:
continue
f = g + heuristic[adjacent]
stack.append((f, adjacent, path + [adjacent]))
min_fcost[adjacent] = g
return None

start = 'S'
goal = 'G'
max_memory = 8

path = ma_star(graph, heuristic, start, goal, max_memory)


print('Shortest path:', path)

OUTPUT:

Shortest path: ['S', 'B', 'C', 'G']

EX 3:

IMPLEMENT NAIVE BAYES MODELS

LOADS A CSV FILE INTO A PANDAS DATAFRAME:

CODE:
import pandas as pd
import numpy as np
path = '/content/drive/MyDrive/PlayTennis.csv'
df = pd.read_csv(path)
df

OUTPUT:

ASSIGNS A COLUMN AS THE DEPENDENT VARIABLE:

CODE:

# Assign Dependent variable


y = df['PlayTennis']
y
OUTPUT:

ASSIGNS INDEPENDENT VARIABLES BY DROPPING THE DEPENDENT


COLUMN:

CODE:

# Assign Independent variable


x = df.drop('PlayTennis', axis=1)
x

OUTPUT:
ENCODES CATEGORICAL FEATURES AS NUMERIC VALUES:

CODE:

# We are going to convert the dataset from string to unique numbers


from sklearn.preprocessing import LabelEncoder
numData = LabelEncoder()
x.Outlook = numData.fit_transform(df['Outlook'])
x.Temp = numData.fit_transform(df['Temp'])
x.Humidity = numData.fit_transform(df['Humidity'])
x.Wind = numData.fit_transform(df['Windy'])
x

OUTPUT:
TRAINS A GAUSSIAN NAIVE BAYES MODEL ON THE DATASET:

CODE:

# Import Gaussian Naive Bayes model


from sklearn.naive_bayes import GaussianNB
# Assign the Gaussian Bayes function
model = GaussianNB()
# Train the model using training set
model.fit(x,y)

OUTPUT:
PREDICTS OUTCOMES USING THE TRAINED MODEL:

CODE:

# Predict the output


pred1 = model.predict([[0,2,0,1]])
print("Predicted result: ", pred1)

pred2 = model.predict([[1,0,1,0]])
print("Predicted result: ", pred2)

pred3 = model.predict([[2,2,0,0]])
print("Predicted result: ", pred3)

OUTPUT:

Predicted result: ['yes']


Predicted result: ['yes']
Predicted result: ['no']

EX 4:
IMPLEMENT BAYESIAN NETWORK

PROGRAM:

INSTALLING PGMPY AND LOADING HEART DISEASE DATA

pip install pgmpy

import numpy as np
import pandas as pd
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination

heartDisease =
pd.read_csv("C:/Users/Student/Desktop/Heartdisease.csv") heartDisease
= heartDisease.replace('?', np.nan)
print('Sample instances from the dataset are given below')
print(heartDisease.head())

Sample instances from the dataset are given below


age gender cp trestbps chol fbs restecg thalach exang oldpeak
\
0 63 1 1 145 233 1 2 150 0 2.3
1 67 1 4 160 286 0 2 108 1 1.5 2 67 1 4 120 229 0 2 129 1 2.6 3
37 1 3 130 250 0 0 187 0 3.5 4 41 0 2 130 204 0 2 172 0 1.4

slope ca thal heartdisease


0 3 0 6 0
1 2 3 3 2
2 2 2 7 1
3 3 0 3 0
4 1 0 3 0

PRINTING ATTRIBUTES AND DATATYPES OF HEART DISEASE DATASET

print('\n Attributes and datatypes')


print(heartDisease.dtypes)

Attributes and datatypes


age int64
gender int64
cp int64
trestbps int64
chol int64
fbs int64
restecg int64
thalach int64
exang int64
oldpeak float64
slope int64
ca object
thal object
heartdisease int64
dtype: object

LEARNING CPD IN BAYESIAN MODEL

model = BayesianModel([('age', 'heartdisease'), ('exang', 'heartdisease'), ('cp',


'heartdisease'), ('heartdisease', 'restecg'), ('heartdisease', 'chol')])
print('\nLearning CPD using Maximum likelihood estimators')
model.fit(heartDisease, estimator=MaximumLikelihoodEstimator)

Learning CPD using Maximum likelihood estimators

BAYESIAN NETWORK INFERENCE

print('\n Inferencing with Bayesian Network:')


HeartDisease_test_infer = VariableElimination(model)

Inferencing with Bayesian Network:

HEART DISEASE PROBABILITY WITH EVIDENCE

print('\n 1. Probability of HeartDisease given evidence= restecg')


q1 = HeartDisease_test_infer.query(variables=['heartdisease'], evidence={'restecg':
1}) print(q1)
1. Probability of HeartDisease given evidence=
restecg +-----------------+---------------------+ |
heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.1386 |
+-----------------+---------------------+
| heartdisease(1) | 0.0000 |
+-----------------+---------------------+
| heartdisease(2) | 0.2403 |
+-----------------+---------------------+
| heartdisease(3) | 0.2174 |
+-----------------+---------------------+
| heartdisease(4) | 0.4036 |
+-----------------+---------------------+

HEART DISEASE PROBABILITY WITH CP EVIDENCE

print('\n 2. Probability of HeartDisease given evidence= cp ')


q2 = HeartDisease_test_infer.query(variables=['heartdisease'], evidence={'cp':
2}) print(q2)

2. Probability of HeartDisease given evidence=


cp +-----------------+---------------------+ |
heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.4433 |
+-----------------+---------------------+
| heartdisease(1) | 0.1888 |
+-----------------+---------------------+
| heartdisease(2) | 0.1189 |
+-----------------+---------------------+
| heartdisease(3) | 0.1377 |
+-----------------+---------------------+
| heartdisease(4) | 0.1114 |
+-----------------+---------------------+

EX 5:(A)
LINEAR REGRESSION MODEL

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

path = '/content/drive/MyDrive/3rd Semester/data science/datasets/diabetes (5).csv'


df = pd.read_csv(path)
df.head(3)

X = df[['Glucose']]
y = df['Outcome']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
y_pred

array([ 0.1918881 , 0.28684498, 0.25971444, 0.25293181,


0.4496282 ,
0.22580127, 0.00875697, 0.32075815, 0.57171562,
0.52423718,
0.28006234, 0.74128148, 0.53101982, 0.17832283,
0.12406175,
0.37501923, 0.09693121, 0.11049648, 0.76841202,
0.47675874,
0.2325839 , 0.12406175, 0.24614917, 0.17832283,
0.40214976,
0.77519466, 0.26649708, 0.20545337, 0.21901863,
0.13084439,
0.6259767 , 0.51745455, 0.77519466, 0.65988987,
0.26649708,
0.55815035, 0.8633689 , 0.26649708, 0.34788869,
0.41571503,
0.10371385, 0.44284557, 0.26649708, 0.2325839 ,
0.15119229,
0.59884616, 0.28006234, 0.17154019, 0.28684498,
0.70058568,
0.16475756, 0.51067191, 0.47675874, 0.43606294,
0.46997611,
0.02910487, 0.59884616, 0.06980068, 0.34788869,
0.66667251,
0.57171562, 0.35467132, 0.40214976, 0.15119229,
0.09693121,
0.52423718, 0.07658331, 0.45641084, 0.2325839 ,
0.69380304,
0.54458509, 0.09014858, 0.14440966, 0.17154019,
0.22580127,
0.22580127, 0.23936654, 0.17832283, 0.32075815,
0.27327971,
0.6056288 , 0.2325839 , 0.0358875 , 0.44284557,
0.33432342,
0.79554256, 0.55815035, 0.36823659, 0.19867073,
0.11727912,
0.15797492, 0.15119229, 0.02910487, 0.34110605,
0.36145396,
0.50388928, 0.24614917, 0.09014858, 0.72771622, -
0.05228674,
0.63954197, 0.05623541, 0.38180186, 0.51745455,
0.59884616,
0.28684498, 0.48354138, 0.70058568, 0.06980068,
0.26649708,
0.18510546, 0.71415095, 0.23936654, 0.65988987,
0.28684498,
0.36823659, 0.6259767 , 0.27327971, 0.14440966,
0.30719288,
0.23936654, 0.20545337, 0.33432342, 0.13762702,
0.55136772,
...
0.76841202, 0.55136772, 0.26649708, 0.4496282 ,
0.75484675,
0.21901863, 0.34110605, 0.59884616, 0.54458509, -
0.01159094,
0.28684498, 0.13762702, 0.46997611, 0.46997611,
0.15797492,
0.09693121, 0.10371385, 0.02232224, 0.28684498,
0.6463246 ,
0.04945277, 0.17154019, 0.51745455, 0.02910487])

# Evaluate the model


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print('Mean squared error:', mse)
print('Coefficient of determination (R^2):', r2)

Mean squared error: 0.17113033279525355


Coefficient of determination (R^2): 0.25463232826956206

EX 5:(B)
MULTIPLE LINEAR REGRESSION
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

path = '/content/drive/MyDrive/3rd Semester/data science/datasets/diabetes (5).csv'


df = pd.read_csv(path)
df.head(3)

feature_cols = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',


'DiabetesPedigreeFunction', 'Age']
X = df[feature_cols]
y = df['Outcome']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)
y_pred

array([ 0.33550028, 0.23809869, 0.1510522 , 0.2401365 ,


0.48142376,
0.45257375, -0.17450469, 0.60662287, 0.52417796,
0.70476953,
0.32360466, 0.85290601, 0.38466612, 0.36056948,
0.09946712,
0.41539557, 0.17869123, 0.07782301, 0.80730861,
0.51299477,
0.28090594, 0.08303057, 0.5099157 , 0.11381771,
0.51325022,
0.82528549, 0.17892718, -0.0594202 , 0.28338572,
0.16407949,
0.83851225, 0.80737515, 0.68154389, 0.7649502 ,
0.56140297,
0.62123131, 1.06134554, 0.30990775, 0.51752336,
0.63691482,
0.07075333, 0.57757007, 0.55015462, 0.37541745, -
0.07644182,
0.50119208, 0.59600162, 0.27464761, 0.42477995,
0.9941898 ,
0.00969584, 0.61763578, 0.73395288, 0.31090975,
0.13456812,
-0.02536316, 0.71219147, -0.30518218, 0.41994556,
0.67869594,
0.66891428, 0.3798452 , 0.2956646 , 0.288035 ,
0.06813053,
0.55464338, 0.01368504, 0.6272007 , -0.02033281,
0.6372293 ,
0.61928494, 0.07019372, 0.26388322, 0.14080565,
0.12425109,
0.50054317, 0.24772661, 0.21027229, 0.18419241,
0.28346361,
0.60206367, 0.19720081, 0.04718638, 0.39163459,
0.31373787,
0.75789609, 0.82549769, 0.35944228, 0.1723114 ,
0.0957888 ,
0.05894136, 0.277268 , -0.35746245, 0.52802473,
0.48569971,
0.57670079, 0.40681613, 0.16649133, 0.56927171,
0.09451543,
0.6570335 , 0.03311435, 0.68073803, 0.48441106,
0.58967882,
0.27055501, 0.33149868, 0.66512401, 0.17581258,
0.51566149,
0.13045166, 0.38010107, -0.0949753 , 0.65582849,
0.23302651,
0.3716743 , 0.68391471, 0.28174341, 0.05450268,
0.53690397,
0.04284507, 0.33357357, 0.30472023, 0.10053203,
0.33006507,
...
0.6518381 , 0.77042295, 0.11555357, 0.44926623,
0.72795331,
0.15230489, 0.21288603, 0.76637265, 0.72722441, -
0.20395979,
0.12946513, -0.02149655, 0.27508285, 0.39903148,
0.15993455,
0.33468331, 0.20438069, -0.12662191, 0.43170733,
0.68158975,
0.163167 , 0.4815615 , 0.30101739, 0.26110909])

# Evaluate the model


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print('Mean squared error:', mse)
print('Coefficient of determination (R^2):', r2)

Mean squared error: 0.17104527280850104


Coefficient of determination (R^2): 0.25500281176741757

EX 5:(C)
LOGISTIC REGRESSION
# import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

# Load the Pima Indian Diabetes dataset


url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-
diabetes.data.csv"
columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
'DiabetesPedigreeFunction', 'Age', 'Outcome']
df = pd.read_csv(url, names=columns)

# Display the first few rows of the dataset


print(df.head())

# Check for missing values


print(df.isnull().sum())
Pregnancies 0
Glucose 0
BloodPressure 0
SkinThickness 0
Insulin 0
BMI 0
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

# Split data into features (X) and target (Y)


X = df.drop('Outcome', axis=1)
y = df['Outcome']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the Logistic Regression model


logreg = LogisticRegression(max_iter=1000)

logreg.fit(X_train, y_train)

y_pred = logreg.predict(X_test)
print('y_test', np.array(y_test))
print('y_pred', y_pred)

y_test [0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1
0 1 1 1 1 0 1 1
1 0 1 0 0 0 1 0 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0
1 0 0 0 1
0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0
0 1 1 1 0
0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1 1 0 0 0 0
0 0 0 1 0
0 1 0 0 1 0]
y_pred [0 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0
0 1 1 1 1 1 1 1
0 1 1 0 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0
1 1 0 0 0
0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 0 1 1 1 0
0 1 0 1 0
0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 0 0
0 0 0 0 0
0 1 0 0 0 0]

accuracy = accuracy_score(y_test, y_pred)


print(f'Accuracy: {accuracy * 100:.2f}%')

Accuracy: 74.68%

# Generate a classification report


print(classification_report(y_test, y_pred))

# Confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
Conf_matrix

array([[78, 21],
[18, 37]])

# Plotting the confusion matrix


plt.figure(figsize=(6, 4))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['No Diabetes',
'Diabetes'], yticklabels=['No Diabetes', 'Diabetes'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.title('Confusion matrix')
plt.show()

EX 6:(A)
BUILD DECISION TREE AND RANDOM FOREST:

DECISION TREE:

IMPORTING REQUIRED LIBRARIES

import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics

LOADING THE DATASET

path = '/content/drive/MyDrive/AIML/2024-25/Lab/diabetes.csv'
df = pd.read_csv(path)
df.head()

OUTPUT:

DISPLAYING DATASET DIMENSIONS

df.shape

OUTPUT:

(768, 9)

DECISION TREE MODEL TRAINING AND EVALUATION


# Feature Selection
feature_cols = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
'DiabetesPedigreeFunction', 'Age']
X = df[feature_cols]
y = df.Outcome

# Splitting Data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

# Building the Decision Tree Model


# Create Decision Tree Classifier object
model = DecisionTreeClassifier()

# Train the Decision Tree Classifier


model = model.fit(X_train, y_train)

# Predict the values for the test data


y_pred = model.predict(X_test)

# Evaluating the Model


print("Accuracy: ",metrics.accuracy_score(y_test,y_pred))

OUTPUT:

Accuracy: 0.7012987012987013

DECISION TREE VISUALIZATION

# To visualize the decision tree:

from sklearn.tree import export_graphviz


from six import StringIO
from IPython.display import Image
import pydotplus

dot_data = StringIO()
export_graphviz(model, out_file=dot_data,
filled=True, rounded=True,
special_characters=True,feature_names = feature_cols,class_names=['0','1'])
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.write_png('diabetes.png')
Image(graph.create_png())

OUTPUT:

EX 6:(B)
RANDOM FOREST
CODE:

from sklearn.datasets import make_classification, make_regression


from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, mean_squared_error, r2_score

# Classification example
X_class, y_class = make_classification(n_samples=1000, n_features=20, n_classes=2,
random_state=42)
X_train_class, X_test_class, y_train_class, y_test_class = train_test_split(X_class, y_class,
test_size=0.2, random_state=23)

rf_class = RandomForestClassifier()
rf_class.fit(X_train_class, y_train_class)
y_pred_class = rf_class.predict(X_test_class)
print("Classification Accuracy Score:", accuracy_score(y_test_class, y_pred_class))

# Regression example
X_reg, y_reg = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(X_reg, y_reg, test_size=0.2,
random_state=23)

rf_reg = RandomForestRegressor()
rf_reg.fit(X_train_reg, y_train_reg)
y_pred_reg = rf_reg.predict(X_test_reg)

# Using R² score for regression


print("Regression R² Score:", r2_score(y_test_reg, y_pred_reg))
print("Regression Mean Squared Error:", mean_squared_error(y_test_reg, y_pred_reg))

OUTPUT:

Classification Accuracy Score: 0.9


Regression R² Score: 0.7475395783989639
Regression Mean Squared Error: 9314.387251516939

EX 7:

BUILD SVM MODEL


LOADING AND DISPLAYING WINE DATASET

import pandas as pd

path ='/content/drive/MyDrive/4th sem/AIML/ai ml lab/AIML Dataset/winequality-red.csv'


df = pd.read_csv(path)
df.head(3)

OUTPUT:

SPLITTING DATA AND TRAINING SVM MODEL

# Import train_test_split function


from sklearn.model_selection import train_test_split

X=df[['fixed acidity', 'volatile acidity', 'citric acid', 'residual sugar', 'chlorides', 'free sulfur
dioxide', 'total sulfur dioxide', 'density', 'pH', 'sulphates', 'alcohol']] # Features
y=df['quality'] # Labels

# Split dataset into training set and test set


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20) # 70% training and
30% test

from sklearn.svm import SVC # If the hyperplane classifies the dataset linearly then the
algorithm we call it as SVC and the algorithm that separates the dataset by non-linear approach
then we call it as SVM.
model=SVC()

model.fit(X_train, y_train)

OUTPUT:
PREDICTION AND EVALUATION

y_pred=model.predict(X_test)

# Importing the classification report and confusion matrix


from sklearn.metrics import classification_report, confusion_matrix

print(confusion_matrix(y_test,y_pred))

OUTPUT:

[[ 0 0 0 1 0 0]
[ 0 0 3 10 0 0]
[ 0 0 61 76 0 0]
[ 0 0 28 92 0 0]
[ 0 0 2 42 0 0]
[ 0 0 1 4 0 0]]

CLASSIFICATION REPORT

print(classification_report(y_test, y_pred))

OUTPUT:

precision recall f1-score support

3 0.00 0.00 0.00 1


4 0.00 0.00 0.00 13
5 0.64 0.45 0.53 137
6 0.41 0.77 0.53 120
7 0.00 0.00 0.00 44
8 0.00 0.00 0.00 5

accuracy 0.48 320


macro avg 0.18 0.20 0.18 320
weighted avg 0.43 0.48 0.43 320
/usr/local/lib/python3.11/dist-packages/sklearn/metrics/
_classification.py:1565: UndefinedMetricWarning: Precision is
ill-defined and being set to 0.0 in labels with no predicted
samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is",
len(result))
/usr/local/lib/python3.11/dist-packages/sklearn/metrics/
_classification.py:1565: UndefinedMetricWarning: Precision is
ill-defined and being set to 0.0 in labels with no predicted
samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is",
len(result))
/usr/local/lib/python3.11/dist-packages/sklearn/metrics/
_classification.py:1565: UndefinedMetricWarning: Precision is
ill-defined and being set to 0.0 in labels with no predicted
samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is",
len(result))

ACTUAL VS PREDICTED RESULTS

data_p=pd.DataFrame({'Actual':y_test, 'Predicted':y_pred})
Data_p

OUTPUT:

MODEL ACCURACY
#Import scikit-learn metrics module for accuracy calculation
from sklearn import metrics

# Model Accuracy, how often is the classifier correct?


print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

OUTPUT:

Accuracy: 0.478125

VISUALIZING SVM DECISION BOUNDARY

# prompt: generate code to visualize the svc of the given dataset

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import classification_report, confusion_matrix
from sklearn import metrics
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs

# ... (your existing code for data loading, model training, and evaluation)

# SVM Visualization with custom data


# Generate sample data (replace with your actual data if needed)
X, y = make_blobs(n_samples=100, centers=2, random_state=0, cluster_std=1.0)

# Create and fit the SVM model


model_viz = SVC(kernel='linear', C=1) # You can change the kernel and C value
model_viz.fit(X, y)

# Plot the data and the decision boundary


plt.figure(figsize=(8, 6)) # Adjust figure size if needed
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap='viridis') # Use a different cmap for better visibility

ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()

# Create grid to evaluate model


xx = np.linspace(xlim[0], xlim[1], 100)
yy = np.linspace(ylim[0], ylim[1], 100)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = model_viz.decision_function(xy).reshape(XX.shape)

# Plot decision boundary and margins


ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5, linestyles=['--', '-', '--'])
ax.scatter(model_viz.support_vectors_[:, 0], model_viz.support_vectors_[:, 1], s=100,
linewidth=1, facecolors='none', edgecolors='k')
plt.title('SVM Decision Boundary')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')

plt.show()

OUTPUT:

EX 8:(A)
IMPLEMENT ENSEMBLING TECHNIQUE

ENSEMBLE LEARNING CLASSIFIER:

LOADING CAR PRICE DATASET

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

from sklearn.linear_model import LinearRegression


from sklearn.ensemble import RandomForestRegressor
import xgboost as xgb

df=pd.read_csv('/content/drive/MyDrive/AIML/car_price.csv')
print(df.head())
print(df.shape)

OUTPUT:

Car_ID Brand Model Year Kilometers_Driven ... Mileage


Engine Power Seats Price
0 1 Toyota Corolla 2018 50000 ... 15
1498 108 5 800000
1 2 Honda Civic 2019 40000 ... 17
1597 140 5 1000000
2 3 Ford Mustang 2017 20000 ... 10
4951 395 4 2500000
3 4 Maruti Swift 2020 30000 ... 23
1248 74 5 600000
4 5 Hyundai Sonata 2016 60000 ... 18
1999 194 5 850000

[5 rows x 13 columns]
(100, 13)

ENCODING CATEGORICAL DATA (BRAND)


from sklearn import preprocessing
le=preprocessing.LabelEncoder()

Brand_c=df['Brand'].unique()
print(Brand_c)

OUTPUT:

['Toyota' 'Honda' 'Ford' 'Maruti' 'Hyundai' 'Tata' 'Mahindra'


'Volkswagen'
'Audi' 'BMW' 'Mercedes']

ENCODING 'BRAND' COLUMN

df['Brand']=le.fit_transform(df['Brand'])
Brand_1=df["Brand"].unique()
print(Brand_1)

OUTPUT:

[ 9 3 2 6 4 8 5 10 0 1 7]

BRAND ENCODING COMPARISON

df['Brand']= le.fit_transform(df['Brand'])
Brand_1 = df['Brand'].unique()
print(pd.DataFrame({'Brand_c':Brand_c, 'Brand_1':Brand_1}))

OUTPUT:

Brand_c Brand_1
0 Toyota 9
1 Honda 3
2 Ford 2
3 Maruti 6
4 Hyundai 4
5 Tata 8
6 Mahindra 5
7 Volkswagen 10
8 Audi 0
9 BMW 1
10 Mercedes 7

DISPLAYING UNIQUE 'MODEL' VALUES

Model_c=df["Model"].unique()
print(Model_c)

OUTPUT:

['Corolla' 'Civic' 'Mustang' 'Swift' 'Sonata' 'Nexon' 'Scorpio'


'Polo'
'A4' 'X1' 'C-Class' 'Endeavour' 'Creta' 'Harrier' 'Ertiga'
'City'
'Tiguan' 'Q3' '5 Series' 'GLC' 'Innova' 'Figo' 'Verna' 'Altroz'
'Thar'
'Passat' 'A6' 'X3' 'E-Class' 'Fortuner' 'Aspire' 'Elantra'
'Safari'
'Vitara' 'WR-V' 'Ameo' 'A3' '7 Series' 'GLE' 'Yaris' 'Ranger'
'Santro'
'Tigor' 'S-Cross' 'BR-V' 'T-Roc' 'Q7' 'X5' 'GLA' 'Camry' 'Venue'
'Tiago'
'XUV300' 'Vento' 'A5' '3 Series' 'Innova Crysta' 'EcoSport']

ENCODING 'MODEL' COLUMN

df['Model']=le.fit_transform(df['Model'])
Model_1=df["Model"].unique()
print(Model_1)

OUTPUT:

[15 14 30 42 41 31 40 33 4 53 11 20 16 27 21 13 47 34 1 25 28
22 50 7
44 32 6 54 17 23 9 19 38 51 52 8 3 2 26 57 36 39 46 37 10
43 35 55
24 12 49 45 56 48 5 0 29 18]

COMPARING ORIGINAL AND ENCODED 'MODEL' VALUES


df['Model']= le.fit_transform(df['Model'])
Model_1 = df['Model'].unique()
print(pd.DataFrame({'Model_c':Model_c, 'Model_l':Model_1}))

OUTPUT:

Model_c Model_l
0 Corolla 15
1 Civic 14
2 Mustang 30
3 Swift 42
4 Sonata 41
5 Nexon 31
6 Scorpio 40
7 Polo 33
8 A4 4
9 X1 53
10 C-Class 11
11 Endeavour 20
12 Creta 16
13 Harrier 27
14 Ertiga 21
15 City 13
16 Tiguan 47
17 Q3 34
18 5 Series 1
19 GLC 25
20 Innova 28
21 Figo 22
22 Verna 50
23 Altroz 7
24 Thar 44
25 Passat 32
26 A6 6
27 X3 54
28 E-Class 17
29 Fortuner 23
30 Aspire 9
31 Elantra 19
32 Safari 38
33 Vitara 51
34 WR-V 52
35 Ameo 8
36 A3 3
37 7 Series 2
38 GLE 26
39 Yaris 57
40 Ranger 36
41 Santro 39
42 Tigor 46
43 S-Cross 37
44 BR-V 10
45 T-Roc 43
46 Q7 35
47 X5 55
48 GLA 24
49 Camry 12
50 Venue 49
51 Tiago 45
52 XUV300 56
53 Vento 48
54 A5 5
55 3 Series 0
56 Innova Crysta 29
57 EcoSport 18

DISPLAYING UNIQUE 'FUEL_TYPE' VALUES

Fuel_type_c=df['Fuel_Type'].unique()
print(Fuel_type_c)

OUTPUT:

['Petrol' 'Diesel']

ENCODING 'FUEL_TYPE' COLUMN

df['Fuel_Type']=le.fit_transform(df['Fuel_Type'])
Fuel_Type_1=df["Fuel_Type"].unique()
print(Fuel_Type_1)
OUTPUT:

[1 0]

COMPARING ORIGINAL AND ENCODED 'FUEL_TYPE' VALUES

df['Fuel_Type']= le.fit_transform(df['Fuel_Type'])
Fuel_Type_1 = df['Fuel_Type'].unique()
print(pd.DataFrame({'Fuel_type_c':Fuel_type_c, 'Fuel_Type_1':Fuel_Type_1}))

OUTPUT:

Fuel_type_c Fuel_Type_1
0 Petrol 1
1 Diesel 0

DISPLAYING UNIQUE 'TRANSMISSION' VALUES

Transmission_c=df['Transmission'].unique()
print(Transmission_c)

OUTPUT:

['Manual' 'Automatic']

ENCODING 'TRANSMISSION' COLUMN

df['Transmission']=le.fit_transform(df['Transmission'])
Transmission_1=df["Transmission"].unique()
print(Transmission_1)

OUTPUT:

[1 0]

COMPARING ORIGINAL AND ENCODED 'TRANSMISSION' VALUES

df['Transmission']= le.fit_transform(df['Transmission'])
Transmission_1 = df['Transmission'].unique()
print(pd.DataFrame({'Transmission_c':Transmission_c, 'Transmission_1':Transmission_1}))
OUTPUT:

Transmission_c Transmission_1
0 Manual 1
1 Automatic 0

DISPLAYING UNIQUE 'OWNER_TYPE' VALUES

Owner_Type_c=df['Owner_Type'].unique()
print(Owner_Type_c)

OUTPUT:

['First' 'Second' 'Third']

ENCODING 'OWNER_TYPE' COLUMN

df['Owner_Type']=le.fit_transform(df['Owner_Type'])
Owner_Type_1=df["Owner_Type"].unique()
print(Owner_Type_1)

OUTPUT:

[0 1 2]

OWNER_TYPE ENCODING COMPARISON

df['Owner_Type']= le.fit_transform(df['Owner_Type'])
Owner_Type_1 = df['Owner_Type'].unique()
print(pd.DataFrame({'Owner_Type_c':Owner_Type_c, 'Owner_Type_1':Owner_Type_1}))

OUTPUT:

Owner_Type_c Owner_Type_1
0 First 0
1 Second 1
2 Third 2

OWNER_TYPE ENCODING COMPARISON


data = pd.DataFrame({'Car_ID':df['Car_ID'], 'Brand_l':df['Brand'], 'Model_l':df['Model'],
'Year':df['Year'], 'FuelType_l':df['Fuel_Type'], 'Kilometers_Driven':df['Kilometers_Driven'],
'Transmission_l':df['Transmission'], 'OwnerType_l':df['Owner_Type'], 'Mileage':df['Mileage'],
'Engine':df['Engine'], 'Power':df['Power'], 'Seats':df['Seats'], 'Price':df['Price']})
print(data)

OUTPUT:

Car_ID Brand_l Model_l Year FuelType_l ... Mileage


Engine Power Seats Price
0 1 9 15 2018 1 ... 15
1498 108 5 800000
1 2 3 14 2019 1 ... 17
1597 140 5 1000000
2 3 2 30 2017 1 ... 10
4951 395 4 2500000
3 4 6 42 2020 0 ... 23
1248 74 5 600000
4 5 4 41 2016 0 ... 18
1999 194 5 850000
.. ... ... ... ... ... ... ...
... ... ... ...
95 96 7 11 2019 0 ... 16
1950 191 5 2900000
96 97 9 29 2017 0 ... 13
2755 171 7 1400000
97 98 2 18 2018 1 ... 18
1497 121 5 750000
98 99 4 50 2019 1 ... 17
1497 113 5 850000
99 100 8 7 2020 1 ... 20
1199 85 5 600000

[100 rows x 13 columns]

EX 8:(B)
ENSEMBLE LEARNING REGRESSION
LOADING DATA FOR ENSEMBLE AVERAGING

# Ensemble Technique 1: Averaging


# importing utility modules
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# importing machine learning models for prediction


from sklearn.ensemble import RandomForestRegressor
import xgboost as xgb
from sklearn.linear_model import LinearRegression

# loading train data set in dataframe from train_data.csv file


path = '/content/drive/MyDrive/AIML/2023-24/Lab/car_price.csv'
df = pd.read_csv(path)
df.head(3)

Df.shape

OUTPUT:

(100, 13)

ENCODING 'BRAND' FOR ENSEMBLE MODEL

#print(df['Brand'].drop_duplicates())
Brand_c = df['Brand'].unique() # c:categorical
print(Brand_c)

# Label Encoding
from sklearn import preprocessing
label_encoder = preprocessing.LabelEncoder()
df['Brand']= label_encoder.fit_transform(df['Brand'])
Brand_l = df['Brand'].unique() # Brand_l: after label coding
print(Brand_l)

OUTPUT:
[ 9 3 2 6 4 8 5 10 0 1 7]

MODEL ENCODING COMPARISON

Model_c = df['Model'].unique() # c:categorical


print(Model_c)

df['Model']= label_encoder.fit_transform(df['Model'])
Model_l = df['Model'].unique()
print(pd.DataFrame({'Model_c':Model_c, 'Model_l':Model_l}))

OUTPUT:

Model_c Model_l
0 Corolla 15
1 Civic 14
2 Mustang 30
3 Swift 42
4 Sonata 41
5 Nexon 31
6 Scorpio 40
7 Polo 33
8 A4 4
9 X1 53
10 C-Class 11
11 Endeavour 20
12 Creta 16
13 Harrier 27
14 Ertiga 21
15 City 13
16 Tiguan 47
17 Q3 34
18 5 Series 1
19 GLC 25
20 Innova 28
21 Figo 22
22 Verna 50
23 Altroz 7
24 Thar 44
25 Passat 32
26 A6 6
27 X3 54
28 E-Class 17
29 Fortuner 23
30 Aspire 9
31 Elantra 19
32 Safari 38
33 Vitara 51
34 WR-V 52
35 Ameo 8
36 A3 3
37 7 Series 2
38 GLE 26
39 Yaris 57
40 Ranger 36
41 Santro 39
42 Tigor 46
43 S-Cross 37
44 BR-V 10
45 T-Roc 43
46 Q7 35
47 X5 55
48 GLA 24
49 Camry 12
50 Venue 49
51 Tiago 45
52 XUV300 56
53 Vento 48
54 A5 5
55 3 Series 0
56 Innova Crysta 29
57 EcoSport 18

DISPLAYING UNIQUE 'FUEL_TYPE' VALUES

FuelType_c = df['Fuel_Type'].unique() # c:categorical


print(FuelType_c)

OUTPUT:

['Petrol' 'Diesel']
ENCODING FUEL_TYPE & TRANSMISSION

df['Fuel_Type']= label_encoder.fit_transform(df['Fuel_Type'])
FuelType_l = df['Fuel_Type'].unique()
print(pd.DataFrame({'FuelType_c':FuelType_c, 'FuelType_l':FuelType_l}))
Transmission_c = df['Transmission'].unique() # c:categorical
print(Transmission_c)

OUTPUT:

['Manual' 'Automatic']

ENCODING 'TRANSMISSION' COLUMN

df['Transmission']= label_encoder.fit_transform(df['Transmission'])
Transmission_l = df['Transmission'].unique()
print(pd.DataFrame({'Transmission_c':Transmission_c, 'Transmission_l':Transmission_l}))

OUTPUT:

Transmission_c Transmission_l
0 Manual 1
1 Automatic 0

DISPLAYING UNIQUE 'OWNER_TYPE' VALUES

OwnerType_c = df['Owner_Type'].unique() # c:categorical


print(OwnerType_c)

OUTPUT:

['First' 'Second' 'Third']

ENCODING 'OWNER_TYPE' COLUMN

df['Owner_Type']= label_encoder.fit_transform(df['Owner_Type'])
OwnerType_l = df['Owner_Type'].unique()
print(pd.DataFrame({'OwnerType_c':OwnerType_c, 'OwnerType_l':OwnerType_l}))
OUTPUT:

OwnerType_c OwnerType_l
0 First 0
1 Second 1
2 Third 2

FINAL DATA PREPROCESSING

# All numeric now


data = pd.DataFrame({'Car_ID':df['Car_ID'], 'Brand_l':df['Brand'], 'Model_l':df['Model'],
'Year':df['Year'], 'FuelType_l':df['Fuel_Type'], 'Kilometers_Driven':df['Kilometers_Driven'],
'Transmission_l':df['Transmission'], 'OwnerType_l':df['Owner_Type'], 'Mileage':df['Mileage'],
'Engine':df['Engine'], 'Power':df['Power'], 'Seats':df['Seats'], 'Price':df['Price']})
print(data)

OUTPUT:

Car_ID Brand_l Model_l Year FuelType_l Kilometers_Driven


\
0 1 9 15 2018 1 50000
1 2 3 14 2019 1 40000
2 3 2 30 2017 1 20000
3 4 6 42 2020 0 30000
4 5 4 41 2016 0 60000
.. ... ... ... ... ... ...
95 96 7 11 2019 0 22000
96 97 9 29 2017 0 38000
97 98 2 18 2018 1 26000
98 99 4 50 2019 1 24000
99 100 8 7 2020 1 18000

Transmission_l OwnerType_l Mileage Engine Power Seats


Price
0 1 0 15 1498 108 5
800000
1 0 1 17 1597 140 5
1000000
2 0 0 10 4951 395 4
2500000
3 1 2 23 1248 74 5
600000
4 0 1 18 1999 194 5
850000
.. ... ... ... ... ... ...
...
95 0 0 16 1950 191 5
2900000
96 1 1 13 2755 171 7
1400000
97 1 2 18 1497 121 5
750000
98 0 1 17 1497 113 5
850000
99 1 0 20 1199 85 5
600000

[100 rows x 13 columns]

ENSEMBLE AVERAGING FOR PRICE PREDICTION

# Class Label
y = data["Price"]

# Feature set
X = data[['Car_ID', 'Brand_l', 'Model_l', 'Year', 'Kilometers_Driven', 'FuelType_l',
'Transmission_l', 'OwnerType_l', 'Mileage', 'Engine', 'Power', 'Seats']]

# Splitting between train data into training and validation dataset


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)

# Ensemble Technique 1: Averaging


# initializing all the model objects with default parameters
model_1 = LinearRegression()
model_2 = xgb.XGBRegressor()
model_3 = RandomForestRegressor()

# training all the model on the training dataset


model_1.fit(X_train, y_train)
model_2.fit(X_train, y_train)
model_3.fit(X_train, y_train)

# predicting the output on the validation dataset


pred_1 = model_1.predict(X_test)
pred_2 = model_2.predict(X_test)
pred_3 = model_3.predict(X_test)

# final prediction after averaging on the prediction of all 3 models


pred_Avg = (pred_1 + pred_2 + pred_3) / 3.0
print(pred_Avg)

# printing the mean squared error between real value and predicted value
print(mean_squared_error(y_test, pred_Avg))

OUTPUT:

[1154492.14103494 1491231.34689431 2590969.44972621


2471757.86852053
2062117.02588711 469883.43677897 2504765.20794544
1102025.04616787
457812.15993689 1931025.70920589 677944.32160731
967821.24892562
909898.05244325 2698853.65961274 351216.67845367
1774569.25234341
381275.56217277 3153092.10439319 677049.28144696
1288730.16039758]
39457081116.02587

ENSEMBLE BAGGING FOR PRICE PREDICTION

# Ensemble Technique 2: Bagging


from sklearn.ensemble import BaggingRegressor
from sklearn import tree
model = BaggingRegressor(tree.DecisionTreeRegressor(random_state=1))
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(pd.DataFrame({'y_test':y_test,'y_pred':y_pred}))
print(mean_squared_error(y_test, y_pred))
model.score(X_test, y_test)
OUTPUT:

y_test y_pred
24 1200000 1220000.0
40 1500000 1660000.0
55 2600000 2630000.0
18 3000000 2570000.0
11 2000000 2100000.0
90 500000 520000.0
27 2800000 2485000.0
69 800000 905000.0
51 550000 530000.0
74 2000000 1865000.0
15 650000 790000.0
22 850000 910000.0
81 700000 765000.0
57 2900000 2600000.0
41 450000 480000.0
87 1800000 1870000.0
3 600000 505000.0
85 3200000 3500000.0
77 650000 680000.0
6 900000 1170000.0
32356250000.0

0.963739134698223

ENSEMBLE BOOSTING (ADABOOST) FOR PRICE PREDICTION

# Ensemble Technique 3: Boosting


# ADABOOST

from sklearn.ensemble import AdaBoostRegressor


model = AdaBoostRegressor()
model.fit(X_train, y_train)
print(pd.DataFrame({'y_test':y_test,'y_pred':y_pred}))
print(mean_squared_error(y_test, y_pred))
model.score(X_test,y_test)

OUTPUT:
y_test y_pred
24 1200000 1220000.0
40 1500000 1660000.0
55 2600000 2630000.0
18 3000000 2570000.0
11 2000000 2100000.0
90 500000 520000.0
27 2800000 2485000.0
69 800000 905000.0
51 550000 530000.0
74 2000000 1865000.0
15 650000 790000.0
22 850000 910000.0
81 700000 765000.0
57 2900000 2600000.0
41 450000 480000.0
87 1800000 1870000.0
3 600000 505000.0
85 3200000 3500000.0
77 650000 680000.0
6 900000 1170000.0
32356250000.0

0.9653902329105001

EX 9:(A)

K-MEANS CLUSTERING

SCATTER PLOT VISUALIZATION

import matplotlib.pyplot as plt

x=[2,4,6,8,10,12,14,16,18,20]
y=[23,56,44,56,67,76,65,54,34,32]

plt.scatter(x,y)
plt.show()
OUTPUT:

CREATING DATA FROM X AND Y

data=list(zip(x,y))
print(data)

OUTPUT:

[(2, 23), (4, 56), (6, 44), (8, 56), (10, 67), (12, 76), (14,
65), (16, 54), (18, 34), (20, 32)]

K-MEANS ELBOW METHOD FOR CLUSTER SELECTION

from sklearn.cluster import KMeans


inertias=[]
for i in range(1,11):
kmeans=KMeans(n_clusters=i)
kmeans.fit(data)
inertias.append(kmeans.inertia_)

plt.plot(range(1,11),inertias,marker='.')
plt.title('Elbow method')
plt.xlabel('Number of clusters')
plt.ylabel('Inertias')
plt.show()

OUTPUT:
K-MEANS CLUSTERING RESULTS VISUALIZATION

kmeans=KMeans(n_clusters=2)
kmeans.fit(data)

plt.scatter(x, y, c=kmeans.labels_)
plt.show()

OUTPUT:

EX 9:(B)
KNN CLUSTERING

LOADING SOCIAL NETWORK ADS DATASET

import pandas as pd

data=pd.read_csv("/content/drive/MyDrive/AIML/Social_Network_Ads.csv")
print(data.head())
print(data.shape)

OUTPUT:
User ID Gender Age EstimatedSalary Purchased
0 15624510 Male 19 19000 0
1 15810944 Male 35 20000 0
2 15668575 Female 26 43000 0
3 15603246 Female 27 57000 0
4 15804002 Male 19 76000 0
(400, 5)

KNN CLASSIFIER TRAINING FOR SOCIAL NETWORK ADS

import numpy as np
import matplotlib.pyplot as plt

X=data.iloc[:,[1,2,3]].values
y=data.iloc[:,-1].values

from sklearn.preprocessing import LabelEncoder


le=LabelEncoder()
X[:,0]=le.fit_transform(X[:,0])

from sklearn.model_selection import train_test_split


X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=42)

from sklearn.preprocessing import StandardScaler


sc=StandardScaler()
X_train=sc.fit_transform(X_train)
X_test=sc.transform(X_test)

from sklearn.neighbors import KNeighborsClassifier


classifier=KNeighborsClassifier(n_neighbors=5,metric='minkowski',p=2)
classifier.fit(X_train,y_train)

OUTPUT:

KNN EVALUATION (CONFUSION MATRIX & ACCURACY)

y_pred=classifier.predict(X_test)
from sklearn.metrics import confusion_matrix,accuracy_score
cm=confusion_matrix(y_test,y_pred)
print(cm)
ac=accuracy_score(y_test,y_pred)
print(ac)

OUTPUT:

[[48 4]
[ 2 26]]
0.925

EX 10:
IMPLEMENT EM FOR BAYESIAN NETWORKS

PROGRAM:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.stats import gaussian_kde
import seaborn as sns
# Generate a dataset with two Gaussian components
mu1, sigma1 = 2, 1
mu2, sigma2 = -1, 0.8
X1 = np.random.normal(mu1, sigma1, size=200)
X2 = np.random.normal(mu2, sigma2, size=600)
X = np.concatenate([X1, X2])
# Plot the density estimation using seaborn
sns.kdeplot(X)
plt.xlabel('X')
plt.ylabel('Density')
plt.title('Density Estimation of X')
plt.show()

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Initialize parameters
mu1_hat, sigma1_hat = np.mean(X1), np.std(X1)
mu2_hat, sigma2_hat = np.mean(X2), np.std(X2)
pi1_hat, pi2_hat = len(X1) / len(X), len(X2) / len(X)

# Perform EM algorithm for 20 epochs


num_epochs = 20
log_likelihoods = []

for epoch in range(num_epochs):


# E-step: Compute responsibilities
gamma1 = pi1_hat * norm.pdf(X, mu1_hat, sigma1_hat)
gamma2 = pi2_hat * norm.pdf(X, mu2_hat, sigma2_hat)
total = gamma1 + gamma2
gamma1 /= total
gamma2 /= total

# M-step: Update parameters


mu1_hat = np.sum(gamma1 * X) / np.sum(gamma1)
mu2_hat = np.sum(gamma2 * X) / np.sum(gamma2)
sigma1_hat = np.sqrt(np.sum(gamma1 * (X - mu1_hat)**2) / np.sum(gamma1))
sigma2_hat = np.sqrt(np.sum(gamma2 * (X - mu2_hat)**2) / np.sum(gamma2))
pi1_hat = np.mean(gamma1)
pi2_hat = np.mean(gamma2)

# Compute log-likelihood
log_likelihood = np.sum(np.log(pi1_hat * norm.pdf(X, mu1_hat, sigma1_hat) + pi2_hat *
norm.pdf(X, mu2_hat, sigma2_hat)))
log_likelihoods.append(log_likelihood)

# Plot log-likelihood values over epochs


plt.plot(range(1, num_epochs+1), log_likelihoods)
plt.xlabel('Epoch')
plt.ylabel('Log-Likelihood')
plt.title('Log-Likelihood vs. Epoch')
plt.show()

from scipy.stats import gaussian_kde

# Plot the final estimated density


X_sorted = np.sort(X)
density_estimation = pi1_hat * norm.pdf(X_sorted, mu1_hat, sigma1_hat) + pi2_hat *
norm.pdf(X_sorted, mu2_hat, sigma2_hat)
plt.plot(X_sorted, gaussian_kde(X)(X_sorted), color='green', linewidth=2)
plt.plot(X_sorted, density_estimation, color='red', linewidth=2)
plt.xlabel('X')
plt.ylabel('Density')
plt.title('Density Estimation of X')
plt.legend(['Kernel Density Estimation', 'Mixture Density'])
plt.show()

EX 11:
Build Simple NN Models

PROGRAM:

import numpy as np
import keras
from keras import layers

# Model / data parameters


num_classes = 10
input_shape = (28, 28, 1)

# Load the data and split it between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Scale images to the [0, 1] range


x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")

# convert class vectors to binary class matrices


y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train shape: (60000, 28, 28, 1)


60000 train samples
10000 test samples

model = keras.Sequential(
[
keras.Input(shape=input_shape),
layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(num_classes, activation="softmax"),
]
)

model.summary()
# Train the model
batch_size = 128
epochs = 15

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Epoch 1/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 48s 109ms/step - accuracy: 0.7681 -
loss: 0.7563 - val_accuracy: 0.9772 - val_loss: 0.0895
Epoch 2/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 78s 102ms/step - accuracy: 0.9606 -
loss: 0.1286 - val_accuracy: 0.9840 - val_loss: 0.0600
Epoch 3/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 88s 116ms/step - accuracy: 0.9731 -
loss: 0.0862 - val_accuracy: 0.9868 - val_loss: 0.0475
Epoch 4/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 76s 102ms/step - accuracy: 0.9773 -
loss: 0.0727 - val_accuracy: 0.9875 - val_loss: 0.0450
Epoch 5/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 80s 98ms/step - accuracy: 0.9809 -
loss: 0.0609 - val_accuracy: 0.9890 - val_loss: 0.0396
Epoch 6/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 83s 100ms/step - accuracy: 0.9830 -
loss: 0.0571 - val_accuracy: 0.9878 - val_loss: 0.0388
Epoch 7/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 81s 99ms/step - accuracy: 0.9823 -
loss: 0.0561 - val_accuracy: 0.9890 - val_loss: 0.0390
Epoch 8/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 82s 100ms/step - accuracy: 0.9853 -
loss: 0.0474 - val_accuracy: 0.9910 - val_loss: 0.0334
Epoch 9/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 83s 102ms/step - accuracy: 0.9856 -
loss: 0.0455 - val_accuracy: 0.9907 - val_loss: 0.0341
Epoch 10/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 82s 103ms/step - accuracy: 0.9867 -
loss: 0.0421 - val_accuracy: 0.9918 - val_loss: 0.0320
Epoch 11/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 80s 100ms/step - accuracy: 0.9880 -
loss: 0.0386 - val_accuracy: 0.9913 - val_loss: 0.0311
Epoch 12/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 82s 100ms/step - accuracy: 0.9879 -
loss: 0.0378 - val_accuracy: 0.9910 - val_loss: 0.0297
Epoch 13/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 82s 99ms/step - accuracy: 0.9889 -
loss: 0.0351 - val_accuracy: 0.9923 - val_loss: 0.0273
Epoch 14/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 82s 99ms/step - accuracy: 0.9889 -
loss: 0.0336 - val_accuracy: 0.9910 - val_loss: 0.0314
Epoch 15/15
422/422 ━━━━━━━━━━━━━━━━━━━━ 82s 99ms/step - accuracy: 0.9887 -
loss: 0.0337 - val_accuracy: 0.9922 - val_loss: 0.0285
<keras.src.callbacks.history.History at 0x79aaffd14510>

# Evaluate the trained model


score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])
EX 12:
DEEP NEURAL NETWORK
LOADING MNIST DATASET

import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np

(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()

len(X_train)

OUTPUT:

60000
MNIST TEST SAMPLE COUNT

len(X_test)

OUTPUT:

10000

MNIST TRAINING DATA SHAPE

X_train.shape

OUTPUT:

(60000, 28, 28)

FIRST TRAINING SAMPLE OF MNIST

print(X_train[0])

OUTPUT:

[[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 3 18 18
18 126 136
175 26 166 255 247 127 0 0 0 0]
[ 0 0 0 0 0 0 0 0 30 36 94 154 170 253 253
253 253 253
225 172 253 242 195 64 0 0 0 0]
[ 0 0 0 0 0 0 0 49 238 253 253 253 253 253 253
253 253 251
93 82 82 56 39 0 0 0 0 0]
[ 0 0 0 0 0 0 0 18 219 253 253 253 253 253 198
182 247 241
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 80 156 107 253 253 205 11
0 43 154
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 14 1 154 253 90 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 139 253 190 2
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 11 190 253 70
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 35 241 225
160 108 1
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 81 240
253 253 119
25 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 45
186 253 253
150 27 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 93 252
253 187 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 249
253 249 64 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 46
130 183 253
253 207 2 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 39 148 229
253 253 253
250 182 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 24 114 221 253 253
253 253 201
78 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 23 66 213 253 253 253 253
198 81 2
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 18 171 219 253 253 253 253 195 80
9 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 55 172 226 253 253 253 253 244 133 11 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 136 253 253 253 212 135 132 16 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0
0 0 0 0 0 0 0 0 0 0]]

VISUALIZING FIRST TRAINING SAMPLE OF MNIST

plt.matshow(X_train[0])

OUTPUT:

<matplotlib.image.AxesImage at 0x7bb032329610>

FIRST IMAGE LABEL – MNIST

print(y_train[0])

OUTPUT:
5

VISUALIZATION OF 3RD DIGIT IMAGE

plt.matshow(X_train[2])

OUTPUT:

<matplotlib.image.AxesImage at 0x7bb02ff4bc90>
DIGIT LABEL OF 3RD IMAGE IN TRAINING DATASET

print(y_train[2])

OUTPUT:

SHAPE OF X_TRAIN_FLATTENED

X_train_flattened=X_train.reshape(len(X_train),28*28)

X_test_flattened=X_test.reshape(len(X_test),28*28)

X_train_flattened.shape

OUTPUT:

(60000, 784)

FIRST FLATTENED IMAGE

X_train_flattened[0]

OUTPUT:

array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 18,
18, 18,
126, 136, 175, 26, 166, 255, 247, 127, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 30, 36, 94, 154,
170, 253,
253, 253, 253, 253, 225, 172, 253, 242, 195, 64, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 49, 238, 253,
253, 253,
253, 253, 253, 253, 253, 251, 93, 82, 82, 56, 39,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18,
219, 253,
253, 253, 253, 253, 198, 182, 247, 241, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
80, 156, 107, 253, 253, 205, 11, 0, 43, 154, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 14, 1, 154, 253, 90, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 139, 253, 190, 2,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 190,
253, 70,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 35,
241, 225, 160, 108, 1, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 81, 240, 253, 253, 119, 25, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 45, 186, 253, 253, 150, 27,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 16, 93, 252,
253, 187,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 249,
253, 249, 64, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
46, 130,
183, 253, 253, 207, 2, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
39, 148,
229, 253, 253, 253, 250, 182, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
24, 114,
221, 253, 253, 253, 253, 201, 78, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
23, 66,
213, 253, 253, 253, 253, 198, 81, 2, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
18, 171,
219, 253, 253, 253, 253, 195, 80, 9, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
55, 172,
226, 253, 253, 253, 253, 244, 133, 11, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
136, 253, 253, 253, 212, 135, 132, 16, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0], dtype=uint8)
SECOND FLATTENED IMAGE

X_train_flattened[2]

OUTPUT:

array([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 67, 232, 39, 0, 0, 0, 0,
0, 0,
0, 0, 0, 62, 81, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 120, 180, 39, 0, 0,
0, 0,
0, 0, 0, 0, 0, 126, 163, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 2, 153, 210, 40,
0, 0,
0, 0, 0, 0, 0, 0, 0, 220, 163, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 27, 254,
162, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 222, 163,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
183, 254,
125, 0, 0, 0, 0, 0, 0, 0, 0, 0, 46,
245, 163,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
198, 254, 56, 0, 0, 0, 0, 0, 0, 0, 0,
0, 120,
254, 163, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 23, 231, 254, 29, 0, 0, 0, 0, 0, 0,
0, 0,
0, 159, 254, 120, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 163, 254, 216, 16, 0, 0, 0, 0,
0, 0,
0, 0, 0, 159, 254, 67, 0, 0, 0, 0, 0,
0, 0,
0, 0, 14, 86, 178, 248, 254, 91, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 159, 254, 85, 0, 0, 0,
47, 49,
116, 144, 150, 241, 243, 234, 179, 241, 252, 40, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 150, 253, 237, 207,
207, 207,
253, 254, 250, 240, 198, 143, 91, 28, 5, 233, 250,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 119,
177, 177,
177, 177, 177, 98, 56, 0, 0, 0, 0, 0, 102,
254, 220,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 169,
254, 137, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 169, 254, 57, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 169, 254, 57, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 169, 255, 94, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 169, 254, 96, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 169, 254,
153, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
169, 255,
153, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
96, 254, 153, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0,
0, 0, 0, 0], dtype=uint8)

NORMALIZED FIRST FLATTENED IMAGE

X_train_flattened=X_train_flattened/255
X_test_flattened=X_test_flattened/255

X_train_flattened[0]

OUTPUT:

array([0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0.01176471, 0.07058824,
0.07058824,
0.07058824, 0.49411765, 0.53333333, 0.68627451,
0.10196078,
0.65098039, 1. , 0.96862745, 0.49803922,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0.11764706, 0.14117647, 0.36862745,
0.60392157,
0.66666667, 0.99215686, 0.99215686, 0.99215686,
0.99215686,
0.99215686, 0.88235294, 0.6745098 , 0.99215686,
0.94901961,
0.76470588, 0.25098039, 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0.19215686,
0.93333333,
0.99215686, 0.99215686, 0.99215686, 0.99215686,
0.99215686,
0.99215686, 0.99215686, 0.99215686, 0.98431373,
0.36470588,
0.32156863, 0.32156863, 0.21960784, 0.15294118,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0.07058824, 0.85882353, 0.99215686,
0.99215686,
0.99215686, 0.99215686, 0.99215686, 0.77647059,
0.71372549,
0.96862745, 0.94509804, 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0.31372549, 0.61176471, 0.41960784, 0.99215686,
0.99215686,
0.80392157, 0.04313725, 0. , 0.16862745,
0.60392157,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0.05490196,
0.00392157, 0.60392157, 0.99215686, 0.35294118,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0.54509804,
0.99215686, 0.74509804, 0.00784314, 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0.04313725, 0.74509804,
0.99215686,
0.2745098 , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0.1372549 , 0.94509804, 0.88235294,
0.62745098,
0.42352941, 0.00392157, 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0.31764706, 0.94117647, 0.99215686, 0.99215686,
0.46666667,
0.09803922, 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0.17647059,
0.72941176, 0.99215686, 0.99215686, 0.58823529,
0.10588235,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0.0627451 ,
0.36470588,
0.98823529, 0.99215686, 0.73333333, 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0.97647059,
0.99215686,
0.97647059, 0.25098039, 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0.18039216,
0.50980392,
0.71764706, 0.99215686, 0.99215686, 0.81176471,
0.00784314,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0.15294118,
0.58039216, 0.89803922, 0.99215686, 0.99215686,
0.99215686,
0.98039216, 0.71372549, 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0.09411765, 0.44705882, 0.86666667, 0.99215686,
0.99215686,
0.99215686, 0.99215686, 0.78823529, 0.30588235,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0.09019608, 0.25882353, 0.83529412,
0.99215686,
0.99215686, 0.99215686, 0.99215686, 0.77647059,
0.31764706,
0.00784314, 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0.07058824, 0.67058824,
0.85882353,
0.99215686, 0.99215686, 0.99215686, 0.99215686,
0.76470588,
0.31372549, 0.03529412, 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0.21568627,
0.6745098 ,
0.88627451, 0.99215686, 0.99215686, 0.99215686,
0.99215686,
0.95686275, 0.52156863, 0.04313725, 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0.53333333, 0.99215686, 0.99215686,
0.99215686,
0.83137255, 0.52941176, 0.51764706, 0.0627451 ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ,
0. ,
0. , 0. , 0. , 0. ])

TRAINING A SIMPLE NN FOR MNIST CLASSIFICATION

model=keras.Sequential([
keras.layers.Dense(10,input_shape=(784,),activation='sigmoid')
])

model.compile(
optimizer="adam",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"]
)
model.fit(X_train_flattened,y_train,epochs=5)

OUTPUT:

Epoch 1/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 4s 2ms/step - accuracy: 0.8133 -
loss: 0.7189
Epoch 2/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 6s 2ms/step - accuracy: 0.9153 -
loss: 0.3040
Epoch 3/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 8s 4ms/step - accuracy: 0.9219 -
loss: 0.2839
Epoch 4/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 7s 2ms/step - accuracy: 0.9233 -
loss: 0.2702
Epoch 5/5
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 5s 3ms/step - accuracy: 0.9255 -
loss: 0.2680

<keras.src.callbacks.history.History at 0x7bb0326f7a50>

EVALUATING THE MODEL PERFORMANCE ON TEST DATA

model.evaluate(X_test_flattened,y_test)

OUTPUT:

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.9143 -


loss: 0.3019

[0.26684650778770447, 0.925000011920929]

MODEL PREDICTION ON TEST DATA

y_predict=model.predict(X_test_flattened)

OUTPUT:

313/313 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step

VISUALIZATION OF TEST IMAGE AT INDEX 5

plt.matshow(X_test[5])

OUTPUT:

<matplotlib.image.AxesImage at 0x7bb0119fbf90>
PREDICTED LABEL FOR TEST IMAGE AT INDEX 5

y_predict[5]

OUTPUT:

array([7.6672986e-05, 9.9796927e-01, 5.5210060e-01, 4.4936424e-


01,
1.0058323e-02, 2.7723329e-02, 1.2036454e-02, 6.0284495e-
01,
6.0808712e-01, 1.7743033e-01], dtype=float32)

PREDICTING DIGIT CLASS FROM MODEL OUTPUT

print(np.argmax(y_predict[5]))

OUTPUT:

DISPLAYING TRUE LABELS FOR FIRST 5 TEST DATA POINTS

y_test[:5]

OUTPUT:

array([7, 2, 1, 0, 4], dtype=uint8)

DISPLAYING PREDICTED LABELS FOR FIRST 5 TEST DATA POINTS


y_predicted_Labels=[np.argmax(i) for i in y_predict]

y_predicted_Labels[:5]

OUTPUT:

[np.int64(7), np.int64(2), np.int64(1), np.int64(0), np.int64(4)]

CONFUSION MATRIX FOR PREDICTIONS

tf.math.confusion_matrix(labels=y_test,predictions=y_predicted_Labels)

OUTPUT:

<tf.Tensor: shape=(10, 10), dtype=int32, numpy=


array([[ 960, 0, 2, 2, 0, 5, 7, 2, 2,
0],
[ 0, 1109, 3, 2, 0, 1, 4, 2, 14,
0],
[ 5, 8, 928, 17, 8, 4, 12, 8, 39,
3],
[ 1, 0, 19, 934, 1, 15, 2, 8, 23,
7],
[ 2, 1, 5, 1, 916, 0, 10, 2, 10,
35],
[ 8, 3, 5, 46, 10, 759, 11, 7, 36,
7],
[ 14, 3, 9, 1, 7, 11, 906, 1, 6,
0],
[ 1, 5, 23, 9, 10, 1, 0, 938, 6,
35],
[ 5, 6, 7, 22, 9, 18, 8, 5, 887,
7],
[ 10, 7, 1, 11, 32, 7, 0, 16, 12,
913]],
dtype=int32)>

CONFUSION MATRIX WITH HEATMAP

import seaborn as sn
cm=tf.math.confusion_matrix(labels=y_test,predictions=y_predicted_Labels)
plt.figure(figsize=(10,10))
sn.heatmap(cm,annot=True,fmt="d")
plt.xlabel("Predicted")
plt.ylabel("Truth")

Text(95.72222222222221, 0.5, 'Truth')

OUTPUT:

You might also like