AI&ML Lab Manual
AI&ML Lab Manual
Palanchur,Chennai–600123
MACHINE LEARNING.
LABORATORY
IV – SEMESTER
1
LOYOLA INSTITUTE OF TECHNOLOGY
Palanchur,Chennai– 600123
CERTIFICATE
2
CONTENTS
3
CONTENTS
4
EX.NO:1 IMPLEMENTATION OF UNINFORMED SEARCH
Date: ALGORITHMS (BFS, DFS)
AIM:
ALGORITHM(BFS):
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1) and set
their STATUS = 2 (waiting state) [END OF LOOP]
Step 6: EXIT
ALGORITHM(DFS):
Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)
Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
Step 5: Push on the stack all the neighbors of N that are in the ready state (whose STATUS = 1)
and set their STATUS = 2 (waiting state) [END OF LOOP]
Step 6: EXIT
PROGRAM(BFS):
if visited[i] ==
False:
queue.append(i)
visited[i] = True
g = Graph()
g.addEdge(0, 1)
g.addEdge(0, 2)
g.addEdge(1, 2)
g.addEdge(2, 0)
g.addEdge(2, 3)
g.addEdge(3, 3)
g.BFS(2)
print ("Following is Breadth First Traversal" " (starting from vertex 2)")
g.BFS(2)
OUTPUT(BFS):
PROGRAM(DFS):
OUTPUT(DFS):
RESULT:
Thus the uninformed search algorithms such as BFS and DFS have been executed
successfully and the output got verified
7
EX.NO:2 IMPLEMENTATION OF INFORMED SEARCH
Date: ALGORITHM (A*)
AIM:
ALGORITHM(A*):
PROGRAM(A*):
aStarAlgo('A', 'J')
OUTPUT(A*):
RESULT:
Thus the program to implement informed search algorithm have been executedsuccessfully and
output got verified.
10
EX.NO:3 IMPLEMENT NAÏVE BAYES MODELS.
Date:
AIM:
To diagnose heart patients and predict disease using heart disease dataset with
Naïve Bayes Classifier Algorithm.
ALGORITHM:
PROGRAM:
NB_from_scratch.py
import csv
import numpy as np
from sklearn.metrics import confusion_matrix, f1_score, roc_curve, auc
import matplotlib.pyplot as plt
from itertools import cycle
from scipy import interp
import warnings
import random
import math
warnings.filterwarnings("ignore")
# Example of Naive Bayes implemented from Scratch in Python
11
def mean(columnvalues):
s=0
n = float(len(columnvalues))
for i in range(len(columnvalues)):
s = s + float(columnvalues[i])
return s / n
for z in range(5):
print("\n\n\nTest Train Split no. ", z + 1, "\n\n\n")
trainsize = int(len(dataset) * 0.75)
trainset = []
testset = list(dataset) for
i in range(trainsize):
index = random.randrange(len(testset))
trainset.append(testset.pop(index))
12
class_data[class_num] = class_datarow
y_pred.append(resultant_class)
# Getting Accuracy
count = 0
for i in range(len(testset)):
if testset[i][-1] ==
y_pred[i]:count += 1
accuracy = (count / float(len(testset))) * 100.0
print("\n\n Accuracy: ", accuracy, "%")
print("\n\n\n\nF1 Score")
f_score = f1_score(y1, y_pred1, average='weighted')
print(f_score)
13
y3 = np.zeros(shape=(len(y_pred1), 5))
for i in range(len(y1)):
y2[i][int(y1[i])] = 1
for i in range(len(y_pred1)):
y3[i][int(y_pred1[i])] = 1
# ROC Curve
generationn_classes = 5
fpr =
dict()tpr =
dict()
roc_auc = dict()
for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(y2[:, i], y3[:,
i])roc_auc[i] = auc(fpr[i], tpr[i])
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
14
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])
plt.plot(fpr["macro"], tpr["macro"],
label='macro-average (area =
{0:0.2f})'
''.format(roc_auc["macro"]),
color='navy', linestyle=':',
linewidth=4)
NB_from_Gaussian_Sklearn.py
import csv
import pandas as pd
import numpy as np
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import confusion_matrix, f1_score, roc_curve, auc
import matplotlib.pyplot as plt
from itertools import cycle
from scipy import interp
training_x = df.iloc[1:df.shape[0],
0:13]# print(training_set)
training_y = df.iloc[1:df.shape[0],
13:14]# print(testing_set)
for z in range(5):
print("\n\n\nTest Train Split no. ", z + 1, "\n\n\n")
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=None)#
Gaussian function of sklearn
gnb = GaussianNB()
gnb.fit(x_train, y_train.ravel())
y_pred = gnb.predict(x_test)
print("\n\n\n\nConfusion Matrix")
cf_matrix = confusion_matrix(y1,
y_pred1)print(cf_matrix)
print("\n\n\n\nF1 Score")
f_score = f1_score(y1, y_pred1, average='weighted')
print(f_score)
16
y2[i][int(y1[i])] = 1
for i in range(len(y_pred1)):
y3[i][int(y_pred1[i])] = 1
# ROC Curve
generationn_classes = 5
fpr =
dict()tpr =
dict()
roc_auc = dict()
for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(y2[:, i], y3[:,
i])roc_auc[i] = auc(fpr[i], tpr[i])
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])
plt.plot(fpr["macro"], tpr["macro"],
label='macro-average (area =
{0:0.2f})'''.format(roc_auc["macro"]),
17
color='navy', linestyle=':', linewidth=4)
OUTPUT:
18
RESULT:
Thus the program to diagnose heart patients and predict disease using heart diseasedataset with
Naïve Bayes Classifier Algorithm have been executed successfully and output got verified.
19
EX.NO:4 IMPLEMENT BAYESIAN NETWORKS
Date:
AIM:
To construct a Bayesian network, to demonstrate the diagnosis of heart patients
using standard Heart Disease Data Set.
ALGORITHM:
PROGRAM:
import bayespy as bp
import numpy as np
import csv
from colorama import init
from colorama import Fore, Back,
Styleinit()
p_age = bp.nodes.Dirichlet(1.0*np.ones(5))
age = bp.nodes.Categorical(p_age, plates=(N,))
20
age.observe(data[:,0])
p_gender = bp.nodes.Dirichlet(1.0*np.ones(2)) gender
= bp.nodes.Categorical(p_gender, plates=(N,))
gender.observe(data[:,1])
p_familyhistory = bp.nodes.Dirichlet(1.0*np.ones(2))
familyhistory = bp.nodes.Categorical(p_familyhistory,
plates=(N,))familyhistory.observe(data[:,2])
p_diet = bp.nodes.Dirichlet(1.0*np.ones(3))
diet = bp.nodes.Categorical(p_diet, plates=(N,))
diet.observe(data[:,3])
p_lifestyle = bp.nodes.Dirichlet(1.0*np.ones(4))
lifestyle = bp.nodes.Categorical(p_lifestyle,
plates=(N,))lifestyle.observe(data[:,4])
m=0
while m ==
0:
print("\n")
res = bp.nodes.MultiMixture([int(input('Enter Age: ' + str(ageEnum))), int(input('Enter
Gender: ' + str(genderEnum))), int(input('Enter FamilyHistory: ' +
str(familyHistoryEnum))), int(input('Enter dietEnum: ' + str(dietEnum))),
int(input('Enter LifeStyle: ' + str(lifeStyleEnum))), int(input('Enter Cholesterol: '
+str(cholesterolEnum)))], bp.nodes.Categorical,
p_heartdisease).get_moments()[0][heartDiseaseEnum['Yes']]
print("Probability(HeartDisease) = " + str(res))
m = int(input("Enter for Continue:0, Exit :1 "))
OUTPUT:
21
Enter Age: {'SuperSeniorCitizen': 0, 'SeniorCitizen': 1, 'MiddleAged': 2, 'Youth': 3, 'Teen': 4}1
Enter Gender: {'Male': 0, 'Female': 1}0
Enter FamilyHistory: {'Yes': 0, 'No': 1}
Enter dietEnum: {'High': 0, 'Medium': 1, 'Low': 2}2
Enter LifeStyle: {'Athlete': 0, 'Active': 1, 'Moderate': 2, 'Sedetary': 3}2
Enter Cholesterol: {'High': 0, 'BorderLine': 1, 'Normal': 2}1
Probability(HeartDisease) = 0.5
Enter for Continue:0, Exit :1 1
RESULT:
Thus the program to implement a bayesian networks in the given heart diseasedataset have
been executed successfully and the output got verified.
22
EX.NO:5 BUILD REGRESSION MODELS
Date:
AIM:
To build regression models such as locally weighted linear regression and plot
thenecessary graphs.
ALGORITHM:
1. Read the Given data Sample to X and the curve (linear or non-linear) to Y
2. Set the value for Smoothening parameter or Free parameter say τ
3. Set the bias /Point of interest set x0 which is a subset of X
4. Determine the weight matrix using :
6. Prediction = x0*β.
PROGRAM:
residuals = y - yest
s = np.median(np.abs(residuals))
23
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2
return yest
import mathn =
100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)f
=0.25
iterations=3
yest = lowess(x, y, f, iterations)
OUTPUT:
RESULT:
Thus the program to implement non-parametric Locally Weighted Regression algorithm in
order to fit data points with a graph visualization have been executedsuccessfully.
24
EX.NO:6 BUILD DECISION TREES AND
Date: RANDOM FORESTS
AIM:
To implement the concept of decision trees with suitable dataset from real world
problems using CART algorithm.
ALGORITHM:
STEPS IN CART ALGORITHM:
1. It begins with the original set S as the root node.
2. On each iteration of the algorithm, it iterates through the very unused
attribute ofthe set S and calculates Gini index of this attribute.
3. Gini Index works with the categorical target variable “Success” or
“Failure”. Itperforms only Binary splits.
4. The set S is then split by the selected attribute to produce a subset of the data.
5. The algorithm continues to recur on each subset, considering only attributes
neverselected before.
PROGRAM:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
data =
pd.read_csv('/Users/ganesh/PycharmProjects/DecisionTree/Social_Network_Ads.csv')
data.head()
y_pred = classifier.predict(x_test)
25
from sklearn import metrics
print('Accuracy Score:', metrics.accuracy_score(y_test, y_pred))
dot_data = StringIO()
export_graphviz(classifier, out_file=dot_data, filled=True, rounded=True,
special_characters=True, feature_names=feature_cols, class_names=['0', '1'])
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
Image(graph.write_png('decisiontree.png'))
dot_data = StringIO()
export_graphviz(classifier, out_file=dot_data, filled=True, rounded=True,
special_characters=True, feature_names=feature_cols, class_names=['0', '1'])
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
26
Image(graph.write_png('opt_decisiontree_gini.png'))
27
Optimized output of decision tree using Gini Index (CART):
RESULT:
Thus the program to implement the concept of decision trees with suitable datasetfrom real
world problems using CART algorithm have been executed successfully.
28
EX.NO:7 BUILD SVM MODELS
Date:
AIM:
To create a machine learning model which classifies the Spam and Ham E-Mails froma given
dataset using Support Vector Machine algorithm.
ALGORITHM:
PROGRAM:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import string
from nltk.corpus import stopwords
import os
from wordcloud import WordCloud, STOPWORDS,
ImageColorGeneratorfrom PIL import Image
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import roc_curve, auc
from sklearn import metrics
from sklearn import model_selection
from sklearn import svm
from nltk import word_tokenize
from sklearn.metrics import roc_auc_score
from matplotlib import pyplot
from sklearn.metrics import plot_confusion_matrix
class data_read_write(object):
def init (self):
pass
def init (self, file_link):
self.data_frame = pd.read_csv(file_link)
29
def read_csv_file(self,
file_link):return
self.data_frame
def write_to_csvfile(self, file_link):
self.data_frame.to_csv(file_link, encoding='utf-8', index=False, header=True)
return
class generate_word_cloud(data_read_write):
def init (self):
pass
def variance_column(self, data):
return np.variance(data)
def word_cloud(self, data_frame_column, output_image_file):
text = " ".join(review for review in data_frame_column)
stopwords = set(STOPWORDS)
stopwords.update(["subject"])
wordcloud = WordCloud(width = 1200, height = 800, stopwords=stopwords,
max_font_size = 50, margin=0,
background_color = "white").generate(text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.savefig("Distribution.png")
plt.show()
wordcloud.to_file(output_image_file)
return
class data_cleaning(data_read_write):
def init (self):
pass
def message_cleaning(self, message):
Test_punc_removed = [char for char in message if char not in string.punctuation]
Test_punc_removed_join = ''.join(Test_punc_removed)
Test_punc_removed_join_clean = [word for word in Test_punc_removed_join.split()
if word.lower() not in stopwords.words('english')]
final_join = ' '.join(Test_punc_removed_join_clean)
return final_join
class apply_embeddding_and_model(data_read_write):
def init (self):
pass
30
vectorizer = CountVectorizer(min_df=2, analyzer="word",
tokenizer=None,preprocessor=None, stop_words=None)
return vectorizer.fit_transform(v_data_column)
31
pyplot.plot(lr_fpr, lr_tpr, marker='.',
label='SVM')pyplot.xlabel('False Positive Rate')
pyplot.ylabel('True Positive Rate')
pyplot.legend()
pyplot.savefig("SVMMat.png"
)pyplot.show()
return
data_obj = data_read_write("emails.csv")
data_frame = data_obj.read_csv_file("processed.csv")
data_frame.head()
data_frame.tail()
data_frame.describe()
data_frame.info()
data_frame.head()
data_frame.groupby('spam').describe()
data_frame['length'] = data_frame['text'].apply(len)
data_frame['length'].max()
sns.set(rc={'figure.figsize':(11.7,8.27)})
ham_messages_length = data_frame[data_frame['spam']==0]
spam_messages_length = data_frame[data_frame['spam']==1]
data_frame[data_frame['spam']==0].text.values
sns.set(rc={'figure.figsize':(11.7,8.27)})
ax = sns.distplot(ham_words_length, norm_hist = True, bins = 30, label = 'Ham')
ax = sns.distplot(spam_words_length, norm_hist = True, bins = 30, label = 'Spam')
plt.title('Distribution of Number of Words')
plt.xlabel('Number of Words')
32
plt.legend()
plt.savefig("SVMGraph.png"
)plt.show()
def mean_word_length(x):
word_lengths = np.array([])
for word in word_tokenize(x):
word_lengths = np.append(word_lengths, len(word))
return word_lengths.mean()
ham_meanword_length =
data_frame[data_frame['spam']==0].text.apply(mean_word_length)
spam_meanword_length =
data_frame[data_frame['spam']==1].text.apply(mean_word_length)
def stop_words_ratio(x):
num_total_words = 0
num_stop_words = 0
for word in word_tokenize(x):
if word in stop_words:
num_stop_words += 1
num_total_words += 1
return num_stop_words / num_total_words
ham = data_frame[data_frame['spam']==0]
spam = data_frame[data_frame['spam']==1]
spam['length'].plot(bins=60, kind='hist')
ham['length'].plot(bins=60, kind='hist')
data_frame['Ham(0) and Spam(1)'] = data_frame['spam']
print( 'Spam percentage =', (len(spam) / len(data_frame) )*100,"%")
print( 'Ham percentage =', (len(ham) / len(data_frame) )*100,"%")
sns.countplot(data_frame['Ham(0) and Spam(1)'], label = "Count")
data_clean_obj = data_cleaning()
data_frame['clean_text'] = data_clean_obj.apply_to_column(data_frame['text'])
data_frame.head()
data_obj.data_frame.head()
data_obj.write_to_csvfile("processed_file.csv")
cv_object = apply_embeddding_and_model()
spamham_countvectorizer = cv_object.apply_count_vector(data_frame['clean_text'])
X = spamham_countvectorizer
label = data_frame['spam'].values
y = label
cv_object.apply_svm(X,y)
OUTPUT:
test set
Accuracy Score: 0.9895287958115183
F1 Score: 0.9776119402985075
Recall: 0.9739776951672863
Precision: 0.9812734082397003
34
35
RESULT:
Thus the program to create a machine learning model which classifies the Spam andHam E-
Mails from a given dataset using Support Vector Machine algorithm have been successfully
executed.
36
EX.NO:8 IMPLEMENT ENSEMBLING
Date: TECHNIQUES.
AIM:
To implement the ensembling technique of Blending with the given Alcohol QCM
Dataset.
ALGORITHM:
1. Split the training dataset into train, test and validation dataset.
2. Fit all the base models using train dataset.
3. Make predictions on validation and test dataset.
4. These predictions are used as features to build a second level model
5. This model is used to make predictions on test and meta-features.
PROGRAM:
import pandas as pd
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import RandomForestRegressor
import xgboost as xgb
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
df = pd.read_csv("train_data.csv")
target = df["target"]
train = df.drop("target")
X_train, X_test, y_train, y_test = train_test_split(train, target, test_size=0.20)
train_ratio = 0.70
validation_ratio = 0.20
test_ratio = 0.10
x_train, x_test, y_train, y_test = train_test_split(
train, target, test_size=1 - train_ratio)
x_val, x_test, y_val, y_test = train_test_split(
x_test, y_test, test_size=test_ratio/(test_ratio + validation_ratio))
model_1 = LinearRegression()
model_2 = xgb.XGBRegressor()
model_3 = RandomForestRegressor()
model_1.fit(x_train, y_train)
val_pred_1 = model_1.predict(x_val)
test_pred_1 = model_1.predict(x_test)
val_pred_1 = pd.DataFrame(val_pred_1)
test_pred_1 = pd.DataFrame(test_pred_1)
model_2.fit(x_train, y_train)
val_pred_2 = model_2.predict(x_val)
test_pred_2 = model_2.predict(x_test)
val_pred_2 = pd.DataFrame(val_pred_2)
test_pred_2 = pd.DataFrame(test_pred_2)
37
model_3.fit(x_train, y_train) val_pred_3
= model_1.predict(x_val) test_pred_3 =
model_1.predict(x_test)
val_pred_3 = pd.DataFrame(val_pred_3)
test_pred_3 = pd.DataFrame(test_pred_3)
df_val = pd.concat([x_val, val_pred_1, val_pred_2, val_pred_3], axis=1) df_test
= pd.concat([x_test, test_pred_1, test_pred_2, test_pred_3], axis=1)final_model
= LinearRegression()
final_model.fit(df_val, y_val)
final_pred = final_model.predict(df_test)
print(mean_squared_error(y_test, pred_final))
OUTPUT:
4790
RESULT:
Thus the program to implement ensembling technique of Blending with the given Alcohol
QCM Dataset have been executed successfully and the output got verfied.
38
EX.NO:8 IMPLEMENT CLUSTERING
Date: ALGORITHMS
AIM:
To implment k-Nearest Neighbour algorithm to classify the Iris Dataset.
ALGORITHM:
Step-1: Select the number K of the neighbors
Step-2: Calculate the Euclidean distance of K number of neighbors
Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
Step-4: Among these k neighbors, count the number of the data points in each
category.
Step-5: Assign the new data points to that category for which the number of the
neighbor is maximum.
Step-6: Our model is ready.
PROGRAM:
import pandas as pd
import numpy as np
from sklearn import datasets
iris=datasets.load_iris()
iris_data=iris.data
iris_labels=iris.target
print("accuracy is")
print(classification_report(y_test, y_pred))
39
OUTPUT:
accuracy is
precision recall f1-score support
accuracy 0.97 30
macro avg 0.96 0.98 0.97 30
weighted avg 0.97 0.97 0.97 30
RESULT:
Thus the program to implement k-Nearest Neighbour Algorithm for clustering Irisdataset
have been executed successfully and output got verified.
40
EX.NO:10 IMPLEMENT EM FOR BAYESIAN
Date: NETWORKS
AIM:
To implement the EM algorithm for clustering networks using the given dataset.
ALGORITHM:
Initialize θ randomly Repeat until convergence:
E-step:
Compute q(h) = P(H = h | E = e; θ) for each h (probabilistic
inference)Create fully-observed weighted examples: (h, e) with
weight q(h)
M-step:
Maximum likelihood (count and normalize) on weighted examples to get θ
PROGRAM:
from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.mixture import GaussianMixture
from sklearn.datasets import load_iris
import sklearn.metrics as sm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dataset=load_iris()
# print(dataset)
X=pd.DataFrame(dataset.data)
X.columns=['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y=pd.DataFrame(dataset.target)
y.columns=['Targets']
# print(X)
plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])
# REAL PLOT
plt.subplot(1,3,1)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y.Targets],s=40
)plt.title('Real')
# K-PLOT
plt.subplot(1,3,2)
model=KMeans(n_clusters=3
)model.fit(X)
41
predY=np.choose(model.labels_,[0,1,2]).astype(np.int64)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[predY],s=40)
plt.title('KMeans')
# GMM PLOT
scaler=preprocessing.StandardScaler()
scaler.fit(X)
xsa=scaler.transform(X)
xs=pd.DataFrame(xsa,columns=X.columns)
gmm=GaussianMixture(n_components=3)
gmm.fit(xs) y_cluster_gmm=gmm.predict(xs)
plt.subplot(1,3,3)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm],s=40)
plt.title('GMM Classification')
OUTPUT:
RESULT:
Thus the program to implement EM Algorithm for clustering networks using the
given dataset have been executed successfully and the output got verified.
42
EX.NO:11 BUILD SIMPLE NN MODELS
Date:
AIM:
ALGORITHM:
1. Image Acquisition: The first step is to acquire images of paper documents
with the help of optical scanners. This way, an original image can be
captured and stored.
2. Pre-processing: The noise level on an image should be optimized and areas
outsidethe text removed. Pre-processing is especially vital for recognizing
handwritten documents that are more sensitive to noise.
3. Segmentation: The process of segmentation is aimed at grouping
characters into meaningful chunks. There can be predefined classes for
characters. So, images can be scanned for patterns that match the classes.
4. Feature Extraction: This step means splitting the input data into a set of
features, that is, to find essential characteristics that make one or another
patternrecognizable.
5. Training an MLP neural network using the following steps:
1. Starting with the input layer, propagate data forward to the output layer.This step
is the forward propagation.
2. Based on the output, calculate the error (the difference between the
predicted and known outcome). The error needs to be minimized.
3. Backpropagate the error. Find its derivative with respect to each weightin the
network, and update the model.
6. Post processing: This stage is the process of refinement as an OCR
model can require some corrections. However, it isn’t possible to
achieve 100% recognitionaccuracy. The identification of characters
heavily depends on the context.
PROGRAM:
from future import print_function import
numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation from
keras.layers import Conv2D, MaxPooling2D, Flatten from
keras.optimizers import RMSprop, SGD
from keras.optimizers import Adam
from keras.utils import np_utils from
emnist import list_datasets
from emnist import extract_training_samplesfrom
emnist import extract_test_samples import
matplotlib
matplotlib.use('TkAgg')
43
import matplotlib.pyplot as plt
np.random.seed(1671) # for reproducibility#
network and training
NB_EPOCH = 30
BATCH_SIZE = 256
VERBOSE = 2
NB_CLASSES = 256 # number of outputs = number of classes
OPTIMIZER = Adam()
N_HIDDEN = 512
VALIDATION_SPLIT=0.2 # how much TRAIN is reserved for
VALIDATIONDROPOUT = 0.20
print(list_datasets())
X_train, y_train = extract_training_samples('byclass')
print("train shape: ", X_train.shape)
print("train labels: ",y_train.shape)
X_test, y_test = extract_test_samples('byclass')
print("test shape: ",X_test.shape)
print("test labels: ",y_test.shape)#for
indexing from 0
y_train = y_train-1
y_test = y_test-1
RESHAPED = len(X_train[0])*len(X_train[1]) X_train
= X_train.reshape(len(X_train), RESHAPED)X_test =
X_test.reshape(len(X_test), RESHAPED) X_train =
X_train.astype('float32')
X_test = X_test.astype('float32')#
normalize
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, NB_CLASSES)
Y_test = np_utils.to_categorical(y_test, NB_CLASSES)#
M_HIDDEN hidden layers
# 35 outputs
# final stage is softmax
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED,)))
model.add(Activation('relu')) model.add(Dropout(DROPOUT))
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(DROPOUT))
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(DROPOUT))
44
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(DROPOUT))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=OPTIMIZER,
metrics=['accuracy'])
OUTPUT:
['balanced', 'byclass', 'bymerge', 'digits', 'letters', 'mnist']train
shape: (697932, 28, 28)
train labels: (697932,)
test shape: (116323, 28, 28)
test labels: (116323,)
697932 train samples
116323 test samples
Model: "sequential"
45
activation_2 (Activation) (None, 256) 0
dropout_2 (Dropout) (None, 256) 0
dense_3 (Dense) (None, 256) 65792
activation_3 (Activation) (None, 256) 0
dropout_3 (Dropout) (None, 256) 0
dense_4 (Dense) (None, 256) 65792
activation_4 (Activation) (None, 256) 0
=================================================================
Total params: 730,624
Trainable params: 730,624
Non-trainable params: 0
46
RESULT:
Thus the program to implement the neural network model for the given dataset
47
EX.NO:12 BUILD DEEP LEARNING NN
Date: MODELS
AIM:
To implement and build a Convolutional neural network model which predicts theage and
gender of a person using the given pre-trained models.
ALGORITHM:
STEPS IN CNN ALGORITHM:
Step-1: Choose the Dataset.
Step-2: Prepare the Dataset for training.
Step-3: Create training Data.
Step-4: Shuffle the Dataset.
Step-5: Assigning Labels and Features.
Step-6: Normalising X and converting labels to categorical data.Step-
7: Split X and Y for use in CNN.
Step-8: Define, compile and train the CNN Model.
Step-9: Accuracy and Score of the model.
PROGRAM:
import cv2 as cv
import math
import time
from google.colab.patches import cv2_imshow
48
MODEL_MEAN_VALUES = (78.4263377603, 87.7689143744, 114.895847746)
ageList = ['(0-2)', '(4-6)', '(8-12)', '(15-20)', '(25-32)', '(38-43)', '(48-53)', '(60-100)']
genderList = ['Male', 'Female']
def age_gender_detector(frame):
# Read frame t
= time.time()
frameFace, bboxes = getFaceBox(faceNet, frame)for
bbox in bboxes:
# print(bbox)
face = frame[max(0,bbox[1]-padding):min(bbox[3]+padding,frame.shape[0]-
1),max(0,bbox[0]-padding):min(bbox[2]+padding, frame.shape[1]-1)]blob =
cv.dnn.blobFromImage(face, 1.0, (227, 227), MODEL_MEAN_VALUES, swapRB=False)
genderNet.setInput(blob)
genderPreds = genderNet.forward()
gender = genderList[genderPreds[0].argmax()]
# print("Gender Output : {}".format(genderPreds))
print("Gender : {}, conf = {:.3f}".format(gender,
genderPreds[0].max()))ageNet.setInput(blob) agePreds
= ageNet.forward()
age = ageList[agePreds[0].argmax()]
print("Age Output : {}".format(agePreds))
print("Age : {}, conf = {:.3f}".format(age, agePreds[0].max()))label = "{},{}".format(gender,age)
cv.putText(frameFace, label, (bbox[0], bbox[1]-10), cv.FONT_HERSHEY_SIMPLEX, 0.8, (0,
255, 255), 2, cv.LINE_AA)
return frameFace
OUTPUT:
RESULT:
Thus the program to implement and build a Convolutional neural network model which predicts the age
and gender of a person using the given pre-trained modelshave been executed successfully and the
output got verified
49
51