0% found this document useful (0 votes)
49 views120 pages

DLT Record Final

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views120 pages

DLT Record Final

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 120

EXPERIMENT-1

AIM: Build a Convolution Neural Network for Image Recognition.

Description:

What is Image Recognition?

Image recognition is a mechanism used to identify an object within an image and to classify it in a specific category,
based on the way human people recognize objects within different sets of images.

Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are a class of deep learning models designed to automatically learn and extract
hierarchical features from images. CNNs consist of layers that perform convolution, pooling, and fully connected
operations. Convolutional layers apply filters to input data, capturing local patterns and edges. Pooling layers down
sample feature maps, retaining important information while reducing computation. Fully connected layers make
decisions based on the learned features. CNNs excel in image classification, object detection, and segmentation tasks
due to their ability to capture spatial hierarchies of features.

Steps to Build an Image Recognition Model using CNN

Before we train a CNN model, let’s build a basic, Fully Connected Neural Network for the dataset. The basic steps to
build an image classification model using a neural network are:

1)Flatten the input image dimensions to 1D (width pixels x height pixels)

2)Normalize the image pixel values (divide by 255)

3)One-Hot Encode the categorical column

4)Build a model architecture (Sequential) with Dense layers (Fully connected layers)

5)Train the model and make predictions

Identifying Images From the CIFAR-10 Dataset Using CNNs


Defining Dataset

The CIFAR-10 dataset consists of 60,000 32 x 32 color images in 10 classes, with 6,000 images per class. There are
50,000 training images and 10,000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains
exactly 1000 randomly selected images from each class. The training batches contain the remaining images in random
order, but some training batches may contain more images from one class than another. Between them, the training
batches contain exactly 5000 images from each class.

The important points that distinguish this dataset from MNIST are:

Images are colored in CIFAR-10 as compared to the black-and-white texture of MNIST.

Each image is 32 x 32 pixel.

50,000 training images and 10,000 testing images. The classes are completely mutually exclusive. There is no overlap
between automobiles and trucks. "Automobile" includes sedans, SUVs, things of that sort. "Truck" includes only big
trucks. Neither includes pickup trucks.
Source code:
Import the required Libraries

This code imports essential libraries for data manipulation, visualization, and machine learning in Python, including NumPy,
Pandas, Matplotlib, Seaborn, OpenCV, and TensorFlow with Keras. It also includes modules for image recognition and
tools for creating and working with convolutional neural networks.

from matplotlib import pyplot as plt

%matplotlib inline

from sklearn.preprocessing import LabelEncoderimport keras

import pandas as pdimport numpy as np

from PIL import Imageimport os

import warnings

warnings.filterwarnings('ignore')

Unzip dataset

This command unzips the "cifar10.zip" file quietly in the specified Google Drive directory, commonly used to extract
datasets in image recognition tasks.

!unzip -q "/content/drive/MyDrive/deepl/cifar10.zip"

The below code Reads a CSV file containing labels for CIFAR-10 dataset images into a Pandas DataFrame, setting the first
column as the index. Prints the label of the image at index 5 in the CIFAR-10 dataset, providing information about the
content of the image. labels.shape: Returns the shape of the DataFrame, indicating the number of rows and columns,
which can give insights into the size and structure of the label data associated with the images.

labels = pd.read_csv("/content/drive/MyDrive/deepl/cifar10Labels.csv", index_col=0)

# View an image

img_idx = 5

print(labels.label[img_idx])

Image.open('cifar10/'+str(img_idx)+'.png')labels.shape

automobile(50000, 1)

The code performs a stratified split of CIFAR-10 labels into training and testing sets, and then captures and stores the
corresponding indexes, ensuring consistent splits for subsequent image recognition tasks.

# Splitting data into Train and Test data

from sklearn.model_selection import train_test_split

y_train, y_test = train_test_split(labels.label, test_size=0.3, random_state=42) train_idx, test_idx = y_train.index,


y_test.index

# Stroing indexes for later use

The below code reads and processes images from the CIFAR-10 dataset for training. It creates a NumPy array X_train
containing the float32 representations of the training images, ready for use in image recognition model training.
# Reading images for trainingtemp = []

for img_idx in y_train.index:

img_path = os.path.join('cifar10/', str(img_idx) + '.png')img = np.array(Image.open(img_path)).astype('float32')

temp.append(img)

X_train = np.stack(temp)

This code reads and converts images from the test set into a NumPy array, facilitating their use as input for image
recognition models during the testing phase. The resulting array X_test contains the pixel data of the test images.

# Reading images for testingtemp = []

for img_idx in y_test.index:

img_path = os.path.join('cifar10/', str(img_idx) + '.png')img = np.array(Image.open(img_path)).astype('float32')

temp.append(img)

X_test = np.stack(temp)

Displays a subset of the normalized testing image data, showing the pixel values after normalization, which is crucial for
efficient training of neural networks in image recognition tasks.

# Normalizing image dataX_train = X_train/255. X_test = X_test/255.

y_train.shapeX_test[6:7]

array([[[[0.9882353 , 0.9882353 , 0.9843137 ],

[0.9607843 , 0.9764706 , 0.9647059 ],

[0.90588236, 0.92941177, 0.91764706],

...,

[0.79607844, 0.87058824, 0.8980392 ],

[0.7764706 , 0.8509804 , 0.8784314 ],

[0.7921569 , 0.8666667 , 0.8980392 ]],

[[0.972549 , 0.98039216, 0.98039216],

[0.9411765 , 0.9607843 , 0.9607843 ],

[0.91764706, 0.9529412 , 0.9490196 ],

...,

[0.79607844, 0.85490197, 0.87058824],

[0.8039216 , 0.8627451 , 0.8784314 ],

[0.79607844, 0.85490197, 0.8745098 ]],

[[0.94509804, 0.9647059 , 0.9764706 ],

[0.9137255 , 0.9490196 , 0.95686275],

[0.90588236, 0.9490196 , 0.95686275],

...,

[0.74509805, 0.79607844, 0.8 ],

[0.7882353 , 0.8352941 , 0.8392157 ],


[0.78431374, 0.83137256, 0.8352941 ]],

...,

[[0.39607844, 0.45882353, 0.5921569 ],

[0.34117648, 0.44313726, 0.6 ],

[0.34117648, 0.41960785, 0.5411765 ],

...,

[0.21960784, 0.1764706 , 0.18039216],

[0.27450982, 0.23137255, 0.23921569],

[0.3372549 , 0.29411766, 0.3019608 ]],

[[0.49803922, 0.54509807, 0.627451 ],

[0.34117648, 0.42352942, 0.54509807],

[0.3529412 , 0.4392157 , 0.5764706 ],

...,

[0.1882353 , 0.16078432, 0.16078432],

[0.25882354, 0.23137255, 0.23529412],

[0.33333334, 0.30588236, 0.30980393]],

[[0.74509805, 0.7764706 , 0.8117647 ],

[0.41960785, 0.47058824, 0.5568628 ],

[0.3647059 , 0.4392157 , 0.5568628 ],

...,

[0.21176471, 0.2 , 0.2 ],

[0.25490198, 0.23921569, 0.24313726],

[0.3372549 , 0.32156864, 0.3254902 ]]]], dtype=float32)

Converts the encoded numerical labels into one-hot encoded format using Keras utility function to_categorical, creating a
binary matrix representation for each class, suitable for training a multi-class image recognition model.

# One-hot encoding 10 output classesencode_X = LabelEncoder()

encode_X_fit = encode_X.fit_transform(y_train)

y_train = keras.utils.to_categorical(encode_X_fit)

The code constructs a CNN model with two convolutional layers, each followed by batch normalization and max-pooling, a
flattening layer, and a fully-connected layer with softmax activation for classification into 10 classes. Regularization is
applied to the convolutional layers for improved generalization.

# Defining CNN network

num_classes = 10

model = keras.models.Sequential([
# Adding first convolutional layer

keras.layers.Conv2D(filters=32, kernel_size=(3, 3), strides=1, padding='same', activation='relu',

kernel_regularizer=keras.regularizers.l2(0.001), input_shape=(32, 32, 3), name='Conv_1'),

# Normalizing the parameters from last layer to speed up the performance (optional)

keras.layers.BatchNormalization(name='BN_1'),

# Adding first pooling layer

keras.layers.MaxPool2D(pool_size=(2, 2), name='MaxPool_1'),

# Adding second convolutional layer

keras.layers.Conv2D(filters=64, kernel_size=(3, 3), strides=1, padding='same', activation='relu',


kernel_regularizer=keras.regularizers.l2(0.001), name='Conv_2'),

keras.layers.BatchNormalization (name='BN_2'),

# Adding second pooling layer

keras.layers.MaxPool2D(pool_size=(2, 2), name='MaxPool_2'),

# Flattens the input

keras.layers.Flatten(name='Flat'),

# Fully-Connected layer

keras.layers.Dense(num_classes, activation='softmax', name='pred_layer')

])

The summary offers a comprehensive overview of the model architecture, aiding in understanding layer configurations
and parameter counts. It helps ensure the proper design and efficient training of the Convolutional Neural Network for
image recognition.

model.summary()

Model: "sequential"

Layer (type) Output Shape Param #


=================================================================
Conv_1 (Conv2D) (None, 32, 32, 32) 896
BN_1 (BatchNormalization) (None, 32, 32, 32) 128

MaxPool_1 (MaxPooling2D) (None, 16, 16, 32) 0

Conv_2 (Conv2D) (None, 16, 16, 64) 18496

BN_2 (BatchNormalization) (None, 16, 16, 64) 256

MaxPool_2 (MaxPooling2D) (None, 8, 8, 64) 0

Flat (Flatten) (None, 4096) 0

pred_layer (Dense) (None, 10) 40970

=================================================================

Total params: 60746 (237.29 KB)

Trainable params: 60554 (236.54 KB)

Non-trainable params: 192 (768.00 Byte)


Trains the compiled model on the normalized and one-hot encoded training data (X_train, y_train) for 5 epochs, using
20% of the data for validation, and employing the ModelCheckpoint callback to save the best model weights.

# Compiling the model

model.compile(loss='categorical_crossentropy',

optimizer=keras.optimizers.Adam(), metrics=['accuracy'])

cpfile = r'CIFAR10_checkpoint.hdf5'

# Weights to be stored in HDF5 format

cb_checkpoint = keras.callbacks.ModelCheckpoint(cpfile, monitor='val_acc', verbose=1, save_best_only=True,


mode='max') epochs = 5

model.fit(X_train, y_train, epochs=epochs, validation_split=0.2, callbacks=[cb_checkpoint])

Epoch 1/5

875/875 [==============================] - ETA: 0s - loss: 1.7468 - accuracy: 0.4747

WARNING:tensorflow:Can save best model only wit875/875 [==============================] - 72s 81ms/step -


loss: 1.7468 - accuracy: 0.4747 - val_loss: 2.3047 - val_accuracy: 0.381Epoch 2/5

875/875 [==============================] - ETA: 0s - loss: 1.2392 - accuracy: 0.6043

WARNING:tensorflow:Can save best model only wit875/875 [==============================] - 69s 79ms/step -


loss: 1.2392 - accuracy: 0.6043 - val_loss: 1.4320 - val_accuracy: 0.554Epoch 3/5

875/875 [==============================] - ETA: 0s - loss: 1.0461 - accuracy: 0.6601

WARNING:tensorflow:Can save best model only wit875/875 [==============================] - 83s 95ms/step -


loss: 1.0461 - accuracy: 0.6601 - val_loss: 1.3571 - val_accuracy: 0.570Epoch 4/5

875/875 [==============================] - ETA: 0s - loss: 0.9319 - accuracy: 0.6989

WARNING:tensorflow:Can save best model only wit875/875 [==============================] - 75s 86ms/step -


loss: 0.9319 - accuracy: 0.6989 - val_loss: 1.4030 - val_accuracy: 0.585Epoch 5/5

875/875 [==============================] - ETA: 0s - loss: 0.8606 - accuracy: 0.7212

WARNING:tensorflow:Can save best model only wit

875/875 [==============================] - 66s 76ms/step - loss: 0.8606 - accuracy: 0.7212 - val_loss: 1.3002 -
val_accuracy: 0.590

<keras.src.callbacks.History at 0x7b164663da80>

Creates a Pandas DataFrame with columns 'predicted' and 'actual' to compare predicted and actual labels for the first 10
test images, aiding in model evaluation and result visualization

# << DeprecationWarning: The truth value of an empty array is ambiguous >> can arise due to a NumPy version higher
than 1.13.3.

# The issue will be updated in upcoming version.

#pred = encode_X.inverse_transform(model.predict_classes(X_test[:10]))
pred=np.argmax(model.predict(X_test[:10]),axis=-1)

pred=encode_X.inverse_transform(pred)act = y_test[:10]

res = pd.DataFrame([pred, act]).T

res.columns = ['predicted', 'actual']res

1/1 [==============================] - 0s 154ms/step


predicted actual

cat horse

ship ship

frog airplane

frog frog

cat automobile

frog frog

ship ship

ship airplanfrog frog

frog dog

Prints the training and testing accuracy scores, rounded to five decimal places, providing a quantitative measure of the
model's performance on both datasets.

# Printing the train and test accuracy from mlxtend.evaluate import scoring

train_acc = scoring(encode_X.inverse_transform(np.argmax(model.predict(X_train),axis=-1)),
encode_X.inverse_transform([np.argmax(x) for x in y_train]))

test_acc = scoring(encode_X.inverse_transform(np.argmax(model.predict(X_test),axis=-1)), y_test) print('Train accuracy:


', np.round(train_acc, 5))

print('Test accuracy: ', np.round(test_acc, 5))

1094/1094 [==============================] - 24s 22ms/step

469/469 [==============================] - 11s 24ms/step

Train accuracy: 0.33417

Test accuracy: 0.403

The code uses mlxtend to compute and plot confusion matrices for training and testing datasets, providing insights into
the model's performance by visualizing the distribution of predicted and actual class labels.

from mlxtend.evaluate import confusion_matrix

from mlxtend.plotting import plot_confusion_matrixdef plot_cm(cm, text):

class_names=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
plot_confusion_matrix(conf_mat=cm,

colorbar=True, figsize=(8, 8), cmap='Greens',show_absolute=False, show_normed=True)

tick_marks = np.arange(len(class_names))

plt.xticks(tick_marks, class_names, rotation=45, fontsize=12)plt.yticks(tick_marks, class_names, fontsize=12)

plt.xlabel('Predicted label', fontsize=14)plt.ylabel('True label', fontsize=14)

plt.title(text, fontsize=19, weight='bold')plt.show()

# Train Accuracy

train_cm = confusion_matrix(y_target=encode_X.inverse_transform([np.argmax(x) for x in y_train]),

y_predicted=encode_X.inverse_transform(np.argmax(model.predict(X_train),axis=-1)),binary=False)

plot_cm(train_cm, 'Confusion Matrix on Train Data')# Test Accuracy


test_cm = confusion_matrix(y_target=y_test,

y_predicted=encode_X.inverse_transform(np.argmax(model.predict(X_test),axis=-1)),binary=False)

plot_cm(test_cm, 'Confusion Matrix on Test Data')

1094/1094 [==============================] - 24s 22ms/step

469/469 [==============================] - 9s 19ms/step


EXPERIMENT-2

AIM: Identifying age group of an actor

Description:

Age Detection : Our goal here is to create a program that will predict the age group of the person using an image.

We will use the Indian Movie Face Database (IMFDB)* created by Shankar Setty et.al. as a benchmark for facial
recognition with wide variation. The database consists of thousands of images of 50+ actors taken from more than 100
videos. Since the database has been created manually by cropping the images from the video, there’s high variability in
terms of pose, expression, illumination, resolution, etc. The original database provides many attributes including:

In this scenario, we will use a cleaned and formatted data set with 26742 images split as 19906 train images and 6636
test images respectively. The target here is to use the images and predict the age of the actor/actress within the
available classes i.e. young, middle and old making it a multi-class classification problem.

we will resize all the images to 32 x 32 shape. All the images have red, blue and green color components; therefore, the
final shape becomes 32 x 32 x 3 giving us a total of 3072 nodes for the input layer.

Next, we will choose one hidden layer to start with along with 500 nodes making a total of 1536500 (3072 x 500)
connections between the input and the hidden layer. We will use the ReLU activation function in this layer.

Next, we have the output layer having only three classes and hence three nodes making a total of 1503 (500 x 3)
connections between hidden and output layer. In this layer, we will use the Softmax activation function.

Dataset:
You can download the train and test data sets. In each directory, you will find a folder consisting of images along with an
excel file which has two columns, ID and Class. The ID column consists of image names like 352.jpg and Class column
holds the respective image character’s age like Old.

The training data is the biggest (in -size) subset of the original dataset, which is used to train or fit the machine learning
model. Firstly, the training data is fed to the ML algorithms, which lets them learn how to make predictions for the given
task.

Once we train the model with the training dataset, it's time to test the model with the test dataset. This dataset
evaluates the performance of the model and ensures that the model can generalize well with the new or unseen
dataset. The test dataset is another subset of original data, which is independent of the training dataset.
Splitting the dataset into train and test sets is one of the important parts of data pre-processing, as by doing so, we can
improve the performance of our model and hence give better predictability.

Therefore, if we train and test the model with two different datasets, then it will decrease the performance of the
model. Hence it is important to split a dataset into two parts, i.e., train and test set.

In this way, we can easily evaluate the performance of our model. Such as, if it performs well with the training data, but
does not perform well with the test dataset, then it is estimated that the model may be overfitted.

The main difference between training data and testing data is that training data is the subset of original data that is used
to train the machine learning model, whereas testing data is used to check the accuracy of the model.

The training dataset is generally larger in size compared to the testing dataset.

In a dataset, a training set is implemented to build up a model, while a test (or validation) set is to validate the model
built. Data points in the training set are excluded from the test (validation) set.
Source Code:
Age Detection of Indian Actors

The line import sys is a Python statement that imports the sys module into your script or interactive session. The sys
module is part of the Python standard library and provides access tosome variables and functions that interact with the
Python interpreter.

Here's a brief explanation of the sys module:

Purpose: sys stands for "system." The module provides access to some variables used or maintained by the Python
interpreter and functions that interact with the interpreter. The codesys.modules[name].dict.clear() is a way to clear the
namespace (dictionary of names) of thecurrent Python module. Let's break down the terms used in the code:

sys: The sys module is a part of the Python standard library that provides access to some variables used or maintained by
the interpreter and functions that interact with the interpreter. Inthis case, it's used to access the modules attribute.

sys.modules: This is a dictionary that maps module names to module objects. It's a globaldictionary that keeps track of all
loaded modules in the current Python process.

[name]: name is a special variable in Python that is automatically set by the interpreter. When aPython script or module is
executed, name is set to 'main' if the script is being run as the main program. If the module is being imported, name is set
to the module's name. So, sys.modules[name] retrieves the module object for the current module.

.dict: The dict attribute of a module is a dictionary that holds the symbol table of the module. Itcontains all the names
(variables, functions, classes, etc.) defined in the module.

.clear(): This method is called on a dictionary and clears all its elements, effectively removing allnames from the module's
symbol table.

So, the overall meaning of the line of code is: "Access the global dictionary of loaded modules(sys.modules), get the
module object for the current module (sys.modules[name]), access its symbol table (dict), and clear that symbol table
(clear()), which effectively removes all names defined in the module."

import sys

sys.modules[ name ]. dict .clear()

import os: The os module provides a way of interacting with the operating system. It's commonlyused for tasks like file
and directory manipulation.

import numpy as np: NumPy is a library for numerical computing in Python. It provides supportfor large, multi-dimensional
arrays and matrices, along with mathematical functions to operateon these elements.

import pandas as pd: Pandas is a data manipulation library. It provides data structures likeDataFrame for efficient data
analysis and manipulation.

import matplotlib.pyplot as plt: Matplotlib is a plotting library. The pyplot module provides aconvenient interface for
creating various types of plots and charts.

%matplotlib inline: This is a Jupyter notebook magic command that ensures that Matplotlib plotsare displayed inline
within the notebook.

from sklearn.preprocessing import LabelEncoder: scikit-learn (sklearn) is a machine learninglibrary in Python.


LabelEncoder is used for encoding categorical labels as integer numbers.

from tensorflow.python.keras import utils: This imports utility functions from the TensorFlow library for tasks related to
neural networks. In this case, it might be used for one-hot encoding orother preprocessing tasks.

from keras.models import Sequential: Keras is a high-level neural networks API. This import statement brings in the
Sequential class, which is used to build a linear stack of neural networklayers.
from keras.layers import Dense, Flatten, InputLayer: This imports specific layer types (Dense, Flatten, InputLayer) from
Keras. These layers are commonly used in the construction of neuralnetwork architectures.

import keras: Importing the Keras library itself. It might be redundant here if you are alreadyimporting specific
components from Keras.

import imageio: ImageIO is a library for reading and writing images in various formats.

from PIL import Image: PIL (Python Imaging Library) is a library for opening, manipulating, andsaving many different image
file formats. In this case, it's used for image resizing.

# Importing necessary librariesimport os

import numpy as np import pandas as pd

import matplotlib.pyplot as plt

%matplotlib inline

from sklearn.preprocessing import LabelEncoder

from tensorflow.python.keras import utils

from keras.models import Sequential

from keras.layers import Dense, Flatten, InputLayerimport keras

import imageio

# To read images

from PIL import Image

# For image resizing

It is specific to the Google Colab environment, which is a free, cloud-based platform for runningPython code, especially
popular in the context of machine learning and data science.

from google.colab import drive

drive.mount('/content/drive', force_remount=True)Mounted at /content/drive

!: In Jupyter Notebooks or Colab, the exclamation mark ! is used to run shell commands directlyfrom the notebook cells.

unzip: This is the command-line utility for unzipping files.

-q: This flag stands for "quiet" and is used to suppress the output of the unzip command. Itmeans the command will not
print the names of the files and directories as they are being extracted.

"/content/drive/MyDrive/DLT/cifar10.zip": This is the path to the ZIP file that you want to unzip.

You're specifying the full path to a file named "cifar10.zip" in your Google Drive under the"/content/drive/MyDrive/DLT/"
directory.

When you run this command in a Colab notebook, it will unzip the contents of the "cifar10.zip" file into the current
directory. The -q flag is used to do this quietly, without printing each file as it'sbeing extracted.

!unzip -q "/content/drive/MyDrive/DLT/agedetectiontrain.zip"

!unzip -q "/content/drive/MyDrive/DLT/agedetectiontest.zip"

The provided code reads data from CSV files ('train.csv' and 'test.csv') using the pandas library inPython.

Reading Training Data: train = pd.read_csv('train.csv') This line reads the contents of the

'train.csv' file and creates a pandas DataFrame named train to store the data. A DataFrame is atwo-dimensional tabular
data structure in pandas that can hold data of different types. The
assumption here is that your training data is stored in a CSV (Comma-Separated Values) file, andpandas is used to read it
into a structured format.

Reading Test Data: test = pd.read_csv('test.csv') Similarly, this line reads the contents of the'test.csv' file and creates a
pandas DataFrame named test to store the test data.

After running these lines, you'll have two DataFrames (train and test), each containing the datafrom its respective CSV file.
You can then explore, analyze, and preprocess the data using the pandas library and proceed with your machine learning
or data analysis tasks.

# Reading the data

train = pd.read_csv('train.csv')test = pd.read_csv('test.csv')

This code snippet generates a random index (idx) from the training dataset, retrieves the imagefile name and
corresponding age group label from the training data, reads the image using imageio, and then displays the image along
with its associated age group using Matplotlib.

Setting Random Seed: np.random.seed(10) This line sets the random seed to ensure

reproducibility. By setting the seed, you ensure that the random number generation is the same each time you run the
code. This is useful when you want to obtain the same random results fordebugging or analysis purposes.

Choosing a Random Index: idx = np.random.choice(train.index) This line selects a random indexfrom the training dataset.
The np.random.choice function is used to randomly choose an index from the indices of the training dataset.

Getting Image Information: img_name = train.ID[idx] img = imageio.imread(os.path.join('Train', img_name)) These lines
retrieve the image file name (img_name) and read the correspondingimage using imageio. The image file is assumed to be
located in a directory named 'Train'. The os.path.join function is used to create the complete path to the image file.

Displaying Image and Age Group: print('Age group:', train.Class[idx]) plt.imshow(img)

plt.axis('off') plt.show() This section prints the age group associated with the randomly selectedimage (train.Class[idx])
and then uses Matplotlib to display the image. The plt.axis('off') line removes axis labels and ticks for a cleaner display.

# Displaying any random movie character along with age groupnp.random.seed(10)

idx = np.random.choice(train.index)img_name = train.ID[idx]

img = imageio.imread(os.path.join('Train', img_name))print('Age group:', train.Class[idx])

plt.imshow(img)plt.axis('off')plt.show()
EXPERIMENT-3
AIM: Design a CNN for Image Recognition which includes hyperparameter tuning
DESCRIPTION:
A CNN can have multiple layers, each of which learns to detect the different features of an input image. A
filter or kernel is applied to each image to produce an output that gets progressively better and more detailed
after each layer. In the lower layers, the filters can start as simple features.

Convolutional Neural Networks (CNNs) are a class of deep neural networks commonly used for image
recognition and analysis, but they are also applied to various other tasks like natural language processing and
speech recognition. Hyperparameters play a crucial role in the performance of CNNs, and tuning them
effectively is essential for achieving optimal results
CNNs have achieved state-of-the-art performance on a wide range of image recognition tasks, including object
classification, object detection, and image segmentation. They are widely used in computer vision, image
processing, and other related fields, and have been applied to a wide range of applications, including self-
driving cars, medical imaging, and security systems.
Hyperparameter Tuning
The first hyperparameter to tune is the number of neurons in each hidden layer. In this case, the number of
neurons in every layer is set to be the same. It also can be made different. The number of neurons should be
adjusted to the solution complexity. The task with a more complex level to predict needs more neurons. The
number of neurons range is set to be from 10 to 100.
An activation function is a parameter in each layer. Input data are fed to the input layer, followed by hidden
layers, and the final output layer. The output layer contains the output value. The input values moving from a
layer to another layer keep changing according to the activation function. The activation function decides how
to compute the input values of a layer into output values. The output values of a layer are then passed to the
next layer as input values again. The next layer then computes the values into output values for another layer
again. There are 9 activation functions to tune in to this demonstration. Each activation function has its own
formula (and graph) to compute the input values.
The layers of a neural network are compiled and an optimizer is assigned. The optimizer is responsible to
change the learning rate and weights of neurons in the neural network to reach the minimum loss function.
Optimizer is very important to achieve the possible highest accuracy or minimum loss. There are 7 optimizers
to choose from. Each has a different concept behind it.
One of the hyperparameters in the optimizer is the learning rate. We will also tune the learning rate. Learning
rate controls the step size for a model to reach the minimum loss function. A higher learning rate makes the
model learn faster, but it may miss the minimum loss function and only reach the surrounding of it. A lower
learning rate gives a better chance to find a minimum loss function. As a tradeoff lower learning rate needs
higher epochs, or more time and memory capacity resources.

If the observation size of the training dataset is too large, it will definitely take a longer time to build the
model. To make the model learn faster, we can assign batch size so that not all of the training data are given
to the model at the same time. Batch size is the number of training data sub-samples for the input. If the
training dataset has 77,500 observations and the batch size is 1000, the model will learn 77 times with 1000
training data sub-samples and another last learning from the 500 training data sub-samples. The smaller batch
size makes the learning process faster, but the variance of the validation dataset accuracy is higher. A bigger
batch size has a slower learning process, but the validation dataset accuracy has a lower variance.
The number of times a whole dataset is passed through the neural network model is called an epoch. One
epoch means that the training dataset is passed forward and backward through the neural network once. A
too-small number of epochs results in underfitting because the neural network has not learned much enough.
The training dataset needs to pass multiple times or multiple epochs are required. On the other hand, too
many epochs will lead to overfitting where the model can predict the data very well, but cannot predict new
unseen data well enough. The number of epoch must be tuned to gain the optimal result. This demonstration
searches for a suitable number of epochs between 20 to 100.
The CIFAR-10 dataset
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There
are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch
contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining
images in random order, but some training batches may contain more images from one class than another.
Between them, the training batches contain exactly 5000 images from each class.
Here are the classes in the dataset, as well as 10 random images from each:

airplane

automobile

bird

cat

deer

dog

frog

horse

ship

truck

The classes are completely mutually exclusive. There is no overlap between automobiles and trucks.
"Automobile" includes sedans, SUVs, things of that sort. "Truck" includes only big trucks. Neither includes
pickup trucks.vvs
Source Code:
Matplotlib: Matplotlib is a popular plotting library in Python for creating static, animated, and interactive
visualizations. It is often used for creating various types of charts and plots.
%matplotlib inline: This is a magic command in Jupyter notebooks that allows Matplotlib plots to be displayed
directly in the notebook, rather than in a separate window.
LabelEncoder: LabelEncoder is part of scikit-learn and is used for encoding categorical labels into numerical
values. This is often used in machine learning tasks where algorithms require numerical inputs.
Keras: Keras is a high-level neural networks API written in Python. It provides an easy-to-use interface for building
and training deep learning models.
Pandas: Pandas is a powerful data manipulation and analysis library for Python. It provides data structures like
DataFrames, which are particularly useful for working with structured data.
NumPy: NumPy is a fundamental package for scientific computing with Python. It provides support for large,
multi-dimensional arrays and matrices, along with mathematical functions to operate on these arrays.
PIL (Python Imaging Library): PIL is a library for opening, manipulating, and saving many different image file
formats. It has been succeeded by the Pillow library, but the import statement still uses "Image" from "PIL" for
compatibility.
os: The os module provides a way of using operating system-dependent functionality, such as reading or writing
to the file system.
Warnings: The warnings module is used to control the display of warning messages in the code.
from matplotlib import pyplot as plt
%matplotlib inline
from sklearn.preprocessing import LabelEncoderimport keras
import pandas as pdimport numpy as np
from PIL import Imageimport os
import warnings
warnings.filterwarnings('ignore')
from google.colab import drive : Imports the drive module from the google.colab package. This module
provides functions for mounting and managing Google Drive in Colab.
drive.mount('/content/drive', force_remount=True) : Mounts your Google Drive to the specified directory (
/content/drive ) in the Colab environment. The force_remount=True parameter is used to force a remount in
case the drive is already mounted.
After running this code, you will be prompted to authenticate and give Colab access to your Google Drive. Once
you've done that, your Google Drive will be accessible from within the Colab notebook, and you can navigate to
the /content/drive directory to access your files.
from google.colab import drive
drive.mount('/content/drive', force_remount=True)Mounted at /content/drive
! : In a Jupyter notebook or Colab environment, adding an exclamation mark before a command allows you to run
shell commands directly from the notebook.
unzip : This is a command-line utility for extracting files from a ZIP archive.
-q : Stands for "quiet" mode. It suppresses the output, making the extraction process less verbose.
"/content/drive/MyDrive/DLT/cifar10.zip" : This is the path to the ZIP file you want to unzip. In this case, it's
located in your Google Drive under the path /content/drive/MyDrive/DLT/cifar10.zip .
So, the command is essentially unzipping the contents of the specified ZIP file into the current working
directory in the Colab environment.
!unzip -q "/content/drive/MyDrive/DeepvLearning for Developers/MODULE-3-CNN/cifar10.zip"
labels = pd.read_csv('/content/drive/MyDrive/DLT/cifar10Labels.csv', index_col=0) : Reads a CSV file named
'cifar10Labels.csv' from the specified path into a Pandas DataFrame. The index_col=0 argument sets the first
column of the CSV file as the index of the
DataFrame.
img_idx = 5 : Assigns the index 5 to the variable img_idx . This index is then used to retrieve information about a
specific image.
print(labels.label[img_idx]) : Prints the label of the image at the specified index ( img_idx). It assumes that the
DataFrame has a column named 'label' containing the labels for each image.
Image.open('cifar10/'+str(img_idx)+'.png') : Opens and displays the image with the filename constructed using
the index ( img_idx ). The images are assumed to be located in the 'cifar10' directory.
Please note that the path used in Image.open is relative, so it's looking for images in the 'cifar10' directory.
Ensure that the images are correctly located in that directory, and the file naming convention matches the
expected format (e.g., '5.png' for index 5).
labels = pd.read_csv('/content/drive/MyDrive/DeepvLearning for Developers/MODULE-3-
CNN/cifar10Labels.csv', index_col=0)
# View an image
img_idx = 5
print(labels.label[img_idx])
Image.open('cifar10/'+str(img_idx)+'.png')
automobile

It looks like you're splitting your data into training and testing sets using the train_test_split function from
scikit-learn. Afterward, you're
reading the images associated with the training and testing indices, converting them to NumPy arrays, and
normalizing the pixel values. Here's a breakdown of your code:
Splitting Data: train_test_split is used to split the indices of your data into training and testing sets.
The test_size=0.3 argument indicates that 30% of the data will be used for testing, and random_state=42`
ensures reproducibility.
Storing Indexes for Later Use: The indexes for training and testing sets are stored in train_idx and test_idx for
later use.
Reading and Storing Images for Training: Images associated with training indices are read, converted to NumPy
arrays, and stored in
X_train .
Reading and Storing Images for Testing: Similarly, images associated with testing indices are read, converted to
NumPy arrays, and stored in X_test .
Normalizing Image Data:The pixel values of the images are normalized to the range [0, 1] by dividing each pixel
value by 255.
# Splitting data into Train and Test data
from sklearn.model_selection import train_test_split
x_train,x_test,y_train, y_test = train_test_split(labels.index,labels.label, test_size=0.3, random_state=42)
train_idx, test_idx = y_train.index, y_test.index # Stroing indexes for later use
# Reading images for trainingtemp = []
for img_idx in y_train.index:
img_path = os.path.join('cifar10/', str(img_idx) + '.png')img =
np.array(Image.open(img_path)).astype('float32')
temp.append(img)
X_train = np.stack(temp)
# Reading images for testingtemp = []
for img_idx in y_test.index:
img_path = os.path.join('cifar10/', str(img_idx) + '.png')img =
np.array(Image.open(img_path)).astype('float32')
temp.append(img)
X_test = np.stack(temp) # Normalizing image dataX_train = X_train/255. X_test = X_test/255.
print(X_train.shape,y_train.shape) (35000, 32, 32, 3) (35000,)
LabelEncoder instantiation:Creates an instance of the LabelEncoder class. This is used to encode the
categorical labels into numerical values.
Fit and transform the training labels: Fits the encoder on the training labels ( y_train ) and transforms them into
numerical values. This step is necessary to convert categorical labels into a format suitable for machine learning
models.
One-hot encoding: Uses the to_categorical function from Keras to perform one-hot encoding on the
transformed labels. This converts the numerical labels into a binary matrix representation suitable for training
neural networks. After this process, y_train will contain the one-hot encoded representations of your original
categorical labels. Each row in y_train corresponds to a training sample, and each column represents a class,
with a value of 1 indicating the presence of that class.
# One-hot encoding 10 output classes
from keras.utils import to_categoricalencode_X = LabelEncoder()
encode_X_fit = encode_X.fit_transform(y_train)y_train = to_categorical(encode_X_fit)
# One-hot encoding 10 output classes
from keras.utils import to_categoricalencode_X = LabelEncoder()
encode_X_fit = encode_X.fit_transform(y_test)y_test = to_categorical(encode_X_fit)
First Convolutional Layer ( Conv_1 ):
32 filters, each with a 4x4 kernel. ReLU activation function.
L2 regularization with a strength of 0.001.
Input shape is (32, 32, 3), assuming RGB images.
Batch Normalization ( BN_1 ):
Normalizes the parameters from the previous layer to speed up performance (optional).
First MaxPooling Layer ( MaxPool_1 ):
Pooling layer with a 2x2 pool size.
Second Convolutional Layer ( Conv_2 ):
64 filters, each with a 4x4 kernel. ReLU activation function.
L2 regularization with a strength of 0.001.
Batch Normalization ( BN_2 ):
Normalizes the parameters from the previous layer.
Second MaxPooling Layer ( MaxPool_2 ):
Pooling layer with a 2x2 pool size.
Flatten Layer ( Flat ):bold text
Flattens the input from the previous layer to a 1D array.
Fully-Connected Layer ( pred_layer ):
Dense layer with num_classes neurons (assuming 10 classes) and softmax activation for classification.
from keras.layers import LeakyReLUnum_classes = 10
model = keras.models.Sequential([
# Adding first convolutional layer
keras.layers.Conv2D(filters=32, kernel_size=(5, 5), strides=1, padding='valid', activation='LeakyReLU',
input_shape=(32, 32, 3), name='Conv_1',batch_size=128),
# Normalizing the parameters from last layer to speed up the performance (optional)# Adding first pooling
layer
keras.layers.MaxPool2D(pool_size=(2, 2), name='MaxPool_1'),# Adding second convolutional layer
keras.layers.Dropout(0.2),
keras.layers.Conv2D(filters=64, kernel_size=(5,5), strides=1, padding='valid', activation='LeakyReLU',
name='Conv_2',batch_size=128),
# Adding second pooling layer
keras.layers.MaxPool2D(pool_size=(2, 2), name='MaxPool_2'),keras.layers.Dropout(0.2),
keras.layers.Conv2D(filters=128, kernel_size=(5,5), strides=1, padding='valid', activation='LeakyReLU',
name='Conv_3',batch_size=128),
# Adding second pooling layer
# keras.layers.MaxPool2D(pool_size=(3, 3), name='MaxPool_3'),
#keras.layers.Conv2D(filters=128, kernel_size=(5, 5), strides=1, padding='valid', activation='LeakyReLU', #
name='Conv_4',batch_size=128),
#keras.layers.MaxPool2D(pool_size=(2, 2), name='MaxPool_4'),# Flattens the input
keras.layers.Flatten(name='Flat'),
# Fully-Connected layer
keras.layers.Dense(100, activation='sigmoid', name='pred1_layer'),
keras.layers.Dense(num_classes, activation='softmax', name='pred_layer')
])
Loss Function ( loss='categorical_crossentropy' ):Categorical crossentropy is commonly used for multi-class
classification problems. It measures the difference between the predicted probabilities and the true one-hot
encoded class labels.
Optimizer ( optimizer=keras.optimizers.Adam() ):Adam is an optimization algorithm that adapts the learning rate
during training. It is well-suited for a variety of machine learning tasks, including deep learning.
Metrics ( metrics=['accuracy'] ):During training, the model will monitor the accuracy metric. This provides the
percentage of correctly classified samples out of the total.
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy'])
cpfile = r'CIFAR10_checkpoint_filter_4.hdf5' :Specifies the file path where the model weights will be saved in
HDF5 format. Adjust the path and filename as needed.
cb_checkpoint = keras.callbacks.ModelCheckpoint(...) :
Creates a model checkpoint callback.
monitor='val_acc' : Monitors the validation accuracy during training.
verbose=1 : Prints a message when a checkpoint is saved.
save_best_only=True : Saves only the best model based on the validation accuracy.
mode='max' : The checkpoint is saved when the monitored quantity ( val_acc ) is maximized.
epochs = 5 :Specifies the number of training epochs.
model.fit(...) :
Trains the model using the training data ( X_train , y_train ).
validation_split=0.2 : Allocates 20% of the training data for validation.
callbacks=[cb_checkpoint] : Utilizes the defined checkpoint callback during training.
print(X_train.shape,y_train.shape)
(35000, 32, 32, 3) (35000, 10)
#cpfile = r'CIFAR10_checkpoint_filter_4.hdf5' # Weights to be stored in HDF5 format
#cb_checkpoint = keras.callbacks.ModelCheckpoint(cpfile, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')epochs = 30
model.fit(X_train, y_train, epochs=epochs, validation_split=0.2)
Epoch 1/30
875/875 [==============================] - 63s 71ms/step - loss: 1.6262 - accuracy: 0.4070 - val_loss:
1.3718 - val_accuracy: 0.Epoch 2/30
875/875 [==============================] - 61s 70ms/step - loss: 1.3142 - accuracy: 0.5263 - val_loss:
1.2849 - val_accuracy: 0.Epoch 3/30
875/875 [==============================] - 62s 71ms/step - loss: 1.1692 - accuracy: 0.5876 - val_loss:
1.2412 - val_accuracy: 0.
Epoch 4/30
875/875 [==============================] - 62s 71ms/step - loss: 1.0734 - accuracy: 0.6211 - val_loss:
1.1623 - val_accuracy: 0.Epoch 5/30
875/875 [==============================] - 60s 68ms/step - loss: 0.9884 - accuracy: 0.6515 - val_loss:
1.0747 - val_accuracy: 0.Epoch 6/30
875/875 [==============================] - 67s 76ms/step - loss: 0.9279 - accuracy: 0.6740 - val_loss:
1.0356 - val_accuracy: 0.Epoch 7/30
875/875 [==============================] - 64s 73ms/step - loss: 0.8788 - accuracy: 0.6925 - val_loss:
1.0096 - val_accuracy: 0.
Epoch 8/30
875/875 [==============================] - 62s 71ms/step - loss: 0.8296 - accuracy: 0.7103 - val_loss:
0.9798 - val_accuracy: 0.Epoch 9/30
875/875 [==============================] - 61s 70ms/step - loss: 0.7854 - accuracy: 0.7243 - val_loss:
0.9746 - val_accuracy: 0.Epoch 10/30
875/875 [==============================] - 62s 71ms/step - loss: 0.7562 - accuracy: 0.7363 - val_loss:
0.9749 - val_accuracy: 0.
Epoch 11/30
875/875 [==============================] - 59s 68ms/step - loss: 0.7247 - accuracy: 0.7453 - val_loss:
0.9452 - val_accuracy: 0.Epoch 12/30
875/875 [==============================] - 63s 73ms/step - loss: 0.6945 - accuracy: 0.7578 - val_loss:
0.9468 - val_accuracy: 0.Epoch 13/30
875/875 [==============================] - 64s 73ms/step - loss: 0.6698 - accuracy: 0.7649 - val_loss:
0.9679 - val_accuracy: 0.Epoch 14/30
875/875 [==============================] - 61s 70ms/step - loss: 0.6556 - accuracy: 0.7702 - val_loss:
0.9648 - val_accuracy: 0.
Epoch 15/30
875/875 [==============================] - 61s 70ms/step - loss: 0.6320 - accuracy: 0.7794 - val_loss:
0.9883 - val_accuracy: 0.Epoch 16/30
875/875 [==============================] - 61s 69ms/step - loss: 0.6158 - accuracy: 0.7818 - val_loss:
0.9931 - val_accuracy: 0.Epoch 17/30
875/875 [==============================] - 62s 71ms/step - loss: 0.5991 - accuracy: 0.7910 - val_loss:
1.0092 - val_accuracy: 0.
Epoch 18/30
875/875 [==============================] - 61s 69ms/step - loss: 0.5838 - accuracy: 0.7949 - val_loss:
1.0224 - val_accuracy: 0.Epoch 19/30
875/875 [==============================] - 63s 72ms/step - loss: 0.5780 - accuracy: 0.7963 - val_loss:
0.9906 - val_accuracy: 0.Epoch 20/30
875/875 [==============================] - 63s 72ms/step - loss: 0.5611 - accuracy: 0.8019 - val_loss:
0.9947 - val_accuracy: 0.Epoch 21/30
875/875 [==============================] - 65s 74ms/step - loss: 0.5577 - accuracy: 0.8053 - val_loss:
1.0092 - val_accuracy: 0.
Epoch 22/30
875/875 [==============================] - 61s 70ms/step - loss: 0.5434 - accuracy: 0.8095 - val_loss:
1.0150 - val_accuracy: 0.Epoch 23/30
875/875 [==============================] - 61s 70ms/step - loss: 0.5458 - accuracy: 0.8068 - val_loss:
1.0408 - val_accuracy: 0.Epoch 24/30
875/875 [==============================] - 62s 70ms/step - loss: 0.5341 - accuracy: 0.8112 - val_loss:
1.0476 - val_accuracy: 0.Epoch 25/30
875/875 [==============================] - 59s 68ms/step - loss: 0.5273 - accuracy: 0.8128 - val_loss:
1.0004 - val_accuracy: 0.Epoch 26/30
875/875 [==============================] - 62s 70ms/step - loss: 0.5238 - accuracy: 0.8144 - val_loss:
1.0359 - val_accuracy: 0.
Epoch 27/30
875/875 [==============================] - 63s 72ms/step - loss: 0.5127 - accuracy: 0.8198 - val_loss:
1.0636 - val_accuracy: 0.Epoch 28/30
875/875 [==============================] - 64s 74ms/step - loss: 0.5080 - accuracy: 0.8206 - val_loss:
1.0647 - val_accuracy: 0.Epoch 29/30
actual_train and predicted_train for Training Set:
actual_train : Extracts the actual class labels by finding the index of the maximum value in each one-hot encoded
label in y_train .
predicted_train : Uses the trained model to predict class labels for the training set using model.predict classes
.Prints Training Accuracy:
print('Train accuracy: ', scoring(actual_train, predicted_train, metric='accuracy') * 100) : Calculates and prints
the training accuracy using your custom scoring function, assuming 'accuracy' is one of the supported metrics.
actual_test and predicted_test for Test Set:
actual_test : Similar to actual_train , extracts the actual class labels for the test set.
predicted_test : Uses the trained model to predict class labels for the test set.
Prints Test Accuracy:

print('Test accuracy: ', scoring(actual_test, predicted_test, metric='accuracy') * 100) : Calculates and prints
the test accuracy using your custom scoring function.
from sklearn.metrics import accuracy_scorepredicted_train = model.predict(X_train)
prediction = np.where(predicted_train<0.5,0,1)accuracy_score(y_train, prediction)
1094/1094 [==============================] - 18s 16ms/step
0.8165142857142857
from sklearn.metrics import accuracy_scorepredicted_test= model.predict(X_test)
prediction = np.where(predicted_test<0.5,0,1)accuracy_score(y_test, prediction)
469/469 [==============================] - 9s 19ms/step
EXPERIMENT - 4
AIM: Implement a Recurrence Neural Network for Predicting Sequential Data.
DESCRIPTION:
RNN :- A recurrent neural network (RNN) is a type of artificial neural network that is used in deep learning and
in the development of models that imitate the activity of neurons in the human brain. RNNs are mainly used in
speech recognition and natural language processing (NLP).
RNN’s recognize data sequential characteristics and use patterns to predict the next likely scenario. They are
adapted to work for time series data or data that involves sequences. Ordinary feed forward neural networks
are only meant for data points that are independent of each other.
RNN’s are characterized by the direction of the flow of information between its layers. They have three gates
that determine whether or not to let new input in, delete the information because it isn't important, or let it
impact the output at the current time step.
Gated Recurrent Unit (GRU) Networks is another type of RNN that is designed to address the vanishing gradient
problem. It has two gates: the reset gate and the update gate.

LSTM
LSTM networks are the most commonly used variation of Recurrent Neural Networks (RNNs). The critical
component of the LSTM is the memory cell and the gates (including the forget gate but also the input gate),
inner contents of the memory cell are modulated by the input gates and forget gates. Assuming that both of
the segue he are closed, the contents of the memory cell will remain unmodified between one time-step and
the next gradients gating structure allows information to be retained across many time-steps, and
consequently also allows group that to flow across many time-steps. This allows the LSTM model to overcome
the vanishing gradient properly occurs with most Recurrent Neural Network models.

In this module, you will learn about how a neural network can benefit from predicting a sequence data set like
time series or sentence formation. To initiate with we use the Infosys Equities data set from January 1st, 2000
till December 31st, 2009 giving us a total of 10 years of data. The National Stock Exchange market opens only
on Weekdays excluding weekends and national holidays, therefore, you can't except data for all 365/366 days
a year.
Dataset

Data set includes the following features:

• Symbol: INFOSYSTCH throughout the dataset.


• Series: EQ (Equity) throughout the dataset.
• Date: Date corresponding to the data.
• Prev Close: Previous closing price.
• Open Price: Corresponding date's open price.
• High Price: Corresponding date's high price.
• Low Price: Corresponding date's low price.
• Last Price: Corresponding date's last price.
• Close Price: Corresponding date's closing price.
• Average Price: Corresponding date's average price.
• Total Traded Quantity: Number of quantity traded.
• Turnover: Corresponding date's turnover.
• No. of Trades: Total number of trades.
• Deliverable Qty: Deliverable stock volume
• % Dly Qt to Traded Qty: Ratio of deliverable volume to traded volume.

For our time series, we will be considering only two features, Date and Average Price.
Source Code:
The provided program uses a Recurrent Neural Network (RNN) to predict sequential data. It generates synthetic
sequences, trains the RNN model on a portion of the data, evaluates its performance on a test set, and makes
predictions on new data. The output includes training and validation loss during training and the model's loss
on the test set. Additionally, it shows the predictions made by the trained model on a new set of synthetic
sequences.

The output includes the training/validation loss during training and the test loss after evaluation. Finally, it
displays the model's predictions on new data.

import numpy as np import pandas as pd

import matplotlib.pyplot as pltimport seaborn as sns

import keras

from keras.models import Sequentialfrom keras.layers import LSTM

from keras.layers import Dense, Activation, Dropoutfrom sklearn.preprocessing import MinMaxScaler

from sklearn.metrics import mean_squared_errorfrom sklearn.utils import shuffle

Here's a brief overview of the libraries you've imported:


NumPy ( import numpy as np ): NumPy is a powerful library for numerical operations in Python. It provides
support for large, multi- dimensional arrays and matrices, along with mathematical functions to operate on
these arrays.
Pandas ( import pandas as pd ): Pandas is a data manipulation library that provides data structures like Series and
DataFrame, which are essential for data analysis and manipulation.
Matplotlib ( import matplotlib.pyplot as plt ): Matplotlib is a popular plotting library for creating static,
interactive, and animated visualizations in Python.
Seaborn ( import seaborn as sns ): Seaborn is a statistical data visualization library built on top of Matplotlib. It
provides an interface for creating informative and attractive statistical graphics.
Keras ( import keras ): Keras is a high-level neural networks API written in Python. It allows for easy and fast
prototyping of deep learning models and is capable of running on top of other popular deep learning
frameworks.
Keras Sequential Model ( from keras.models import Sequential ): Sequential is a linear stack of layers in a
neural network model.
Keras LSTM Layer ( from keras.layers import LSTM ): LSTM (Long Short-Term Memory) is a type of recurrent
neural network (RNN) layer, which is particularly effective for sequence prediction problems.
1. Keras Dense Layer ( from keras.layers import Dense ):Dense is a standard fully connected neural
network layer. It is the most commonly used layer type.
2. Keras Activation Layer ( from keras.layers import Activation ): Activation layers are used to apply
element-wise activation functions to the output of a layer.
3. Keras Dropout Layer ( from keras.layers import Dropout ): Dropout is a regularization technique used
to prevent overfitting by randomly setting a fraction of input units to zero during training.
4. Scikit-learn MinMaxScaler ( from sklearn.preprocessing import MinMaxScaler ): MinMaxScaler is
used for scaling numerical features to a specified range, usually between 0 and 1.
5. Scikit-learn Mean Squared Error ( from sklearn.metrics import mean_squared_error ): Mean
Squared Error (MSE) is a common metric used to evaluate the performance of regression models.
6. Scikit-learn Shuffle ( from sklearn.utils import shuffle ): Shuffle is used for shuffling the data randomly.
It can be useful when training machine learning models to ensure that the order of data does not
affect the learning process.

from google.colab import drive


drive.mount('/content/drive',force_remount=True) Mounted at /content/drive

This is a common step when working with Colab to access files and datasets stored in your Google Drive.
from google.colab import drive: This line imports the drive module from the google.colab package. Colab
provides this module to interact with Google Drive.
drive.mount('/content/drive', force_remount=True): This line mounts your Google Drive to the Colab
environment. It prompts you to visit a link, authorize access to your Google Drive, and enter an
authorization code. After successful authentication, your Google Drive will be mounted at the specified
path ('/content/drive'), and the content will be accessible within your Colab notebook.
The force_remount=True parameter is used to force a remount of Google Drive, even if it has been
mounted before. This can be useful if you want to ensure that the most up-to-date content from your
Google Drive is available in the Colab environment.
After running this code cell, you should see a prompt that asks you to follow a link to authorize access to
your Google Drive. Once you complete the authorization, you'll get a code to enter in the cell, and your
Google Drive will be mounted. You can then navigate to the mounted path to access your Google Drive

files.
# Loading data
data = pd.read_csv('/content/drive/MyDrive/dlt/INFY20002008.csv')data.info()
<class 'pandas.core.frame.DataFrame'>RangeIndex: 2496 entries, 0 to 2495
Data columns (total 16 columns):

Non-Null Count Dtype


# Column
0 Unnamed: 0 2496 non-null int64
1 Symbol 2496 non-null object
2 Series 2496 non-null object
3 Date 2496 non-null object
4 Prev Close 2496 non-null float64
5 Open Price 2496 non-null float64
6 High Price 2496 non-null float64
7 Low Price 2496 non-null float64
8 Last Price 2496 non-null float64
9 Close Price 2496 non-null float64
10 Average Price 2496 non-null float64
11 Total Traded Quantity 2496 non-null int64
12 Turnover 2496 non-null float64
13 No. of Trades 2496 non-null object
14 Deliverable Qty 2496 non-null object
15 % Dly Qt to Traded Qty 2496 non-null object
dtypes: float64(8),
int64(2), object(6)
memory usage: 312.1+ KB

pd.read_csv('/content/drive/MyDrive/dlt/INFY20002008.csv'): This line uses the read_csv function from the


pandas library to read a CSV file. The file path is specified as '/content/drive/MyDrive/dlt/INFY20002008.csv'. This
path indicates that the file is located in the 'dlt' folder within your Google Drive.
data.info():After loading the data, the info() method is called on the DataFrame (data). This method provides a
concise summary of the DataFrame, including the column data types, non-null values, and memory usage.
Keep in mind that the actual information displayed by data.info() will show the structure of the loaded data,
including the number of non-null entries in each column, the data types of the columns, and memory usage.
# Selecting only Date and Average Price columns data = data[['Date', 'Average
Price']]
This code creates a new DataFrame (data) by extracting only the 'Date' and 'Average Price' columns from the
original DataFrame. The double square brackets ([['Date', 'Average Price']]) are used to select specific columns,
and the new DataFrame will contain only these two columns.
Now, data will be a DataFrame that includes only the 'Date' and 'Average Price' columns, which can be useful if
you want to focus your analysis or visualization on these specific columns.
# Scaling the values in the range of 0 to 1
scaler = MinMaxScaler(feature_range = (0, 1))
scaled_price = scaler.fit_transform(data.loc[:, 'Average Price'].values.reshape(-
1, 1))

MinMaxScaler(feature_range=(0, 1)): This line creates an instance of the MinMaxScaler class from scikit-learn.
The feature_range parameter is set to (0, 1), which means that the transformed data will be scaled to the
range [0, 1].
data.loc[:, 'Average Price'].values.reshape(-1, 1): This extracts the 'Average Price' column from the DataFrame
(data). The .values attribute converts it to a NumPy array, and .reshape(-1, 1) reshapes it into a single column.
This is necessary because the fit_transform method expects a 2D array or matrix as input.
scaler.fit_transform(...): This fits the scaler to the data and transforms the data simultaneously. The
fit_transform method computes the minimum and maximum values of the data and scales it to the specified
range.
The scaled values are stored in the scaled_price variable, and these scaled values can be used for training
machine learning models or other analyses where scaled features are beneficial.

# Splitting dataset in the ratio of 75:25 for training and testtrain_size = int(data.shape[0] * 0.75)
train, test = scaled_price[0:train_size, :], scaled_price[train_size:data.shape[0], :]print("Number of
entries (training set, test set): " + str((len(train), len(test))))
Number of entries (training set, test set): (1872, 624)
1. train_size = int(data.shape[0] * 0.75): This line calculates the size of the training set. It takes 75%
of the total number of entries in the dataset ( data ) and converts it to an integer using int().
2. train, test = scaled_price[0:train_size, :], scaled_price[train_size:data.shape[0], :] : This line
actually performs the split. It uses array slicing to create two sets, train and test , from the scaled_price
array. The training set includes the first train_size entries,
and the test set includes the remaining entries.
3. print("Number of entries (training set, test set): " + str((len(train), len(test)))) : This line prints
the number of entries in the training and test sets.

After running this code, you'll have two sets ( train and test ) that you can use for training and evaluating
your machine learning model. The training set will contain 75% of the data, and the test set will contain
the remaining 25%.

def create_dataset(scaled_price, window_size=1):data_X, data_Y = [], []


for i in range(len(scaled_price) - window_size - 1):a = scaled_price[i:(i + window_size), 0]
data_X.append(a)
data_Y.append(scaled_price[i + window_size, 0])return(np.array(data_X), np.array(data_Y))

1. def create_dataset(scaled_price, window_size=1): : This line defines a function named create_dataset


that takes two parameters:
scaled_price , which is the scaled time series data, and window_size , which represents the size of the
input sequence.
2. data_X, data_Y = [], [] : These lines initialize two empty lists, data_X and data_Y , which will be used

to store input sequences and corresponding output values.


3. for i in range(len(scaled_price) - window_size - 1): : This line starts a loop that iterates through the
indices of the time series data, considering the specified window_size . The loop goes up to
len(scaled_price) - window_size - 1 to ensure that there is enough room for creating input-output
pairs.
4. a = scaled_price[i:(i + window_size), 0] : This line extracts an input sequence of length
window_size from the scaled_price array and appends it to the data_X list. The [0] indexing is used
to access the first column of the array.
5. data_Y.append(scaled_price[i + window_size, 0]) : This line appends the corresponding output
value (next element in the time series) to the data_Y list.
6. return(np.array(data_X), np.array(data_Y)) : Finally, the function returns NumPy arrays converted from
the lists data_X and data_Y .

# Create test and training sets for one-step-ahead regression. window_size =


3
train_X, train_Y = create_dataset(train, window_size) test_X, test_Y =
create_dataset(test, window_size)
print("Original training data shape:") print(train_X.shape)
# Reshape the input data into appropriate form for Keras.
train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))
print("New training data shape:") print(train_X.shape)

Original training data shape:


(1868, 3)
New training data shape:
(1868, 1, 3)

1. window_size = 3 : You've chosen a window_size of 3, indicating that each input sequence for the
LSTM model will consist of three time steps.
2. train_X, train_Y = create_dataset(train, window_size) : This line uses the create_dataset function to
generate training sets
( train_X and train_Y ) from the train data. train_X will contain input sequences, and train_Y will
contain corresponding output values.

3. test_X, test_Y = create_dataset(test, window_size) : Similarly, this line generates test sets ( test_X and
test_Y ) using the
create_dataset function from the test data.
4. print("Original training data shape:") : This line prints the shape of the original training data ( train_X )
before reshaping.
5. print(train_X.shape) : This line prints the shape of the original training data ( train_X ), which
represents the number of input sequences, the number of time steps, and the size of each
time step.
6. train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1])) : This line reshapes the
training input data to fit the expected input shape for an LSTM model. The new shape is
(number of input sequences, 1, number of time steps) .
7. test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1])) : Similarly, this line reshapes
the test input data to match the format required by the LSTM model.
8. print("New training data shape:") : This line prints the shape of the reshaped training data ( train_X
). This reshaping is necessary because LSTM models in Keras expect input data in the shape (number
of samples, number of time steps,number of features) . In your case, each input sequence is treated as
a single sample with three time steps and one feature.
The LSTM architecture here consists of:

• One input layer.


• One LSTM layer of 4 blocks.
• One Dense layer to produce a single output. MSE as loss function.

# Designing the LSTM


modelmodel =
Sequential()
model.add(LSTM(4, input_shape = (1,
window_size)))model.add(Dense(1))

# Compiling the model


model.compile(loss = "mean_squared_error", optimizer = "adam")
# Training the model
model.fit(train_X, train_Y, epochs=3, batch_size=1)

Epoch 1/3
1868/1868 [==============================] - 4s 2ms/step - loss: 0.0055
Epoch 2/3
1868/1868 [==============================] - 3s 2ms/step - loss: 3.9482e-04
Epoch 3/3
1868/1868 [==============================] - 3s 1ms/step - loss: 3.7661e-04
<keras.src.callbacks.History at 0x7b1dbf77dd20>
1. model = Sequential() : This line initializes a sequential model. A sequential model is appropriate for a
plain stack of layers where each layer has exactly one input tensor and one output tensor.
2. model.add(LSTM(4, input_shape=(1, window_size))) : This line adds an LSTM layer to the model.
The layer has 4 units (LSTM cells), and the input_shape parameter is set to (1, window_size) ,
indicating that the input sequences have one time step and window_size features.

3. model.add(Dense(1)) : This line adds a dense (fully connected) layer with 1 unit to the model. This
layer is responsible for producing the output of the model.
4. model.compile(loss="mean_squared_error", optimizer="adam") : This line compiles the model.
The chosen loss function is mean squared error ( "mean_squared_error" ), and the optimizer is
Adam ( "adam" ). The loss function is a measure of how well the model is performing, and the
optimizer is responsible for updating the model's weights during training to minimize the loss.
5. model.fit(train_X, train_Y, epochs=3, batch_size=1) : This line trains the model using the
training data ( train_X as input and train_Y as target). The training is performed for 3 epochs
with a batch size of 1. An epoch is one complete pass through the entire training dataset.

After running this code, your LSTM model will be trained on the specified data. You can then use the
trained model for making predictions on new data or evaluating its performance on the test set.
def predict_and_score(model, X, Y):
# Make predictions on the original scale of the data.
pred = scaler.inverse_transform(model.predict(X))
# Prepare Y data to also be on the original scale for interpretability.
orig_data = scaler.inverse_transform([Y])
# Calculate RMSE.
score = np.sqrt(mean_squared_error(orig_data[0], pred[:, 0]))return(score, pred)
rmse_train, train_predict = predict_and_score(model, train_X, train_Y)rmse_test, test_predict =
predict_and_score(model, test_X, test_Y)

print("Training data score: %.2f RMSE" % rmse_train)print("Test data score: %.2f RMSE" % rmse_test)

59/59 [==============================] - 1s 1ms/step


20/20 [==============================] - 0s 1ms/step
Training data score: 293.01 RMSETest data score: 188.13 RMSE

1. pred = scaler.inverse_transform(model.predict(X)) : This line uses the trained model to make


predictions ( model.predict(X) ) on the scaled input data ( X ). The predictions are then inverse-
transformed using scaler.inverse_transform to bring them back to the original scale of the data.

2. orig_data = scaler.inverse_transform([Y]) : This line inverse-transforms the true values ( Y ) back


to the original scale. The [Y] is used to reshape the array to the expected input shape.
3. score = np.sqrt(mean_squared_error(orig_data[0], pred[:, 0])) : This line calculates the RMSE
between the true values and the predictions on the original scale.
4. The function returns the RMSE score and the predictions.

5. Calculating RMSE for the training set and test set:

rmse_train, train_predict = predict_and_score(model, train_X, train_Y) : This line calculates


the RMSE for the training set using the predict_and_score function.

rmse_test, test_predict = predict_and_score(model, test_X, test_Y) : Similarly, this line calculates


the RMSE for the test set.
6. Printing the results:

print("Training data score: %.2f RMSE" % rmse_train) : This line prints the RMSE for the training set.
print("Test data score: %.2f RMSE" % rmse_test) : This line prints the RMSE for the test set.

These RMSE scores can be used to evaluate the performance of your LSTM model on both the training and
test datasets. Lower RMSE values indicate better model performance.
# Start with training predictions.
train_predict_plot =
np.empty_like(scaled_price)
train_predict_plot[:, :] = np.nan
train_predict_plot[window_size:len(train_predict) + window_size, :] = train_predict

# Add test predictions.


test_predict_plot =
np.empty_like(scaled_price)
test_predict_plot[:, :] = np.nan
test_predict_plot[len(train_predict) + (window_size * 2) + 1:len(scaled_price) - 1, :] = test_predict
# Create the plot.
plt.figure(figsize = (15, 5))
plt.plot(scaler.inverse_transform(scaled_price), label = "True value")plt.plot(train_predict_plot, label =
"Training set prediction")
plt.plot(test_predict_plot, label = "Test set prediction")plt.xlabel("Days")
plt.ylabel("Average Price")
plt.title("Comparison true vs. predicted training / test")plt.legend()
plt.show()

Here's what each part of the code is doing:


1. train_predict_plot = np.empty_like(scaled_price) : Creates an empty array with the same shape
as scaled_price to store training set predictions.
2. train_predict_plot[:, :] = np.nan : Fills the array with NaN values.
3. train_predict_plot[window_size:len(train_predict) + window_size, :] = train_predict :
Places the training set predictions ( train_predict ) into the corresponding positions of the
array, leaving the rest as NaN.
4. test_predict_plot = np.empty_like(scaled_price) : Creates an empty array for test set predictions.
5. test_predict_plot[:, :] = np.nan : Fills the test set prediction array with NaN values.
6. test_predict_plot[len(train_predict) + (window_size * 2) + 1:len(scaled_price) - 1, :] =
test_predict : Places the test set predictions ( test_predict ) into the corresponding positions of the
array, leaving the rest as NaN.

7. Creating the plot using Matplotlib:


• plt.figure(figsize=(15, 5)) : Sets the figure size.
• plt.plot(scaler.inverse_transform(scaled_price), label="True value") : Plots the true
values by inverse-transforming the scaled prices to the original scale.
• plt.plot(train_predict_plot, label="Training set prediction") : Plots the training
set predictions.
• plt.plot(test_predict_plot, label="Test set prediction") :
Plots the test set predictions.
plt.xlabel("Days") :
Sets the x-axis label.
• plt.ylabel("Average Price") :
Sets the y-axis label.
• plt.title("Comparison true vs. predicted training / test") :
Sets the title of the plot.
• plt.legend():
Displays the legend.
• plt.show():
Displays the plot.

This plot allows you to visually compare the true values with the predicted values for both the training and
test sets.
EXPERIMENT-5

AIM : Removing Noise from the Images.

Description:

Noise: Humans are prone to making mistakes when collecting data, and data collection instruments may be
unreliable, resulting in dataset errors. The errors are referred to as noise. Data noise in deep learning can cause
problems since the algorithm interprets the noise as a pattern and can start generalizing from it.

Noise Removal: In the proposed algorithm, the training process consists of three successive steps. In the first
step, a classifier is trained to classify the noisy and clean images. In the second step, a denoiser network aims
to remove the noise in the image features that are extracted by the trained classifier. Finally, a decoder is utilized
to map back the denoised images features into images pixels.

Denoising: Denoising is an advanced technique used to decrease grainy spots and discoloration in images while
minimizing the loss of quality.
Image Denoising is a computer vision task that involves removing noise from an image.
Image denoising plays an important role in a wide range of applications such as image restoration, visual
tracking, image registration, image segmentation, and image classification, where obtaining the original image
content is crucial for strong performance.
The median filter is excellent for denoising an image in the case of salt-and-pepper noise because it does not
blur the image, as a mean filter would do. Despite its name, the median filter is not a filter because it does not
respect the linearity property. Therefore it cannot be written as a convolution.

Process Involved in Removing Noise from the Images:


➢ we will use existing image data and add them to random noise. Then, we will feed the original images as
input and noisy images as output. Our autoencoder will learn the relationship between a clean image and a
noisy image and how to clean a noisy image.
➢ The denoise Image function relies on the activations (Deep Learning Toolbox) function to estimate the noise
of the input image, A. The denoise Image function specifies the Output As name-value argument of activations
as "channels" so that A can be larger than the network input size.
➢ Spatial domain methods aim to remove noise by calculating the gray value of each pixel based on the
correlation between pixels/image patches. In general, spatial domain methods can be divided into two
categories: spatial domain filtering and variational denoising methods.

Dataset:
For Image Denoising we use the CIFAR-10 datasets.
➢ The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a collection of images that are commonly
used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for
machine learning research.
➢ The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images
dataset and consists of 60000 32x32 color images. The 100 classes in the CIFAR-100 are grouped into 20 super
classes. There are 600 images per class.
➢ Image Classification is a method to classify the images into their respective category classes. CIFAR-10
Dataset as it suggests has 10 different categories of images in it. There is a total of 60000 images of 10 different
classes naming Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck.
➢ CIFAR-10 is a dataset containing 60,000 images of 10 classes and is considered a typical dataset for computer
vision problems. In this assignment, I'll try out a few neural network configurations and hyperparameters and
compare the results.
➢ The Range of cifar-10 is 0 to 255.
➢ The error rate of a human on CIFAR-10 is estimated to be around 6%, which means that a model achieving
above 94% accuracy will be regarded as a super-human performance. According to paperswithcode.com, the
best model can reach 99% accuracy on CIFAR-10.
➢ For the CIFAR-10 results the best test accuracy corresponds to batch sizes m=4 and m=8, although quite good
results are maintained out to m=128. ➢ CIFAR-10 is an established computer-vision dataset used for object
recognition. It is a subset of the 80 million tiny images dataset and consists of 60,000 32x32 color images
containing one of 10 object classes, with 6000 images per class. It was collected by Alex Krizhevsky, Vinod Nair,
and Geoffrey Hinton.
➢ CIFAR-10 – An image classification dataset consisting of ten classes of sixty thousand images. There are five
training batches and one test batch in the dataset and there are 10000 images in each batch. The size is 170
MB.
➢ CINIC-10 is a dataset for image classification. It has a total of 270,000 images, 4.5 times that of CIFAR-10. It
is constructed from two different sources: ImageNet and CIFAR-10. Specifically, it was compiled as a bridge
between CIFAR-10 and ImageNet.
➢ The Convolutional Neural Network has 4 convolution layers and pooling layers with 2 fully connected layers
in CIFAR-10.
➢ CIFAR-10 is a well-understood dataset and widely used for benchmarking computer vision algorithms in the
field of machine learning. The problem is “solved.” It is relatively straightforward to achieve 80% classification
accuracy.
Source Code:
The import statement import pandas in Python brings the entire pandas library into your script or Jupyter
Notebook. However, it's a common convention to use the alias pd to refer to pandas. This makes it more
concise and is widely adopted in the data science community.
Importing NumPy: The import numpy part brings the entire NumPy library into your script or Jupyter
Notebook. NumPy is a powerful library for numerical operations and array manipulations in Python.
Assigning an Alias: The as np part gives NumPy a shorter alias, in this case, np. This is a widely adopted
convention and makes it more convenient to refer to NumPy functions and classes throughout your code.
When you use the statement import matplotlib.pyplot as plt in Python, you are importing the pyplot
module from the Matplotlib library and assigning it the alias plt. This is a common convention in the data
visualization community, making it more convenient to refer to Matplotlib functions and classes
throughout your code.
When you use the statement from PIL import Image in Python, you are importing the Image module from
the Python Imaging Library (PIL) or, more commonly, from its fork called "Pillow." Pillow is a powerful
library for working with images in various formats.

When you use the statement import os in Python, you are importing the os module, which provides a way to
interact with the operating system. The os module allows you to perform various operating system-related tasks,
such as working with file systems, directories, and environment variables.

import pandas as pd import numpy as np


import matplotlib.pyplot as plt from PIL import Image
import os

It is specific to the Google Colab environment, which is a free, cloud-based platform for running Python
code, especially popular in the context of machine learning and data science.
from google.colab import drive
drive.mount('/content/drive',force_remount=True) Mounted at /content/drive

!: In Jupyter Notebooks or Colab, the exclamation mark ! is used to run shell commands

directly from the notebook cells. unzip: This is the command-line utility for unzipping files.

-q: This flag stands for "quiet" and is used to suppress the output of the unzip command. It means the
command will not print the names of the files and directories as they are being extracted.
"/content/drive/MyDrive/DLT/cifar10.zip": This is the path to the ZIP file that you want to unzip. You're
specifying the full path to a file named "cifar10.zip" in your Google Drive under the
"/content/drive/MyDrive/DLT/" directory.
When you run this command in a Colab notebook, it will unzip the contents of the "cifar10.zip" file into the
current directory. The -q flag is used to do this quietly, without printing each file as it's being extracted.
!unzip -q "/content/drive/MyDrive/DLT/cifar10.zip"
The np.array function is used to create a NumPy array, which is a multi-dimensional, homogeneous, and
flexible array object. NumPy arrays are more efficient than Python lists for numerical operations and are a
cornerstone for numerical computing tasks, including data analysis,
machine learning, and scientific research.
The plt.imshow() function is used to display an image or 2D array as a plot. It is particularly useful for
visualizing images in the context of data analysis, computer vision, and image processing.
The plt.show() function is a part of the Matplotlib library in Python and is used to display the current figure
that has been created using Matplotlib functions. It is commonly used in scripts or Jupyter Notebooks to
render and show the Matplotlib plots.
img = np.array(Image.open('cifar10/5.png')) plt.imshow(img)
plt.show()

img_arr = []

for i in range(1, 151):


img_path = os.path.join('cifar10/'+str(i) +'.png') img = np.array(Image.open(img_path))/255.
img_arr.append(img)

The line img_arr = np.array(img_arr) is converting the Python list img_arr, which contains NumPy arrays
representing images, into a single NumPy array. This is often done to create a structured and efficient
representation of a dataset for further processing or analysis.

If img_arr is a NumPy array resulting from the code img_arr = np.array(img_arr), you can check its shape using the
.shape attribute. The shape of the NumPy array provides information about the number of dimensions and the
size of each dimension.

# Converting back to numpy array img_arr = np.array(img_arr)


img_arr.shape
(150, 32, 32, 3)

The code plt.imshow((img_arr[4]*255).astype(np.uint8)) is using Matplotlib to display an image.


The plt.show() function is used to display the Matplotlib plots that have been created in your code. When you
create plots using Matplotlib functions, the plots are stored in memory, and plt.show() is required to actually
render and display them.
plt.imshow((img_arr[4]*255).astype(np.uint8)) plt.show()

The term "noise factor" typically refers to a parameter or variable used to introduce random variations or disturbances
into a system, process, or data. It is commonly employed in various fields such as signal processing, communication
systems, simulations, and machine learning.

you're using the noise_factor to add random noise to an image represented by the img_arr variable. The code is applying a
Gaussian (normal) distribution of random numbers to each pixel in the image.

you're using Matplotlib to visualize one of the noisy images. The plt.imshow() function is commonly used to display
images in Python.

we have included the plt.show() line, which is essential for displaying the image. When working with Matplotlib in
interactive environments or scripts, this function is used to show the plot you've created with plt.imshow().

# Adding random noise to the images noise_factor = 0.05

noisy_imgs = img_arr + noise_factor * np.random.normal(size=img.shape)

# Image with noise

plt.imshow((noisy_imgs[4]*255).astype(np.uint8)) plt.show()

It looks like you're importing layers from the Keras library, which is commonly used for building neural network models in
Python. The layers you've imported are typical components of a Convolutional Autoencoder, a type of neural network
architecture used for tasks like image
denoising and dimensionality reduction. Input: Input is used to create an input tensor. It defines the shape of the input
data that will be fed into the model.

Conv2D: Conv2D is a 2D convolutional layer. It performs convolutional operations on 2D input data, which is often used
for image processing tasks. This layer is responsible for learning spatial hierarchies of features.

MaxPooling2D: MaxPooling2D is a downsampling layer that performs max pooling operation on the spatial dimensions of
the input. It helps reduce the spatial dimensions of the representation and retains the most important information.

UpSampling2D: UpSampling2D is an upsampling layer that increases the spatial dimensions of the input. It is often used in
combination with convolutional layers to learn to reconstruct the input data.

The Model class from Keras is used to instantiate a model in Keras, which can be a complete neural network or a
submodel (e.g., an encoder or decoder in an autoencoder). This class allows you to define the input and output of the
model, essentially specifying the architecture.

from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D from keras.models import Model

def auto_encoderis used to define a function called auto_encoder that likely aims to implement an autoencoder. An
autoencoder is a type of neural network architecture used for unsupervised learning, particularly for dimensionality
reduction and feature learning.

f_size defined a variable f_size with a value of 3, representing the filter size. In the context of neural networks, particularly
convolutional neural networks (CNNs), the filter size refers to the dimensions of the convolutional kernel.

p_size defined a variable p_size with a value of 1, representing the pool size. In the context of neural networks, specifically
convolutional neural networks (CNNs), the pool size refers to the dimensions of the pooling window used in max
pooling or average pooling layers.

conv_1 = Conv2D(32, (f_size, f_size), activation='relu', padding='same')(img)defining a convolutional layer in Keras with 32
filters, a specified filter size (f_size), ReLU activation, and 'same' padding. This is a common configuration for a
convolutional layer in a neural network.

pool_1 = MaxPooling2D(pool_size=(p_size, p_size))(conv_1) defines that we are adding a MaxPooling2D layer to your
neural network. This layer is commonly used to downsample the spatial dimensions of the input by taking the maximum
value from a pool of values.

conv_2 = Conv2D(64, (f_size, f_size), activation='relu', padding='same')(pool_1) defines that we are adding another
convolutional layer (conv_2) to your neural network. This is a common practice in deep learning models, where each
convolutional layer is designed to learn increasingly, complex features from the input data.

pool_2 = MaxPooling2D(pool_size=(p_size, p_size))(conv_2) says that we are adding another MaxPooling2D layer
(pool_2) to your neural network. This is a common practice, especially in convolutional neural networks, where pooling
layers are used to downsample the spatial dimensions of the input.

conv_3 = Conv2D(128, (f_size, f_size), activation='relu', padding='same')(pool_2) with 128 filters and activation is relu
function and same padding.

conv_4 = Conv2D(128, (f_size, f_size), activation='relu', padding='same')(conv_3) with 128 filters and activation is relu
functiom and with same padding.

up_1 = UpSampling2D((p_size, p_size))(conv_4) defines the adding of an UpSampling2D layer (up_1) to your neural
network. The UpSampling2D layer is used to increase the spatial dimensions of the input, effectively "upsampling" the
feature maps.

conv_5 = Conv2D(64, (f_size, f_size), activation='relu', padding='same')(up_1) with 64 filters and relu activation function
and with same padding.
up_2 = UpSampling2D((p_size, p_size))(conv_5) deffines the adding an another UpSampling2D layer (up_2) to your neural
network. Similar to the previous explanation, the UpSampling2D layer is used to increase the spatial dimensions of the
input.

decoded = Conv2D(3, (f_size, f_size), activation='sigmoid', padding='same')(up_2) descibes Conv2D Layer (decoded):

Type: Convolutional layer. Number of Filters: 3 - Assuming you're working with RGB images, this is typical, as RGB images
have three channels (Red, Green, Blue). Filter (Kernel) Size: (f_size, f_size) - The size of the convolutional kernel,
determining the spatial extent of weights learned by the layer. Activation Function: Sigmoid - Often used for the last layer
of an autoencoder to squash the pixel values between 0 and 1, suitable for image data. Padding: 'same' - Pads the input
so that the output has the same spatial dimensions as the input. Input to decoded (up_2):

up_2 is the assumed output from the previous UpSampling2D layer. Output of decoded:

The output of decoded will have three channels (assuming RGB images) and potentially higher spatial resolution
compared to the input to up_2.

def auto_encoder(img): # Encoder module

f_size = 3 # filter size p_size = 1 # pool size

conv_1 = Conv2D(32, (f_size, f_size), activation='relu', padding='same')(img) pool_1 = MaxPooling2D(pool_size=(p_size,


p_size))(conv_1)

conv_2 = Conv2D(64, (f_size, f_size), activation='relu', padding='same')(pool_1) pool_2 =


MaxPooling2D(pool_size=(p_size, p_size))(conv_2)

conv_3 = Conv2D(128, (f_size, f_size), activation='relu', padding='same')(pool_2)

# Decoder module

conv_4 = Conv2D(128, (f_size, f_size), activation='relu', padding='same')(conv_3) up_1 = UpSampling2D((p_size,


p_size))(conv_4)

conv_5 = Conv2D(64, (f_size, f_size), activation='relu', padding='same')(up_1) up_2 = UpSampling2D((p_size,


p_size))(conv_5)

decoded = Conv2D(3, (f_size, f_size), activation='sigmoid', padding='same')(up_2) return decoded

img = Input(shape=(32, 32, 3)) describes that

Input Layer (img): Type: Input layer. Shape: (32, 32, 3) - This specifies the shape of the input data. In this case, it's a 3D
tensor with a shape of (height, width, channels). 32 is the height of the input image. 32 is the width of the input image.
3 is the number of channels, typically

representing Red, Green, and Blue (RGB) channels in image data.

model = Model(img, auto_encoder(img)) describes that a Keras model using the Model class, where the input is img
(presumably an image input tensor) and the output is the result of the auto_encoder function applied to img.

model.compile(loss='mean_squared_error', optimizer='adam') describes that Loss Function ('mean_squared_error'):

The choice of the mean squared error (MSE) as the loss function suggests that your model is likely designed for a
regression task. In the context of autoencoders, MSE is often used when the goal is to minimize the difference between
the input and the reconstructed output. Optimizer ('adam'):

The Adam optimizer is specified for training your model. Adam is a popular optimization algorithm that adapts the
learning rates for each parameter during training, making it well-suited for a variety of tasks. Compile Method:

The compile method configures the model for training by specifying the loss function and optimizer. After compiling, the
model is ready to be trained using the fit method.

model.summary() describes that To obtain a summary of your compiled Keras model, you can use the summary method.
This will provide a detailed overview of your model's architecture, including the number of parameters in each layer.
img = Input(shape=(32, 32, 3))

model = Model(img, auto_encoder(img))

model.compile(loss='mean_squared_error', optimizer='adam') model.summary()


Model: "model"

Layer (type) Output Shape Param #

=================================================================
input_1 (InputLayer) [(None, 32, 32, 3)] 0

conv2d (Conv2D) (None, 32, 32, 32) 896

max_pooling2d (None, 32, 32, 32) 0


(MaxPooling2 D)

conv2d_1 (Conv2D) (None, 32, 32, 64) 18496

max_pooling2d_1 (None, 32, 32, 64) 0


(MaxPoolin g2D)

conv2d_2 (Conv2D) (None, 32, 32, 128) 73856

conv2d_3 (Conv2D) (None, 32, 32, 128) 14758


4
up_sampling2d (None, 32, 32, 128) 0
(UpSampling2
D)

conv2d_4 (Conv2D) (None, 32, 32, 64) 73792

up_sampling2d_1 (None, 32, 32, 64) 0


(UpSamplin g2D)
conv2d_5 (Conv2D) (None, 32, 32, 3) 1731

=================================================================

Total params: 316355 (1.21 MB)

Trainable params: 316355 (1.21 MB)

Non-trainable params: 0 (0.00 Byte)

model.fit(noisy_imgs[:120], img_arr[:120], epochs=10, validation_split=0.2,batch_size=1) says that Input Data


(noisy_imgs[:120]):

The input data for training is the noisy images. noisy_imgs[:120] implies that the first 120 samples of your noisy images
dataset are used for training. Target Data (img_arr[:120]):

The target data is the clean images corresponding to the noisy input images. img_arr[:120] implies that the first 120
samples of your clean images dataset are used as the target for training. Number of Epochs (epochs=10):

Training will be performed for 10 epochs. An epoch is one complete pass through the entire training dataset. Validation
Split (validation_split=0.2):

20% of the training data (noisy_imgs[:120] and img_arr[:120]) will be used as a validation set. The model's performance
on this set will be monitored during training. Batch Size (batch_size=1):

The training is performed with a batch size of 1. This means that the model's parameters will be updated after processing
each individual sample.

model.fit(noisy_imgs[:120], img_arr[:120], epochs=10,


validation_split=0.2,batch_size=1) Epoch 1/10
96/96 [==============================] - 8s 61ms/step - loss: 0.0383 - val_loss: 0.0131
Epoch 2/10

96/96 [==============================] - 6s 64ms/step - loss: 0.0095 - val_loss: 0.0082

Epoch 3/10

96/96 [==============================] - 6s 67ms/step - loss: 0.0068 - val_loss: 0.0048

Epoch 4/10

96/96 [==============================] - 5s 47ms/step - loss: 0.0047 - val_loss: 0.0035

Epoch 5/10

96/96 [==============================] - 7s 70ms/step - loss: 0.0030 - val_loss: 0.0029

Epoch 6/10

96/96 [==============================] - 7s 70ms/step - loss: 0.0033 - val_loss: 0.0060

Epoch 7/10

96/96 [==============================] - 7s 75ms/step - loss: 0.0067 - val_loss: 0.0039

Epoch 8/10

96/96 [==============================] - 6s 60ms/step - loss: 0.0035 - val_loss: 0.0031

Epoch 9/10

96/96 [==============================] - 4s 46ms/step - loss: 0.0028 - val_loss: 0.0033

Epoch 10/10

96/96 [==============================] - 6s 62ms/step - loss: 0.0026 - val_loss: 0.0020

<keras.src.callbacks.History at 0x7e947b5add80>

pred = model.predict(img_arr) describes that this line of code assumes that model is already trained and
has learned a representation of the data. The predictions (pred) will be compared with the original clean
images to evaluate how well your model has learned to denoise the input.
pred = model.predict(img_arr)

5/5 [==============================] - 3s 442ms/step

plt.figure(figsize=(10, 5))

It seems like you are creating a new figure for a plot using matplotlib. The plt.figure function is commonly used to create a
new figure for plotting. The figsize parameter is used to specify the width and height of the figure in inches.

ax1 = plt.subplot2grid((1, 3), (0,0))

This line creates a subplot in a grid with 1 row and 3 columns. (1, 3) specifies the grid shape, and (0, 0) specifies the
position of the subplot within the grid. In this case, you're creating a subplot in the first column of the first row.

ax1.set_title('Original image', fontsize='large')

This line sets the title of the subplot to 'Original image' with a fontsize of 'large'. ax1.imshow(img_arr[4])

This line displays the image stored in img_arr[4] within the first subplot (ax1). plt.subplot2grid((1, 3), (0, 1)):

This line creates a subplot in the same grid with 1 row and 3 columns. (1, 3) specifies the grid shape, and (0, 1) specifies
the position of the subplot within the grid. In this case, you're creating a subplot in the second column of the first row.

ax2.set_title('Noisy image', fontsize='large'):

This line sets the title of the subplot to 'Noisy image' with a fontsize of 'large'.
ax2.imshow((noisy_imgs[4]*255).astype('uint8')):
This line displays the noisy image stored in noisy_imgs[4] after scaling it by 255 and converting the data type to 'uint8'. The
multiplication by 255 and type conversion is often done to scale the pixel values back to the standard 8-bit range.

plt.subplot2grid((1, 3), (0, 2)):

This line creates a subplot in the same grid with 1 row and 3 columns. (1, 3) specifies the grid shape, and (0, 2) specifies
the position of the subplot within the grid. In this case, you're creating a subplot in the third column of the first row.

ax3.set_title('Reconstructed image', fontsize='large'):

This line sets the title of the subplot to 'Reconstructed image' with a fontsize of 'large'. ax3.imshow(pred[4]):

plt.show() is used to display the entire plot with all three subplots. This function is typically used to display the plots you
have created using matplotlib.

This line displays the reconstructed image stored in pred[4]. The pred variable is assumed to contain the model's
predictions.

plt.figure(figsize=(10, 5))

ax1 = plt.subplot2grid((1, 3), (0,0))

ax1.set_title('Original image', fontsize='large') ax1.imshow(img_arr[4])

ax2 = plt.subplot2grid((1, 3), (0,1))

ax2.set_title('Noisy image', fontsize='large') ax2.imshow((noisy_imgs[4]*255).astype('uint8'))

ax3 = plt.subplot2grid((1, 3), (0,2))

ax3.set_title('Reconstructed image', fontsize='large') ax3.imshow(pred[4])

plt.show()

plt.figure(figsize=(10, 5))

It seems like you are creating a new figure with a specific size using plt.figure(figsize=(10, 5)). This line sets the width and
height of the figure to be 10 inches by 5 inches, respectively.

plt.subplot2grid((1, 3), (0, 0)):

This line creates a subplot in a grid with 1 row and 3 columns. (1, 3) specifies the grid shape, and (0, 0) specifies the
position of the subplot within the grid. In this case, you're creating a subplot in the first column of the first row.

ax1.set_title('Original image', fontsize='large'):

This line sets the title of the subplot to 'Original image' with a fontsize of 'large'. ax1.imshow(img_arr[131]):

This line displays the image stored in img_arr[131] within the first subplot (ax1). plt.subplot2grid((1, 3), (0, 1)):

This line creates a subplot in the same grid with 1 row and 3 columns. (1, 3) specifies the grid shape, and (0, 1) specifies
the position of the subplot within the grid. In this case, you're creating a subplot in the second column of the first row.

ax2.set_title('Noisy image', fontsize='large'):

This line sets the title of the subplot to 'Noisy image' with a fontsize of 'large'. ax2.imshow(noisy_imgs[131]):

This line displays the noisy image stored in noisy_imgs[131] within the second subplot (ax2). plt.subplot2grid((1, 3), (0,
2)):

This line creates a subplot in the same grid with 1 row and 3 columns. (1, 3) specifies the grid shape, and (0, 2) specifies
the position of the subplot within the grid. In this case, you're creating a subplot in the third column of the first row.
ax2.set_title('Noisy image', fontsize='large'):

This line sets the title of the subplot to 'Noisy image' with a fontsize of 'large'.
ax2.imshow(noisy_imgs[131]):

This line displays the noisy image stored in noisy_imgs[131] within the second subplot (ax2).
plt.subplot2grid((1, 3), (0, 2)):

This line creates a subplot in the same grid with 1 row and 3 columns. (1, 3) specifies the grid shape,
and (0, 2) specifies the position of the subplot within the grid. In this case, you're creating a subplot
in the third column of the first row.

ax3.set_title('Reconstructed image', fontsize='large'):

This line sets the title of the subplot to 'Reconstructed image' with a fontsize of 'large'.
ax3.imshow(pred[131]):

This line displays the reconstructed image stored in pred[131] within the third subplot (ax3).

The plt.show() command will display the entire figure with all three subplots. This will include the
original image, the corresponding noisy image, and the reconstructed image for the data at index
131.

plt.figure(figsize=(10, 5))

ax1 = plt.subplot2grid((1, 3), (0,0))

ax1.set_title('Original image', fontsize='large') ax1.imshow(img_arr[131])

ax2 = plt.subplot2grid((1, 3), (0,1))

ax2.set_title('Noisy image', fontsize='large') ax2.imshow(noisy_imgs[131])

ax3 = plt.subplot2grid((1, 3), (0,2))

ax3.set_title('Reconstructed image', fontsize='large') ax3.imshow(pred[131])

plt.show()

WARNING:matplotlib.image:Clipping input data to the valid range for imshow with RGB data ([0..1]
for floats or [0..255] for integer
EXPERIMENT - 6

AIM: Implement Object Detection Using YOLO.

DESCRIPTION:

Object Detection:

Object detection is the task of detecting instances of objects of a specific class within an image or video. Basically,
it locates the existence of objects in an image using a bounding box and assigns the types or classes of the objects
found. For instance, it takes an image as input and generates one or more bounding boxes, each with the class label
attached. These algorithms are powerful enough to handle multi-class classification and localization and objects with
multiple occurrences.

Object detection is a combination of two tasks:

• Image classification

• Object localization

Image classification algorithms predict the type or class of an object in an image among a predefined set of classes
that the algorithm was trained for. Usually, input is an image with a single object, such as a cat. Output is a class or
label representing a particular object, often with a probability of that prediction.

Object localization algorithms locate the presence of an object in the image and represent its location with a
bounding box. They take an image with one or more objects as input and output the location of one or more
bounding boxes using their position, height, and width.

Object Detection Methods

Generally, object detection methods can be classified as either neural network-based or non-neural approaches.
Also, some of them are rule-based, where the rule is predefined to match specific objects. Non-neural approaches
require defining features using some feature engineering techniques and then using a method such as a support
vector machine (SVM) to do the classification.

Some of the neural network methods are:

• Region-Based Convolutional Neural Networks (R-CNN, Fast R-CNN, etc.)

• Single Shot Detector (SSD)

• You Only Look Once (YOLO)

DATASET:
We used COCO dataset for object detection

• Common Objects in Context (COCO) is one such example of a benchmarking dataset, used widely throughout the
computer vision research community. It even has applications for general practitioners in the field, too. The
Microsoft Common Objects in Context (COCO) dataset is the gold standard benchmark for evaluating state of the art
of computer vision models.

• COCO contains over 330,000 images, of which more than 200,000 are labelled, across dozens of categories of
objects. COCO is a collaborative project maintained by computer vision professionals from numerous prestigious
institutions, including Google, Caltech, and Georgia Tech.

• The COCO dataset is designed to represent a vast array of things that we regularly encounter in everyday life, from
vehicles like bikes to animals like dogs to people.The COCO dataset contains images from over 80 "object" and 91
generic "stuff" categories, which means the dataset can be used for benchmarking general-purpose models more
effectively than small-scale datasets. •In addition, coco dataset contains:

1. 121,408 images
2. 883,331 object annotations

3.80 classes of data

4. The median image ratio is 640x480

• The COCO dataset can be used for multiple computer vision tasks. COCO is commonly used for object detection,
semantic segmentation, and keypoint detection.

• Objects are annotated with a bounding box and class label. This annotation can be used to identify what is in an
image. In the example below, giraffes and cows are identified in a photo of the outdoors.

• The dataset has two main parts: the images and their annotations:

1. The annotations are provided in JSON format, with each file corresponding to a single image.

2. The images are organized into a hierarchy of directories, with the top-level directory containing subdirectories for
the train,test and validation sets.

• COCO offers various types of annotations:

1. Object detection with bounding box coordinates and full segmentation masks for 80 different objects

2. Stuff image segmentation with pixel maps displaying 91 amorphous background areas

3. Panoptic segmentation identifies items in images based on 80 "things" and 91 "stuff" categories

• The COCO dataset can be used to train object detection models. The dataset provides bounding box coordinates
for 80 different types of objects, which can be used to train models to detect bounding boxes and classify objects in
the images.

• In object detection, the bounding boxes are always rectangular. As a result, if the object contains the curvature
part, it does not help determine its shape. In order to find precisely the shape of the object, we should use some of
the image segmentation techniques.
Source Code:
The Darknet repository is an open-source deep learning framework written in C and CUDA, which supports various
neural network architectures, including YOLO. It is developed by AlexeyAB, and it provides a complete
implementation of the YOLO algorithm. The Darknet framework is well- known for its performance in object
detection tasks and is frequently used by researchers and developers working with YOLO. The code

provided suggests the first step to start working with the YOLO algorithm using the Darknet framework is to clone the
Darknet repository from GitHub. Cloning the repository means downloading a copy of the repository to your local
machine for further development, experimentation, or usage.

# clone darknet repository

!git clone https://github.com/AlexeyAB/darknet

Cloning into 'darknet'...

remote: Enumerating objects: 15833, done.

remote: Total 15833 (delta 0), reused 0 (delta 0), pack-reused 15833 Receiving objects: 100% (15833/15833), 14.39
MiB | 18.62 MiB/s, done. Resolving deltas: 100% (10666/10666), done.

This code is meant to modify the Makefile of the Darknet repository to enable GPU support and OpenCV (computer
vision library) integration. These modifications are crucial for accelerating the training and inference of YOLO models
and for working with images and videos. Here's an explanation of each line in the code:

%cd darknet: This is a Jupyter Notebook magic command that changes the current working directory to the
"darknet" directory. It assumes that the "darknet" directory is present in the current location, and it's typically where
you would find the Darknet source code.

!sed -i 's/OPENCV=0/OPENCV=1/' Makefile: This line uses the sed command to perform an in-place replacement in
the Makefile of Darknet. It searches for the line that specifies OPENCV=0 and replaces it with OPENCV=1 , enabling
OpenCV support.

!sed -i 's/GPU=0/GPU=1/' Makefile: Similarly, this line replaces GPU=0 with GPU=1 in the Makefile, enabling GPU
support.

!sed -i 's/CUDNN=0/CUDNN=1/' Makefile: This line replaces CUDNN=0 with CUDNN=1 in the Makefile. Enabling
CUDNN (CuDNN) is essential for using NVIDIA's optimized deep learning libraries for improved performance.

!sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile : This line replaces CUDNN_HALF=0 with CUDNN_HALF=1 .


Enabling CUDNN_HALF allows for reduced precision (half-precision) computations, which can further accelerate deep
learning operations on compatible GPUs. After running this code, the Makefile in the Darknet repository will be
configured to use GPU, CuDNN, and OpenCV, which are important for optimizing YOLO's performance. This setup is
typical when you want to compile Darknet with these features enabled, which is necessary for training and deploying
YOLO models effectively.

# change makefile to make sure GPU and OPENCV enabled

%cd darknet

!sed -i 's/OPENCV=0/OPENCV=1/' Makefile

!sed -i 's/GPU=0/GPU=1/' Makefile

!sed -i 's/CUDNN=0/CUDNN=1/' Makefile

!sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile

/content/darknet

The below code is meant to verify the CUDA (Compute Unified Device Architecture) version installed on a virtual
machine. CUDA is a parallel
computing platform and API developed by NVIDIA, and it is commonly used for GPU-accelerated deep learning and
other scientific computing tasks.

!/usr/local/cuda/bin/nvcc --version: This is a shell command that uses the NVIDIA CUDA Compiler (nvcc) to check the
version of

CUDA installed on the system. It will display the version information in the output. Make sure that the path
/usr/local/cuda/bin/nvcc is correct for your virtual machine. In some cases, you might need to use a different path to
the nvcc executable if CUDA is installed in a

different location. When you run this code, it will show you information about the installed CUDA version, which is
important for compatibility with GPU-accelerated libraries and frameworks like TensorFlow, PyTorch, and Darknet.

# verify CUDA on Virtual Machine

!/usr/local/cuda/bin/nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Aug_15_22:02:13_PDT_2023

Cuda compilation tools, release 12.2, V12.2.140 Build cuda_12.2.r12.2/compiler.33191640_0

This code is a command to build the Darknet framework. Building Darknet will compile the source code, resulting in an
executable file that you can use to run or train object detectors using YOLO models. Here's what the code does:

1. !make: This is a shell command that invokes the make utility. The make utility is commonly used for building
software projects by reading a Makefile (a script that defines how to compile and link the code) and executing
the compilation and

linking commands specified within it. When you run !make, it will execute the compilation process specified in
Darknet's Makefile. This process typically involves compiling the source code, linking libraries, and generating
the executable file that allows you to work with YOLO for object detection.

# make darknet (builds the darknet to make executable file to run or train object detectors)

!make

mkdir -p ./obj/ mkdir -p backup mkdir -p results chmod +x *.sh

g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -DOPENCV `pkg-config --cflags opencv4 2> /dev/null ||
pkg-config --c

The below code is a command to download a pre-trained YOLOv4 model weights file from a specific GitHub release.
Here's what this code does: !wget
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights`: This command
uses the wget utility to download a file from a given URL. In this case, it's downloading the YOLOv4 model weights file
from the specified GitHub release. The URL points to a specific release version where the YOLOv4 model weights are
hosted. These weights can be used for inference or fine-tuning in your YOLOv4-based object detection tasks. After
running this command, the yolov4.weights file will be downloaded to your current working

directory, and you can use it with the Darknet framework or other YOLO-based implementations for various object
detection tasks. Make sure you have the necessary permissions and disk space to download the file.

!wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights

--2023-12-20 11:45:38--
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights Resolving
github.com (github.com)... 192.30.255.112

Connecting to github.com (github.com)|192.30.255.112|:443... connected.

HTTP request sent, awaiting response... 302 Found


Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/75388965/ba4b6380-
889c-11ea-9751-f994f596179

--2023-12-20 11:45:38-- https://objects.githubusercontent.com/github-production-release-asset-


2e65be/75388965/ba4b6380-889c-11ea-9

Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.108.133,


185.199.111.133, ... Connecting to objects.githubusercontent.com
(objects.githubusercontent.com)|185.199.109.133|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 257717640 (246M) [application/octet-stream] Saving to: ‘yolov4.weights’

yolov4.weights 100%[===================>] 245.78M 320MB/s in 0.8s

2023-12-20 11:45:39 (320 MB/s) - ‘yolov4.weights’ saved [257717640/257717640]

This code defines a set of useful functions for working with images and files in a Google Colab environment. Here's an
explanation of each of these functions:

imShow(path): This function is used to display an image stored at the specified path. It relies on the OpenCV library for
reading and

processing images and Matplotlib for displaying them. The image is read from the given path, resized to a larger size,
and then displayed in the Colab notebook using Matplotlib. This is a handy function for visualizing images.

upload() : This function allows you to upload files to a Google Colab notebook. It uses the files.upload()
function from the google.colab library to prompt the user to upload files. Once the user uploads a file, it is saved to
the current working directory of the Colab notebook.

download(path): This function is used to download a file from a Google Colab notebook. You specify the path
to the file you want to download, and it uses the files.download() function to trigger the download of that file to
your local machine.

#object class for useful functions def imShow(path):

import cv2

import matplotlib.pyplot as plt

%matplotlib inline

image = cv2.imread(path)

height, width = image.shape[:2]

resized_image = cv2.resize(image,(3*width, 3*height), interpolation = cv2.INTER_CUBIC)

fig = plt.gcf()

fig.set_size_inches(18, 10) plt.axis("off")

plt.imshow(cv2.cvtColor(resized_image, cv2.COLOR_BGR2RGB)) plt.show()

# use this to upload files def upload():

from google.colab import files uploaded = files.upload()

for name, data in uploaded.items():

with open(name, 'wb') as f:

f.write(data)

print ('saved file', name)


# use this to download a file def download(path):

from google.colab import files files.download(path)

The above code is a command to run the Darknet detection on a test image using the YOLOv4 model and the COCO
dataset configuration. Here's an explanation of the command:

!./darknet: This is the path to the Darknet executable that you previously built using the make command. The
./ at the beginning specifies that the executable is in the current directory.

detector test: This is the command used to perform object detection with Darknet. It specifies that you want to run a
detection test.

cfg/coco.data: The path to the data configuration file. In this case, it's configured to use the COCO dataset.

cfg/yolov4.cfg: The path to the YOLOv4 model configuration file. This file contains information about the
architecture and settings of the YOLOv4 model.

yolov4.weights: The path to the YOLOv4 model weights file. This file contains the learned parameters of the YOLOv4
model.

data/object3.jpg: The path to the test image on which you want to perform object detection. When you run
this command, Darknet will use the YOLOv4 model to detect objects in the specified image. Detected objects will be
outlined and labeled in the output image. The result of the detection, including bounding boxes and class labels, will
be displayed in the terminal or the Colab notebook, depending on where you're running the code.

# run darknet detection using COCO Dataset on test images

!./darknet detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights data/object3.jpg

CUDA-version: 12020 (12020), cuDNN: 8.9.6, CUDNN_HALF=1, GPU count: 1 CUDNN_HALF=1

OpenCV version: 4.5.4

0 : compute_capability = 750, cudnn_half = 1, GPU: Tesla T4 net.optimized_memory = 0

mini_batch = 1, batch = 8, time_steps = 1, train = 0

layer filters size/strd(dil) input output

0 Create CUDA-stream - 0

Create cudnn-handle 0

conv 32 3 x 3/ 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BF

1 conv 64 3 x 3/ 2 608 x 608 x 32 -> 304 x 304 x64 3.407


BF

conv 1 64 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x64 0.757


BF
route -> 304 x 304 x
64

4 conv 64 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x64 0.757


BF

5 conv 32 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x32 0.379

BF
6 conv 64 3 x 3/ 1 304 x 304 x 32 -> 304 x 304 x64 3.407
BF

Shortcut Layer: 4, wt = 0, wn = 0, outputs: x 304 x 64 0.006 BF


304
conv 64 1 x 1/ 1 304 x 304 x 64 0.757
304 x 304 x 64 -> BF

9 route 8 2 -> 304 x 304 x 128

10 conv 64 1 x 1/ 1 304 x 304 x 128 ->304 x 304 x 64 1.514


BF

11 conv 128 3 x 3/ 2 304 x 304 x 64 -> 152 x 152 x 128 3.407


BF

12 conv 64 1 x 1/ 1 152 x 152 x 128 ->152 x 152 x 64 0.379


BF

13 11 -> 152 x 152 x 128


route

14 conv 64 1 x 1/ 1 152 x 152 x 128 ->152 x 152 x 64 0.379


BF

15 conv 64 1 x 1/ 1 152 x 152 x 64 -> 152 x 152 x 64 0.189


BF

16 conv 64 3 x 3/ 1 152 x 152 x 64 -> 152 x 152 x 64 1.703


BF

Shortcut Layer: 14, wt = 0,wn = 0, outputs: 152 x 152 x 64 0.001


BF
conv 64 1 x 1/ 1
152 x 152 x 64 -> 152 x 152 x 64
0.189 BF

19 conv 64 3 x 3/ 1 152 x 152 x 64 -> 152 x 152 x 64


1.703 BF

20 Shortcut Layer: 17, wt = 0, wn = 0, outputs: 152 x 152 x 64


0.001 BF

21 conv 64 1 x 1/ 1 152 x 152 x 64 - 152 x 152 x 64 0.189


> BF

22 21 12 -> 152 x 152 x 128


route

23 conv 128 1 x 1/ 1 152 x 152 x 128 - 152 x 152 x 128 0.757


> BF

24 conv 256 3 x 3/ 2 152 x 152 x 128 - 76 x 76 x 256 3.407


> BF

25 conv 128 1 x 1/ 1 76 x 76 x 256 -> 76 x 76 x 128 0.379


BF

26 24 -> 76 x 76 x 256
route
27 conv 128 1 x 1/ 1 76 x 76 x 256 -> 76 x 76 x 128 0.379
BF

28 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189


BF

29 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

30 Shortcut 27, wt = 0,wn = 0, outputs: x 76 x 128 0.001 BF


Layer: 76
1 x 1/ 1 76 x 76 x 128 0.189
31 conv 128 76 x 76 x 128 -> BF

32 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

Shortcut Layer: 30, wt = 0,wn = 0, outputs: x 76 x 128 0.001 BF


76
conv 128 1 x 1/ 1 76 x 76 x 128 0.189
76 x 76 x 128 -> BF

35 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

Shortcut Layer: 33, wt = 0,wn = 0, outputs: x 76 x 128 0.001 BF


76
conv 128 1 x 1/ 1 76 x 76 x 128 0.189
76 x 76 x 128 -> BF

38 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

Shortcut Layer: 36, wt = 0,wn = 0, outputs: x 76 x 128 0.001 BF


76
conv 128 1 x 1/ 1 76 x 76 x 128 0.189
76 x 76 x 128 -> BF

41 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

Shortcut Layer: 39, wt = 0,wn = 0, outputs: x 76 x 128 0.001 BF


76
conv 128 1 x 1/ 1 76 x 76 x 128 0.189
76 x 76 x 128 -> BF

44 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

Shortcut Layer: 42, wt = 0,wn = 0, outputs: x 76 x 128 0.001 BF


76
conv 128 1 x 1/ 1 76 x 76 x 128 0.189
76 x 76 x 128 -> BF

47 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703


BF

48 Shortcut Layer: 45, wt = 0, wn = 0, outputs: 76 x 76 x 128


0.001 BF

The code imShow('predictions.jpg') is using the previously defined imShow() function to display an image named
"predictions.jpg." This
image is likely generated as a result of running the object detection with Darknet, and it's expected to show the
detected objects with bounding boxes and labels.

# show image using our defined object imShow('predictions.jpg')

This code first moves up one directory ( %cd .. ), which means it navigates out of the "darknet" directory. Then, it calls the
upload() function, which allows you to upload a new image from your computer into your current Google Colab environment.
Afterward, it navigates back to the "darknet" directory.

%cd ..: This command moves up one directory from the current working directory. It's used to navigate out of the
"darknet" directory.
#Upload new image from the computer using defined Upload object

%
/content
Upload widget is only available when the cell
executed in the current browser session. Please rerun this cell
to enable.

# run darknet with YOLOv4 on your personal image! (note yours will not be called highway.jpg so change the name)

!./darknet detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights ../object2.jpg imShow('predictions.jpg')

CUDA-version: 12020 (12020), cuDNN: 8.9.6, CUDNN_HALF=1, GPU count: 1 CUDNN_HALF=1

OpenCV version: 4.5.4

0 : compute_capability = 750, cudnn_half = 1, GPU: Tesla T4 net.optimized_memory = 0

mini_batch = 1, batch = 8, time_steps = 1, train = 0

layer filters size/strd(dil) input output

0 Create CUDA-stream - 0

Create cudnn-handle 0

conv 32 3 x 3/ 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BF


1 conv 64 3 x 3/ 2 608 x 608 x 32 -> 304 x 304 x 64 3.407 BF
conv 1 64 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x 64 0.757 BF
route -> 304 x 304 x 64
4 conv 64 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x 64 0.757 BF
5 conv 32 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x 32 0.379 BF
6 conv 64 3 x 3/ 1 304 x 304 x 32 -> 304 x 304 x 64 3.407 BF
Shortcut Layer: 4, wt = 0, wn = 0, outputs: 304 x 304 x 64 0.006 BF
conv 64 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x 64 0.757 BF
9 route 8 2 -> 304 x 304 x 128
10 conv 64 1 x 1/ 1 304 x 304 x 128 -> 304 x 304 x 64 1.514 BF
11 conv 128 3 x 3/ 2 304 x 304 x 64 -> 152 x 152 x 128 3.407 BF
12 conv 64 1 x 1/ 1 152 x 152 x 128 -> 152 x 152 x 64 0.379 BF
13 route 11 -> 152 x 152 x 128
14 conv 64 1 x 1/ 1 152 x 152 x 128 -> 152 x 152 x 64 0.379 BF
15 conv 64 1 x 1/ 1 152 x 152 x 64 -> 152 x 152 x 64 0.189 BF

From google.colab import drive drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive",


force_remount=True).

## DOWNLOAD to LOCAL MACHINE

download('predictions.jpg')

28 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

29 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF

30 Shortcut Layer: 27, wt = 0, wn = 0, outputs: 76 x 76 x 128 0.001 BF

31 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

32 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF

33 Shortcut Layer: 30, wt = 0, wn = 0, outputs: 76 x 76 x 128 0.001 BF

34 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

35 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF

36 Shortcut Layer: 33, wt = 0, wn = 0, outputs: 76 x 76 x 128 0.001 BF

37 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

38 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF

39 Shortcut Layer: 36, wt = 0, wn = 0, outputs: 76 x 76 x 128 0.001 BF

40 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

41 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF

42 Shortcut Layer: 39, wt = 0, wn = 0, outputs: 76 x 76 x 128 0.001 BF

43 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

44 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF

45 Shortcut Layer: 42, wt = 0, wn = 0, outputs: 76 x 76 x 128 0.001 BF

46 conv 128 1 x 1/ 1 76 x 76 x 128 -> 76 x 76 x 128 0.189 BF

47 conv 128 3 x 3/ 1 76 x 76 x 128 -> 76 x 76 x 128 1.703 BF


EXPERIMENT-7
AIM : Design a Deep learning Network for Robust Bi-Tempered Logistic Loss.
DESCRIPTION:
The Bi-Tempered Logistic Loss is an enhancement over the conventional cross-entropy loss function,
particularly useful in scenarios where the data is noisy or imbalanced. It aims to address the drawbacks of
the standard logistic loss by introducing temperature parameters to control the sharpness of the loss
function and better handle outliers and noisy data points.
Standard Logistic Loss:
- Used for classification tasks to measure the difference between predicted and actual class probabilities.
- Can be sensitive to outliers and noisy data points.
Bi-Tempered Logistic Loss:
- Introduces two temperature parameters, t1 and t2, to control the sharpness of the loss curve.
- Adapts the loss function to be less sensitive to outliers and noisy samples.
- Utilizes a modified log function with temperature scaling to calculate the loss.
- Incorporates two temperature-scaling functions to balance between confident and uncertain predictions.
We know that, the deep learning model performance is dependent on the quality of training data. The real-
world training data sets can be noisy. For example, corrupted images, mislabeled data are few noisy data
sets. The Loss function can fail in handling the noisy training data due to the following two reasons:
1. Highly deviated outliers: Loss function like logistic Loss function are sensitive to outliers
2. Mislabeled data samples: The neural network outputs the class label for each test sample by increasing
the distance between the classes. During the process of increasing the decision boundary, the value of the
loss function become reduced very fast, so that the training process tend to get close to the boundary of
the outliers or mislabeled data samples. Consequently, prediction error occurs.
So, a robust loss function is required. “bi-tempered logistic loss function can be used to generalize he
problem of noisy training data.
As the name says, there are two modifiable parameters that can handle outliers and mislabeled data. They
are:
“temperatures”—t1 : symbolizes the boundedness, and
t2 : indicates the rate of decay in the termination or end of the transfer function
initialize t1 and t2 to 1.0 so that, the logistic loss function is recovered.
If t1 < 1.0 the boundedness gets increased and if t2 > 1.0 makes transfer function heavy tailed.

DATASET:-
The MNIST dataset is a widely used dataset in the field of machine learning and computer vision. It is a
collection of 70,000 small, grayscale images of handwritten digits (0 through 9), each of size 28x28 pixels.
The dataset is often used as a benchmark in the development and testing of various machine learning
algorithms, particularly for image classification tasks.
Here are the key details of the MNIST dataset:
1. Size of the Dataset:
- The MNIST dataset consists of 70,000 images in total.
-This dataset is commonly divided into two subsets: a training set and a test set.
2. Training Set:
- The training set typically comprises 60,000 images.
- These images are used to train machine learning models. The model learns patterns and features from
this set.
3. Test Set:
- The test set contains the remaining 10,000 images.
- It is used to evaluate the performance of a trained model on unseen data. This helps assess how well the
model
generalizes to new, previously unseen examples.
4. Image Characteristics:
- Each image in the MNIST dataset is grayscale, meaning it has only one channel (as opposed to RGB
images,
which have three channels for red, green, and blue).
- The images are 28x28 pixels in size, resulting in a total of 784 pixels per image.
5.Labeling:
- Each image is associated with a label indicating the digit it represents (0 through 9). For instance, an
image of
the digit "5" will have a label of 5.
6. Usage in Machine Learning:
- The MNIST dataset is commonly used for training and testing machine learning models, especially for
tasks
related to image classification.
- It serves as a standard benchmark to compare the performance of different algorithms.
7. Challenges:
- While MNIST has been instrumental in the development of many image classification techniques, it is
considered a relatively simple dataset compared to real-world scenarios. Some modern models may
achieve near- perfect accuracy on MNIST, but this does not necessarily guarantee success on more complex
datasets.
8. Availability:
- The MNIST dataset is freely available and can be accessed from various machine learning libraries and
Repositories. When working with MNIST, it's common to preprocess the data, normalize the pixel values,
and use techniques like convolutional neural networks (CNNs) for effective feature extraction from the
images.
Source Code:
Designing a deep learning network for Robust Bi-Tempered Logistic Loss involves creating a neural network
architecture that works effectively with this specialized loss function. Here's aconcise breakdown:
Understanding the Loss Function:
Bi-Tempered Logistic Loss is a robust loss function used in classification tasks,
especially in scenarios where the data might be noisy or mislabeled. It introducestemperature parameters
and a scaling factor to control the impact of different classes on the loss.
Network Architecture:
Choose an appropriate architecture (e.g., CNN for images) considering thecomplexity and nature of the
data.
Include normalization layers and techniques to prevent overfitting.
Custom Loss Function:
Implement the Bi-Tempered Logistic Loss as a custom loss function inTensorFlow/Keras.
Define the function with parameters (t1, t2, c) and handle numerical stability.
Model Compilation:
Compile the model with an optimizer (e.g., Adam) using the custom Bi-TemperedLogistic Loss function.
Include relevant metrics for monitoring performance (e.g., accuracy).
Training and Evaluation:
Train the model on the training data and monitor performance on validation data.Evaluate the model's
performance using metrics (loss, accuracy) on unseen test data.
Hyperparameter Tuning:
Experiment with different values for temperature parameters (t1, t2, c) to enhance themodel's
performance.
Considerations:
Ensure data quality, adjust model complexity, and apply regularization techniques.Document experiments
and track model performance for iterative improvements.
Deployment:
Save the trained model and consider optimization techniques for deployment inapplications.
The focus is on crafting a neural network that effectively learns from data using the Bi-TemperedLogistic
Loss, ensuring robustness in the face of noisy or mislabeled data while achieving goodperformance.
import tensorflow as tf
!pip install -U tensorflow-addons
Requirement already satisfied: tensorflow-addons in /usr/local/lib/python3.10/dist-paRequirement already
satisfied: packaging in /usr/local/lib/python3.10/dist-packages (Requirement already satisfied:
typeguard<3.0.0,>=2.7 in /usr/local/lib/python3.10/dis
The provided code accomplishes the following tasks:
Imports: It imports necessary modules from TensorFlow/Keras to build and train neuralnetworks, including
components for handling the MNIST dataset.
Bi-Tempered Logistic Loss Function: Defines a custom loss function named bi_tempered_logistic_loss that
implements the Bi-Tempered Logistic Loss equation,addressing noisy or mislabeled data by modeling class
distribution flexibly.
Loading and Preprocessing MNIST Dataset: Loads the MNIST dataset, normalizes pixel values to a range
between 0 and 1, reshapes the data to comply with a CNN's input shape,and one-hot encodes the labels.
Building the CNN Model: Constructs a Sequential model using Convolutional NeuralNetwork (CNN) layers:
Two sets of Convolutional layers followed by MaxPooling layers to extract features.Flatten layer to prepare for the

Dense layers.

Dense layers for learning representations.


A final Lambda layer to normalize the outputs.
Compiling the Model: Compiles the model using the Adam optimizer and the custom Bi-Tempered Logistic
Loss function.
Training the Model: Trains the compiled model using the prepared MNIST dataset for 10epochs while
validating the model's performance on the test set.
Evaluation and Output: Evaluates the trained model's performance on the test set to compute the test loss
value. The output showcases the training progress for each epoch,displaying the loss values. Finally, it prints
the test loss value calculated after the trainingcompletes.
The output indicates the training progress across 10 epochs, displaying the loss values at each epoch. The
final line prints the calculated test loss value after the model has completed training and evaluation on the
test dataset. The negative values for loss might be due to the nature of thecustom loss function and how it's
calculated in this context. Negative values in loss functions might occur when they are not conventional
mean squared or categorical cross-entropy typesbut rather specific to the application or custom function's
design.
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Lambdafrom
tensorflow.keras.models import Sequential
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
# Bi-tempered Logistic loss function
def bi_tempered_logistic_loss(y_true, y_pred, t1, t2, c, epsilon=1e-7):y_pred = tf.clip_by_value(y_pred,
epsilon, 1 - epsilon)
temp1 = (tf.math.pow(1 - y_pred, t1) - 1) / t1temp2 = (tf.math.pow(1 - y_pred, t2) - 1) / t2
loss = y_true * (temp1 + temp2) + c * (temp1 * temp2)return tf.reduce_sum(loss, axis=-1)
# Load the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0 # Normalize pixel valuesX_train =
X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
y_train = to_categorical(y_train, 10) # One-hot encode labelsy_test = to_categorical(y_test, 10)
# Build a CNN model with Bi-Tempered Logistic Lossmodel = Sequential([
Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(64, kernel_size=(3, 3), activation='relu'),MaxPooling2D(pool_size=(2, 2)),
Flatten(),
Dense(128, activation='relu'),
Dense(10, activation='linear'), # Use Linear activation for Logits
Lambda(lambda x: tf.math.l2_normalize(x, axis=1)) # Normalize the Logits
])
# Compile the model with Bi-Tempered Logistic Losst1 = 0.8
t2 = 1.2
c = 1.0
model.compile(optimizer='adam', loss=lambda y_true, y_pred: bi_tempered_logistic_loss(y_t
# Train the model
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))
# Evaluate the model
test_loss = model.evaluate(X_test, y_test, verbose=0)print(f"Test loss: {test_loss}")
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mni
11490434/11490434 [==============================] - 0s 0us/step
Epoch 1/10
1875/1875 [==============================] - 58s 30ms/step - loss: -0.9584 - val_loss
Epoch 2/10
1875/1875 [==============================] - 58s 31ms/step - loss: -1.0171 - val_loss
Epoch 3/10
1875/1875 [==============================] - 59s 31ms/step - loss: -1.0246 - val_loss
Epoch 4/10
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0276 - val_loss
Epoch 5/10
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0309 - val_loss

Epoch 6/10
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0307 - val_loss

Epoch 7/10
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0326 - val_loss
Epoch 8/10
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0335 - val_loss
Epoch 9/10
1875/1875 [==============================] - 59s 31ms/step - loss: -1.0338 - val_loss
Epoch 10/10
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0359 - val_loss
Test loss: -1.026029348373413
Number of Epochs:
The first code snippet trains the model for 10 epochs, while the second snippet trains for only 5epochs.
Output:
The output provided in both cases displays the training progress for each epoch, showing theloss values at
each epoch. The final line prints the test loss value calculated after the trainingcompletes. Training Time:
Training times might differ due to the variance in the number of epochs. The exact times for
each epoch vary between the two executions but should be similar given similar hardware anddataset sizes.
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Lambdafrom
tensorflow.keras.models import Sequential
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
# Bi-tempered logistic Loss function
def bi_tempered_logistic_loss(y_true, y_pred, t1, t2, c, epsilon=1e-7):y_pred = tf.clip_by_value(y_pred,
epsilon, 1 - epsilon)
temp1 = (tf.math.pow(1 - y_pred, t1) - 1) / t1temp2 = (tf.math.pow(1 - y_pred, t2) - 1) / t2
loss = y_true * (temp1 + temp2) + c * (temp1 * temp2)return tf.reduce_sum(loss, axis=-1)
# Load the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0 # Normalize pixel values
# Reshape the data to have a single channel (grayscale)
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
# One-hot encode the Labels
y_train = to_categorical(y_train, 10)y_test = to_categorical(y_test, 10)
# Build a CNN model with Bi-Tempered Logistic Lossmodel = Sequential([
Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(64, kernel_size=(3, 3), activation='relu'),MaxPooling2D(pool_size=(2, 2)), Flatten(), Dense(128,
activation='relu'), Dense(10, activation='linear'),
# Use Linear activation for Logits
Lambda(lambda x: tf.math.l2_normalize(x, axis=1)) # Normalize the Logits
])
# Compile the model with Bi-Tempered Logistic Losst1 = 0.8
t2 = 1.2
c = 1.0
model.compile(optimizer='adam', loss=lambda y_true, y_pred: bi_tempered_logistic_loss(y_t
# Train the model
model.fit(X_train, y_train, epochs=5, validation_data=(X_test, y_test))# Evaluate the model
# Evaluate the model
test_loss = model.evaluate(X_test, y_test, verbose=0)print(f"Test loss: {test_loss}")
Epoch 1/5
1875/1875 [==============================] - 56s 29ms/step - loss: -0.9596 - val_loss
Epoch 2/5
1875/1875 [==============================] - 59s 31ms/step - loss: -1.0175 - val_loss
Epoch 3/5
1875/1875 [==============================] - 57s 30ms/step - loss: -1.0250 - val_loss
Epoch 4/5
1875/1875 [==============================] - 55s 29ms/step - loss: -1.0277 - val_loss
Epoch 5/5
1875/1875 [==============================] - 57s 31ms/step - loss: -1.0311 - val_loss
Test loss: -1.026908040046692
The primary difference lies in the optimizer used during compilation. The rest of the code is nearly identical
to the previous snippet. rmsprop is another optimizer commonly used in neuralnetwork training, but its
behavior and convergence might differ from adam. The output and
behavior of the model could vary due to this change in optimization algorithms.
If you want to delve deeper into the optimizer differences, adam generally adapts the learning rate during
training, while rmsprop also adapts the learning rate but in a slightly different manner.The performance and
convergence of the model might be affected by this change, potentially leading to varied results in training
dynamics and final accuracy.
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Lambdafrom
tensorflow.keras.models import Sequential
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
# Bi-tempered logistic Loss function
def bi_tempered_logistic_loss(y_true, y_pred, t1, t2, c, epsilon=1e-7):y_pred = tf.clip_by_value(y_pred,
epsilon, 1 - epsilon)
temp1 = (tf.math.pow(1 - y_pred, t1) - 1) / t1temp2 = (tf.math.pow(1 - y_pred, t2) - 1) / t2
loss = y_true * (temp1 + temp2) + c * (temp1 * temp2)return tf.reduce_sum(loss, axis=-1)
# Load the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0 # Normalize pixel values
# Reshape the data to have a single channel (grayscale)X_train = X_train.reshape(X_train.shape[0], 28, 28,
1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
# One-hot encode the Labels
y_train = to_categorical(y_train, 10)y_test = to_categorical(y_test, 10)
# Build a CNN model with Bi-Tempered Logistic Lossmodel = Sequential([
Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D(pool_size=(2, 2)),
EXPERIMENT - 8
AIM : Build AlexNet using Advanced CNN.
Description :

Introduction
AlexNet is a groundbreaking convolutional neural network (CNN) architecture that achieved remarkable
success in the ImageNet Large Scale Visual Recognition Challenge in 2012. It played a pivotal role in
popularizing deep learning for image classification tasks. In this document, we'll delve into the process of
building an AlexNet-inspired model using advanced CNN techniques.
Key Components of AlexNet

Convolutional Layers
1. 1st Convolutional Layer : 96 filters, 11x11 kernel size, 4x4 strides, ReLU activation.
2. Max Pooling : 3x3 pool size, 2x2 strides.
3. 2nd Convolutional Layer : 256 filters, 5x5 kernel size, 1x1 strides, ReLU activation.
4. Max Pooling : 3x3 pool size, 2x2 strides.
5. 3rd Convolutional Layer : 384 filters, 3x3 kernel size, 1x1 strides, ReLU activation.
6. 4th Convolutional Layer : 384 filters, 3x3 kernel size, 1x1 strides, ReLU activation.
7. 5th Convolutional Layer : 256 filters, 3x3 kernel size, 1x1 strides, ReLU activation.
8. Max Pooling : 3x3 pool size, 2x2 strides.

Fully Connected Layers


9. Flatten : Flatten the output for fully connected layers.
10. 1st Fully Connected Layer : 4096 neurons, ReLU activation, Dropout (40% rate).
11. 2nd Fully Connected Layer : 4096 neurons, ReLU activation, Dropout (40% rate).
12. 3rd Fully Connected Layer (Output Layer) : 1000 neurons (based on ImageNet classes), softmax
activation.

Model Compilation
• Loss Function: Categorical Cross entropy
• Optimizer: Adam
• Metrics: Accuracy

Advanced CNN Techniques


1. Normalization of Data: Normalize pixel values to have zero mean and unit variance.
2. Data Augmentation : Augment training data with random transformations to improve model
generalization.
3. Transfer Learning : Leverage pre-trained models or pre-trained layers to boost performance on
specific tasks.
4. Dropout Regularization: Use dropout layers to prevent overfitting during training.

Training and Evaluation


• Train the model for a specified number of epochs using the training set.
• Evaluate the model on a separate validation set to monitor generalization.
• Test the final model on an independent test set to assess performance.

Visualization and Analysis


• Plot training and validation accuracy/loss curves.
• Generate a classification report for detailed performance metrics.
• Visualize correctly and incorrectly classified samples for insights.

DATASET:
MNIST Dataset Overview : The MNIST dataset is a widely used collection of handwritten digit images,
commonly employed as a benchmark in the field of machine learning and computer vision. Here's a
comprehensive description of the MNIST dataset:

Overview:
• Name: MNIST (Modified National Institute of Standards and Technology)
• Nature: Image Classification
• Dataset Size:
• Training Set: 60,000 images
• Test Set: 10,000 images
• Image Size: 28x28 pixels
• Classes: 10 (Digits 0 through 9)
• Source: Originally created by Yann LeCun, Corinna Cortes, and Christopher J.C. Burges for NIST,
modified for machine learning experiments.

Characteristics:
1.Image Content: Each image in the MNIST dataset is a grayscale image of a handwritten digit.
2.Labeling : Every image is associated with a label indicating the digit it represents (0 through 9).
3.Digit Variety : The dataset includes a diverse set of digits, capturing variations in writing styles.
4.Size Consistency : All images are resized to a standard size of 28x28 pixels, providing consistency for
machine learning models.
5.Gray Scale: Images are in grayscale, with pixel values ranging from 0 to 255
Source Code:
This code block imports various Python libraries and modules commonly used in machine learning, particularly for
working with image data and building neural networks using the Keras framework. Here's a brief description of each:

numpy: A library for numerical operations, particularly useful for handling arrays and matrices.

pandas: A data manipulation library that provides data structures like DataFrames, making it easy to work with
structured data.

matplotlib.pyplot: A plotting library for creating visualizations

keras: A high-level deep learning library. It provides an easy-to-use interface for building and training neural networks.

to_categorical: A utility function in Keras for one-hot encoding categorical variables.

image: Keras module for working with images.

img_to_array, array_to_img: Functions for converting images to arrays and vice versa.

train_test_split: Function from scikit-learn for splitting datasets into training and testing sets.

Sequential: Keras class for creating a linear stack of layers.

Various layers (Dense, Dropout, Flatten, Input, Conv2D, MaxPooling2D, AveragePooling2D, BatchNormalization):
Different types of layers used in building neural networks.

classification_report: A function from scikit-learn for generating a classification report, which includes precision,
recall, and F1-score.

datasets: A module in Keras for loading common datasets.

Importing the Model class from Keras, which allows building more complex neural network architectures than the
simple sequential model.

import numpy as np import pandas as pd

import matplotlib.pyplot as pltimport keras

from keras.utils import to_categoricalfrom keras.preprocessing import image

from keras.preprocessing.image import img_to_array, array_to_imgfrom sklearn.model_selection import


train_test_split

from keras.models import Sequential

from keras.layers import Dense, Dropout, Flatten, Input

from keras.layers import Conv2D, MaxPooling2D, AveragePooling2Dfrom keras.layers import BatchNormalization

from sklearn.metrics import classification_reportfrom keras import datasets

#from scipy.misc import imresizefrom keras.models import Model

In this section of the code, the MNIST dataset is being loaded using Keras. MNIST is a well-known dataset of
handwritten digits, commonly used for training and testing machine learning models

This line loads the MNIST dataset using Keras. The dataset is split into training and testing sets, with images and
corresponding labels.

x_train and x_test contain the images of handwritten digits.

y_train and y_test contain the corresponding labels (the digit each image represents).

This line creates a new variable y_true and assigns it the values of y_test. y_true is commonly used in machine
learning contexts to represent the true (actual) labels.
(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()y_true = y_test

This code block is preparing the MNIST dataset for input into a neural network by reshaping the images. Here's an
explanation of the code:

img_row, img_cols = 28, 28

input_shape = (img_row, img_cols, 1)

These lines define the dimensions of the images in the MNIST dataset. Each image is 28 pixels in height and 28 pixels
in width, forming a 28x28 pixel grid. The input_shape variable is set to (28, 28, 1), indicating the dimensions of each
image with a single channel (grayscale).

x_train = x_train.reshape(x_train.shape[0], img_cols, img_row, 1) x_test = x_test.reshape(x_test.shape[0], img_cols,


img_row, 1)

These lines reshape the image arrays (x_train and x_test) to match the specified input_shape. The reshape function
is used to rearrange the dimensions of the arrays. The new shape is set to (number of samples, height, width,
channels). In this case, it's (number of samples, 28, 28, 1) to match the expected input shape for a Convolutional
Neural Network (CNN).

print("Train set shape", x_train.shape, 'train label shape', y_train.shape)

print('Test set shape', x_test.shape, 'test labels:', y_test.shape)

These lines print the shapes of the training and testing sets after the reshaping process. It helps to verify that the
reshaping was successful and that the input shapes match the expected format for a CNN. The shapes of x_train and
x_test should now be (number of samples, 28, 28, 1), and the shapes of y_train and y_test represent the labels for
the corresponding sets.

This preparation is crucial for feeding the data into a convolutional neural network (CNN) since CNNs expect input
data in a specific format, especially when working with image data

# parsing through the datasetimg_row, img_cols = 28, 28

input_shape = (img_row, img_cols, 1)

x_train = x_train.reshape(x_train.shape[0], img_cols, img_row, 1)x_test = x_test.reshape(x_test.shape[0], img_cols,


img_row, 1)

print("Train set shape", x_train.shape, 'trainlabel shape', y_train.shape)print('test set shape', x_test.shape, 'test
labels:', y_test.shape)

Train set shape (60000, 28, 28, 1) trainlabel shape (60000,)

test set shape (10000, 28, 28, 1) test labels: (10000,)

This code block is splitting the original training set (x_train and y_train) into a new training set (x_train and y_train), a
validation set (x_val and

y_val), and a test set (x_test and y_test). Here's a breakdown of the code:

x_train, x_val, y_train, y_val = train_test_split(x_test, y_test, test_size=0.2)

This line uses the train_test_split function from scikit-learn to split the original training set (x_test and y_test) into a
new training set (x_train and y_train) and a validation set (x_val and y_val). The test_size parameter is set to 0.2,
indicating that 20% of the data will be used for validation.

x_test = x_train[:5000] y_test = y_train[:5000]

This line further splits the original training set (x_train and y_train) into a smaller test set (x_test and y_test) by
selecting the first 5000 samples. This smaller test set is likely used for quick testing or validation purposes.
print('X_train shape:', x_train.shape, 'X_label shape:', y_train.shape) print('Val_set shape:', x_val.shape, 'val_label
shape:', y_val.shape) print('Test_set shape:', x_test.shape, 'y_test shape:', y_test.shape)

These lines print the shapes of the newly created training, validation, and test sets. It's a useful step to verify that the
data has been split

correctly, and the shapes match the expectations. The shapes printed indicate the number of samples and the
dimensions of each sample for the respective sets.

# split the train set to validatation set

x_train, x_val, y_train, y_val = train_test_split(x_test, y_test, test_size=0.2)x_test=x_train[:5000]

y_test=y_train[:5000]

print('X_train shape:', x_train.shape, 'X_label shape:', y_train.shape)print('Val_set shape:', x_val.shape, 'val_label


shape:', y_val.shape) print('Test_set shape:', x_test.shape, 'y_test shape:', y_test.shape)

X_train shape: (8000, 28, 28, 1) X_label shape: (8000,)

Val_set shape: (2000, 28, 28, 1) val_label shape: (2000,)

Test_set shape: (5000, 28, 28, 1) y_test shape: (5000,)

This code block is performing normalization on the pixel values of the images in the training, validation, and test sets.
Normalization is a common preprocessing step in machine learning, especially for image data. Here's an explanation
of the code:

For each set (x_train, x_val, and x_test), the pixel values are normalized. The normalization process involves
subtracting the mean and dividing by the standard deviation.

x_train.mean() and x_train.std() calculate the mean and standard deviation of the pixel values in the training set.

Similarly, x_val.mean() and x_val.std() are used for the validation set, and x_test.mean() and x_test.std() bold text
for the test set.

The result is that each set is transformed so that its pixel values have a mean of approximately 0 and a standard
deviation of approximately 1. Normalizing the data helps in training neural networks by ensuring that the features
(pixel values in this case) are on a similar scale, which can lead to better convergence during training.

This normalization process is a common practice in machine learning to improve the stability and performance of the
training process.

# normalization of data

x_train = (x_train - x_train.mean()) / x_train.std()x_val = (x_val - x_val.mean()) / x_val.std()

x_test = (x_test - x_test.mean()) / x_test.std()

This code block appears to be preparing the data for a neural network model, specifically for image classification.
Here's an overall description:

num_labels = 10

This line sets the variable num_labels to 10, suggesting that the dataset involves classifying images into 10 different
categories.

im_row = 227

im_col = 227
These lines set the dimensions (im_row and im_col) to 227x227 pixels, indicating the desired size for the images in the
dataset.

def reformat(dataset):

dataset = np.asarray([img_to_array(array_to_img(im, scale=False).resize ((im_row, im_col))) for im in dataset])


return dataset

This function, reformat, takes a dataset of images and applies the following operations:

Converts each image to a NumPy array.

Uses array_to_img and img_to_array from Keras to convert the array back to an image and then resizes it to the
specified dimensions (227x227 pixels). The result is a reformatted dataset of images.

y_train = keras.utils.to_categorical(y_train) x_train = reformat(x_train)

print('X_train shape:', x_train.shape, 'X_label shape:', y_train.shape)

The training set labels (y_train) are one-hot encoded using to_categorical from Keras, converting them into binary
vectors.

The training set images (x_train) are then reformatted using the reformat function. The shapes of the reformatted
training set and its labels are printed.

y_test = keras.utils.to_categorical(y_test) x_test = reformat(x_test)

print('test set shape:', x_test.shape, 'test label shape', y_test.shape) Similar operations are performed for the test set
(x_test and y_test). y_val = keras.utils.to_categorical(y_val)

x_val = reformat(x_val)

print('val set shape:', x_val.shape, 'val_lavels shape:', y_val.shape)

Lastly, the same operations are applied to the validation set (x_val and y_val).

num_labels = 10

# formatting the data for model inputim_row = 227

im_col = 227

def reformat(dataset):

dataset = np.asarray([img_to_array(array_to_img(im, scale=False).resize((im_row, im_col))) for im in dataset])return


dataset

y_train = keras.utils.to_categorical(y_train)x_train = reformat(x_train)

print('X_train shape:', x_train.shape, 'X_label shape:', y_train.shape)

y_test = keras.utils.to_categorical(y_test)x_test = reformat(x_test)

print('test set shape:', x_test.shape, 'test label shape', y_test.shape)

y_val = keras.utils.to_categorical(y_val)x_val = reformat(x_val)

print('val set shape:', x_val.shape, 'val_lavels shape:', y_val.shape)

X_train shape: (8000, 227, 227, 1) X_label shape: (8000, 10)

test set shape: (5000, 227, 227, 1) test label shape (5000, 10)

val set shape: (2000, 227, 227, 1) val_lavels shape: (2000, 10)

AlexNet Architecture
This code defines the architecture of the AlexNet model using the Keras framework. AlexNet is a well-known deep
convolutional neural network architecture designed for image classification tasks. Here's a breakdown of the code:

batch_size = 32

num_classes = 10

epochs = 50

These variables define parameters for training the model, such as the batch size, the number of classes, and the
number of epochs (iterations through the entire dataset during training).

model = Sequential()

This line initializes a sequential model, which is a linear stack of layers in Keras.

1st Convolutional Layer

model.add(Conv2D(filters=96, input_shape=(227,227,1), kernel_size=(11, 11), strides=(4, 4), activation='relu'))

The first convolutional layer with 96 filters, an input shape of (227, 227, 1) (representing grayscale images), a kernel
size of (11, 11), and a stride of (4, 4). ReLU (Rectified Linear Unit) is used as the activation function.

Max Pooling

model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))

Max pooling layer following the first convolutional layer with a pool size of (3, 3) and a stride of (2, 2). The subsequent
layers follow a similar pattern:

2nd Convolutional Layer

model.add(Conv2D(filters=256, kernel_size=(5, 5), strides=(1, 1), activation='relu'))

Max Pooling

model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))

3rd Convolutional Layer

model.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu'))

4th Convolutional Layer

model.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu'))

5th Convolutional Layer

model.add(Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation='relu'))

Max Pooling

model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))

These convolutional layers are followed by three fully connected layers:

Passing it to a Fully Connected layer

model.add(Flatten())

1st Fully Connected Layer

model.add(Dense(4096, activation='relu'))

Add Dropout to prevent overfitting

model.add(Dropout(0.4))
2nd Fully Connected Layer

model.add(Dense(4096, activation='relu'))

Add Dropout

model.add(Dropout(0.4))

3rd Fully Connected Layer

model.add(Dense(1000, activation='relu'))

Add Dropout

model.add(Dropout(0.4))

Finally, the output layer with the number of classes defined earlier:

Output Layer

model.add(Dense(num_classes, activation='softmax'))

The model is compiled using categorical cross-entropy as the loss function, the Adam optimizer, and accuracy as the
evaluation metric:

Compile the model

model.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])

This code defines the architecture of the AlexNet model and prepares it for training on a dataset with 10 classes.

# Defining AlexNet architecture

batch_size = 32

num_classes = 10

epochs = 50

model = Sequential()

# 1st Convolutional Layer

model.add(Conv2D(filters=96, input_shape=(227,227,1), kernel_size=(11, 11), strides=(4, 4), activation='relu'))

# Max Pooling

model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))

# 2nd Convolutional Layer

model.add(Conv2D(filters=256, kernel_size=(5, 5), strides=(1, 1), activation='relu'))

# Max Pooling

model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))

# 3rd Convolutional Layer

model.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu'))

# 4th Convolutional Layer

model.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation='relu'))

# 5th Convolutional Layer

model.add(Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation='relu'))

# Max Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))

# Passing it to a Fully Connected layermodel.add(Flatten())

# 1st Fully Connected Layer

model.add(Dense(4096, activation='relu'))

# Add Dropout to prevent overfittingmodel.add(Dropout(0.4))

# 2nd Fully Connected Layer

model.add(Dense(4096, activation='relu'))

# Add Dropout

model.add(Dropout(0.4))

# 3rd Fully Connected Layer

model.add(Dense(1000, activation='relu'))

# Add Dropout

model.add(Dropout(0.4))

# Output Layer

model.add(Dense(num_classes, activation='softmax'))

# Compile the model

model.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])

The model.summary() method provides a concise summary of the architecture and parameters of the defined neural
network

model.summary()

Model: "sequential"

Layer (type) Output Shape Param #

====================================================

conv2d (Conv2D) (None, 55, 55, 96) 11712

max_pooling2d (None, 27, 27, 96) 0


(MaxPooling2D)

conv2d_1 (Conv2D) (None, 23, 23, 256) 614656

max_pooling2d_1 (None, 11, 11, 256) 0


(MaxPooling2D)

conv2d_2 (Conv2D) (None, 9, 9, 384) 885120

conv2d_3 (Conv2D) (None, 7, 7, 384) 1327488

conv2d_4 (Conv2D) (None, 5, 5, 256) 884992

max_pooling2d_2 (None, 2, 2, 256) 0


(MaxPooling 2D)

flatten (Flatten) (None, 1024) 0

dense (Dense) (None, 4096) 4198400


dropout (Dropout) (None, 4096) 0

dense_1 (Dense) (None, 4096) 16781312

dropout_1 (Dropout) (None, 4096) 0

dense_2 (Dense) (None, 1000) 4097000

dropout_2 (Dropout) (None, 1000) 0

dense_3 (Dense) (None, 10) 10010

=================================================================

Total params: 28810690 (109.90 MB)

Trainable params: 28810690 (109.90 MB)

Non-trainable params: 0 (0.00 Byte)

This line of code is using the fit method to train the AlexNet model on the provided training data (x_train and y_train).
Here's a breakdown of the parameters:

hist = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_val, y_val))


x_train: The input data (features) for training the model.

y_train: The target labels for training the model.

batch_size: The number of samples per gradient update. In this case, it's set to 32, meaning the model will be updated
after processing each batch of 32 samples.

epochs: The number of epochs (iterations over the entire training dataset) for training the model. In this case, it's set
to 50.

verbose: Controls the amount of information printed during training. A value of 1 prints progress bar and loss
information for each epoch.

validation_data: A tuple containing validation data to be used during training. In this case, it's specified as (x_val,
y_val).

The fit method trains the model on the provided data, updates the model parameters, and returns a history object
(hist) that contains information about the training process, such as the training and validation loss and accuracy for
each epoch

hist = model.fit(x_train, y_train, batch_size= batch_size, epochs= epochs, verbose=1, validation_data=(x_val,y_val))

This code block is evaluating the trained AlexNet model on the test dataset (x_test and y_test). Here's a breakdown of
the code:

score = model.evaluate(x_test, y_test, verbose=1)

The evaluate method is used to evaluate the model on the test data. It returns a list containing the test loss and test
accuracy. The verbose=1

parameter indicates that progress information should be printed during evaluation.

print('Test loss:', score[0]) print('Test accuracy:', score[1])

These lines print the test loss and test accuracy obtained from the evaluation. The values are extracted from the score
list.

After executing this code block, you will get printed output indicating the test loss and accuracy of the trained model
on the test dataset. The test accuracy represents the percentage of correctly classified samples in the test set, while
the test loss is a measure of the model's
performance on the test data.

score = model.evaluate(x_test, y_test, verbose= 1)print('Test loss:', score[0])

print('Test accuracy:', score[1])

10000/10000 [==============================] - 169s 17ms/stepTest loss: 0.14832029934376478

Test accuracy: 0.9602

This code block is using Matplotlib to visualize the training and validation accuracy as well as the training and
validation loss over the epochs. Here's a breakdown of the code:

accuracy = hist.history['accuracy'] val_accuracy = hist.history['val_accuracy'] loss = hist.history['loss']

val_loss = hist.history['val_loss'] epochs = range(len(accuracy))

These lines extract the training and validation accuracy, as well as the training and validation loss, from the training
history (hist) obtained during the model training.

plt.plot(epochs, accuracy, 'bo', label='Training accuracy') plt.plot(epochs, val_accuracy, 'b', label='Validation


accuracy') plt.title('Training and validation accuracy')

plt.legend() plt.figure()

The first plt.plot line creates a plot of training accuracy with blue dots, and the second line creates a plot of validation
accuracy with a solid blue line. The plt.title function adds a title to the accuracy plot, and plt.legend adds a legend to
distinguish between training and validation accuracy.

plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training
and validation loss') plt.legend() plt.show()

Similarly, these lines create two plots: one for training loss and another for validation loss. The blue dots represent
training loss, and the solid blue line represents validation loss. Titles and legends are added to make the plots more
informative.

After executing this code block, it will generate two plots showing the training and validation accuracy and loss over
the epochs. These visualizations can help you assess the model's performance during training, identify overfitting or
underfitting, and make decisions on whether further adjustments are needed.

accuracy = hist.history['acc']

val_accuracy = hist.history['val_acc']loss = hist.history['loss']

val_loss = hist.history['val_loss']epochs = range(len(accuracy))

plt.plot(epochs, accuracy, 'bo', label='Training accuracy')

plt.plot(epochs, val_accuracy, 'b', label='Validation accuracy')

plt.title('Training and validation accuracy')

plt.legend()plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')

plt.plot(epochs, val_loss, 'b', label='Validation loss')

plt.title('Training and validation loss')


plt.show()

This code block is making predictions on the test data using the trained AlexNet model and then separating the indices
of correctly and incorrectly classified samples. Here's a breakdown of the code:

Get the predictions for the test data

predicted_classes = model.predict_classes(x_test)

The predict_classes method is used to obtain the predicted class labels for the test data (x_test). This generates an
array of predicted classes for each sample in the test set.

Get the indices to be plotted

correct = np.nonzero(predicted_classes == y_true)[0]

incorrect = np.nonzero(predicted_classes != y_true)[0]

The np.nonzero function is used to find the indices where the predicted classes match the true labels (correct) and
where they do not match (incorrect). This helps in separating the indices of correctly and incorrectly classified samples.

After executing this code block, the correct and incorrect arrays contain the indices of samples in the test set that were
correctly and incorrectly classified, respectively. These indices can be useful for further analysis or visualization, such as
inspecting specific images to understand model performance.

#get the predictions for the test data

predicted_classes = model.predict_classes(x_test)

#get the indices to be plotted

correct = np.nonzero(predicted_classes==y_true)[0]

incorrect = np.nonzero(predicted_classes!=y_true)[0]

This code block is using scikit-learn's classification_report function to generate a detailed classification report, which
includes precision, recall, and F1-score, for the predictions made by the AlexNet model on the test data. Here's a
breakdown of the code:

target_names = ["Class {}".format(i) for i in range(num_classes)]

This line creates a list of class names for the classification report. The list is based on the number of classes
(num_classes), with each class represented as "Class 0", "Class 1", ..., "Class 9".
print(classification_report(y_true, predicted_classes, target_names=target_names))

The classification_report function is used to generate a detailed classification report. It takes the true labels (y_true) and
predicted labels (predicted_classes) as inputs. The target_names parameter is set to the list of class names created
earlier.

The printed classification report provides metrics such as precision, recall, and F1-score for each class, as well as overall
performance metrics. It's a valuable tool for assessing the model's performance on individual classes and gaining insights
into its strengths and weaknesses across different categories.

target_names = ["Class {}".format(i) for i in range(num_classes)]

print(classification_report(y_true, predicted_classes, target_names=target_names))

precision recall f1 score support

Class 0 0.97 0.98 0.98 980

Class 1 0.99 0.98 0.98 1135

Class 2 0.99 0.88 0.93 1032

Class 3 0.98 0.98 0.98 1010

Class 4 1.00 0.93 0.96 982

Class 5 0.92 0.99 0.95 892

Class 6 0.99 0.97 0.98 958

Class 7 0.96 0.98 0.97 1028

Class 8 0.87 0.99 0.92 974

Class 9 0.95 0.94 0.94 1009

micro avg 0.96 0.96 0.96 10000

macro avg 0.96 0.96 0.96 10000

weighted 0.96 0.96 0.96 10000


avg

This code block is using Matplotlib to create a visual representation of correctly classified images from the test set. It
plots a 3x3 grid of images with their predicted and true class labels. Here's a breakdown of the code:

for i, c in enumerate(correct[:9]):

plt.subplot(3,3,i+1)

plt.imshow(x_test[c].reshape(227,227), cmap='gray', interpolation='none') plt.title("Predicted {}, Class


{}".format(predicted_classes[c], y_true[c])) plt.tight_layout()

The for loop iterates over the indices of correctly classified samples (correct[:9]) and uses enumerate to get both the
index (i) and the corresponding index value (c).

plt.subplot(3,3,i+1)sets up a subplot in a 3x3 grid, and plt.imshow displays the image using a grayscale colormap
(cmap='gray'). The title of each subplot includes the predicted class and the true class labels for better understanding.

plt.tight_layout() is used to improve the layout spacing. After executing this code block, you should see a 3x3 grid of
correctly classified images from the test set, with each subplot displaying an image along with its predicted and true
class labels.

for i, c in enumerate(correct[:9]):plt.subplot(3,3,i+1)
plt.imshow(x_test[c].reshape(227,227), cmap='gray', interpolation='none') plt.title("Predicted {}, Class
{}".format(predicted_classes[c], y_true[c]))plt.tight_layout()

This code block is using Matplotlib to create a visual representation of incorrectly classified images from the test set. It
plots a 3x3 grid of images with their predicted and true class labels. Here's a breakdown of the code:

for i,inc in enumerate(incorrect[:9]):

plt.subplot(3,3,i+1)

plt.imshow(x_test[c].reshape(227,227), cmap='gray', interpolation='none') plt.title("Predicted {}, Class


{}".format(predicted_classes[c], y_true[c])) plt.tight_layout()

The for loop iterates over the indices of incorrectly classified samples (incorrect[:9]) and uses enumerate to get both the
index (i) and the corresponding index value (c).

plt.subplot(3,3,i+1)sets up a subplot in a 3x3 grid, and plt.imshow displays the image using a grayscale colormap
(cmap='gray'). The title of each subplot includes the predicted class and the true class labels for better understanding.

plt.tight_layout() is used to improve the layout spacing. After executing this code block, you should see a 3x3 grid of
incorrectly classified images from the test set, with each subplot displaying an image along with its predicted and true
class labels.

for i, inc in enumerate(incorrect[:9]):plt.subplot(3,3,i+1)

plt.imshow(x_test[inc].reshape(227,227), cmap='gray', interpolation='none')

plt.title("Predicted {}, Class {}".format(predicted_classes[inc], y_true[inc]))plt.tight_layout()

This code block displays a single image from the training set. The image is reshaped and shown using Matplotlib. Here's a
breakdown of the code:
EXPERIMENT – 9
AIM : Demonstration of Application of Autoencoders
Description:

AUTOENCODERS:
Autoencoders are the data encoding techniques based on Unsupervised Artificial Neural Networks. This special
type of ANN is trained to encode the data so that in such a way that data is represented in compressed form.
The Autoencoders are also trained to decode the data so that, the original data can be reconstructed as far as
possible.

Architecture of Autoencoders:
The architecture for autoencoders are varied. In this section LSTM autoencoders is discussed. LSTM based
autoencoders are used to encode and decode the sequence data.

Why sequence data is challenging to process?


• Sequence data are challenging for prediction task because the size of the is not fixed but it varies.
• Also, the temporal series of the data representation make it challenging to extract the features.

So, the building a predictive model to predict the sequence data involve sequence of operation and hence such
problems are called as Sequence-to Sequence. Autoencoders comes as the best choice to handle sequence-to-
sequence problems.

To solve the problem the following required are provided:


A large set of MNIST data in which all set of numerical images are provided.
And we can input images according to requirements and use to process raw data.
An auto encoder consists of 3 components: encoder, code and decoder.
DATASET:
MNIST Dataset Overview : The MNIST dataset is a widely used collection of handwritten digit images,
commonly employed as a benchmark in the field of machine learning and computer vision. Here's a
comprehensive description of the MNIST dataset:

Overview:
• Name: MNIST (Modified National Institute of Standards and Technology)
• Nature: Image Classification
• Dataset Size:
• Training Set: 60,000 images
• Test Set: 10,000 images
• Image Size: 28x28 pixels
• Classes: 10 (Digits 0 through 9)
• Source: Originally created by Yann LeCun, Corinna Cortes, and Christopher J.C. Burges for NIST, modified
for machine learning experiments.

Characteristics:
1.Image Content: Each image in the MNIST dataset is a grayscale image of a handwritten digit.
2.Labeling : Every image is associated with a label indicating the digit it represents (0 through 9).
3.Digit Variety : The dataset includes a diverse set of digits, capturing variations in writing styles.
4.Size Consistency : All images are resized to a standard size of 28x28 pixels, providing consistency for machine
learning models.
5.Gray Scale: Images are in grayscale, with pixel values ranging from 0 to 255
Source Code:
Purpose:
It creates an LSTM autoencoder model to recreate a given input sequence. It's designed forsequential
data, where order matters (like time series or text).
Key Steps:

Step 1:

Import libraries:
numpy for array operations

keras for building the LSTM model

STEP 2:

Define input sequence:


Creates a sample sequence of numbers [0.1, 0.2, ..., 0.9].

STEP 3:

Reshape input:
Converts the sequence into a 3D shape suitable for LSTM (samples, timesteps, features).

STEP 4:

Build the model:


Sequential model with 4 layers:

LSTM layer (100 units, relu activation) to encode the input sequence.
RepeatVector layer to repeat the encoded representation for output generation.LSTM

layer (100 units, relu activation) to decode the repeated representation.

TimeDistributed Dense layer (1 unit) to output the reconstructed sequence.

STEP 5:

Compile the model:


Sets the optimizer to 'adam' and loss function to 'mse' (mean squared error).

STEP 6:

Train the model:

Fits the model to the input sequence, aiming to learn a representation that can recreate it.Trains for

300 epochs (iterations through the data).

Visualize the model (optional):

Creates a plot of the model architecture.


STEP:7
Reconstruct and evaluate:
Predicts the output using the same input sequence.
Prints the reconstructed sequence to compare with the original.

#1. Reconstruction of sequence using Autoencoders


# Setp1:Building an simple autoencoders to create simple sequencefrom
numpy import array

from keras.models import Sequential


from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributedfrom
keras.utils import plot_model

# lstm autoencoder recreate sequence#


define input sequence
sequence = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
# reshape input into [samples, timesteps, features]n_in =

len(sequence)

sequence = sequence.reshape((1, n_in, 1))#


define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_in))

model.add(LSTM(100, activation='relu', return_sequences=True))


model.add(TimeDistributed(Dense(1)))

model.compile(optimizer='adam', loss='mse')# fit


model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')#
demonstrate recreation

yhat = model.predict(sequence, verbose=0)


print(yhat[0,:,0])

[0.10749847 0.20450562 0.30195725 0.3997921 0.49816766 0.5970611


0.69709414 0.79899496 0.90359807]
2. Prediction of the sequence of number using Autoencoders

Like reconstruction, autoencoders can be used to predict the sequence, the code is as givenbelow:
Purpose:

It creates an LSTM autoencoder model to predict the next values in a given sequence. It learnspatterns in
sequential data to anticipate future values.

Key Steps:

STEP 1:
Import libraries:
numpy for array operations

keras for building the LSTM model

STEP 2:

Define input sequence:


Creates a sample sequence of numbers [0.1, 0.2, ..., 0.9].

STEP 3:

Reshape input:
Converts the sequence into a 3D shape suitable for LSTM (samples, timesteps, features).

STEP 4:

*Prepare output sequence: *


Shifts the input sequence by one step to create the target output (e.g., input = [1, 2, 3], output =[2, 3]).

STEP 5:

Build the model:


Sequential model with 4 layers:
LSTM layer (100 units, relu activation) to encode the input sequence.

RepeatVector layer to repeat the encoded representation for output generation.LSTM

layer (100 units, relu activation) to decode the repeated representation.

TimeDistributed Dense layer (1 unit) to output the predicted sequence.

STEP 7:

Compile the model:


Sets the optimizer to 'adam' and loss function to 'mse' (mean squared error).

STEP 8:

Train the model:


Fits the model to the input and output sequences, aiming to learn to predict future values.Trains for

300 epochs (iterations through the data).


Visualize the model (optional):
Creates a plot of the model architecture.

STEP 9:

Predict and evaluate:


Predicts the output using the same input sequence. Prints the predicted sequence to comparewith the
shifted input.
# lstm autoencoder predict sequence
from numpy import array

from keras.models import Sequential


from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributedfrom
keras.utils import plot_model
# define input sequence
seq_in = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

# reshape input into [samples, timesteps, features]n_in =


len(seq_in)

seq_in = seq_in.reshape((1, n_in, 1))#


prepare output sequence

seq_out = seq_in[:, 1:, :]n_out


= n_in - 1
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_in,1)))
model.add(RepeatVector(n_out))

model.add(LSTM(100, activation='relu', return_sequences=True))


model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
plot_model(model, show_shapes=True, to_file='predict_lstm_autoencoder.png')# fit
model

model.fit(seq_in, seq_out, epochs=300, verbose=0)#


demonstrate prediction

yhat = model.predict(seq_in, verbose=0)


print(yhat[0,:,0])

[0.16358767 0.28784174 0.40271223 0.51003706 0.6112754 0.7076115


0.8000286 0.88936013]
3. Outlier/Anomaly detection using Autoencoders: Suppose the input data is highly correlated and
requires a technique to detect the anomaly or an outlier then, Autoencoders is the bestchoice. Since,
autoencoders can encode the data in the compressed for, they can handle
the correlated data.

Let’s train the autoencoders using MNIST data set using simple Feed Forward neural network. Code:

Simple 6 layered feed forward Autoencoders

Once the autoencoders is trained on MNIST data set, an anomaly detection can be done using 2different
images. First one of the images from the MNIST data set is chosen and feed to the
trained autoencoders. Since, this image is not an anomaly, the error or loss function is expectedto be very
low. Next, when some random image is given as test image, the loss rate is expected to be very high as it is
an anomaly.
Simple 6 layered Autoencoders build to train on MNIST data

Purpose:

It creates an autoencoder model to learn a compressed representation of images from theMNIST


dataset.

It aims to reconstruct the original images from the compressed representation.

Key Steps:

STEP 1:

Import libraries:
keras for building and training the autoencoder model

numpy for array operations

STEP 2:

Load MNIST data:

Imports the MNIST dataset of handwritten digits.


Preprocesses the data by reshaping and normalizing pixel values.

STEP 3:

Build the autoencoder:


Sequential model with 6 layers:

Encoder:
Dense layer (512 units, Relu activation)

Dense layer (128 units, Relu activation)

Bottleneck layer (10 units, linear activation)

Decoder:
Dense layer (128 units, elu activation)

Dense layer (512 units, elu activation)

Output layer (784 units, sigmoid activation)

STEP 4:

Compile the model:


Sets the loss function to 'mean_squared_error' and the optimizer to 'Adam'.

STEP 5:

Train the model:


Fits the model to the MNIST training data, aiming to reconstruct the input images.
Trains for 10 epochs (iterations through the data).
Monitors the loss on both the training and validation sets.

STEP 6:

Extract the encoder:

Creates a separate model that takes an input and outputs the bottleneck representation.

Generate compressed representations:


Uses the encoder model to create compressed representations of the training data.

Generate reconstructions:
Uses the full autoencoder model to reconstruct the original images from the compressed
representations.

Extract the decoder:


Creates a separate model that takes a compressed representation and outputs the
reconstructed image.
import numpy as np
import keras
from keras.datasets import mnist
from keras.models import Sequential, Modelfrom
keras.layers import Dense, Input
from keras import optimizers
from keras.optimizers import Adam
(x_train, y_train), (x_test, y_test) = mnist.load_data()train_x =
x_train.reshape(60000, 784) / 255

val_x = x_test.reshape(10000, 784) / 255


autoencoder = Sequential()

autoencoder.add(Dense(512, activation='relu', input_shape=(784,)))


autoencoder.add(Dense(128, activation='relu'))
autoencoder.add(Dense(10, activation='linear', name="bottleneck"))
autoencoder.add(Dense(128, activation='relu'))
autoencoder.add(Dense(512, activation='relu'))
autoencoder.add(Dense(784, activation='sigmoid'))
autoencoder.compile(loss='mean_squared_error', optimizer = Adam())
trained_model = autoencoder.fit(train_x, train_x, batch_size=1024, epochs=10, verbose=1,encoder =
Model(autoencoder.input, autoencoder.get_layer('bottleneck').output)

encoded_data = encoder.predict(train_x)

# bottleneck representation

decoded_output = autoencoder.predict(train_x)

# reconstructionencoding_dim = 10

# return the decoder

encoded_input = Input(shape=(encoding_dim,))

decoder = autoencoder.layers[-3](encoded_input)
decoder = autoencoder.layers[-2](decoder)

decoder = autoencoder.layers[-1](decoder)
decoder = Model(encoded_input, decoder)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mni


11490434/11490434 [==============================] - 0s 0us/step
Epoch 1/10
59/59 [============================== - 7s 108ms/step - loss: 0.0752 - val_loss: 0.0
Epoch ]2/10
59/59 [============================== - 8s 145ms/step - loss: 0.0414 - val_loss: 0.0
Epoch ]3/10
59/59 [============================== - 12s 198ms/step - loss: 0.0319 - val_loss: 0.
Epoch ]4/10
59/59 [==============================] - 10s 163ms/step - loss: 0.0273 - val_loss: 0.
Epoch 5/10
59/59 [============================== - 9s 157ms/step - loss: 0.0249 - val_loss: 0.0
Epoch ]6/10
59/59 [============================== - 9s 159ms/step - loss: 0.0233 - val_loss: 0.0
Epoch ]7/10
59/59 [==============================] - 10s 178ms/step - loss: 0.0220 - val_loss: 0.
Epoch 8/10
59/59 [============================== - 6s 108ms/step - loss: 0.0210 - val_loss: 0.0
Epoch ]9/10
59/59 [============================== - 7s 119ms/step - loss: 0.0202 - val_loss: 0.0
Epoch ]10/10
59/59 [==============================] - 6s 99ms/step - loss: 0.0196 - val_loss: 0.01
1875/1875 [==============================] - 6s 3ms/step
1875/1875 [==============================] - 9s 5ms/step
Anamoly Detection
from google.colab import drive

drive.mount('/content/drive',force_remount=True)Mounted at /content/drive

Purpose:

It tests the performance of the trained autoencoder model on an image that's not from the
MNIST dataset. It measures the reconstruction error to assess how well the model generalizesto unseen
data.

Key Steps:

STEP 1:

Load an external image:


Loads a JPEG image named "don-joshuva.jpeg" from the specified path.Resizes it

to 28x28 pixels to match the MNIST format.

Converts it to grayscale.

STEP 2:

Preprocess the image:


Converts the image to a NumPy array.
Reshapes the array into a 1-dimensional vector of 784 pixels

STEP 3:

Reconstruct the image:

Uses the trained autoencoder model to reconstruct the image from its compressed
representation.

STEP 4:

Calculate reconstruction error:


Computes the Euclidean distance between the original image and the reconstructed image using
np.linalg.norm.

This measures how much information was lost during compression and reconstruction.

STEP 5:

Print the error:


Displays the reconstruction error, indicating how well the model performed on this unseenimage.

# %matplotlib inline
from keras.preprocessing import image
# if the img.png is not one of the MNIST dataset that the model was trained on, the error img =
image.load_img("/content/drive/MyDrive/DLT/don-joshuva.jpeg", target_size=(28, 28), input_img =
image.img_to_array(img)
inputs = input_img.reshape(1,784)
target_data = autoencoder.predict(inputs)
dist = np.linalg.norm(inputs - target_data, axis=-1)print(dist)

1/1 [==============================] - 0s 81ms/step


[3366.3535]
EXPERIMENT -10
AIM: Demonstration of GAN.

DESCRIPTION:

GAN:
Generative Adversarial Networks, or GANs, represent a cutting-edge approach to generative modeling within
deep learning, often leveraging architectures like convolutional neural networks. The goal of generative
modeling is to autonomously identify patterns in input data, enabling the model to produce new examples
that feasibly resemble the original dataset.

GANs tackle this challenge through a unique setup, treating it as a supervised learning problem involving two
key components: the generator, which learns to produce novel examples, and the discriminator, tasked with
distinguishing between genuine and generated instances. Through adversarial training, these models engage
in a competitive interplay until the generator becomes adept at creating realistic samples, fooling the
discriminator approximately half the time.

This dynamic field of GANs has rapidly evolved, showcasing remarkable capabilities in generating lifelike
content across various domains. Notable applications include image-to-image translation tasks and the
creation of photorealistic images indistinguishable from real photos, demonstrating the transformative
potential of GANs in the realm of generative modeling.

ARCHITECTURE:

The architecture of the deep learning GAN models consists of two modules
namely generator and Discriminator.

1. A generator model:
• It is the learning component of a GAN models.
• It learns to generate the new data by incorporating the feedback received from the discriminator.
• It learns to allow the discriminator to classify its newly generated data as real.
• Hence, training the Generator also requires the discriminator to be considered.

2. A Discriminator models:

• It is a classifier in GANs
• Its job is to classify the output of the generator (newly generated data) from the real one.
GAN ARCHITECTURE

DATASET:

MNIST Dataset Overview : The MNIST dataset is a widely used collection of handwritten digit images,
commonly employed as a benchmark in the field of machine learning and computer vision. Here's a
comprehensive description of the MNIST dataset:

Overview:
• Name: MNIST (Modified National Institute of Standards and Technology)
• Nature: Image Classification
• Dataset Size:
• Training Set: 60,000 images
• Test Set: 10,000 images
• Image Size: 28x28 pixels
• Classes: 10 (Digits 0 through 9)
• Source: Originally created by Yann LeCun, Corinna Cortes, and Christopher J.C. Burges for NIST, modified
for machine learning experiments.
Source Code:
Importing Libraries

import tensorflow as tf

from tensorflow.keras.layers import Dense, Flatten, Reshape

from tensorflow.keras.models import Sequential

from tensorflow.keras.optimizers import Adamimport numpy as np

import matplotlib.pyplot as plt

TensorFlow is imported for building and training neural networks. Specific layers and models from the Keras API (which is
integrated into TensorFlow) are imported. NumPy is imported for numerical operations. Matplotlib is imported for
plotting graphs and visualizations.

Loading MNIST Dataset and Normalizing

# Load the MNIST dataset

(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data()train_images = train_images / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


11490434/11490434 [==============================] - 10s 1us/step

The MNIST dataset is loaded. It contains 28x28 pixel grayscale images of handwritten digits (0-9). The images are
normalized to the range [0, 1] by dividing by 255.

Generator and Discriminator Models

# Generator model

generator = Sequential([

Dense(128, input_shape=(100,), activation='relu'),Dense(784, activation='sigmoid'),

Reshape((28, 28))

])

The generator and discriminator models are defined using the Sequential API. The generator takes a random noise vector
of size 100, passes it through a dense layer, reshapes it to the size of an image (28x28). The discriminator takes an image,
flattens it, passes it through dense layers, and produces a binary output indicating whether the input is real or
generated.

Compiling the Discriminator and Combining into GAN

# Discriminator model

discriminator = Sequential([

Flatten(input_shape=(28, 28)),Dense(128, activation='relu'),Dense(1, activation='sigmoid')

])

# Compile the discriminator

discriminator.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.0002), metrics=['accuracy'])

The discriminator is compiled with binary cross-entropy loss and Adam optimizer. The GAN is created by combining the
generator and discriminator. Discriminator's weights are frozen during GAN training. The GAN is compiled with binary
cross-entropy loss and Adam optimizer.

# Combine generator and discriminator into a GANdiscriminator.trainable = False


gan_input = tf.keras.Input(shape=(100,))x = generator(gan_input)

gan_output = discriminator(x)

gan = tf.keras.models.Model(gan_input, gan_output)

gan.compile(loss='binary_crossentropy', optimizer=Adam(learning_rate=0.0002))

Training the GAN

# Training the GAN

def train_gan(epochs, batch_size):

batch_count = train_images.shape[0] // batch_size

for e in range(epochs):

for _ in range(batch_count):

noise = np.random.normal(0, 1, size=[batch_size, 100])generated_images = generator.predict(noise)

image_batch = train_images[np.random.randint(0, train_images.shape[0], size=batch_size)]

# Train the discriminator

discriminator.trainable = True

d_loss_real = discriminator.train_on_batch(image_batch, np.ones((batch_size, 1)))

d_loss_fake = discriminator.train_on_batch(generated_images, np.zeros((batch_size, 1)))d_loss = 0.5 *


np.add(d_loss_real, d_loss_fake)

# Train the generator

noise = np.random.normal(0, 1, size=[batch_size, 100])discriminator.trainable = False

g_loss = gan.train_on_batch(noise, np.ones((batch_size, 1)))

print(f"Epoch {e}/{epochs}, D Loss: {d_loss[0]}, G Loss: {g_loss}")if e % 10 == 0:

plot_generated_images(e, generator)

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step


4/4 [==============================] - 0s 4ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 5ms/step
4/4 [==============================] - 0s 5ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 4ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 4ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step


4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 6ms/step
4/4 [==============================] - 0s 5ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 4ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step
4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 5ms/step

4/4 [==============================] - 0s 6ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 4ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 5ms/step


4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 3ms/step

4/4 [==============================] - 0s 5ms/step

4/4 [===========

Visualization of Generated Images

def plot_generated_images(epoch, generator, examples=10, dim=(1, 10), figsize=(10, 1)):noise = np.random.normal(0, 1,


size=[examples, 100])

generated_images = generator.predict(noise)

generated_images = generated_images.reshape(examples, 28, 28)plt.figure(figsize=figsize)

for i in range(generated_images.shape[0]):plt.subplot(dim[0], dim[1], i+1)

plt.imshow(generated_images[i], interpolation='nearest', cmap='gray_r')plt.axis('off')

plt.tight_layout()

plt.savefig(f'gan_generated_image_epoch_{epoch}.png')plt.show()

# Train the GAN

train_gan(100, 128)

Here's the demonstration of how you can interpret the output:

As training progresses, you should observe a decrease in the D Loss and an increase in the G Loss. This indicates that the
discriminator is getting better at distinguishing real from generated data, and the generator is improving at generating
realistic data.

If the D Loss becomes very low (close to zero), it might indicate that the discriminator is too strong or that the generator
is not effective at fooling it. You may need to adjust the architecture or training parameters.

If the G Loss becomes very high, it might indicate that the generator is not making significant progress in generating
realistic data. You might need to adjust the architecture, hyperparameters, or training strategy.

The visualizations of generated images can provide a qualitative assessment of the quality of generated data. You should
look for clear and recognizable patterns in the generated images as training progresses.

It's important to note that GAN training can be complex, and interpreting the output requires a balance between
quantitative measures (loss values) and qualitative inspection of generated samples. You may need to experiment with
different hyperparameters and network architectures to achieve the desired results.
EXERCISE – 11
AIM: TRAFFIC-SIGN RECOGNITION SYSTEM
Description:
Problem statement:
The problem is to develop a Traffic-sign recognition system which can recognize the traffic signs put up on the
road e.g. "speed limit" or "children" or "turn ahead" etc. Given the traffic signs in the image form as input, the
problem is to recognize the signs using Machine learning techniques. To solve the problem following are
provided:
A huge collection of traffic signal taken under different scene is available as input. These signs may be not
clearly visible, are challenging to process as they are taken from far
Separate set of images are for testing the model is available
Use the available data and develop a traffic sign recognition system which can categorize
signs i.e, classify to which class the traffic sign belongs to.
Project implementation: Traffic signs are of different types like speed limits, traffic signal, turn left or right
etc. Traffics recognition problem can be considered as traffic sign classification problem. Since, the traffic signs
might have been captured from far, the model that we build should be able to detect accurately.
We use deep learning technique which can extract the features accurately and predict the sign class. The sign
detection methods are based on the features like colour, shape. To extract the features from the complex
images, a deep learning technique - Convolutional Neural network and image processing techniques are used.
A Convolutional Neural Network (CNN): It is a type of Deep Learning neural network architecture commonly
used in Computer Vision. Computer vision is a field of Artificial Intelligence that enables a computer to
understand and interpret the image or visual data. Convolutional Neural Network consists of multiple layers
like the input layer, Convolutional layer, Pooling layer, and fully connected layers. The Convolutional layer
applies filters to the input image to extract features, the Pooling layer down samples the image to reduce
computation, and the fully connected layer makes the final prediction. The network learns the optimal filters
through backpropagation and gradient descent.
-Dataset used in the project:

To implement this project, traffic sign data set is used. This data set can be downloaded from Kaggle. The data
set used here is from German Traffic Sign Benchmark. Data set description is as follows:

➢ Number of images = 50,000


➢ Size of the data set = around 300 MB
➢ Number of classes = 43
➢ The class distribution is varying
➢ The train folder contains images that can be used to train
➢ The path of the reading the taring images are in Train.csv file
➢ Similarly, test folder has test images.

Before implementing this project ensure, the following necessary packages are installed:

➢ Keras – CNN model building


➢ Matplotlib and seaborn - data visualization
➢ Scikit-Learn – Predicting and model summary
➢ PIL, CV- Image reading and processing
➢ Pandas – Data manipulation

First download the data set and extract the files into a directory. The extracted data set contains 3 folder and 3
.csv file. Meta, Train and Test folder contains the images for target class, training and testing respectively.
Meta, Train and Test .csv files contains paths for images, image-ID and other information.

The method used to build and evaluate the model for traffic sign categorization is a given below:
Source code:
Exploring the Dataset
Step1: Import the necessary files
!pip install opencv-python

Requirement already satisfied: opencv-python in c:\users\vanam\anaconda3\lib\site-packages (4.8.1.78)


Requirement already satisfied: numpy>=1.19.3 in c:\users\vanam\anaconda3\lib\site-packages (from opencv-
python) (1.26.1) WARNING: Ignoring invalid distribution -ensorflow-intel (c:\users\vanam\anaconda3\lib\site-
packages)
WARNING: Ignoring invalid distribution -ensorflow-intel (c:\users\vanam\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution -ensorflow-intel (c:\users\vanam\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution -ensorflow-intel (c:\users\vanam\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution -ensorflow-intel (c:\users\vanam\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution -ensorflow-intel (c:\users\vanam\anaconda3\lib\site-packages)
import numpy as np import pandas as pd
import matplotlib.pyplot as plt import seaborn as sns
import cv2
import tensorflow as tf from PIL import Image
import os
from sklearn.model_selection import train_test_split from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
C:\Users\vanam\anaconda3\lib\site-packages\scipy\ init .py:146: UserWarning: A NumPy version >=1.16.5
and <1.23.0 is required for warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"

Step2: Fetch the path for the images


data = []
labels = [] classes = 43
dataset_dir = '/content/drive/MyDrive/DLT/Traffic Sign Recognition.zip' #Retrieving the imagedata = []
labels = [] classes = 43
dataset_dir = 'C:/Users/vanam/OneDrive/Downloads/Traffic Sign Recognition' #Retrieving the images and
their labels
metaDf = pd.read_csv('C:/Users/vanam/OneDrive/Downloads/Traffic Sign Recognition/Meta.csv') trainDf =
pd.read_csv('C:/Users/vanam/OneDrive/Downloads/Traffic Sign Recognition/Train.csv') testDf=
pd.read_csv('C:/Users/vanam/OneDrive/Downloads/Traffic Sign Recognition/Test.csv')
labels = ['20 km/h', '30 km/h', '50 km/h', '60 km/h', '70 km/h', '80 km/h', '80 km/h end', '100 km/h', '120
km/h', 'No overtaking','No overtaking for tracks', 'Crossroad with secondary way', 'Main road', 'Give way',
'Stop', 'Road up', 'Road up
for tra 'Other dangerous', 'Turn left', 'Turn right', 'Winding road', 'Hollow road', 'Slippery road', 'Narrowing
road', 'Roadwor 'Pedestrian', 'Children', 'Bike', 'Snow', 'Deer', 'End of the limits', 'Only right', 'Only left', 'Only
straight', 'Only 'Only straight and left', 'Take right', 'Take left', 'Circle crossroad', 'End of overtaking limit', 'End
of overtaking l
print('SHAPE of training set:',trainDf.shape) print('SHAPE of test set:',trainDf.shape)
print('SHAPE of MetaInfo:',trainDf.shape)
SHAPE of training set: (39209, 8)
SHAPE of test set: (39209, 8)
SHAPE of MetaInfo: (39209, 8)
Step3: Load the path
trainDf['Path'] = list(map(lambda x: os.path.join(dataset_dir,x.lower()), trainDf['Path'])) testDf['Path'] =
list(map(lambda x: os.path.join(dataset_dir,x.lower()), testDf['Path'])) metaDf['Path'] = list(map(lambda x:
os.path.join(dataset_dir,x.lower()), metaDf['Path']))
Step4: Print the samples from the files to verify its correctness
trainDf.sample(10)

Width Height Roi.X1 Roi.Y1 Roi.X2 Roi.Y2 ClassId

24169 43 44 5 6 38 39 C:/Users/vanam/OneDrive/
16

17636 55 52 6 5 50 47 C:/Users/vanam/OneDrive/
11

18839 71 72 7 7 65 66 C:/Users/vanam/OneDrive/
12

19644 111 111 10 10 102 102 C:/Users/vanam/OneDrive/


12

19843 37 39 5 6 32 33 C:/Users/vanam/OneDrive/
12

14477 39 39 6 5 34 34 C:/Users/vanam/OneDrive/
9

metaDf.sample(10)

Path ClassId ShapeId ColorId SignId


30 C:/Users/vanam/OneDrive/Downloads/Traffic 36 1 1 4.4
Sign...
37 C:/Users/vanam/OneDrive/Downloads/Traffic 42 1 3 3.28
Sign...
9 C:/Users/vanam/OneDrive/Downloads/Traffic 16 1 0 3.3
Sign...
21 C:/Users/vanam/OneDrive/Downloads/Traffic 28 0 0 1.33
Sign...
40 C:/Users/vanam/OneDrive/Downloads/Traffic 7 1 0 3.29
Sign...
38 C:/Users/vanam/OneDrive/Downloads/Traffic 5 1 0 3.29
Sign...
17 C:/Users/vanam/OneDrive/Downloads/Traffic 23 0 0 1.13
Sign...
7 C:/Users/vanam/OneDrive/Downloads/Traffic 14 3 0 2.2
Sign...
19 C:/Users/vanam/OneDrive/Downloads/Traffic 25 0 0 1.37
Sign...
8 C:/Users/vanam/OneDrive/Downloads/Traffic 15 1 0 3.1
Sign...

The data set may be imbalanced meaning that, the number of samples available for each class may not
same. So, it is advisable to analyze the class distribution so that training and validation of the model can
be implemented accordingly.
To find the data set imbalance, Histogram plot is used. From the below output, we can see that, this
class distribution for this data set is not uniform. Seaborn library is used to plot and visualize the
histogram. Also, we can observer that, the subset of train and test dataset almost have similar class
imbalance fig, axs = plt.subplots(1, 2, sharex=True, sharey=True, figsize=(25, 6)) axs[0].set_title('Train
classes distribution') axs[0].set_xlabel('Class') axs[0].set_ylabel('Count') axs[1].set_title('Test classes

distribution') axs[1].set_xlabel('Class') axs[1].set_ylabel('Count') sns.countplot(trainDf.ClassId,


ax=axs[0],palette = "Set1") sns.countplot(testDf.ClassId, ax=axs[1]) axs[0].set_xlabel('Class ID');
axs[1].set_xlabel('Class ID');

Step6: Analyze the size distribution of images Since the accuracy of the sign recognition is dependent of the
quality of the images, it is requiring knowing the resolution of the images.

KDE Plot also called as Kernel Density Estimate is used for visualizing the PDF- Probability Density function
of a continuous variable. Seaborn is used to plot KDE plot. The below plot shoes the probability density
at different values in a continuous variable.

A multivariate plotting is used to visualize width and height of the images. This dataset contains
thousands of images but not with same resolution. we can see from below given output that, most of
images is rectangular, most of them are about 35x35 pixels resolution and few samples have high
resolution like a 100x100 pixels.

trainDfDpiSubset = trainDf[(trainDf.Width < 80) & (trainDf.Height < 80)]; testDfDpiSubset =


testDf[(testDf.Width < 80) & (testDf.Height < 80)];
g = sns.JointGrid(x="Width", y="Height", data=trainDfDpiSubset)
sns.kdeplot(x=trainDfDpiSubset.Width, y=trainDfDpiSubset.Height, cmap="Reds", shade=False,
shade_lowest=False, ax=g.ax_joint) sns.kdeplot(x=testDfDpiSubset.Width, y=testDfDpiSubset.Height,
cmap="Blues", shade=False, shade_lowest=False, ax=g.ax_joint) g.fig.set_figwidth(25)
g.fig.set_figheight(8) plt.show();

C:\Users\vanam\anaconda3\lib\site-packages\seaborn\distributions.py:1718: UserWarning warnings.warn(msg,


UserWarning)

C:\Users\vanam\anaconda3\lib\site-packages\seaborn\distributions.py:1718: UserWarning warnings.warn(msg,


UserWarning)

Step7: Visualize the target class The target class in this data set can be considered as a sample but as an
image indicating the sign. Some of the images may be different from the dataset samples so, few target
class samples are visualized
rows = 6
cols = 8
fig, axs = plt.subplots(rows, cols, sharex=True, sharey=True, figsize=(25, 12))
plt.subplots_adjust(left=None, bottom=None, right=None, top=0.9, wspace=None, hspace=None) metaDf =
metaDf.sort_values(by=['ClassId']) idx = 0
for i in range(rows):

for j in range(cols): if idx > 42:

break

img = cv2.imread(metaDf["Path"].tolist()[idx], cv2.IMREAD_UNCHANGED) if img is not None:

img[np.where(img[:,:,3]==0)] = [255,255,255,255]

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = cv2.resize(img, (60,60))

axs[i,j].imshow(img)

axs[i,j].set_facecolor('xkcd:salmon')

axs[i,j].set_facecolor((1.0, 0.47, 0.42))

axs[i,j].set_title(labels[int(metaDf["ClassId"].tolist()[idx])]) axs[i,j].get_xaxis().set_visible(False)

axs[i,j].get_yaxis().set_visible(False) idx += 1

Step8: Visualize the Training set Few images of the training set are also visualized
rows = 10
cols = 10
fig, axs = plt.subplots(rows, cols, sharex=True, sharey=True, figsize=(25, 12))

plt.subplots_adjust(left=None, bottom=None, right=None, top=0.9, wspace=None, hspace=None) cur_path =


'C:/Users/vanam/OneDrive/Downloads/Traffic Sign Recognition/'

print(cur_path) idx = 0

for i in range(rows):

for j in range(cols):

path = os.path.join(cur_path,trainDf["Path"].tolist()[idx]) img = cv2.imread(path,cv2.IMREAD_UNCHANGED)

#print(path)

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = cv2.resize(img, (60,60))

axs[i,j].imshow(img)
axs[i,j].set_title(labels[int(trainDf["ClassId"].tolist()[idx])]) axs[i,j].get_xaxis().set_visible(False)

axs[i,j].get_yaxis().set_visible(False) idx += 1

C:/Users/vanam/OneDrive/Downloads/Traffic Sign Recognition/

Model building using Convolutional Neural Network Keras is used for Model building. For presentation
purpose, Model building python code is presented separately. Code and the description are as follows:

Step1: Import the necessary file All the necessary files required for model building, data manipulation,
visualization, image processing must be imported.

import numpy as np import pandas as pd


import matplotlib.pyplot as plt import cv2
import tensorflow as tf from PIL import Image
import os
from sklearn.model_selection import train_test_split from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
Step2: Extract the path for training, testing and label images Image path and the images are stored separately,
fetch the path of the images from the folders where it is stored.
data = []
labels = [] classes = 43
cur_path = os.getcwd() print(cur_path)
# Retrieving the images and their labels for i in range(classes):
path = os.path.join(cur_path, 'Traffic Sign Recognition', 'Train', str(i)) images = os.listdir(path)
print(path)
print(images)
# Converting lists into numpy arrays for a in images:
try:
image = Image.open(os.path.join(path, a)) image = image.resize((30, 30))
image = np.array(image) data.append(image)
labels.append(i) except:
print("Error loading image")
C:\Users\vanam\OneDrive\Downloads
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\0
['00000_00000_00000.png', '00000_00000_00001.png', '00000_00000_00002.png',
'00000_00000_00003.png', '00000_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\1
['00001_00000_00000.png', '00001_00000_00001.png', '00001_00000_00002.png',
'00001_00000_00003.png', '00001_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\2
['00002_00000_00000.png', '00002_00000_00001.png', '00002_00000_00002.png',
'00002_00000_00003.png', '00002_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\3
['00003_00000_00000.png', '00003_00000_00001.png', '00003_00000_00002.png',
'00003_00000_00003.png', '00003_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\4
['00004_00000_00000.png', '00004_00000_00001.png', '00004_00000_00002.png',
'00004_00000_00003.png', '00004_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\5
['00005_00000_00000.png', '00005_00000_00001.png', '00005_00000_00002.png',
'00005_00000_00003.png', '00005_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\6
['00006_00000_00000.png', '00006_00000_00001.png', '00006_00000_00002.png',
'00006_00000_00003.png', '00006_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\7
['00007_00000_00000.png', '00007_00000_00001.png', '00007_00000_00002.png',
'00007_00000_00003.png', '00007_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\8
['00008_00000_00000.png', '00008_00000_00001.png', '00008_00000_00002.png',
'00008_00000_00003.png', '00008_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\9
['00009_00000_00000.png', '00009_00000_00001.png', '00009_00000_00002.png',
'00009_00000_00003.png', '00009_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\10
['00010_00000_00000.png', '00010_00000_00001.png', '00010_00000_00002.png',
'00010_00000_00003.png', '00010_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\11
['00011_00000_00000.png', '00011_00000_00001.png', '00011_00000_00002.png',
'00011_00000_00003.png', '00011_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\12
['00012_00000_00000.png', '00012_00000_00001.png', '00012_00000_00002.png',
'00012_00000_00003.png', '00012_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\13
['00013_00000_00000.png', '00013_00000_00001.png', '00013_00000_00002.png',
'00013_00000_00003.png', '00013_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\14
['00014_00000_00000.png', '00014_00000_00001.png', '00014_00000_00002.png',
'00014_00000_00003.png', '00014_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\15
['00015_00000_00000.png', '00015_00000_00001.png', '00015_00000_00002.png',
'00015_00000_00003.png', '00015_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\16
['00016_00000_00000.png', '00016_00000_00001.png', '00016_00000_00002.png',
'00016_00000_00003.png', '00016_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\17
['00017_00000_00000.png', '00017_00000_00001.png', '00017_00000_00002.png',
'00017_00000_00003.png', '00017_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\18
['00018_00000_00000.png', '00018_00000_00001.png', '00018_00000_00002.png',
'00018_00000_00003.png', '00018_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\19
['00019_00000_00000.png', '00019_00000_00001.png', '00019_00000_00002.png',
'00019_00000_00003.png', '00019_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\20
['00020_00000_00000.png', '00020_00000_00001.png', '00020_00000_00002.png',
'00020_00000_00003.png', '00020_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\21
['00021_00000_00000.png', '00021_00000_00001.png', '00021_00000_00002.png',
'00021_00000_00003.png', '00021_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\22
['00022_00000_00000.png', '00022_00000_00001.png', '00022_00000_00002.png',
'00022_00000_00003.png', '00022_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\23
['00023_00000_00000.png', '00023_00000_00001.png', '00023_00000_00002.png',
'00023_00000_00003.png', '00023_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\24
['00024_00000_00000.png', '00024_00000_00001.png', '00024_00000_00002.png',
'00024_00000_00003.png', '00024_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\25
['00025_00000_00000.png', '00025_00000_00001.png', '00025_00000_00002.png',
'00025_00000_00003.png', '00025_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\26
['00026_00000_00000.png', '00026_00000_00001.png', '00026_00000_00002.png',
'00026_00000_00003.png', '00026_00000_00004.png', '0
C:\Users\vanam\OneDrive\Downloads\Traffic Sign Recognition\Train\27

Step3: Read the images and convert to numpy arrays We need to convert the list into numpy arrays for feeding
to the model.

#Converting lists into numpy arrays data = np.array(data)


labels = np.array(labels)
print(data.shape,
labels.shape) (39209, 30, 30, 3) (39209,)
Step4: Split the training set The training set is split as shown in the code. The shape of data is (3440, 30,
30, 3) (860, 30, 30, 3) (3440,) (860,) which means that there are 3440 images of size 30×30 pixels and the 3
means that the data is represented in RGB color space. With the sklearn package, we use the
train_test_split() method to split training and testing data.
#Splitting training and testing dataset
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
(31367, 30, 30, 3) (7842, 30, 30, 3) (31367,) (7842,)

Step5: Convert the label using one hot encoding From the <keras.utils> package, method is used to convert
the labels present in y_train and t_test into one-hot encoding.
#Converting the labels into one hot encoding y_train = to_categorical(y_train, 43)
y_test = to_categorical(y_test, 43)
Step6: Build the CNN model CNN model is used to Classify of the images into their corresponding categories.
The description of the CNN architecture used here is:
2 Conv2D layer (filter=32, kernel_size=(5,5), activation=”relu”) MaxPool2D layer ( pool_size=(2,2)) Dropout
layer (rate=0.25) 2 Conv2D layer (filter=64, kernel_size=(3,3), activation=”relu”) MaxPool2D layer (
pool_size=(2,2)) Dropout layer (rate=0.25) Flatten layer to squeeze the layers into 1 dimension Dense Fully
connected layer (256 nodes, activation=”relu”) Dropout layer (rate=0.5) Dense layer (43 nodes,
activation=”softmax”)
#Building the model model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2))) model.add(Dropout(rate=0.25))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu')) model.add(Conv2D(filters=64,
kernel_size=(3, 3), activation='relu')) model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25)) model.add(Flatten())
model.add(Dense(256, activation='relu')) model.add(Dropout(rate=0.5))
model.add(Dense(43, activation='softmax'))

Step7: Compile the model The model needs to be compiled for the hyperparameters. We have used Adam
optimizer to compile the model which performs gives better accuracy and converges fast. Since the classes
are categorical, “categorical_crossentropy” is used as loss function with adam optimizer.
#Compilation of the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) epochs = 15
history = model.fit(X_train, y_train, batch_size=32, epochs=epochs, validation_data=(X_test, y_test))
Epoch 1/15
981/981 [==============================] - 88s 79ms/step - loss: 2.0051 - accuracy: 0.4934 - val_loss:
0.6057 - val_accuracy: 0.855 Epoch 2/15
981/981 [==============================] - 81s 82ms/step - loss: 0.7954 - accuracy: 0.7685 - val_loss:
0.2969 - val_accuracy: 0.927 Epoch 3/15
981/981 [==============================] - 80s 81ms/step - loss: 0.5348 - accuracy: 0.8429 - val_loss:
0.2080 - val_accuracy: 0.936 Epoch 4/15
981/981 [==============================] - 80s 81ms/step - loss: 0.4474 - accuracy: 0.8716 - val_loss:
0.1528 - val_accuracy: 0.961 Epoch 5/15
981/981 [==============================] - 82s 83ms/step - loss: 0.3850 - accuracy: 0.8864 - val_loss:
0.1238 - val_accuracy: 0.964
Epoch 6/15
981/981 [==============================] - 80s 82ms/step - loss: 0.3487 - accuracy: 0.8999 - val_loss:
0.1113 - val_accuracy: 0.967 Epoch 7/15
981/981 [==============================] - 82s 84ms/step - loss: 0.3119 - accuracy: 0.9100 - val_loss:
0.0806 - val_accuracy: 0.976 Epoch 8/15
981/981 [==============================] - 87s 88ms/step - loss: 0.2839 - accuracy: 0.9183 - val_loss:
0.0767 - val_accuracy: 0.977
Epoch 9/15
981/981 [==============================] - 82s 84ms/step - loss: 0.2678 - accuracy: 0.9216 - val_loss:
0.0806 - val_accuracy: 0.975 Epoch 10/15
981/981 [==============================] - 84s 86ms/step - loss: 0.2657 - accuracy: 0.9250 - val_loss:
0.0644 - val_accuracy: 0.981 Epoch 11/15
981/981 [==============================] - 83s 85ms/step - loss: 0.2656 - accuracy: 0.9253 - val_loss:
0.2162 - val_accuracy: 0.944 Epoch 12/15
981/981 [==============================] - 119s 122ms/step - loss: 0.2760 - accuracy: 0.9221 - val_loss:
0.0602 - val_accuracy: 0.9
Epoch 13/15
981/981 [==============================] - 115s 118ms/step - loss: 0.2144 - accuracy: 0.9393 - val_loss:
0.0689 - val_accuracy: 0.9 Epoch 14/15
981/981 [==============================] - 90s 92ms/step - loss: 0.2654 - accuracy: 0.9279 - val_loss:
0.0652 - val_accuracy: 0.979 Epoch 15/15
981/981 [==============================] - 82s 84ms/step - loss: 0.2442 - accuracy: 0.9338 - val_loss:
0.0781 - val_accuracy: 0.977
Step8: Train and validate the Model After building the CNN model with required hyperparameter, we use the
training images to train the model using model.fit(). The model was trained by varying the Batch size and epoch
size. Initially Epoch size=10 and batch size=50. Next both epoch and batch size was increased at epoch size=15
and batch size= 64, the accuracy was 92%. Further increase both the parameter there was no significant
improvement in accuracy and loss function.
Step9: Plot the accuracy and loss Graphs are plotted to visualize the epoch vs accuracy and epoch vs Loss
function
#plotting graphs for accuracy plt.figure(0)
plt.plot(history.history['accuracy'], label='training accuracy') plt.plot(history.history['val_accuracy'], label='val
accuracy') plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy') plt.legend()
plt.show()
plt.figure(1)
plt.plot(history.history['loss'], label='training loss') plt.plot(history.history['val_loss'], label='val loss')
plt.title('Loss')
plt.xlabel('epochs') plt.ylabel('loss') plt.legend()
plt.show()
Step10: Load the test data along with labels This dataset contains a Test folder: details related to the image
path and in a Test.csv file their respective class labels. We first extract the image path and labels using pandas.
Images is resized to 30×30 pixels. We use numpy array to store all image data.
#testing accuracy on test dataset cur_path = os.getcwd()
print(cur_path)
#Retrieving the images and their labels #for i in range(classes):
path = os.path.join(cur_path,'Traffic Sign Recognition\Test') images = os.listdir(path)
print(path)
print(images)

Step11: Predict the class Using the test images the class label is predicted. We import from the
<sklearn.metrics>, and observe the predicted value.
# Assuming a multi-class classification model
pred = np.argmax(model.predict(X_test), axis=1)

# Accuracy with the test data


from sklearn.metrics import accuracy_score print(accuracy_score(labels, pred))
1226/1226 [==============================] - 31s 22ms/step
0.9807187125404881
EXERCISE-12
AIM: OBJECT CLASSIFICATION FOR AUTOMATED CCTV
Description:
Nowadays, Surveillance has become an essential part of any industry for safety and watch. Recent
developments in technology like computer vision, machine learning has brought significant advancements in
various automatic surveillance systems. Generally, CCTV will be running all the time and hence, consumes
more memory.
One of the industries decides to adopt artificial intelligence for automating CCTV recording. The idea is to
customize the CCTV operation based on the object detection. The industry has come up with the plan to
automate the CCTV in a way that if some objects are recognized and categorized as belonging to specific class
only then the recording should start. By using this method, the need for recording the images continuously
gets avoided there by reducing the memory requirements.
So the, problem is to categorize the object type as human, vehicles, animals etc…Suppose you are asked to
analyze this industry requirement and come up with a feasible solution that can help the company to
customize the CCTV based image classification.
Instructions for problem solving:
As a deep learning developer, design a best model by training the neural network with 60,000 training
samples.
Use all the test image samples to test whether the product is labelled appropriately.
You can use tensorflow / Keras for downloading the data set and to build the model.
Fine tune the hyperparameters and perform the model evaluation.
Substantiate your solution based on your insights for better visualization and provide a report on model
performance.
Here we are using Convolution Neural Network (CNN) to implement this project
Convolution Neural Network (CNN):
Convolutional Neural Network is a specialized neural network designed for visual data, such as images &
videos. But CNNs also work well for non-image data (especially in NLP & text classification).
Its concept is similar to that of a vanilla neural network (multilayer perceptron) – It follows the same general
principle of forwarding & backward propagation.
Dataset:
Fashion MNIST:
Data set description:
Initially to test the model you can use the benchmark data set namely. Fashion-MNIST data set before
deploying it. This dataset is a standard dataset that can be loaded directly. For more details click here.
The data set description is as follows:
Size of training set = 60,000 images
Number of samples/class = 600,000 images.
Image size= Each example is a 28x28 grayscale image.
Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel. This
pixel-value is an integer that ranges between 0 and 255.
Number of class = 10 classes.
The training and test data sets have 785 columns. The details of the data set organization are as given below:
Each row is a separate image
Column 1 is the class label
Remaining columns are pixel numbers (784 total)
Each value is the darkness of the pixel (1 to 255)
Each training and test example are assigned with one of the following labels:
Cars
Birds
Cats
Deer
Dog
Frog
Horses
Ships
Trucks
Airplanes
Tools and Technology required:
TensorFlow/Keras
Knowledge on Convolution Neural Network -Deep Learning, Basic understanding of image representation
Pandas
Data Visualization: Matplotlib and seaborn
Source Code:
cifar-10 data set:
Dataset Description
The CIFAR-10 data consists of 60,000 32x32 color images in 10 classes, with 6000 images per class. There are
50,000 training images and 10,000 test images in the official data. We have preserved the train/test split from
the original dataset.
The provided files are:
train.7z - a folder containing the training images in png format
test.7z - a folder containing the test images in png format
trainLabels.csv - the training labels
To discourage certain forms of cheating (such as hand labeling) we have added 290,000 junk images in the test
set. These images are ignored in the scoring. We have also made trivial modifications to the official 10,000
test images to prevent looking them up by file hash. These modifications should not appreciably affect the
scoring. You should predict labels for all 300,000 images.
The label classes in the dataset are:
airplane
automobile
bird
cat
deer
dog
frog
horse
ship
truck
The classes are completely mutually exclusive. There is no overlap between automobiles and trucks.
"Automobile" includes sedans, SUVs, things of that sort. "Truck" includes only big trucks. Neither includes
pickup trucks.
OBJECT CLASSIFICATION
This code snippet imports essential libraries for machine learning and data visualization in Python, including
TensorFlow and Keras for deep learning, NumPy for numerical operations, and Matplotlib, Seaborn, and
Pandas for data visualization and manipulation.
import tensorflow as tf
from tensorflow import kerasimport numpy as np
import matplotlib.pyplot as pltimport seaborn as sns
import pandas as pd
Load Fashion-MNIST Dataset: The code uses the keras.datasets.fashion_mnist module to load the Fashion-
MNIST dataset, a collection of
grayscale images of 10 different types of clothing items. The dataset is divided into training and testing sets,
with corresponding images and labels stored in variables train_images, train_labels, test_images, and
test_labels.
# Load the Fashion-MNIST dataset
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-
ubyte.gz29515/29515 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-
ubyte.gz26421880/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-
ubyte.gz
5148/5148 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-
ubyte.gz4422102/4422102 [==============================] - 0s 0us/step
Reshape Image Data: The train_images and test_images arrays are reshaped to have dimensions (60000, 28,
28, 1) and (10000, 28, 28, 1)
respectively, where the last dimension represents the single channel (grayscale). Normalize Pixel Values: The
pixel values of the image data are normalized by dividing them by 255.0. This scales the pixel values between 0
and 1, a common preprocessing step in neural network training, aiding in model convergence.
train_images = train_images.reshape((60000, 28, 28, 1))
test_images = test_images.reshape((10000, 28, 28, 1))
train_images, test_images = train_images / 255.0, test_images / 255.0
Class Labels Definition: The class_labels list is defined to map numerical labels in the dataset to their
corresponding class names for better interpretability. Each index in the list represents a unique class, and the
order corresponds to the numerical labels assigned to each class in the Fashion-MNIST dataset.
# Define the class labels
class_labels = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle
boot"]
Train-Validation Split: The train_test_split function from sklearn.model_selection is used to split the training
data into training and validation sets. The split is performed with a test size of 20%, and a random seed of 42 is
set for reproducibility. The resulting sets are train_images, val_images, train_labels, and val_labels.
# Split the training data into training and validation setsfrom sklearn.model_selection import train_test_split
train_images, val_images, train_labels, val_labels = train_test_split(train_images, train_labels, test_size=0.2,
random_state=42)
Dataset Visualization: This code uses Matplotlib to create a 5x5 grid of subplots to visualize a subset of the
training dataset. For each subplot, an image from train_images is displayed in grayscale, along with its
corresponding class label from class_labels[train_labels[i]]. plt.axis('off') is used to hide axis labels
# Visualize the dataset
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i + 1)
plt.imshow(train_images[i].reshape(28, 28), cmap='gray')plt.title(class_labels[train_labels[i]])
plt.axis('off')
plt.show()

Convolutional Neural Network (CNN) Model: A sequential model is created using Keras, representing a
Convolutional Neural Network (CNN) for image classification. The model consists of convolutional layers with
ReLU activation, max-pooling layers, a flattening layer, and two dense
layers (fully connected) with ReLU activation for feature extraction and classification. The output layer has 10
neurons corresponding to the 10 classes in the Fashion-MNIST dataset, and no activation function is specified,
indicating a raw output used for classification

# Create a CNN model for image classificationmodel = keras.Sequential([


keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.MaxPooling2D((2, 2)),keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10) # 10 classes for Fashion-MNIST
])
Model Summary: The model.summary() method provides a concise summary of the CNN architecture. It
displays the layers, their output shapes, and the number of parameters in each layer, providing a useful
overview of the model's structure and complexity.
model.summary()
Model: "sequential"

Layer (type) Output Shape Param #


=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320
max_pooling2d (MaxPooling2 (None, 13, 13, 32) 0
D)

conv2d_1 (Conv2D) (None, 11, 11, 64) 18496

max_pooling2d_1 (MaxPoolin (None, 5, 5, 64) 0


g2D)
flatten (Flatten) (None, 1600) 0

dense (Dense) (None, 128) 204928

dense_1 (Dense) (None, 10) 1290

Total params: 225034 (879.04 KB)


Trainable params: 225034 (879.04 KB)
Non-trainable params: 0 (0.00 Byte)
Model Compilation: The model.compile function is used to configure the training process of the neural
network. The Adam optimizer is chosen with a specified learning rate by default. Sparse categorical
crossentropy is set as the loss function, suitable for integer-encoded class labels. The metric for evaluation
during training is accuracy, indicating the fraction of correctly classified images.
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),metrics=['accuracy'])
Model Training: The model.fit method is utilized to train the neural network on the provided training data
(train_images and train_labels). Training is conducted for 10 epochs, allowing the model to learn from the data
through multiple passes. The validation data (val_images and val_labels) are used to evaluate the model's
performance on unseen data during each epoch. The training history, containing information such as loss and
accuracy for each epoch on both training and validation sets, is stored in the history variable.
# Train the model using the training data
history = model.fit(train_images, train_labels, epochs=10, validation_data=(val_images, val_labels))
Epoch 1/10
1500/1500 [==============================] - 42s 27ms/step - loss: 0.4681 - accuracy: 0.8307 -
val_loss: 0.3390 - val_accuracy: 0.8Epoch 2/10
1500/1500 [==============================] - 40s 27ms/step - loss: 0.3154 - accuracy: 0.8837 -
val_loss: 0.2846 - val_accuracy: 0.8
Epoch 3/10
1500/1500 [==============================] - 40s 27ms/step - loss: 0.2666 - accuracy: 0.9010 -
val_loss: 0.2843 - val_accuracy: 0.8Epoch 4/10
1500/1500 [==============================] - 40s 27ms/step - loss: 0.2336 - accuracy: 0.9140 -
val_loss: 0.2485 - val_accuracy: 0.9Epoch 5/10
1500/1500 [==============================] - 40s 26ms/step - loss: 0.2061 - accuracy: 0.9230 -
val_loss: 0.2504 - val_accuracy: 0.9Epoch 6/10
1500/1500 [==============================] - 39s 26ms/step - loss: 0.1831 - accuracy: 0.9308 -
val_loss: 0.2493 - val_accuracy: 0.9
Epoch 7/10
1500/1500 [==============================] - 39s 26ms/step - loss: 0.1605 - accuracy: 0.9411 -
val_loss: 0.2817 - val_accuracy: 0.9Epoch 8/10
1500/1500 [==============================] - 50s 33ms/step - loss: 0.1417 - accuracy: 0.9460 -
val_loss: 0.2573 - val_accuracy: 0.9Epoch 9/10
1500/1500 [==============================] - 40s 27ms/step - loss: 0.1245 - accuracy: 0.9531 -
val_loss: 0.2614 - val_accuracy: 0.9Epoch 10/10
1500/1500 [==============================] - 40s 27ms/step - loss: 0.1103 - accuracy: 0.9583 -
val_loss: 0.2681 - val_accuracy: 0.9

The model was trained for 10 epochs, achieving decreasing training and validation losses, as well as increasing
training and validation accuracies. The final accuracy on the validation set is around 91.25%, indicating effective
learning. However, it's important to monitor for signs of overfitting, where the model may perform well on the
training set but not generalize well to new data.
Generate Predictions: The model.predict method is employed to generate predictions on the test dataset
(test_images). The resulting predictions are then processed to obtain the corresponding class labels by finding
the index of the maximum value in each prediction using np.argmax. The predicted labels are stored in the
predicted_labels variable, providing the model's classification for the test dataset.
# Generate predictions on the test dataset
predictions = model.predict(test_images)
predicted_labels = [class_labels[np.argmax(prediction)] for prediction in predictions]
313/313 [==============================] - 4s 13ms/step
Visualize Sample Predictions: Matplotlib is used to create a 5x5 grid of subplots for visualizing a subset of the
test dataset along with their predicted labels. For each subplot, an image from test_images is displayed in
grayscale. The true class label is obtained from class_labels[test_labels[i]], and the predicted label is obtained
from the predicted_labels array. The title of each subplot shows the true and predicted labels for comparison,
providing a visual assessment of the model's performance on the test data.
# Print some sample images with predicted labelsplt.figure(figsize=(10, 10))
for i in range(25):

plt.subplot(5, 5, i + 1)

plt.imshow(test_images[i].reshape(28, 28), cmap='gray') # Reshape for visualizationtrue_label =


class_labels[test_labels[i]]

predicted_label = predicted_labels[i]

plt.title(f"True: {true_label}\nPredicted: {predicted_label}")plt.axis('off')

# Generate predictions on the test dataset


predictions = model.predict(test_images)
predicted_labels = [class_labels[np.argmax(prediction)] for prediction in predictions]
313/
Plt.show()
Confusion Matrix for Evaluation: The code utilizes the pd.crosstab function from the Pandas
library to create a confusion matrix based on the true labels (test_labels) and predicted labels
converted to numerical indices. The confusion matrix is then visualized as a heatmap using
Seaborn (sns.heatmap), with annotations displaying the counts in each cell. This heatmap
provides a comprehensive view of the model's performance by showing how well it predicts
each class in relation to the ground truth
# Create a confusion matrix to evaluate the model's performance
confusion_matrix = pd.crosstab(test_labels, [class_labels.index(label) for label in
predicted_labels], rownames=['Actual'], colnames=['sns.heatmap(confusion_matrix,
annot=True, fmt="d", xticklabels=class_labels, yticklabels=class_labels)
plt.show()

Evaluate Model on Test Data: The model.evaluate method is used to assess the model's performance on
the test dataset (test_images and test_labels). The resulting test loss and accuracy are stored in the
variables test_loss and test_accuracy. The test accuracy is then printed to the console, providing a
quantitative measure of the model's ability to generalize to unseen data

test_loss, test_accuracy = model.evaluate(test_images, test_labels, verbose=2)

print(f"Test accuracy: {test_accuracy}")

313/313 - 2s - loss: 0.2883 - accuracy: 0.9062 - 2s/epoch - 6ms/stepTest accuracy: 0.9061999917030334


Visualize Training History: Matplotlib is used to plot the training accuracy (history.history['accuracy']) and
validation accuracy

(history.history['val_accuracy']) over the epochs. The x-axis represents the number of training epochs, and
the y-axis represents the

Visualize Training History: Matplotlib is used to plot the training accuracy (history.history['accuracy']) and
validation accuracy

(history.history['val_accuracy']) over the epochs. The x-axis represents the number of training epochs, and
the y-axis represents the

corresponding accuracy values. This visualization provides insight into the training process, showing how
well the model performs on both the training and validation sets over time.

# Visualize the model's training history


plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')plt.xlabel('Epochs')
plt.ylabel('Accuracy')plt.legend()
plt.show()

Model Performance Report: The code prints a summary report on the model's performance.
Specifically, it prints the test accuracy obtained from the evaluation of the model on the test
dataset. This provides a concise overview of the model's accuracy on unseen data, summarizing
its effectiveness in making predictions.
# report on the model's performance
print("Model Performance Report:")
print(f"Test accuracy: {test_accuracy}")
Model Performance Report:
Test accuracy: 0.9061999917030334The model's performance on the test dataset is reported as
an accuracy of approximately 90.62%. This metric reflects the proportion of correctly classified

You might also like