0% found this document useful (0 votes)
7 views6 pages

Application of LLMs

The document outlines the design and implementation of a Convolutional Neural Network (CNN) for multi-class image classification using the Imagenette dataset. It details the dataset selection, preprocessing steps, model architecture modifications, and performance metrics, achieving over 76% accuracy on test data. The analysis highlights specific misclassifications and challenges faced during model training and optimization.

Uploaded by

arnab.datta.dbpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views6 pages

Application of LLMs

The document outlines the design and implementation of a Convolutional Neural Network (CNN) for multi-class image classification using the Imagenette dataset. It details the dataset selection, preprocessing steps, model architecture modifications, and performance metrics, achieving over 76% accuracy on test data. The analysis highlights specific misclassifications and challenges faced during model training and optimization.

Uploaded by

arnab.datta.dbpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Application of LLMs

Task 1: Design a CNN Architecture for Multi-Class


Classification

Objective: Develop a Convolutional Neural Network (CNN) for multi-class


classification using a publicly available dataset of your choice from the internet. This
task will involve designing, experimenting, and improving your CNN architecture
while documenting all key steps, challenges, and insights.

Dataset Selection:

Imagenette( https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-
160.tgz ) is a subset of the larger ImageNet dataset, designed to be more
manageable while still providing a meaningful challenge for image
classification tasks. This dataset includes images from 10 different classes:
Cassette player, Chain saw, Church, English, springer, French horn, Garbage
truck, Gas pump, Golf ball, Parachute, Tench

Dataset Preprocessing:

The train data was merged into one tracking the target labels, files were
renamed. Then zero padding was added to images to make it a square, and
finally the images were resized to 80px X 80px. The latter was done to reduce
the number of parameters of the model. A similar preprocessing was done on
the test(validation) dataset.

MODEL ARCHITECTURE
Initial Approach:

As outlined in TensorFlow Convolutional Neural Network (CNN)


documentation
(https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/
images/cnn.ipynb)
Modification:

 Applying Data Augmentation: Rotate, shift, and flip images.


 Adding Noise to Images Before Training: Use Gaussian noise.
 Adding Extra Convolutional Layer: Another convolutional layer in the
architecture.
 Using Leaky ReLU: Activation functions in certain layers.
 Using Dropout: Regularization method in specific layers.
 Using L2 Regularization: Regularization applied to certain layers.
 Using Early Stopping: Callback to prevent overfitting.
 Reduce Learning Rate on Plateau: Callback to adjust learning rate during
training.
These were aimed at reducing the chance of overfitting the model. The extra
convolutional layer improved accuracy

Experiments:
Increasing Dropout Too High, Adding Too Much L2 Regularization, and Adding
Another Dense Layer:

These modifications collectively slowed the learning process significantly. The


excessive dropout rates and high L2 regularization penalized the model's weights
too harshly, while the added complexity from another dense layer increased the
model's capacity but made it harder to optimize. As a result, the model struggled
to reach good accuracy during training and began to overfit prematurely, unable
to generalize well to new data.

Final structure

1. Input Layer: Accepts input images of size (80x80x3).

2. Convolutional Layer 1:

 32 filters, 3x3 kernel size, with ReLU activation.

 Output is followed by:


 MaxPooling2D: Pool size of (2x2).

 Dropout: Rate of 0.2 to prevent overfitting.

3. Convolutional Layer 2:

 64 filters, 3x3 kernel size, with ReLU activation.

 Includes L2 regularization with a factor of 0.001.

 Output is followed by:

 AveragePooling2D: Pool size of (2x2).

 Dropout: Rate of 0.1.

4. Convolutional Layer 3:

 64 filters, 3x3 kernel size, with ReLU activation.

 Output is followed by:

 MaxPooling2D: Pool size of (2x2).

 Dropout: Rate of 0.2.

5. Convolutional Layer 4:

 128 filters, 3x3 kernel size.

 Activation is LeakyReLU with an alpha of 0.05.

 Output is followed by:

 Dropout: Rate of 0.1.

6. Flatten Layer: Converts the multi-dimensional feature maps into a 1D vector.

7. Fully Connected Layer 1:

 256 units.

 Activation is LeakyReLU with an alpha of 0.1.

 Output is followed by:

 Dropout: Rate of 0.3.

8. Fully Connected Output Layer:

 10 units, corresponding to the number of output classes.

 Activation is sigmoid.

 Includes L2 regularization with a factor of 0.0001.


Performance of The Model

The model achieves an accuracy of over 76% on test data and over 80% on training
data.
Other metrics:

precisi f1- suppo


on recall score rt

cassette
player 0.79 0.79 0.79 357
chain saw 0.68 0.65 0.66 386
church 0.76 0.81 0.79 409
English
springer 0.92 0.79 0.85 395
French
horn 0.63 0.83 0.72 394
garbage
truck 0.77 0.85 0.81 389
gas pump 0.71 0.71 0.71 419
golf ball 0.84 0.71 0.77 399
parachute 0.87 0.84 0.85 390
tench 0.92 0.81 0.86 387

accuracy 0.78 3925


macro avg 0.79 0.78 0.78 3925
weighted
avg 0.79 0.78 0.78 3925

Analysis of Performance: (Please refer to the following confusion matrix)


The model misclassifies many images labelled chainsaw as French Horn. This can
be explained because many images labelled chainsaw and French Horn also has a
human holding them. Some other misclassifications can be attributed to a similar
reason, though most of them are due to the shortcoming of the model, which could
not have been completely eliminated.

You might also like