0% found this document useful (0 votes)
29 views21 pages

Image Recognition Using ML (CNN) For Beginners - by Akhil Haridasan - The Startup - Medium

Uploaded by

Sumit Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views21 pages

Image Recognition Using ML (CNN) For Beginners - by Akhil Haridasan - The Startup - Medium

Uploaded by

Sumit Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Image Recognition Using ML (CNN) for


Beginners
Akhil Haridasan · Follow
Published in The Startup
9 min read · Oct 8, 2020

Listen Share

Image Recognition, Image Processing, Computer vision are some of the hottest
topics in the tech industry these days. There are various inventions that have been
developed using these technologies. Out of which, Face Recognition, Gesture
Recognition, Driverless-cars, etc, are some of the coolest creations of computer
vision and image recognition. And, the core or the foundation of all these creations
is “Image Recognition”.

In this article, we will try to understand how Convolutional Neural Network (a type
of Deep Learning algorithm) can be used for image classification. This article is
basically designed for beginners or those who are interested in learning Image
Recognition and Machine Learning. (Again one of the easiest that is out there)

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 1/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Agenda:

1. Fashion MNIST (FMNIST) Clothing Classification

2. Creating a better Dataset

3. How to Define/Use a Model

4. Evaluate the Model

5. Make Predictions

Pre-requisite:

1. Python 3.6 (preferred 3.6.10)

2. Tensorflow 2.1.0 and Keras 2.3.1(as we are going to work with Deep Learning
models and Keras)

3. Google Colab/PyCharm/Jupyter Notebook (I prefer Colab because there is free


GPU support🤣)

1. FMNIST Clothing Classification


Why FMNIST dataset?

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 2/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Firstly, Fashion MNIST is the most widely used image dataset and it can be a useful
starting point for beginners to develop and learn image classification using
convolutional neural networks. Second, this dataset already has a well-defined
training and testing dataset that can be used without any hassle.

This dataset consists of 60,000 small 28x28 pixel grayscale images of 10 different
types that include, shoes, t-shirts, dresses, bags, etc with labels assigned to them as
follows:

0: T-shirt/top

1: Trouser

2: Pullover

3: Dress

4: Coat

5: Sandal

6: Shirt

7: Sneaker

8: Bag

9: Ankle boot

Let us load this FMNIST dataset and see how it exactly looks.

from keras.datasets import fashion_mnist


from matplotlib import pyplot
from keras.utils import to_categorical

(trainX, trainy), (testX, testy) = fashion_mnist.load_data()

#summarize loaded dataset

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))


print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

#plot first 9 images in the training dataset

for i in range(9):
pyplot.subplot(330 + 1 + i) #define subplot

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 3/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))
pyplot.show() #plot raw pixel data

We can see that there are 60,000 examples in the training dataset and 10,000 in the
test dataset.

2. Creating a better Dataset


This step is divided into two:

a. Loading of the dataset

Here, we know that our image is pre-segmented (i.e. every image in our dataset is
assigned a digit that ranges from 0–9, which indicates that if it’s a shoe then it has
number 0 and so on). All our images are of size 28x28 and they are all grayscaled
images.

But to confirm or to be precise, we will reshape all the images in our dataset to
28x28 pixel with a monotonous color. So, that even if there are some images that
don’t follow the pixel and color convention, it can be turned into one that follows
our convention.

#load dataset

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()


#reshape dataset to have a single channel

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))


testX = testX.reshape((testX.shape[0], 28, 28, 1))

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 4/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Now, because we know that our images are assigned a particular integer value, we
will be using a technique called “one-hot encoding” to convert these integers into
binary vectors.

#one hot encode target values

trainY = to_categorical(trainY)
testY = to_categorical(testY)

We will now create a single function to perform all these three steps together.

def load_dataset():

#load dataset
(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

#reshape dataset to have a single channel

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))


testX = testX.reshape((testX.shape[0], 28, 28, 1))

#one hot encode target values

trainY = to_categorical(trainY)
testY = to_categorical(testY)

return trainX, trainY, testX, testY

b. Preparation of the dataset

Every image has a pixel value/or we can say that every image is represented using a
pixel value that ranges from 0 and 255, where 0 means black and 255 means white

Now, we need to convert this 0–255 pixel to a range 0–1 for a better result. So,
basically, we are re-scaling our images to a range[0,1]. We will do that by converting
these pixel data to float values and then divide these values by 255 (which is our
maximum pixel value)

def prepare_pixels(train, test):


#convert from integers to floats

train_norm = train.astype('float32')
test_norm = test.astype('float32')

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 5/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

#normalize to range 0-1

train_norm = train_norm / 255.0


test_norm = test_norm / 255.0

#return normalized images

return train_norm, test_norm

3. Creating/Using a Model
We will now create our baseline model.

We will create a basic model for our dataset which can work no matter how we
change the current dataset (like adding new photos, changing the color of photos,
etc). This model will be our base model, and then it can be improved based on the
accuracy and other parameters.

ML Model — Convolutional Neural Network

It is a type of ML algorithm that has been developed to recognize underlying


relationships in a set of data through a process that mimics the way the human
brain operates.

CNN is one of the main categories to do image recognition, image classification,


object detection, facial recognition, etc.

Why is CNN preferred for image datasets?

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 6/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

In CNN, every image is read in parts than as a whole image. For instance, let say we
have a 300x300 pixel image, then CNN will divide the image into smaller images of
4x4 matrices and then deal with these small matrices one-by-one. And then,
features are extracted from those smaller image matrix.

Any CNN model will have two main aspects:

1. Feature extraction — Performed using convolutional and pooling layers

2. Classifier — that will make a prediction.

How will our CNN model be?

1. We will start with a single convolutional layer with a small filter size (3,3) and a
modest number of filters (32) followed by a max-pooling layer.

2. We know that here we have to categorize the data into 10 different classes, right?
So this will be called a multi-class classification problem. Let me ask you a
question, based on the images that we have seen. What do you think would be
the number of output layers? 10!! (come on that was obvious).

3. We will require an activation function (AF). An AF is responsible for


transforming the summed weighted input into an output.

4. We will also add Dense layers between the feature extractor and the output layer
to interpret the features. Let us add 100 nodes and see how it goes.

5. We will use a RELU Activation function and he weight initialization scheme (best
practice). RELU is a kind of AF that looks at the output of the activation function,
if the output is positive then mark the end result as 1 otherwise mark the end
result as zero.

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 7/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

6. We will then use a stochastic gradient descent (SGD) optimizer to optimize our
learning algorithm. Our SGD will have a learning rate of 0.01 and a momentum
of 0.9. (Try changing learning rates to see the differences in accuracy values)

7. Finally, we will compile the model with a categorical cross-entropy loss function
along with our SGD (considered suitable for multi-class classification), and we
will monitor our classification accuracy.

#define our CNN model

def cnn_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu',
kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu',
kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
#compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy',
metrics=['accuracy'])

return model

4. Evaluate the Model


Once we are ready with our model, the next step is to evaluate our model for
accuracy.

Evaluation Metrix — K-fold cross-validation

We will evaluate our model using a K-fold cross-validation metrix. Here, try to
choose your k value in such a way that it’s not too large. Eventually, it will help us
avoid long running time and evaluate our model repeatedly.

The training dataset is shuffled before the split. Sample shuffling is performed each
time so that any model we evaluate will have the same train and test datasets in each
fold.

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 8/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

We will train the model with 10 epochs and a default batch size of 32 examples. For
every epoch, our test set for k folds will be used to evaluate the model. This will help
us to create a learning curve to identify the performance of the model.

def evaluate_model(dataX, dataY, n_folds=5):

scores, histories = list(), list()


#prepare cross validation
kfold = KFold(n_folds, shuffle=True, random_state=1)

#enumerate splits
for train_ix, test_ix in kfold.split(dataX):
model = cnn_model()
trainX, trainY, testX, testY = dataX[train_ix],
dataY[train_ix], dataX[test_ix], dataY[test_ix]
history = model.fit(trainX, trainY, epochs=10,
batch_size=32, validation_data=(testX, testY), verbose=0)
_, acc = model.evaluate(testX, testY, verbose=0)
print('> %.3f' % (acc * 100.0))
scores.append(acc)
histories.append(history)
model.save('final_model.h5') #save the model for future use
return scores, histories

Result of Evaluation

We will be presenting two aspects of the results. First, the accuracy diagnosis and
second, the loss between training and testing dataset.

def accuracy_summary(histories):
for i in range(len(histories)):
pyplot.subplot(222)
pyplot.title('Classification Accuracy')
pyplot.plot(histories[i].history['accuracy'], color='blue',
label='train')
pyplot.plot(histories[i].history['val_accuracy'],
color='orange', label='test')
pyplot.show()
def loss_summary(histories):

for i in range(len(histories)):
pyplot.subplot(211)
pyplot.title('Loss')

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 9/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

pyplot.plot(histories[i].history['loss'], color='blue',
label='train')
pyplot.plot(histories[i].history['val_loss'],
color='orange', label='test')

pyplot.show()

Now, a final function to call all the above-defined functions.

def final():
trainX, trainY, testX, testY = load_dataset()
trainX, testX = prepare_pixels(trainX, testX)

scores, histories = evaluate_model(trainX, trainY)


accuracy_summary(histories)
loss_summary(histories)
final()

The above image shows the result for accuracy values for each fold of the cross-
validation process. The results may vary with the stochastic nature of the algorithm
on running it multiple times.

Blue lines in the graph indicate model performance on train dataset and orange
lines indicate performance on test dataset. Additionally, we can see that the model
is able to achieve a good fit with train and test learning curves converging.

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 10/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

5. Make Predictions
An important thing to keep in mind is that when making predictions, we need to
have a grayscale image for prediction. As we have trained our model on grayscale
images. Another workaround for this could be an addition of a new function that
converts an RGB image into a grayscale image. For now, I will use one of the images
from the test dataset and predict the class of that image.

# make a prediction for a new image.

%pylab inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import load_model

def load_image(filename):
#load the image
img = load_img(filename, grayscale=True, target_size=(28, 28))
#convert to array
img = img_to_array(img)
#reshape into a single sample with 1 channel
img = img.reshape(1, 28, 28, 1)

#prepare pixel data


-img = img.astype('float32')
img = img / 255.0

return img

Use the saved model to predict the class to which it falls. I have created if-else
conditions to make it more clear for you guys to understand the exact category.

img1 = mpimg.imread('/content/sample_data/sample_image.png')
imgplot = plt.imshow(img1)
plt.show()
img = load_image("/content/sample_data/sample_image.png")
model = load_model('/content/final_model.h5')
# predict the class
result = model.predict_classes(img)

if result[0] == 0:
print("Top")
elif result[0] == 1:

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 11/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

print("Trouser")
elif result[0] == 2:
print("Pullover")
elif result[0] == 3:
print("Dress")
elif result[0] == 4:
print("Coat")
elif result[0] == 5:
print("Sandal")
elif result[0] == 6:
print("Shirt")
elif result[0] == 7:
print("Sneaker")
elif result[0] == 8:
print("Bag")
elif result[0] == 9:
print("Ankle Boot")
else:
print("Not in the list")

From the above image, you can see that the image that has been passed on to our
model was that of a pullover and it did predict the image as a “pullover”. You can also
try using a different image and check for your own.

Some additional FYI thing

How this model can be further improvised

1. By padding convolution — helps more features to contribute to the output

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 12/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

2. By increasing filters — helps in extracting simple features from the input images

If you are facing any issue pertaining to Deep Learning models / ML models. You
can contact me via LinkedIn or Facebook. Or else comment here itself, feedbacks
are always a good way to improve.

I will be posting something interesting again with easy steps soon. Till then Enjoy
coding !! 👍

Machine Learning Artificial Intelligence Python Deep Learning

Image Recognition

Follow

Written by Akhil Haridasan


31 Followers · Writer for The Startup

More from Akhil Haridasan and The Startup

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 13/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Akhil Haridasan in Analytics Vidhya

Developing Deep Learning API using Django


I hope you guys found my last post (Deploying Deep Learning Django app to Google Cloud
Platform) useful. In this post, we will learn about…

· 8 min read · Jun 27, 2020

262 1

Open in app Sign up Sign In

Search Medium

Tim Denning in The Startup

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 14/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

I Met a Quiet Millionaire Who Operates a $2.5m Tiny Business While


Working 2-3 Hours a Day
This is bizarrely what he taught me

· 6 min read · Jul 31

12.1K 260

Tim Denning in The Startup

I’ve Spent 16,733 Hours Studying Personal Finance — Here Are the 9 Most
Important Money Rules
Budgeting and reducing expenses doesn’t make you wealthy

· 8 min read · Aug 26

5.3K 83

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 15/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Akhil Haridasan in Analytics Vidhya

Deploying Deep Learning Django app to Google Cloud Platform


Deploying a simple Django application may seem easy on a publicly (free) available hosting
website, which to some extent is true. However…

6 min read · Jun 17, 2020

176 1

See all from Akhil Haridasan

See all from The Startup

Recommended from Medium

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 16/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Rayyan Shaikh

Mastering BERT: A Comprehensive Guide from Beginner to Advanced in


Natural Language Processing…
Introduction: A Guide to Unlocking BERT: From Beginner to Expert

19 min read · Aug 26

1.4K 12

Debjyoti Banerjee Student, Jaipuria Lucknow

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 17/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Understanding CNN for Image Classification


Hi all, today I thought of sharing my knowledge on how the classification of an Image using a
Convolutional Neural Network is done, which…

5 min read · Jul 10

Lists

Predictive Modeling w/ Python


20 stories · 441 saves

Practical Guides to Machine Learning


10 stories · 511 saves

Natural Language Processing


668 stories · 275 saves

ChatGPT
21 stories · 177 saves

AL Anany

The ChatGPT Hype Is Over — Now Watch How Google Will Kill ChatGPT.
It never happens instantly. The business game is longer than you know.

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 18/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

· 6 min read · Sep 1

10.6K 343

Sadaf Saleem

Neural Networks in 10mins. Simply Explained!


What are Neural Networks?

9 min read · May 15

222 2

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 19/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

Omkar Manohar Dalvi

GPU for Machine Learning (CUDA, cuDNN and Tensorflow)


Are you tired of seeing your CPU getting totally utilized while training machine learning model
and your GPU being completely idle ?

6 min read · Jul 6

Vatsalya Krishan Maurya

How to Setup GPU for Deep Learning (Windows 11)


https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 20/21
10/2/23, 8:24 PM Image Recognition Using ML (CNN) for Beginners | by Akhil Haridasan | The Startup | Medium

I have searched various methods, but I was not able to find the correct way to enable GPU on
my Windows machine. But on research and…

6 min read · Jun 16

58

See more recommendations

https://medium.com/swlh/image-recognition-using-ml-cnn-b7e4993d5059 21/21

You might also like