0% found this document useful (0 votes)
45 views

Program Code - Digit Recognizer Paper

The document discusses loading and preprocessing digit recognition data using Python libraries like Pandas and TensorFlow. It builds a convolutional neural network model with Conv2D, MaxPooling2D, Flatten, and Dense layers that achieves over 97% accuracy on the test set. The trained model is saved and used to make predictions on the test data, which are submitted in a CSV file for evaluation.

Uploaded by

Tanjir Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Program Code - Digit Recognizer Paper

The document discusses loading and preprocessing digit recognition data using Python libraries like Pandas and TensorFlow. It builds a convolutional neural network model with Conv2D, MaxPooling2D, Flatten, and Dense layers that achieves over 97% accuracy on the test set. The trained model is saved and used to make predictions on the test data, which are submitted in a CSV file for evaluation.

Uploaded by

Tanjir Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

r

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-
python
# For example, here's several helpful packages to load in

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.


# For example, running this (by clicking run or pressing Shift+Enter) will list all
files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.
/kaggle/input/digit-recognizer/train.csv
/kaggle/input/digit-recognizer/test.csv
/kaggle/input/digit-recognizer/sample_submission.csv
In [2]:
import tensorflow as tf
import matplotlib.pyplot as plt
tf.__version__
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Activation, MaxPooling2D, Conv2D,
Conv1D, MaxPooling1D
In [3]:
train=pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
test=pd.read_csv('/kaggle/input/digit-recognizer/test.csv')
sample_submission=pd.read_csv('/kaggle/input/digit-recognizer/sample_submission.csv')
In [4]:
X_train=train.drop(columns=['label']).values
y_train=train.label.values
X_train=tf.keras.utils.normalize(X_train, axis=1)
X_test=tf.keras.utils.normalize(test, axis=1).values
In [5]:
print(X_train.shape, y_train.shape, X_test.shape)
(42000, 784) (42000,) (28000, 784)
In [6]:
X_test1 = X_test.reshape(X_test.shape[0],28,28,1)
X_train1 = X_train.reshape(X_train.shape[0],28,28,1)
In [7]:
#model=tf.keras.models.Sequential()
#model.add(tf.keras.layers.Flatten())
#model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
#model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
#model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))

model=Sequential()
model.add(Conv2D(128, (3,3), input_shape=X_train1.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(128, (3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(128, (3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))

model.add(Dense(10))
model.add(Activation('softmax'))

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
model.fit(X_train1, y_train, epochs=20, validation_split=0.1)
Train on 37800 samples, validate on 4200 samples
Epoch 1/20
37800/37800 [==============================] - 75s 2ms/sample - loss: 0.2935 -
accuracy: 0.9065 - val_loss: 0.1064 - val_accuracy: 0.9629
Epoch 2/20
37800/37800 [==============================] - 75s 2ms/sample - loss: 0.0934 -
accuracy: 0.9708 - val_loss: 0.0834 - val_accuracy: 0.9748
Epoch 3/20
37800/37800 [==============================] - 74s 2ms/sample - loss: 0.0653 -
accuracy: 0.9793 - val_loss: 0.0587 - val_accuracy: 0.9821
Epoch 4/20
37800/37800 [==============================] - 74s 2ms/sample - loss: 0.0503 -
accuracy: 0.9845 - val_loss: 0.0615 - val_accuracy: 0.9795
Epoch 5/20
7872/37800 [=====>........................] - ETA: 55s - loss: 0.0369 -
accuracy: 0.9874
In [8]:

model.save('my_digit_recognizer')
new_model=tf.keras.models.load_model('my_digit_recognizer')
y_pred=new_model.predict_classes(X_test1)
In [9]:
sample_submission.head()
submission=pd.DataFrame({'ImageId': sample_submission.ImageId,'Label':y_pred})
submission.to_csv('/kaggle/working/submission.csv',index=False)
check=pd.read_csv('/kaggle/working/submission.csv')
check.head()

Out[9]:

ImageId Label

0 1 2

1 2 0

2 3 9

3 4 9

4 5 3

Let's test the model

In [10]:

X_test_1=X_test.reshape(X_test.shape[0],28,28)
plt.imshow(X_test_1[100])
plt.show()
print('Prediction: ', y_pred[100])

Prediction: 0

You might also like