0% found this document useful (0 votes)

17 views6 pages

Assignment3

The document outlines the details for Programming Assignment #1 for IITM-CS7015: Deep Learning, focusing on implementing gradient descent with backpropagation for classifying images from the Fashion-MNIST dataset. Students are encouraged to work in pairs, submit well-documented code and a detailed report in LaTeX, and adhere to specific requirements for hyperparameters and logging. The assignment includes a Kaggle submission component, with the goal of achieving an error rate of 8% on test data.

Uploaded by

sambtganguly20

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views6 pages

Assignment3

Uploaded by

sambtganguly20

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

IITM-CS7015 : Deep Learning Given on: Jan 31, 9pm

Programming Assignment #1 Due on : Feb 15, 11:55pm IST

• The goal of this assignment is to implement and use gradient descent (and its variants) with
backpropagation for a classification task.

• We strongly recommend that you work on this assignment in teams of 2. Both the members
of the team are expected to work together (and not divide the work) since this assignment
is designed with a learning outcome in view.

• Collaborations and discussions with others are strictly prohibited.

• You may use Python (numpy and Pandas) for your implementation. If you are using any
other languages, please contact the TAs before you proceed.

• All the models will be tested on same environment on Kaggle. So, you must submit all your
results on test data on Kaggle.

• You have to turn in the well documented code along with a detailed report of the results of
the experiment electronically in Moodle. Typeset your report in Latex. Reports which are
not written using Latex will not be accepted.

• The report should be precise and concise. Unnecessary verbosity will be penalized.

• You have to check the Moodle discussion forum regularly for updates regarding the assign-
ment.

1 Task
In this assignment you need to implement a feedforward neural network (we strongly rec-
ommend using numpy for all matrix/vector operations). This network will be trained and
tested using the Fashion-MNIST dataset. Specifically, given an input image (28 x 28 = 784
pixels) from the Fashion-MNIST dataset, the network will be trained to classify the image
into 1 of 10 classes.

1.1 Format
Your implementation should support the use of the following hyper-parameters/options:

• –lr (initial learning rate for gradient descent based algorithms)

• –momentum (momentum to be used by momentum based algorithms)

• –num hidden (number of hidden layers - this does not include the 784 dimensional
input layer and the 10 dimensional output layer)
• –sizes (a comma separated list for the size of each hidden layer)

• –activation (the choice of activation function - valid values are tanh/sigmoid)

• –loss (possible choices are squared error[sq] or cross entropy loss[ce])

• –opt (the optimization algorithm to be used: gd, momentum, nag, adam - you will be
implementing the mini-batch version of these algorithms)

• –batch size (the batch size to be used - valid values are 1 and multiples of 5)

• –anneal (if true the algorithm should halve the learning rate if at any epoch the vali-
dation loss decreases and then restart that epoch)

• –save dir (the directory in which the pickled model should be saved - by model we
mean all the weights and biases of the network)

• –expt dir (the directory in which the log files will be saved - see below for a detailed
description of which log files should be generated)

• –train (path to the Training dataset)

• –test (path to the Test dataset)

You should use the argparse module in python for parsing these parameters.

You need to submit the source code for the assignment. Your code should include one
file called train.py which should be runnable using the following command:

python train.py --lr 0.01 --momentum 0.5 --num_hidden 3 --sizes 100,100,100

--activation sigmoid --loss sq --opt adam --batch_size 20 --anneal true
--save_dir pa1/ --expt_dir pa1/exp1/ --train train.csv --test test.csv

Of course, the actual values for the options can change (for example, we could try dif-
ferent learning rates, momentum, optimization algorithms, etc.). For the remainder of this
document we will assume that the above command is saved as a shell script in run.sh (from
now on if we refer to run.sh then it means we are referring to the above command)

1.2 Log file

Your code should create the following log files: log train.txt and log val.txt in the expt dir.
The log *.txt file should contain the loss (squared error or cross entropy as specified)
and the error rate (i.e., percentage of examples which were classified incorrectly) on the
train/validation data after every 100 steps (refer to the Lecture slides for the definition of a
step). The log.txt file should contain only the following lines:

Page 2
Epoch 0, Step 100, Loss: <value>, Error: <value>, lr: 0.01
Epoch 0, Step 200, Loss: <value>, Error: <value>, lr: 0.01
Epoch 0, Step 300, Loss: <value>, Error: <value>, lr: 0.01
Epoch 0, Step 400, Loss: <value>, Error: <value>, lr: 0.01
...
Epoch 1, Step 100, Loss: <value>, Error: <value>, lr: 0.01
...
Epoch 2, Step 100, Loss: <value>, Error: <value>, lr: 0.01
...
...
Epoch <max_epoch>, Step 100, Loss: <value>, Error: <value>, lr: 0.01
...

(where lr is the learning rate which may change if you use annealing)
(Error can take on a real value between 0 to 100 but round it off to two decimal places)

1.3 Predictions
In addition, your code should also generate a test submission.csv file in the expt dir. Each
line of this file should contain a image ID from the test set and the predicted label(0-9) of
the corresponding test image. The file must also have column headers namely: id and label.
As the test data has 10K examples your submission file must have 10K lines. Here is what
a sample test submission.csv file would look like:

id,label
0,3
1,2
3,8
...
9999,4

A sample submission file has been provided along with the dataset.
Note: You will have to initialize the weights of the network randomly. To ensure repli-
cability of your results, make sure that you set a seed of 1234 [numpy.random.seed(1234)]
before initializing the weights and biases. If you don‘t do this we may get very different
results when we run your code after submission.

1.4 Kaggle Submission

It is suggested that both the team members make separate account on Kaggle. You can
form teams for this assignment on the Kaggle assignment page. With this both the team
members will be able to make submissions.
You must submit the test submission.csv files on Kaggle assignment page for evaluation.
The evaluation on Kaggle will be done in 2 steps. Till Feb 15, you are allowed to make 5

Page 3
submissions on the portal everyday. These will be evaluated on 30% of the test data. Till
Feb 15, your 1st evaluation scores will be visible on Public leaderboard. On Feb 15, you can
select 2 of your best submissions to get evaluated in the second step. By default, your best 2
submissions will be taken for evaluation in the second step. After the deadline, your scores
in the second step of evaluation will be visible on the Private leaderboard.

1.5 Report
You need to submit a report along with the assignment. The report should contain plots
comparing the train and validation loss across different epochs for the following.
1. varying the size of the hidden layer (50, 100, 200, 300) - [keeping just one hidden layer]
You will have to draw the following plots
2. varying the size of the hidden layer (50, 100, 200, 300) - [with two hidden layers and
the same size for each hidden layer]
3. varying the size of the hidden layer (50, 100, 200, 300) - [with three hidden layers and
the same size for each hidden layer]
4. varying the size of the hidden layer (50, 100, 200, 300) - [with four hidden layers and
the same size for each hidden layer]

For all the above 4 cases you will use sigmoid activation, cross entropy loss, Adam,
batch size 20 and tune the learning rate to get best results. For each of the 4 questions
above you need to draw the following plots:
• x-axis: number of epochs, y-axis: training loss [4 curves - each curve corresponding
to one of the four hidden sizes mentioned above]
• x-axis: number of epochs, y-axis: validation loss [4 curves - each curve corre-
sponding to one of the four hidden sizes mentioned above]
5. Adam, NAG, Momentum, GD [with 3 hidden layers and each layer having 300 neurons]
(again sigmoid activation, cross entropy loss, batch size 20 and tune the learning rate.
momentum wherever applicable to get best results). You need to make the following
plots:
• x-axis: number of epochs, y-axis: training loss [4 curves - each curve corresponding
to one of the four optimization methods mentioned above]
• x-axis: number of epochs, y-axis: validation loss [4 curves - each curve corre-
sponding to one of the four optimization methods mentioned above]
6. sigmoid v/s tanh activation [Adam, 2 hidden layers, 100 neurons in each layer, batch
size 20, cross entropy loss]. You need to make the following plots:
• x-axis: number of epochs, y-axis: training loss [2 curves - each curve corresponding
to one of the two activation functions mentioned above]

Page 4
• x-axis: number of epochs, y-axis: validation loss [2 curves - each curve corre-
sponding to one of the two activation functions mentioned above]

7. cross entropy loss v/s squared error loss [Adam, 2 hidden layers, 100 neurons in each
layer, batch size 20, sigmoid activation] You need to make the following plots:

• x-axis: number of epochs, y-axis: training loss [2 curves - each curve corresponding
to one of the two loss functions mentioned above]
• x-axis: number of epochs, y-axis: validation loss [2 curves - each curve corre-
sponding to one of the two loss functions mentioned above]

8. Batch size: 1,20,100,1000 [Adam, 2 hidden layers, 100 neurons in each layer, sigmoid
activation, cross entropy loss] You need to make the following plots:

• x-axis: number of epochs, y-axis: training loss [4 curves - each curve corresponding
to one of the 4 batch sizes mentioned above]
• x-axis: number of epochs, y-axis: validation loss [4 curves - each curve corre-
sponding to one of the 4 batch sizes mentioned above]

1.6 Evaluation
We anticipate that some of you may not be able to support all values of hyperparameters
mentioned above. To help us evaluate only those options which are supported by your code
you need to submit a file (supported.txt) listing the supported options for the following
hyperparameters: anneal, opt, loss, activation. For example, if you have supported all
possibles option for these hyperparameters then supported.txt will contain the following
contents:

--anneal: true,false
--opt: gd,momentum,nag,adam
--loss: sq,ce
--activation: tanh,sigmoid

However, if your code does not support tanh activation and adam and squared error loss
then supported.txt will contain the following contents:

--anneal: true,false
--opt: gd,momentum,nag
--loss: ce
--activation: sigmoid

You need a single tar.gz file containing the following:

- train.py
- run.sh (containing the best hyperparameters setting)
- any other python scripts that you have written

Page 5
- supported.txt (as described above)
- report (in Latex) of the results of your experiments
- Kaggle_subs/ (folder containing all the Kaggle submissions)

The tar.gz should be named as RollNo backprop.tar.gz.

• Your task is to achieve an error rate of 8% on the test data. You will be evaluated
based on how close you get to achieving this error rate.

• Along with the source code you need to submit a run.sh file containing the command
(with hyperparameters) that gave you the lowest error rate on the public test set.

• In addition, we will also run your code using different hyperparameter configurations
(for example, number of hidden layers, size of hidden layers, etc.). You will then be
evaluated based on how good/bad your performance is compared to the performance
of others on different hyperparameter configurations.

• And of course, you will also be evaluated based on which of the specified hyperparam-
eters are supported correctly by your code.

Page 6

The Red Door
No ratings yet
The Red Door
15 pages
Grade 04 ENAT (Class) Matatag-CD
No ratings yet
Grade 04 ENAT (Class) Matatag-CD
40 pages
DLP Lab
No ratings yet
DLP Lab
81 pages
Shaurya DL file
No ratings yet
Shaurya DL file
75 pages
Perceptions of personality from speech_ effects of manipulations
No ratings yet
Perceptions of personality from speech_ effects of manipulations
9 pages
IBest_DeepLearning
No ratings yet
IBest_DeepLearning
123 pages
Cv prince
No ratings yet
Cv prince
120 pages
College List
No ratings yet
College List
14 pages
exp 5 (1)
No ratings yet
exp 5 (1)
9 pages
[Nietzsche-Studien] Sources of and Influences on Nietzsches The Birth of Tragedy
No ratings yet
[Nietzsche-Studien] Sources of and Influences on Nietzsches The Birth of Tragedy
22 pages
RLDL
No ratings yet
RLDL
23 pages
Machine Learning HW3 - Image Classification
No ratings yet
Machine Learning HW3 - Image Classification
48 pages
DLWP Assignment 3
No ratings yet
DLWP Assignment 3
2 pages
Final_DL
No ratings yet
Final_DL
26 pages
RLDL File
No ratings yet
RLDL File
31 pages
Assignment 7
No ratings yet
Assignment 7
3 pages
WEEK 4 - Chapter 4 Cursors and Exception Handling
No ratings yet
WEEK 4 - Chapter 4 Cursors and Exception Handling
33 pages
8413-1 Intro To Logic
No ratings yet
8413-1 Intro To Logic
19 pages
dl lab_merged (2)
No ratings yet
dl lab_merged (2)
60 pages
RLDL128
No ratings yet
RLDL128
73 pages
Pdf
No ratings yet
Pdf
41 pages
Dokumen - Pub - Big Data Analytics A Guide To Data Science Practitioners Making The Transition To Big Data 1032457554 9781032457550
No ratings yet
Dokumen - Pub - Big Data Analytics A Guide To Data Science Practitioners Making The Transition To Big Data 1032457554 9781032457550
328 pages
Devops Lab -Run Regression Tests Using Maven Build Pipeline
No ratings yet
Devops Lab -Run Regression Tests Using Maven Build Pipeline
5 pages
NN & DL Lab Manual 1[1]
No ratings yet
NN & DL Lab Manual 1[1]
44 pages
Lab 2 Assignment_W2022
No ratings yet
Lab 2 Assignment_W2022
8 pages
CCS355-Neural networks and deep learning_____Assignment 1
No ratings yet
CCS355-Neural networks and deep learning_____Assignment 1
15 pages
NNDL_RECORD_MANUAL
No ratings yet
NNDL_RECORD_MANUAL
36 pages
NNDL Lab Manual
No ratings yet
NNDL Lab Manual
43 pages
Lab 2-Image-Classification-Using-NNs
No ratings yet
Lab 2-Image-Classification-Using-NNs
6 pages
DL & AI - Lab Manual
No ratings yet
DL & AI - Lab Manual
33 pages
NN From Scratch PDF 1735495327
No ratings yet
NN From Scratch PDF 1735495327
19 pages
Discuss About Ecstacy A2
No ratings yet
Discuss About Ecstacy A2
3 pages
Lab 1 Assignment_W2022
No ratings yet
Lab 1 Assignment_W2022
7 pages
Mathematics
No ratings yet
Mathematics
37 pages
lab 8
No ratings yet
lab 8
10 pages
The Theme of War in Narine Abgaryan's Collection of Stories "Long On Living"
No ratings yet
The Theme of War in Narine Abgaryan's Collection of Stories "Long On Living"
3 pages
Web Technologies
No ratings yet
Web Technologies
7 pages
Experiement 3
No ratings yet
Experiement 3
4 pages
Deep Learning Assignments
No ratings yet
Deep Learning Assignments
6 pages
AI lab 8
No ratings yet
AI lab 8
14 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Deep Learning
No ratings yet
Deep Learning
30 pages
Prediction Using ANN
No ratings yet
Prediction Using ANN
3 pages
ID6001_Homework_2b57bb1d39ec7c53700fa31dc04520dc
No ratings yet
ID6001_Homework_2b57bb1d39ec7c53700fa31dc04520dc
2 pages
hand writing using _cnn (1)
No ratings yet
hand writing using _cnn (1)
5 pages
The Dependency Inversion Principle: Elias Fofanov
No ratings yet
The Dependency Inversion Principle: Elias Fofanov
64 pages
BasicNeuralNetwork TrainingAndEvaluation - Ipynb Colaboratory
No ratings yet
BasicNeuralNetwork TrainingAndEvaluation - Ipynb Colaboratory
2 pages
Assignment_2
No ratings yet
Assignment_2
2 pages
DL Record
No ratings yet
DL Record
37 pages
R Deep Neural Network Step by Step
No ratings yet
R Deep Neural Network Step by Step
27 pages
Unit 1D-ANSWER SHEET
No ratings yet
Unit 1D-ANSWER SHEET
7 pages
keras
No ratings yet
keras
4 pages
Assignment 4x
No ratings yet
Assignment 4x
19 pages
NN From Scratch
No ratings yet
NN From Scratch
5 pages
Manufacturing Consent Thesis
100% (3)
Manufacturing Consent Thesis
6 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Assignment 5 - NN
No ratings yet
Assignment 5 - NN
4 pages
DL Mannual For Reference
No ratings yet
DL Mannual For Reference
58 pages
Chapter04 - Getting Started With Neural Networks
No ratings yet
Chapter04 - Getting Started With Neural Networks
9 pages
Effective Communication PDF
No ratings yet
Effective Communication PDF
48 pages
DL - Assignment 1
No ratings yet
DL - Assignment 1
12 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
IT 118 - SIA - Module 3
No ratings yet
IT 118 - SIA - Module 3
19 pages
Folktales Unit 7.1
No ratings yet
Folktales Unit 7.1
2 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
Lab 12
No ratings yet
Lab 12
6 pages
Holiday Homework XI 2024-25
No ratings yet
Holiday Homework XI 2024-25
2 pages
5 6282992023913889839 PDF
67% (3)
5 6282992023913889839 PDF
242 pages
2802ICT Programming Assignment 2
No ratings yet
2802ICT Programming Assignment 2
6 pages
Hasel (The Tel Zayit Inscription)
No ratings yet
Hasel (The Tel Zayit Inscription)
2 pages
ML Lab 11 Manual - Neural Networks (Ver4)
No ratings yet
ML Lab 11 Manual - Neural Networks (Ver4)
8 pages
Important Questions
No ratings yet
Important Questions
4 pages
ASNM Program Explain
No ratings yet
ASNM Program Explain
4 pages
PHP: Hypertext Preprocessor: Correct Answer!
No ratings yet
PHP: Hypertext Preprocessor: Correct Answer!
6 pages
CS335 Lab6
No ratings yet
CS335 Lab6
7 pages
EXAMEN FINAL de La Cruz Cajo Junior Estanlyn - ELEM II
No ratings yet
EXAMEN FINAL de La Cruz Cajo Junior Estanlyn - ELEM II
3 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
Machine Learning 2nd Assignment
No ratings yet
Machine Learning 2nd Assignment
3 pages
Test of Receptive Skill
No ratings yet
Test of Receptive Skill
16 pages
Setup Help
No ratings yet
Setup Help
4 pages
Sentiment Analysis On Tweets
No ratings yet
Sentiment Analysis On Tweets
2 pages
SRS & Vision of Hotel Management System
No ratings yet
SRS & Vision of Hotel Management System
13 pages
Donn Love Poet
100% (1)
Donn Love Poet
6 pages
Yogendranathyogi - Blogspot.in Khechari Vidya
100% (4)
Yogendranathyogi - Blogspot.in Khechari Vidya
21 pages
12-01416 Siemens - Sicam Pas - 104 Slave - Aoc
No ratings yet
12-01416 Siemens - Sicam Pas - 104 Slave - Aoc
1 page
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet