0% found this document useful (0 votes)

2 views

CS540_Spring_2025_Homework_7

The CS540 Spring 2025 Homework 7 focuses on implementing and training the LeNet-5 convolutional neural network using PyTorch, while exploring hyper-parameter variations and counting trainable parameters. Students will work with the CIFAR100 dataset and utilize provided helper code to facilitate the training and evaluation process. The assignment requires submission of the implemented code and results, with a deadline of April 8th, 2025.

Uploaded by

ishita.kapoor0712

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

CS540_Spring_2025_Homework_7

Uploaded by

ishita.kapoor0712

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

CS540 Spring 2025 Homework 7

1 Assignment Goals
• Implement and train LeNet-5 [3], a simple convolutional neural network (CNN).

• Understand and count the number of trainable parameters in the model.

• Explore different training configurations by varying hyper-parameters like batch size, learning rate and
number of training epochs.
• Design and customize your own deep network for scene recognition.

2 Summary
Your implementation in this assignment might take one or two hours to run. We highly recommend
starting working on this assignment early! In this homework, we will explore building deep neural net-
works, specifically Convolutional Neural Networks (CNNs), using PyTorch. Helper code is provided in this
assignment. If you have compute limitations, you can train and evaluate your model on CSL servers (see
instructions in HW6 and at the end of this assignment). Alternatively, if you have a sufficiently powerful
CPU, you can also run this on your machine. Go through Submission details in Section 7 carefully.

3 Packages Needed for this Project

You are only allowed to use Python3 standard library as well as numpy, torch, torchvision, and tqdm. We
recommend using the conda package manager as suggested in the PyTorch tutorial to easily set up the correct
environment on the CSL machine:

3.1 Install Conda

Documentation reference (Miniconda) here.

3.1.1 On CSL

>>> ssh <userid>@best-linux.cs.wisc.edu

>>> mkdir -p ~/miniconda3
>>> wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O
~/miniconda3/miniconda.sh
>>> bash ~/miniconda3/miniconda.sh -b -u -p /miniconda3
>>> rm ~/miniconda3/miniconda.sh
>>> source ~/miniconda3/bin/activate
>>> conda init --all
You may need to restart your terminal or try source ~/.bashrc.

1
3.1.2 On your machine
Alternatively, if you can run the code on your own machines, follow the instructions depending on your OS:
• MacOS
• Windows
• Linux

3.2 Use conda to set up environment for this HW

Installing the Pytorch environment may take a while:
>>> conda create -n "cse540-hw7" pytorch torchvision torchaudio anaconda::tqdm cpuonly
-c pytorch
>>> conda activate cse540-hw7

4 Dataset
You will implement LeNet and design your own CNN model on CIFAR100 [2], a scene recognition dataset
from Alex Krizhevsky and Geoffrey Hinton (Nobel Price winner!) at the University of Toronto. The CI-
FAR100 dataset has 100 classes containing 600 tiny images each. There are 500 training images and 100
testing images per class. Each image is 32x32 in size. You can always assume this input resolution.

Figure 1: Examples of images in the CIFAR100 dataset.

4.1 Helper Code

We provide helper methods in train_cifar100.py, eval_cifar100.py and dataloader.py, and skeleton
code in student_code.py. See comments in these files for more details.
Before beginning the training procedure, we define the dataloader, model, optimizer, image transforms
and criterion. We use train_model() and test_model() methods for training and evaluating the CNN,
which is similar to what we did in HW6. We provide the helper code on canvas (HW7.zip). Download and
unzip this file.
If you are using CSL, here is how to copy HW7.zip contents to the CSL server:

2
>>> unzip HW7.zip
>>> scp -r HW7 <userid>@best-linux.cs.wisc.edu:<path_you_want>
Our data loader will try to download the full dataset the first time you run python train_cifar100.py,
and you should see:
>>> Loaded trainset: 1563
>>> Loaded testset: 313
>>> ...
>>> ValueError: optimizer got an empty parameter list
The ValueError is because you have not yet written code for LeNet() in student_code. Think: What
do the numbers 3125 and 313 represent? Hint: how do the dataloaders load the data? If this process works,
skip to Section 5.

Backup setup: If the CIFAR100 website is down, you will get a download error. Follow these instructions
to set up the dataset manually:

1. Download the backup CIFAR100 zip file from this link

2. If you are using CSL Server: Copy the downloaded zip file to the CSL server via scp. Unpack the
CIFAR100 zip file (data.zip) by running the command below.
>>> scp data.zip <userid>@best-linux.cs.wisc.edu:<path_you_want>
>>> unzip data.zip

If your CSL account do not have unzip command installed, do the following:
>>> sudo apt-get install unzip

3. If you are using your own machine: Unpack the zip file by running the command below.
>>> unzip data.zip

4. Now, you will have a data/ directory, which consist of cifar-100-python.tar.gz. You can then run
train_cifar100.py, and our code will generate the following: (1) data/cifar100/images/ directory,
(2) data/cifar100/train.txt label file, and (3) data/cifar100/test.txt label file.
Once you have this all set up, you can delete data.zip

5 Program Specification
Implement the following in student_code.py:
1. class LeNet(): define the network layers in __init()__ and the forward process in forward().

2. count_model_params(): return the number of trainable parameters in the model

5.1 Creating LeNet-5 ([30] points)

Background: LeNet [3] was one of the first convolutional neural networks (CNNs), and its success was
foundational in furthering research into deep learning for computer vision. While we are implementing an
existing architecture, it might be helpful to think about why early researchers chose certain kernel sizes,
padding, and strides which we learned about conceptually in class.

3
Implementation: In LeNet-5, you should use the following layers in this order (see Figure 2):
1. One conv layer with the 6 output channels, kernel size = 5, stride = 1, followed by a ReLU [1, 4]
activation and a 2D max pool operation (kernel size = 2 and stride = 2).
2. One conv layer with 16 output channels, kernel size = 5, stride = 1, followed by a ReLU activation
and a 2D max pool operation (kernel size = 2 and stride = 2).
3. A flatten layer to convert the 3D tensor to a 1D tensor.
4. A linear layer with output dimension = 256, followed by a ReLU activation.
5. A linear layer with output dimension = 128, followed by a ReLU activation.

6. A linear layer with output dimension = number of classes (in our case, 100).

Figure 2: LeNet-5 Architecture.

Implement the model with the LeNet() class in student_code.py. You are expected to create the model
following this PyTorch tutorial, which is different from using nn.Sequential() as we did in the last HW. In
addition, given a batch of inputs with shape [N, C, W, H], where N is the batch size, C is the input channel
and W, H are the width and height of the image (both 32 in our case), you are expected to return both
the output of the model (torch.Tensor) along with the shape of the intermediate outputs for the above 6
stages. The shape should be a Python dictionary with the keys = [1, 2, 3, 4, 5, 6] (integers) denoting each
stage, where the corresponding value is a list that denotes the shape of the intermediate outputs.

Hint: The expected model has the following form:

class LeNet ( nn . Module ) :
def __init__ ( self , input_shape =(32 , 32) , num_classes =100) : super ( LeNet
, self ) . __init__ ()
# certain definitions
def forward ( self , x ) :
shape_dict = {}
# certain operations
return out , shape_dict
shape_dict should have the following form: {1: [a, b, c, d], 2:[e, f, g, h], ... , 6: [x, y]}
The linear layer and conv layers have bias terms. You need to use torch.nn.conv2d to create a convolutional
layer. The method parameters allow us to specify the details of the conv layer eg. the input/output
dimensions, padding, stride, and kernel size. More information can be found in the documentation.

4
5.2 Count the number of trainable parameters of LeNet-5 ([20] points)
Background: As discussed in the lecture, fully connected models (like what we created in HW6) are
dense, with many trainable parameters. After finishing this section, think about the number of parameters
(also sometimes called model size) in the CNN model compared to the number of parameters in a fully
connected model of similar depth (similar number of layers). Especially, how does the difference in size
impact efficiency and accuracy?

Implementation: In this part, you are expected to return the number of trainable parameters of the
LeNet model you created in Section 5.1. You have to fill in count_model_params() in student_code.py.
The function output should be in the unit of Million (1e6) ie. how many millions of trainable parameters
are in your implementation of LeNet. Please do not use any external libraries (see Section 3) which directly
calculate the number of parameters (other libraries, such as NumPy can be used as helpers)

Hint: You can use the model.named_parameters() to get the name and corresponding parameters of a
torch model. Please do not round your result.

5.3 Training LeNet-5 under different configurations ([50] points)

Background: A large part of creating neural networks is designing the architecture (part 1). However,
there are other ways of tuning the neural net to change its performance. In this section, we can see how
batch size, learning rate, and number of epochs impact how well the model learns. As you get your results,
it might be helpful to think about how and why the changes have impacted the training.

Implementation: Based on the LeNet-5 model created in Section 5.1, in this section you are expected to
train the LeNet-5 model under different configurations. You will use similar implementations of train_model
and test_model as you did for HW6 (which we provide in student_code.py). When you run the training
script train_cifar100.py, the script will save two files in the outputs/ folder.
• checkpoint.pth.tar is the model checkpoint at the latest epoch.
• model_best.pth.tar is the model weights that has highest accuracy on the validation set.
Our code supports resuming from a previous checkpoint, so you can pause training and resume later.
This can be achieved by running python train_cifar100.py --resume ./outputs/checkpoint.pth.tar.
This is also very helpful if your training is interrupted for any reason.

Evaluation: After training, you can evaluate your model on the val set with the eval_cifar100.py script
we provide. This script will grab a pre-trained model and evaluate it on the val set of 10K images. For
example, you can run python eval_cifar100.py --load ./outputs/model_best.pth.tar. The output
shows the validation accuracy and also the model evaluation time in seconds (see an example below).
= > Loading from cached file ./ data / cifar100 / cached \ _val . pkl
= > loading checkpoint './ outputs / model \ _best . pth . tar '
= > loaded checkpoint './ outputs / model \ _best . pth . tar ' ( epoch x )
Evaluting the model ...
[ Test set ] Epoch : xx , Accuracy : xx . xx %
Evaluation took 2.26 sec
You can run this script a few times to see the average runtime of your model. Please train the model under
the following configurations:

1. The default configuration provided in the code, which means you do not have to make modifications.
2. Set the batch size to 8, the remaining hyper-params are same as the default configuration.
3. Set the batch size to 16, the remaining hyper-params are same as the default configuration.

5
4. Set the learning rate to 0.05, the remaining hyper-params are same as the default configuration.
5. Set the learning rate to 0.01, the remaining hyper-params are same as the default configuration.
6. Set the epochs to 20, the remaining hyper-params are same as the default configuration.
7. Set the epochs to 5, the remaining hyper-params are same as the default configuration.

After training, you are expected to get the validation accuracy using the best model (model_best.pth.tar),
then save the output into a results.txt file, where the accuracy of each configuration above is placed in each
newline, in order. Your .txt file will end up looking like this:
11.11
22.22
33.33
...
These exact accuracy will probably not align well with your results. They are just for illustration purposes.
Follow the submission details in Section 7.

6 Help with Training

6.1 Profiling Your Model (Optional)
You might find that the training or evaluation of your model is a bit slower than expected. Fortunately,
PyTorch has its own profiling tool. Here is a quick tutorial of using PyTorch profiler. You can easily inject
the profiler into train_cifar100.py to inspect the runtime and memory consumption of different parts of your
model. A general principle is that a deep (many layers) and wide (many feature channels) network will train
much slower. It is your design choice to balance between efficiency and accuracy.

6.2 Training on CSL

You are required to train and evaluate your models on the CSL machines. You should find a way to allow
your remote session to remain active if you are disconnected from CSL. In this case, we recommend using
tmux, a terminal multiplexer for Unix-like systems. tmux is already installed on CSL. To use tmux, simply
type tmux in the terminal. Now you can run your code in a tmux session. And the session will remain active
even if you are disconnected.
• If you want to detach a tmux session without closing it, press "ctrl + b" then "d" (detach) within a
tmux session. This will exit to the terminal while keeping the session active. Later you can re-attach
the session.

• If you want to enter an active tmux session, type "tmux a" to attach to the last session in the terminal
(outside of tmux).
• If you want to close a tmux session, press "ctrl + b" then "x" (exit) within a tmux session. You
won’t be able to enter this session again. Please make sure that you close your tmux sessions after this
assignment.

• Here is a brief tutorial about this powerful tool, tmux.

7 Submission Notes
Please submit two files named student_code.py and results.txt (From Section 5.3) to Gradescope. Do
not submit a Jupyter notebook .ipynb file. Be sure to remove all debugging output before submission.
Failure to remove debugging output may be penalized.

6
• No code should be put outside the function definitions (except for import statements; helper functions
are allowed).
• The validation accuracy for different configurations must be in .txt format and named results.txt
This assignment is due on April 8th at 11:59 PM. We highly recommend starting early. We
highly suggest submitting a version well before the deadline (at least one hour before) and check
the content/format of the submission to make sure it’s the right version. You can then update your submission
until the deadline, if needed.

Good luck and happy (deep) learning!

References
[1] Alston S Householder. A theory of steady-state activity in nerve-fiber networks: I. definitions and
preliminary lemmas. The bulletin of mathematical biophysics, 3:63–69, 1941.

[2] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
[3] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to
document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
[4] Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In
Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.

Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet
Ime-Ita Apostila Ingles Vol 1
No ratings yet
Ime-Ita Apostila Ingles Vol 1
34 pages
XXXX XXXX 6322: XXXX XXXX 6322 XXXX XXXX 6322
No ratings yet
XXXX XXXX 6322: XXXX XXXX 6322 XXXX XXXX 6322
1 page
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
LPIC-1 Primer
From Everand
LPIC-1 Primer
John Greene
4.5/5 (3)
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
From Everand
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
Kanto
No ratings yet
Homework_6
No ratings yet
Homework_6
7 pages
Ai HW1
No ratings yet
Ai HW1
25 pages
Aditya Joshi 23252595 Assign 5
No ratings yet
Aditya Joshi 23252595 Assign 5
7 pages
CS401 24 Assign 2 Template Fixed
No ratings yet
CS401 24 Assign 2 Template Fixed
11 pages
Assignment3 AL
No ratings yet
Assignment3 AL
23 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
Building Deep Learning Models Using the PyTorch Library
No ratings yet
Building Deep Learning Models Using the PyTorch Library
4 pages
Train your image classifier model with PyTorch
No ratings yet
Train your image classifier model with PyTorch
6 pages
BIA9
No ratings yet
BIA9
5 pages
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
Satellite Instructions
No ratings yet
Satellite Instructions
3 pages
PyTorch Made Easy A Quick Overview
No ratings yet
PyTorch Made Easy A Quick Overview
55 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
Harvard CS197 Lecture 5 Notes
No ratings yet
Harvard CS197 Lecture 5 Notes
14 pages
Deep Learning With PyTorch
No ratings yet
Deep Learning With PyTorch
19 pages
Assignment 5 - NN
No ratings yet
Assignment 5 - NN
4 pages
یادگیری پایتورچ
No ratings yet
یادگیری پایتورچ
30 pages
MICCAI Educational Challenge
No ratings yet
MICCAI Educational Challenge
3 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Technologies
No ratings yet
Technologies
9 pages
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
keras
No ratings yet
keras
4 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
PRML-Lab01
No ratings yet
PRML-Lab01
2 pages
Prblem Col
No ratings yet
Prblem Col
2 pages
PyTorch - A Comprehensive Overview
No ratings yet
PyTorch - A Comprehensive Overview
7 pages
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
Interview Questions for IBM Mainframe Developers
From Everand
Interview Questions for IBM Mainframe Developers
Robert Wingate
1/5 (1)
Deep Neural Network Application
No ratings yet
Deep Neural Network Application
17 pages
Deep learning lab manual(2)
No ratings yet
Deep learning lab manual(2)
28 pages
DLCV Ch3 Convolutional Neural Network
No ratings yet
DLCV Ch3 Convolutional Neural Network
45 pages
Software Design Simplified
From Everand
Software Design Simplified
Liviu Catalin Dorobantu
No ratings yet
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
From Everand
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
Kanto
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
CIFAR_10_ Dataset_Using_CNN_Aniiiii_HTML
No ratings yet
CIFAR_10_ Dataset_Using_CNN_Aniiiii_HTML
8 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
CS5242 Assignment 2
No ratings yet
CS5242 Assignment 2
12 pages
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
03_pytorch_computer_vision
No ratings yet
03_pytorch_computer_vision
29 pages
Deep Learning Practical
No ratings yet
Deep Learning Practical
12 pages
"C Programming for Beginners: A Step-by-Step Guide"
From Everand
"C Programming for Beginners: A Step-by-Step Guide"
Lov kush
No ratings yet
Activation Functions: Ismail Elezi
No ratings yet
Activation Functions: Ismail Elezi
30 pages
2c PyTorch4
No ratings yet
2c PyTorch4
4 pages
Implemented LeNet on PyTorch
100% (1)
Implemented LeNet on PyTorch
17 pages
UNIT_I CHP_5
No ratings yet
UNIT_I CHP_5
26 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Convolutional Autoencoder in Pytorch On MNIST Dataset - by Eugenia Anello - DataSeries - Medium
No ratings yet
Convolutional Autoencoder in Pytorch On MNIST Dataset - by Eugenia Anello - DataSeries - Medium
18 pages
Week 02 Ch2.1 Introduction To Neural Networks
No ratings yet
Week 02 Ch2.1 Introduction To Neural Networks
44 pages
DL Programs
No ratings yet
DL Programs
12 pages
Module02 PyTorch
No ratings yet
Module02 PyTorch
36 pages
CEP-DIP
No ratings yet
CEP-DIP
9 pages
Sonicpoint NDR Getting Started Guide
No ratings yet
Sonicpoint NDR Getting Started Guide
50 pages
final question-bank
No ratings yet
final question-bank
4 pages
Managing & Tabulating Data in Microsoft Excel
No ratings yet
Managing & Tabulating Data in Microsoft Excel
184 pages
ccs372 Vir Manual
No ratings yet
ccs372 Vir Manual
120 pages
123 Loads
No ratings yet
123 Loads
4 pages
Binder 1
No ratings yet
Binder 1
74 pages
AZ-104 Exam - Free Actual Q&As, Page 3 _ ExamTopics
No ratings yet
AZ-104 Exam - Free Actual Q&As, Page 3 _ ExamTopics
6 pages
ELECTRICITY Notes
No ratings yet
ELECTRICITY Notes
20 pages
156 Half-Cell: 10BB Half-Cut Mono Perc
No ratings yet
156 Half-Cell: 10BB Half-Cut Mono Perc
2 pages
CBC CHS
No ratings yet
CBC CHS
78 pages
Detection
No ratings yet
Detection
8 pages
BTT Culminating Dragon Den 2022
No ratings yet
BTT Culminating Dragon Den 2022
2 pages
SPA Cathtech CIPS DCVG
No ratings yet
SPA Cathtech CIPS DCVG
2 pages
Kiran Abinitio
No ratings yet
Kiran Abinitio
66 pages
Contributed
No ratings yet
Contributed
5 pages
Ground-Fault Protection
No ratings yet
Ground-Fault Protection
3 pages
Mems Assignment
No ratings yet
Mems Assignment
6 pages
Manual. MOVIDRIVE MDX61B Sensor Based Positioning Via Bus Application. Edition 01 - 2005 FA362000 11313528 - EN
No ratings yet
Manual. MOVIDRIVE MDX61B Sensor Based Positioning Via Bus Application. Edition 01 - 2005 FA362000 11313528 - EN
84 pages
M340_system_bits_words
No ratings yet
M340_system_bits_words
9 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
Modal Logic
No ratings yet
Modal Logic
67 pages
05 Query Processing and Optimization-TELU
No ratings yet
05 Query Processing and Optimization-TELU
56 pages
MiniElcor Eng
0% (1)
MiniElcor Eng
131 pages
Employee Nda 353
No ratings yet
Employee Nda 353
4 pages
Lsppscripting
No ratings yet
Lsppscripting
30 pages
Windows XP Embedded Thin Client Manual
No ratings yet
Windows XP Embedded Thin Client Manual
72 pages
Prince2 Process Model
No ratings yet
Prince2 Process Model
1 page
Categorizing Traditional Chinese Painting Images: Lecture Notes in Computer Science October 2004
No ratings yet
Categorizing Traditional Chinese Painting Images: Lecture Notes in Computer Science October 2004
9 pages