Assignment 4x
Assignment 4x
Assignment 4
Assignment 4: Neural Networks
Goal: Get familiar with neural networks by implementing them and applying them to image classification.
In this assignment we are going to learn about neural networks (NNs). The goal is to implement two
neural networks: a fully-connected neural network, a convolutional neural network, and analyze their
behavior.
The considered task is image classification. We consider a dataset of small natural images (see the
additional file) with multiple classes. We aim at formulating a model (a neural network) and learning it
using the negative log-likelihood function (i.e., the cross-entropy loss) as the objective function, and the
stochastic gradient descent as the optimizer.
In the second part, you are asked to implement the whole pipeline for a given dataset by yourself.
Please run the code below and spend a while on analyzing the images.
If any code line is unclear to you, please read on that in numpy, scipy, matplotlib and PyTorch docs.
about:srcdoc Page 1 of 19
assignment_4 22/05/2022, 23:45
In [48]:
import os
import sys
!{sys.executable} -m pip install torch
import numpy as np
import matplotlib.pyplot as plt
import torch
from sklearn.datasets import load_digits
from sklearn import datasets
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
import torch.nn.functional as F
EPS = 1.e-7
In [49]:
# IF YOU USE COLAB, THIS IS VERY USEFUL! OTHERWISE, PLEASE REMOVE IT.
# mount drive: WE NEED IT FOR SAVING IMAGES!
#from google.colab import drive
#drive.mount('/content/gdrive')
In [50]:
# IF YOU USE COLAB, THIS IS VERY USEFUL! OTHERWISE, PLEASE REMOVE IT.
# PLEASE CHANGE IT TO YOUR OWN GOOGLE DRIVE!
#results_dir = '/content/gdrive/My_Drive/Colab Notebooks/TEACHING/'
about:srcdoc Page 2 of 19
assignment_4 22/05/2022, 23:45
In [51]:
self.transforms = transforms
def __len__(self):
return len(self.data)
about:srcdoc Page 3 of 19
assignment_4 22/05/2022, 23:45
In [52]:
for i in range(4):
for j in range(4):
img = np.reshape(x[4*i+j],(8,8))
axs[i,j].imshow(img, cmap='gray')
axs[i,j].axis('off')
NOTE: Please pay attention to the inputs and outputs of each function.
Below, we have two helper modules (layers) that can be used to reshape and flatten a tensor. They are
useful for creating sequentials with convolutional layers.
about:srcdoc Page 4 of 19
assignment_4 22/05/2022, 23:45
In [53]:
about:srcdoc Page 5 of 19
assignment_4 22/05/2022, 23:45
In [54]:
#=========
# GRADING:
# 0
# 0.5 pt if code works but it is explained badly
# 1.0 pt if code works and it is explained well
#=========
# Implement a neural network (NN) classifier.
class ClassifierNeuralNet(nn.Module):
def __init__(self, classnet):
super(ClassifierNeuralNet, self).__init__()
# We provide a sequential module with layers and activations
self.classnet = classnet
# The loss function (the negative log-likelihood)
self.nll = nn.NLLLoss(reduction='none') #it requires log-softmax as in
put!!
return y_pred
Question 1 (0-0.5pt): What is the objective function for a classification task? In other words, what is
nn.NLLLoss in the code above? Please write it in mathematical terms.
Answer: In this case, the objective function is the negative log likelihood function.
about:srcdoc Page 6 of 19
assignment_4 22/05/2022, 23:45
Question 2 (0-0.5pt): In the code above, it is said to use the logarithm of the softmax as the final
activation function. Is it correct to use the log-softmax instead of the softmax for making predictions (i.e.,
picking the most probable label).
Answer: Yes, it is fine because the logarithm does not change the most probable label, it changes only
the probability to the log-probability.
2.2 Evaluation
In [55]:
about:srcdoc Page 7 of 19
assignment_4 22/05/2022, 23:45
about:srcdoc Page 8 of 19
assignment_4 22/05/2022, 23:45
In [56]:
about:srcdoc Page 9 of 19
assignment_4 22/05/2022, 23:45
2.4 Experiments
Initialize dataloaders
In [57]:
In [58]:
about:srcdoc Page 10 of 19
assignment_4 22/05/2022, 23:45
In [59]:
Initialize hyperparameters
about:srcdoc Page 11 of 19
assignment_4 22/05/2022, 23:45
In [60]:
Running experiments
In the code below, you are supposed to implement architectures for MLP and CNN. For properly
implementing these architectures, you can get 0.5pt for each of them.
In [62]:
# MLP
if name[0:14] == 'classifier_mlp':
#=========
# GRADING:
# 0
about:srcdoc Page 12 of 19
assignment_4 22/05/2022, 23:45
# CNN
elif name[0:14] == 'classifier_cnn':
#=========
# GRADING:
# 0
# 0.5pt if properly implemented
#=========
#------
# PLEASE FILL IN:
size = (1,8,8)
classnet = nn.Sequential(
Reshape(size=(1, 8, 8)),
nn.Conv2d(in_channels=1, out_channels=64, kernel_size=3),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride = 1),
nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride = 1),
nn.Flatten(),
nn.Linear(M, 10),
nn.LogSoftmax(dim=1))
#
# You are asked here to propose your own architecture
# NOTE: Plese note that the images are represented as vectors, thus, you m
ust
# use Reshape(size) as the first layer, and Flatten() after all convolutio
nal
# layers and before linear layers.
# NOTE: Please remember that the output must be LogSoftmax!
#------
# Init ClassifierNN
model = ClassifierNeuralNet(classnet)
about:srcdoc Page 13 of 19
assignment_4 22/05/2022, 23:45
# Training procedure
nll_val, error_val = training(name=name,
max_patience=max_patience,
num_epochs=num_epochs,
model=model,
optimizer=optimizer,
training_loader=training_loader,
val_loader=val_loader)
about:srcdoc Page 14 of 19
assignment_4 22/05/2022, 23:45
about:srcdoc Page 15 of 19
assignment_4 22/05/2022, 23:45
2.5 Analysis
Question 3 (0-0.5pt): Please compare the convergence of MLP and CNN in terms of the loss function
and the classification error.
Answer: CNN has both lower scores for the loss function and for the classification error. Which means
CNN's performance is better than MLP's.
about:srcdoc Page 16 of 19
assignment_4 22/05/2022, 23:45
about:srcdoc Page 17 of 19
assignment_4 22/05/2022, 23:45
about:srcdoc Page 18 of 19
assignment_4 22/05/2022, 23:45
Question 4 (0-0.5pt): In general, for a properly picked architectures, a CNN should work better than an
MLP. Did you notice that? Why (in general) CNNs are better suited to images than MLPs?
Answer: See Question 3 for the images. CNN's are better suited to images thsan MLP's because they
take a tensor as input while MLP's do not. Therefore CNN's can understand spatial relation between the
pixels of images better, which allows CNN's to outperform MLP's when it comes to more complex
images.
Please repeat (some) of the code in the previous section and apply a bigger convolutional neural network
(CNN) to the following dataset:
http://ufldl.stanford.edu/housenumbers/ (http://ufldl.stanford.edu/housenumbers/)
1. (1pt) Create appropriate Dataset class. Please remember to use the original training data and test
data, and also to create a validation set from the traning data (at least 10% of the training examples).
Do not use extra examples!
2. (1pt) Implement an architecture that will give at most 0.1 classification error. For instance, see this
paper as a reference:
https://arxiv.org/pdf/1204.3968.pdf#:~:text=The%20SVHN%20classification%20dataset%20%5B8,set%20
(https://arxiv.org/pdf/1204.3968.pdf#:~:text=The%20SVHN%20classification%20dataset%20%5B8,set%20
3. (1pt) Think of an extra component that could improve the performance (e.g., a regularization, specific
activation functions).
4. (1pt) Provide a good explanation of the applied architecture and a description of all components.
5. (2pt) Analyze the results.
Please be very precise, comment your code and provide a comprehensive and clear analysis.
In [ ]:
about:srcdoc Page 19 of 19