Machine Learning HW3 - Image Classification
Machine Learning HW3 - Image Classification
ML TAs
[email protected]
Objective - Image Classification
1. Solve image classification with convolutional neural networks.
2. Improve the performance with data augmentations.
3. Understand popular image model techniques such as residual.
Task Introduction - Food Classification
● The images are collected from the food-11 dataset classified into 11
classes.
● Training set: 9866 labeled images
● Validation set: 3430 labeled images
● Testing set: 3347 images
Rules
● DO NOT attempt to find the original labels of the testing set.
● DO NOT use any external datasets.
● DO NOT use any pretrained models.
○ Also, do not attempt to “test how effective pretraining is” by submitting to kaggle.
Pretraining is very effective and you may test it after the competition ends.
● You may use any publicly available packages/code
○ But make sure you do not use pretrained models. Most code use those.
○ You may not upload your code/checkpoints to be publicly available during the timespan
of this homework.
Baseline
Simple : 0.50099
Both type should be strings. Id corresponds to the jpg filenames in test. Follow
the sample code if you have trouble with formatting.
Model Selection
● Visit torchvision.models for a list of model structures, or go to timm for
the latest model structures.
● Pretrained weights are not allowed, specifically set pretrained=False to
ensure that the guideline is met.
Data Augmentation
● Modify the image data so non-identical inputs are given to the model each
epoch, to prevent overfitting of the model
● Visit torchvision.transforms for a list of choices and their corresponding
effect. Diversity is encouraged! Usually, stacking multiple transformations
leads to better results.
● Coding : fill in train_tfm to gain this effect
Advanced Data Augmentation - mixup
0.5* + 0.5 * =
Label
0 1 01
Advanced Data Augmentation - mixup
● Coding :
● In your torch.utils.Dataset, __getitem__()needs to return an
image that is the linear combination of two images.
● In your torch.utils.Dataset, __getitem__() needs to return a label
that is a vector, to assign probabilities to each class.
● You need to explicitly code out the math formula of the cross entropy
loss, as CrossEntropyLoss does not support multiple labels.
Test Time Augmentation
● The sample code tests images using a deterministic “test transformation”
● You may using the train transformation for a more diversified
representation of the images, and predict with multiple variants of the
test images.
● Coding : You need to fill in train_tfm, change the augmentation
method for test_dataset, and modify prediction code to gain this effect
train_tfm test_tfm test_tfm
+ >
Pred Pred Pred Pred Pred Pred Pred
Ensemble
Pred
Test Time Augmentation
● Usually, test_tfm will produce images that are more identifiable, so you
can assign a larger weight to test_tfm results for better performance.
train_tfm test_tfm
avg_train_tfm_pred
Train Validation
Cross Validation
● Even if you don’t do cross validation, you are encouraged to resplit the
train/validation set to suitable proportions.
○ Currently, train : validation ~ 3 : 1, more training data could be valuable.
Ensemble
● Average of logits or probability : Need to save verbose output, less
ambiguous
● Voting : Easier to implement, need to break ties
Change sorting to
“Recently Run” if you
can’t find the code
How to get data : In the input section, there should already be data titled
“ml2022spring-hw3”
Since GPU time is limited, It is advised to NOT utilize GPU while debugging
How to Run interactively : The commands are very similar to google colab
Make sure your code is bug free, as any error in any code block would result in
early stopping
How to view “Run in background” results
=> Run any code with GPU on kaggle today (3/4) to get (possible) extra
time next week.
=> 時間快用完的時候在背景跑一隻程式,等於多12小時runtime
=> 時間快用完的時候在背景跑兩隻程式,等於多24小時runtime
Time management - Parallelization
● You can run two codes in the background
● If you are satisfied with your code, utilize this to run multiple random
seeds/multiple train validation split/multiple model structures, so
you can ensemble
A sample procedure for beating the boss baseline
The boss baseline could be beaten with a single model trained on kaggle for 12hrs
Train : 12h
Prediction
O
Q2. Residual Connection Implementation (2%)
Residual Connection is widely used in CNNs such as Deep Residual Learning
for Image Recognition. Residual is demonstrated in the following graph.
Image Source :
https://towardsdatascience.com/wh
at-is-residual-connection-efb07cab0
d55
Q2. Residual Connection Implementation (2%)
Implement Residual Connections in the Residual_Model, following the
graph below. Copy your Residual_Model code and paste it on Gradescope.
=Addition
● Your Residual_Model should connect like =ReLU
cnn_layer1
cnn_layer2
cnn_layer3
cnn_layer4
cnn_layer5
cnn_layer6
fc_layer
● You should only modify the forward part of the model
Submission Format
● train_tfm and Residual_Model are present in colab (scroll to
bottom) and kaggle (ML2022HW3 - Report Questions), you only need to
modify from our sample code.
Regulations and Grading Policy
Grading
● simple (public) +0.5 pts
● simple (private) +0.5 pts
● medium (public) +0.5 pts
● medium (private) +0.5 pts
● strong (public) +0.5 pts
● strong (private) +0.5 pts
● boss (public) +0.5 pts
● boss (private) +0.5 pts
● code submission +2 pts
● report +4 pts
Total : 10 pts
Code Submission
● NTU COOL
○ Compress your code and pack them into .zip file
<student_ID>_hw3.zip
Colab :
https://colab.research.google.com/drive/15hMu9YiYjE_6HY99UXon2vKGk2Kwu
gWu
Contact us if you have problems…
● Kaggle Discussion (Recommended for this HW)
● NTU COOL
● Email
○ [email protected]
○ The title should begin with “[hw3]”