Individual Assignment 3 Guideline
Individual Assignment 3 Guideline
Note: Do not include the questions as well as dataset in your submission (to avoid similarity with
other submissions)
Objective:
The objective of this assignment is to familiarize you with two essential concepts in machine
learning: K-Fold Cross Validation and Early Stopping. You will apply these techniques to a
classification task using a mobile price classification dataset. Through this assignment, you will
learn how to implement K-Fold Cross Validation to assess model performance and how to use
Early Stopping to prevent overfitting during model training.
For both parts of this assignment, you'll use the provided 'mobile.csv' dataset to classify mobile
prices as low (0) or high (1).
Step-by-Step Procedure:
The step-by-step procedures (11 steps) to apply cross validation and model performance
evaluation is provided in the file ‘Classification Metrics and K-fold CV.ipynb’ which can be find
under the Additional Materials folder.
IMPORTAN NOTE: some parts of the codes are missing and denoted by ‘???’ and left to you to
complete. This is an easy part provided that you follow the instructions carefully. (Also read the
green comments before the codes which help you fill the ??? sections easier)
Deliverables:
1. At the end, provide a short summary of the following results: optimal k (based on ‘accuracy’,
‘f1_score’), ‘accuracy’ and ‘f1_score’ score lists (20 for 20 different k’s) according to k-fold cross
validation.
Optimal K
Evaluation Metric:
accuracy
Evaluation Metric:
f1_score
Score List
Evaluation Metric:
accuracy
Evaluation Metric:
f1_score
3. A short discussion of whether the model is overfitted, underfitted or proper fit? Interpret the
results to understand the model's performance and potential bias-variance trade-offs.
Discussion on
model fitness MAXIMUM 5 Lines
(question 3)
Part 2: Early Stopping and Artificial Neural Network Classification
In this section, you will learn about the concept of Early Stopping and how to implement it to
prevent overfitting during model training.
Step-by-Step Procedure:
The step-by-step procedures (9 steps) to apply cross validation and model performance
evaluation is provided in the file ‘Early Stopping.ipynb’ which can be find under the Additional
Materials folder.
IMPORTAN NOTE: some parts of the codes are missing and denoted by ‘???’ and left to you to
complete. This is an easy part provided that you follow the instructions carefully. (Also read the
green comments before the codes which help you fill the ??? sections easier)
Deliverables:
1. Screen shot of training and validation loss over epochs.
2. The train and test accuracy.
3. Do you think the model gets overfitted after 100 epochs? If no, provide your reasons, and if
yes, what is the optimal number of epochs to stop training at?
accuracy
Test
Train
Discussion on
model overfitting MAXIMUM 5 Lines
(question 3)
IMPORTANT: The report submission should include ONLY the following components:
Part 1:
• Three Tables: Optimal K, Score list, and discussion on model fitness.
• Two Screenshots: Train and test confusion matrix.
Part 2:
• Two Tables: Train and test accuracy, and discussion on model overfitting.
• One Screenshots: training and validation loss over epochs.
Please adhere to this specified structure and format for your submission.
N.B. Failure to comply with the above would result in low grades.