0% found this document useful (0 votes)
9 views1 page

Lab 1 Assignment

The Lab 1 Assignment requires students to implement the KNN algorithm for fruit classification using the 'fruit data with colors.txt' dataset. Students must complete tasks including data import, dataset splitting, feature scaling, KNN classifier construction, model training, and cross-validation accuracy computation. The assignment is to be submitted as a .ipynb file by February 9 at 11:59 PM, with specific instructions for using Google Colab.

Uploaded by

kongjun9423
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views1 page

Lab 1 Assignment

The Lab 1 Assignment requires students to implement the KNN algorithm for fruit classification using the 'fruit data with colors.txt' dataset. Students must complete tasks including data import, dataset splitting, feature scaling, KNN classifier construction, model training, and cross-validation accuracy computation. The assignment is to be submitted as a .ipynb file by February 9 at 11:59 PM, with specific instructions for using Google Colab.

Uploaded by

kongjun9423
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Lab 1 Assignment

Student Name ID:

Submit your .ipynb file to the Lab Assignments section, ensuring that your name and student ID are
included, by 11:59 PM on February 9.

Note: Please answer the following questions using Google Colab - the online Python platform https:
//colab.research.google.com, and click New Notebook to access a Colab note. For all questions
related to coding, the codes should be provided; please also refer to the relevant lecture slides.

1. (10 points) In this assignment, we will use the “fruit data with colors.txt” dataset to implement
the KNN algorithm for fruit classification. We will use the mass, width, height, and color scores as
features and fruit label as the target variable. Our objective is to develop a KNN based predictor
that can classify a fruit based on its features.
(a) (2 points) Please import the dataset “fruit data with colors.txt” and show the first 10 rows.
Use mass, width, height, and color scores as the features and fruit label as the target.
(b) (1 point) Please split the dataset into training and testing datasets with a 4:1 ratio.
(c) (3 points) Please rescale the features by using min-max scaling. Discuss the importance of
scaling. Explain two different methods of scaling. Put your explanation in a markdown cell.
(d) (2 points) Please construct a KNN-based fruit classifier by performing a Grid Search to find
the best value of K for the KNN classifier, using values of K ranging from 1 to 20.
(e) (1 point) Using the best K found from Grid Search, train a KNN model and compute its test
set accuracy.
(f) (1 point) Using the trained model, perform 5-fold cross-validation and compute the accuracy
for each fold. Report the individual fold accuracies and the average cross-validation accuracy.

You might also like