0% found this document useful (0 votes)
8 views4 pages

ALY 6020 Week 1 Midweek Assignment

This report details the Module 1 Midweek project for ALY 6020 Predictive Analytics, focusing on building a classification model using the k-nearest neighbour (k-NN) algorithm with the 'iris' dataset. The model achieved an accuracy of 96.7%, with individual species accuracies of 100% for setosa and versicolor, and 90% for virginica. The analysis indicates that varying the value of k maintains an accuracy above 90%, confirming the effectiveness of the k-NN model for classifying iris types.

Uploaded by

liankairen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views4 pages

ALY 6020 Week 1 Midweek Assignment

This report details the Module 1 Midweek project for ALY 6020 Predictive Analytics, focusing on building a classification model using the k-nearest neighbour (k-NN) algorithm with the 'iris' dataset. The model achieved an accuracy of 96.7%, with individual species accuracies of 100% for setosa and versicolor, and 90% for virginica. The analysis indicates that varying the value of k maintains an accuracy above 90%, confirming the effectiveness of the k-NN model for classifying iris types.

Uploaded by

liankairen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

ALY 6020- Module 1 Midweek Project

Liankai Ren

ALY6020: Predictive Analytics, College of Professional Studies,


Northeastern University

Professor: Chris Luciuk.

Sep 24, 2022


Introduction
This report is the Module 1 Midweek project of ALY 6020 Predictive Analytics. In
this report, we used the k-nearest neighbour algorithm(K-NN) to build a classification
model. The k-NN is a non-parametric supervised learning method first developed by
Evelyn Fix and Joseph Hodges in 1951. It is a simple, easy-to-implement supervised
machine learning algorithm that can be used to solve both classification and regression
problems. Therefore, the k-NN has become one of the most famous classification
algorithms as of now in the industry. The dataset that we used in this report is “iris”. The
data set contains 3 classes of 50 instances each, where each class refers to a type of iris
plant. There are 5 attributes: sepal length, sepal width, petal length, petal width and
class. In this report, we need to use the k-NN to build a classification model for different
types of irises.
Analyzsis
First, library required package and load the “iris” dataset. There are 5 attributes in
this dataset. The first 4 attributes are the target scale and the last attribute “Species” is
the classifier.

Then split the data to train set and test set with the ratio 80%. Also, set the target
scale that include all 4 attribute for iris.

After all preparation, we can build the k-NN model and set the k=1. The confusion
matrix shows the result of our k-NN model and answer the question that the accuracy of
the model is 0.967. The accuracy of the setosa is 100%, accuracy of the versicolor is
100% and accuracy of the virginica is 90%.

For further analyzing, the value of k can be varied to check the accuracy. From
the output, it shows when k=2, 3, 5, 10, the accuracy is 90%. Based on the results, all
accuracies if the k-NN model with different k is above 90%. Therefore, we can conclude
that the k-NN model is a good classification model for prediction the type of iris.
Reference
“K-NN Classifier in R Programming - GeeksforGeeks.” GeeksforGeeks, 18
June 2020, https://www.geeksforgeeks.org/k-nn-classifier-in-r-programming/.
skalskip. “Iris Data Visualization and KNN Classification | Kaggle.” Kaggle:
Your Machine Learning and Data Science Community, Kaggle, 27 Sept. 2017,
https://www.kaggle.com/code/skalskip/iris-data-visualization-and-knn-
classification.

You might also like