0% found this document useful (0 votes)
13 views

Assignment 1 (3)

The document outlines Homework 1 for CSCI410, which involves implementing the K Nearest Neighbors (KNN) algorithm in Python. It details the required functions, including calculating Euclidean distance, finding K nearest neighbors, predicting class labels, and calculating accuracy. Additionally, it discusses weighted voting methods for KNN and provides a scenario for determining class labels using inverse-square and Gaussian voting methods.

Uploaded by

aliawela00
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Assignment 1 (3)

The document outlines Homework 1 for CSCI410, which involves implementing the K Nearest Neighbors (KNN) algorithm in Python. It details the required functions, including calculating Euclidean distance, finding K nearest neighbors, predicting class labels, and calculating accuracy. Additionally, it discusses weighted voting methods for KNN and provides a scenario for determining class labels using inverse-square and Gaussian voting methods.

Uploaded by

aliawela00
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CSCI410 : Homework 1 (Due October 24)

Exercise 1

KNN implementation

Your task is to implement the K Nearest Neighbors algorithm from scratch using python. The algorithm
should be able to classify a given set of data points into predefined classes based on the majority vote of
its K nearest neighbors. The algorithm should include the following functions:

1. euclidean_distance(point1, point2): This function calculates the Euclidean distance between two
data points, point1 and point2. The Euclidean distance metric is commonly used in KNN to
measure the similarity between data points.
2. get_k_nearest_neighbors(X_train ,y_train, x_test, k): Given a test data point x_test, this function
should calculate and return the K nearest neighbors from the training data. It should use the
euclidean_distance() function to compute the distances between the test point and all the
training data points, and then select the K nearest neighbors based on the shortest distances.
X_train, a matrix representing the training data with each row representing a data point, and
y_train, a vector representing the corresponding labels or classes for each data point in the
training set.
3. predict(x_test, k): This function predicts the class label for a given test data point x_test. It
should make use of the get_k_nearest_neighbors() function to obtain the K nearest neighbors
and then use majority voting to determine the predicted class label.
4. accuracy(y_pred, y_true): This function calculates the accuracy of the KNN algorithm by
comparing the predicted labels y_pred with the true labels y_true.

Please submit your code in a well-documented format, including comments to explain the different
sections and steps of your implementation.

Exercise 2

The k-NN algorithm looks at several close neighbors, then it lets the neighbors vote with equal strength,
with victory going to the value with most votes. This approach raises concerns when the close neighbors
predict different values. One solution is to let the neighbors vote in proportion to a weight, 𝑤, that is
determined by the distance. Here are some possibilities:

 Each nearby neighbor votes according to the inverse square of its distance, 𝑤 = 1/𝑑2 .
 Each nearby neighbor votes according to a Gaussian function of the square of its distance,
2
𝑤 = 𝑒 −𝛼𝑑 .
In the figure bellow, suppose that we will use three neighbors (k = 3) to predict the color of the
unknown black circle (P). Determine whether the P is blue, red or green for inverse-square voting and
for Gaussian voting with 𝛼 = 0.2 and with 𝛼 = 0.4.

You might also like