K_NN classification
K_NN classification
K-Nearest Neighbors (KNN) is a simple, intuitive machine learning algorithm used for
both classification and regression tasks. The core idea behind KNN is that data points
that are close to each other in feature space are likely to belong to the same class (for
classification) or have similar values (for regression).
1. Choose the number of neighbors (K): You first select the number of nearest
neighbors to consider for making a decision. For example, if K = 3, we consider
the 3 closest points to the new point.
2. Calculate the distance: For a new data point, calculate the distance from it to all
other points in the training dataset. A common distance metric is Euclidean
distance.
3. Sort the distances: Once you calculate all the distances, sort them in ascending
order to find the K nearest points (neighbors).
4. Make a prediction:
o For classification: Assign the most common class (the majority class)
among the K nearest neighbors to the new point.
o For regression: Take the average(or mean) of the values of the K nearest
neighbors and assign it to the new point.
Example 2 : Classification of Fruits
Let's say we have a dataset of fruits based on two features: weight and color intensity.
We need to classify a new fruit as either an apple or a banana.
| Fruit | Weight (grams) | Color Intensity (scale 1-10) | Class |
| Apple | 150 |8 | Apple |
| Apple | 160 |7 | Apple |
| Banana | 120 |4 | Banana |
| Banana | 130 |5 | Banana |
| Apple | 140 |8 | Apple |
| Banana | 125 |6 | Banana |
Step-by-step Process
5. Classify: The majority of the 3 nearest neighbors are apples, so the new fruit is
classified as an apple.