2.introduction To Supervised Learning and K Nearest Neighbors
2.introduction To Supervised Learning and K Nearest Neighbors
Supervised Learning
Learning
•Objectives
Explain supervised learning and how it can be applied to
regression and classification problems
Machine learning
allows computers to
learn and infer from
data.
Machine Learning in Our Daily
Lives
Spam Web Search Postal Mail
Filtering Routing
Fraud Detection
Movie Vehicle Driver
Recommendations
Assistance
Fraud Detection
Movie Vehicle Driver
Recommendations
Assistance
Fraud Detection
Movie Vehicle Driver
Recommendations
Assistance
Movie Vehicle
Fraud
Recommendatio Driver
Detection
ns Assistance
Classification outcome is a
category
Types of Supervised
Learning
Classificatio outcome is a
n category
Supervised Learning
Overview
movie fit
data with + mode model
revenue l
labele fit
+ mode model
d
l
data
2.Generative:
- build a generative statistical model
- e.g., Naïve Bayes
3.Discriminative
- directly estimate a decision rule/boundary
- e.g., decision tree
Classification
• Assume we want to teach a computer to distinguish between cats and
dogs …
Several steps:
1. feature transformation
2. Model / classifier
specification
3. Model /
classifier
estimation (with
regularization)
4. feature
selection
Classification
• Assume we want to teach a computer to distinguish between cats and
dogs …
Several steps:
1. feature transformation
2. Model / classifier
specification
3. Model / classifier
estimation (with
regularization)
4. feature selection
Several steps:
1. feature transformation
2. Model / classifier
specification
3. Model / classifier
estimation (with
regularization)
4. feature selection
Several steps:
1. feature transformation
2. Model / classifier
specification
3. Model / classifier estimation
(with regularization)
4. feature selection
Several steps:
1. feature transformation
2. Model / classifier
specification
3. Model / classifier estimation
(with regularization)
4. feature selection
Classifier
X Y
w1, w2 …
X,
Y
teacher
Types of classifiers
• We can divide the large variety of classification approaches into roughly two main types
2.Generative:
- build a generative statistical model
- e.g., Bayesian networks
3.Discriminative
- directly estimate a decision rule/boundary
- e.g., decision tree
Machine Learning
Vocabulary
• Target: predicted category or value of the
data
(column to predict)
• Features: properties of the data used for
prediction
(non-target columns)
• Example: a single data point within the data (one
row)
• Label: the target value for a single data point
Machine Learning
Vocabulary
sepal length sepal width petal length petal width species
e
6.9 3.1 4.9 1.5 versicolor
l
6.9 3.1 4.9 1.5 versicolor
?
What is
Classification?
Which flower is a customer
most likely to purchase based
on similarity to previous
purchase?
?
What is
Classification?
Which flower is a customer
most likely to purchase based
on similarity to previous
purchase?
?
What is
Classification?
Which flower is a customer
most likely to purchase based
on similarity to previous
purchase?
?
What is Needed for
Classification?
• Model data with:
• Features that can be
quantitated
• Labels that are
known
• Method to measure
similarity
What is Needed for
Classification?
• Model data with:
• Features that can be
quantitated
• Labels that are known
• Method to measure
similarity
What is Needed for
Classification?
• Model data with:
• Features that can be
quantitated
• Labels that are known
• Method to measure similarity
KNN: Pseudocode
KNN: Example
K nearest neighbors (KNN)
• Need to determine an appropriates
value for k
• What happens if we chose k=1?
• What if k=3?
?
Euclidean
Distance
Ag
e
Number of Malignant
Nodes
Euclidean Distance (L2
Distance)
d
∆ Age
Ag
e
∆ Nodes
𝑑=
∆𝑁𝑜𝑑𝑒𝑠2 + ∆𝐴𝑔𝑒2
∆ Age
Ag
e
∆ Nodes
𝑑=
∆𝑁𝑜𝑑𝑒𝑠 +
∆𝐴𝑔𝑒
KNN: Euclidean distance matrix
Decision Boundaries
Voronoi diagram
🞑 Describes the areas that are nearest to any given point, given a set of data.
🞑 Each line segment is equidistant between two points of opposite class
https://www.youtube.com/watch?v=j2c3kumwoAk
Decision Boundaries
• KNN creates local models (or neighborhoods) across the feature space with
each space defined by a subset of the training data.
• Implicitly a ‘global’ decision space is created with boundaries between the
training data.
Decision Boundaries
With large number of examples and possible noise in the labels,
the decision boundary can become nasty!
🞑 “Overfitting” problem
Effect of K
Larger k produces smoother boundary effect
When K==N, always predict the majority class
Discussion
Which model is better between K=1 and
K=15?
Why?
How to choose
k?Empirically optimal
k?
Feature scaling
Standardize the range of independent variables (features of data)
A.k.a Normalization or Standardization
Standardization
Standardization or Z-score normalization
🞑 Rescale the data so that the mean is zero and the standard
deviation from the mean (standard scores) is one
x−𝜇
x𝑛 𝑜 𝑟 𝑚 = 𝜎
𝜇 is mean, 𝜎 is a standard deviation from the
mean (standard score)
Min-Max scaling
Full remission
Partial
remission 40
Did not survive
Age
20
0 10
20
Number of Malignant Nodes
Regression with
KNN
K= K= K=
20 3 1
Pros and Cons
Pros
🞑 Learning
and implementation is extremely simple and Intuitive
🞑 Flexible decision boundaries
Cons
🞑 Irrelevant
or correlated features have high impact and must be eliminated
🞑 Typically difficult to handle high dimensionality
🞑 Computational costs: memory and classification time computation
K Nearest Neighbors: The
Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
Add the following two lines of code after the above code:
import patch_sklearn
patch_sklearn()
K Nearest Neighbors: The
Syntax
Import the class containing the classification method
from sklearn.neighbors import KNeighborsClassifier
Fit the instance on the data and then predict the expected value
KNN = KNN.fit(X_data, y_data)
y_predict = KNN.predict(X_data)
Fit the instance on the data and then predict the expected value
KNN = KNN.fit(X_data, y_data)
y_predict = KNN.predict(X_data)