0% found this document useful (0 votes)
2 views

Machine Learning Concepts part 4

Uploaded by

Sagar Das
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Machine Learning Concepts part 4

Uploaded by

Sagar Das
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Training and Testing

!"#$%$%&'()*'

+,'-.&/"$*01'

+/2).'345'
Training and Testing

!"#$%$%&'()*'

+,'-.&/"$*01'

6%7/1)8''
&)%2)"8'' =")2$*'#1/:%*'>'
#&)8'' +/2).'345' =")2$*'9)(?%/'
4#1$.9'(*#*:(8'
;$<7/2)'
K-nearest neighbors
• Not every ML method builds a model!

• Our first ML method: KNN.

• Main idea: Uses the similarity between examples.

• Assumption: Two similar examples should have same labels.

• Assumes all examples (instances) are points in the d dimen-


sional space Rd.
K-nearest neighbors
• KNN uses the standard Euclidian distance to define nearest
neighbors.
Given two examples xi and xj :

v
d
u
uX
d(xi, xj ) = t (xik − xjk )2
u

k=1
K-nearest neighbors
Training algorithm:
Add each training example (x, y) to the dataset D.
x ∈ Rd, y ∈ {+1, −1}.
K-nearest neighbors
Training algorithm:
Add each training example (x, y) to the dataset D.
x ∈ Rd, y ∈ {+1, −1}.

Classification algorithm:

Given an example xq to be classified. Suppose Nk (xq ) is the set of

the K-nearest neighbors of xq .

X
ŷq = sign( yi)
xi∈Nk (xq )
K-nearest neighbors

3-NN. Credit: Introduction to Statistical Learning.


K-nearest neighbors

3-NN. Credit: Introduction to Statistical Learning.

Question: Draw an approximate decision boundary for K = 3?


K-nearest neighbors

Credit: Introduction to Statistical Learning.


K-nearest neighbors
Question: What are the pros and cons of K-NN?
K-nearest neighbors
Question: What are the pros and cons of K-NN?
Pros:
+ Simple to implement.
+ Works well in practice.
+ Does not require to build a model, make assumptions, tune
parameters.
+ Can be extended easily with news examples.
K-nearest neighbors
Question: What are the pros and cons of K-NN?
Pros:
+ Simple to implement.
+ Works well in practice.
+ Does not require to build a model, make assumptions, tune
parameters.
+ Can be extended easily with news examples.

Cons:
- Requires large space to store the entire training dataset.
- Slow! Given n examples and d features. The method takes
O(n × d) to run.
- Suffers from the curse of dimensionality.
Applications of K-NN
1. Information retrieval.

2. Handwritten character classification using nearest neighbor in


large databases.

3. Recommender systems (user like you may like similar movies).

4. Breast cancer diagnosis.

5. Medical data mining (similar patient symptoms).

6. Pattern recognition in general.


Training and Testing

!"#$%$%&'()*'

+,'-.&/"$*01'

6%7/1)8''
&)%2)"8'' =")2$*'#1/:%*'>'
#&)8'' +/2).'345' =")2$*'9)(?%/'
4#1$.9'(*#*:(8'
;$<7/2)'

Question: How can we be confident about f ?


Training and Testing
• We calculate E train the in-sample error (training error or em-
pirical error/risk).

n
E train(f ) =
X
`oss(yi, f (xi))
i=1

You might also like