Machine Learning Concepts part 4
Machine Learning Concepts part 4
!"#$%$%&'()*'
+,'-.&/"$*01'
+/2).'345'
Training and Testing
!"#$%$%&'()*'
+,'-.&/"$*01'
6%7/1)8''
&)%2)"8'' =")2$*'#1/:%*'>'
#&)8'' +/2).'345' =")2$*'9)(?%/'
4#1$.9'(*#*:(8'
;$<7/2)'
K-nearest neighbors
• Not every ML method builds a model!
v
d
u
uX
d(xi, xj ) = t (xik − xjk )2
u
k=1
K-nearest neighbors
Training algorithm:
Add each training example (x, y) to the dataset D.
x ∈ Rd, y ∈ {+1, −1}.
K-nearest neighbors
Training algorithm:
Add each training example (x, y) to the dataset D.
x ∈ Rd, y ∈ {+1, −1}.
Classification algorithm:
X
ŷq = sign( yi)
xi∈Nk (xq )
K-nearest neighbors
Cons:
- Requires large space to store the entire training dataset.
- Slow! Given n examples and d features. The method takes
O(n × d) to run.
- Suffers from the curse of dimensionality.
Applications of K-NN
1. Information retrieval.
!"#$%$%&'()*'
+,'-.&/"$*01'
6%7/1)8''
&)%2)"8'' =")2$*'#1/:%*'>'
#&)8'' +/2).'345' =")2$*'9)(?%/'
4#1$.9'(*#*:(8'
;$<7/2)'
n
E train(f ) =
X
`oss(yi, f (xi))
i=1