0% found this document useful (0 votes)

15 views45 pages

Lecture 07 Slides

The document outlines a course structure that includes topics such as Gaussian Naive Bayes and k-Nearest Neighbour (k-NN), along with student feedback and adjustments made to the course. It discusses the implementation of Gaussian Naive Bayes as a probabilistic classifier and the k-NN algorithm for classification based on proximity among data points. Additionally, it covers assessment methods and provides a framework for understanding conditional independence in the context of the course material.

Uploaded by

baptiste.ferrer10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views45 pages

Lecture 07 Slides

Uploaded by

baptiste.ferrer10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Gaussian Naive Bayes

k-Nearest Neighbour
2
Outline

▪ Student reviews, discussion and measures to incorporate them

▪ Gaussian Naive Bayes

▪ k-Nearest Neighbour (k-NN)

Student reviews and how I accounted for them
1) hand-writing hard to follow and fast:
- slower writing, recording of lectures

2) course organization is not clear

- added a sheet about lectures and topic

3) grading not clear

- See Moodle with the example worked out
- Example final will be placed
4) more problem sets
- two problem sets were provided since the quiz, more will be provided throughout and worked
examples in class

Miscellaneous comments to discuss.

Length and clarity of exercise
Connection with Analysis Numerique: seems only linear regression (important topic)
Typos? L1: 0, L2: 2, L3: 1, L4: 0, L5: 0, L6: 0
2) Course structure

Lecture Exercise hour

Week Date Topic Quiz PS Python exercise
1 25.09.23 Introduc0on - EX00, EX01,EX02: python background
2 02.10.23 Linear regression - EX03: linear regression
3 09.10.23 Logis0c regression - EX04: case study, regression for system iden0fica0on
4 16.10.23 AI ethics PS1 -
5 23.10.23 Mul0nomial logis0c regression, feature engineering 25.10.23 - -
6 30.10.23 Data sta0s0cs, Naïve Bayes - EX05: logis0c regression, cross-valida0on
7 06.11.23 Gaussian Naïve Bayes, k-NN - EX06: KNN, hyper parameter tuning
8 13.11.23 Clustering (k-means), dimensionality reduc0on PS2 -
9 20.11.23 PCA, Neural networks (NN) 22.11.23 - EX07: neural network for character recogni0on
10 27.11.23 Convolu0onal neural network (CNN) - EX08: CNN for image classifica0on
11 04.12.23 CNN, Decision-tree PS3 EX09: python background on pandas package
12 11.12.23 Random forest, AI in Industry 13.12.23 - EX10: decision-tree for 0tanic
13 18.12.23 RL, Review Prac0ce final
3) Assessment

Sample of last year exam will be available on Moodle

6
Introduction Linear regression Logistic regression

Feature engineering Data statistics Naive Bayes

KNN Clustering Dimensionality reduction

Neural networks Convolutional neural Decision-trees

networks
Gaussian Naive Bayes
Recall - Naive Bayes for classification
Bayes’ theorem for finite-valued features xiR" xje40 13 , ,

P(x | y = c)P(y = c)
P(y = c | x) =
P(x)
Prior probability ·

P(y =

Generative model ·
P(x)y =

c) features are independent

the clay
given
Si
.

Naive Bayes assumption 4p(xj)y c)

P(y)y c)
=

i 1
=

word "money
why helpful
-

? P(x; # emails hang

/y <)
:

/ I ↓ of emails
spam
xi
:

"Money" spam
example
Bayes rule x
=(Rd
Continuous features

i
fx(x)

·
fx|y=c(x)P(y = c)
P(y = c | x) =
fx(x)
Probability distribution for a continuous random variable ↑

--I -

functor of
f
probabilly density IR
x =
(2) ·
as
X
al
F v = IR
Note : Pr(x =

v) =
0

d
but Pr( +
=D) =

(f ,
x)dx D CIR

Recall ,
(f ,
(d :
1 and f, x) - ↓x =
1R4
IRY
Naive Bayes assumption
Continuous features
fx|y=c(x)P(y = c)
P(y = c | x) =
fx(x)
condit of class
I probability density
(a) a.
sprea
-
a
,
ly =
<

I well-defined probably deasily ful

Naire Baye's anssumption :

f (j)
ficly=
(x) >
4
C
j =

cj(y =

features are
conaliterally indep .

given class a
Gaussian Naive Bayes
Assumes that the generative model for each feature follows a gaussian
distribution I e
L 0
-
;,
2)2 of
- ;, uj
:
mean

f (x ; I e S
, 2

data for feature ;

; /y =

e given class label C

2
2
O variance

NC e;, c)
-
..

I
- -
Gaussian Naive Bayes
Estimating the conditional Gaussian distribution for each feature
using sample data
xi y} * yeh 12 k3
;
,
, ...,

Recall E
Eit
: ~

i:
IIc isIc
where I
C
has the indices of
data
correspondy to class 2 .

E
)
C ;
⑧
↑
>
i
x
-Ic
I
define
C

Note
E

: some texts varianc

I Ic
by divid by 1I,-7 :
Gaussian Naive Bayes
Classifier prediction

fx|y=c(x)P(y = c)
P(y = c | x) =
fx(x)
to
procket label y for data point ,

need to
(y= c(x) for cah1 2
k3
we
P
compare every
, , ...,

denominater the
f
,
(12) is
same for
any <(1 ,
2,
K)

numerates
so , its sufficient to
compare
.
Gaussian Naive Bayes
Naive Bayes assumption and final classifier
fx1|y=c(x1)fx2|y=c(x2)…fxd|y=c(xd)P(y = c)
P(y = c | x) =
fx(x)
d

If E xj)
(x) assumphon
fx(y
=

c i
=

3 xj/y =

Gam data
->
computed
2

f (j) N (u
5j )
=

xj/ y =

< ;, c ,

be small take
logarithm of the product above
fxj(y !xj)
Since can
.

:
=

d
s

log (1 fxjly-d ; 14(y x) I l fx;; ) +

logp(y =

e)
og
= =

j
=

, 1
Gaussian Naive Bayes
Summary

-probabilish classifier

assumphon features
conditonally indep
class
-

are
given
: .

I
-trainig 15
easier

hard
prache to
varify ,

but empirically has worked well -

See this explained example as introduction : https://www.youtube.com/watch?v=H3EjCKtlVog&t=60s

Exercise - conditional independence
Background of dataset and problem

we base next exercise

inspired by
I
this dataset (small modificators of
numbers for ease of computate)
Source: Manuel Foerster and Dominik Karos Article
Step by step example - conditional independence
• 1) Show the probability of being arrested is not independent of being Black

Group Population Number arrested

Black 1,8 x 10^6 10 x 10^3

White 2,7 x 10^6 2 x 10^3

• 2) Show, conditioned on being stopped, probability of Group Population Number arrested Number stopped
being arrested is independent of being black
Black 1,8 x 10^6 10 x 10^3 5 x 10^5
ofcolor
White 2,7 x 10^6 2 x 10^3 1 x 10^5
&
white populate

I
A &
black
A beg black
areated

&
:

B
B :

being arrested
&
C
being stopped stopped
:

blacks areated
a
-
DP(AIB) =

-
10
=
5/6
# arrested x 103

PcA)
Ijettorys
:

(Recall ·
P (AIB) =

PCA) FD A , B are
independent)
A & B are
independent
Conditonal
② independence given 2 .

P(A ,
B1C) =
P(A1C) P (BIC) ?

↑ (A , BIC) =

P(AIC) =
Mos
~x 10S

P(BIC) =
x1c3
6 x 10S

x3
P(A , BIC =

6 x 10S
-
110s
6 10S
x
113
6 10S
, X
x
~
I

k-Nearest Neighbour
k-NN Problem setup
-D his *
Supervised machine learning
fire & .

Goal: Use a dataset to produce useful predictions on never-before-seen data.

labels

Recall terminology:
Features: input variables
Label: what we are predicting
k-NN Abstraction for classification

Problem: Classifying data points among different categories.

KNN (K-Nearest Neighbors) algorithm assumes that data

points of similar classes exist in close proximity
(Similar Inputs have similar outputs )

It classifies an unknown data point according to the category

⑭nearest neighbors
of its K
(K = 1, 3, 5, …0) for binary classification
E
k-NN Distance Metric
How to measure the proximity between data points? → Measure distance
=IRd
(x ! -xi)")
C' .

Minkowski distance D(x , x2) =

j 1
=

P
=

1
, 2 ,
3 ....

P =
1 :
Manhattan (1) distance
P =

2 : Evalidean (22) distance .

x , x
=

(0 , 13d /
x =

(1 ,
0
,
0
,
0
,
1 .. N
↑
, o

Hamming distence :
# posshons that the rectors differ

example D
(i) [8 7) 2
:
:

,
Hanning
k-NN Distance Metric
How to measure the proximity between data points? → Measure distance

1)
13(x)D(x x) =

qu I Drankaltem
(x x) =

I
,
Euclidean

↑ Level sets of the two most common distances

I
-

L1 (Manhattan) distance L2 (Euclidean) distance

I
I

-
I
M I
ic

-
k-NN Feature scaling

Features might have different scales

Distance metric gives more importance to the feature with largest scale
Normalizing data allows equal exploitation of information from features

Z-Score Standardisation: Set mean (µ) to 0, standard deviation (σ) to 1

For each feature:
k-NN
Feature scaling - Visualisation

Normalize
k-NN
Visualisation When K=1:
take label of nearest neighbor

K=1

In which category belongs the point? Gentoo

k-NN
K>1 Instead of copying label from nearest neighbor,
take majority vote from K closest points

K=3

In which category belongs the point? Adélie

k-NN
K>1 Instead of copying label from nearest neighbor,
take majority vote from K closest points

K=9

In which category belongs the point? Gentoo

What is a pseudo-code?

▪ Description of an algorithm in a language independent way

▪ It’s what we think the algorithm should do before we encode it in
python,matlab, C, etc.
▪ In this course, I don’t require a formal syntax (otherwise, it becomes a
programming again)
because its label unknown
k-NN unknown
is

-
Implementation
test
test decide it
.

Given x want
to label y
-

▪ Initialize k N

▪ For every known data point (x -

xtesl
1
N
=

Xin
,

i 1
D(x
= ...,
,

• Calculate the distance with the unknown data point

in
in
▪ Pick the k nearest known data points from the unknown data point !
,
...
iz "
y y ▪ Get the labels of the selected k entries i
, ...
i are indices of the
3 -
,
k
, ...,

▪ Return the mode (majority vote) of the k labels K closest point .

k-NN
Implementation (in your exercises this week)
k-NN
Implementation

(xi ,
y= Y
=

Save training data

k-NN
Implementation

test
For each test sample: x

▪ Find k-closest training data

▪ Predict mode of k-closest
training data
k-NN
Implementation

Q: With N examples, how

fast are training and
prediction?
k-NN
Implementation

Q: With N examples, how

fast are training and
prediction?

A: Train O(1), predict O(N)

k-NN
Implementation

Q: With N examples, how

fast are training and
prediction?

A: Train O(1), predict O(N)

If we are using for real-time

decision-making, this could
be bad: we want classifiers
that are fast at prediction;
slow for training can be ok
k-NN
Hyperparameters

What is the best value of k to use ?

What is the best distance to use ?

These are hyperparameters: choices about the

model/algorithm that we set rather than learn
k-NN
Hyperparameters

What is the best value of k to use ?

What is the best distance to use ?

These are hyperparameters: choices about the

model/algorithm that we set rather than learn

Very problem-dependent.
Must try them all out and see what works best.
k-NN
Setting hyperparameters

Validation set accuracy for different values

of k for our penguin classification task

(Seems that k = 5 or 7 works best

for this example)
Train, validate, test in ML
Setting hyperparameters
Your Dataset

Cross-Validation : Split data into folds, try each fold as validation

and average the results
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Test

Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Test

Useful for small datasets

k-NN
Setting hyperparameters

Different dataset:
5-fold cross-validation
for the value of k.

Each point: single

outcome.

The line goes

through the mean, bars
indicate standard
deviation

(Seems that k ~= 7 works best

for this data)
k-NN
For randomly distributed points in high dimensions, distances concentrate within a very small range
> >
Di high dimension :

K .

NN doeint
work so well
-

>
② Manhattan ↑

I
distance works

better in
high
alimension
Distribution of all pairwise distances between randomly
-
distributed points within d-dimensional unit squares, Reference

Theory: Aggarwal et al. 2001, On the Surprising Behavior of Distance Metrics in High Dimensional Space
More intuition: StackExchange article
k-NN
Summary

Advantages Disadvantages
tunig hyperparameter
▪ Easy to implement
-
the
only train
▪ Does not work as well in high
->
▪ No training required
dimensions
▪ New data can be added
▪ Sensitive to noisy data and
seamlessly
skewed class distribution
▪ Versatile - useful for regression
▪ Requires high memory
and classification
▪ Prediction stage is slow with
large data, requires comparison
with all samples in dataset
ML

supervised - -
unsupervised
~- Rinear Next
nagression
~ Rogistic
-

alustering
dimensional
regression
~ Naire
Bayes
-

duchen
y
,

U-NN -

:
next
Reinforcement
~neural network Learn .

~ decision tree

L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
3a KNN PDF
No ratings yet
3a KNN PDF
26 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
No ratings yet
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
18 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
5c. Nearest Neighbour Classifier
No ratings yet
5c. Nearest Neighbour Classifier
2 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
Jntuk r20 ML Unit-II
No ratings yet
Jntuk r20 ML Unit-II
33 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
Instance Based Learning
No ratings yet
Instance Based Learning
20 pages
05 KNN
No ratings yet
05 KNN
49 pages
cs4302 Lecture2
No ratings yet
cs4302 Lecture2
40 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
L5-KNN
No ratings yet
L5-KNN
23 pages
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
Classification and Regression: Arturo Calder On Mora
No ratings yet
Classification and Regression: Arturo Calder On Mora
8 pages
m3 Final-1
No ratings yet
m3 Final-1
171 pages
Datamining Lect7knearst
No ratings yet
Datamining Lect7knearst
62 pages
Chapter 6: Classification and Prediction: Classify Predictions
No ratings yet
Chapter 6: Classification and Prediction: Classify Predictions
23 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
K NN Annotated Slides
No ratings yet
K NN Annotated Slides
9 pages
ML Practical Kunal 6-10
No ratings yet
ML Practical Kunal 6-10
10 pages
ML 5
No ratings yet
ML 5
35 pages
2-KNN
No ratings yet
2-KNN
67 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
Unit 4 - KVR
No ratings yet
Unit 4 - KVR
111 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
ML Practical Manjot 6-10
No ratings yet
ML Practical Manjot 6-10
10 pages
Machine Learning Lecture 02
No ratings yet
Machine Learning Lecture 02
25 pages
Week 7 Nearest Neighbours
No ratings yet
Week 7 Nearest Neighbours
21 pages
Lecture 2 - Nearest-Neighbors Methods
No ratings yet
Lecture 2 - Nearest-Neighbors Methods
57 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
ML Practical Kiranjot 6-10
No ratings yet
ML Practical Kiranjot 6-10
10 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
KNN HMM
No ratings yet
KNN HMM
51 pages
ML Unit-2
No ratings yet
ML Unit-2
33 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
KNN
No ratings yet
KNN
53 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
Topic 7.7 K-Nearest Neighbor Analysis
No ratings yet
Topic 7.7 K-Nearest Neighbor Analysis
5 pages
Session 9 KNN - 2024
No ratings yet
Session 9 KNN - 2024
23 pages
ML Unit 2 r20 Jntuk
No ratings yet
ML Unit 2 r20 Jntuk
34 pages
ML Mid2 Ans
No ratings yet
ML Mid2 Ans
24 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
CBSE Class 10 SST Notes Question Bank The Rise of Nationalism in Europe
67% (3)
CBSE Class 10 SST Notes Question Bank The Rise of Nationalism in Europe
9 pages
Impact of Online Shopping On Budget Management of STEM Students in OLFU
No ratings yet
Impact of Online Shopping On Budget Management of STEM Students in OLFU
34 pages
101 - Master Electrical Component List
100% (1)
101 - Master Electrical Component List
87 pages
The Relationship of Social Media and Mental Health
80% (5)
The Relationship of Social Media and Mental Health
41 pages
Code 188 - Punto Classic
No ratings yet
Code 188 - Punto Classic
5 pages
Dickies Duck Canvas Utility Pant Dk0a4xgoc401
No ratings yet
Dickies Duck Canvas Utility Pant Dk0a4xgoc401
1 page
Master Symbol
100% (1)
Master Symbol
3 pages
Las 4 Carpentry 7 8 q3
No ratings yet
Las 4 Carpentry 7 8 q3
5 pages
WS Science7 Q1 Week 1 Final
No ratings yet
WS Science7 Q1 Week 1 Final
14 pages
Siwes Report On Npa
No ratings yet
Siwes Report On Npa
45 pages
Solving SLE Using Cramer
No ratings yet
Solving SLE Using Cramer
9 pages
Shell Gadus s2 V460a 2
No ratings yet
Shell Gadus s2 V460a 2
2 pages
A300 Leg Press Exploded View
No ratings yet
A300 Leg Press Exploded View
2 pages
5 S
No ratings yet
5 S
22 pages
SATIP-K-001-03 - HVAC Metal Duct System - Rev. 6
0% (1)
SATIP-K-001-03 - HVAC Metal Duct System - Rev. 6
3 pages
12 Physics 1 PDF
No ratings yet
12 Physics 1 PDF
324 pages
QMS Iqa
No ratings yet
QMS Iqa
11 pages
Communication For Empowerment in Madagascar: An Assessment of Communication and Media Needs at The Community Level (2008)
100% (1)
Communication For Empowerment in Madagascar: An Assessment of Communication and Media Needs at The Community Level (2008)
84 pages
2nd Summative Test Science 1st Q
No ratings yet
2nd Summative Test Science 1st Q
4 pages
ChatGPT in Education Guide
No ratings yet
ChatGPT in Education Guide
5 pages
08 Group by
No ratings yet
08 Group by
17 pages
Customer Service Metrics Calculator - HubSpot
No ratings yet
Customer Service Metrics Calculator - HubSpot
25 pages
ECE 7995-Embedded Systems For Vehicles-Syllabus S17 - v2
No ratings yet
ECE 7995-Embedded Systems For Vehicles-Syllabus S17 - v2
6 pages
Practical Datesheet
No ratings yet
Practical Datesheet
6 pages
Effect of Blanching and Drying Methods of Spinach On The Physicochemical Properties and Cooking Quality of Enriched Pasta
No ratings yet
Effect of Blanching and Drying Methods of Spinach On The Physicochemical Properties and Cooking Quality of Enriched Pasta
8 pages
Measuring Consistency of Juices and Pastes
No ratings yet
Measuring Consistency of Juices and Pastes
3 pages
Classical and Modern Education Compared PSEv2 PDF
No ratings yet
Classical and Modern Education Compared PSEv2 PDF
2 pages
p7000 Yammy
No ratings yet
p7000 Yammy
2 pages
05-Number Crunching and Character Patterns
No ratings yet
05-Number Crunching and Character Patterns
2 pages
General Notes Legend:: Main Breaker
No ratings yet
General Notes Legend:: Main Breaker
1 page