0% found this document useful (0 votes)
0 views62 pages

Notes Class 4 Annoted Presented

The document outlines the topics covered in a classification course, including announcements about homework deadlines and algorithm categories such as regression and classification. It discusses various algorithms like K-NN and logistic regression, along with concepts like false alarm rates and the Base Rate Fallacy. The document emphasizes the importance of understanding these concepts in the context of data interpretation and classification tasks.

Uploaded by

amorn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views62 pages

Notes Class 4 Annoted Presented

The document outlines the topics covered in a classification course, including announcements about homework deadlines and algorithm categories such as regression and classification. It discusses various algorithms like K-NN and logistic regression, along with concepts like false alarm rates and the Base Rate Fallacy. The document emphasizes the importance of understanding these concepts in the context of data interpretation and classification tasks.

Uploaded by

amorn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

DS502/MA543

Classification, Part 1

Prof. Randy Paffenroth


[email protected]
Worcester Polytechnic Institute
Announcements

HW 1 due today!

HW 2 out and due in two weeks on 10/2.
– Let’s take a look at it.
Announcements

Any teammate issues?
Announcements

How was class last week?
Any questions from last time?
Categories of algorithms
Regression Classification

Dimension Reduction Clustering


Categories of algorithms
Regression Classification

Dimension Reduction Clustering


Categories of algorithms
Regression Classification

Dimension Reduction Clustering


Categories of algorithms
Regression Classification

Dimension Reduction Clustering


Demo 1 notes
See it in R.
Fit

KNN

LDA
All classes have the
same Gaussian
Logistic Regression
Predict

QDA
Gaussian Classes have
different Gaussians

Training and testing


Error TP, FP, FN, TN
to see how we did!
Notes 1a
False alarm rates

Predict Predict
“is green” “is not green”

Truly True Positive False Negative


“is green” (TP) (FN)
(null hypothesis
is true)
Truly False Positive True Negative
“is not green” (FP) (TN)
(null hypothesis
is false)
Notes 1a
False alarm rates

Predict Predict
“is green” “is not green”

Truly True Positive False Negative


“is green” (TP) (FN) Type 2 errors
(null hypothesis
is true)
Truly False Positive True Negative
“is not green” (FP) (TN)
(null hypothesis
is false)

Type 1 errors
Notes 1a
False alarm rates
Predict Predict
“is green” “is not green”

Truly True Positive False Negative


“is green” (TP) (FN)
(null hypothesis
is true)
Truly False Positive True Negative
“is not green” (FP) (TN)
(null hypothesis
is false)
Notes 1
Bayes classifier
See it in R. Demo 2 and Demo 3
Notes 1
K-NN
See it in R. Demo 4
Normalize your data!!
K-NN

How to pick K?
K-NN

Consistency!
Curse of dimensionality!
Logistic regression
Notes 2-6
Logistic regression
Logistic regression in R Demo 5
Notes 2-6
Logistic regression: Derivation
Demo 5

Logistic regression in R: Back to code


Be careful:
Logistic regression on separable data
Logistic regression: Multiple dimensions
Review: Conditional probabilities

"Seven 5732852" by Niklas Morberg - Seven. Licensed


under Creative Commons Attribution-Share Alike 2.0 via
Wikimedia Commons -
http://commons.wikimedia.org/wiki/File:Seven_5732852.jpg#
mediaviewer/File:Seven_5732852.jpg
Review: Bayes Theorem
Base Rate Fallacy

The Base Rate Fallacy is a very common error
that people make when interpreting data.

It is quite easy to describe (and hopefully
understand).

It does not require very much mathematical
background.

It demonstrates that our intuition can lead us astray.

https://en.wikipedia.org/wiki/Base_rate_fallacy
Base Rate Fallacy

Suppose you have taken a test for a deadly


disease.

The doctor tells you that the test is quite


accurate, in that, if you have the disease then
the test will correctly tell you that you have
the disease 100% of the time.

However, if you don't have the disease, the


test will very occasionally (say 1 time in 10)
mistakenly tell you that you have it.

The test comes back positive (it says you


have the disease)! Are you worried!?
Licensed under Public Domain via Wikimedia Commons -
https://commons.wikimedia.org/wiki/File:US_Navy_060105-N-8154G-
In particular, can you estimate the probability 010_A_hospital_corpsman_with_the_Blood_Donor_Team_from_Portsmouth_Naval_
Hospital_takes_samples_of_blood_from_a_donor_for_testing.jpg#/media/

that you actually have the disease given that File:US_Navy_060105-N-8154G-


010_A_hospital_corpsman_with_the_Blood_Donor_Team_from_Portsmouth_Naval_
Hospital_takes_samples_of_blood_from_a_donor_for_testing.jpg
the test came back positive?
Base Rate Fallacy

What is your estimate?
A) 99% probability I have the disease
B) 90% probability I have the disease
C) 50% probability I have the disease
D) 10% probability I have the disease
E) I don't know and I am mad at you for asking me!
The importance of asking the right question.
Base Rate Fallacy
Base Rate Fallacy

By Jgsho (Own
work) [CC BY-SA
3.0
(http://creativeco
mmons.org/licens
es/by-sa/3.0)], via
Wikimedia
Commons
Base Rate Fallacy

By Jgsho (Own
work) [CC BY-SA
3.0
(http://creativeco
mmons.org/licens
es/by-sa/3.0)], via
Wikimedia
Commons
Base Rate Fallacy

By Jgsho (Own
work) [CC BY-SA
3.0
(http://creativeco
mmons.org/licens
es/by-sa/3.0)], via
Wikimedia
Commons
Base Rate Fallacy

By Jgsho (Own
work) [CC BY-SA
3.0
(http://creativeco
mmons.org/licens
es/by-sa/3.0)], via
Wikimedia
Commons
Base Rate Fallacy
Base Rate Fallacy
The importance of asking the right question.

You might also like