0% found this document useful (0 votes)

48 views

Machine Learning - Classification: CS102 Winter 2019

The document provides an overview of machine learning classification techniques. It discusses supervised learning where the training data contains labels or categories. Several classification algorithms are described including k-nearest neighbors (KNN), decision trees, and Naive Bayes. KNN classifies new data based on the labels of the closest training examples. Decision trees use a tree structure to recursively split the data based on feature values until pure leaves containing labels are reached. Naive Bayes computes the probability of each label based on the feature values and training data distributions.

Uploaded by

technetvn

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Machine Learning - Classification: CS102 Winter 2019

Uploaded by

technetvn

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Machine Learning - Classification

CS102
Winter 2019

Classification CS102
Big Data Tools and Techniques
§ Basic Data Manipulation and Analysis
Performing well-defined computations or asking
well-defined questions (“queries”)
§ Data Mining
Looking for patterns in data
§ Machine Learning
Using data to build models and make predictions
§ Data Visualization
Graphical depiction of data
§ Data Collection and Preparation

Classification CS102
Regression
Using data to build models and make predictions
§ Supervised
§ Training data, each example:
• Set of predictor values - “independent variables”
• Numerical output value - “dependent variable”
§ Model is function from predictors to output
• Use model to predict output value for new
predictor values
§ Example
• Predictors: mother height, father height, current age
• Output: height

Classification CS102
Regression
Classification
Using data to build models and make predictions
§ Supervised
§ Training data, each example:
• Set of predictor values– numeric
feature values - “independent
or categorical
variables”
Numerical output value - “dependent
• Categorical “label” variable”
feature values
method from predictors
§ Model is function to label
to output
• Use model to predict output value for new
label
predictor values
feature values
§ Example
• Predictors:
Feature values: age,
mother gender,
height, income,
father profession
height, current age
Label: buyer,
• Output: heightnon-buyer

Classification CS102
Other Examples
Medical diagnosis
• Feature values: age, gender, history,
symptom1-severity, symptom2-severity,
test-result1, test-result2
• Label: disease
Email spam detection
• Feature values: sender-domain, length,
#images, keyword1, keyword2, …, keywordn
• Label: spam or not-spam
Credit card fraud detection
• Feature values: user, location, item, price
• Label: fraud or okay

Classification CS102
Algorithms for Classification
Despite similarity of problem statement to
regression, non-numerical nature of classification
leads to completely different approaches
§ K-nearest neighbors
§ Decision trees
§ Naïve Bayes
§ … and others

Classification CS102
K-Nearest Neighbors (KNN)
For any pair of data items i1 and i2, from their
feature values compute distance(i1,i2)
Example:
Features - gender, profession, age, income, postal-code
person1 = (male, teacher, 47, $25K, 94305)
person2 = (female, teacher, 43, $28K, 94309)
distance(person1, person2)

distance() can be defined as inverse of similarity()

Classification CS102
K-Nearest Neighbors (KNN)
Features - gender, profession, age, income, postal-code
person1 = (male, teacher, 47, $25K, 94305)
person2 = (female, teacher, 43, $28K, 94309)
Remember training data has labels

Classification CS102
K-Nearest Neighbors (KNN)
Features - gender, profession, age, income, postal-code
person1 = (male, teacher, 47, $25K, 94305) buyer
person2 = (female, teacher, 43, $28K, 94309) non-buyer
Remember training data has labels
To classify a new item i : In the labeled data find
the K closest items to i, assign most frequent label
person3 = (female, doctor, 40, $40K, 95123)

Classification CS102
KNN Example
§ City temperatures – France and Germany
§ Features: longitude, latitude
§ Distance is Euclidean distance
distance([o1,a1],[o2,a2]) = sqrt((o1−o2)2 + (a1−a2)2)
= actual distance in x-y plane
§ Labels: frigid, cold, cool, warm, hot

Nice (7.27, 43.72) cool Predict temperature

Toulouse (1.45, 43.62) warm category from
Frankfurt (8.68, 50.1) cold longitude and latitude
......

Classification CS102
KNN Example

Classification CS102
KNN Summary
To classify a new item i : find K closest items to i
in the labeled data, assign most frequent label

§ No hidden complicated math!

§ Once distance function is defined, rest is easy
§ Though not necessarily efficient
Real examples often have thousands of features
§ Medical diagnosis: symptoms (yes/no), test results
§ Email spam detection: words (frequency)
Database of labeled items might be enormous

Classification CS102
“Regression” Using KNN
Features - gender, profession, age, income, postal-code
person1 = (male, teacher, 47, $25K, 94305) buyer
person2 = (female, teacher, 43, $28K, 94309) non-buyer
Remember training data has labels
To classify a new item i, find K closest items to i
in the labeled data, assign most frequent label
person3 = (female, doctor, 40, $40K, 95123)

Classification CS102
“Regression” Using KNN
Features - gender, profession, age, income, postal-code
$250
person1 = (male, teacher, 47, $25K, 94305) buyer
person2 = (female, teacher, 43, $28K, 94309) non-buyer
$100
Remember training data has labels
To classify a new item i, find K closest items to i
average
in the labeled data, assign most value of
frequent labels
label
person3 = (female, doctor, 40, $40K, 95123)

Classification CS102
Regression Using KNN - Example

Can refine by weighting

average by distance

Classification CS102
Decision Trees
§ Use the training data to construct a decision tree
§ Use the decision tree to classify new data

Classification CS102
Decision Trees
Nodes: features with apologies
for binary gender
Gender
Edges: feature values male
female
Leaves: labels
Age Income
<20 >50 ≥$100K
<$100K
20-50
Non-Buyer Profession Postal Code Profession Buyer

teacher other other

other
doctor 92*** lawyer

Buyer Buyer Non-Buyer Buyer Non-Buyer Non-Buyer Buyer

New data item to classify:

Navigate tree based on feature values

Classification CS102
Decision Trees
Primary challenge is building good decision
trees from training data
• Which features and feature values to use
at each choice point
• HUGE number of possible trees even with
small number of features and values
Common approach: “forest” of many trees,
combine the results
• Still impossible to consider all trees

Classification CS102
Naïve Bayes
Given new data item i, based on i’s feature values
and the training data, compute the probability of
each possible label. Pick highest one.
Efficiency relies on conditional independence
assumption:
Given any two features F1,F2 and a label L, the
probability that F1=v1 for an item with label L is
independent of the probability that F2=v2 for that
item
Examples:
gender and age? income and postal code?

Classification CS102
Naïve Bayes
Given new data item i, based on i’s feature values
and the training data, compute the probability of
each possible label. Pick highest one.
Efficiency relies on conditional independence
assumption:
Given any two features
Conditional F1,F2 and a label L, the
independence
probability
assumption that F1=vdoesn’t
often 1 for anhold,
item with label L is
independent
which of the
is why the probability
approach that F2=v2 for that
is “naive”
item.
Nevertheless the
Examples: approach works very
gender and age? income and postal code?
well in practice

Classification CS102
Naïve Bayes Example
Predict temperature category for a country based
on whether the country has coastline and
whether it is in the EU

Classification CS102
Naïve Bayes Preparation
Step 1: Compute fraction (probability) of items in
each category

cold .18
cool .38
warm .24
hot .20

Classification CS102
Naïve Bayes Preparation
Step 2: For each category, compute fraction of
items in that category for each feature and value
coastline=yes .83 coastline=yes .5
warm coastline=no .5
cold coastline=no .17
(.24)
(.18) EU=yes .67 EU=yes .5
EU=no .33 EU=no .5
coastline=yes .69 coastline=yes 1.0
cool coastline=no .31 hot coastline=no .0
(.38) (.20)
EU=yes .77 EU=yes .71
EU=no .23 EU=no .29

Classification CS102
Naïve Bayes Prediction
New item: France, coastline=yes, EU=yes
For each category: probability of category times
product of probabilities of new item’s features
in that category. Pick highest.

category prob. coastline=yes EU=yes product

cold .18 .83 .67 .10
cool .38 .69 .77 .20
warm .24 .5 .5 .06
hot .20 1.0 .71 .14

category prob. coastline=yes EU=yes product

cold .18 .83 .67 .10
cool .38 .69 .77 .20
warm .24 .5 .5 .06
hot .20 1.0 .71 .14

Classification CS102
Naïve Bayes Prediction
New item: Serbia, coastline=no, EU=no
For each category: probability of category times
product of probabilities of new item’s features
in that category. Pick highest.

category prob. coastline=no EU=no product

cold .18 .17 .33 .01
cool .38 .31 .23 .03
warm .24 .5 .5 .06
hot .20 .0 .29 .00

category prob. coastline=no EU=no product

cold .18 .17 .33 .01
cool .38 .31 .23 .03
warm .24 .5 .5 .06
hot .20 .0 .29 .00

Classification CS102
Naïve Bayes Prediction
New item: Austria, coastline=no, EU=yes
For each category: probability of category times
product of probabilities of new item’s features
in that category. Pick highest.

category prob. coastline=no EU=yes product

cold .18 .17 .67 .02
cool .38 .31 .77 .09
warm .24 .5 .5 .06
hot .20 .0 .71 .0

category prob. coastline=no EU=yes product

cold .18 .17 .67 .02
cool .38 .31 .77 .09
warm .24 .5 .5 .06
hot .20 .0 .71 .0

Classification CS102
Naïve Bayes Prediction
New item: Austria, coastline=no, EU=yes
Manycategory:
For each presentations of NaïveofBayes
probability category times
include
product an additionalofnormalization
of probabilities new item’s features
step
in that so the final
category. Pickproducts
highest.are
probabilities that sum to 1.0. The
choice prob.
category of label is unchanged,
coastline=no so we’ve
EU=yes product
omitted.18
cold that step .17
for simplicity.
.67 .02
cool .38 .31 .77 .09
warm .24 .5 .5 .06
hot .20 .0 .71 .0

Classification CS102
Feature Selection
Real applications often have thousands of features
§ Naïve Bayes typically uses only some of the
features, those most affecting the label
§ Decision trees also rely on choosing features
that most affect the label
§ Feature selection is a key part of machine
learning – an art and a science

Classification CS102
Training and Test
Created machine learning model from training data.
How do you know whether it’s a good model?
Ø Try it on known data
Feature Values Labels

Training Training Data

Data

“Test Data”

Classification CS102
Other Terms You Might Hear
Logistic regression
• Recall regression model is function f from
predictor values to numeric output value
• For classification: from training data obtain one
regression function fL for each label L
fL(feature-values) = probability of item having label L
Support Vector Machine
• Two labels only (“binary classifier”)
• Features = multidimensional space
• From training data SVM finds
hyper-plane that best divides
space according to labels

Classification CS102
Other Terms You Might Hear
Deep Learning
• Complex, mysterious (the ultimate “black box”
software), becoming extremely popular
• Multiple layers, each layer uses classification
techniques to reduce complexity for next layer
and further classification
• Important plus: identifies features from raw data
Neural Network
• Precursor to deep learning, typically two layers
• Leap to deep learning enabled by massive
amounts of data, powerful computing

Classification CS102
Classification Summary
§ Supervised machine learning
§ Training data, each example:
• Set of feature values – numeric or categorical
• Categorical output value – label
§ Model is “function” from feature values to label
• Use model to predict label for new feature values
§ Approaches we covered
• K-nearest neighbors – relies on distance (or
similarity) function
• Decision trees – relies on finding good trees/forests
• Naïve Bayes – relies on conditional independence
assumption

Classification CS102

Practical Microsoft Office 2010
No ratings yet
Practical Microsoft Office 2010
15 pages
The Psychology of Achievement
No ratings yet
The Psychology of Achievement
0 pages
Classification Slides
No ratings yet
Classification Slides
41 pages
05 Classification Part1
No ratings yet
05 Classification Part1
35 pages
Introduction to Classification and Classification Algorithms
No ratings yet
Introduction to Classification and Classification Algorithms
9 pages
ClusteringSlides Stanford
No ratings yet
ClusteringSlides Stanford
22 pages
classification
No ratings yet
classification
36 pages
DM CIA 4
No ratings yet
DM CIA 4
20 pages
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
No ratings yet
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
93 pages
ML U4
No ratings yet
ML U4
48 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Unit4_PPT
No ratings yet
Unit4_PPT
118 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Lecture7 KNN
No ratings yet
Lecture7 KNN
40 pages
L5 K Nearest Neighbor
No ratings yet
L5 K Nearest Neighbor
10 pages
8 Classification
No ratings yet
8 Classification
16 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Classification and Regression: Arturo Calder On Mora
No ratings yet
Classification and Regression: Arturo Calder On Mora
8 pages
Data Mining Intro IEP
No ratings yet
Data Mining Intro IEP
47 pages
Data Mining All Summary
No ratings yet
Data Mining All Summary
47 pages
CH5 Data Mining Classification Prepared by Dr. Maher Abuhamdeh
No ratings yet
CH5 Data Mining Classification Prepared by Dr. Maher Abuhamdeh
61 pages
Overview of Big Data: CS102 Winter 2019
No ratings yet
Overview of Big Data: CS102 Winter 2019
53 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
Lecture Slides-Week15,16
No ratings yet
Lecture Slides-Week15,16
50 pages
FPA unit 2
No ratings yet
FPA unit 2
20 pages
DataMining_Unit-3
No ratings yet
DataMining_Unit-3
8 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
No ratings yet
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
32 pages
Learning AI
No ratings yet
Learning AI
34 pages
Lecture 2 Final
No ratings yet
Lecture 2 Final
90 pages
DMDM Part 2
No ratings yet
DMDM Part 2
94 pages
ML+Clustering
No ratings yet
ML+Clustering
33 pages
Lecture 05.decision Tree and K Means PDF
No ratings yet
Lecture 05.decision Tree and K Means PDF
38 pages
Classification
No ratings yet
Classification
10 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
CSE-VSEM-503-B-PR-UNIT-2-NOTES
No ratings yet
CSE-VSEM-503-B-PR-UNIT-2-NOTES
17 pages
Maxbox Starter60 Machine Learning
No ratings yet
Maxbox Starter60 Machine Learning
8 pages
41 j48 Naive Bayes Weka
No ratings yet
41 j48 Naive Bayes Weka
5 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
47 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
48 pages
ML Unit 2
No ratings yet
ML Unit 2
84 pages
Internet of Things Comparative Study
No ratings yet
Internet of Things Comparative Study
3 pages
Unit 5 - DA - Classification & Clustering
No ratings yet
Unit 5 - DA - Classification & Clustering
105 pages
datamining-lect12
No ratings yet
datamining-lect12
75 pages
IDS26 Clustering and Classification
No ratings yet
IDS26 Clustering and Classification
30 pages
DMW Module 3
No ratings yet
DMW Module 3
112 pages
DMW Module 3
No ratings yet
DMW Module 3
112 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
Unit-4 AML (1. Basics and K-NN)
No ratings yet
Unit-4 AML (1. Basics and K-NN)
25 pages
Supervised Learning
No ratings yet
Supervised Learning
71 pages
Chapter
100% (1)
Chapter
101 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
2.introduction To Supervised Learning and K Nearest Neighbors
No ratings yet
2.introduction To Supervised Learning and K Nearest Neighbors
74 pages
An Introduction To Data Mining IIT Bombay
No ratings yet
An Introduction To Data Mining IIT Bombay
48 pages
Lecture 2 - Nearest-Neighbors Methods
No ratings yet
Lecture 2 - Nearest-Neighbors Methods
57 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning - Regression: CS102 Winter 2019
No ratings yet
Machine Learning - Regression: CS102 Winter 2019
31 pages
Relational D Bands QL
No ratings yet
Relational D Bands QL
14 pages
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
No ratings yet
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
29 pages
IC3 Teacher
No ratings yet
IC3 Teacher
46 pages
Improved Sequential Pattern Mining Using An Extended Bitmap Representation
No ratings yet
Improved Sequential Pattern Mining Using An Extended Bitmap Representation
11 pages
Sequential Pattern Mining by Pattern-Growth: Principles and Extensions
No ratings yet
Sequential Pattern Mining by Pattern-Growth: Principles and Extensions
38 pages
Firefly Algorithm
No ratings yet
Firefly Algorithm
22 pages
Advertising Promotion and Other Aspects of Integrated Marketing Communications 9th Edition Shimp Solutions Manual - PDF DOCX Format Is Available For Instant Download
100% (7)
Advertising Promotion and Other Aspects of Integrated Marketing Communications 9th Edition Shimp Solutions Manual - PDF DOCX Format Is Available For Instant Download
31 pages
Update Y2 Lesson Plan Week 2
No ratings yet
Update Y2 Lesson Plan Week 2
6 pages
5 Steps in The Research Process Overview
No ratings yet
5 Steps in The Research Process Overview
4 pages
Feedback - ITA - Student Lv-490.C++
No ratings yet
Feedback - ITA - Student Lv-490.C++
6 pages
Practice Teacher Evaluation
No ratings yet
Practice Teacher Evaluation
2 pages
Iep For Difficulty Focusing and Remembering
No ratings yet
Iep For Difficulty Focusing and Remembering
3 pages
(OFFICIAL) PBA Manual - Interview Project
No ratings yet
(OFFICIAL) PBA Manual - Interview Project
24 pages
My Experience As A Teacher and Future Plans - (Essay Example), 530 Words GradesFixer
No ratings yet
My Experience As A Teacher and Future Plans - (Essay Example), 530 Words GradesFixer
2 pages
Review of Related Literature
No ratings yet
Review of Related Literature
34 pages
Guide Number 1 OUR Life!: Japan, Mexico, Korea, Colombia, Spain, United States
No ratings yet
Guide Number 1 OUR Life!: Japan, Mexico, Korea, Colombia, Spain, United States
3 pages
@shadowing Podcasts
No ratings yet
@shadowing Podcasts
4 pages
LEAD Generation in HR and MARKETING
No ratings yet
LEAD Generation in HR and MARKETING
18 pages
DOC-20241017
No ratings yet
DOC-20241017
46 pages
The Teacher and The School Curriculum
No ratings yet
The Teacher and The School Curriculum
1 page
Debating: 1. To Order A Sequence of Arguments
No ratings yet
Debating: 1. To Order A Sequence of Arguments
5 pages
Education Research Kit 2023-1
No ratings yet
Education Research Kit 2023-1
21 pages
Saido 2018
No ratings yet
Saido 2018
21 pages
Gpoa For Clubs and Orgs
No ratings yet
Gpoa For Clubs and Orgs
2 pages
08 Mastering ITTO Cheat Sheets
50% (2)
08 Mastering ITTO Cheat Sheets
6 pages
Community Dynamics
No ratings yet
Community Dynamics
2 pages
Thecriticinratatouille
No ratings yet
Thecriticinratatouille
2 pages
STAR Observation Tool For FS Students
No ratings yet
STAR Observation Tool For FS Students
9 pages
Proiect Didactic - V - L1 - 23.10 - Role Model
No ratings yet
Proiect Didactic - V - L1 - 23.10 - Role Model
3 pages
Lesson Plan in Grade 3 PDF
No ratings yet
Lesson Plan in Grade 3 PDF
4 pages
Introduction to Prompt Engineering
No ratings yet
Introduction to Prompt Engineering
32 pages
Limbocriticalbookreview
No ratings yet
Limbocriticalbookreview
6 pages
Ergonomi Giriş
No ratings yet
Ergonomi Giriş
4 pages
Research Sample.
No ratings yet
Research Sample.
47 pages
CDC Course Report
No ratings yet
CDC Course Report
10 pages

Machine Learning - Classification: CS102 Winter 2019

Uploaded by

Machine Learning - Classification: CS102 Winter 2019

Uploaded by

Machine Learning - Classification

distance() can be defined as inverse of similarity()

Nice (7.27, 43.72) cool Predict temperature

§ No hidden complicated math!

Can refine by weighting

teacher other other

Buyer Buyer Non-Buyer Buyer Non-Buyer Non-Buyer Buyer

New data item to classify:

category prob. coastline=yes EU=yes product

category prob. coastline=yes EU=yes product

category prob. coastline=no EU=no product

category prob. coastline=no EU=no product

category prob. coastline=no EU=yes product

category prob. coastline=no EU=yes product

Training Training Data

You might also like