0% found this document useful (0 votes)

51 views32 pages

Probabilistic Models For Classification

The document discusses probabilistic models for classification, including generative models that model the joint probability distribution and discriminative models that directly model the posterior probability. Generative models include Gaussian discriminant analysis and naive Bayes classifiers, while discriminative models include logistic regression. The document also covers evaluating classification performance using metrics like accuracy, precision, recall, ROC curves, estimating generalization error through cross-validation, and other related topics.

Uploaded by

Sweta Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views32 pages

Probabilistic Models For Classification

Uploaded by

Sweta Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Probabilistic Models

for Classification

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 1
Binary Classification Problem
•N iid training samples:
• Class label:
• Feature vector:

• Focus on modeling conditional probabilities

• Needs to be followed by a decision step

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 2
Generative models for classification
• Model joint probability

• Class posterior probabilities via Bayes rule

• Prior probability of a class:

• Class conditional probabilities:

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 3
Generative Process for Data
• Enables
generation of new data points
• Repeat N times
• Sample class
• Sample feature value

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 4
Conditional Probability in a Generative Model

•

where

• Logistic function

• Independent of specific form of class

conditional probabilities

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 5
Case: Binary classification with Gaussians

• Prior class probability

• Gaussian class densities

• Parameters
• Note: Covariance parameter is shared

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 6
Case: Binary classification with Gaussians

•

Where

• Quadratic term cancels out

• Linear classification model

• Class boundary

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 7
Special Cases
•
• Class boundary:

• Class boundary shifts by

• Arbitrary
• Decision boundary still linear but
not orthogonal to the hyper-plane
joining the two means

Image from Michael Jordan’s book

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 8
MLE for Binary Gaussian
• Formulate loglikelihood in terms of parameters

• Maximize loglikelihood wrt parameters

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 9
Case: Gaussian Multi-class Classification
•
• Prior
• Class conditional densities

where
• Soft-max / normalized exponential function
• For Gaussian class conditionals

• The decision boundaries are still lines in the feature space

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 10
MLE for Gaussian Multi-class
• Similar to the Binary case

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 11
Case: Naïve Bayes
• Similar
to Gaussian setting, only features are
discrete (binary, for simplicity)

• “Naïve” Assumption: Feature dimensions

conditionally independent given class label
• Very different from independence assumption

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 12
Case: Naïve Bayes
• Class conditional probability

• Posterior probability

Where

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 13
MLE for Naïve Bayes

• Formulate loglikelihood in terms of parameters

• Maximize likelihood wrt parameters

• MLE overfits
• Susceptible to 0 frequencies in training data

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 14
Bayesian Estimation for Naïve Bayes
• Model
the parameters as random variables and
analyze posterior distributions
• Take point estimates if necessary

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 15
Discriminative Models for Classification
• Familiar
form for posterior class
distribution

• Model posterior distribution directly

• Advantages as classification model

• Fewer assumptions, fewer parameters
Image from Michael Jordan’s book

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 16
Logistic Regression for Binary Classification

• Apply model for binary setting

• Formulate likelihood with weights as parameters

where

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 17
MLE for Binary Logistic Regression
• Maximize likelihood wrt weights

• No closed form solution

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 18
MLE for Binary Logistic Regression
• Not
quadratic but still convex
• Iterative optimization using gradient descent (LMS
algorithm)

• Batch gradient update

• Stochastic gradient descent update

• Faster algorithm – Newton’s Method

• Iterative Re-weighted least squares (IRLS)

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 19
Bayesian Binary Logistic Regression
• Bayesian model exists, but intractable
• Conjugacy breaks down because of the sigmoid function
• Laplace approximation for the posterior

• Major challenge for Bayesian framework

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 20
Soft-max regression for Multi-class Classification

• Left as exercise

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 21
Choices for the activation function
• Probit function: CDF of the Gaussian
• Complementary log-log model: CDF of exponential

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 22
Generative vs Discriminative: Summary
• Generative models
• Easy parameter estimation
• Require more parameters OR simplifying assumptions
• Models and “understands” each class
• Easy to accommodate unlabeled data
• Poorly calibrated probabilities

• Discriminative models
• Complicated estimation problem
• Fewer parameters and fewer assumptions
• No understanding of individual classes
• Difficult to accommodate unlabeled data
• Better calibrated probabilities

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 23
Decision Theory
• From posterior distributions to actions
• Loss functions measure extent of error
• Optimal action depends on loss function

• Reject option for classification problems

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 24
Loss functions
• 0-1 loss

• Minimized by MAP estimate (posterior mode)

• loss

• Expected loss: (Min mean squared error)

• Minimized by Bayes estimate (posterior mean)

• loss

Minimized by posterior median

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 25
Evaluation of Binary Classification Models

• Consider
class conditional distribution
• Decision rule:

• Confusion Matrix

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 26
ROC curves

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 27
ROC curves

• Plot TPR and FPR for

different values of
decision threshold

• Quality of classifier
measured by area under
the curve (AUC)

Image from wikipedia

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 28
Precision-recall curves
settings such as
•• In
information retrieval,
• Precision =
• Recall =
• Plot precision vs recall for
varying values of threshold
• Quality of classifier
measured by area under the
curve (AUC) or by specific
values e.g. P@k

Image from scikit-learn

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 29
F1-scores
• To
evaluate at a single threshold, need to combine
precision and recall

• when P and R and not equally important

• Harmonic mean
• Why?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 30
Estimating generalization error
• Training set performance is not a good indicator of
generalization error
• A more complex model overfits, a less complex one underfits
• Which model do I select?

• Validation set
• Typically 80%, 20%
• Wastes valuable labeled data

• Cross validation
• Split training data into K folds
• For ith iteration, train on K/i folds, test on i th fold
• Average generalization error over all folds
• Leave one out cross validation: K=N

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 31
Summary
• Generative models
• Gaussian Discriminant Analysis
• Naïve Bayes
• Discriminative models
• Logistics regression
• Iterative algorithms for training
• Binary vs Multiclass

• Evaluation of classification models

• Generalization performance
• Cross validation

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 32

Date: Venue:: 28-11-2023, Saveetha School of Engineering
No ratings yet
Date: Venue:: 28-11-2023, Saveetha School of Engineering
100 pages
6th - SEM Machine Learning Notes PDF
100% (1)
6th - SEM Machine Learning Notes PDF
36 pages
Chapter - 5 Machine Learning
0% (1)
Chapter - 5 Machine Learning
25 pages
07 BayesianClassifier
No ratings yet
07 BayesianClassifier
56 pages
Aml Unit 1
No ratings yet
Aml Unit 1
66 pages
Lec02 ClassifierEvaluation
No ratings yet
Lec02 ClassifierEvaluation
36 pages
Learning
No ratings yet
Learning
51 pages
Lec04 Classifiers NBC
No ratings yet
Lec04 Classifiers NBC
24 pages
Lec03 Classifiers KNN+DT
No ratings yet
Lec03 Classifiers KNN+DT
30 pages
Lecture 9
No ratings yet
Lecture 9
24 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
Lec7 - Nonparametric Methods - II
No ratings yet
Lec7 - Nonparametric Methods - II
38 pages
LINFO2262: Machine Learning: Classification and Evaluation
No ratings yet
LINFO2262: Machine Learning: Classification and Evaluation
39 pages
Lecture 04
No ratings yet
Lecture 04
28 pages
Presentation 33360 Content Document 20250319044717PM
No ratings yet
Presentation 33360 Content Document 20250319044717PM
126 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-04 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-04 Reference-Material-I
69 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
Lec05 - Supervised
No ratings yet
Lec05 - Supervised
26 pages
III BCA ML - Syll - Model - All Units
No ratings yet
III BCA ML - Syll - Model - All Units
85 pages
Unit 5
No ratings yet
Unit 5
13 pages
Lecture 03 Bayes Classifier With Prob Concepts
No ratings yet
Lecture 03 Bayes Classifier With Prob Concepts
70 pages
ML DecisionTrees
No ratings yet
ML DecisionTrees
46 pages
Ul Brand Test Tool User Manual
No ratings yet
Ul Brand Test Tool User Manual
89 pages
ISTQB Agile Tester Exam - Answer
No ratings yet
ISTQB Agile Tester Exam - Answer
139 pages
DLT Unit-1
No ratings yet
DLT Unit-1
28 pages
Ktu ML Syllabus
No ratings yet
Ktu ML Syllabus
7 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Probabilistic Models With Latent Variables
No ratings yet
Probabilistic Models With Latent Variables
29 pages
Final PRINT 2022 SCHEME VI SEM SCHEME & SYLLABUS
No ratings yet
Final PRINT 2022 SCHEME VI SEM SCHEME & SYLLABUS
30 pages
Who Am I?: Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
No ratings yet
Who Am I?: Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
13 pages
Introduc) On To Machine Learning: Syllabus: Rao Vemuri Fall 2013
No ratings yet
Introduc) On To Machine Learning: Syllabus: Rao Vemuri Fall 2013
10 pages
BCS602 Model Question Paper Solved (Search Creators)
No ratings yet
BCS602 Model Question Paper Solved (Search Creators)
37 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Lecture 2
No ratings yet
Lecture 2
21 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
DM See M4
No ratings yet
DM See M4
8 pages
Machine Learning: Foundations: Prof. Nathan Intrator
No ratings yet
Machine Learning: Foundations: Prof. Nathan Intrator
60 pages
Exercises
No ratings yet
Exercises
69 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
ML - Part - A
No ratings yet
ML - Part - A
10 pages
Syllabus
No ratings yet
Syllabus
2 pages
AI Week 14
No ratings yet
AI Week 14
3 pages
MLSM Lecture1 050923
No ratings yet
MLSM Lecture1 050923
37 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Classification
No ratings yet
Classification
4 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
7 Types of Classification Algorithms
No ratings yet
7 Types of Classification Algorithms
9 pages
Assignment 02: Submitted To
No ratings yet
Assignment 02: Submitted To
4 pages
The Ability To Learn Is A Core Artefact of Intelligence: Machine Learning Fundamentals
No ratings yet
The Ability To Learn Is A Core Artefact of Intelligence: Machine Learning Fundamentals
17 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
ClassNote One
No ratings yet
ClassNote One
2 pages
Handout - BITS-F464 - Machine - Learning - August 2019
No ratings yet
Handout - BITS-F464 - Machine - Learning - August 2019
4 pages
Handout BITS-C464 Machine Learning - 2013
No ratings yet
Handout BITS-C464 Machine Learning - 2013
3 pages
ML Course Outline
No ratings yet
ML Course Outline
4 pages
Syllabus - 3650014 Machine Learning
No ratings yet
Syllabus - 3650014 Machine Learning
3 pages
Inject-Concerning Transmitters and Receivers by Peter Neuthinger
No ratings yet
Inject-Concerning Transmitters and Receivers by Peter Neuthinger
5 pages
Letters For Ojt
No ratings yet
Letters For Ojt
13 pages
Machine Learning Mastery Notes
No ratings yet
Machine Learning Mastery Notes
4 pages
Risk Assessment of It Security Possible Solutions and Mechanisms To Control It Security Risk Unit 8: Security
No ratings yet
Risk Assessment of It Security Possible Solutions and Mechanisms To Control It Security Risk Unit 8: Security
15 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
Account Allocation Sheet
No ratings yet
Account Allocation Sheet
22 pages
046 Nirbhay Gupta Summer Training Report
No ratings yet
046 Nirbhay Gupta Summer Training Report
28 pages
Performance Analysis of RIP, EIGRP, OSPF and ISIS Routing Protocols
No ratings yet
Performance Analysis of RIP, EIGRP, OSPF and ISIS Routing Protocols
8 pages
ITN260
No ratings yet
ITN260
7 pages
Atlib B
100% (1)
Atlib B
2 pages
CSEC Information Technology June 2016 P02
No ratings yet
CSEC Information Technology June 2016 P02
17 pages
Blue Tide Brochure
No ratings yet
Blue Tide Brochure
12 pages
Short Notes Regional Geography
No ratings yet
Short Notes Regional Geography
6 pages
Healthcare 10 01993
No ratings yet
Healthcare 10 01993
32 pages
Industrial Design Portfolio 2021
No ratings yet
Industrial Design Portfolio 2021
13 pages
HP EliteBook x360 830 G7 Notebook PC Parts Locator
No ratings yet
HP EliteBook x360 830 G7 Notebook PC Parts Locator
31 pages
7 A Activity List
No ratings yet
7 A Activity List
13 pages
Adam: A Method For Stochastic Optimization: Diederik P. Kingma and Jimmy Lei Ba
No ratings yet
Adam: A Method For Stochastic Optimization: Diederik P. Kingma and Jimmy Lei Ba
41 pages
GeM Bidding 7789897
No ratings yet
GeM Bidding 7789897
8 pages
Esd Lab-7 Working With Ultrsonic Sensor
No ratings yet
Esd Lab-7 Working With Ultrsonic Sensor
13 pages
EdgeWise Structure Guide
No ratings yet
EdgeWise Structure Guide
19 pages
Git Document
No ratings yet
Git Document
14 pages
Cloudera Administrator Training For Apache Hadoop PDF
50% (2)
Cloudera Administrator Training For Apache Hadoop PDF
2 pages
TFT LCD Display Incubator Controller EGGHATCHER 02 V01 0516
No ratings yet
TFT LCD Display Incubator Controller EGGHATCHER 02 V01 0516
36 pages
Entegra
No ratings yet
Entegra
4 pages
Momen
No ratings yet
Momen
2 pages
Design Gráfico Dos Anos 60 e 70 - Pôsteres e Capas de Discos by Coletivo Herméticos - Issuu
No ratings yet
Design Gráfico Dos Anos 60 e 70 - Pôsteres e Capas de Discos by Coletivo Herméticos - Issuu
1 page
Polyga h3
No ratings yet
Polyga h3
3 pages
GMT4000product Information
No ratings yet
GMT4000product Information
2 pages
Ac & DC Ammeters: Fixed Range & Selectable Range (16 Ranges in 1 Meter)
No ratings yet
Ac & DC Ammeters: Fixed Range & Selectable Range (16 Ranges in 1 Meter)
2 pages
Neo4j Graph Data Science Certified - Exam Practice Tests
From Everand
Neo4j Graph Data Science Certified - Exam Practice Tests
Cristian Scutaru
No ratings yet
PTC Creo Parametric 3.0 for Designers
From Everand
PTC Creo Parametric 3.0 for Designers
Prof. Sham Tickoo
5/5 (1)

Probabilistic Models For Classification

Uploaded by

Probabilistic Models For Classification

Uploaded by

Probabilistic Models

• Focus on modeling conditional probabilities

• Class posterior probabilities via Bayes rule

• Prior probability of a class:

• Independent of specific form of class

• Prior class probability

• Gaussian class densities

• Quadratic term cancels out

• Linear classification model

• Class boundary shifts by

Image from Michael Jordan’s book

• Maximize loglikelihood wrt parameters

• The decision boundaries are still lines in the feature space

• “Naïve” Assumption: Feature dimensions

• Maximize likelihood wrt parameters

• Model posterior distribution directly

• Advantages as classification model

• Apply model for binary setting

• Formulate likelihood with weights as parameters

• No closed form solution

• Batch gradient update

• Stochastic gradient descent update

• Faster algorithm – Newton’s Method

• Major challenge for Bayesian framework

• Reject option for classification problems

• Minimized by MAP estimate (posterior mode)

• Expected loss: (Min mean squared error)

Minimized by posterior median

• Plot TPR and FPR for

Image from wikipedia

Image from scikit-learn

• when P and R and not equally important

• Evaluation of classification models

You might also like