0% found this document useful (0 votes)

27 views16 pages

Week 11 - PROG 8510 Week 11

This class describes various classification algorithms including logistic regression, decision trees, random forests, and support vector machines. It discusses how classification predicts categorical outcomes like spam/not-spam rather than continuous values. Common steps in classification problems include estimating probabilities that records belong to a class, setting a cutoff probability, and assigning records above the cutoff to that class. Specific algorithms covered are support vector machines, which find boundaries to separate classes; logistic regression which assigns classes based on model output; decision trees which classify using a series of questions; and random forests which combine predictions from many decision trees. Python implementations of these algorithms using Scikit-Learn are also demonstrated. The next class will cover evaluating classification model performance.

Uploaded by

Vineel Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views16 pages

Week 11 - PROG 8510 Week 11

Uploaded by

Vineel Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

PROG 8510 : Programming Statistics for Business

Classification

Week 11
This class
• Describe diverse types of classification algorithms.
• Build classification models, such as Logistics Regression, Decision Trees,
Random Forests
What are classification algorithms ?
• Classification is perhaps the most important form of prediction: the goal is to predict
whether a record is a 1 or a 0 (phishing/not-phishing, click/don’t click, churn/don’t
churn), or in some cases, one of several categories.
• Often, we need more than a simple binary classification: we want to know the
predicted probability that a case belongs to a class.
• Rather than having a model simply assign a binary classification, most algorithms
can return a probability score (propensity) of belonging to the class of interest.
• For example; Predicting the amount of loan for a potential client is a regression
problem whereas predicting whether they will qualify or not qualify for the loan is a
classification problem.
What are classification algorithms ?

Basic Image for spam filtration classifier

Steps involved in classification
A sliding cutoff can then be used to convert the propensity score to a decision. The general approach is as
follows:

1. Establish a cutoff probability for the class of interest, above which we consider a record as belonging to
that class.

2. Estimate (with any model) the probability that a record belongs to the class of interest.

3. If that probability is above the cutoff probability, assign the new record to the class of interest.

4. In the next section you will learn the inner working of some of the most common classification
algorithms.
Support Vector Machines
• Support vector machines or SVM are a types of discriminant classification
algorithm. Here rather than modeling each class, we simply find a line or
curve (in two dimensions) or manifold (in multiple dimensions) that divides
the classes from each other.
• A linear discriminative classifier would attempt to draw a straight line
separating the two sets of data, and thereby create a model for classification.
SVM Models
• SVM fits a line that maximizes the
margin between the two sets of
points.
• Notice that a few of the training
points just touch the margin: they
are indicated by the black circles in
this figure. These points are the
pivotal elements of this fit, and are
known as the support vectors, and
give the algorithm its name.

SVM model dividing red and yellow class

SVM in Python
Notice since it is classification problem
hence we changed the y value for a
continuous variable to a categorical or
binary variable. i.e Was the tip value
more than 2$ or not ?

Support vector classifier model is

imported from the relevant
submodule in sklearn package.
Rest of the coding steps are in line
with Sklearn API.
SVM in Python Continued

SVM model created is now used to

predict the values by musing predict
method.

The resultant output is an array of

Boolean data types depicting whether
for each X record whether the tip
value exceeded 2$ or not.
Logistic Regression
• Logistic regression is analogous to
multiple linear regression except
the outcome is binary. Class 1

• Based on the value of model output

- logit(p), class 1 or zero is
assigned.

Class 0
Logistic Regression in Python

The sklearn.linear_model submodule

contains LogisticRegression class
similar to LinearRegression that is used
to perform logistic regression in python.

Notice that rest of the steps in model

building and classification remains the
same.
Decision Trees and Random Forests
• Decision trees are extremely
intuitive ways to classify or label
objects: you simply ask a series of
questions designed to zero-in on
the classification.
• For example, if you wanted to build
a decision tree to classify an animal
you come across while on a hike,
you might construct the one shown
here:
Fig Source : Data Science Handbook, Jake Vanderplaus
Decision Tress in Python

• The sklearn.tree submodule

contains DecisionTreeClassifier
class similar that is used to
perform logistic regression in
python.

• Notice that rest of the steps in

model building and classification
remains the same.
Random Forests in Python

• When the predictive power of

several individual trees are
combined in a single model such
model is called random forest.

• Models that rely on outputs from

several individual component
models is called ensemble
modeling.
Please refer to the notebook file
PROG 8510 Classification Modeling and Evaluation (Week 11
and 12)
Next Class :
• Classification Model Evaluation

Bendix Air Brake System Schematic PDF
75% (4)
Bendix Air Brake System Schematic PDF
1 page
Classification
100% (2)
Classification
105 pages
Python Predictive Modeling
No ratings yet
Python Predictive Modeling
24 pages
ML Notes - 2025
No ratings yet
ML Notes - 2025
145 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Lecture-5 Classification in ML
No ratings yet
Lecture-5 Classification in ML
50 pages
Lab 04 - Supervised ML Classification - Updated
No ratings yet
Lab 04 - Supervised ML Classification - Updated
21 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
ML Unit-Ii
No ratings yet
ML Unit-Ii
37 pages
Classification, Prediction
100% (1)
Classification, Prediction
67 pages
Machine Learning
100% (6)
Machine Learning
115 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Classification
No ratings yet
Classification
22 pages
Supervised Learning
No ratings yet
Supervised Learning
187 pages
Unit 3
No ratings yet
Unit 3
123 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
Machine Learning
No ratings yet
Machine Learning
133 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
Logistic Regression 5
No ratings yet
Logistic Regression 5
61 pages
1-Mapping Problems To Machine Learning Tasks
No ratings yet
1-Mapping Problems To Machine Learning Tasks
19 pages
ML - Module 3
No ratings yet
ML - Module 3
58 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
Module 04
No ratings yet
Module 04
75 pages
Classification Algorithm
No ratings yet
Classification Algorithm
78 pages
KNN Unit1 Notes
No ratings yet
KNN Unit1 Notes
57 pages
MTOT Session10 Instructional Materials
No ratings yet
MTOT Session10 Instructional Materials
23 pages
Lecture 11 - 09.09.24 Classification Part 1
No ratings yet
Lecture 11 - 09.09.24 Classification Part 1
51 pages
Algorithmeknn 121213175830 Phpapp02
No ratings yet
Algorithmeknn 121213175830 Phpapp02
52 pages
Dav Unit 3
No ratings yet
Dav Unit 3
50 pages
Lec 17 - Dsfa23
No ratings yet
Lec 17 - Dsfa23
32 pages
AI Lec 4
No ratings yet
AI Lec 4
35 pages
Lesson 8 - Classification
No ratings yet
Lesson 8 - Classification
74 pages
11 W11NSE6220 - Fall 2023 - Zeng
No ratings yet
11 W11NSE6220 - Fall 2023 - Zeng
43 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
ML Notes by Pushpa
No ratings yet
ML Notes by Pushpa
26 pages
CCPS521 WIN2023 Week05 - Classification
No ratings yet
CCPS521 WIN2023 Week05 - Classification
47 pages
Logistic Regression in Python - Real Python
No ratings yet
Logistic Regression in Python - Real Python
27 pages
AIML
No ratings yet
AIML
30 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Unit 3
No ratings yet
Unit 3
27 pages
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
No ratings yet
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
43 pages
Data Science
No ratings yet
Data Science
38 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Artificial Intelligence Lec 2
No ratings yet
Artificial Intelligence Lec 2
17 pages
Commerce Clause Flowchart
100% (1)
Commerce Clause Flowchart
1 page
Classification
No ratings yet
Classification
21 pages
1 - Supervised Learning & Its Types
No ratings yet
1 - Supervised Learning & Its Types
24 pages
Overview Basics
No ratings yet
Overview Basics
16 pages
Unit 3
No ratings yet
Unit 3
16 pages
Modern Machine Learning in Python
No ratings yet
Modern Machine Learning in Python
50 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
Classification Unit3
No ratings yet
Classification Unit3
15 pages
Unit 1 Part 3
No ratings yet
Unit 1 Part 3
11 pages
7 Types of Classification Algorithms
No ratings yet
7 Types of Classification Algorithms
9 pages
Machine Learning Models
No ratings yet
Machine Learning Models
11 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
4 Types of Classification Tasks in Machine Learning
No ratings yet
4 Types of Classification Tasks in Machine Learning
14 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
A. Install Relevant Package For Classification. B. Choose Classifier For Classification Problem. C. Evaluate The Performance of Classifier
No ratings yet
A. Install Relevant Package For Classification. B. Choose Classifier For Classification Problem. C. Evaluate The Performance of Classifier
10 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
H2S Presentation
No ratings yet
H2S Presentation
66 pages
Outline Field Development & Project Management (5th Apr 22) Rev.2
No ratings yet
Outline Field Development & Project Management (5th Apr 22) Rev.2
67 pages
Analog Display Digital VFO
No ratings yet
Analog Display Digital VFO
3 pages
Clarion Dxz838rmp
No ratings yet
Clarion Dxz838rmp
28 pages
5.size Oriented and Function Oriented Metrics
No ratings yet
5.size Oriented and Function Oriented Metrics
4 pages
Australian Royal Commission Into National Natural Disaster Arrangements - Report (Accessible)
No ratings yet
Australian Royal Commission Into National Natural Disaster Arrangements - Report (Accessible)
594 pages
QA For Bank
No ratings yet
QA For Bank
443 pages
Semi Automated Wireless Beach Cleaning Robot
No ratings yet
Semi Automated Wireless Beach Cleaning Robot
3 pages
Hydrogen Aircraft and Airport Safety
No ratings yet
Hydrogen Aircraft and Airport Safety
31 pages
Netbackup Troubleshooting Commands
No ratings yet
Netbackup Troubleshooting Commands
4 pages
Va 28 16 00
No ratings yet
Va 28 16 00
48 pages
Project 2
No ratings yet
Project 2
7 pages
Negro Who's Who in California (1948)
100% (2)
Negro Who's Who in California (1948)
154 pages
Charles Crissman Wendy Crissman Christine Crissman v. Dover Downs Entertainment Inc. Dover Downs, Inc, 289 F.3d 231, 3rd Cir. (2000)
No ratings yet
Charles Crissman Wendy Crissman Christine Crissman v. Dover Downs Entertainment Inc. Dover Downs, Inc, 289 F.3d 231, 3rd Cir. (2000)
31 pages
Aisi 5140 PDF
No ratings yet
Aisi 5140 PDF
2 pages
Research Assistants 1
No ratings yet
Research Assistants 1
2 pages
BP 36-56 Ingles
No ratings yet
BP 36-56 Ingles
16 pages
AUT International Scholarships - South Asia - Regulations S1 2025 Final Version
No ratings yet
AUT International Scholarships - South Asia - Regulations S1 2025 Final Version
5 pages
THHDH
No ratings yet
THHDH
56 pages
Pro Wrestling Illustrated, 2005-03 (2004 in Wrestling) (C)
No ratings yet
Pro Wrestling Illustrated, 2005-03 (2004 in Wrestling) (C)
148 pages
Colorimeter Calibration
No ratings yet
Colorimeter Calibration
3 pages
647e1269017b6a1e238c38b6 EIR2023-Ethiopia
No ratings yet
647e1269017b6a1e238c38b6 EIR2023-Ethiopia
27 pages
Lewatit Monoplus S 108 H
No ratings yet
Lewatit Monoplus S 108 H
5 pages
Standard Costing and Variance Analysis 1: Solutions To Chapter 18 Questions
No ratings yet
Standard Costing and Variance Analysis 1: Solutions To Chapter 18 Questions
8 pages
Unit 3
No ratings yet
Unit 3
3 pages
Endorsement Letter Honda
No ratings yet
Endorsement Letter Honda
1 page
13 Marquez v. CA
No ratings yet
13 Marquez v. CA
1 page
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Week 11 - PROG 8510 Week 11

Uploaded by

Week 11 - PROG 8510 Week 11

Uploaded by

PROG 8510 : Programming Statistics for Business

Basic Image for spam filtration classifier

SVM model dividing red and yellow class

Support vector classifier model is

SVM model created is now used to

The resultant output is an array of

• Based on the value of model output

The sklearn.linear_model submodule

Notice that rest of the steps in model

• The sklearn.tree submodule

• Notice that rest of the steps in

• When the predictive power of

• Models that rely on outputs from

You might also like