0% found this document useful (0 votes)

18 views14 pages

Logistic Regression Classification in Natural Language Processing (NLP) Final

The document discusses logistic regression as a key classification tool in Natural Language Processing (NLP), highlighting its relationship with neural networks and its application in tasks such as spam detection. It outlines the components of logistic regression, including feature representation, classification functions, loss functions, and optimization algorithms, while also explaining how to convert categorical features into numerical values using one-hot encoding. The conclusion emphasizes the effectiveness and interpretability of logistic regression, while acknowledging its limitations with complex data.

Uploaded by

maan younis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views14 pages

Logistic Regression Classification in Natural Language Processing (NLP) Final

Uploaded by

maan younis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

AKRE UNIVERSITY FOR APPLIED SCIENCES TECHNICAL

COLLEGE OF INFORMATICS-AKRE

Logistic Regression Classification

in Natural Language Processing
(NLP)

Academic Year: 2024-2025

name
INTRODUCTION

 Logistic regression is one of the most important analytic

tools for classification in NLP. As the textbook states:
 "Logistic regression is the baseline supervised machine
learning algorithm for classification, and has a very close
relationship with neural networks. A neural network can
be viewed as a series of logistic regression classifiers
stacked on top of each other."
IDEA OF LOGISTIC REGRESSION

We’ll compute w∙x+b

And then we’ll pass it through the sigmoid
function:
 σ(w∙x+b)
And we'll just treat it as a probability
THE SIGMOID FUNCTION

Equation
THE SIGMOID FUNCTION

This is the Sigmoid Function Plot, showing how logistic regression maps input
values to probabilities between 0 and 1. The red dashed line represents the decision
threshold at 0.5.
TURNING A PROBABILITY INTO A CLASSIFIER

if w∙x+b > 0
if w∙x+b ≤ 0
COMPONENTS OF LOGISTIC REGRESSION

 Four Key Components:

1.Feature Representation:
Convert text to numerical vectors (e.g., count of positive words).
2.Classification Function (Sigmoid/Softmax):
Transforms output into probabilities.
3.Loss Function (Cross-Entropy Loss):
Measures error between predictions and true labels.
4.Optimization Algorithm (Gradient Descent):
Updates weights to minimize error.
CLASSIFICATION WITH LOGISTIC REGRESSION (USING
TEXT FEATURES)

 In this example, we classify emails as Spam (1) or Not

Spam (0) based on three textual features:
• Keyword Presence (e.g., "Discount," "Free," "Meeting")
• Sender Type (e.g., "Company," "Individual," "Unknown")
• Number of Links in the email (e.g., "Few," "Moderate," "Many")
 Since logistic regression requires numerical input, we use
one-hot encoding to convert categorical values into
numerical features.
8
Define the Dataset

Keyword Sender Type Link Count Spam (1) / Not Spam (0)
Discount Company Many 1
Free Unknown Moderate 1
Meeting Individual Few 0
Offer Company Moderate 1
Hello Individual Few 0

9
Apply One-Hot Encoding

Each categorical feature is converted into separate binary columns:

Spam (1) / Not Spam

Individual
Company

Moderate
Unknown
Discount

Meeting

Many
Hello
Offer
Free

Few

(0)
1 0 0 0 0 1 0 0 1 0 0 1
0 1 0 0 0 0 0 1 0 1 0 1
Now,
0 our0 categorical
1 features
0 are
0 transformed
0 into numerical
1 0 0binary0values,1 0
10

which
0 can
0 be used0 in logistic
1 regression.
0 1 0 0 0 1 0 1
LOGISTIC REGRESSION MODEL

Assuming the model has been trained, we define the learned weights as:

11
COMPUTE MODEL OUTPUT

Features (One-Hot
g(z) Predicted y
Encoded)

(1,0,0,0,0,1,0,0,1,0,0) −2(1)+1.1(1)+1.4(1)+1-2(1) + 1.1(1) + 1.4(1) + 1= 1.5 0.223 1.223 0.818 1

(0,1,0,0,0,0,0,1,0,1,0) 1.5(1)−0.9(1)+0.8(1)+11.5(1) - 0.9(1) + 0.8(1) + 1 = 2.4 0.091 1.091 0.916 1

(0,0,1,0,0,0,1,0,0,0,1) 1.2(1)+0.7(1)−0.5(1)+11.2(1) + 0.7(1) - 0.5(1) + 1 = 2.4 0.091 1.091 0.916 0

(0,0,0,1,0,1,0,0,0,1,0) 0.5(1)+1.1(1)+0.8(1)+10.5(1) + 1.1(1) + 0.8(1) + 1= 3.4 0.033 1.033 0.968 1

(0,0,0,0,1,0,1,0,0,0,1) −1.3(1)+0.7(1)−0.5(1)+1-1.3(1) + 0.7(1) - 0.5(1) + 1= -0.1 1.105 2.105 0.476 0

12
DECISION RULE

•If g(z)≥0.5, classify as Spam (1).

•If g(z)<0.5, classify as Not Spam (0).

•Emails with strong spam keywords (like "Discount," "Free," or "Offer") are classified as Spam (1).
•Emails with neutral words (like "Meeting" or "Hello") are more likely to be Not Spam (0).
•The sender type and the number of links also influence classification.
Conclusions

•Effectiveness in Text Classification: Logistic regression is a simple yet powerful

algorithm for binary classification problems like spam detection, sentiment analysis,
and topic classification.
•Feature Representation Matters: Proper text representation techniques like One-Hot
Encoding, TF-IDF, or Word Embeddings significantly impact model performance.
•Interpretability: Unlike deep learning models, logistic regression provides clear
decision boundaries and is easy to interpret, making it useful for explainable AI.
•Limitations: While effective for linearly separable data, logistic regression struggles
with complex, high-dimensional text data, where advanced models like SVMs, Random
Forests, or Deep Learning perform better.

3 - 1 Logistic Regression
No ratings yet
3 - 1 Logistic Regression
9 pages
ML_MU_Unit_2 - Supervised Learning-Classification Techniques
No ratings yet
ML_MU_Unit_2 - Supervised Learning-Classification Techniques
153 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
78 pages
Linear Regression
No ratings yet
Linear Regression
85 pages
ai-tech-agency-infographics
No ratings yet
ai-tech-agency-infographics
65 pages
23-LogisticRegression
No ratings yet
23-LogisticRegression
67 pages
Session 5 - Logistic Regression
No ratings yet
Session 5 - Logistic Regression
69 pages
Ch03 LogisticRegression
No ratings yet
Ch03 LogisticRegression
79 pages
5_LR_Apr_7_2021 (3)
No ratings yet
5_LR_Apr_7_2021 (3)
93 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
DM - Lecture 3
No ratings yet
DM - Lecture 3
41 pages
Multimedia Application L9
No ratings yet
Multimedia Application L9
43 pages
Lecture 16 - Classification
No ratings yet
Lecture 16 - Classification
43 pages
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
No ratings yet
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
33 pages
4.Logistic Regression
No ratings yet
4.Logistic Regression
28 pages
Computer Graphics UNIT V
No ratings yet
Computer Graphics UNIT V
20 pages
FSP Logistics Regression
No ratings yet
FSP Logistics Regression
34 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
Unit II
100% (1)
Unit II
13 pages
Chapter 8
No ratings yet
Chapter 8
27 pages
Lecture 21 - Logistic Regression
No ratings yet
Lecture 21 - Logistic Regression
34 pages
Lec12 Logreg
No ratings yet
Lec12 Logreg
41 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Logistic Regression Notes
No ratings yet
Logistic Regression Notes
25 pages
Ed3book - Jan72023 87 110
No ratings yet
Ed3book - Jan72023 87 110
24 pages
SMDS-Unit-5
No ratings yet
SMDS-Unit-5
21 pages
logisticregression
No ratings yet
logisticregression
22 pages
Logistic Regression: "And How Do You Know That These Fine Begonias Are Not of Equal Importance?"
No ratings yet
Logistic Regression: "And How Do You Know That These Fine Begonias Are Not of Equal Importance?"
21 pages
3-Intro-to-Logistic-Regression-LT
No ratings yet
3-Intro-to-Logistic-Regression-LT
18 pages
4. Logistic Regression
No ratings yet
4. Logistic Regression
21 pages
5
No ratings yet
5
25 pages
1694600777-Unit2.2 Logistic Regression CU 2.0
100% (1)
1694600777-Unit2.2 Logistic Regression CU 2.0
37 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
FALLSEM2024-25 MMAT501L TH VL2024250107615 2024-09-24 Reference-Material-I
No ratings yet
FALLSEM2024-25 MMAT501L TH VL2024250107615 2024-09-24 Reference-Material-I
12 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
10 1016@j Asoc 2020 106229
No ratings yet
10 1016@j Asoc 2020 106229
18 pages
Logistic regression
No ratings yet
Logistic regression
12 pages
Logistic Regression
No ratings yet
Logistic Regression
11 pages
Logistic Regression For Spam Filtering: Niclas Englesson
No ratings yet
Logistic Regression For Spam Filtering: Niclas Englesson
37 pages
Logistic Regression: "And How Do You Know That These Fine Begonias Are Not of Equal Importance?"
No ratings yet
Logistic Regression: "And How Do You Know That These Fine Begonias Are Not of Equal Importance?"
25 pages
Logistic-Regression-An-Introduction
No ratings yet
Logistic-Regression-An-Introduction
6 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
11 pages
LSVM - Jaikrishna 1
No ratings yet
LSVM - Jaikrishna 1
5 pages
Logistic Regression - Jaikrishna 2
No ratings yet
Logistic Regression - Jaikrishna 2
5 pages
Logistic Regression Algorithm
No ratings yet
Logistic Regression Algorithm
8 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
AI lab8
No ratings yet
AI lab8
8 pages
B55MLExp1
No ratings yet
B55MLExp1
4 pages
spamfilter (2)
No ratings yet
spamfilter (2)
4 pages
Problem Set Logistic Regression
No ratings yet
Problem Set Logistic Regression
2 pages
spamdetection
No ratings yet
spamdetection
6 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
aiproject-2
No ratings yet
aiproject-2
4 pages
P 2.1 Logistic Regression
No ratings yet
P 2.1 Logistic Regression
18 pages
Spam Message Detection Using Logistic Regression
No ratings yet
Spam Message Detection Using Logistic Regression
4 pages
Sonia Jessica - 2022 - How Does Logistic Regression Work
No ratings yet
Sonia Jessica - 2022 - How Does Logistic Regression Work
4 pages
Derivation Logistic Regression
No ratings yet
Derivation Logistic Regression
4 pages
lec 1 Data Acquisition and preprocessing
No ratings yet
lec 1 Data Acquisition and preprocessing
8 pages
Calculator Techniques - Engr. Vargas 2025
No ratings yet
Calculator Techniques - Engr. Vargas 2025
129 pages
Understanding IP Prefix Lists
No ratings yet
Understanding IP Prefix Lists
9 pages
Menu Recommendation_OOps
No ratings yet
Menu Recommendation_OOps
2 pages
Analyzing Health Data in R for SAS Users - 1st Edition Direct eBook Download
100% (7)
Analyzing Health Data in R for SAS Users - 1st Edition Direct eBook Download
15 pages
P6 Discuss How Asymptotic Analysis Can Be Used To Assess The Effectiveness of An Algorithm
100% (2)
P6 Discuss How Asymptotic Analysis Can Be Used To Assess The Effectiveness of An Algorithm
12 pages
BCSL-013 Practice Questions (1)
No ratings yet
BCSL-013 Practice Questions (1)
7 pages
Opp 1
No ratings yet
Opp 1
42 pages
CG-3 (1)
No ratings yet
CG-3 (1)
80 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Quiz AI1704 Page 2 of 2
No ratings yet
Quiz AI1704 Page 2 of 2
8 pages
Chapter 1 - 230406 - 210437
No ratings yet
Chapter 1 - 230406 - 210437
11 pages
Ankit Verma Salesforce Resume
No ratings yet
Ankit Verma Salesforce Resume
1 page
Free Proxy List
No ratings yet
Free Proxy List
23 pages
COLA-070071 - Unit 04 - Database Design and Development
No ratings yet
COLA-070071 - Unit 04 - Database Design and Development
86 pages
PRC SettingUpApprovals Whitepaper Rel13-19C
No ratings yet
PRC SettingUpApprovals Whitepaper Rel13-19C
99 pages
ERD To Relational Model
No ratings yet
ERD To Relational Model
11 pages
How To Run AJAX
No ratings yet
How To Run AJAX
3 pages
8 Collaborative ICT Development PDF
No ratings yet
8 Collaborative ICT Development PDF
26 pages
Nested Tables
No ratings yet
Nested Tables
5 pages
Patent Landscape Study of Blockchain in Ip: by - Nikita Garg and Vaibhav Sharma
No ratings yet
Patent Landscape Study of Blockchain in Ip: by - Nikita Garg and Vaibhav Sharma
6 pages
IT15 Decorators Python Draft23
No ratings yet
IT15 Decorators Python Draft23
5 pages
Access Control List: Topology Diagram
No ratings yet
Access Control List: Topology Diagram
6 pages
Linear Regression and Least Squares
No ratings yet
Linear Regression and Least Squares
3 pages
PCSX2 First Time Setup and Configuration 1: Document Source
No ratings yet
PCSX2 First Time Setup and Configuration 1: Document Source
7 pages
PI For Loading System From Autower PDF
No ratings yet
PI For Loading System From Autower PDF
5 pages
00 Readme First!!! PDF
No ratings yet
00 Readme First!!! PDF
2 pages
An Introduction To Openssl Programming (Par T I)
No ratings yet
An Introduction To Openssl Programming (Par T I)
13 pages
Zinwave UNItivity 5000
No ratings yet
Zinwave UNItivity 5000
3 pages

Logistic Regression Classification in Natural Language Processing (NLP) Final

Uploaded by

Logistic Regression Classification in Natural Language Processing (NLP) Final

Uploaded by

AKRE UNIVERSITY FOR APPLIED SCIENCES TECHNICAL

Logistic Regression Classification

Academic Year: 2024-2025

 Logistic regression is one of the most important analytic

We’ll compute w∙x+b

 Four Key Components:

 In this example, we classify emails as Spam (1) or Not

Each categorical feature is converted into separate binary columns:

Spam (1) / Not Spam

(1,0,0,0,0,1,0,0,1,0,0) −2(1)+1.1(1)+1.4(1)+1-2(1) + 1.1(1) + 1.4(1) + 1= 1.5 0.223 1.223 0.818 1

(0,1,0,0,0,0,0,1,0,1,0) 1.5(1)−0.9(1)+0.8(1)+11.5(1) - 0.9(1) + 0.8(1) + 1 = 2.4 0.091 1.091 0.916 1

(0,0,1,0,0,0,1,0,0,0,1) 1.2(1)+0.7(1)−0.5(1)+11.2(1) + 0.7(1) - 0.5(1) + 1 = 2.4 0.091 1.091 0.916 0

(0,0,0,1,0,1,0,0,0,1,0) 0.5(1)+1.1(1)+0.8(1)+10.5(1) + 1.1(1) + 0.8(1) + 1= 3.4 0.033 1.033 0.968 1

(0,0,0,0,1,0,1,0,0,0,1) −1.3(1)+0.7(1)−0.5(1)+1-1.3(1) + 0.7(1) - 0.5(1) + 1= -0.1 1.105 2.105 0.476 0

•If g(z)≥0.5, classify as Spam (1).

•Effectiveness in Text Classification: Logistic regression is a simple yet powerful

You might also like