0% found this document useful (0 votes)

110 views7 pages

05 Logistic - Regression

Logistic regression is a classification algorithm used when the response variable is categorical. It finds a relationship between features and the probability of a particular outcome class. The sigmoid function is used in logistic regression since its range is between 0 and 1, making it suitable for calculating probabilities. Logistic regression models the log odds (log(p/1-p)) as a linear combination of features to classify observations into binary categories based on a probability threshold of 0.5.

Uploaded by

adalina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views7 pages

05 Logistic - Regression

Uploaded by

adalina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Logistic Regression

It’s a classification algorithm, that is used where the response variable is categorical. The idea of Logistic
Regression is to find a relationship between features and probability of particular outcome.
E.g. When we have to predict if a student passes or fails in an exam when the number of hours spent
studying is given as a feature, the response variable has two values, pass and fail.
If the probability is more than 50%, it assigns the value in that particular class else if the probability is less
than 50%, the value is assigned to the other class. Therefore, we can say that logistic regression acts as a
binary classifier.

Working of a Logistic Model

For linear regression, the model is defined by: 𝑦 = 𝛽0 + 𝛽1 𝑥 - (i)

and for logistic regression, we calculate probability, i.e. y is the probability of a given variable x belonging to a
certain class. Thus, it is obvious that the value of y should lie between 0 and 1.

But, when we use equation(i) to calculate probability, we would get values less than 0 as well as greater than 1.
That doesn’t make any sense . So, we need to use such an equation which always gives values between 0 and
1, as we desire while calculating the probability.

So here we Use Sigmoid Function

Sigmoid function

We use the sigmoid function as the underlying function in Logistic regression. Mathematically and graphically, it
is shown as:

Why do we use the Sigmoid Function?

1) The sigmoid function’s range is bounded between 0 and 1. Thus it’s useful in calculating the probability for
the Logistic function.

2) It’s derivative is easy to calculate than other functions which is useful during gradient descent calculation.

3) It is a simple way of introducing non-linearity to the model.

Now Logistic function On Sigmoid Function

Logit Function
Logistic regression can be expressed as:

where, the left hand side is called the logit or log-odds function, and p(x)/(1-p(x)) is called odds.
The odds signifies the ratio of probability of success to probability of failure. Therefore, in Logistic
Regression, linear combination of inputs are mapped to the log(odds) - the output being equal to 1.
The cost function for the whole training set is given as :

Logistic And Linear Model

Pratical Demonstrate Of Logistic Regression

Importing the libraries

In [1]:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

Importing the dataset

In [2]:

dataset = pd.read_csv('Social_Network_Ads.csv')

In [3]:

dataset.head()

Out[3]:

User ID Gender Age EstimatedSalary Purchased

0 15624510 Male 19 19000 0

1 15810944 Male 35 20000 0

2 15668575 Female 26 43000 0

3 15603246 Female 27 57000 0

4 15804002 Male 19 76000 0

In [4]:

X = dataset.drop(['Purchased','User ID','Gender'],axis=1)
y = dataset['Purchased']

In [5]:

X.shape,y.shape

Out[5]:

((400, 2), (400,))

Splitting the dataset into the Training set and Test set
In [6]:

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state =

Feature Scaling
In [7]:

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Training the Logistic Regression model on the Training set

In [8]:

from sklearn.linear_model import LogisticRegression

classifier = LogisticRegression(C=1.0)
classifier.fit(X_train, y_train)

Out[8]:

LogisticRegression()

Predicting the Test set results

In [9]:

y_pred = classifier.predict(X_test)
In [10]:

calculation = pd.DataFrame(np.c_[y_test,y_pred], columns = ["Original Purchased","Predict P

calculation

Out[10]:

Original Purchased Predict Purchased

0 0 0

1 0 0

2 0 0

3 0 0

4 0 0

... ... ...

95 1 0

96 0 0

97 1 0

98 1 1

99 1 1

100 rows × 2 columns

Visualising the Training set results

In [11]:

from matplotlib.colors import ListedColormap

X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1,
np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1,
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.sh
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

'c' argument looks like a single numeric RGB or RGBA sequence, which should
be avoided as value-mapping will have precedence in case its length matches
with 'x' & 'y'. Please use a 2-D array with a single row if you really want
to specify the same RGB or RGBA value for all points.
'c' argument looks like a single numeric RGB or RGBA sequence, which should
be avoided as value-mapping will have precedence in case its length matches
with 'x' & 'y'. Please use a 2-D array with a single row if you really want
to specify the same RGB or RGBA value for all points.

Visualising the Test set results

In [12]:

from matplotlib.colors import ListedColormap

X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1,
np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1,
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.sh
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Design of Steel Lighting System Support Pole Structures
No ratings yet
Design of Steel Lighting System Support Pole Structures
81 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Flight Performance and Planning (PPL)
No ratings yet
Flight Performance and Planning (PPL)
3 pages
What Is A Support Vector Machine?: Primer
No ratings yet
What Is A Support Vector Machine?: Primer
3 pages
Software Asset Management: What Is It and Why Do I Need It?: A Textbook on the Fundamentals in Software License Compliance, Audit Risks, Optimizing Software License ROI, Business Practices and Life Cycle Management
From Everand
Software Asset Management: What Is It and Why Do I Need It?: A Textbook on the Fundamentals in Software License Compliance, Audit Risks, Optimizing Software License ROI, Business Practices and Life Cycle Management
Carl A. Bolton
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Data Science Interview Questions 2019
No ratings yet
Data Science Interview Questions 2019
16 pages
Project Report: CS 574 - Computer Vision Using Machine Learning
No ratings yet
Project Report: CS 574 - Computer Vision Using Machine Learning
38 pages
Gradient Descent Algorithm
No ratings yet
Gradient Descent Algorithm
5 pages
Logistic Regression
100% (1)
Logistic Regression
14 pages
Career Plans For Next 2 Years
No ratings yet
Career Plans For Next 2 Years
11 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Matplotlib PDF
No ratings yet
Matplotlib PDF
16 pages
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
100% (1)
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
11 pages
A Practical Time-Series Tutorial With MATLAB
No ratings yet
A Practical Time-Series Tutorial With MATLAB
95 pages
7 Time Series Datasets For Machine Learning
No ratings yet
7 Time Series Datasets For Machine Learning
8 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Using Categorical Data With One Hot Encoding - Kaggle PDF
No ratings yet
Using Categorical Data With One Hot Encoding - Kaggle PDF
4 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Classification Algorithms
100% (2)
Classification Algorithms
23 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
ENG 202: Computers and Engineering Object Oriented Programming in PYTHON
No ratings yet
ENG 202: Computers and Engineering Object Oriented Programming in PYTHON
56 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
MACHINELEARING UNIT 1material
100% (1)
MACHINELEARING UNIT 1material
64 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
20 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Linear Regression
No ratings yet
Linear Regression
83 pages
Understanding DBSCAN Algorithm and Implementation From Scratch - by Andrewngai - Towards Data Science
No ratings yet
Understanding DBSCAN Algorithm and Implementation From Scratch - by Andrewngai - Towards Data Science
10 pages
Introduction To Python and Computer Programming 1704298503
No ratings yet
Introduction To Python and Computer Programming 1704298503
44 pages
Building Powerful Image Classification Models Using Very Little Data
No ratings yet
Building Powerful Image Classification Models Using Very Little Data
20 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
BA ZG523 Introduction To Data Science
50% (2)
BA ZG523 Introduction To Data Science
12 pages
Cluster
100% (1)
Cluster
72 pages
Variosalgoritmos - Jupyter Notebook
100% (1)
Variosalgoritmos - Jupyter Notebook
9 pages
Module 2
No ratings yet
Module 2
20 pages
Linear Regression With Multiple Variables - Machine Learning, Deep Learning, and Computer Vision
100% (1)
Linear Regression With Multiple Variables - Machine Learning, Deep Learning, and Computer Vision
12 pages
DataScience Interview Questions
100% (1)
DataScience Interview Questions
66 pages
Cheatsheet Midterms 2 - 3
No ratings yet
Cheatsheet Midterms 2 - 3
2 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
100% (1)
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
11 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
Full Syllabus of Calicut University (2004) Information Technology (IT)
No ratings yet
Full Syllabus of Calicut University (2004) Information Technology (IT)
191 pages
I. The Types of Machine Learning
No ratings yet
I. The Types of Machine Learning
8 pages
Classifying The Supervised Machine Learning and Comparing The Performances of The Algorithms
No ratings yet
Classifying The Supervised Machine Learning and Comparing The Performances of The Algorithms
17 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Effective Amazon Machine Learning
From Everand
Effective Amazon Machine Learning
Alexis Perrier
No ratings yet
Data Science Career Guide Interview Preparation
From Everand
Data Science Career Guide Interview Preparation
Gradient Publication
No ratings yet
Panasonic Kx-t7230SP User Manual 2
No ratings yet
Panasonic Kx-t7230SP User Manual 2
26 pages
Single and Multi Camera Productions
No ratings yet
Single and Multi Camera Productions
9 pages
MR 6-Channel Phased Array Flex Coil: GE 1.5T and 3.0T Operator Manual
No ratings yet
MR 6-Channel Phased Array Flex Coil: GE 1.5T and 3.0T Operator Manual
40 pages
Corporate Deck
No ratings yet
Corporate Deck
20 pages
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
No ratings yet
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
27 pages
Industrial Steam Turbine Value Packages: Energy
No ratings yet
Industrial Steam Turbine Value Packages: Energy
30 pages
Led Emergency Kit
No ratings yet
Led Emergency Kit
2 pages
1110 109 002 Pve B 085 - 0
No ratings yet
1110 109 002 Pve B 085 - 0
9 pages
Project Domain / Category: Online Scrabble Game
No ratings yet
Project Domain / Category: Online Scrabble Game
2 pages
Chapter 3 Statically Determinate and Indeterminate Systems: 3.1 Principles of Solid Mechanics
No ratings yet
Chapter 3 Statically Determinate and Indeterminate Systems: 3.1 Principles of Solid Mechanics
17 pages
Viga Acero Compuesta - 21x50#
No ratings yet
Viga Acero Compuesta - 21x50#
10 pages
2SC829
No ratings yet
2SC829
4 pages
1) Cryptic Species As A Window Into The Paradigm Shift of The Species Concept
No ratings yet
1) Cryptic Species As A Window Into The Paradigm Shift of The Species Concept
75 pages
What Is Phenomenology? Describe The Proponents, Purposes and Types of Phenomenology
No ratings yet
What Is Phenomenology? Describe The Proponents, Purposes and Types of Phenomenology
4 pages
Admin & QC Training
No ratings yet
Admin & QC Training
34 pages
A New Soft Starting Method For Wound-Rotor Induction Motor
No ratings yet
A New Soft Starting Method For Wound-Rotor Induction Motor
6 pages
Approved Guidelines For Synopsis Thesis PHD Degree
100% (1)
Approved Guidelines For Synopsis Thesis PHD Degree
6 pages
Equivalence Relations - A
50% (2)
Equivalence Relations - A
14 pages
Quantum Users Guide-3
100% (2)
Quantum Users Guide-3
209 pages
Meteorologist For A Day Project-Revised
No ratings yet
Meteorologist For A Day Project-Revised
1 page
Preezie Use Case
No ratings yet
Preezie Use Case
3 pages
Unpacking Grade 9
No ratings yet
Unpacking Grade 9
180 pages
Ethernet Twist Per Inch
No ratings yet
Ethernet Twist Per Inch
8 pages
Vocabulary of ISO 9000 Standard
No ratings yet
Vocabulary of ISO 9000 Standard
19 pages
Development of Lightning Detector System Using Multistation Method
No ratings yet
Development of Lightning Detector System Using Multistation Method
5 pages
Organic Chemistry 4th Edition by Smith Janice Instant Download
100% (1)
Organic Chemistry 4th Edition by Smith Janice Instant Download
33 pages
Culture
No ratings yet
Culture
15 pages
Datasheet
No ratings yet
Datasheet
78 pages

05 Logistic - Regression

Uploaded by

05 Logistic - Regression

Uploaded by

Logistic Regression

Working of a Logistic Model

For linear regression, the model is defined by: 𝑦 = 𝛽0 + 𝛽1 𝑥 - (i)

So here we Use Sigmoid Function

Why do we use the Sigmoid Function?

3) It is a simple way of introducing non-linearity to the model.

Now Logistic function On Sigmoid Function

Logistic And Linear Model

Importing the libraries

Importing the dataset

User ID Gender Age EstimatedSalary Purchased

0 15624510 Male 19 19000 0

1 15810944 Male 35 20000 0

2 15668575 Female 26 43000 0

3 15603246 Female 27 57000 0

4 15804002 Male 19 76000 0

((400, 2), (400,))

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

Training the Logistic Regression model on the Training set

from sklearn.linear_model import LogisticRegression

Predicting the Test set results

calculation = pd.DataFrame(np.c_[y_test,y_pred], columns = ["Original Purchased","Predict P

Original Purchased Predict Purchased

... ... ...

100 rows × 2 columns

Visualising the Training set results

from matplotlib.colors import ListedColormap

Visualising the Test set results

from matplotlib.colors import ListedColormap

You might also like