0% found this document useful (0 votes)
5 views7 pages

Logistic Regression Explained

Logistic regression is used when the dependent variable is binary. It estimates the probability of success using a logistic curve. The document discusses the process of performing logistic regression in Python including importing data, splitting into training and test sets, feature scaling, training the model, predicting results, and evaluating using a confusion matrix.

Uploaded by

Vinay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Logistic Regression Explained

Logistic regression is used when the dependent variable is binary. It estimates the probability of success using a logistic curve. The document discusses the process of performing logistic regression in Python including importing data, splitting into training and test sets, feature scaling, training the model, predicting results, and evaluating using a confusion matrix.

Uploaded by

Vinay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Logistics Regression

Explained

Mohammad Arshad
Logistic Regression – Introduction
In Linear regression, the outcome variable is continuous and the predictor variables can be a mix of numeric and
categorical. But often there are situations where we wish to evaluate the effects of multiple explanatory variables on a
binary outcome variable

For example, the effects of a number of factors on the development or otherwise of a disease. A patient may be cured or
not; a prospect may respond or not, should we grant a loan to particular person or not, etc.

When the outcome or dependent variable is binary, and we wish to measure the effects of several independent variables on
it, we uses Logistic Regression

► The binary outcome variable can be coded as 0 or 1.


► The logistic curve is shown in the figure below:

We estimate the probability of success by the


equation:
Process Flow
Data Identification Factor
Data In
Preparation/ of Variables Analysis or
Python
Cleaning Correlation

Data is obtained in ▪ Missing Value Imputation Independent and dependent ▪ FA is done in order to
pandas dataframe ▪ Trash value variables should be identified get the variables into
▪ Outlier Treatment groups
▪ Good to choose factor
solution near the Eigen
value of 1
▪ As a further Check
Correlation Analysis is
done

Creation of Logistic Regression Validate


Modeling KS Statistic Output
in Python Output
Dataset

Divide data into ▪ Assume all assumptions ▪ Validate the output on Results will be
Development and hold the Validation sample , presented in
Validation Sample in ▪ Check for the by running the same PowerPoint
ratio 70:30 or 80:20 significance of the model on the Validation
variables sample
▪ Run Regression on
Development sample
Python code
Step 1: Importing the dataset
dataset = pd.read_csv(‘car_purchase_Ads.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

Step 2: Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

Step 2: Feature Scaling


from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Step 4 : Training the Logistic Regression model on the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)
Python code
Step 5: Predicting a new result
print(classifier.predict(sc.transform([[30,87000]])))

Step 6: Predicting the Test set results


y_pred = classifier.predict(X_test)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

Step 7: Making the Confusion Matrix


from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)
Practice

For location of code and dataset


https://github.com/arshad831/Modelling-
Exercise/blob/main/logistic_regression.ipynb
Thank You!
To know more Get In Touch!

Kick start your AI & Data Science Career

Join 1050 + AI professionals and enthusiasts

nas.io/artificialintelligence

You might also like