0% found this document useful (0 votes)

25 views11 pages

Logistic Regression in Machine Learning

Uploaded by

aishwary srivastav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views11 pages

Logistic Regression in Machine Learning

Uploaded by

aishwary srivastav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Logistic Regression (For Classification Problem)

(Code: Subhajit Das)

What is Logistic Regression?

Logistic Regression is a supervised machine learning algorithm mainly used for classification
tasks where the goal is to predict the probability that an instance belongs to a given class. It
is a kind of statistical algorithm, which analyzes the relationship between a set of
independent variables and the dependent binary variables.

Logistic regression predicts the output of a categorical dependent variable. Therefore the
outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or
False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.

Where we can use Logistic Regression:

Logistic regression is commonly used in various fields to predict the likelihood of an event
occurring. Here are some of the applications of logistic regression:

1. Fraud Detection: Logistic regression can be used to predict whether a transaction is

fraudulent or not.
2. Disease Prediction: In healthcare, logistic regression can be used to predict the
likelihood of a patient having a certain disease based on their symptoms.
3. Churn Prediction: In marketing and finance, logistic regression can be used to predict
whether a customer will churn or not.
4. Credit Scoring: Logistic regression can be used to predict the probability of a customer
defaulting on a loan.
5. Spam Detection: Logistic regression can be used to predict whether an email is spam
or not.
6. Customer Churn Prediction: Logistic regression can be used to predict whether a
customer will leave a service or continue to use it.
7. Predicting Mortality in Injured Patients: Logistic regression can be used to predict the
likelihood of survival in injured patients.
8. Artificial Neural Networks: Logistic regression is the foundation for many advanced
machine learning algorithms, including neural networks.
These are just a few examples. The use of logistic regression is vast and varied across
different fields.

Logistic Regression vs Linnear Regression

How Logistic Regression works:

1. Binary Outcome: Logistic regression is used when the dependent variable is binary
(0/1, Yes/No, True/False) in nature.
2. Logistic Function: It uses a logistic function to transform this binary outcome into a
continuous variable that can range between 0 and 1.
3. Dependent Variable: This continuous variable is used as the dependent variable in the
regression analysis.
4. Independent Variables: The independent variables in the analysis are the factors that
you think may influence the outcome.
5. Estimating Effects: The regression analysis estimates the effect of each independent
variable on the probability of the outcome.
6. Predicting Probability: These estimates can then be used to predict the probability of
the outcome given a set of values for the independent variables.
7. Sigmoid Function: The sigmoid function maps any real-valued number into another
value between 0 and 1. In machine learning, we use sigmoid to map predictions to
probabilities.
8. Decision Boundary: Logistic Regression algorithm also uses a concept of threshold
value. By default, the threshold value is 0.5 which means if the probability of an
observation is greater than 0.5 then the observation is classified as Class-1(Y=1), else it
is classified as Class-0(Y=0).

Remember, the goal of logistic regression is to find the best fitting model to describe the
relationship between the dichotomous characteristic of interest and a set of independent
variables.

Sigmoid Function in Logistic Regression:

The Sigmoid Function is a crucial part of Logistic Regression in machine learning. It is a

mathematical function that can take any real value and map it between 0 and 1, forming a
curve shaped like the letter "S".

In Logistic Regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1). The curve from the logistic function
indicates the likelihood of something such as whether the cells are cancerous or not, a
mouse is obese or not based on its weight, etc.

The Sigmoid Function is used to map the predicted values to probabilities. It ensures that the
value of the logistic regression must be between 0 and 1, which cannot go beyond this limit.
This makes it a powerful tool for decision-making in classification tasks.
In [1]: import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

SUV Data Analysis

In [2]: suv_df = pd.read_csv("/content/drive/MyDrive/ML and DL DataSets/11_SUV_data.
suv_df.head()

Out[2]: User ID Gender Age EstimatedSalary Purchased

0 15624510 Male 19 19000 0

1 15810944 Male 35 20000 0

2 15668575 Female 26 43000 0

3 15603246 Female 27 57000 0

4 15804002 Male 19 76000 0

In [3]: suv_df.info

Out[3]: <bound method DataFrame.info of User ID Gender Age EstimatedSalar

y Purchased
0 15624510 Male 19 19000 0
1 15810944 Male 35 20000 0
2 15668575 Female 26 43000 0
3 15603246 Female 27 57000 0
4 15804002 Male 19 76000 0
.. ... ... ... ... ...
395 15691863 Female 46 41000 1
396 15706071 Male 51 23000 1
397 15654296 Female 50 20000 1
398 15755018 Male 36 33000 0
399 15594041 Female 49 36000 1

[400 rows x 5 columns]>

LabelEncoder (To convert Gender)

In [4]: from sklearn.preprocessing import LabelEncoder

In [5]: Le = LabelEncoder()

In [6]: suv_df['Gender'] = Le.fit_transform(suv_df['Gender'])

suv_df['Gender'].head(8)

Out[6]: 0 1
1 1
2 0
3 0
4 1
5 1
6 0
7 0
Name: Gender, dtype: int64

Separating features and labels

In [7]: x = suv_df.drop(['User ID', 'Purchased'], axis = 1)
x

Out[7]: Gender Age EstimatedSalary

0 1 19 19000

1 1 35 20000

2 0 26 43000

3 0 27 57000

4 1 19 76000

... ... ... ...

395 0 46 41000

396 1 51 23000

397 0 50 20000

398 1 36 33000

399 0 49 36000

400 rows × 3 columns

In [8]: y = suv_df['Purchased']
y

Out[8]: 0 0
1 0
2 0
3 0
4 0
..
395 1
396 1
397 1
398 0
399 1
Name: Purchased, Length: 400, dtype: int64

Spliting train and test datasets

In [9]: from sklearn.model_selection import train_test_split

In [10]: x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)

In [11]: # length or sample of train dataset

len(x_train)

Out[11]: 280

In [12]: # length or sample of test dataset

len(x_test)

Out[12]: 120

Using Logistic Regression

Parameters used in Logistic Regression:

The parameters used in Logistic Regression include:

1. Explanatory Variables: These are the independent variables or predictors in the model.
2. Response Variable: This is the dependent variable that we are trying to predict.
3. Logistic Function: This is the formula used to represent how the independent and
dependent variables relate to one another.
4. penalty: This specifies the norm used in the penalization.
5. dual: This is a boolean parameter that decides whether to solve the primal or dual
optimization problem.
6. tol: This is the tolerance for stopping criteria.
7. C: This is the inverse of regularization strength.
8. fit_intercept: This specifies if a constant (a.k.a. bias or intercept) should be added to
the decision function.
9. intercept_scaling: This is useful only when the solver ‘liblinear’ is used and
self.fit_intercept is set to True.
10. class_weight: This is an optional parameter that represents weights associated with
classes.
11. random_state: This parameter is used for shuffling the data.
12. solver: This parameter specifies the algorithm to use in the optimization problem.
13. max_iter: This is the maximum number of iterations for the solver to converge.
14. multi_class: This parameter specifies if the algorithm should use a one-vs-rest or
multinomial approach.
15. verbose: This is used for the verbosity of the output.
16. warm_start: This is a boolean parameter that decides whether to reuse the solution of
the previous call to fit as initialization.
17. n_jobs: This is the number of CPU cores used during the cross-validation loop.
18. l1_ratio: This parameter is used only when penalty is 'elasticnet'.

In [13]: from sklearn.linear_model import LogisticRegression

In [14]: log = LogisticRegression()

In [15]: # Fit the model

log.fit(x_train, y_train)

Out[15]: LogisticRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or
trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.

In [16]: y_test.head(8)

Out[16]: 208 1
227 1
268 1
9 0
327 0
119 0
134 0
349 0
Name: Purchased, dtype: int64
In [17]: y_pred = log.predict(x_test)
print(y_pred)

[0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0]

Metrices used in Linear Regression:

There are several metrics that can be used to evaluate the performance of a Logistic
Regression model:

1. Accuracy: This is the ratio of the total number of correct predictions to the total number
of input samples.
2. Precision: This is the ratio of the total number of correct positive predictions to the total
number of positive predictions.
3. Recall (Sensitivity): This is the ratio of the total number of correct positive predictions
to the total number of actual positives.
4. F1-Score: This is the Harmonic Mean of Precision and Recall.
5. AUC-ROC (Area Under the Receiver Operating Characteristics Curve): This is a
performance measurement for classification problem at various thresholds settings.
6. Confusion Matrix: A table used to describe the performance of a classification model.
7. Log-Loss: It's a performance metric for evaluating the predictions of probabilities of
membership to a particular class.
8. AIC (Akaike Information Criteria): It's a method used to compare different models and
find out the best fitting model from the given different models.
9. Null Deviance and Residual Deviance: These are measures of goodness of fit for a
logistic regression model.
10. Mean Absolute Error (MAE): This is the mean of the absolute value of the errors.
11. Root Mean Square Error (RMSE): This is the square root of the average of squared
differences between prediction and actual observation.
12. Coefficient of Determination or R2: This metric provides an indication of the goodness
of fit of a set of predictions to the actual values.
13. Adjusted R2: This is a modified version of R-squared that has been adjusted for the
number of predictors in the model.

Remember, the choice of metric depends on your specific problem and the business context.
It's always a good idea to understand the assumptions and implications of each metric
before choosing one.

Viewing the prediction score

In [18]: log.score(x_test, y_test)

Out[18]: 0.65
In [19]: from sklearn import metrics

# Accuracy
accuracy = metrics.accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')

# Precision
precision = metrics.precision_score(y_test, y_pred, zero_division=0) # The z
print(f'Precision: {precision}') # 0, means that out of all the positive pre

# Recall
recall = metrics.recall_score(y_test, y_pred)
print(f'Recall: {recall}') # 0, means that out of all the positive predictio

# F1 Score
f1_score = metrics.f1_score(y_test, y_pred)
print(f'F1 Score: {f1_score}') # F1 score is the harmonic mean of precision

# AUC-ROC
auc_roc = metrics.roc_auc_score(y_test, y_pred)
print(f'AUC-ROC: {auc_roc}')

# Confusion Matrix
confusion_matrix = metrics.confusion_matrix(y_test, y_pred)
print(f'Confusion Matrix: \n{confusion_matrix}')
print()

# Mean Absolute Error (MAE)
mae = metrics.mean_absolute_error(y_test, y_pred)
print(f'MAE: {mae}') # 0.4167, means that on average, your model’s predictio

# Root Mean Square Error (RMSE)
rmse = np.sqrt(metrics.mean_squared_error(y_test, y_pred))
print(f'RMSE: {rmse}') # 0.6455, means that on average, the square root of t

# R2 Score
r2 = metrics.r2_score(y_test, y_pred) # R2 score of 1 indicates that the mod
print(f'R2 Score: {r2}') # A negative R2 score indicates that your model is

# Adjusted R2 Score
n = len(y_test) # Number of observations
k = 1 # Number of predictors
adjusted_r2 = 1 - (1 - r2) * (n - 1) / (n - k - 1)
print(f'Adjusted R2 Score: {adjusted_r2}') # A negative R2 score indicates t

Accuracy: 0.65
Precision: 0.3333333333333333
Recall: 0.07692307692307693
F1 Score: 0.125
AUC-ROC: 0.5014245014245013
Confusion Matrix:
[[75 6]
[36 3]]

MAE: 0.35
RMSE: 0.5916079783099616
R2 Score: -0.5954415954415957
Adjusted R2 Score: -0.6089622869283888
In [20]: # Printing AUC-ROC Curve
fpr, tpr, _ = metrics.roc_curve(y_test, y_pred) # '_' This will prevent the
# In Python, the underscore _ is often used as a “throwaway” variable to ind
auc = metrics.roc_auc_score(y_test, y_pred)

plt.plot(fpr, # Increasing false positive rates such that element i is the f
tpr, # Increasing true positive rates such that element i is the tr
label="AUC = "+str(auc))
plt.legend(loc=4)
plt.show()

Predict Purchased

In [21]: log_gender = LabelEncoder()

categories = ['Male', 'Female']
log_gender.fit(categories)

Out[21]: LabelEncoder()
In a Jupyter environment, please rerun this cell to show the HTML representation or
trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.

In [22]: purchase_predict = LabelEncoder()

categories = ['Yes', 'No']
purchase_predict.fit(categories)

Out[22]: LabelEncoder()
In a Jupyter environment, please rerun this cell to show the HTML representation or
trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.
In [23]: age = int(input("Enter the Age: "))

gender = input("Enter the Gender: ")
gender_var = log_gender.transform([gender])[0] # The [0] at the end is an in

salary = int(input("Enter the Salary: "))

label_map = {0: 'No', 1: 'Yes'}
predict_value = log.predict([[age, gender_var, salary]])

# Map the numerical prediction back to a string label
label_value = label_map[predict_value[0]]

print(label_value) # 0: 'Yes', 1: 'No'

Enter the Age: 21

Enter the Gender: Male
Enter the Salary: 35000
Yes

/usr/local/lib/python3.10/dist-packages/sklearn/base.py:439: UserWarning:
X does not have valid feature names, but LogisticRegression was fitted wit
h feature names
warnings.warn(

Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
4. Logistic Regression
No ratings yet
4. Logistic Regression
21 pages
Wa0004.
No ratings yet
Wa0004.
9 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Misc 5
No ratings yet
Misc 5
1 page
Logistic Regression[2]
No ratings yet
Logistic Regression[2]
36 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
Module-2_Logistic Regression in Machine Learning
No ratings yet
Module-2_Logistic Regression in Machine Learning
28 pages
Logistics Regression
No ratings yet
Logistics Regression
8 pages
Logistic Regression
100% (1)
Logistic Regression
10 pages
Logistic Regression
No ratings yet
Logistic Regression
13 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
No ratings yet
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
33 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
ENCODING & Logistic Regression
No ratings yet
ENCODING & Logistic Regression
3 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Intro to Linear and Logistic Reg
No ratings yet
Intro to Linear and Logistic Reg
5 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Task1
No ratings yet
Task1
7 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
AIML_Lab7_Manual (Model Eval-Cross Validation)
No ratings yet
AIML_Lab7_Manual (Model Eval-Cross Validation)
6 pages
ml (08-08-2024)
No ratings yet
ml (08-08-2024)
5 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Sonia Jessica - 2022 - How Does Logistic Regression Work
No ratings yet
Sonia Jessica - 2022 - How Does Logistic Regression Work
4 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
FALLSEM2024-25_BCSE209L_TH_VL2024250101695_2024-08-12_Reference-Material-II
No ratings yet
FALLSEM2024-25_BCSE209L_TH_VL2024250101695_2024-08-12_Reference-Material-II
19 pages
Logistic Regression Presentation (1)
No ratings yet
Logistic Regression Presentation (1)
10 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
17 pages
Experiment No 3
No ratings yet
Experiment No 3
7 pages
Lecture Note #9_PEC-CS701E
No ratings yet
Lecture Note #9_PEC-CS701E
41 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
3 pages
Logistic regression
No ratings yet
Logistic regression
12 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
FALLSEM2024-25 MMAT501L TH VL2024250107615 2024-09-24 Reference-Material-I
No ratings yet
FALLSEM2024-25 MMAT501L TH VL2024250107615 2024-09-24 Reference-Material-I
12 pages
Aiml Unit 3 1
No ratings yet
Aiml Unit 3 1
9 pages
logisticregression-190726150723 (1)
No ratings yet
logisticregression-190726150723 (1)
19 pages
Dav Exp4 66
No ratings yet
Dav Exp4 66
5 pages
logisticregression
No ratings yet
logisticregression
22 pages
Ml Assignment Kv2
No ratings yet
Ml Assignment Kv2
10 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
05 Logistic - Regression
No ratings yet
05 Logistic - Regression
7 pages
ML Assignment
No ratings yet
ML Assignment
20 pages
week 8
No ratings yet
week 8
38 pages
AI lab8
No ratings yet
AI lab8
8 pages
2-Logistic Regression
No ratings yet
2-Logistic Regression
15 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
Logistic Regression Algorithm
No ratings yet
Logistic Regression Algorithm
8 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
10 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
29 LogisticRegression
No ratings yet
29 LogisticRegression
15 pages
03 Logistic Regression
No ratings yet
03 Logistic Regression
23 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
PCA
No ratings yet
PCA
32 pages
neural network
No ratings yet
neural network
18 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
clustering
No ratings yet
clustering
16 pages
C++ Programming: From Problem Analysis To Program Design, Fifth Edition
No ratings yet
C++ Programming: From Problem Analysis To Program Design, Fifth Edition
38 pages
Diamond Chip
No ratings yet
Diamond Chip
18 pages
2022 年下二笔三笔实务真题
No ratings yet
2022 年下二笔三笔实务真题
5 pages
BUS299 FinalExamAnswer
No ratings yet
BUS299 FinalExamAnswer
2 pages
Know Yourself and Your Destination Workshop
No ratings yet
Know Yourself and Your Destination Workshop
34 pages
Leading A Multigenerational Nursing Workforce - Issues, Challenges and Strategies
No ratings yet
Leading A Multigenerational Nursing Workforce - Issues, Challenges and Strategies
10 pages
Free Virtual Assistant Checklist
No ratings yet
Free Virtual Assistant Checklist
5 pages
Soil Analysis Report
No ratings yet
Soil Analysis Report
37 pages
Writing Essay (Smartphone in Class)
No ratings yet
Writing Essay (Smartphone in Class)
1 page
Global Sensitivity Analysis of WLS State Estimation For Distribution Systems
No ratings yet
Global Sensitivity Analysis of WLS State Estimation For Distribution Systems
164 pages
Factor of Safety
100% (1)
Factor of Safety
4 pages
MD400 en Userguide
No ratings yet
MD400 en Userguide
16 pages
HC-H120-PRO
No ratings yet
HC-H120-PRO
3 pages
Table of Contents
No ratings yet
Table of Contents
14 pages
Diffusion Of innovation theory
No ratings yet
Diffusion Of innovation theory
5 pages
CIS Microsoft Windows Server 2012 R2 Benchmark v2.2.1
No ratings yet
CIS Microsoft Windows Server 2012 R2 Benchmark v2.2.1
20 pages
Grade 4 DLL Quarter 2 Week 8 (Sir Bien Cruz)
No ratings yet
Grade 4 DLL Quarter 2 Week 8 (Sir Bien Cruz)
41 pages
A Hybrid CNN-LSTM: A Deep Learning Approach For Consumer Sentiment Analysis Using Qualitative User-Generated Contents
No ratings yet
A Hybrid CNN-LSTM: A Deep Learning Approach For Consumer Sentiment Analysis Using Qualitative User-Generated Contents
15 pages
Tandy's Septic Plan
No ratings yet
Tandy's Septic Plan
1 page
Solutions Short Notes - Learning Tales ND 7
No ratings yet
Solutions Short Notes - Learning Tales ND 7
2 pages
Bid 1323433
No ratings yet
Bid 1323433
2 pages
Unit 3
No ratings yet
Unit 3
28 pages
Iso 13366-1-2008
No ratings yet
Iso 13366-1-2008
24 pages
124
No ratings yet
124
20 pages
Optimal Drum Filtration Strategy in Enzyme Recovery 06-11-20 JPR
No ratings yet
Optimal Drum Filtration Strategy in Enzyme Recovery 06-11-20 JPR
76 pages
The History of Information Communication Technology
No ratings yet
The History of Information Communication Technology
8 pages
Portfolio: Teacher Induction Program
No ratings yet
Portfolio: Teacher Induction Program
14 pages
Website_Design_Trust_and_Culture_An_Eigh
No ratings yet
Website_Design_Trust_and_Culture_An_Eigh
5 pages
Chapter 1 Why Do Research
No ratings yet
Chapter 1 Why Do Research
12 pages
(Lecture 24) Result Oriented Task Management - Revised
No ratings yet
(Lecture 24) Result Oriented Task Management - Revised
34 pages

Logistic Regression in Machine Learning

Uploaded by

Logistic Regression in Machine Learning

Uploaded by

Logistic Regression (For Classification Problem)

(Code: Subhajit Das)

What is Logistic Regression?

Where we can use Logistic Regression:

1. Fraud Detection: Logistic regression can be used to predict whether a transaction is

Logistic Regression vs Linnear Regression

How Logistic Regression works:

Sigmoid Function in Logistic Regression:

The Sigmoid Function is a crucial part of Logistic Regression in machine learning. It is a

SUV Data Analysis

Out[2]: User ID Gender Age EstimatedSalary Purchased

0 15624510 Male 19 19000 0

1 15810944 Male 35 20000 0

2 15668575 Female 26 43000 0

3 15603246 Female 27 57000 0

4 15804002 Male 19 76000 0

Out[3]: <bound method DataFrame.info of User ID Gender Age EstimatedSalar

[400 rows x 5 columns]>

LabelEncoder (To convert Gender)

In [4]: from sklearn.preprocessing import LabelEncoder

In [6]: suv_df['Gender'] = Le.fit_transform(suv_df['Gender'])

Separating features and labels

Out[7]: Gender Age EstimatedSalary

... ... ... ...

400 rows × 3 columns

Spliting train and test datasets

In [9]: from sklearn.model_selection import train_test_split

In [10]: x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)

In [11]: # length or sample of train dataset

In [12]: # length or sample of test dataset

Using Logistic Regression

Parameters used in Logistic Regression:

In [13]: from sklearn.linear_model import LogisticRegression

In [14]: log = LogisticRegression()

In [15]: # Fit the model

Metrices used in Linear Regression:

Viewing the prediction score

In [18]: log.score(x_test, y_test)

In [21]: log_gender = LabelEncoder()

In [22]: purchase_predict = LabelEncoder()

Enter the Age: 21

You might also like