0% found this document useful (0 votes)

3 views

0Loan_Eligibility_prediction_Python.ipynb - Colab

Uploaded by

K arun kumar Arun

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

0Loan_Eligibility_prediction_Python.ipynb - Colab

Uploaded by

K arun kumar Arun

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.

ipynb - Colab

keyboard_arrow_down Loan Eligibility prediction using Machine Learning Models in Python

Have you ever thought about the apps that can predict whether you will get your loan approved or not? In this article, we are going to develop
one such model that can predict whether a person will get his/her loan approved or not by using some of the background information of the
applicant like the applicant’s gender, marital status, income, etc.

Importing Libraries In this step, we will be importing libraries like NumPy, Pandas, Matplotlib, etc.

1 import numpy as np
2 import pandas as pd
3 import matplotlib.pyplot as plt
4 import seaborn as sb
5 from sklearn.model_selection import train_test_split
6 from sklearn.preprocessing import LabelEncoder, StandardScaler
7 from sklearn import metrics
8 from sklearn.svm import SVC
9 from imblearn.over_sampling import RandomOverSampler
10
11 import warnings
12 warnings.filterwarnings('ignore')

keyboard_arrow_down Loading Dataset

1 df = pd.read_csv('/content/loan_data.csv')
2 df.head()

Loan_ID Gender Married ApplicantIncome LoanAmount Loan_Status

0 LP001002 Male No 5849 NaN Y

1 LP001003 Male Yes 4583 128.0 N

2 LP001005 Male Yes 3000 66.0 Y

3 LP001006 Male Yes 2583 120.0 Y

4 LP001008 Male No 6000 141 0 Y

Next steps: Generate code with df

toggle_off View recommended plots New interactive sheet

To see the shape of the dataset, we can use shape method.

1 df.shape

(598, 6)

To print the information of the dataset, we can use info() method

1 df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 598 entries, 0 to 597
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Loan_ID 598 non-null object
1 Gender 598 non-null object
2 Married 598 non-null object
3 ApplicantIncome 598 non-null int64
4 LoanAmount 577 non-null float64
5 Loan_Status 598 non-null object
dtypes: float64(1), int64(1), object(4)
memory usage: 28.2+ KB

To get values like the mean, count and min of the column we can use describe() method.

1 df.describe()

https://colab.research.google.com/drive/1WcvuIKfQxZK-1e6O2I4GOHyISo4CRZFp#scrollTo=XMgDOJEhlvbZ&printMode=true 1/6
12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.ipynb - Colab

ApplicantIncome LoanAmount

count 598.000000 577.000000

mean 5292.252508 144.968804

std 5807.265364 82.704182

min 150.000000 9.000000

25% 2877.500000 100.000000

50% 3806.000000 127.000000

75% 5746.000000 167.000000

max 81000 000000 650 000000

Exploratory Data Analysis EDA refers to the detailed analysis of the dataset which uses plots like distplot, barplots, etc.

Let’s start by plotting the piechart for LoanStatus column.

1 temp = df['Loan_Status'].value_counts()
2 plt.pie(temp.values,
3 labels=temp.index,
4 autopct='%1.1f%%')
5 plt.show()

Here we have an imbalanced dataset. We will have to balance it before training any model on this data.

We specify the DataFrame df as the data source for the sb.countplot() function. The x parameter is set to the column name from which the
count plot is to be created, and hue is set to ‘Loan_Status’ to create count bars based on the ‘Loan_Status’ categories.

1 plt.subplots(figsize=(15, 5))
2 for i, col in enumerate(['Gender', 'Married']):
3 plt.subplot(1, 2, i+1)
4 sb.countplot(data=df, x=col, hue='Loan_Status')
5 plt.tight_layout()
6 plt.show()

https://colab.research.google.com/drive/1WcvuIKfQxZK-1e6O2I4GOHyISo4CRZFp#scrollTo=XMgDOJEhlvbZ&printMode=true 2/6
12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.ipynb - Colab

One of the main observations we can draw here is that the chances of getting a loan approved for married people are quite low compared to
those who are not married.

1 plt.subplots(figsize=(15, 5))
2 for i, col in enumerate(['ApplicantIncome', 'LoanAmount']):
3 plt.subplot(1, 2, i+1)
4 sb.distplot(df[col])
5 plt.tight_layout()
6 plt.show()

There are some extreme outlier’s in the data we need to remove them

1 df = df[df['ApplicantIncome'] < 25000]

2 df = df[df['LoanAmount'] < 400000]

Let’s see the mean amount of the loan granted to males as well as females. For that, we will use groupyby() method.

1 df.groupby('Gender').mean(numeric_only=True)['LoanAmount']

LoanAmount

Gender

Female 126.697248

Male 146.872294

The loan amount requested by males is higher than what is requested by females.

1 df.groupby(['Married', 'Gender']).mean(numeric_only=True)['LoanAmount']

LoanAmount

Married Gender

No Female 116.115385

Male 135.959677

Yes Female 153.322581

Male 150.875740

Here is one more interesting observation in addition to the previous one that the married people requested loan amount is generally higher than
that of the unmarried. This may be one of the reason’s that we observe earlier that the chances of getting loan approval for a married person are
lower than that compared to an unmarried person.

1 # Function to apply label encoding

2 def encode_labels(data):
3 for col in data.columns:

https://colab.research.google.com/drive/1WcvuIKfQxZK-1e6O2I4GOHyISo4CRZFp#scrollTo=XMgDOJEhlvbZ&printMode=true 3/6
12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.ipynb - Colab
4 if data[col].dtype == 'object':
5 le = LabelEncoder()
6 data[col] = le.fit_transform(data[col])
7
8 return data
9
10 # Applying function in whole column
11 df = encode_labels(df)
12
13 # Generating Heatmap
14 sb.heatmap(df.corr() > 0.8, annot=True, cbar=False)
15 plt.show()

keyboard_arrow_down Data Preprocessing

In this step, we will split the data for training and testing. After that, we will preprocess the training data.

1 features = df.drop('Loan_Status', axis=1)

2 target = df['Loan_Status'].values
3
4 X_train, X_val, Y_train, Y_val = train_test_split(features, target,
5 test_size=0.2,
6 random_state=10)
7
8 # As the data was highly imbalanced we will balance
9 # it by adding repetitive rows of minority class.
10 ros = RandomOverSampler(sampling_strategy='minority',
11 random_state=0)
12 X, Y = ros.fit_resample(X_train, Y_train)
13
14 X_train.shape, X.shape

((456, 5), (638, 5))

We will now use Standard scaling for normalizing the data. To know more about StandardScaler refer this link.

1 # Normalizing the features for stable and fast training.

2 scaler = StandardScaler()
3 X = scaler.fit_transform(X)
4 X_val = scaler.transform(X_val)

keyboard_arrow_down Model Development

We will use Support Vector Classifier for training the model.

1 from sklearn.metrics import roc_auc_score

2 model = SVC(kernel='rbf')

https://colab.research.google.com/drive/1WcvuIKfQxZK-1e6O2I4GOHyISo4CRZFp#scrollTo=XMgDOJEhlvbZ&printMode=true 4/6
12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.ipynb - Colab
3 model.fit(X, Y)
4
5 print('Training Accuracy : ', metrics.roc_auc_score(Y, model.predict(X)))
6 print('Validation Accuracy : ', metrics.roc_auc_score(Y_val, model.predict(X_val)))
7 print()

Training Accuracy : 0.6300940438871474

Validation Accuracy : 0.48198198198198194

Model Evaluation Model Evaluation can be done using confusion matrix.

we first train the SVC model using the training data X and Y. Then, we calculate the ROC AUC scores for both the training and validation
datasets. The confusion matrix is built for the validation data by using the confusion_matrix function from sklearn.metrics. Finally, we plot the
confusion matrix using the plot_confusion_matrix function from the sklearn.metrics.plot_confusion_matrix submodule.

1 from sklearn.svm import SVC

2 from sklearn.metrics import confusion_matrix
3 training_roc_auc = roc_auc_score(Y, model.predict(X))
4 validation_roc_auc = roc_auc_score(Y_val, model.predict(X_val))
5 print('Training ROC AUC Score:', training_roc_auc)
6 print('Validation ROC AUC Score:', validation_roc_auc)
7 print()
8 cm = confusion_matrix(Y_val, model.predict(X_val))

Training ROC AUC Score: 0.6300940438871474

Validation ROC AUC Score: 0.48198198198198194

a table that compares predicted values to actual values for a dataset to evaluate the performance of a classification model.

1 plt.figure(figsize=(6, 6))
2 sb.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False)
3 plt.title('Confusion Matrix')
4 plt.xlabel('Predicted Label')
5 plt.ylabel('True Label')
6 plt.show()

1 from sklearn.metrics import classification_report

2 print(classification_report(Y_val, model.predict(X_val)))

precision recall f1-score support

0 0.30 0.30 0.30 37

1 0.67 0.67 0.67 78

accuracy 0.55 115

macro avg 0.48 0.48 0.48 115
weighted avg 0.55 0.55 0.55 115

https://colab.research.google.com/drive/1WcvuIKfQxZK-1e6O2I4GOHyISo4CRZFp#scrollTo=XMgDOJEhlvbZ&printMode=true 5/6
12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.ipynb - Colab

https://colab.research.google.com/drive/1WcvuIKfQxZK-1e6O2I4GOHyISo4CRZFp#scrollTo=XMgDOJEhlvbZ&printMode=true 6/6

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Standard Bank Home Loan Prediction
No ratings yet
Standard Bank Home Loan Prediction
11 pages
Loan Eligibility Prediction: Machine Learning
100% (1)
Loan Eligibility Prediction: Machine Learning
8 pages
Loan Prediction Using Machine Learning
No ratings yet
Loan Prediction Using Machine Learning
29 pages
loan eligiilitypdf
No ratings yet
loan eligiilitypdf
6 pages
Python Code For Loan Default Prediction
No ratings yet
Python Code For Loan Default Prediction
4 pages
Loan Approval
No ratings yet
Loan Approval
12 pages
loan eligibilityppt
No ratings yet
loan eligibilityppt
6 pages
Inline: Import As Import As Import As Import As Matplotlib Import
100% (1)
Inline: Import As Import As Import As Import As Matplotlib Import
15 pages
Machine Learning (P1)
No ratings yet
Machine Learning (P1)
9 pages
Loan Eligibility Prediction
No ratings yet
Loan Eligibility Prediction
12 pages
Loan Approval Model Prediction
No ratings yet
Loan Approval Model Prediction
10 pages
Loan Prediction
No ratings yet
Loan Prediction
20 pages
Loan Status Prediction
No ratings yet
Loan Status Prediction
23 pages
SSRN Id3769854
No ratings yet
SSRN Id3769854
8 pages
Ihic-2022 PPT Paper - Id 100
No ratings yet
Ihic-2022 PPT Paper - Id 100
11 pages
Report on Loan Eligibility Analysis
No ratings yet
Report on Loan Eligibility Analysis
5 pages
Final Project Making Predictions From Data-Course 2: October 6, 2020
No ratings yet
Final Project Making Predictions From Data-Course 2: October 6, 2020
20 pages
Exp 3
No ratings yet
Exp 3
6 pages
Report
No ratings yet
Report
15 pages
Loan Prediction-1
No ratings yet
Loan Prediction-1
26 pages
Loan Prediction
No ratings yet
Loan Prediction
26 pages
Machine Learning Presentation Richa
No ratings yet
Machine Learning Presentation Richa
7 pages
Loan Prediction
No ratings yet
Loan Prediction
33 pages
ranvijay12203409 (1)
No ratings yet
ranvijay12203409 (1)
13 pages
Loan-Prediction Using Machine Learning
No ratings yet
Loan-Prediction Using Machine Learning
31 pages
AIPROJECT
No ratings yet
AIPROJECT
9 pages
SYNOPSIS OF LEP 01
No ratings yet
SYNOPSIS OF LEP 01
8 pages
Loan Approval - PPT
No ratings yet
Loan Approval - PPT
19 pages
Predicting Personal Loan Approval Using Machine Learning Handbook
No ratings yet
Predicting Personal Loan Approval Using Machine Learning Handbook
31 pages
LOan final (1)
No ratings yet
LOan final (1)
6 pages
Feature Engineering - 01
No ratings yet
Feature Engineering - 01
31 pages
Report 2
No ratings yet
Report 2
26 pages
Paper 1
No ratings yet
Paper 1
10 pages
Alonzo_Lab4Activity
No ratings yet
Alonzo_Lab4Activity
7 pages
PA v0.25
No ratings yet
PA v0.25
18 pages
loan
No ratings yet
loan
11 pages
Prediciton of Loan Apprval-Project Report
No ratings yet
Prediciton of Loan Apprval-Project Report
82 pages
d.sce project (2)
No ratings yet
d.sce project (2)
28 pages
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
No ratings yet
Prediction of Modernized Loan Approval System Based On Machine Learning Approach
22 pages
Loan Approval Prediction
No ratings yet
Loan Approval Prediction
23 pages
i Love Merge
No ratings yet
i Love Merge
56 pages
Hands-On Activity 3.3 Random Forest Mantaring - Ipynb - Mantaring
No ratings yet
Hands-On Activity 3.3 Random Forest Mantaring - Ipynb - Mantaring
13 pages
Machine Learning with PySpark and MLlib — Solving a Binary Classification Problem _ by Susan Li _ Towards Data Science
No ratings yet
Machine Learning with PySpark and MLlib — Solving a Binary Classification Problem _ by Susan Li _ Towards Data Science
10 pages
Machine Learning Paper BD
No ratings yet
Machine Learning Paper BD
16 pages
Credit Card Approve Predict Bynvd
No ratings yet
Credit Card Approve Predict Bynvd
90 pages
minipptPOWER.1pdf
No ratings yet
minipptPOWER.1pdf
16 pages
Untitled 7
No ratings yet
Untitled 7
6 pages
Fin Irjmets1651834789
No ratings yet
Fin Irjmets1651834789
8 pages
Credit_Card_Approval_Prediction_Report-Final
No ratings yet
Credit_Card_Approval_Prediction_Report-Final
27 pages
Project
No ratings yet
Project
11 pages
Edafinal 1
No ratings yet
Edafinal 1
32 pages
Abhay Seminar Papers
No ratings yet
Abhay Seminar Papers
16 pages
REORT
No ratings yet
REORT
3 pages
lab-02
No ratings yet
lab-02
12 pages
Research Report
No ratings yet
Research Report
8 pages
Presentation 13
No ratings yet
Presentation 13
8 pages
Project Stage I Report
No ratings yet
Project Stage I Report
17 pages
For Loan Approval Prediction
100% (1)
For Loan Approval Prediction
14 pages
StarterNotebook - Jupyter Notebook
No ratings yet
StarterNotebook - Jupyter Notebook
12 pages
Public Participation in EIA
No ratings yet
Public Participation in EIA
12 pages
React-JS.pptx
No ratings yet
React-JS.pptx
11 pages
RL report TEAM - 12
No ratings yet
RL report TEAM - 12
17 pages
RL Report TEAM - 6
No ratings yet
RL Report TEAM - 6
13 pages
C# Expiriments - Copy
No ratings yet
C# Expiriments - Copy
42 pages
Module 3 Part 1B AgularJS
No ratings yet
Module 3 Part 1B AgularJS
24 pages
LB12_Implement GAN for neural style transfer (1).ipynb - Colab
No ratings yet
LB12_Implement GAN for neural style transfer (1).ipynb - Colab
17 pages
important programs - Copy
No ratings yet
important programs - Copy
20 pages
Labsheet2
No ratings yet
Labsheet2
21 pages
PREMKUMAR
No ratings yet
PREMKUMAR
2 pages
Case Study Applied ML
No ratings yet
Case Study Applied ML
1 page
20231CSE0156 - Copy
No ratings yet
20231CSE0156 - Copy
1 page
Comprehensive Analysis of Software Project Management
No ratings yet
Comprehensive Analysis of Software Project Management
20 pages
Lab Sheet 3 - Interactive Webpage Using HTML5 and CSS3 - Resturant
No ratings yet
Lab Sheet 3 - Interactive Webpage Using HTML5 and CSS3 - Resturant
7 pages
Soa Module1
No ratings yet
Soa Module1
42 pages
Exp-2 Hadoop Commands
No ratings yet
Exp-2 Hadoop Commands
6 pages
Word Count
No ratings yet
Word Count
10 pages
MS-V373 r4.0 PDF
No ratings yet
MS-V373 r4.0 PDF
53 pages
Articol Disteibuted Data Processing
No ratings yet
Articol Disteibuted Data Processing
9 pages
Byrith 4-30 Revisited
No ratings yet
Byrith 4-30 Revisited
2 pages
Microprocessor Expriments
No ratings yet
Microprocessor Expriments
26 pages
Git Basic Good
No ratings yet
Git Basic Good
20 pages
Datatypes in C++
No ratings yet
Datatypes in C++
11 pages
Application of Lean Six Sigma Methodology Application in Banking FINAL
No ratings yet
Application of Lean Six Sigma Methodology Application in Banking FINAL
58 pages
Primare C32 Remote Control User Guide
No ratings yet
Primare C32 Remote Control User Guide
20 pages
Module 3 - Computer Software
No ratings yet
Module 3 - Computer Software
3 pages
Dell Logical Reasoning
No ratings yet
Dell Logical Reasoning
5 pages
Programming Language CSC804 Lecture Note
No ratings yet
Programming Language CSC804 Lecture Note
58 pages
Pressure Drop Theory
No ratings yet
Pressure Drop Theory
3 pages
Pemrograman Bahasa C++ Tiara Dwi Anugrah D411 09 337: Intro To The C++ Language
No ratings yet
Pemrograman Bahasa C++ Tiara Dwi Anugrah D411 09 337: Intro To The C++ Language
13 pages
Fortios Handbook 54
No ratings yet
Fortios Handbook 54
3,476 pages
Introduction To Lab Equipment
No ratings yet
Introduction To Lab Equipment
5 pages
Version Control System
No ratings yet
Version Control System
28 pages
Youtube Downloader in Java
No ratings yet
Youtube Downloader in Java
6 pages
Mastering Vsphere
No ratings yet
Mastering Vsphere
34 pages
Sharecenter™ + 4-Bay Cloud Network Storage Enclosure: Product Highlights
No ratings yet
Sharecenter™ + 4-Bay Cloud Network Storage Enclosure: Product Highlights
3 pages
H21555B.pdf 免费高速下载百度网盘-分享无限制
No ratings yet
H21555B.pdf 免费高速下载百度网盘-分享无限制
19 pages
LOQ_15ARP9_83JC0045IN
No ratings yet
LOQ_15ARP9_83JC0045IN
2 pages
Biometric Appointment Letter - Do Not Reply (1) - 1
No ratings yet
Biometric Appointment Letter - Do Not Reply (1) - 1
3 pages
Syam Gupta: Mechanical Engineering (Sophomore)
No ratings yet
Syam Gupta: Mechanical Engineering (Sophomore)
1 page
Iso 14962 1997
No ratings yet
Iso 14962 1997
9 pages
05 Handout 1 (Pre-Finals)
No ratings yet
05 Handout 1 (Pre-Finals)
2 pages
Adobe Scan 6 Jan 2025
No ratings yet
Adobe Scan 6 Jan 2025
1 page
An Introduction To Pascal Programming MOD 2010
No ratings yet
An Introduction To Pascal Programming MOD 2010
5 pages
SDM - Noc Paper
No ratings yet
SDM - Noc Paper
6 pages
Thermostat, T9000 Wireless, Battery, 1Ht 1Cl, 5-2-Prog, MCO, IOM Manual
No ratings yet
Thermostat, T9000 Wireless, Battery, 1Ht 1Cl, 5-2-Prog, MCO, IOM Manual
16 pages
Basic Operations English 76 PDF
No ratings yet
Basic Operations English 76 PDF
9 pages

0Loan_Eligibility_prediction_Python.ipynb - Colab

Uploaded by

0Loan_Eligibility_prediction_Python.ipynb - Colab

Uploaded by

12/3/24, 1:51 PM 0Loan_Eligibility_prediction_Python.

keyboard_arrow_down Loan Eligibility prediction using Machine Learning Models in Python

keyboard_arrow_down Loading Dataset

Loan_ID Gender Married ApplicantIncome LoanAmount Loan_Status

0 LP001002 Male No 5849 NaN Y

1 LP001003 Male Yes 4583 128.0 N

2 LP001005 Male Yes 3000 66.0 Y

3 LP001006 Male Yes 2583 120.0 Y

4 LP001008 Male No 6000 141 0 Y

Next steps: Generate code with df

To see the shape of the dataset, we can use shape method.

To print the information of the dataset, we can use info() method

count 598.000000 577.000000

mean 5292.252508 144.968804

std 5807.265364 82.704182

min 150.000000 9.000000

25% 2877.500000 100.000000

50% 3806.000000 127.000000

75% 5746.000000 167.000000

max 81000 000000 650 000000

Let’s start by plotting the piechart for LoanStatus column.

1 df = df[df['ApplicantIncome'] < 25000]

Yes Female 153.322581

1 # Function to apply label encoding

keyboard_arrow_down Data Preprocessing

1 features = df.drop('Loan_Status', axis=1)

((456, 5), (638, 5))

1 # Normalizing the features for stable and fast training.

keyboard_arrow_down Model Development

1 from sklearn.metrics import roc_auc_score

Training Accuracy : 0.6300940438871474

Model Evaluation Model Evaluation can be done using confusion matrix.

1 from sklearn.svm import SVC

Training ROC AUC Score: 0.6300940438871474

1 from sklearn.metrics import classification_report

precision recall f1-score support

0 0.30 0.30 0.30 37

accuracy 0.55 115

You might also like