0% found this document useful (0 votes)

23 views24 pages

Trainity-Data An

The document outlines a data analytics project focused on analyzing loan applications to identify patterns related to customer payment difficulties and loan defaults. It details the exploratory data analysis (EDA) approach, including handling missing data, identifying outliers, analyzing data imbalance, and conducting various statistical analyses using Microsoft Excel. The project aims to optimize loan approval decisions while mitigating risks associated with defaults, providing insights into the demographics and behaviors of loan applicants.

Uploaded by

diariesdoodling

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views24 pages

Trainity-Data An

Uploaded by

diariesdoodling

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

lOMoARcPSD|47013055

Trainity Data Analytics Training project 6

Data Analytics (Devi Ahilya Vishwavidyalaya)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Shubhanshi Bajpai ([email protected])
lOMoARcPSD|47013055

Trainity Data Analytics Training

Project 6
Bank Loan Case Study
Date: 13-12-2023 Arpit Paliwal
[email protected]

Project Description: Conduct Exploratory Data Analysis (EDA) as a data analyst at a finance
company specializing in lending loans to urban customers. The company faces a challenge of
customers with insufficient credit history exploiting the system and defaulting on loans. The goal
is to use EDA to analyze patterns in the data and ensure that qualified applicants are not
rejected.

The dataset includes information on loan applications, categorized into customers with payment
difficulties (late payments on installments) and those without payment issues. Four possible
outcomes of a loan application are Approved, Canceled, Refused, and Unused Offer.

The business objectives are to identify patterns indicating if a customer will struggle with
installment payments. This information can be used to make decisions such as denying loans,
reducing loan amounts, or lending at higher interest rates to risky applicants. The company aims
to understand key factors behind loan defaults for better decision-making in loan approval.

The context of risk analytics in banking and financial services is crucial to understanding the
project, including the significance of various variables in predicting and mitigating loan default
risks.

Approach: The focus is on mitigating default risks, particularly from customers with insufficient
credit history. The dataset comprises two categories: customers with payment difficulties and
those without. Four possible loan application outcomes exist: Approved, Canceled, Refused,
and Unused Offer.

The primary business objectives are to identify patterns that signal potential payment difficulties
and to comprehend the key factors influencing loan defaults. Through EDA, the aim is to
optimize decision-making in loan approval by avoiding rejections for capable applicants while
mitigating financial losses associated with defaults. A foundational understanding of risk
analytics in banking and financial services is recommended to navigate the significance of
variables in this context.

Tech Stack Used :Microsoft Excel 2007 as the principal tool. The project heavily relied on
Excel's extensive functions, adept data handling capabilities, and robust charting tools, playing a
pivotal role in both the analysis and reporting phases. The user-friendly interface of Excel
proved instrumental in seamlessly manipulating data and generating reports, thereby
significantly contributing to the successful evaluation of the data.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Data Analysis Tasks:

A.Identify Missing Data and Deal with it Appropriately: As a data analyst, you come across
missing data in the loan application dataset. It is essential to handle missing data effectively to
ensure the accuracy of the analysis.
Task: Identify the missing data in the dataset and decide on an appropriate method to deal with
it using Excel built-in functions and features.
Primary Data Set: application_data.csv

Explanation: To handle missing data I did the following:

1.calculated the percentage of blank cells in a new row (50001) using the function
=(COUNTBLANK()/COUNT())*100
2.With the help of conditional formatting identiﬁed and deleted all the columns which had a
percentage of missing cells more than 40%.
3. Filled all the missing cells with the median (row 50002) of that particular column (median as
mean will be ineffective because of outliers).

After the cleaning the data was left with 73 columns and 50002 rows with 0 blank cells and no
duplicates.

Graph:

Graph of column with missing values <40%

B. Identify Outliers in the Dataset: Outliers can significantly impact the analysis and distort the
results. You need to identify outliers in the loan application dataset.
Task: Detects and identifies outliers in the dataset using Excel statistical functions and features,
focusing on numerical variables.
Primary Data Set: application_data.csv
Explanation: To identify the outliers the quartile function was used as the following:
1.calculated the first and third quartile using the function =QUARTILE(ARRAY,1) and =
QUARTILE(ARRAY,3)
2. Calculated the inter quartile range(IQR) by subtracting the first quarter from the third quarter.
3. Calculated the lower and upper bound using the formula lower bound = Q1 - 1.5*IQR, upper
bound = Q3 + 1.5*IQR.
4. Another column was created to check if the values in the previous column lie between the
range of upper bound and lower bound which will be true and false if the value is an outlier.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Graphs: the scatter plots here are shown to visualize the outlier(took 15000 rows as excel was
freezing for a large number of rows).

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

(the identification of outliers is done on these crucial amount/income columns to find the unfit
candidates for the loan)
Although we can hardly find any outliers in the given dataset.

C. Analyze Data Imbalance: Data imbalance can affect the accuracy of the analysis, especially
for binary classiﬁcation problems. Understanding the data distribution is crucial for building
reliable models.
Task: Determine if there is data imbalance in the loan application dataset and calculate the ratio
of data imbalance using Excel functions.
Primary Data: application_data

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Explanation: To check the data imbalance in the dataset I created different pivot tables for
columns in which the data imbalance was to be checked and generated column charts for each
of them in a different sheet.

Graphs:
Target:

Most of the people had paid installments in time comparatively few had diﬃculties.

Contract Type:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Higher number of cash loans among clients than revolving loans.

Gender:

Signiﬁcantly higher number of female applicants than male applicants.

Owning Realty:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Majority of the applicants are realty owners

Count of Children:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Most of the applicants are childless meaning young and career focused applicants are in
majority

Organization Type:

Many of the applicants either have business entities or are self employed.

D. Perform Univariate, Segmented Univariate, and Bivariate Analysis: To gain insights into the
driving factors of loan default, it is important to conduct various analyses on consumer and loan
attributes.
Task: Perform univariate analysis to understand the distribution of individual variables,
segmented univariate analysis to compare variable distributions for different scenarios, and
bivariate analysis to explore relationships between variables and the target variable using Excel
functions and features.
Primary Data: application_data
Secondary Data: previous_data

Analysis of application_data :
Univariate analysis: In this type of analysis data consists of only one variable. The analysis of
univariate data is thus the simplest form of analysis since the information deals with only one
quantity that changes. It does not deal with causes or relationships and the main purpose of the
analysis is to describe the data and ﬁnd patterns that exist within it.

Segmented univariate analysis: segmented univariate analysis is an extension of univariate

analysis as Segmented analysis here means that the data variable is analyzed in subsets(as
ranges).

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

I generated the frequency distribution histogram by creating classes (from max, min), bins and
using data analytics option>histogram>input range,output range, chart output.

The majority of applicants fall within the range of 25,650 to 275,650.

Loans are generally more prevalent in the lower credit range of 45,000 to 345,000, and as credit
scores increase, loan amounts tend to decrease.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

The majority of loans are obtained by individuals aged between 31 and 51 and the age bracket
of 21 to 61 exhibits a relatively even distribution of loan counts, suggesting a balanced
distribution.
There is a decline in the number of individuals taking loans as the age range increases.

Individuals in the working category are the most frequent borrowers, with commercial
associates following closely behind.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

People working for 0-10 years apply for most loans.

The highest number of loans is taken by married individuals, with singles coming in second.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

The majority of individuals reside in apartments, with a comparatively lower number choosing to
live with their parents.

Bivariate analysis: Bivariate analysis is one of the statistical analyses where two variables are
observed. One variable here is dependent while the other is independent. These variables are
usually denoted by X and Y. So, here we analyze the changes occurring between the two
variables and to what extent. Apart from bivariate, there are other two statistical analyses, which
are Univariate (for one variable) and Multivariate (for multiple variables).

Amount income & target:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

The highest incidence of defaults occurs among individuals with incomes in the range of 25,650
to 275,650. As income levels rise, both the number of loan applicants and the instances of
defaults decrease.

Amount credit:

The majority of loans are concentrated within the low credit range of 45,000 to 345,000, with a
notable occurrence of defaults. The highest default rates are observed in the credit range of

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

345,000 to 645,000. A decrease in both loan amounts and default occurrences is noted as credit
levels increase.

Age/Target:

▪ Age range of 31 to 51 ends up taking most loans and default.

▪ Age group of 21 to 61 tend to have similar roughly counts
indicating balanced distribution.
▪ As age range increases, people taking loan decreases as well
as them defaulting.

Count of children/target:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

People with 0 children default more followed by

people with 1 and 2 children(as no of people with 0 children are more in number).

Family Status/target:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Majority of defaulters are married followed by singles.

primary_data univariate and bivariate analysis:

After cleaning the data by deleting duplicate rows and blank rows and columns with more than
40% blank cells (as we did in application_data) the sheet is ready for analysis:
- 32 columns of 37 after deleting columns with mostly blank cells.
-
Univariate/Segmented Univariate analysis:
Amt_Goods_price:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

The majority of applicants seek loans for goods falling within the range of 0 to 2 lakh, while
there is a decrease in the number of people applying for goods with higher amounts.
Client type:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Most of the applicants are repeaters.

Payment Type:

Most applicants like to prefer cash through the bank and followed by XNA.

Bivariate analysis:
Contract Status/loan purpose:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Most XAP loans are getting approved while Most XNA loans are rejected.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Contract Status/client Type:

Repeat applicants have the highest approval rate, with new applications following closely
behind. Repeat applicants face nearly equal chances of either being canceled or refused.

Contract Status/contract Type:

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Consumer loans have a very low likelihood of being canceled. The highest proportion of cash
loans tends to be canceled.
Contract Status/Contract Type:

The majority of applicants have their loans approved through cash via the bank, and
cancellations are rare in this category. On the other hand, most applicants with the designation
XNA experience cancellations.

E. Identify Top Correlations for Different Scenarios: Understanding the correlation between
variables and the target variable can provide insights into strong indicators of loan default.
Task: Segment the dataset based on different scenarios (e.g., clients with payment diﬃculties
and all other cases) and identify the top correlations for each segmented data using Excel
functions.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Primary Dataset: application_data

Explanation: To calculate the correlation of different scenarios (scenarios with numeric data) i
copied columns with numeric data to a different sheet and calculated their correlation matrix
using data>data analytics>correlation.

Correlation matrix:

1. There is a robust correlation of 0.880 between CNT_CHILDREN and

CNT_FAM_MEMBERS, implying a strong association between the number of family
members and the number of children.
2. AMT_CREDIT and AMT_GOODS_PRICE exhibit a highly positive correlation of 0.987,
signifying a close relationship between them.
3. The positive correlation of 0.769 between AMT_ANNUITY and AMT_CREDIT suggests
that annuity is often linked to the loan amount.
4. AGE demonstrates a moderately negative correlation of -0.242 with YEAR_EMPLOYED,
indicating that older individuals tend to have fewer years of employment.
5. Negative correlation between REGION_RATING_CLIENT and
REGION_RATING_CLIENT_W_CITY -0.532 and -0.530 with
REGION_POPULATION_RALATIVE this indicates people living in higher populated regions
have lower ratings of their living regions.

Insights: Based on the analysis conducted on the provided data, several assumptions can be
inferred. The majority of loan applicants are individuals with zero children, real estate ownership,
and ownership of a business entity. Additionally, a signiﬁcant portion of the applicants are
female, and the majority falls within the income range of 25,650 to 275,650. Furthermore,
individuals with a working income, having 0-10 years of employment, and being married are
more likely to make timely payments associated with the loan.

Result: Engaging in this comprehensive project proved beneﬁcial in gaining a deeper

understanding of various Excel functionalities. The exploration of concepts such as histograms,
correlation coeﬃcients, and both univariate and bivariate analysis enhanced comprehension of
statistical principles. Handling a larger and more complex dataset contributed to an improved
approach to solving data analysis problems overall.

Downloaded by Shubhanshi Bajpai ([email protected])

lOMoARcPSD|47013055

Drive Links:
Application_data:
https://docs.google.com/spreadsheets/d/1a2dkSdjpqA1yosCl-q_HBArZ80VCkmhc/edit?usp=dr
ive_link&ouid=103303747027981242683&rtpof=true&sd=true
previous_data:
https://docs.google.com/spreadsheets/d/1PXPDoJynWRnDlQR2bpBmMwUNwVax_n9Y/edit?u
sp=drive_link&ouid=103303747027981242683&rtpof=true&sd=true

Downloaded by Shubhanshi Bajpai ([email protected])

Leveraging Lookups and Subsearches
100% (2)
Leveraging Lookups and Subsearches
72 pages
EDA Credit Case Study (Karan Pratap Singh)
100% (1)
EDA Credit Case Study (Karan Pratap Singh)
63 pages
This Study Resource Was: Bank Loan Default Prediction Model
No ratings yet
This Study Resource Was: Bank Loan Default Prediction Model
9 pages
Credit Eda Case Study Analysis
75% (4)
Credit Eda Case Study Analysis
13 pages
EDA Assignment
100% (1)
EDA Assignment
19 pages
EDA Loan Case Study PPT - Ver 1.1
80% (5)
EDA Loan Case Study PPT - Ver 1.1
22 pages
FRA Milestone1 - Maminulislam
100% (4)
FRA Milestone1 - Maminulislam
23 pages
Trainity Data Analytics Training Project 6
No ratings yet
Trainity Data Analytics Training Project 6
22 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
2 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
34 pages
6 - Bank Loan Analysis
No ratings yet
6 - Bank Loan Analysis
10 pages
1 PPPP
No ratings yet
1 PPPP
26 pages
Credit EDA Assignment PDF
No ratings yet
Credit EDA Assignment PDF
40 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
26 pages
Bank Loan PPT
No ratings yet
Bank Loan PPT
45 pages
Trainity Project-6
No ratings yet
Trainity Project-6
12 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
22 pages
Problem Statement
No ratings yet
Problem Statement
11 pages
Bank Loan Case Study Report
No ratings yet
Bank Loan Case Study Report
23 pages
Ass 06 - Bank Loan Case Study
No ratings yet
Ass 06 - Bank Loan Case Study
11 pages
Bank Loan Case Study 2
No ratings yet
Bank Loan Case Study 2
23 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
43 pages
Bank Loan Casestudy
No ratings yet
Bank Loan Casestudy
17 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
26 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
21 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
41 pages
EDA Assignment
No ratings yet
EDA Assignment
33 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
11 pages
Bank Loan Case Study1
No ratings yet
Bank Loan Case Study1
13 pages
Credit EDA Case Study Problem Statement
No ratings yet
Credit EDA Case Study Problem Statement
4 pages
Credit EDA Case Study Doc 1
100% (1)
Credit EDA Case Study Doc 1
16 pages
Explatory Data Analysis
No ratings yet
Explatory Data Analysis
18 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
21 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
13 pages
Business Analytics
No ratings yet
Business Analytics
56 pages
Decision Making Assignment
No ratings yet
Decision Making Assignment
6 pages
Lending Club Case Study: Prabhat Sharma Brij Bhushan Paliwal
No ratings yet
Lending Club Case Study: Prabhat Sharma Brij Bhushan Paliwal
10 pages
EDA Case Study
No ratings yet
EDA Case Study
94 pages
Summary and Context
No ratings yet
Summary and Context
51 pages
Capstone Project - Final Submission
No ratings yet
Capstone Project - Final Submission
36 pages
Credit EDA Case Study
No ratings yet
Credit EDA Case Study
42 pages
EDA Credit Assignment Shakti - PDF
No ratings yet
EDA Credit Assignment Shakti - PDF
51 pages
Group 5 Dseb64a Report
No ratings yet
Group 5 Dseb64a Report
10 pages
EDA Group Case Study
No ratings yet
EDA Group Case Study
33 pages
Data Analyst Interview Assignment
No ratings yet
Data Analyst Interview Assignment
26 pages
Credit EDA Assignment
No ratings yet
Credit EDA Assignment
23 pages
Vechile Loan Defaulter
No ratings yet
Vechile Loan Defaulter
23 pages
Spark Python Course APPLY Project Problem Statement
No ratings yet
Spark Python Course APPLY Project Problem Statement
3 pages
LendingClubCaseStudy 1
No ratings yet
LendingClubCaseStudy 1
19 pages
Edafinal 1
No ratings yet
Edafinal 1
32 pages
Data Mining Case Study PDF
100% (1)
Data Mining Case Study PDF
21 pages
Data Mining Case Study PDF
No ratings yet
Data Mining Case Study PDF
21 pages
EDA Assignment Summary PDF
No ratings yet
EDA Assignment Summary PDF
12 pages
Vehicle Loan Default Prediction
No ratings yet
Vehicle Loan Default Prediction
14 pages
Bank Loan PDF
No ratings yet
Bank Loan PDF
30 pages
Bank Loan Case Study
No ratings yet
Bank Loan Case Study
71 pages
Hillier 7e Ch02 PPT Accessible
No ratings yet
Hillier 7e Ch02 PPT Accessible
74 pages
Eyram Excel Dashboard
No ratings yet
Eyram Excel Dashboard
60 pages
Thera Bank PRJ
100% (10)
Thera Bank PRJ
79 pages
UNIT 03 - Electrochemistry
No ratings yet
UNIT 03 - Electrochemistry
10 pages
Restaurant
No ratings yet
Restaurant
24 pages
Novel Convolutional Neural Network (NCNN) For The Diagnosis of Bearing Defects in Rotary Machinery
No ratings yet
Novel Convolutional Neural Network (NCNN) For The Diagnosis of Bearing Defects in Rotary Machinery
10 pages
What Is Non-MMU or MMU-Less Linux
No ratings yet
What Is Non-MMU or MMU-Less Linux
2 pages
Icar Syllabus-Physics, Chemistry, Maths, Bio & Agriculture
75% (4)
Icar Syllabus-Physics, Chemistry, Maths, Bio & Agriculture
26 pages
Erba LAURA Smart Brochure WEB
No ratings yet
Erba LAURA Smart Brochure WEB
4 pages
Natural Gas Pressure Test
No ratings yet
Natural Gas Pressure Test
4 pages
What Are The Differences Between IE1-IE4
No ratings yet
What Are The Differences Between IE1-IE4
2 pages
Continuity Equation
No ratings yet
Continuity Equation
11 pages
List of Deliverabls-Aker Proposal
No ratings yet
List of Deliverabls-Aker Proposal
4 pages
SETS
50% (2)
SETS
26 pages
Valves Symbols
No ratings yet
Valves Symbols
4 pages
Sheet Five Conduction MEP 212s
No ratings yet
Sheet Five Conduction MEP 212s
4 pages
A9F74220
No ratings yet
A9F74220
3 pages
Gaminglasopa: Powered by
No ratings yet
Gaminglasopa: Powered by
3 pages
Lipid Chemistry BSN
No ratings yet
Lipid Chemistry BSN
53 pages
Human Skin Grade 6
No ratings yet
Human Skin Grade 6
15 pages
Fluid Power - 2
No ratings yet
Fluid Power - 2
11 pages
Circuit Note: Dual-Channel Colorimeter With Programmable Gain Transimpedance Amplifiers and Digital Synchronous Detection
No ratings yet
Circuit Note: Dual-Channel Colorimeter With Programmable Gain Transimpedance Amplifiers and Digital Synchronous Detection
8 pages
Computer and Communication Networks Lab Manual (Lab 4) : Topic: Wireshark TCP Packets
No ratings yet
Computer and Communication Networks Lab Manual (Lab 4) : Topic: Wireshark TCP Packets
7 pages
Load Test On Separately Excitied DC Generator
No ratings yet
Load Test On Separately Excitied DC Generator
5 pages
Iachasta (Inter-Administration Charging and Statistics)
No ratings yet
Iachasta (Inter-Administration Charging and Statistics)
15 pages
Midpoint
No ratings yet
Midpoint
10 pages
LQ043T3DX02 SP 122805 PDF
No ratings yet
LQ043T3DX02 SP 122805 PDF
25 pages
MPL Series P21 - 33
No ratings yet
MPL Series P21 - 33
13 pages
A Guide To Radiocarbon Units and Calculations
No ratings yet
A Guide To Radiocarbon Units and Calculations
19 pages
CINPD Unit 5
No ratings yet
CINPD Unit 5
16 pages
Zelio Electromechanical Relays - RHT4138E
No ratings yet
Zelio Electromechanical Relays - RHT4138E
2 pages
An Empirical Assessment of Empirical Corporate Finance
No ratings yet
An Empirical Assessment of Empirical Corporate Finance
40 pages

Trainity-Data An

Uploaded by

Trainity-Data An

Uploaded by

lOMoARcPSD|47013055

Trainity Data Analytics Training project 6

Data Analytics (Devi Ahilya Vishwavidyalaya)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Trainity Data Analytics Training

Downloaded by Shubhanshi Bajpai ([email protected])

Data Analysis Tasks:

Explanation: To handle missing data I did the following:

Graph of column with missing values <40%

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

Higher number of cash loans among clients than revolving loans.

Signiﬁcantly higher number of female applicants than male applicants.

Downloaded by Shubhanshi Bajpai ([email protected])

Majority of the applicants are realty owners

Downloaded by Shubhanshi Bajpai ([email protected])

Segmented univariate analysis: segmented univariate analysis is an extension of univariate

Downloaded by Shubhanshi Bajpai ([email protected])

The majority of applicants fall within the range of 25,650 to 275,650.

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

People working for 0-10 years apply for most loans.

Downloaded by Shubhanshi Bajpai ([email protected])

Amount income & target:

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

▪ Age range of 31 to 51 ends up taking most loans and default.

Downloaded by Shubhanshi Bajpai ([email protected])

People with 0 children default more followed by

Downloaded by Shubhanshi Bajpai ([email protected])

Majority of defaulters are married followed by singles.

primary_data univariate and bivariate analysis:

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

Most of the applicants are repeaters.

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

Contract Status/client Type:

Contract Status/contract Type:

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

Primary Dataset: application_data

1. There is a robust correlation of 0.880 between CNT_CHILDREN and

Result: Engaging in this comprehensive project proved beneﬁcial in gaining a deeper

Downloaded by Shubhanshi Bajpai ([email protected])

Downloaded by Shubhanshi Bajpai ([email protected])

You might also like