1 PPPP
1 PPPP
Case Study
(Final Project – 2)
BY HARSHA YADAV
Project Description:
TECH-STACK USED
Software And The Version Used While Making The Project :
1. MS Excel (For working, analysing and reporting insights)
2. Microsoft Power Point (For presenting the detailed analysis)
Data Understanding:
Both the CSV files will be checked for any unnecessary data and
unwanted columns/rows, and will be cleaned/removed if necessary.
Then they will be checked for outliers, if any, to find if there is
skewness in the given columns which would affect the final
visualization and insight. Data Imbalance will be checked. Different
types of analysis will be done to understand the relationships
between different variable to find the Driving Factors. Different
visualizations will be observed to understand the relationships
AFTER CLEANING THE TABLES
Task 2 : Identify the missing data and use appropriate method to deal
with it. (Remove columns/or replace it with an appropriate value)
In Applicant_data.csv
Before Cleaning, the number of Columns and rows are 122 and 3075124 respectively.
UNIVARIATE ANALYSIS :
• Individuals with higher incomes are less likely to apply for loans.
• The credit amount of a bank loan is typically in the range of 45000 to 1045000.
• The majority of loan applications have come from people between the ages of 35 and 50.
• Those with 0 to 8 years of work experience are the most likely to seek for loans.
• Individuals who own homes are more likely to apply for loans than others.
• Those who are married have taken out more loans.
• More loans have been requested by working people.
• Unaccompanied minors have requested for extra loans.
SEGMENTED UNIVARIATE ANALYSIS
BIVARIATE ANALYSIS :
• Customers who live in low-rating areas will have higher defaults.
• Individuals with lower incomes are more likely to default.
• Young people are more likely to default, and the trend of defaulters
declines with age.
• Ladies are less inclined than males to have defaults.
• More defaults are predicted due to maternity leave and unemployment.
• Customers with more than five family members are more likely to default
on their bank loan.
• Customers with fewer educational qualifications are more likely to fail on a
bank loan.
• Customers with hardly work experience are more likely to have defaults.
Task 6 : Find the top 10 correlation for the Client with
payment difficulties and all other cases (Target variable).
Top 10 driving factors in current application.csv
1. Income type
2. Count of Family Members
3. Children count
4. External source
5. Region rating of client
6. Age
7. Months Employed
8. Amount credit
9. Amount Goods Price
10. Amount total income
Insights
• NAME_EDUCATION_TYPE: Academic degree has less defaults.
• NAME_INCOME_TYPE: Student and Businessmen have no defaults.
• REGION_RATING_CLIENT: RATING 1 is safer.
• ORGANIZATION_TYPE: Clients with Trade Type 4 and 5 and Industry type 8 have defaulted less than 3%.
• DAYS_BIRTH: People above age of 50 have low probability of defaulting
• DAYS_EMPLOYED: Clients with 40+ year experience having less than 1% default rate.
• AMT_INCOME_TOTAL: Applicant with Income more than 700,000 are less likely to default.
• NAME_CASH_LOAN_PURPOSE: Loans bought for Hobby, buying garage are being repaid mostly.
• CNT_CHILDREN: People with zero to two children tend to repay the loans.
• CODE_GENDER: Men are at relatively higher default rate
• NAME_FAMILY_STATUS: People who have civil marriage or who are single default a lot.
• NAME_EDUCATION_TYPE: People with Lower Secondary & Secondary education
• NAME_INCOME_TYPE: Clients who are either at Maternity leave OR Unemployed default a lot.
• REGION_RATING_CLIENT: People who live in Rating 3 has highest defaults.
• OCCUPATION_TYPE: Avoid Low-skill Laborers, Drivers and Waiters/barmen staff, Security staff, Laborers and Cooking staff as their default rate is huge
Result
• After performing the analysis, we can rectify whether a client will
repay the loan or not.
• The people who are likely to face problem in loan repayment are
labourers.
• People with Secondary /secondary special education might face
problem in loan repayment.
• Moreover, those who are living in house/apartment are facing
difficulty in loan repayment (may be because of extra home loan,
EMIs and so on).
***End of report***