Project 2
Project 2
Campaign
Bank Loan (AI/ML)
Date : 20 July 2023
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Contents / Agenda
● Executive Summary
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Executive Summary
● AllLife Bank is a US bank that has a growing customer base. Majority of customers
are liability customers and very small numbers of customers who are also borrowers
(asset customers).
● Alllife bank had one marketing campaign to convert the liability customers to
become borrower's client as well which their success ratio is 9%. Which gives a
platform for to increase debt services and earn more interest.
● Alllife bank needs to introduce promotional offer to liability customers to get into
security account, mortgage account and credit cards and personal loan. Which way
they can earn more interest.
● Study the customers profile based on that they have to offer new promotional loans.
● Most of client use online banking so approach them and promote the different
products to signup.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Business Problem Overview and Solution Approach
● Bank has many depositor (liability) however they have a smaller number of client
are borrowers(assets) clients. Their income comes from interests, so their primary
goal is increasing borrower's number.
● Through online platform introduce promotional offer for credit cards, mortages,
personal loan and security account.
● Study client’s profile and offer them to get personal loan based on their income,
family size and credit card use and mortgages.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
EDA Results
● Age : The min is 23 , max is 67 , Average is 45 Years old - Data seems is slightly fitting a
uniform distribution.
● The min Experience is 0 years, the max is 43 and the mean is approximately 20 years.
Looks uniform distribution again.
● The min Income is 46k, max is 224K and average is 64K - Dataset is right skewed.
● The min credit card avg. is 0k (which can be reflecting customers who do not own credit
cards), max is 10K and average is approx. 1.9K. Few outliers are there.
● Out of 5000 only 1538 have mortgage. Based on 1538 -The mortgage distribution of the
customers under mortgage is right skewed with a minimum value of approx. 99K to max
value of 635K and mean value between USD 180-200K
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
EDA Results
● The Family sizes more than 29.4% are size 1, 25.9% are size 2 then 20.2% are size 3 and
24.4% are size 4.
● 41.9% of customers are 1: Undergrad, 28.1% of customers are 2: Graduate, 30.0% of
customers are 3: Advanced/Professional
● Highest count for zip code the first 2 digit is 94.
● More than 4000 (90%) of customers did accept a personal loan.
● More than 4000 (94%) of customers do not have a CD account.
● More than 4000 (89.6%) of customers do not Securities accounts.
● More than 3000 (70%) of customers do not use a credit card issued by a different bank.
● 59.7% of customers use the online banking services
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
EDA Results
● High concentration of customers who accepted a personal loan are observed at:
● The income level of the client has a high impact on the client decision to accept a
personal loan, The high the income, the more chances the client will accept a personal
loan.
● As the mortgage value increase, the customer is more likely to accept a personal loan
● Family become bigger; clients are more willing to accept personal loans.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Data Preprocessing
● Data will be split into train and test and validation to evaluate the train model.
● Predicting a customer will accept a loan but, the customer would not accept a loan.
Which is loss of resources
● Predicting a customer will not accept a loan but, the customer would have accepted
a loan. – which is loss of opportunity – very Important as we loose potential
customer.(False Negative)
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Model Building
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Model Performance Summary
● Model is overfitting the data as observed from
the confusion matrix the FN & FP are 0%
● DecisionTreeClassifier(class_weight={0: 0.15, 1:
0.85}, random_state=1)
1. Income
2. Education-Graduate
3. CCAvg
4. Education- Advanced/Professional
5. Family
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Model Performance Improvement
● The original tree is quite complicated and is overfitting the training data set, hence
pre-pruning and post pruning are to be considered to improve the model
performance.
● Train a decision tree using effective alphas. The last value in ccp_alphas is the alpha
value (0.17) that prunes the whole tree, leaving the tree, clfs[-1], with one node.
● We have tried three different model Decision Tree sklearn, Decision Tree (Pre-
Pruning) and Decision Tree (Post-Pruning) .
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Model Performance Improvement
● F1 1.0 1.0
● Accuracy 1.0
● Recall 1.0
● Precision 1.0
● F1 1.0
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Model Performance Improvement
The training performance comparison for the Decision Tree models on the loan data shows :
All Three models (Decision Tree sklearn, Decision Tree (Pre-Pruning) and Decision Tree (Post-Pruning) )
achieved perfect accuracy, recall, precision, and F1-score, with each metric being 1.0.
This suggests that the models have correctly classified all instances in the training set, achieving 100%
accuracy and capturing all positive cases without making any false positive predictions.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
APPENDIX
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Data Background and Contents
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Happy Learning !
17
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.