0% found this document useful (0 votes)
87 views14 pages

Bank Customer Segmentation

Banking customer segmentation

Uploaded by

Raibhush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views14 pages

Bank Customer Segmentation

Banking customer segmentation

Uploaded by

Raibhush
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Banking

Customer
Segmentation

Name Bhushan Rai


PGP-DSBA Online
January’ 21
Date: 27/06/2021

0
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Table of Contents

Contents
Executive Summary ................................................................................................................................. 3
Introduction ............................................................................................................................................ 3
Data Description ..................................................................................................................................... 3
Sample of the dataset ......................................................................................................................... 3
Exploratory Data Analysis ....................................................................................................................... 4
Let us check the types of variables in the data frame. ....................................................................... 4
Check for missing values in the dataset .............................................................................................. 4
Descriptive Statistics ………………………………………………………………………………………………………………….....4
1.1 Read the data, do the necessary initial steps, and exploratory data analysis (Univariate, Bi-variate, and
multivariate analysis............................................................................................................................5
Histplot, Univariate Analysis ............................................................................................................... 5
Skewness in data, distplot ..................................................................................................................6
Bivariate Analysis, pairplot..................................................................................................................7
Correlation Plot ..................................................................................................................................8
Check Outliers.....................................................................................................................................9

1.2 Do you think scaling is necessary for clustering in this case? Justify............................................10

1.3 Apply hierarchical clustering to scaled data. Identify the number of optimum clusters using
Dendrogram and briefly describe them...............................................................................................11

1.4 Apply K-Means clustering on scaled data and determine optimum clusters. Apply elbow
curve and silhouette score. Explain the results properly. Interpret and write inferences on the finalized
clusters..........................................................................................................................................12

Visual representation of clusters ..........................................................................................................13


Recommendation / Conclusion .............................................................................................................14

The End....................................................................................................................................................

1
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
List of Figures
Fig.1 – Histplot, Distplot ................................................................................................................................. 5
Fig.2 – Histplot, Distplot ................................................................................................................................. 6
Fig.3 – Pair plot .............................................................................................................................................. 7
Fig.4 – Heatmap ............................................................................................................................................. 8
Fig 5- Boxplot..................................................................................................................................................9
Fig 6- Dendrogram----------------------------------------------------------------------------------------------------------------- 10
Fig 7- Elbow plot-------------------------------------------------------------------------------------------------------------------- 14

List of Tables
Table 1. Dataset Sample ................................................................................................................................. 3
Table 2. Descriptive Statistics......................................................................................................................... 4
Table 3. Skewness of data .............................................................................................................................. 6
Table 4. Correlation between observation ....................................................................................................7
Table 5: Scaled data .........................................................................................................................................9
Table 6: Number of clusters and frequency table...........................................................................................10
Table 7: Kmeans and sil width.........................................................................................................................11
Table 8: Grouping as per clusters....................................................................................................................12

2
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Executive Summary
A leading bank wants to develop a customer segmentation to give promotional offers to its customers. They
collected a sample that summarizes the activities of users during the past few months. You are given the task to
identify the segments based on credit card usage.

Introduction

The purpose is to explore the data set and find the spending areas of the customers as accordance to
their credit profile, so promotional offers can be provided based on their transaction history.

Data Description

1: spending: Amount spent by the customer per month (in 1000s)


2: advance_payments: Amount paid by the customer in advance by cash (in 100s)
3: probability_of_full_payment: Probability of payment done in full by the customer to the bank
4:current_balance: Balance amount left in the account to make purchases (in 1000s)
5:credit_limit: Limit of the amount in credit card (10000s)
6:min_payment_amt : minimum paid by the customer while making payments for purchases made monthly
(in 100s)
7:max_spent_in_single_shopping: Maximum amount spent in one purchase (in 1000s) Import the necessary
libraries and load the dataset.

Sample of the dataset:

Table 1. Dataset Sample

3
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Exploratory Data Analysis

Let us check the types of variables in the data frame

Check for missing values in the dataset:

Descriptive Statistics:

4
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
1.1 Read the data, do the necessary initial steps, and exploratory data analysis
(Univariate, Bi-variate, and multivariate analysis.

Univariate Analysis:

5
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Calculate the skewness in the dataset:

Data is rightly skewed for all variable, except for probability_of_full_payment which is left skewed

6
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Bivariate Analysis

All variables are highly correlated to each other

7
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Correlation Plot

Fig.1 – Correlation Heatmap

8
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
Check Outliers:

1. No missing values found


2. There are outliers present in only 2 variables: min_payment_amt and probability_of_full_payment
3. There is a small outlier hence no treatment is needed

1.2 Do you think scaling is necessary for clustering in this case? Justify

Yes, it’s necessary as we need to rescale the data for further clustering use as the variables are different
from each other and range needs to be added

9
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
1.3 Apply hierarchical clustering to scaled data. Identify the number of optimum
clusters using Dendrogram and briefly describe them

Cluster Frequency:

The observation for clustering would nominal be 3, based on the hierarchical clustering we have a pattern
of high, medium and low spending with variables max_spent_in single_shopping and
probability_of_full_payment.

10
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
1.4 Apply K-Means clustering on scaled data and determine optimum clusters. Apply
elbow curve and silhouette score. Explain the results properly. Interpret and write
inferences on the finalized clusters.
Within sum of squares ranging from 1 to 15:

[1469.9999999999998,
659.171754487041,
430.6589731513006,
371.38509060801096,
327.21278165661346,
289.31599538959495,
262.98186570162267,
241.81894656086033,
223.91254221002725,
206.39612184786694,
193.2835133180646,
182.97995389115258,
175.11842017053073,
166.02965682631788]

Its observed there are 3 to 4 points however we will go with 3 points for this
Calculation.

11
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
The optimal number of clusters here would be 3.

The KMeans group 2 is the high spending group

The KMeans group 1 is the medium spending group

12
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
The KMeans is the lowest spending group

Conclusion and Recommendation:

There are 3 clustering groups with high, medium and low spending

Promotional strategy here can be:

group2: high spending


there can be a raise of the credit limit, reward points as we see the probability of full payment is also high
loans can be offered with a good history tracked record of users

group1: medium spending


they are maintaining the account offers like loyalty bonus, increase credit limit provide more customer
points to increase spending habits

group0: low spending


give more payment plans so can catch up with balance and offers like daily transaction points should be
provided

13
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited

You might also like