0% found this document useful (0 votes)
447 views3 pages

Credit Card Segmentation

1) The document discusses using k-means, DBSCAN, and hierarchical clustering algorithms to segment customers into groups based on their credit card usage behavior from a dataset of about 9000 credit card holders. 2) K-means clustering was applied after preprocessing the data through mean imputation and standardization. DBSCAN overcomes some limitations of k-means by forming clusters based on density rather than distances from means. 3) Hierarchical clustering grouped similar data objects into distinct clusters where objects within each cluster are broadly similar to each other. 4) The analysis identified 4 main customer clusters characterized by different income levels and purchasing behaviors.

Uploaded by

Elizebeth Shiju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
447 views3 pages

Credit Card Segmentation

1) The document discusses using k-means, DBSCAN, and hierarchical clustering algorithms to segment customers into groups based on their credit card usage behavior from a dataset of about 9000 credit card holders. 2) K-means clustering was applied after preprocessing the data through mean imputation and standardization. DBSCAN overcomes some limitations of k-means by forming clusters based on density rather than distances from means. 3) Hierarchical clustering grouped similar data objects into distinct clusters where objects within each cluster are broadly similar to each other. 4) The analysis identified 4 main customer clusters characterized by different income levels and purchasing behaviors.

Uploaded by

Elizebeth Shiju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

CLASS ACTIVITY 3 Part A

CREDIT CARD SEGMENTATION

Elizabeth Shiju C0777014


Gayathri Sasidharan C0777013

In this assessment, we implement customer segmentation based on credit card


usage behavior with three different approaches :K-means, DBSCAN and
Hierarchical Clustering. We used the credit card user segmentation data set from
kaggle which ​summarizes the usage behavior of about 9000 active credit card holders
during the last 6 months.

It includes the following variables:

● CUST_ID: Identification of Credit Card holder


● BALANCE: Balance amount left in their account to make purchases
BALANCE_FREQUENCY: How frequently the Balance is updated, score between 0 and 1
PURCHASES: Amount of purchases made from account
● ONEOFF_PURCHASES: Maximum purchase amount done in one-go
INSTALLMENTS_PURCHASES: Amount of purchase done in installment
● CASH_ADVANCE: Cash in advance given by the user
● PURCHASES_FREQUENCY: How frequently the Purchases are being made, score
between 0 and 1 (1 = frequently purchased, 0 = not frequently purchased)

ONEOFF_PURCHASES_FREQUENCY: How frequently Purchases are happening in one-go


(1 = frequently purchased, 0 = not frequently purchased)
● PURCHASES INSTALLMENTS FREQUENCY: How frequently purchases in installments are
being done (1 = frequently done, 0 = not frequently done)
● CASHADVANCE_FREQUENCY: How frequently the cash in advance being paid
CASHADVANCE_TRX: Number of Transactions made with "Cash in Advanced"
PURCHASES_TRX: Number of purchase transactions made
● CREDIT_LIMIT: Limit of Credit Card for user PAYMENTS: Amount of Payment done by user
MINIMUM_PAYMENTS: Minimum amount of payments made by user
PRC_FULL_PAYMENT: Percent of full payment paid by user
● TENURE: Tenure of credit card service for user

K-Means Clustering
Firstly, for preprocessing the data we found the mean value to replace the
missing values in the data set. We also ​performed data standardization and applied
PCA technique for dimensionality reduction.
This the graph plotted using k-means algorithm.

One of the problems faced in k-means is that we have to specify the number of clusters
in order to use which is difficult to know in prior in most of the cases.

DBSCAN Clustering
In density-based spatial clustering of applications with noise (DBSCAN)
algorithm,
In k-means algorithms, clusters depend on the mean value of cluster elements and a
slight change in data points might affect the clustering outcome. This issue is
overcomed in DBSCAN due to the way clusters are formed

In this case we used two parameters of dbscan, minpts ie the minimum number of
points clustered together as well as eps which is a distance measure used to find
points which are nearby.

Agglomerative clustering
Using Agglomerative clustering, we grouped different similar data objects in clusters.
​ here each cluster is distinct from each
Where ​the endpoint is a set of clusters​, w
other cluster, and the objects within each cluster are broadly similar to each other.

Conclusion
There are mainly 4 clusters identified. They are:

● CLUSTER 1 - PEOPLE WITH LOW LEVEL OF INCOME. Not Frequent purchases.


● CLUSTER 2 - PEOPLE WITH MEDIAN LEVEL OF INCOME. High Frequent purchases.
● CLUSTER 3 - PEOPLE WITH HIGH LEVEL OF INCOME. Not Frequent purchases and the
most high advance level.
● CLUSTER 4 - PEOPLE WITH LOW LEVEL OF INCOME. Frequent purchases.

You might also like