0% found this document useful (0 votes)
10 views

SQL Project

analysis done by SQL queries and visualization on Power bi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

SQL Project

analysis done by SQL queries and visualization on Power bi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Jaipuria Institute of

Management,
Vineet Khand, Gomti Nagar
Lucknow – 226010

Academic Year 2024-25


Batch 2023-25
Trimester IV
Programme
PGDM-FS
(PGDM / PGDM-FS / PGDM-RM)
Business Intelligence and Decision Making
Name of Course
(BIADM)
Section A
Name of Faculty Dr. Purnendu Shekhar Pandey

Topic of individual Assignment /


Individual Assignment
Project
Deadline for Submission 14th August 2024
Maximum Marks Allotted 15

Tick on appropriate

I hereby declare that this project/assignment does not contain any AI generated content
(e.g. ChatGPT etc.)

I hereby declare that this project/assignment has used AI generated content (e.g. ChatGPT
etc.) as supporting resource for the completion of project/assignment.

Name of the Student: Supriya Singh Signature of Student

Date of receiving at PMC: Signature of PMC Staff:


Penalty [Marks to be deducted (if any)]:

1
Contents
introduction ................................................................................................................................ 3
problem Statement ..................................................................................................................... 4
Data Description ........................................................................................................................ 5
Methodology .............................................................................................................................. 6
Data collection ....................................................................................................................... 6
Data Filtering with SQL......................................................................................................... 6
Data Visualization with Power BI ........................................................................................ 12
Analysis and Insights ............................................................................................................... 13
1.Total Customers (Top Center) ........................................................................................... 13
2. Customers by Card Type (Top Left) ................................................................................ 13
3. Customers by Card Type and Credit Score Range (Top Right) ....................................... 14
4. Customers by Geography (Bottom Right) ....................................................................... 14
5. Active vs. Inactive Members , (Bottom Left) .................................................................. 14
6. Churn Rate by Age Group (Top Left) .............................................................................. 15
7. Churn Rate by Number , of Products (Top Right) ........................................................... 16
8. Churn Rate by Credit , Score Range (Bottom) ................................................................ 16
9. Churn Rate by Geography , (Top Left) ............................................................................ 17
10. Churn Rate by Card , Type (Top Right) ......................................................................... 18
11. Relationship Between Complaints and , Churn (Bottom Left) ...................................... 18
12. Churn Rate by Customer , Satisfaction Score (Bottom Right) ...................................... 19
Recommendations .................................................................................................................... 20
Conclusion ............................................................................................................................... 21
References ................................................................................................................................ 21

2
INTRODUCTION

Customer churn poses a significant , challenge for the banking industry, . directly impacting
profitability and growth. , As financial institutions operate in . a highly competitive
environment, where , products and services often appear . similar, retaining customers
becomes crucial , to maintaining a stable revenue . stream.

This project aims to analyse , customer churn data to identify . the key factors influencing
customer , attrition and develop targeted retention . strategies. The study emphasizes the ,
importance of understanding the various . drivers behind customer churn, which , can include
demographic characteristics, financial . behaviours , product usage, and , customer
satisfaction levels. To achieve . this, the project leverages data , analysis and supervised
learning techniques . to predict which customers are , at risk of leaving.

By . applying models such as logistic , regression and decision trees to . historical data, the
analysis seeks , to uncover patterns that can . inform proactive measures for customer ,
retention. Additionally, the project explores . how different attributes—such as credit , scores,
geographic locations, and customer . engagement levels—affect churn rates.

The , dataset utilized in this analysis . comprises 2,304 entries and 16 , attributes,
encompassing a rich mix . of demographic, behavioural , and , financial information. This
comprehensive dataset . allows for a nuanced examination , of customer profiles and their .
corresponding churn behaviour .

The , insights generated from this analysis . are intended to provide actionable ,
recommendations for banks, enabling them . to tailor their retention strategies , to address the
specific needs . of high-risk customers effectively.

Ultimately , , this project aspires to . contribute to the broader understanding , of customer


churn in the . banking sector, offering valuable insights , that can enhance customer
satisfaction, . loyalty, and long-term success. By , reducing churn rates, banks can . not only
protect their revenue , streams but also strengthen their . competitive position in the market, ,
ensuring sustainable growth in an . increasingly challenging landscape.

3
PROBLEM STATEMENT

This study emphasizes the critical , nature of customer churn to . predict and mitigate this
issue, , which directly relates to the . project's goal of analyzing customer , churn data to
enhance retention . strategies.

The banking industry faces a , critical challenge in retaining customers, . as customer churn
can lead , to significant revenue loss and . increased costs for acquiring new , customers. In a
highly competitive . market, where financial institutions offer , similar products, retaining
customers is . essential for maintaining profitability and , growth. Customer churn is often .
driven by various factors, including , demographic characteristics, financial behaviors,
product . usage, and customer satisfaction levels. ,

This project aims to analyze , customer churn data to identify . key factors that influence
customer , attrition. By understanding the reasons . behind customer churn, banks can ,
develop targeted retention strategies that . address specific pain points and , reduce churn
rates. The primary . objective is to leverage data , analysis and supervised learning techniques
. to predict which customers are , at risk of leaving. This . predictive capability allows banks
to , take proactive measures to retain . customers before they churn.

Supervised learning models, such as , logistic regression and decision trees, . will be applied
to historical , data to identify patterns and . predict future churn. Additionally, this , analysis
will explore how different . attributes, like credit scores, geographic , locations, and customer
engagement levels, . impact churn rates. By gaining , a granular understanding of these .
factors, banks can tailor their , retention strategies to address the . specific needs of high-risk
customers. ,

Ultimately, this project seeks to , provide actionable insights that can . be translated into
effective customer , retention strategies. By reducing churn, . the bank can protect its ,
revenue streams and strengthen its . competitive position in the market. , The insights
generated from this . study will be critical in , helping the bank enhance customer .
satisfaction, loyalty, and long-term success. ,

4
DATA DESCRIPTION
The dataset consists of 2,304 , entries and 16 columns. Below . is a brief description of , the
key columns:

Surname : Customer's last name , (non-predictive, categorical).

CreditScore : Credit score of , the customer (numerical).

Geography : Country where the , customer resides (categorical).

Gender : Gender of the , customer (categorical).

Age : Age of the , customer (numerical).

Tenure : Number of years , the customer has been with . the bank (numerical).

Balance : Account balance of , the customer (numerical).

NumOfProducts : Number of products , the customer holds with the . bank (numerical).

HasCrCard : Whether the customer , has a credit card (binary: . 1 = Yes, 0 = , No).

IsActiveMember : Whether the customer , is an active member (binary: . 1 = Yes, 0 = , No).

EstimatedSalary : Estimated salary of , the customer (numerical).

Exited : Whether the customer , has churned (binary: 1 = . Yes, 0 = No).

Complain : Whether the customer , has filed a complaint (binary). .

Satisfaction Score : Customer's satisfaction , score (numerical).

Card Type : Type of , card held by the customer . (categorical: e.g., DIAMOND, GOLD).

Point Earned : Reward points , earned by the customer (numerical). .

This dataset provides a good , mix of demographic, behavioral, and . financial data for
customer churn , analysis.

5
METHODOLOGY
Data collection
The research paper "Churning of , Bank Customers Using Supervised Learning" . obtained
from ResearchGate provides additional , context and insights for the . customer churn
analysis project. The , data collection for this study . involved utilizing a comprehensive
dataset , comprising 2,304 entries and 16 . attributes. This dataset includes a , mix of
demographic, behavioural , . and financial data, such as , customer surnames, credit scores,
geographic . locations, age, account balance, and , customer satisfaction scores. Key variables
. in the dataset, such as , whether a customer has exited . (indicating churn), the number of ,
products held, and the presence . of complaints, provide critical insights , into customer
behaviour and potential . churn factors.

Data Filtering with SQL


SQL queries are used to , filter, aggregate, and analyze the . data. This step helps in ,
extracting relevant insights, such as . customer demographics, churn rates, and , relationships
between different variables. Key . steps include:

1. Data Extraction : Utilize SQL , queries to select relevant fields . from the customer
database, including , customer demographics, account details, churn . status, and other
pertinent attributes. ,

2. Data Aggregation : Perform aggregation , to summarize data based on . specific


dimensions such as geography, , age, credit score, and product . ownership.

3. Churn Analysis : Calculate churn , rates across different segments to . identify


patterns and potential risk , factors associated with customer attrition. .

4. Data Filtering : Apply conditions , to filter out unnecessary data . and focus on
specific customer , segments.

5. Data Preparation : Ensure data , is clean and formatted correctly . for visualization,
handling null values, , standardizing formats, and creating calculated . fields.

1. Retrieve all records

Retrieves all records from the , customer 1 table

6
2. Total number of customers ,

Counts the total number of , customers in the dataset.

3. Number of Customers by , Geography

Counts the number of customers , by geography (country).

4. Churn Rate by Geography ,

Calculates the churn rate by , geography, showing total customers, churned . customers, and
churn rate percentage. ,

5. Distribution of Credit Scores ,

Displays the distribution of credit , scores among customers.

6. Average Balance by Geography ,

Calculates the average account balance , of customers by geography.

7
7. Number of Products Owned , by Customers

Counts the number of customers , based on the number of . products they own.

8. Churn Rate by Number , of Products

Calculates the churn rate by , the number of products owned . by customers.

9. Churn Rate by Age , Group

Categorizes customers into age groups , and calculates churn rates for . each group.

8
10. Churn Rate by Credit , Score Range

Groups customers by credit score , ranges and calculates churn rates. .

11. Churn Rate by Customer , Satisfaction Score

Analyzes churn rates by customer , satisfaction scores.

12. Number of Active Members , by Geography

9
Counts active and inactive members , by geography.

13. Distribution of Card Types ,

Shows the distribution of customers , based on the type of . card they own.

14. Churn Rate by Card , Type

Calculates churn rates by card , type.

10
15. Relationship Between Complaints and , Churn

Examines the relationship between customer , complaints and churn rates.

16. Customers by Card Type , and Credit Score Range

Combines card types with credit , score ranges to analyze the . distribution of customers.

11
Data Visualization with Power BI
For data visualization, Power BI , can be used to create . the following interactive
visualizations:

1. Customer Distribution by Geography: A , bar or pie chart showing . the number of


customers by , geography.

2. Churn Rate by Geography: A , map or bar chart depicting . churn rates across
different regions. ,

3. Distribution of Credit Scores: A , histogram displaying the distribution of . credit


scores among customers.

4. Average Balance by Geography: A , bar chart comparing the average . account


balance across different geographies. ,

5. Churn Rate by Number of , Products: A line or bar . chart showing how churn rates
, vary based on the number . of products customers own.

6. Churn Rate by Age Group: , A bar chart visualizing churn . rates for different age
groups. ,

7. Churn Rate by Credit Score , Range: A bar chart showing . churn rates across
different credit , score ranges.

8. Churn Rate by Satisfaction Score: , A bar chart illustrating the . correlation between
satisfaction scores and , churn rates.

9. Active vs. Inactive Members by , Geography: A stacked bar chart . comparing


active and inactive members , across different regions.

10. Card Type Distribution: A pie , or bar chart displaying the . distribution of card
types among , customers.

11. Churn Rate by Card Type: , A bar chart comparing churn . rates for different card
types. ,

12. Complaints and Churn: A bar , chart showing the relationship between . customer
complaints and churn rates. ,

12
ANALYSIS AND INSIGHTS
Dashboard PART 1

1.Total Customers (Top Center)


• Insight : The chart shows , that there are a total . of 2,304 customers in the , dataset.
This is the base . number for understanding customer distribution , across different
dimensions.

• Interpretation : This number provides , a reference point for analyzing . the


distribution of customers by , card type, geography, and activity . status. It is essential
for , calculating percentages or rates in . other analyses.

2. Customers by Card Type (Top Left)


• Insight : The bar chart , shows the distribution of customers . across four card types:
Gold, , Silver, Diamond, and Platinum. The . distribution of customers across different
, card types is relatively balanced, . with GOLD cards holding the , highest number of
customers (586), . followed by SILVER (584), DIAMOND , (579), and PLATINUM
(555).

13
• Interpretation : This indicates that , the Gold and Silver card . types are the most
popular , among customers, possibly due to . benefits or accessibility associated with ,
these cards. Understanding why certain . card types are more popular , can help in
designing targeted . marketing strategies.

3. Customers by Card Type and Credit Score Range (Top Right)


• Insight : This clustered bar , chart breaks down the number . of customers for each
card , type by different credit score . ranges:

o 600-699 and 700 and above , : The most common credit . score ranges across
all card , types.

o Below 400 and 400-499 : , Least common credit score ranges. .

• Interpretation : Higher credit scores , (600 and above) are more . common among all
card types, , especially for higher-end cards like . Diamond and Platinum. This could ,
imply that customers with higher . credit scores are more likely , to qualify for and
obtain . these cards. The bank might , consider focusing on customers within . these
higher credit score ranges , for premium products.

4. Customers by Geography (Bottom Right)


• Insight : France has the , highest number of customers (1119), . followed by Spain
(665) and , Germany (580).

• Interpretation : France has a , dominant share of the bank’s . customers, followed by


Spain and , Germany. This geographical concentration could . suggest a strong market
presence , or tailored services in France. . The bank may explore reasons , behind the
higher customer base . in France and apply successful , strategies to other regions.

5. Active vs. Inactive Members , (Bottom Left)


• Insight : The donut chart , shows a nearly equal split . between active (51.43%) and
inactive , members (48.57%).

• Interpretation : A high proportion , of inactive members might be . a concern for


customer retention. , The bank could investigate the . reasons for inactivity and
develop , engagement strategies to convert inactive . members into active ones. Since
, the split is almost even, . focusing on re-engaging inactive customers , could
significantly impact overall customer . activity levels.

14
The visualizations provide a clear , overview of the customer distribution . across various
dimensions. Key insights , include the popularity of certain . card types, the correlation
between , credit score and card type, . geographical distribution of customers, and , the
balance between active and . inactive members. These insights can , guide the bank in making
. data-driven decisions related to marketing , strategies, customer engagement, and regional .
focus.

Dashboard PART 2

6. Churn Rate by Age Group (Top Left)


• Insight : This horizontal bar , chart illustrates the churn rate . across different age
groups:

o 20-29 : Highest churn rate , at 92.59%.

o Below 20 : High churn , rate of 78.12%.

o 30-39 : Significantly lower churn , rate at 33.67%.

o 40-49 : Lower churn rate , at 14.04%.

o 50-59 : Moderate churn rate , at 8.93%.

o 60 and above : Lowest , churn rate at 7.41%.

15
• Interpretation : Younger customers (under , 30) have a significantly higher . churn
rate compared to older , customers. This could indicate that . younger customers are
either more , price-sensitive, less satisfied with the . bank's services, or have more ,
options available to them. The . bank may need to focus , on retention strategies
specifically targeted . at younger customers, possibly by , offering products that are
more . appealing to this demographic or , improving customer service.

7. Churn Rate by Number , of Products (Top Right)


• Insight : The pie chart , shows churn distribution based on . the number of products a
, customer has:

o 1 Product : Largest segment , with 76.32% of churned customers. .

o 2 Products : Accounts for , 16.03% of churned customers.

o 3 Products : A smaller , share of 7.32%.

o 4 Products : The smallest , segment at 0.33%.

• Interpretation : Customers with only , one product are most likely . to churn,
suggesting that single-product , customers might not be as . engaged or loyal to the ,
bank. In contrast, customers with . multiple products have a much , lower churn rate,
likely due . to higher engagement and satisfaction. , The bank should consider
strategies . to cross-sell additional products to , single-product customers, which could
help . reduce the churn rate.

8. Churn Rate by Credit , Score Range (Bottom)


• Insight : The treemap shows , churn distribution across different credit . score ranges:

o 700 and above : Largest , segment, though with a lower . churn rate.

o 600-699 : Also a significant , segment with a notable churn . rate.

o 500-599 : Smaller but still , notable churn rate.

o 400-499 : Very small segment , with higher churn rates.

o Below 400 : Very small , but critical group with 100% . churn rate.

• Interpretation : The highest churn , is seen among customers with . lower credit
scores (below 600), , which could suggest that these . customers are either less
satisfied , or more likely to face . financial difficulties that lead to , churn. On the

16
other hand, . customers with higher credit scores , (600 and above) form the . majority
but have lower churn , rates, indicating they are more . stable and satisfied. The bank ,
could benefit from focusing on . improving services and support for , lower credit
score customers to . reduce churn in this segment. ,

The second set of visualizations , provides a deeper understanding of . the churn dynamics
across different , customer segments. High churn rates . among younger customers, those
with , fewer products, and lower credit . scores suggest areas where the , bank can focus its
efforts . to improve customer retention. Strategies , such as targeted retention programs, .
improved customer engagement for younger , customers, and cross-selling opportunities for .
single-product customers could help reduce , churn rates.

Dashboard PART 3

9. Churn Rate by Geography , (Top Left)


• Insight : This horizontal bar , chart highlights the churn rate . across different
geographies:

o France : The highest churn , with 201 customers.

o Germany : Moderate churn with , 173 customers.

17
o Spain : Lowest churn with , 110 customers.

• Interpretation : France has the , highest churn rate among the . three geographies,
suggesting that customers , in France might be less . satisfied or have more
competitive , banking options. Spain, on the . other hand, has the lowest , churn rate,
indicating higher customer . retention. The bank might need , to investigate the
reasons for . higher churn in France, such , as customer satisfaction, service issues, .
or competitive pressures, and develop , targeted strategies to improve retention . in
that region.

10. Churn Rate by Card , Type (Top Right)


• Insight : The stacked bar , chart displays churn rates based . on the type of card ,
customers hold:

o Diamond : Highest churn with , 127 churned customers out of . 579 total.

o Gold : 125 churned customers , out of 586 total.

o Silver : 123 churned customers , out of 584 total.

o Platinum : Lowest churn with , 109 churned customers out of . 555 total.

• Interpretation : The churn rate , appears fairly consistent across different . card
types, with Platinum customers , having a slightly lower churn . rate. This suggests
that the , type of card may not . be a significant factor in , churn, but the slight
difference . for Platinum cardholders might indicate , higher satisfaction or better
services . for those customers. The bank , could explore if the benefits . associated
with Platinum cards contribute , to better retention and consider . extending similar
benefits to other , card types.

11. Relationship Between Complaints and , Churn (Bottom Left)

• Insight : The pie chart , illustrates the relationship between customer . complaints and
churn:

o Customers with Complaints (3.42%) : , A small segment of churned .


customers made complaints.

o Customers without Complaints (79.79%) : , The majority of churned


customers . did not make complaints.

18
• Interpretation : Surprisingly, most churned , customers did not file complaints, .
suggesting that dissatisfaction may not , always be formally expressed before .
customers decide to leave. This , indicates a potential gap in . the bank's ability to
identify , and address issues before customers . churn. The bank should consider ,
implementing proactive measures to gauge . customer satisfaction, such as surveys ,
or feedback mechanisms, to catch . potential issues early.

12. Churn Rate by Customer , Satisfaction Score (Bottom Right)

• Insight : This donut chart , shows the churn rate based . on customer satisfaction
scores:

o Low Satisfaction (3.61%) : A , small segment of churned customers . had low


satisfaction scores.

o High Satisfaction (79.23%) : Most , churned customers had high satisfaction .


scores.

• Interpretation : It's concerning that , a large proportion of churned . customers had


high satisfaction scores, , which contradicts the expectation that . higher satisfaction
should correlate with , lower churn. This might indicate . that satisfaction scores alone
are , not sufficient to predict churn, . or that other factors are , at play. The bank
should . analyze whether these satisfaction scores , are accurate and consider
additional . factors such as service usage , patterns or external influences that . might
contribute to churn despite , high satisfaction.

The third set of visualizations , provides insight into how geography, . card type, complaints,
and satisfaction , scores relate to customer churn. . France stands out as a , region with higher
churn, and . while card type does not , show significant differences in churn, . Platinum
cardholders fare slightly better. , The relationship between complaints and . churn suggests
that many dissatisfied , customers do not formally express . their concerns before leaving,
highlighting , the need for more proactive . customer engagement. Additionally, the high ,
churn among satisfied customers raises . questions about the accuracy of , satisfaction scores
or the presence . of other contributing factors. These , insights can guide the bank's . efforts to
reduce churn and , improve customer retention strategies.

19
RECOMMENDATIONS
Based on the insights gained , from the data analysis and . visualization, the following
recommendations can , be made to reduce customer . churn:

1. Develop targeted retention strategies for , younger customers : Implement


programs . and offers specifically tailored to , younger customers (under 30) to .
address their unique needs and , preferences, as this segment has . the highest churn
rates.

2. Focus on cross-selling and bundling , products : Encourage single-product


customers . to acquire additional products, as , customers with multiple products have
. significantly lower churn rates. Offer , attractive bundles or discounts to . incentivize
customers to expand their , product portfolio.

3. Improve services and support for , customers with lower credit scores . :
Customers with credit scores , below 600 have higher churn . rates. Analyze the
specific challenges , faced by this segment and . implement measures to provide better
, support, such as financial education, . flexible payment options, or personalized ,
assistance.

4. Investigate reasons for higher churn , in France : France has . the highest churn
rate among , the three geographies. Conduct further . research to understand the
underlying , causes, such as competitive pressures, . customer satisfaction issues, or
service-related , problems. Develop targeted strategies to . address the unique
challenges in , the French market.

5. Implement proactive customer engagement measures , : Many dissatisfied


customers do . not formally express their concerns , before leaving. Introduce
proactive measures . to gauge customer satisfaction, such , as regular surveys,
feedback mechanisms, . or personalized outreach, to identify , and address issues
early.

20
CONCLUSION
By analysing customer churn data , using SQL and Power BI, . this project has provided
valuable , insights into the factors influencing . customer attrition in the banking , industry.
The findings highlight the . importance of targeted retention strategies, , cross-selling
opportunities, and proactive customer . engagement to reduce churn rates. , By implementing
the recommended strategies, . banks can enhance customer satisfaction, , loyalty, and long-
term success.

REFERENCES
The Research paper used - ,
https://www.researchgate.net/publication/340855263_Churning_of_Bank_Customers_Using_
Supervised_Learning

https://www.researchgate.net/publication/340855263_Churning_of_Bank_Customers_Using_
Supervised_Learninghttps://www.researchgate.net/publication/340855263_Churning_of_Ban
k_Cuhttps://www.researchgate.net/publication/340855263_Churning_of_Bank_Customers_U
sing_Supervised_Learningstomers_Using_Supervised_Learning

21

You might also like