SQL Project
SQL Project
Management,
Vineet Khand, Gomti Nagar
Lucknow – 226010
Tick on appropriate
I hereby declare that this project/assignment does not contain any AI generated content
(e.g. ChatGPT etc.)
I hereby declare that this project/assignment has used AI generated content (e.g. ChatGPT
etc.) as supporting resource for the completion of project/assignment.
1
Contents
introduction ................................................................................................................................ 3
problem Statement ..................................................................................................................... 4
Data Description ........................................................................................................................ 5
Methodology .............................................................................................................................. 6
Data collection ....................................................................................................................... 6
Data Filtering with SQL......................................................................................................... 6
Data Visualization with Power BI ........................................................................................ 12
Analysis and Insights ............................................................................................................... 13
1.Total Customers (Top Center) ........................................................................................... 13
2. Customers by Card Type (Top Left) ................................................................................ 13
3. Customers by Card Type and Credit Score Range (Top Right) ....................................... 14
4. Customers by Geography (Bottom Right) ....................................................................... 14
5. Active vs. Inactive Members , (Bottom Left) .................................................................. 14
6. Churn Rate by Age Group (Top Left) .............................................................................. 15
7. Churn Rate by Number , of Products (Top Right) ........................................................... 16
8. Churn Rate by Credit , Score Range (Bottom) ................................................................ 16
9. Churn Rate by Geography , (Top Left) ............................................................................ 17
10. Churn Rate by Card , Type (Top Right) ......................................................................... 18
11. Relationship Between Complaints and , Churn (Bottom Left) ...................................... 18
12. Churn Rate by Customer , Satisfaction Score (Bottom Right) ...................................... 19
Recommendations .................................................................................................................... 20
Conclusion ............................................................................................................................... 21
References ................................................................................................................................ 21
2
INTRODUCTION
Customer churn poses a significant , challenge for the banking industry, . directly impacting
profitability and growth. , As financial institutions operate in . a highly competitive
environment, where , products and services often appear . similar, retaining customers
becomes crucial , to maintaining a stable revenue . stream.
This project aims to analyse , customer churn data to identify . the key factors influencing
customer , attrition and develop targeted retention . strategies. The study emphasizes the ,
importance of understanding the various . drivers behind customer churn, which , can include
demographic characteristics, financial . behaviours , product usage, and , customer
satisfaction levels. To achieve . this, the project leverages data , analysis and supervised
learning techniques . to predict which customers are , at risk of leaving.
By . applying models such as logistic , regression and decision trees to . historical data, the
analysis seeks , to uncover patterns that can . inform proactive measures for customer ,
retention. Additionally, the project explores . how different attributes—such as credit , scores,
geographic locations, and customer . engagement levels—affect churn rates.
The , dataset utilized in this analysis . comprises 2,304 entries and 16 , attributes,
encompassing a rich mix . of demographic, behavioural , and , financial information. This
comprehensive dataset . allows for a nuanced examination , of customer profiles and their .
corresponding churn behaviour .
The , insights generated from this analysis . are intended to provide actionable ,
recommendations for banks, enabling them . to tailor their retention strategies , to address the
specific needs . of high-risk customers effectively.
3
PROBLEM STATEMENT
This study emphasizes the critical , nature of customer churn to . predict and mitigate this
issue, , which directly relates to the . project's goal of analyzing customer , churn data to
enhance retention . strategies.
The banking industry faces a , critical challenge in retaining customers, . as customer churn
can lead , to significant revenue loss and . increased costs for acquiring new , customers. In a
highly competitive . market, where financial institutions offer , similar products, retaining
customers is . essential for maintaining profitability and , growth. Customer churn is often .
driven by various factors, including , demographic characteristics, financial behaviors,
product . usage, and customer satisfaction levels. ,
This project aims to analyze , customer churn data to identify . key factors that influence
customer , attrition. By understanding the reasons . behind customer churn, banks can ,
develop targeted retention strategies that . address specific pain points and , reduce churn
rates. The primary . objective is to leverage data , analysis and supervised learning techniques
. to predict which customers are , at risk of leaving. This . predictive capability allows banks
to , take proactive measures to retain . customers before they churn.
Supervised learning models, such as , logistic regression and decision trees, . will be applied
to historical , data to identify patterns and . predict future churn. Additionally, this , analysis
will explore how different . attributes, like credit scores, geographic , locations, and customer
engagement levels, . impact churn rates. By gaining , a granular understanding of these .
factors, banks can tailor their , retention strategies to address the . specific needs of high-risk
customers. ,
Ultimately, this project seeks to , provide actionable insights that can . be translated into
effective customer , retention strategies. By reducing churn, . the bank can protect its ,
revenue streams and strengthen its . competitive position in the market. , The insights
generated from this . study will be critical in , helping the bank enhance customer .
satisfaction, loyalty, and long-term success. ,
4
DATA DESCRIPTION
The dataset consists of 2,304 , entries and 16 columns. Below . is a brief description of , the
key columns:
Tenure : Number of years , the customer has been with . the bank (numerical).
NumOfProducts : Number of products , the customer holds with the . bank (numerical).
HasCrCard : Whether the customer , has a credit card (binary: . 1 = Yes, 0 = , No).
Card Type : Type of , card held by the customer . (categorical: e.g., DIAMOND, GOLD).
This dataset provides a good , mix of demographic, behavioral, and . financial data for
customer churn , analysis.
5
METHODOLOGY
Data collection
The research paper "Churning of , Bank Customers Using Supervised Learning" . obtained
from ResearchGate provides additional , context and insights for the . customer churn
analysis project. The , data collection for this study . involved utilizing a comprehensive
dataset , comprising 2,304 entries and 16 . attributes. This dataset includes a , mix of
demographic, behavioural , . and financial data, such as , customer surnames, credit scores,
geographic . locations, age, account balance, and , customer satisfaction scores. Key variables
. in the dataset, such as , whether a customer has exited . (indicating churn), the number of ,
products held, and the presence . of complaints, provide critical insights , into customer
behaviour and potential . churn factors.
1. Data Extraction : Utilize SQL , queries to select relevant fields . from the customer
database, including , customer demographics, account details, churn . status, and other
pertinent attributes. ,
4. Data Filtering : Apply conditions , to filter out unnecessary data . and focus on
specific customer , segments.
5. Data Preparation : Ensure data , is clean and formatted correctly . for visualization,
handling null values, , standardizing formats, and creating calculated . fields.
6
2. Total number of customers ,
Calculates the churn rate by , geography, showing total customers, churned . customers, and
churn rate percentage. ,
7
7. Number of Products Owned , by Customers
Counts the number of customers , based on the number of . products they own.
Categorizes customers into age groups , and calculates churn rates for . each group.
8
10. Churn Rate by Credit , Score Range
9
Counts active and inactive members , by geography.
Shows the distribution of customers , based on the type of . card they own.
10
15. Relationship Between Complaints and , Churn
Combines card types with credit , score ranges to analyze the . distribution of customers.
11
Data Visualization with Power BI
For data visualization, Power BI , can be used to create . the following interactive
visualizations:
2. Churn Rate by Geography: A , map or bar chart depicting . churn rates across
different regions. ,
5. Churn Rate by Number of , Products: A line or bar . chart showing how churn rates
, vary based on the number . of products customers own.
6. Churn Rate by Age Group: , A bar chart visualizing churn . rates for different age
groups. ,
7. Churn Rate by Credit Score , Range: A bar chart showing . churn rates across
different credit , score ranges.
8. Churn Rate by Satisfaction Score: , A bar chart illustrating the . correlation between
satisfaction scores and , churn rates.
10. Card Type Distribution: A pie , or bar chart displaying the . distribution of card
types among , customers.
11. Churn Rate by Card Type: , A bar chart comparing churn . rates for different card
types. ,
12. Complaints and Churn: A bar , chart showing the relationship between . customer
complaints and churn rates. ,
12
ANALYSIS AND INSIGHTS
Dashboard PART 1
13
• Interpretation : This indicates that , the Gold and Silver card . types are the most
popular , among customers, possibly due to . benefits or accessibility associated with ,
these cards. Understanding why certain . card types are more popular , can help in
designing targeted . marketing strategies.
o 600-699 and 700 and above , : The most common credit . score ranges across
all card , types.
• Interpretation : Higher credit scores , (600 and above) are more . common among all
card types, , especially for higher-end cards like . Diamond and Platinum. This could ,
imply that customers with higher . credit scores are more likely , to qualify for and
obtain . these cards. The bank might , consider focusing on customers within . these
higher credit score ranges , for premium products.
14
The visualizations provide a clear , overview of the customer distribution . across various
dimensions. Key insights , include the popularity of certain . card types, the correlation
between , credit score and card type, . geographical distribution of customers, and , the
balance between active and . inactive members. These insights can , guide the bank in making
. data-driven decisions related to marketing , strategies, customer engagement, and regional .
focus.
Dashboard PART 2
15
• Interpretation : Younger customers (under , 30) have a significantly higher . churn
rate compared to older , customers. This could indicate that . younger customers are
either more , price-sensitive, less satisfied with the . bank's services, or have more ,
options available to them. The . bank may need to focus , on retention strategies
specifically targeted . at younger customers, possibly by , offering products that are
more . appealing to this demographic or , improving customer service.
• Interpretation : Customers with only , one product are most likely . to churn,
suggesting that single-product , customers might not be as . engaged or loyal to the ,
bank. In contrast, customers with . multiple products have a much , lower churn rate,
likely due . to higher engagement and satisfaction. , The bank should consider
strategies . to cross-sell additional products to , single-product customers, which could
help . reduce the churn rate.
o 700 and above : Largest , segment, though with a lower . churn rate.
o Below 400 : Very small , but critical group with 100% . churn rate.
• Interpretation : The highest churn , is seen among customers with . lower credit
scores (below 600), , which could suggest that these . customers are either less
satisfied , or more likely to face . financial difficulties that lead to , churn. On the
16
other hand, . customers with higher credit scores , (600 and above) form the . majority
but have lower churn , rates, indicating they are more . stable and satisfied. The bank ,
could benefit from focusing on . improving services and support for , lower credit
score customers to . reduce churn in this segment. ,
The second set of visualizations , provides a deeper understanding of . the churn dynamics
across different , customer segments. High churn rates . among younger customers, those
with , fewer products, and lower credit . scores suggest areas where the , bank can focus its
efforts . to improve customer retention. Strategies , such as targeted retention programs, .
improved customer engagement for younger , customers, and cross-selling opportunities for .
single-product customers could help reduce , churn rates.
Dashboard PART 3
17
o Spain : Lowest churn with , 110 customers.
• Interpretation : France has the , highest churn rate among the . three geographies,
suggesting that customers , in France might be less . satisfied or have more
competitive , banking options. Spain, on the . other hand, has the lowest , churn rate,
indicating higher customer . retention. The bank might need , to investigate the
reasons for . higher churn in France, such , as customer satisfaction, service issues, .
or competitive pressures, and develop , targeted strategies to improve retention . in
that region.
o Diamond : Highest churn with , 127 churned customers out of . 579 total.
o Platinum : Lowest churn with , 109 churned customers out of . 555 total.
• Interpretation : The churn rate , appears fairly consistent across different . card
types, with Platinum customers , having a slightly lower churn . rate. This suggests
that the , type of card may not . be a significant factor in , churn, but the slight
difference . for Platinum cardholders might indicate , higher satisfaction or better
services . for those customers. The bank , could explore if the benefits . associated
with Platinum cards contribute , to better retention and consider . extending similar
benefits to other , card types.
• Insight : The pie chart , illustrates the relationship between customer . complaints and
churn:
18
• Interpretation : Surprisingly, most churned , customers did not file complaints, .
suggesting that dissatisfaction may not , always be formally expressed before .
customers decide to leave. This , indicates a potential gap in . the bank's ability to
identify , and address issues before customers . churn. The bank should consider ,
implementing proactive measures to gauge . customer satisfaction, such as surveys ,
or feedback mechanisms, to catch . potential issues early.
• Insight : This donut chart , shows the churn rate based . on customer satisfaction
scores:
The third set of visualizations , provides insight into how geography, . card type, complaints,
and satisfaction , scores relate to customer churn. . France stands out as a , region with higher
churn, and . while card type does not , show significant differences in churn, . Platinum
cardholders fare slightly better. , The relationship between complaints and . churn suggests
that many dissatisfied , customers do not formally express . their concerns before leaving,
highlighting , the need for more proactive . customer engagement. Additionally, the high ,
churn among satisfied customers raises . questions about the accuracy of , satisfaction scores
or the presence . of other contributing factors. These , insights can guide the bank's . efforts to
reduce churn and , improve customer retention strategies.
19
RECOMMENDATIONS
Based on the insights gained , from the data analysis and . visualization, the following
recommendations can , be made to reduce customer . churn:
3. Improve services and support for , customers with lower credit scores . :
Customers with credit scores , below 600 have higher churn . rates. Analyze the
specific challenges , faced by this segment and . implement measures to provide better
, support, such as financial education, . flexible payment options, or personalized ,
assistance.
4. Investigate reasons for higher churn , in France : France has . the highest churn
rate among , the three geographies. Conduct further . research to understand the
underlying , causes, such as competitive pressures, . customer satisfaction issues, or
service-related , problems. Develop targeted strategies to . address the unique
challenges in , the French market.
20
CONCLUSION
By analysing customer churn data , using SQL and Power BI, . this project has provided
valuable , insights into the factors influencing . customer attrition in the banking , industry.
The findings highlight the . importance of targeted retention strategies, , cross-selling
opportunities, and proactive customer . engagement to reduce churn rates. , By implementing
the recommended strategies, . banks can enhance customer satisfaction, , loyalty, and long-
term success.
REFERENCES
The Research paper used - ,
https://www.researchgate.net/publication/340855263_Churning_of_Bank_Customers_Using_
Supervised_Learning
https://www.researchgate.net/publication/340855263_Churning_of_Bank_Customers_Using_
Supervised_Learninghttps://www.researchgate.net/publication/340855263_Churning_of_Ban
k_Cuhttps://www.researchgate.net/publication/340855263_Churning_of_Bank_Customers_U
sing_Supervised_Learningstomers_Using_Supervised_Learning
21