0% found this document useful (0 votes)

420 views11 pages

Bank Customer Churn Analysis - Jupyter Notebook

The document is a Jupyter notebook analyzing customer churn for a bank. It imports necessary libraries, reads in a dataset on bank customers, and drops unnecessary columns. It then performs exploratory data analysis on the data, including plotting pie charts of categorical variables, bar plots of churn rates by geography and gender, and counts of churned vs not churned customers. It also encodes categorical variables as numeric and checks for null values before assigning variables as dependent (churn) and independent for modeling.

Uploaded by

akash.050501

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

420 views11 pages

Bank Customer Churn Analysis - Jupyter Notebook

Uploaded by

akash.050501

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

Importing necessary libraries

In [1]: 1 import pandas as pd
2 import numpy as np
3 import matplotlib.pyplot as plt
4 import seaborn as sns

Read dataset
In [2]: 1 df=pd.read_csv('Bank Churn_Modelling.csv')
2 df.head()

Out[2]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Bala

0 1 15634602 Hargrave 619 France Female 42 2

1 2 15647311 Hill 608 Spain Female 41 1 8380

2 3 15619304 Onio 502 France Female 42 8 15966

3 4 15701354 Boni 699 France Female 39 1

4 5 15737888 Mitchell 850 Spain Female 43 2 12551

In [3]: 1 df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 RowNumber 10000 non-null int64
1 CustomerId 10000 non-null int64
2 Surname 10000 non-null object
3 CreditScore 10000 non-null int64
4 Geography 10000 non-null object
5 Gender 10000 non-null object
6 Age 10000 non-null int64
7 Tenure 10000 non-null int64
8 Balance 10000 non-null float64
9 NumOfProducts 10000 non-null int64
10 HasCrCard 10000 non-null int64
11 IsActiveMember 10000 non-null int64
12 EstimatedSalary 10000 non-null float64
13 Exited 10000 non-null int64
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

Dropping unwanted columns

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 1/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [4]: 1 unnecessary_cols=['RowNumber','CustomerId','Surname']
2 df=df.drop(df[unnecessary_cols],axis=1)

EDA

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 2/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [5]: 1 for column in df.columns:

2 unique_values = df[column].value_counts()
3 if df[column].nunique()<6:
4 plt.figure()
5 plt.pie(unique_values, labels=unique_values.index, autopct='%1.
6 plt.title(f'Distribution of {column}')
7 plt.axis('equal')
8
9 # Display all the pie charts
10 plt.show()

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 3/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 4/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 5/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [6]: 1 # Churn Rate by Geography

2
3 plt.figure(figsize =(10,6))
4
5 churn_rate_geo_gender = df.groupby(['Geography','Gender'])['Exited'].me
6 sns.barplot(data=churn_rate_geo_gender, x= 'Geography', y= 'Churn Rate'
7 plt.xlabel('Geography')
8 plt.ylabel('Churn Rate')
9 plt.title('Churn Rate by Geography & Gender')
10 plt.show()

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 6/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [7]: 1 churn_counts = df['Exited'].value_counts()

2 colors = ['#7B68EE', '#483D8B']
3 plt.figure(figsize=(8, 6))
4 plt.bar(churn_counts.index, churn_counts.values, color=colors)
5 plt.xlabel('Churn (Exited)')
6 plt.ylabel('Count')
7 plt.xticks(churn_counts.index, labels=['Not Churned', 'Churned'])
8 plt.title('Count of Customers Churned vs. Not Churned')
9 plt.show()

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 7/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [8]: 1 plt.figure(figsize=(10, 6))

2 sns.countplot(data=df, x='IsActiveMember', hue='Exited', palette='Set1'
3 plt.xlabel('Active Membership')
4 plt.ylabel('Count')
5 plt.title('Active Membership Distribution by Churn')
6 plt.legend(['Not Churned', 'Churned'])
7 plt.xticks([0, 1], ['Inactive', 'Active'])
8 plt.show()

Categorical values
In [9]: 1 for i in df:
2 if df[i].dtypes == object:
3 print(df[i].value_counts(),"\n")

Geography
France 5014
Germany 2509
Spain 2477
Name: count, dtype: int64

Gender
Male 5457
Female 4543
Name: count, dtype: int64

Encoding categorical variables

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 8/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [10]: 1 from sklearn.preprocessing import LabelEncoder

2 le=LabelEncoder()

In [11]: 1 df['Geography']=le.fit_transform(df['Geography'])
2 for i,j in enumerate (le.classes_):
3 print(i, j)

0 France
1 Germany
2 Spain

In [12]: 1 df['Gender']=le.fit_transform(df['Gender'])
2 for i,j in enumerate (le.classes_):
3 print(i, j)

0 Female
1 Male

In [13]: 1 df.head()

Out[13]:
CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsA

0 619 0 0 42 2 0.00 1 1

1 608 2 0 41 1 83807.86 1 0

2 502 0 0 42 8 159660.80 3 1

3 699 0 0 39 1 0.00 2 0

4 850 2 0 43 2 125510.82 1 1

In [14]: 1 df.isnull().sum()

Out[14]: CreditScore 0
Geography 0
Gender 0
Age 0
Tenure 0
Balance 0
NumOfProducts 0
HasCrCard 0
IsActiveMember 0
EstimatedSalary 0
Exited 0
dtype: int64

Assigning dependent and independent

variable
In [15]: 1 X = df.drop('Exited', axis=1)
2 y = df['Exited']

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 9/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

Splitting dataset to training and testing set

In [16]: 1 from sklearn.model_selection import train_test_split

2 X_train, X_test, y_train, y_test = train_test_split(X,y, train_size=.70
3 X_train.shape, y_test.shape

Out[16]: ((7000, 10), (3000,))

Models
In [17]: 1 from sklearn.metrics import classification_report, confusion_matrix
2 def print_metrics(model, X_train=X_train,y_train = y_train, X_test = X_
3 model.fit(X_train, y_train)
4 y_pred = model.predict(X_test)
5 print(classification_report(y_test,y_pred))
6 print(confusion_matrix(y_test,y_pred))

In [18]: 1 from sklearn.linear_model import LogisticRegression

2 print_metrics(LogisticRegression())

precision recall f1-score support

0 0.81 0.97 0.89 2416

1 0.44 0.08 0.14 584

accuracy 0.80 3000

macro avg 0.63 0.53 0.51 3000
weighted avg 0.74 0.80 0.74 3000

[[2354 62]
[ 536 48]]

In [19]: 1 from sklearn.tree import DecisionTreeClassifier

2 print_metrics(DecisionTreeClassifier())

precision recall f1-score support

0 0.88 0.87 0.87 2416

1 0.48 0.50 0.49 584

accuracy 0.80 3000

macro avg 0.68 0.68 0.68 3000
weighted avg 0.80 0.80 0.80 3000

[[2096 320]
[ 294 290]]

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 10/11

11/12/2023, 10:49 Bank Customer Churn Analysis - Jupyter Notebook

In [20]: 1 from xgboost import XGBClassifier

2 print_metrics(XGBClassifier())

precision recall f1-score support

0 0.88 0.95 0.92 2416

1 0.70 0.48 0.57 584

accuracy 0.86 3000

macro avg 0.79 0.71 0.74 3000
weighted avg 0.85 0.86 0.85 3000

[[2299 117]
[ 305 279]]

In [21]: 1 from sklearn.ensemble import RandomForestClassifier

2 print_metrics(RandomForestClassifier())

precision recall f1-score support

0 0.88 0.97 0.92 2416

1 0.78 0.46 0.58 584

accuracy 0.87 3000

macro avg 0.83 0.72 0.75 3000
weighted avg 0.86 0.87 0.86 3000

[[2338 78]
[ 314 270]]

In [22]: 1 # Factors contributing to customer attrition :

2 # 1. Female (Gender)
3 # 2. Germany (Geography)

localhost:8888/notebooks/KINGS LABS/4. Customer churn analysis/Bank Customer Churn Analysis.ipynb 11/11

Domain PR Check List3!!! (8647)
No ratings yet
Domain PR Check List3!!! (8647)
304 pages
Project: ©great Learning. Proprietary Content. All Rights Reserved. Unauthorised Use or Distribution Prohibited
No ratings yet
Project: ©great Learning. Proprietary Content. All Rights Reserved. Unauthorised Use or Distribution Prohibited
8 pages
Final Report
No ratings yet
Final Report
518 pages
SAP Tables - Overview
No ratings yet
SAP Tables - Overview
3 pages
A Study of Determinants of Influencing The Adaptation of Computerized Accounting System Among Small Enterprises Located in Cabadbaran City
No ratings yet
A Study of Determinants of Influencing The Adaptation of Computerized Accounting System Among Small Enterprises Located in Cabadbaran City
4 pages
Triggering Circuit
No ratings yet
Triggering Circuit
26 pages
Predictive Modelling - Linear Discriminant Analysis - Mentor Version - Jupyter Notebook
100% (1)
Predictive Modelling - Linear Discriminant Analysis - Mentor Version - Jupyter Notebook
25 pages
DUI0448I v2p Ca9 TRM
No ratings yet
DUI0448I v2p Ca9 TRM
62 pages
All Life Bank - AIML - ML - Project - Low - Code - Notebook
No ratings yet
All Life Bank - AIML - ML - Project - Low - Code - Notebook
78 pages
Lecture 01 Intro
No ratings yet
Lecture 01 Intro
31 pages
2.dasar Counting 1
No ratings yet
2.dasar Counting 1
19 pages
SPPUML3
No ratings yet
SPPUML3
12 pages
DcTrack Installation
No ratings yet
DcTrack Installation
4 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
Clustering Project
100% (1)
Clustering Project
44 pages
Satp Installation Guide 3.2
No ratings yet
Satp Installation Guide 3.2
81 pages
ML Project Shivani Pandey
100% (2)
ML Project Shivani Pandey
49 pages
Windows Server 2003 Domains Active Directory
No ratings yet
Windows Server 2003 Domains Active Directory
392 pages
Supervised Learning Project - Ipynb - Colab
No ratings yet
Supervised Learning Project - Ipynb - Colab
14 pages
Code:: Bahria University, Islamabad Campus Short Assignment (Quiz 01) (Fall 2020 Semester)
No ratings yet
Code:: Bahria University, Islamabad Campus Short Assignment (Quiz 01) (Fall 2020 Semester)
4 pages
School Education and Sports Department
No ratings yet
School Education and Sports Department
1 page
Engineers Guide To Microchip 2018
100% (1)
Engineers Guide To Microchip 2018
36 pages
AI
No ratings yet
AI
48 pages
Answer Book (Ashish)
100% (1)
Answer Book (Ashish)
21 pages
Scribbed 223751127-Chapter-12-Enhanced-Entity-Relationship-Modeling PDF
No ratings yet
Scribbed 223751127-Chapter-12-Enhanced-Entity-Relationship-Modeling PDF
16 pages
CA ERwin Tutorial
No ratings yet
CA ERwin Tutorial
12 pages
Sustainable IT Services: Assessing The Impact of Green Computing Practices
No ratings yet
Sustainable IT Services: Assessing The Impact of Green Computing Practices
11 pages
Mobile Phone Cloning IJERTCONV3IS10043
No ratings yet
Mobile Phone Cloning IJERTCONV3IS10043
5 pages
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
100% (1)
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
23 pages
Main Ldap Training Day2
No ratings yet
Main Ldap Training Day2
39 pages
Churn For Bank Customers
No ratings yet
Churn For Bank Customers
28 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Book
No ratings yet
Book
162 pages
Churn Prediction Model
No ratings yet
Churn Prediction Model
36 pages
Activity File XII 24-25 - 240919 - 091153
No ratings yet
Activity File XII 24-25 - 240919 - 091153
17 pages
ML0101EN Clas K Nearest Neighbors CustCat Py v1
100% (1)
ML0101EN Clas K Nearest Neighbors CustCat Py v1
11 pages
Introduction To Deep Learning - Assignment
No ratings yet
Introduction To Deep Learning - Assignment
4 pages
Capstone Project Report
No ratings yet
Capstone Project Report
187 pages
Lecture # 1
No ratings yet
Lecture # 1
14 pages
Machine Learning Projects For Final Year PDF
No ratings yet
Machine Learning Projects For Final Year PDF
4 pages
Duval
No ratings yet
Duval
9 pages
Machine Learning Coursera Quiz 2
100% (1)
Machine Learning Coursera Quiz 2
6 pages
Capstone Project 2 1
No ratings yet
Capstone Project 2 1
3 pages
CCS355 Neural Networks and Deep Learning Lab
No ratings yet
CCS355 Neural Networks and Deep Learning Lab
43 pages
HP F210 User Manual
No ratings yet
HP F210 User Manual
31 pages
Customer Intelligence Syste1
No ratings yet
Customer Intelligence Syste1
19 pages
Machine Learning - Nabeel Khan - Final Project Report - Problem 2
100% (1)
Machine Learning - Nabeel Khan - Final Project Report - Problem 2
24 pages
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
No ratings yet
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
31 pages
Kawai CN290 Digital Piano Manual
No ratings yet
Kawai CN290 Digital Piano Manual
24 pages
Dinya Antony MRA ML2
100% (1)
Dinya Antony MRA ML2
24 pages
LAS WEEK 1 - Grade 10 ICT
No ratings yet
LAS WEEK 1 - Grade 10 ICT
4 pages
Statisitics Project 6
100% (2)
Statisitics Project 6
48 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
Week 1 Quiz
100% (1)
Week 1 Quiz
28 pages
The Problem of Overfitting: Overfitting With Linear Regression
No ratings yet
The Problem of Overfitting: Overfitting With Linear Regression
32 pages
Project Predictive Modeling PDF
100% (1)
Project Predictive Modeling PDF
58 pages
Assignment ML
100% (2)
Assignment ML
21 pages
Why Do You Need To Scale Data in KNN: 3 Answers
No ratings yet
Why Do You Need To Scale Data in KNN: 3 Answers
1 page
ML Project Report: (Text Learning Case Study)
No ratings yet
ML Project Report: (Text Learning Case Study)
9 pages
Data Mining Project Shivani Pandey
100% (1)
Data Mining Project Shivani Pandey
40 pages
Car Transport Prediction
100% (2)
Car Transport Prediction
27 pages
Random Forest - US - Heart - Patients - Class
100% (1)
Random Forest - US - Heart - Patients - Class
24 pages
ML Quiz 2
No ratings yet
ML Quiz 2
1 page
Project Report: CS 574 - Computer Vision Using Machine Learning
No ratings yet
Project Report: CS 574 - Computer Vision Using Machine Learning
38 pages
Predicting Mode of Transport (ML) : Akalya KS
No ratings yet
Predicting Mode of Transport (ML) : Akalya KS
17 pages
Quiz
No ratings yet
Quiz
6 pages
Project 5 - Cars
100% (1)
Project 5 - Cars
22 pages
Unit 4 Basics of Feature Engineering
No ratings yet
Unit 4 Basics of Feature Engineering
33 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
Machine Learning Projects PDF
No ratings yet
Machine Learning Projects PDF
5 pages
Quiz 3 - Recommendation Systems, Association Rule Mining - Machine Learning 3 - Ravi
No ratings yet
Quiz 3 - Recommendation Systems, Association Rule Mining - Machine Learning 3 - Ravi
7 pages
Final - Data and Ai Governance.6sept2023
No ratings yet
Final - Data and Ai Governance.6sept2023
42 pages
PM ProjectJune - 2021
100% (1)
PM ProjectJune - 2021
33 pages
Capstone Presentation
No ratings yet
Capstone Presentation
58 pages
Clustering Analysis: Prepared by Muralidharan N
100% (1)
Clustering Analysis: Prepared by Muralidharan N
16 pages
Loading The Dataset: 'Churn - Modelling - CSV'
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
6 pages
Machine Learning Mini-Project Report
No ratings yet
Machine Learning Mini-Project Report
26 pages
ML 2
No ratings yet
ML 2
6 pages
Von Neumann Architecture
No ratings yet
Von Neumann Architecture
3 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
Crime Prediction in Nigeria's Higer Institutions
No ratings yet
Crime Prediction in Nigeria's Higer Institutions
13 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
Sample - Customer Churn Prediction Python Documentation
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
Heather Goodwin: 12235 Conveyor Court Bristow, VA 20136 C: 703-402-8921
No ratings yet
Heather Goodwin: 12235 Conveyor Court Bristow, VA 20136 C: 703-402-8921
4 pages
Missing Value Treatment
No ratings yet
Missing Value Treatment
22 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Data Science & Business Analytics: Post Graduate Program in
No ratings yet
Data Science & Business Analytics: Post Graduate Program in
16 pages
Module 5 Pandas Assignment Updated
No ratings yet
Module 5 Pandas Assignment Updated
3 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages