0% found this document useful (0 votes)

36 views17 pages

Interim Report

Uploaded by

Anisha Gheever

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views17 pages

Interim Report

Uploaded by

Anisha Gheever

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

INTERIM REPORT: CUSTOMER CHURN PREDICTION IN SUBSCRIPTION-BASED

SERVICES

1. Introduction of the Business Problem

Defining the Problem Statement: The main idea of this project is to analyze customer churn on
subscription-based services, being vital for the future revenue of the company and customer
loyalty. The issue is about finding the major variables, involved in churn, creating the machine
learning algorithm to forecast buyer’s behavior, and generating the business strategies against
churn.

Need of the Study/Project: Customer churn is thus one of the most important measures of
business profitability. Since churn is the likelihood of customers leaving, its prediction enables
businesses to deploy aggressive customer retention measures hence guaranteeing constant
income and excluding the costs of customer acquisition.

Understanding Business/Social Opportunity: The predictive churn modeling in addition to

identifying the customers’ reasons for churning also assist one to have actionable knowledge that
can be implemented in order to intervene and improve the customers’ satisfaction hence making
the customers remain loyal to the firm and also increasing the value proposition of the firm.

2. Data Report

Data Collection: That data used is the secondary customer subscription data that can encompass
the demographic information, use of the services, and the preferred mode of payment and
interaction with the customer service department. Specific monetary and non-monetary data
sources identified include the following: Data collection details involve customers and their
characteristics, use of services as well as their experience with the customer support of the
company under analysis; all primary data collected through surveys conducted using a sample
from the customer database of the company under analysis.

Visual Inspection of Data: This dataset consists of 11586 rows and 19 attributes which includes
customer tenure, monthly revenue, and interaction details etc. Respecting above steps, each
attribute was graphically explored using histograms, box plots, and count plots for the purpose of
getting insights about their distribution as well as checking for outliers.

Understanding of Attributes: The dataset attributes contain the following: Demographic data
which can be further divided into; Gender, Marital status, Service usage details such as; Tenure,
Account_user_count, Payments; Payment method, coupon_used_for_payment, Customer support
details; CC_contacted_LY, Day_since_CC_connect. Some features for example the multiple
identification numbers were eliminated before the actual processing of the information. As a
result, it was identified that there was missing data in measures like Tenure, Payment, and
Service_Score which was handled via imputation.

3. Exploratory Data Analysis (EDA)

Output of EDA:

 Shape of the Dataset: The dataset contains 11,260 rows and 19 columns.
 Column Headings: ['AccountID', 'Churn', 'Tenure', 'City_Tier', 'CC_Contacted_LY',
'Payment', 'Gender', 'Service_Score', 'Account_user_count', 'account_segment',
'CC_Agent_Score', 'Marital_Status', 'rev_per_month', 'Complain_ly', 'rev_growth_yoy',
'coupon_used_for_payment', 'Day_Since_CC_connect', 'cashback', 'Login_device']
 First Few Rows: Displayed sample rows include information such as AccountID, Churn
status, Tenure, Payment method, and more.
 Last Few Rows: Displayed sample rows showing similar attributes to give an overview
of the data.
 Missing Data: Several columns, including Tenure, Payment, and Service_Score, contain
missing values.
 Data Types: Attributes include a mix of numerical and categorical variables, requiring
different preprocessing techniques.

Univariate Analysis:

 Distribution of Customer Tenure: The histogram( Figure 1) illustrates the spread of the
customers by tenure, where most are below 20 months, and fewer customers are above
this point. This means that, a significant number of customers are likely to churn over
comparatively early. Since early churn is primarily due to dissatisfaction, the targeted
retention strategy might involve making customer experience better within a year.

Figure 1: Distribution of Customer Tenure

 Churn Rate by Account Segment: The bar plot(Figure 2) illustrates that some particular
account segments such as segments 2 and 3 experienced higher churn rates. From this it
could be inferred that customers in these segments may have an unmet need or problem
and must be addressed through enhanced service delivery or reward programs
respectively.

Figure 2: Churn Rate by Account Segment

Bivariate Analysis:

 Customer Service Agent Score by Churn Status:

From the box plot (Figure 3) of the successful churn customers and churn customers, the
result illustrates that the churn customers have higher CS agent score. This means that
most of the churn originates from unsatisfactory customer support experiences. It also
calls for more commitment by companies into improving the quality of the customer
relations department.
Figure 3: Customer Service Agent Score by Churn Status

Monthly Revenue by Churn Status: Using the box plot (Figure 4) of the monthly
revenue of customers, it is evident that the two groups do not differ significantly and
therefore churn cannot be better explained by the base alone. Yet it can be valuable to
assess technology’s impact on revenue and measure it other value drivers, like enhanced
service quality and customer satisfaction, when the goal is to construct a more effective
retention agenda.
Figure 4: Monthly Revenue by Churn Status

Number of Complaints Last Year by Churn Status: An analysis (Figure 5) of the

count plot showed that the customers with more complaints tend to churn. This goes to
show the need for timely handling of complaints with a view of minimizing
dissatisfaction, and consequently churn. Any business should incorporate efficient
complaints handling processes and should continuously analyze the complaints.

Figure 5: Number of Complaints Last Year by Churn Status

Data Cleaning and Transformation:

 Handling Missing Values for 'Account_user_count': The 'Account_user_count'

column was converted to a numeric type, coercing any errors to NaN. Missing values
were then filled with the mean value of the column to maintain consistency.
 Removal of Unwanted Variables: Columns that were deemed irrelevant, repeating, or
derived without adding significant value were eliminated. These include:
o AccountID: A unique identifier with no predictive value.
o Login_device: The type of device used for login (e.g., mobile or computer),
which was unlikely to be a strong predictor of churn.
o Marital_Status: Unlikely to have direct influence on customer churn in this
context.
o coupon_used_for_payment: Limited influence on churn and not a strong
predictor.
o rev_growth_yoy: Derived feature that could be redundant with other revenue-
related attributes.
o Day_Since_CC_connect: Number of days since last customer care interaction,
not providing significant additional predictive power.
o cashback: Promotional feature that may add noise rather than predictive value.
 Encoding Categorical Columns: The categorical columns ('Payment', 'Gender',
'account_segment') were label encoded to prepare them for modeling.
 Class Imbalance Check: The distribution of the target variable ('Churn') was checked
for imbalance. The class imbalance was visualized using a bar plot (figure 6), showing
that there is significantly more non-churn (class 0) customers compared to churn (class
1). This will require techniques like SMOTE to balance the dataset for modeling.

Figure 6: Class Distribution of Target Variable (Churn)

4. Business Insights from EDA

Data Imbalance: Finally, the provided dataset is highly skewed as a majority of the customers
were not churn customers. This inequality means that while churn is not particularly common, it
is essential to fix since these clients may lose their value to the business. This means that along
elimination of imbalance, other methods such as SMOTE also prevents modelling bias because it
will predict both the churn and non-churn classes of data.
High Churn Rate in Specific Account Segments: Understanding shows that there are some
cases where the churn rate is higher than others, more so in segment 2 and segment 3. From this
understanding, it is possible to argue that customers in these segments may have unmet needs, or
other issues that need to be addressed. These segments could be targeted through business
interventions that entail provision of specialized services to customers or provision of loyalty
programs that can help to reduce churn ages.
Customer Interaction Frequency as a Churn Indicator: High levels of interaction with
customer care, for instance through continuously complaining are also indicators of churn. This
brings out the fact that not only must one attend to clients’ complaints, the resolutions should
make them happy. Enhancing customer retention can be achieved if businesses find ways of
actively pursuing ways of identifying and overcoming these pains.

Importance of Early Tenure Experience: Moreover, the distribution of the number of

customers suggests that the customers are rather inclined to churn in earlier stages of their
microbial network with the company. This only means that first moment or first impression is
crucial in terms of customer loyalty. The Concerns are that for customers who have monthly
subscription with the firms, there is an increased likelihood that they will churn within the first
weeks or months of subscription, and this can be a major advantage for firms to focus on the
onboarding process so as to ensure high levels of satisfaction within the first months of
interaction.
Customer Service Quality Impact on Churn: The simple regression analysis of the score
given to a customer service agent reveals that customers who have rated their service experience
low are the ones most likely to churn. This insight focuses on how firms should direct their
resources towards availing functional training for the customer services providers as well as
improving the quality of service delivery to customers.
Revenue Insights: The revenues in the month grouped by customers that churned and those that
did not churn have the following distribution. This implies that one has to look at the revenue
figures with a non-binary lens that fundamentally tells you churn is not solely related to revenue;
one has to dig deeper and understand high-revenue customers and their general experience and
satisfaction to retain them. To some extent, implementation of unique services or privileges
within a store may be useful in maintaining the high-spending clientele.
Complaint Management: The last year’s commutation count is the best approximate of churn.
Loyal customer risk consumers are those who have lodged more complaints than other
consumers. Automating complaints handling, adopting positive customer attitude and monitoring
customer disgruntlement are among the powerful techniques that help manage churn. There
should be a close loop to manage and implement responses to assess customer support and
feedback on various organizational dynamics.
Clustering Insights for Customer Segmentation: From the performed initial cluster analysis, it
is possible to identify separate customer segments, such as high-risk churners. Such info helps
businesses to address the high-risk segments within their target audiences, sufficiently using key
tactics in the form of loyalty programs, offers, or specially targeted messaging that will aid
increase customer retention.

5. Model Building and Interpretation

Algorithms Selected: Based on the nature of the churn prediction problem, three algorithms of
supervised learning have been selected namely: Logistic Regression, Random Forest, and SVC.
 Logistic Regression was chosen because of its simplicity and interactivity.
 Random Forest was selected because it uses multiple decision trees in order to make final
decision and thus can handle feature interactions.

 SVC is used for its capability to identify non-linear relationships that will help provide
clearly differentiating churn customer from the other customers.

Base Model Performance and Interpretation:

Figure 7: Base Model Performance
Logistic Regression: The model of overall accuracy was good but poor when it came to
classifying the customers that have the propensity to churn in a given organization as was
evidenced by the lower recall. This means that it was unlikely to capture true churn cases and
therefore not efficient for retention efforts.(Figure 8)
Figure 8:Logistic Regression - Confusion Matrix
Random Forest Classifier: In its performance the model registers very good accuracy and
preserve good rate between precision and recall. It was generally able to set apart churned and
non-churned customers, which put it in good stead for the process of churn prediction.( Figure 9)

Figure 9: Random Forest Classifier - Confusion Matrix

Support Vector Classifier (SVC): Although the model had good overall accuracy, it faced
significant challenges in predicting churned customers due to a low recall score. This indicates
that the model often failed to identify customers at risk of churning, limiting its usefulness for
proactive retention.
Figure 10: Support Vector Classifier - Confusion Matrix

Feature Selection and Re-Training:

Feature selection was performed to improve the performance of the machine learning and related
algorithms used, while at the same time improving the interpretability of results. The number of
input features is thereby decreased and the models become less overfitting; the computational
work is also minimized especially in cases where the computational memory is limited.
Recursive Feature Elimination (RFE): RFE was used as the dependent method for feature
selection. This operation is done recursively whereby each time a feature is removed the model is
rebuilt to find out which feature is unimportant. The idea is to find out which features matter
most as their inclusion enhances model generality and interpretability.
Selected Features for Each Model:

 Logistic Regression: ['Tenure', 'City_Tier', 'Account_user_count', 'CC_Agent_Score',

'Complain_ly']
 Random Forest Classifier: ['Tenure', 'CC_Contacted_LY', 'Payment',
'Account_user_count', 'rev_per_month']
 Support Vector Classifier (SVC): ['Tenure', 'CC_Contacted_LY', 'Account_user_count',
'account_segment', 'Complain_ly']

Model Performance with Selected Features

Figure 11: Model Performance with Selected features
In the table ( figure 11) , it provides summary of the various machine learning models employed
for churning customer and the results obtained after feature selection. The amounts displayed are
accuracy, precision, and recall for each model distinguished by churn (yes/no) as well as F1-
score.
Here’s an overview of what each column means:
Model: The following are the machine learning models which is featured by this column:
Logistic Regression, Random Forest, SVC.
Accuracy (%): This implies the level of accuracy, which is the ration of true prediction to the
total set by the model. For instance, Random Forest has accuracy of 88.51%, this means it was
able to predict 88.51% of the correct churn statuses.
Precision (Churn: No) (%): This defines the percentage of the times the model was right in
identifying customers who did not churn by giving a “no churn” designation. A high value means
that majority of the “no churn” predictions were accurate.
Precision (Churn: Yes) (%): This demonstrates the times that the model got it right by
predicting that a particular customer will churn (class 1). A low percentage of precision here like
in SVC, that had an archive ,which only got 0% correct would indicate that many of the things
that it labelled ‘churn’ were, in fact, wrong.
Recall (Churn: No) (%): The number of actual non-churn that the model successfully classified
it from the entire pool of actual non-churn cases. A high recall for "Churn: “No”, such as SVC at
100% indicates the models can include almost all of the non-churn cases properly.
Recall (Churn: Yes) (%): shows how well the model was able to identify customers who
ultimately did churn. A low recall as denoted by SVC’s 0% indicate that the model was able to
identify a few of the true churn cases.
F1-Score (Churn: No) (%): Precision + Recall for non-churned customers. It offers the cross-
validation of the model and indicates how effectively it predicts the absence of churn.
F1-Score (Churn: Yes) (%): Churned clients similar to the earlier column but only with
churned customers. A low F1-score here means that the model was not very good at identifying
the churned customers it was looking for as it did not bump up its recall values at the cost of
decreasing precision and recall.
Summary of Insights:
Logistic Regression: Achieved fairly high accuracy and high recall for “no churn” class, but
failed to identify the churn cases properly, they even had poor precision and recall for class 1.
Random Forest Classifier: Gained the best value for all the measures while achieving high
performance for churned and non-churned classes. It achieved the higher accuracy rate of
88.51% to prove the theory that it can identify two types of customers.
Support Vector Classifier (SVC): Had good accuracy but performed very poorly in identifying
churn cases as can be seen in the confusion matrix above. The recall of churn was 0% meaning
that it was unable to correctly identify a single actual churned customer This is a big
disadvantage when it comes to using this method to predict churn.
Hence, Random Forest algorithm turned out to be a reliable model for the customer churn
prediction as it was able to manage a good, balanced accuracy and recall with regard to the two
classes of data.
Model Interpretation
In the overall comparison of the results and disregarding quantity, the Random Forest Classifier
showed the most uniform high accuracy in fact along with the highest figures no matter whether
the features were subjected to filtering or not. Nonetheless, their performance in feature
interactions and balancing precisions and recalls for both churned and non-churned are precisely
why this model is preferred for customer churn predictions.

6. Model Tuning and Business Implications

Hyperparameter Tuning:
To further improve the performance of each model, GridSearchCV was utilized for
hyperparameters optimization for every model. This process was extended by fine-tuning
significant parameters to yield improved accuracy and recall rates for churned customers
particularly.
Hyperparameter Tuning Results:
Logistic Regression: The two prominent parameters that yielded the best results were C=1 and
penalty =’l2’. After the tuning process the model achieved 86,53% accuracy. However, the recall
for churn stayed low to identify more need for improvement.
Random Forest Classifier: The best predictors were n_estimators with an optimal value of 50,,
max_depth and min_samples_split values of None and 2 respectively. On the final tuned
Random Forest the model showed a performance of 88.40% in the churn, and no churn
customer’s records.
Support Vector Classifier (SVC): The highest accuracy achieved from this decision forest was
with a parameter C= 0.01 and the kernel of the radial basis function was ‘linear’. For churn,
tuned SVC model indicates poor results similar to if used as it with recall rate at 0%.

Figure 12: Hyperparameter Tuning – results

As per Figure 12,
Accuracy: The Random Forest model gave 88.40% accuracy which means the model was right
about the customer behavior in 88.40 % of the instances.
Precision (Churn: No/Yes): Random Forest model had a precision of 70% for "Churn: ‘Yes,’
meaning that 70 percent of customers that were anticipated to churn really did.
Recall : The Random Forest model, for instance, had a recall of 60% for "Churn: Yes,” as it has
been able to correctly detect 60 percent of the actual churn cases.
F1-Score (Churn: No/Yes): The Random Forest model had an F1-score of 64% for "Churn:
‘Yes,” indicating an appropriate degree of differentiation between claim-churn forecast and false
alarms.
Key Observations: ( Figure 12)
Random Forest Classifier: presented a well-fitted model with more precision, recall, and
balance between the F1-score; therefore, the best model for detecting the churned and non-
churned customers.

Logistic Regression: Logistic Regression performed fairly well and missed some churned
customers especially due to low recall.
Support Vector Classifier (SVC) : SVC failed to capture any churned customers effectively, as
indicated by its recall and F1-score of 0% for "Churn: Yes." This indicates that this model may
not have been well suited to this problem without certain changes made to it.
Business Implications:
Improved Customer Retention Strategies: Through the analysis of a tuned Random Forest
Classifier it was identified that it demonstrated the most accurate performance, thus, businesses
can use the information gathered from this model and formulate punctual retention strategies.
Top drivers of churn are known to facilitate targeting specific customers who are likely to churn
and addressing their issue.
Resource Allocation: Churn could be managed well when the providers note particular types of
clients as potential to leave and direct their efforts where they’ll be needed, in marketing,
customer care, and so on. This assists with the customer loyalty programs, to make sure that the
organisation is paying attention to the customers who are even thinking of dumping the
organization.
Personalized Interventions: The knowledge coming from these tuned models is useful of
designing individual based intervention approaches. For example, customers with high risk
scores can be offered extra bonus for loyalty or even promotions for the goods they opted for or
better customer service for better experience.
Onboarding Improvements: Onboarding is an important area that can help decrease churn rates
because, as the name suggests, lots of customers churn in the first few months of interacting with
the company. The owners and managers of the companies should pay a lot of attention to the first
time that customers are interacting with their goods and services.
Customer Support Quality: Here it is possible to note that the tuned models pointed out the
customer service quality as an important factor. Other important areas are to distinguish the
sources of churn, especially in how to lower them, which includes enhancements in the customer
support where the staff serving the clients should be trained on proper communication and
mechanisms for receiving the customers’ feedback should be developed.
Data-Driven Decision Making: Through predictive modeling, firms and organizations can
better understand which customers they should consider keeping. This involves ascertaining
which segments are most likely to audited and then developing certain programs or certain
improvements to suit those customers.
Conclusion
The topics of this interim report were centered on customer churn prediction in subscription-
based service industries applying pre-built ML algorithms. In this paper, we have assessed the
performance of Logistic Regression, Random Forest Classifier and Support Vector Classifier
(SVC) with the purpose of differentiating churned and non-churned clients. Based on the
analysis:

 Random Forest Classifier turned out to be the best model since it achieved comparable
accuracy and optimal recall and precision for churn and non-churn classes.
 Logistic Regression which was also can perform good in this case used showed that it is
very difficult the identify the churned customers and this is due to the very low recall
value.
 After optimizing the hyperparameters, using Support Vector Classifier (SVC) again
showed a low accuracy in identifying churned customers.
Business Implications:
The findings related to the current investigation suggest that there is need to appreciate and
anticipate the model of customer attrition in order to develop appealing strategies for this client
base. The information brought out by the models enable firms to identify high-risk customer
segment, effectively deploy resources and decide on measures that will improve customer
satisfaction and hence loyalty. Issues to do with customer acquisition, targeted sales offers, and
effective ways of handling customer complaints are some of the measures that need to be taken
to reduce churn and therefore improve the general business profitability.
Next Steps:
 Model Optimization: Serious consideration can be given to the development of the
ensemble methods or to the use of more sophisticated techniques for further improvement
of models.
 Data Enrichment: It may further benefit a model to incorporate extra customer
behavioral attributes like or frequency of posting about the product on social media, or
past purchase behaviour.
 Implementation: Using the tuned model to predict churn in real time and feeding the
insights into the customer relationship management (CRM) system to take early and
adequate retention action.
Algorithms Selected: Churn analysis for customers was performed using three Supervised
Machine learning algorithms

Foundation Class X PCMB
No ratings yet
Foundation Class X PCMB
1,571 pages
Shop Manual Komatsu pc300 7 PDF
88% (16)
Shop Manual Komatsu pc300 7 PDF
2 pages
Laboratory Quality Control
50% (2)
Laboratory Quality Control
19 pages
MCC CRC Implemenation Guide Edited
No ratings yet
MCC CRC Implemenation Guide Edited
38 pages
Psychology and Other Disciplines
No ratings yet
Psychology and Other Disciplines
5 pages
Operator'S Manual: T6.145 T6.155 T6.165 T6.175 T6.180 Autocommand
No ratings yet
Operator'S Manual: T6.145 T6.155 T6.165 T6.175 T6.180 Autocommand
22 pages
Grid Audit Report Format
100% (1)
Grid Audit Report Format
7 pages
Ireland Companies List - Consumer Goods
No ratings yet
Ireland Companies List - Consumer Goods
23 pages
Ireland Companies List - Computer Software
No ratings yet
Ireland Companies List - Computer Software
12 pages
Structure and Written Expression: Section Two
100% (1)
Structure and Written Expression: Section Two
26 pages
Predictive Modelling Project - Business Report
100% (1)
Predictive Modelling Project - Business Report
23 pages
Churn Management
100% (1)
Churn Management
15 pages
Ireland Companies List - Industrial Automation
100% (1)
Ireland Companies List - Industrial Automation
2 pages
Portable Radios: Operating Instructions
100% (1)
Portable Radios: Operating Instructions
47 pages
Churn Prediction Product Idea
No ratings yet
Churn Prediction Product Idea
7 pages
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
100% (1)
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
14 pages
45B Ahmed Shaikh AIML Journal
No ratings yet
45B Ahmed Shaikh AIML Journal
181 pages
Ireland Companies List - Computer & Network Security
No ratings yet
Ireland Companies List - Computer & Network Security
2 pages
MAD111 - Chap 1
No ratings yet
MAD111 - Chap 1
237 pages
Ireland Companies List - Computer Hardware
No ratings yet
Ireland Companies List - Computer Hardware
1 page
Wa0004.
No ratings yet
Wa0004.
70 pages
Concept Note - Chhandavi Gowardhan
No ratings yet
Concept Note - Chhandavi Gowardhan
2 pages
Assignment 1 DA - E Oct 2023 V1-1
No ratings yet
Assignment 1 DA - E Oct 2023 V1-1
3 pages
Journal Homepage: - : Introduction
No ratings yet
Journal Homepage: - : Introduction
30 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
23 pages
DSS 2 Draft
No ratings yet
DSS 2 Draft
33 pages
Balaji Capstone Project 2
No ratings yet
Balaji Capstone Project 2
56 pages
Interim Repor - Final
No ratings yet
Interim Repor - Final
19 pages
Project Report..
No ratings yet
Project Report..
36 pages
Balaji Capstone Project 1
No ratings yet
Balaji Capstone Project 1
28 pages
PFEreport
No ratings yet
PFEreport
43 pages
Telco Customers Churn Predication - Analysis
No ratings yet
Telco Customers Churn Predication - Analysis
24 pages
Reference Report 2
No ratings yet
Reference Report 2
43 pages
Token ID Ain20250117003-1
No ratings yet
Token ID Ain20250117003-1
14 pages
CustomerChurnPrediction ProjectReport 2555425555
No ratings yet
CustomerChurnPrediction ProjectReport 2555425555
19 pages
Capstone+Project+ +Nikhil.+R+ +01
No ratings yet
Capstone+Project+ +Nikhil.+R+ +01
30 pages
DataScience Project-New
No ratings yet
DataScience Project-New
16 pages
Churn Prediction in Telecom Using Machine Learning in R
No ratings yet
Churn Prediction in Telecom Using Machine Learning in R
9 pages
2020 Paper 6
No ratings yet
2020 Paper 6
24 pages
A Survey On Churn Analysis by Jaehuyn
No ratings yet
A Survey On Churn Analysis by Jaehuyn
25 pages
Data Science Case Report
No ratings yet
Data Science Case Report
20 pages
Customer Churn Presentation
No ratings yet
Customer Churn Presentation
28 pages
Asset Holiday Home Work 2
No ratings yet
Asset Holiday Home Work 2
13 pages
GRP 10 Report
No ratings yet
GRP 10 Report
16 pages
Final Report Srini
No ratings yet
Final Report Srini
24 pages
01 - Electricity - Basic Principles
No ratings yet
01 - Electricity - Basic Principles
14 pages
Rahul Jha Capstone Final
No ratings yet
Rahul Jha Capstone Final
14 pages
Customer Churn Prediction Capstone Projectdocx
No ratings yet
Customer Churn Prediction Capstone Projectdocx
11 pages
Finalized Version
No ratings yet
Finalized Version
16 pages
Phase 3
No ratings yet
Phase 3
16 pages
Group 13 - Analyzing Customer Churn
No ratings yet
Group 13 - Analyzing Customer Churn
6 pages
Capstone Project Customer Churn Abhay Ankit Project Note 1
No ratings yet
Capstone Project Customer Churn Abhay Ankit Project Note 1
31 pages
Solar-Powered Lawnmower Design and Development
No ratings yet
Solar-Powered Lawnmower Design and Development
8 pages
SQL Project
No ratings yet
SQL Project
21 pages
Yash - Capstone Report PDF Notes1
No ratings yet
Yash - Capstone Report PDF Notes1
14 pages
Improving Quality in Food Products: Nestlé's Strategies For Standard Operating Procedures (SOP) and Documentation
No ratings yet
Improving Quality in Food Products: Nestlé's Strategies For Standard Operating Procedures (SOP) and Documentation
10 pages
Customer Churn in Subscription Business Model-Pred
No ratings yet
Customer Churn in Subscription Business Model-Pred
7 pages
Skymionic Beams PDF
No ratings yet
Skymionic Beams PDF
6 pages
Customer Churn Telecom
No ratings yet
Customer Churn Telecom
35 pages
Customer Churn Prediction Capstone Himanshu
No ratings yet
Customer Churn Prediction Capstone Himanshu
5 pages
STAT8010 Assignment 2 - 2023
No ratings yet
STAT8010 Assignment 2 - 2023
4 pages
PowerCo Problem
No ratings yet
PowerCo Problem
2 pages
Chapter - 2: Conceptual Data Modeling
No ratings yet
Chapter - 2: Conceptual Data Modeling
41 pages
DRS1
No ratings yet
DRS1
5 pages
Output 4
No ratings yet
Output 4
5 pages
Telecom Customer Churn
No ratings yet
Telecom Customer Churn
5 pages
Anticipating Customer Churn in Telecommunication Using Machine Learning Algorithms For Customer Retention
No ratings yet
Anticipating Customer Churn in Telecommunication Using Machine Learning Algorithms For Customer Retention
7 pages
Churn Data Prediction Project
No ratings yet
Churn Data Prediction Project
5 pages
Synopsis
No ratings yet
Synopsis
17 pages
STS Lesson 1
No ratings yet
STS Lesson 1
8 pages
Telecom Customer Churn Report
No ratings yet
Telecom Customer Churn Report
3 pages
Literature Review
No ratings yet
Literature Review
4 pages
Methodology
No ratings yet
Methodology
12 pages
A Conversation With William Rathje-Anthropology Today
No ratings yet
A Conversation With William Rathje-Anthropology Today
7 pages
Synopsis
No ratings yet
Synopsis
3 pages
Business Problem
No ratings yet
Business Problem
10 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
5 pages
Report
No ratings yet
Report
17 pages
Results and Discussions
No ratings yet
Results and Discussions
5 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
8 pages
Aa BPG 375001
No ratings yet
Aa BPG 375001
36 pages
DM Assg 041
No ratings yet
DM Assg 041
9 pages
Phase 1
No ratings yet
Phase 1
2 pages
Footscan®v9 Software Packages
No ratings yet
Footscan®v9 Software Packages
1 page
Electrical Thumb Rules You MUST Follow Part 5
No ratings yet
Electrical Thumb Rules You MUST Follow Part 5
3 pages
MATH8009 2023-24 Project
No ratings yet
MATH8009 2023-24 Project
3 pages
MH 7
No ratings yet
MH 7
1 page
Contourline / Pureline Warming Drawer: 8 Shown Above: Esw 6114
No ratings yet
Contourline / Pureline Warming Drawer: 8 Shown Above: Esw 6114
5 pages
De Chuyen Anh Vinh Phuc 2018-2019
No ratings yet
De Chuyen Anh Vinh Phuc 2018-2019
6 pages
Skill Development Under RKVY-2016-17
No ratings yet
Skill Development Under RKVY-2016-17
10 pages
Interview Vera Geier PDF
No ratings yet
Interview Vera Geier PDF
2 pages
12622-Article Text-22383-1-10-20220510
No ratings yet
12622-Article Text-22383-1-10-20220510
5 pages
Customer Churn Prediction Project: Group C
No ratings yet
Customer Churn Prediction Project: Group C
12 pages
DEGUZMAN KS3 LeaP G8Q3W6
No ratings yet
DEGUZMAN KS3 LeaP G8Q3W6
3 pages
Customer Churn Analysis in Telecom Industry
No ratings yet
Customer Churn Analysis in Telecom Industry
6 pages
Information Required For Preparation of Offers For Safety Consultancy Assignments
No ratings yet
Information Required For Preparation of Offers For Safety Consultancy Assignments
3 pages
Customer Churn Analysis and Prediction
No ratings yet
Customer Churn Analysis and Prediction
4 pages
11 2 Multi-Step Subtraction Problems
No ratings yet
11 2 Multi-Step Subtraction Problems
2 pages
It Is This Very Small Risk Probability That Causes
From Everand
It Is This Very Small Risk Probability That Causes
William Blanke
No ratings yet
8 Steps to Problem Solving: Six Sigma
From Everand
8 Steps to Problem Solving: Six Sigma
Mohit Sharma
3.5/5 (3)

Interim Report

Uploaded by

Interim Report

Uploaded by

INTERIM REPORT: CUSTOMER CHURN PREDICTION IN SUBSCRIPTION-BASED

1. Introduction of the Business Problem

Understanding Business/Social Opportunity: The predictive churn modeling in addition to

3. Exploratory Data Analysis (EDA)

Figure 1: Distribution of Customer Tenure

Figure 2: Churn Rate by Account Segment

 Customer Service Agent Score by Churn Status:

Number of Complaints Last Year by Churn Status: An analysis (Figure 5) of the

Figure 5: Number of Complaints Last Year by Churn Status

 Handling Missing Values for 'Account_user_count': The 'Account_user_count'

Figure 6: Class Distribution of Target Variable (Churn)

Importance of Early Tenure Experience: Moreover, the distribution of the number of

5. Model Building and Interpretation

Base Model Performance and Interpretation:

Figure 9: Random Forest Classifier - Confusion Matrix

Feature Selection and Re-Training:

 Logistic Regression: ['Tenure', 'City_Tier', 'Account_user_count', 'CC_Agent_Score',

Model Performance with Selected Features

6. Model Tuning and Business Implications

Figure 12: Hyperparameter Tuning – results

You might also like