0% found this document useful (0 votes)
16 views16 pages

Phase 3

The document outlines a model planning and building phase aimed at predicting customer churn in a telecom company. It highlights key factors influencing churn, such as contract type, tenure, payment methods, and additional services, and describes the development of a Decision Tree Model to analyze these factors. The analysis concludes with recommendations to target high-risk customer segments and improve service offerings to enhance customer retention.

Uploaded by

dahmerjack56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views16 pages

Phase 3

The document outlines a model planning and building phase aimed at predicting customer churn in a telecom company. It highlights key factors influencing churn, such as contract type, tenure, payment methods, and additional services, and describes the development of a Decision Tree Model to analyze these factors. The analysis concludes with recommendations to target high-risk customer segments and improve service offerings to enhance customer retention.

Uploaded by

dahmerjack56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

PHASE 3

MODEL PLANNING AND BUILDING

Prepared by: shouq alrahma 202102114, mariam alhammadi 202118366, maitha alateeqi
202117547

Prepared for: Mohammad Tubaishat


Importing data and downloading necessary files:

Using the `head` of the telecom data, the first few rows of the data reveal
important factors that impact customer churn. Customers who have been
with the company for a shorter period of time (1-2 months) are more
likely to churn, based on their "Yes" churn status, compared to a
customer who has been with the company for 45 months and has not
churned. Furthermore, clients with month-to-month agreements exhibit
higher turnover rates when contrasted with customers who have one-
year contracts. The payment method is also important, as three
customers who used Electronic check as their payment method have left,
indicating a potential connection between this payment method and
customer turnover. Clients who do not have extra features such as
Technical Support, Device Protection, and Streaming TV are more likely
to cancel their services frequently, suggesting that combining services
could improve customer retention. These trends underscore the
significance of tenure, type of contract, payment method, and extra
services in comprehending and dealing with customer turnover.

In this step, we prepared the data for analysis by converting the target
variable Churn into a factor, which is essential for classification tasks.
The Churn column, initially containing "Yes" and "No" values as
strings, was transformed into a factor to enable the model to interpret it
correctly as a categorical variable. This change is crucial for the
classification model to distinguish between customers who have left
("Yes") and those who have stayed ("No"). By ensuring that Churn is set
as a factor, we allow the model to handle it as a binary outcome,
improving the accuracy and interpretability of our predictive analysis.
In this phase, we proceed with Model Planning and Building by first
dividing our data into training and testing sets, which is essential for
evaluating the model’s performance. We set a random seed
(set.seed(123)) to ensure reproducibility, so that each time the code runs,
the split will be the same. Using a 70-30 split ratio, we assign 70% of the
data to the training set (used to build the model) and the remaining 30%
to the testing set (used to evaluate the model's accuracy). This separation
allows us to train the model on one portion of the data and test it on
another, providing a more reliable measure of how the model will
perform on unseen data.
In this step, we constructed a Decision Tree Model for classification to
predict customer churn based on various factors. Using
the rpart function, the model analyzes the relationship between the target
variable (Churn) and 19 independent variables such
as gender, tenure, contract type, payment method, and additional
services like Tech Support and Streaming TV. The output shows the tree
structure starting with 4,930 training observations at the root node,
where 1,308 customers are predicted as "Yes" for churn and the rest as
"No." The model splits based on key factors like contract type, tenure,
and Internet Service, progressively narrowing down groups of customers
to make more accurate predictions. For example, customers with
a month-to-month contract have a higher churn probability, while those
with longer contracts (e.g., one-year or two-year agreements) are less
likely to churn. Similarly, customers using Fiber optic services or
lacking Tech Support are identified as high-risk groups.
In this step, we visualize our decision tree model for predicting customer
churn by plotting it. Each node in the tree represents a decision point
based on different features, such as "Contract," "InternetService," or
"Tenure." At each decision node, customers are split based on their
characteristics to predict their likelihood of churn ("Yes" or "No").
For example, at the top node (root), we see the "Contract" feature, where
customers with month-to-month contracts are more likely to churn than
those with one-year or two-year contracts. Specifically, out of 2,706
customers with month-to-month contracts, 1,167 have a churn status of
"No," while 1,539 have a churn status of "Yes," indicating a higher
likelihood of churn among this group. The number shown in each node
indicates the split based on the feature, with "Yes" or "No" outcomes
representing churn status. The numbers within each node display the
customer count and the distribution of churn outcomes. Blue nodes
generally indicate a prediction of "No" (not churning), while green
nodes indicate "Yes" (churning).

This visual representation of the decision tree helps us understand which


factors contribute most to customer churn and how customer segments
differ based on these attributes.
This boxplot visualizes "Monthly Charges" for customers who stayed
("No") versus those who left ("Yes"). Customers who churned tend to
have higher monthly charges, with their median monthly charge being
slightly above $70, compared to around $60 for those who did not churn.
The spread for churned customers is narrower, indicating more
consistent monthly charges among this group. This aligns with the
decision tree results, where higher monthly charges were a factor
contributing to customer churn, emphasizing the importance of pricing
strategies to retain high-paying customers.
This boxplot shows the "Tenure" (in months) of customers who stayed
("No") compared to those who left ("Yes"). It reveals that customers
who stayed generally have longer tenures, with a median tenure close to
40 months. In contrast, customers who left have shorter tenures, with a
median closer to 10 months. This supports the findings from the decision
tree, where shorter tenures were a key factor for predicting churn,
highlighting that customers with less time with the company are more
likely to leave.
In this step, we evaluate the model's performance by generating a
confusion matrix to understand its classification accuracy. The confusion
matrix compares the model's predictions with the actual test data to
measure how well it distinguishes between customers who churn ("Yes")
and those who do not ("No").
According to the matrix, the model correctly predicted 1,396 instances
of non-churning customers and 262 instances of churning customers.
However, it misclassified 156 customers as not churning when they
actually churned, and 299 customers as churning when they did not.
These results provide insights into the model's accuracy and areas where
it may need improvement to reduce false positives and false negatives,
thereby enhancing its predictive performance for customer churn.
In this step, the overall accuracy of the decision tree model is calculated
to assess its performance. The accuracy is determined by dividing the
number of correctly classified instances (both "Yes" and "No") by the
total number of test cases in the dataset.
The accuracy score obtained is 0.7847, which means the model correctly
predicts customer churn 78.47% of the time.
This t-test evaluates the "Contract Tenure Hypothesis" by comparing the
mean tenure of customers who churned ("Yes") against those who did
not churn ("No"). The results show a significant difference between the
two groups, as indicated by a t-value of -34.824 and a p-value less than
2.2e-16. The mean tenure for churned customers is 17.98 months, while
for non-churned customers, it is 37.57 months, with a 95% confidence
interval of the mean difference ranging from -20.69 to -18.49. These
findings strongly support the hypothesis that customers with shorter
tenure are more likely to churn, reinforcing the importance of early
engagement strategies to retain new customers.
This Chi-Squared test evaluates the "Additional Services Hypothesis" by
analyzing the relationship between customers having technical support
and their likelihood of churning ("Yes" or "No"). The test results show a
significant association between these factors, with an X-squared value of
828.2, 2 degrees of freedom, and a highly significant p-value of less than
2.2e-16. These results indicate that the availability of technical support
greatly influences churn behavior.
The findings reveal that customers who do not have technical support
are far more likely to churn compared to those who have it, emphasizing
the importance of offering or improving technical support services. This
analysis strongly supports the hypothesis and shows that providing
technical support can help keep customers and reduce churn effectively.
This Chi-Squared test evaluates the "Billing and Payment Methods
Hypothesis" by analyzing the relationship between customers' payment
methods and their likelihood of churning ("Yes" or "No"). The test
results reveal a significant association between these factors, with an X-
squared value of 648.14, 3 degrees of freedom, and a highly significant
p-value of less than 2.2e-16. These findings demonstrate that the type of
payment method a customer uses significantly affects their churn
behavior.
The analysis shows that customers using automatic payment methods,
such as Bank Transfer and Credit Card, are far less likely to churn, with
churn rates of 16.7% and 15.2%, respectively. In contrast, customers
using Electronic Check show a much higher churn rate of 45.3%,
making this group the most likely to leave. Customers paying by Mailed
Check also have a lower churn rate of 19.1%.

These findings strongly support the hypothesis and highlight the


importance of promoting automatic payment methods as a strategy to
reduce churn. Additionally, the high churn rate for electronic check
users suggests a need to address issues related to this payment method,
such as user experience or satisfaction, to improve customer retention.
Proposal and Recommendations:

Target Month-to-Month Contract Customers: Since these customers


have shown higher churn rates, offer incentives such as discounts or
loyalty programs to encourage them to switch to longer contracts.

Focus on Early Tenure Customers: Customers in the early stages of their


tenure are often at higher risk of churn. Implementing onboarding
programs, personalized support, or early loyalty incentives could
improve retention among new customers.

Improve Service for Fiber Optic Customers: If the analysis shows fiber
optic users are more likely to churn, investigate service quality or
pricing issues and consider offering tailored support or premium features
to improve satisfaction among these customers.

Enhance Support Services: The availability of tech support seems to


correlate with churn likelihood. Investing in improved customer support,
especially for customers who initially decline tech support, could help
reduce churn.

You might also like