Capstone Project
Capstone Project
PROJECT REPORT
for
Submitted By
Specialization SAP ID Name
CC&VT 500083753 Mansha Batra
AI&ML 500082980 Aayushman Gusain
Submitted to:
Ashish Pratap Singh,
Assistant Professor (SG)
School of Business, UPES
School of Computer Science
UPES, Dehradun
1. Project Title
Real-time Churn Prediction with Amazon Connect and Amazon SageMaker.
2. Abstract
The Real-time Churn Prediction project leverages the power of Artificial Intelligence, Machine
Learning (AIML), and Cloud Computing Practices to develop a predictive model for customer
churn in Amazon Connect, a cloud-based contact centre solution. The project utilizes Amazon
SageMaker, a fully-managed machine learning service that provides data scientists and
developers with the ability to build, train, and deploy machine learning models quickly. This
project aims to provide businesses with a real-time churn prediction solution that can help them
proactively address customer churn, thereby increasing customer retention and improving
overall business performance. The project also employs Business Frameworks to ensure that
the solution is aligned with the business's objectives, strategies, and processes. By integrating
AIML, Cloud Computing Practices, and Business Frameworks, this project offers a
comprehensive solution for predicting and mitigating customer churn, thereby providing
businesses with a competitive edge in the market.
3. Introduction
Customer churn is a significant challenge faced by businesses in today's highly competitive
market. It refers to the loss of customers who discontinue using a company's products or
services. Customer churn can lead to a significant loss of revenue, profitability, and market
share for businesses. Therefore, it is crucial for businesses to identify and address the factors
that contribute to customer churn.
The Real-time Churn Prediction project is an innovative solution that leverages the power of
Artificial Intelligence, Machine Learning (AIML), and Cloud Computing Practices to predict
customer churn in real-time. The project utilizes Amazon Connect, a cloud-based contact
center solution, and Amazon SageMaker, a fully-managed machine learning service, to build,
train, and deploy a predictive model for customer churn. The project aims to provide businesses
with a real-time churn prediction solution that can help them proactively address customer
churn, thereby increasing customer retention and improving overall business performance.
In addition to AIML and Cloud Computing Practices, the project also employs Business
Frameworks to ensure that the solution is aligned with the business's objectives, strategies, and
processes. By integrating these three key components, the project offers a comprehensive
solution for predicting and mitigating customer churn, thereby providing businesses with a
competitive edge in the market. This report presents a detailed overview of the Real-time
Churn Prediction project, including its objectives, methodology, implementation, and
evaluation. The report also discusses the benefits and limitations of the project and provides
recommendations for future research and development.
4. Problem Statement
The problem statement for this project revolves around the need to predict customer churn
effectively and efficiently using advanced technologies and frameworks. By developing a real-
time churn prediction system, businesses can gain valuable insights into customer behaviour,
identify at-risk customers, and implement targeted retention strategies to reduce churn rates
and improve customer satisfaction.
5. Motivation
The motivation behind the Real-time Churn Prediction project is to address the critical
challenge of customer churn faced by businesses. Customer churn, which refers to the loss of
customers who discontinue using a company's products or services, can lead to significant
revenue loss, decreased profitability, and a negative impact on market share. By leveraging
Artificial Intelligence, Machine Learning (AIML), and Cloud Computing Practices, this
project aims to develop a predictive model for customer churn using Amazon Connect and
Amazon SageMaker.
The project is motivated by the need to predict customer churn in real-time, enabling
businesses to proactively identify at-risk customers, address customer pain points, and
implement strategies to lower churn rates and increase customer retention. By utilizing
machine learning and data analysis, the project seeks to provide businesses with a powerful
tool to forecast customer churn, thereby helping them improve customer satisfaction, increase
profitability, and maintain a competitive edge in the market.
The project also emphasizes the importance of real-time insights, enabling companies to take
immediate action to prevent churn and retain customers. By proactively reaching out to these
customers with targeted support or product enhancements, businesses can reduce churn rates
and retain valuable customers.
Overall, the Real-time Churn Prediction project is motivated by the need to provide businesses
with a comprehensive solution for predicting and mitigating customer churn, thereby
improving customer satisfaction, increasing profitability, and maintaining a competitive edge
in the market.
6. Objectives
Main Objective:
1. Develop a real-time churn prediction system using Amazon Connect and Amazon
SageMaker that can accurately predict customer churn in various industries.
Sub-Objectives:
1. Data Collection and Preparation: Collect and process large amounts of customer
data, including demographic information, purchase history, customer behavior, and
social media activity, using AWS cloud system.
2. Feature Engineering: Identify and select the most predictive factors that indicate
potential customer churn, such as recent complaints, lack of usage, and job or life
changes.
3. Model Development and Validation: Develop and validate a predictive model using
advanced machine learning algorithms, such as decision trees, random forests, and
neural networks, to ensure accurate and reliable predictions.
7. Methodology
The Real-time Churn Prediction project aims to develop a real-time churn prediction system
using Amazon Connect and Amazon SageMaker that can accurately predict customer churn in
various industries. The project follows a comprehensive methodology that includes data
collection and preparation, feature engineering, model development and validation, real-time
insights and intervention, cost savings and resource allocation, enhanced customer experience,
competitive advantage, continuous improvement, addressing data challenges, use case
tailoring, and designing churn prediction workflow.
Each Step:
1. Data Collection and Preparation: Data collection and preparation involves collecting
large amounts of customer data from various sources, including demographic
information, purchase history, customer behaviour, and social media activity. The data
is then processed and prepared for analysis using AWS cloud system.
2. Feature Engineering: Feature engineering involves identifying and selecting the most
predictive factors that indicate potential customer churn, such as recent complaints, lack
of usage, and job or life changes. These features are then used to develop a predictive
model.
5. Cost Savings and Resource Allocation: Cost savings and resource allocation involve
reducing customer acquisition costs by focusing on retaining existing customers,
improving customer retention strategies, and allocating resources more efficiently. This
is achieved by using the predictive model to identify at-risk customers and implementing
targeted retention strategies.
10. Use Case Tailoring: Use case tailoring involves identifying the most relevant use cases
for churn prediction, such as telecommunication, software as a service provider, retail
market, subscription-based businesses, financial institutions, marketing, and human
resource management. The prediction model and application are then tailored to the
specific use case, ensuring it meets the company’s needs, goals, and expectations.
8. Expected Result
1. Not tracking the product champion: Failing to monitor the behaviour of the most
influential user of the product can lead to a false reading of customer health and a
missed chance to identify a potential churn signal.
2. Not normalizing usage rate for population: Failing to adjust the usage rate for the
number of people using the product at a given company can result in an incorrect
interpretation of customer health and a missed churn signal.
3. Ignoring customer feedback: Disregarding customer complaints or not investigating
customer pain points can lead to a failure to address the root cause of churn and a lack
of improvement in customer satisfaction and retention.
4. Using inaccurate data: Using inaccurate data to measure customer churn can result in
incorrect conclusions and a failure to take appropriate action to reduce churn.
5. Not comparing churn rates to industry averages: Failing to compare churn rates to
industry averages can make it difficult to determine whether the churn rate is acceptable
or whether action needs to be taken to reduce it.
6. Not taking into account the customer's lifetime value: Measuring churn as a
percentage of customers who cancel their service within a certain period of time without
considering the revenue generated by those customers can result in an inaccurate
measure of churn.
7. Choosing the wrong measurement method: Choosing the wrong method for
measuring churn, such as looking at the churn rate as a percentage of total accounts lost
instead of the amount of revenue lost, can result in incorrect conclusions and a failure
to take appropriate action to reduce churn.
8. Not taking action on churn data: Measuring churn without taking action based on the
data can result in a failure to reduce churn and a loss of revenue in the long run.
10. Using new customers to reduce churn: Overlooking lost customers and focusing only
on total customer and/or revenue growth can result in a failure to address the root cause
of churn and a lack of improvement in customer retention.
11. Not analysing churn rates properly: Failing to identify a churn rate benchmark for
the business can make it difficult to determine whether the churn rate is acceptable or
whether action needs to be taken to reduce it.
12. Not taking into account the customer's lifetime value: Measuring churn as a
percentage of customers who cancel their service within a certain period of time without
considering the revenue generated by those customers can result in an inaccurate
measure of churn.
13. Not using machine learning techniques or survival analysis: Failing to use machine
learning techniques or survival analysis, which are the two most popular broad
approaches to churn modelling, can result in a failure to accurately predict which
customers are likely to churn.
14. Ignoring domain knowledge, skill, and creativity in feature construction: Failing
to use domain knowledge, skill, and creativity in constructing a robust feature set with
information that is predictive of a churn event can result in a less effective churn model.
15. Not addressing target leakage, unavailable or missing information, or the need for
optimal feature transformations: These roadblocks can arise during feature
construction and can impact the effectiveness of the churn model.
To avoid these mistakes, it is important to use accurate data, compare churn rates to industry
averages, take into account the customer's lifetime value, choose the right measurement
method, take action based on churn data, pair churn with engagement metrics, use machine
learning techniques or survival analysis, and use domain knowledge, skill, and creativity in
feature construction.
1. Churn Rate: The percentage of customers who stop using a company's product or
service within a given time frame is a fundamental metric that directly reflects customer
satisfaction and loyalty.
2. Customer Health Score: A metric that assesses the overall health of customers based
on various factors like usage patterns, engagement levels, and satisfaction indicators,
helping to identify at-risk customers and prioritize retention efforts.
3. Customer Lifetime Value (CLV): The predicted net profit a company expects to earn
from a customer throughout their entire relationship is crucial for understanding the
value of retaining customers and guiding retention strategies.
4. Red Flag Metrics: Specific behaviours or actions that indicate a high risk of churn,
such as unfollowing on social media, uninstalling an app, or reduced usage, are essential
for early identification of potential churners.
By monitoring these key metrics closely, businesses can gain valuable insights into customer
behaviour, predict churn risk accurately, and implement targeted strategies to reduce churn and
increase customer retention effectively.
11. Case Study:
Churn Case Study: Identifying Flight Risks with Airlines
• The Problem: Airlines constantly face customer churn, were passengers’ defect to
competitors. Acquiring new customers is expensive, so retaining existing ones is
crucial.
• The Company: Imagine a major airline, "Sky High Airlines" (SHA). They experience
customer churn, impacting their revenue.
• The Solution: Churn Modelling. SHA implemented a data science approach to predict
customer churn.
Taking Action:
After selecting the best model, SHA can:
1. Identify high-risk customers predicted to churn soon.
2. Develop targeted retention strategies for these at-risk customers.
The Benefits:
By effectively using churn models, SHA can:
1. Reduce customer churn, leading to higher customer lifetime value.
2. Optimize marketing campaigns by targeting high-value customers.
3. Improve customer satisfaction through proactive interventions.
Real-World Examples:
1. Airlines like American Airlines [US airlines churn reduction] and Emirates [Emirates
churn case study] have reportedly used churn modelling to reduce customer defection.
Conclusion:
1. Churn modelling empowers companies like SHA to predict and prevent customer
churn. By analysing customer data and taking strategic actions, businesses can retain
valuable customers and ensure long-term success.