0% found this document useful (0 votes)
21 views33 pages

Chapter 2

The document discusses RFM (Recency, Frequency, Monetary) segmentation for customer analysis using Python. It outlines the process of calculating RFM metrics, grouping customers into segments based on percentiles, and assigning labels to these segments. Additionally, it demonstrates how to analyze and categorize customers into segments like Gold, Silver, and Bronze based on their RFM scores.

Uploaded by

likitandlikit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views33 pages

Chapter 2

The document discusses RFM (Recency, Frequency, Monetary) segmentation for customer analysis using Python. It outlines the process of calculating RFM metrics, grouping customers into segments based on percentiles, and assigning labels to these segments. Additionally, it demonstrates how to analyze and categorize customers into segments like Gold, Silver, and Bronze based on their RFM scores.

Uploaded by

likitandlikit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Recency, frequency,

monetary (RFM)
segmentation
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N

Karolis Urbonas
Head of Data Science, Amazon
What is RFM segmentation?
Behavioral customer segmentation based on three metrics:

Recency (R)

Frequency (F)
Monetary Value (M)

CUSTOMER SEGMENTATION IN PYTHON


Grouping RFM values
The RFM values can be grouped in several ways:

Percentiles e.g. quantiles

Pareto 80/20 cut


Custom - based on business knowledge

We are going to implement percentile-based grouping.

CUSTOMER SEGMENTATION IN PYTHON


Short review of percentiles
Process of calculating percentiles:

1. Sort customers based on that metric

2. Break customers into a pre-defined number of groups of equal size


3. Assign a label to each group

CUSTOMER SEGMENTATION IN PYTHON


Calculate percentiles with Python
Data with eight CustomerID and a randomly calculated Spend values.

CUSTOMER SEGMENTATION IN PYTHON


Calculate percentiles with Python
spend_quartiles = pd.qcut(data['Spend'], q=4, labels=range(1,5))
data['Spend_Quartile'] = spend_quartiles
data.sort_values('Spend')

CUSTOMER SEGMENTATION IN PYTHON


Assigning labels
Highest score to the best metric - best is not always highest e.g. recency
In this case, the label is inverse - the more recent the customer, the better

CUSTOMER SEGMENTATION IN PYTHON


Assigning labels
# Create numbered labels
r_labels = list(range(4, 0, -1))
# Divide into groups based on quartiles
recency_quartiles = pd.qcut(data['Recency_Days'], q=4, labels=r_labels)
# Create new column
data['Recency_Quartile'] = recency_quartiles
# Sort recency values from lowest to highest
data.sort_values('Recency_Days')

CUSTOMER SEGMENTATION IN PYTHON


Assigning labels
As you can see, the quartile labels are reversed, since the more recent customers are more
valuable.

CUSTOMER SEGMENTATION IN PYTHON


Custom labels
We can define a list with string or any other values, depending on the use case.

# Create string labels


r_labels = ['Active', 'Lapsed', 'Inactive', 'Churned']
# Divide into groups based on quartiles
recency_quartiles = pd.qcut(data['Recency_Days'], q=4, labels=r_labels)

# Create new column


data['Recency_Quartile'] = recency_quartiles

# Sort values from lowest to highest


data.sort_values('Recency_Days')

CUSTOMER SEGMENTATION IN PYTHON


Custom labels
Custom labels assigned to each quartile

CUSTOMER SEGMENTATION IN PYTHON


Let's practice with
percentiles!
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N
Calculating RFM
metrics
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N

Karolis Urbonas
Head of Data Science, Amazon
Definitions
Recency - days since last customer transaction
Frequency - number of transactions in the last 12 months

Monetary Value - total spend in the last 12 months

CUSTOMER SEGMENTATION IN PYTHON


Dataset and preparations
Same online dataset like in the previous lessons
Need to do some data preparation

New TotalSum column = Quantity x UnitPrice .

CUSTOMER SEGMENTATION IN PYTHON


Data preparation steps
We're starting with a pre-processed online DataFrame with only the latest 12 months of data:

print('Min:{}; Max:{}'.format(min(online.InvoiceDate),
max(online.InvoiceDate)))

Min:2010-12-10; Max:2011-12-09

Let's create a hypothetical snapshot_day data as if we're doing analysis recently.

snapshot_date = max(online.InvoiceDate) + datetime.timedelta(days=1)

CUSTOMER SEGMENTATION IN PYTHON


Calculate RFM metrics
# Aggregate data on a customer level
datamart = online.groupby(['CustomerID']).agg({
'InvoiceDate': lambda x: (snapshot_date - x.max()).days,
'InvoiceNo': 'count',
'TotalSum': 'sum'})
# Rename columns for easier interpretation
datamart.rename(columns = {'InvoiceDate': 'Recency',
'InvoiceNo': 'Frequency',
'TotalSum': 'MonetaryValue'}, inplace=True)
# Check the first rows
datamart.head()

CUSTOMER SEGMENTATION IN PYTHON


Final RFM values
Our table for RFM segmentation is completed!

CUSTOMER SEGMENTATION IN PYTHON


Let's practice
calculating RFM
values!
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N
Building RFM
segments
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N

Karolis Urbonas
Head of Data Science, Amazon
Data
Dataset we created previously
Will calculate quartile value for each column and name then R , F , M

CUSTOMER SEGMENTATION IN PYTHON


Recency quartile
r_labels = range(4, 0, -1)
r_quartiles = pd.qcut(datamart['Recency'], 4, labels = r_labels)
datamart = datamart.assign(R = r_quartiles.values)

CUSTOMER SEGMENTATION IN PYTHON


Frequency and monetary quartiles
f_labels = range(1,5)
m_labels = range(1,5)
f_quartiles = pd.qcut(datamart['Frequency'], 4, labels = f_labels)
m_quartiles = pd.qcut(datamart['MonetaryValue'], 4, labels = m_labels)
datamart = datamart.assign(F = f_quartiles.values)
datamart = datamart.assign(M = m_quartiles.values)

CUSTOMER SEGMENTATION IN PYTHON


Build RFM segment and RFM score
Concatenate RFM quartile values to RFM_Segment
Sum RFM quartiles values to RFM_Score

def join_rfm(x): return str(x['R']) + str(x['F']) + str(x['M'])


datamart['RFM_Segment'] = datamart.apply(join_rfm, axis=1)
datamart['RFM_Score'] = datamart[['R','F','M']].sum(axis=1)

CUSTOMER SEGMENTATION IN PYTHON


Final result

CUSTOMER SEGMENTATION IN PYTHON


Let's practice
building RFM
segments
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N
Analyzing RFM
segments
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N

Karolis Urbonas
Head of Data Science, Amazon
Largest RFM segments
datamart.groupby('RFM_Segment').size().sort_values(ascending=False)[:10]

CUSTOMER SEGMENTATION IN PYTHON


Filtering on RFM segments
Select bottom RFM segment "111" and view top 5 rows

datamart[datamart['RFM_Segment']=='111'][:5]

CUSTOMER SEGMENTATION IN PYTHON


Summary metrics per RFM score
datamart.groupby('RFM_Score').agg({
'Recency': 'mean',
'Frequency': 'mean',
'MonetaryValue': ['mean', 'count'] })/
.round(1)

CUSTOMER SEGMENTATION IN PYTHON


Grouping into named segments
Use RFM score to group customers into Gold, Silver and Bronze segments.

def segment_me(df):
if df['RFM_Score'] >= 9:
return 'Gold'
elif (df['RFM_Score'] >= 5) and (df['RFM_Score'] < 9):
return 'Silver'
else:
return 'Bronze'
datamart['General_Segment'] = datamart.apply(segment_me, axis=1)
datamart.groupby('General_Segment').agg({
'Recency': 'mean',
'Frequency': 'mean',
'MonetaryValue': ['mean', 'count']}).round(1)

CUSTOMER SEGMENTATION IN PYTHON


New segments and their values

CUSTOMER SEGMENTATION IN PYTHON


Practice building
custom segments
C U S T O M E R S E G M E N TAT I O N I N P Y T H O N

You might also like