Kulkarni A. Applied Recommender Systems With Python 2023
Kulkarni A. Applied Recommender Systems With Python 2023
Recommender
Systems with Python
Build Recommender Systems with
Deep Learning, NLP and Graph-Based
Techniques
—
Akshay Kulkarni
Adarsha Shivananda
Anoosh Kulkarni
V Adithya Krishnan
Applied Recommender
Systems with Python
Build Recommender Systems
with Deep Learning, NLP and
Graph-Based Techniques
Akshay Kulkarni
Adarsha Shivananda
Anoosh Kulkarni
V Adithya Krishnan
Applied Recommender Systems with Python: Build Recommender Systems with
Deep Learning, NLP and Graph-Based Techniques
Akshay Kulkarni Adarsha Shivananda
Bangalore, Karnataka, India Hosanagara tq, Shimoga dt, Karnataka, India
Anoosh Kulkarni V Adithya Krishnan
Bangalore, India Navi Mumbai, India
Preface������������������������������������������������������������������������������������������������������������������ xiii
v
Table of Contents
vi
Table of Contents
vii
Table of Contents
Index��������������������������������������������������������������������������������������������������������������������� 243
viii
About the Authors
Akshay R. Kulkarni is an artificial intelligence (AI) and
machine learning (ML) evangelist and a thought leader. He
has consulted several Fortune 500 and global enterprises to
drive AI and data science–led strategic transformations. He is
a Google developer, an author, and a regular speaker at major
AI and data science conferences (including the O’Reilly Strata
Data & AI Conference and Great International Developer
Summit (GIDS)) . He is a visiting faculty member at some of
the top graduate institutes in India. In 2019, he was featured
as one of India’s “top 40 under 40” data scientists. In his spare
time, Akshay enjoys reading, writing, coding, and helping
aspiring data scientists. He lives in Bangalore with his family.
ix
About the Authors
x
About the Technical Reviewer
Krishnendu Dasgupta is co-founder of DOCONVID AI. He
is a computer science and engineering graduate with a
decade of experience building solutions and platforms on
applied machine learning. He has worked with NTT DATA,
PwC, and Thoucentric and is now working on applied AI
research in medical imaging and decentralized privacy-
preserving machine learning in healthcare. Krishnendu is
an alumnus of the MIT Entrepreneurship and Innovation
Bootcamp and devotes his free time as an applied AI and
ML research volunteer for various research NGOs and
universities across the world.
xi
Preface
This book is dedicated to data scientists who are starting new recommendation engine
projects from scratch but don’t have prior experience in this domain. They can easily
learn concepts and gain practical knowledge with this book. Recommendation engines
have recently gained a lot of traction and popularity in different domains and have a
proven track record for increasing sales and revenue.
This book is divided into eleven chapters. The first section, Chapters 1 and 2, covers
basic approaches. The following section, which consists of Chapters 3, 4, 5, and 6, covers
popular methods, including collaborative filtering-based, content-based, and hybrid
recommendation systems. The next section, Chapters 7 and 8, discusses implementing
systems using state-of-the-art machine learning algorithms. Chapters 9, 10, and 11
discuss trending and emerging techniques in recommendation systems.
The code for the implementations in each chapter and the required datasets are
available on GitHub at github.com/apress/applied-recommender-systems-python.
To successfully perform all the projects in this book, you need Python 3.x or higher
running on any Windows- or Unix-based operating system with a processor of 2.0 GHz
or higher and a minimum of 4 GB RAM. You can download Python from Anaconda and
leverage a Jupyter notebook for all coding purposes. This book assumes you know Keras
basics and how to install machine learning and deep learning basic libraries.
Please upgrade or install the latest versions of all the libraries.
xiii
CHAPTER 1
Introduction to
Recommendation
Systems
In today’s world, customers are faced with multiple choices for every decision. Let’s
assume that a person is looking for a book to read without any specific idea of what
they want. There’s a wide range of possibilities for how their search might pan out. They
might waste a lot of time browsing the Internet and trawling through various sites hoping
to strike gold. They might look for recommendations from other people.
But if there was a site or app that could recommend books to this customer based
on what they’d read previously, that would save time that would otherwise be spent
searching for books of interest on various sites. In short, our main goal is to recommend
things based on the user’s interests. And that’s what recommendation engines do.
A recommendation engine, also known as a recommender system or a
recommendation system, is one of the most widely used machine learning applications;
for example, it is used by companies like Amazon, Netflix, Google, and Goodreads.
This chapter explains recommendation systems and presents various
recommendation engine algorithms and the fundamentals of creating them in Python
3.8 or greater using a Jupyter notebook.
1
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_1
Chapter 1 Introduction to Recommendation Systems
2
Chapter 1 Introduction to Recommendation Systems
• Content-based filtering
• Collaborative-based filtering
• Hybrid systems
• ML clustering
• ML classification
3
Chapter 1 Introduction to Recommendation Systems
included butter and chips. For a product recommendation, 50% confidence may be
perfectly acceptable, but this level may not be high enough in a medical situation.
Lift is the ratio of observed support to expected support if the two rules were
independent. As a rule of thumb, a lift value close to 1 means that the rules were
completely independent. Lift - values > 1 are more “interesting” and could indicate a
useful rule pattern. Figure 1-1 illustrates how support, confidence, and lift are calculated.
Content-Based Filtering
The content-based filtering method is a recommendation algorithm that suggests
items similar to the ones the users have previously selected or shown interest in. It can
recommend based on the actual content present in the item. For example, as shown in
Figure 1-2, a new article is recommended based on the text present in the articles.
4
Chapter 1 Introduction to Recommendation Systems
Let’s look at the popular example of Netflix and its recommendations to explore the
workings in detail. Netflix saves all user viewing information in a vector-based format,
known as the profile vector, which contains information on past viewings, liked and
disliked shows, most frequently watched genres, star ratings, and so forth. Then there
is another vector that stores all the information regarding the titles (movies and shows)
available on the platform, known as the item vector. This vector stores information like
the title, actors, genre, language, length, crew info, synopsis, and so forth.
The content-based filtering algorithm uses the concept of cosine similarity. In it, you
find the cosine of the angle between two vectors—the profile and item vectors in this
case. Suppose A is the profile vector and B is the item vector, then the (cosine) similarity
between them is calculated as follows.
The outcome (i.e., the cosine value) always ranges between –1 and 1, and this
value is calculated for multiple item vectors (movies), keeping the profile vector (user)
constant. The items/movies are then ranked in descending order of similarity, and either
of the two following approaches is used for recommendations.
• In a top N approach, the top N movies are recommended, where N is
a threshold on the number of titles recommended.
5
Chapter 1 Introduction to Recommendation Systems
The following are other methods popularly used in calculating the similarity.
The major downside to this recommendation engine is that all suggestions fall into
the same category, and it becomes somewhat monotonous. As the suggestions are based
on what the user has seen or liked, we’ll never get new recommendations that the user
has not explored in the past. For example, if the user has only seen mystery movies, the
engine will only recommend more mystery movies.
To improve on this, you need a recommendation engine that not only gives
suggestions based on the content but also on the behavior of users and on what other
like-minded users are watching.
6
Chapter 1 Introduction to Recommendation Systems
Collaborative-Based Filtering
In collaborative-based filtering recommendation engines, a user-to-user
similarity is also considered, along with item similarities, to address some of the
drawbacks of content-based filtering. Simply put, a collaborative filtering
system recommends an item to user A based on the interests of a similar user B.
Figure 1-4 shows a simple working mechanism of collaborative-based filtering
The similarity between users can be calculated again by all the techniques mentioned
earlier. A user-item matrix is created individually for each customer, which stores the
user’s preference for an item. Taking the same example of Netflix’s recommendation
engine, the user aspects like previously watched and liked titles, ratings provided (if any)
by the user, frequently watched genres, and so on are stored and used to find similar
users. Once these similar users are found, the engine recommends titles that the user has
not yet watched but users with similar interests have watched and liked.
This type of filtering is quite popular because it is only based on a user’s past
behavior, and no additional input is required. It’s used by many major companies,
including Amazon, Netflix, and American Express.
7
Chapter 1 Introduction to Recommendation Systems
One of the drawbacks of this method happens when no ratings are provided for a
particular item; then, it can’t be recommended. And reliable recommendations can be
tough to get if a user has only rated a few items.
Hybrid Systems
So far, you have seen how content-based and collaborative-based recommendation
engines work and their respective pros and cons. But the hybrid recommendation system
combines content and collaborative-based filtering methods.
Hybrid recommendation systems can overcome the drawbacks of both content-
based and collaborative-based to form one powerful recommendation system, both the
individual methods fail to perform well when there is a lack of data to learn the relation
between users and items, which is overcome in this hybrid approach.
Figure 1-5 shows a simple working mechanism of the hybrid
recommendation system.
8
Chapter 1 Introduction to Recommendation Systems
ML Clustering
In today’s world, AI has become an integral part of all automation and technology-based
solutions and the area of recommendation systems is no different. Machine learning-
based methods are the upcoming high prospective methods that are quickly becoming a
norm as more and more companies start adapting AI.
Machine learning methods are of two types: unsupervised and supervised. This
section discusses the unsupervised learning method, which is the ML clustering–based
method. The unsupervised learning technique uses ML algorithms to find hidden patterns
in data to cluster them without human intervention (unlabeled data). Clustering is the
grouping of similar objects into clusters. On average, an object belonging to one cluster is
more similar to an object within that cluster than to an object belonging to another cluster.
9
Chapter 1 Introduction to Recommendation Systems
• fuzzy mapping
ML Classification
Again, clustering comes with its disadvantages. That’s where a classification-based
recommendation system comes into play.
10
Chapter 1 Introduction to Recommendation Systems
In classification based, the algorithm uses features of both items and users to predict
whether a user will like a product or not. An application of the classification-based
method is the buyer propensity model.
Propensity modeling predicts the chances of customers buying a particular item
or any equivalent task. Also, for example, propensity modeling can help predict the
likelihood that a sales lead will convert to a customer or not based on various features.
The propensity score or probability is used to take action.
The following are some of the limitations of classification-based algorithms.
• Collecting a combination of data about different users and items is
sometimes difficult.
• Classification is challenging.
• The problem is training the models in real time.
Deep Learning
Deep learning is a branch of machine learning which is more powerful than ML-based
algorithms and tends to produce better results. Of course, there are limitations, like the
need for huge data or explainability, which we must overcome.
Various companies use deep neural networks (DNNs) to enhance the customer
experience, especially if it’s unstructured data like images and text.
The following are three types of deep learning–based recommender systems.
• Restricted Boltzmann
• Autoencoder based
• Neural attention–based
Later chapters explore how machine learning and deep learning can be leveraged to
build powerful recommender systems.
Now that you have a good understanding of the concepts, let’s start with a
simple rule-based recommender system in this chapter before proceeding to the
implementation in upcoming chapters.
Popularity
A popularity-based rule is the simplest form: a product is recommended based on its
popularity (most sold, most clicked, etc.). Let’s implement a quick one. For example, a
song listened to by many people means it’s popular. It is recommended to others without
any other intelligence being part of it.
Let’s take a retail dataset and implement the same logic.
Fire up a Jupyter notebook and import the necessary packages.
Note Refer to the data in this book’s data section. Download the dataset from the
GitHub link of this book.
#import data
df = pd.read_csv('data.csv',encoding= 'unicode_escape')
df.head()
Figure 1-7 shows the output of the top 5 rows from the dataset.
12
Chapter 1 Introduction to Recommendation Systems
Figure 1-8 shows that the quantity has some negative values that are a part of the
incorrect data, so lets drop them using the below code.
13
Chapter 1 Introduction to Recommendation Systems
Figure 1-9. shows the output after removing the negative values
Now that we cleaned up the data, let’s do some basic types of recommendation
systems. These are not intelligent yet effective in some cases. Popularity-based
recommendation systems could be a trending song. It could be a fast-selling item
required for everyone, a recently released movie that gets traction, or a news article
many users have read.
Sometimes it’s important to keep it simple because it gets you the most revenue.
Let’s build a popularity-based system in the data we are using.
14
Chapter 1 Introduction to Recommendation Systems
Figure 1-10 shows that PAPER CRAFT is the most bought item across all regions. It’s a
very popular item.
global_popularity.reset_index(inplace=True)
sns.barplot(y='Description', x='Quantity', data=global_popularity.head(10))
plt.title('Top 10 Most Popular Items Globally', fontsize=14)
plt.ylabel('Item')
15
Chapter 1 Introduction to Recommendation Systems
16
Chapter 1 Introduction to Recommendation Systems
Figure 1-12 shows that PAPER CRAFT, LITTLE BIRDIE is the most purchased item.
It’s very popular only in the United Kingdom.
17
Chapter 1 Introduction to Recommendation Systems
Figure 1-13 shows that RABBIT NIGHT LIGHT is the most purchased item. It’s very
popular in the Netherlands.
Buy Again
Now let’s discuss buy again. It’s another simple recommendation simple calculated at
the customer/user level. You might have seen “Watch again” on streaming platforms. It’s
the same concept. You know a certain set of actions are done repeatedly by a customer,
and we recommend the same action next time.
This is very useful in online grocery platforms because customers come back and
buy the same item again and again.
Let’s implement it.
# Fetching the items bought by the customer for provided customer id
items_bought = df_new[df_new['CustomerID']==customerid].Description
18
Chapter 1 Introduction to Recommendation Systems
buy_again(17850)
Figure 1-14 recommends the holder and the lantern to customer 17850, given that he
often buys these items.
Summary
In this chapter, you learned about recommender systems—how they work, their
applications, and the various implementation types. You also learned about implicit and
explicit types and the differences between them. The chapter also explored market basket
analysis (association rule mining), content-based and collaborative-based filtering, hybrid
systems, ML clustering-based and classification-based methods, and deep learning and
NLP-based recommender systems. Finally, you implemented simple recommender
systems. Many other complex algorithms are explored in upcoming chapters.
19
CHAPTER 2
This chapter explores the implementation of market basket analysis with the help
of an open source e-commerce dataset. You start with the dataset in exploratory data
21
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_2
Chapter 2 Market Basket Analysis (Association Rule Mining)
analysis (EDA) and focus on critical insights. You then learn about the implementation
of various techniques in MBA, plot a graphical representation of the associations, and
draw insights.
Implementation
Let’s imports the required libraries.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.style
%matplotlib inline
from mlxtend.frequent_patterns import apriori,association_rules
from collections import Counter
from IPython.display import Image
Data Collection
Let’s look at an open source dataset from a Kaggle e-commerce website. Download the
dataset from www.kaggle.com/carrie1/ecommerce-data?select=data.csv.
(541909, 8)
data.head()
22
Chapter 2 Market Basket Analysis (Association Rule Mining)
data.isnull().sum().sort_values(ascending=False)
CustomerID 135080
Description 1454
Country 0
UnitPrice 0
InvoiceDate 0
Quantity 0
StockCode 0
InvoiceNo 0
dtype: int64
data1 = data.dropna()
data1.describe()
23
Chapter 2 Market Basket Analysis (Association Rule Mining)
The Quantity column has some negative values, which are part of the incorrect data,
so let’s drop these entries.
The following selects only data in which the quantity is greater than 0.
Figure 2-4 shows the output after filtering the data in the Quantity column.
24
Chapter 2 Market Basket Analysis (Association Rule Mining)
Loyal Customers
Let’s create a new Amount feature/column, which is the product of the quantity and its
unit price.
Now let’s use the group by function to highlight the customers with the greatest
number of orders.
plt.subplots(figsize=(15,6))
25
Chapter 2 Market Basket Analysis (Association Rule Mining)
plt.style.use('bmh')
The x axis indicates the customer ID, and the y axis indicates the number of orders.
plt.plot(orders.CustomerID, orders.InvoiceNo)
plt.xlabel('Customers ID')
plt.ylabel('Number of Orders')
Let’s use the group by function again to get the customers with the highest amount
spent (invoices).
plt.subplots(figsize=(15,6))
The x axis indicates the customer ID, and y axis indicates the amount spent.
plt.plot(money_spent.CustomerID, money_spent.Amount)
plt.style.use('bmh')
plt.xlabel('Customers ID')
plt.ylabel('Money spent')
27
Chapter 2 Market Basket Analysis (Association Rule Mining)
import datetime
data1['InvoiceDate'] = pd.to_datetime(data1.InvoiceDate,
format='%m/%d/%Y %H:%M')
28
Chapter 2 Market Basket Analysis (Association Rule Mining)
Create a new feature for the day; for example, Monday=1.....until Sunday=7.
plt.style.use('bmh')
Let’s use group by to extract the number of invoices per year and month.
ax = data1.groupby('InvoiceNo')['year_month'].unique().value_counts().sort_
index().plot(kind='bar',figsize=(15,6))
ax.set_xlabel('Month',fontsize=15)
ax.set_ylabel('Number of Orders',fontsize=15)
ax.set_xticklabels(('Dec_10','Jan_11','Feb_11','Mar_11','Apr_11','May_1
1','Jun_11','July_11','Aug_11','Sep_11','Oct_11','Nov_11','Dec_11'),
rotation='horizontal', fontsize=13)
plt.show()
29
Chapter 2 Market Basket Analysis (Association Rule Mining)
data1[data1['day']==6].shape[0]
ax = data1.groupby('InvoiceNo')['day'].unique().value_counts().sort_
index().plot(kind='bar',figsize=(15,6))
ax.set_xlabel('Day',fontsize=15)
ax.set_ylabel('Number of Orders',fontsize=15)
ax.set_xticklabels(('Mon','Tue','Wed','Thur','Fri','Sun'),
rotation='horizontal', fontsize=15)
plt.show()
30
Chapter 2 Market Basket Analysis (Association Rule Mining)
ax = data1.groupby('InvoiceNo')['hour'].unique().value_counts().iloc[:-1].
sort_index().plot(kind='bar',figsize=(15,6))
ax.set_xlabel('Hour',fontsize=15)
ax.set_ylabel('Number of Orders',fontsize=15)
Provide X tick labels (all orders are placed between hours 6 and 20).
31
Chapter 2 Market Basket Analysis (Association Rule Mining)
data1.UnitPrice.describe()
count 397924.000000
mean 3.116174
std 22.096788
min 0.000000
25% 1.250000
50% 1.950000
75% 3.750000
max 8142.750000
Name: UnitPrice, dtype: float64
Since the minimum unit price = 0, there are either incorrect entries or free items.
Let’s check the distribution of unit prices.
plt.subplots(figsize=(12,6))
32
Chapter 2 Market Basket Analysis (Association Rule Mining)
sns.set_style('darkgrid')
sns.boxplot(data1.UnitPrice)
plt.show()
Items with UnitPrice = 0 are not outliers. These are the “free” items.
Create a new DataFrame for free items.
free_items_df = data1[data1['UnitPrice'] == 0]
free_items_df.head()
Figure 2-13 shows the filtered data output (unit price = 0).
33
Chapter 2 Market Basket Analysis (Association Rule Mining)
Let’s count the number of free items given away by month and year.
free_items_df.year_month.value_counts().sort_index()
201012 3
201101 3
201102 1
201103 2
201104 2
201105 2
201107 2
201108 6
201109 2
201110 3
201111 14
Name: year_month, dtype: int64
There is at least one free item every month except June 2011.
Let’s count the number of free items per year and month.
ax = free_items_df.year_month.value_counts().sort_index().plot(kind='bar',
figsize=(12,6))
ax.set_xlabel('Month',fontsize=15)
ax.set_ylabel('Frequency',fontsize=15)
34
Chapter 2 Market Basket Analysis (Association Rule Mining)
ax.set_xticklabels(('Dec_10','Jan_11','Feb_11','Mar_11','Apr_11','May_11
','July_11','Aug_11','Sep_11','Oct_11','Nov_11'), rotation='horizontal',
fontsize=13)
plt.show()
The greatest number of free items were given out in November 2011. The greatest
number of orders were also placed in November 2011.
Use bmh.
plt.style.use('bmh')
Use groupby to count the unique number of invoices by year and month.
35
Chapter 2 Market Basket Analysis (Association Rule Mining)
ax = data1.groupby('InvoiceNo')['year_month'].unique().value_counts().sort_
index().plot(kind='bar',figsize=(15,6))
ax.set_xlabel('Month',fontsize=15
ax.set_ylabel('Number of Orders',fontsize=15)
ax.set_xticklabels(('Dec_10','Jan_11','Feb_11','Mar_11','Apr_11','May_1
1','Jun_11','July_11','Aug_11','Sep_11','Oct_11','Nov_11','Dec_11'),
rotation='horizontal', fontsize=13)
plt.show()
Compared to the May month, the sales for the month of August have declined,
indicating a slight effect from the “number of free items”.
36
Chapter 2 Market Basket Analysis (Association Rule Mining)
Use bmh.
plt.style.use('bmh')
Let’s use groupby to sum the amount spent per year and month.
ax = data1.groupby('year_month')['Amount'].sum().sort_index().plot(kind='ba
r',figsize=(15,6))
ax.set_xlabel('Month',fontsize=15)
ax.set_ylabel('Amount',fontsize=15)
ax.set_xticklabels(('Dec_10','Jan_11','Feb_11','Mar_11','Apr_11','May_1
1','Jun_11','July_11','Aug_11','Sep_11','Oct_11','Nov_11','Dec_11'),
rotation='horizontal', fontsize=13)=
plt.show()
Figure 2-16 shows the output of revenue generated for different months.
37
Chapter 2 Market Basket Analysis (Association Rule Mining)
Item Insights
This segment answers questions like the following.
• What are the “first choice” items for the greatest number of invoices?
most_sold_items_df = data1.pivot_table(index=['StockCode','Descript
ion'], values='Quantity', aggfunc='sum').sort_values(by='Quantity',
ascending=False)
most_sold_items_df.reset_index(inplace=True)
sns.set_style('white')
Figure 2-17 shows the output of the top ten items based on sales.
38
Chapter 2 Market Basket Analysis (Association Rule Mining)
product_white_df.shape
(2028, 13)
It denotes that WHITE HANGING HEART T-LIGHT HOLDER has been ordered
2028 times.
len(product_white_df.CustomerID.unique())
856
This means 856 customers ordered WHITE HANGING HEART T-LIGHT HOLDER.
Create a pivot table that displays the sum of unique customers who bought a
particular item.
39
Chapter 2 Market Basket Analysis (Association Rule Mining)
most_bought = data1.pivot_table(index=['StockCode','Descripti
on'], values='CustomerID', aggfunc=lambda x: len(x.unique())).sort_
values(by='CustomerID', ascending=False)
most_bought
Figure 2-18 shows the output of unique customers who bought a particular item.
Since the WHITE HANGING HEART T-LIGHT HOLDER count matches length 856,
the pivot table looks correct for all items.
most_bought.reset_index(inplace=True)
sns.set_style('white'
Create a bar plot of description (or the item) on the y axis and the sum of unique
customers on the x axis.
Plot only the ten most frequently purchased items.
40
Chapter 2 Market Basket Analysis (Association Rule Mining)
Figure 2-19 shows the output top ten items by most of the number of customers.
41
Chapter 2 Market Basket Analysis (Association Rule Mining)
l = data1['InvoiceNo']
l = l.to_list()
len(l)
397924
Use the set function to find unique invoice numbers only and store them in the
invoices list.
invoices_list = list(set(l))
The following finds the length of the invoices (or the count of unique invoice
numbers).
42
Chapter 2 Market Basket Analysis (Association Rule Mining)
len(invoices_list)
18536
first_choices_list = []
for i in invoices_list:
first_purchase_list = data1[data1['InvoiceNo']==i]['items'].reset_
index(drop=True)[0]
# Appending
first_choices_list.append(first_purchase_list)
first_choices_list[:5]
['ROCKING_HORSE_GREEN_CHRISTMAS_',
'POTTERING_MUG',
'JAM_MAKING_SET_WITH_JARS',
'TRAVEL_CARD_WALLET_PANTRY',
'PACK_OF_12_PAISLEY_PARK_TISSUES_']
The length of the first choices matches the length of the invoices.
len(first_choices_list)
18536
count = Counter(first_choices_list)
43
Chapter 2 Market Basket Analysis (Association Rule Mining)
df_first_choices.rename(columns={'index':'item', 0:'count'},inplace=True)
df_first_choices.sort_values(by='count',ascending=False)
Figure 2-21 shows the output of the top ten first choices.
plt.subplots(figsize=(20,10))
sns.set_style('white')
44
Chapter 2 Market Basket Analysis (Association Rule Mining)
Figure 2-22 shows the output of the top ten first choices.
Let’s use group by function to create a market basket DataFrame, which specifies if
an item is present in a particular invoice number for all items and all invoices.
The following denotes the quantity in the invoice number, which must be fixed.
Figure 2-23 shows the output of total quantity, grouped by invoice and description.
45
Chapter 2 Market Basket Analysis (Association Rule Mining)
This output gets the quantity ordered (e.g., 48,24,126), but we just want to know if an
item was purchased or not.
So, let’s encode the units as 1 (if purchased) or 0 (not purchased).
def encode_units(x):
if x < 1:
return 0
if x >= 1:
return 1
market_basket = market_basket.applymap(encode_units)
market_basket.head(10)
Let’s look at an example. If 10 out of 100 users purchase milk, support for milk is
10/100 = 10%. The calculation formula is shown in Figure 2-26.
Suppose you are looking to build a relationship between milk and bread. If 7 out of
40 milk buyers also buy bread, then confidence = 7/40 = 17.5%
Figure 2-27 explains confidence.
47
Chapter 2 Market Basket Analysis (Association Rule Mining)
48
Chapter 2 Market Basket Analysis (Association Rule Mining)
Association Rules
Association rule mining finds interesting associations and relationships among large
sets of data items. This rule shows how frequently an item set occurs in a transaction. A
market basket analysis is performed based on the rules created from the dataset.
Figure 2-30 explains the association rule.
49
Chapter 2 Market Basket Analysis (Association Rule Mining)
Figure 2-30 shows that out of the five transactions in which a mobile phone was
purchased, three included a mobile screen guard. Thus, it should be recommended.
50
Chapter 2 Market Basket Analysis (Association Rule Mining)
prod_wooden_star_rules = association_rules(itemsets_frequent,
metric="lift", min_threshold=1)
prod_wooden_star_rules.sort_values(['lift','support'],ascending=False).
reset_index(drop=True).head()
Figure 2-31 shows the output of apriori algorithm.
Creating a Function
Create a new function to pass an item name. It returns the items that are bought together
frequently. In other words, it returns the items that are likely to be bought by the user
because they bought the item passed into the function.
def bought_together_frequently(item):
51
Chapter 2 Market Basket Analysis (Association Rule Mining)
a_rules.sort_values(['lift','support'],ascending=False).reset_
index(drop=True)
Example 1 is as follows.
Example 2 is as follows.
Example 3 is as follows.
Items frequently bought together with JAM MAKING SET WITH JARS
array([frozenset({'JAM MAKING SET WITH JARS'}),
frozenset({'JAM MAKING SET PRINTED'}),
52
Chapter 2 Market Basket Analysis (Association Rule Mining)
Validation
JAM MAKING SET PRINTED is a part of invoice 536390, so let’s print all the items from
this invoice and cross-check it.
data1[data1 ['InvoiceNo']=='536390']
There are some common items between the recommendations from the bought_
together_frequently function and the invoice.
Thus, the recommender is performing well.
53
Chapter 2 Market Basket Analysis (Association Rule Mining)
support=prod_wooden_star_rules.support.values
confidence=prod_wooden_star_rules.confidence.values
import networkx as nx
import random
import matplotlib.pyplot as plt
54
Chapter 2 Market Basket Analysis (Association Rule Mining)
Graph1.add_nodes_from([a])
Graph1.add_nodes_from([c])
Graph1.add_edge("R"+str(i), c, color=colors[i], weight=2)
edges = Graph1.edges()
colors = [Graph1[u][v]['color'] for u,v in edges]
weights = [Graph1[u][v]['weight'] for u,v in edges]
55
Chapter 2 Market Basket Analysis (Association Rule Mining)
56
Chapter 2 Market Basket Analysis (Association Rule Mining)
support = a_rules.support.values
confidence = a_rules.confidence.values
color_map=[]
N = 50
colors = np.random.rand(N)
strs=['R0', 'R1', 'R2', 'R3', 'R4', 'R5', 'R6', 'R7', 'R8', 'R9',
'R10', 'R11']
Graph2.add_nodes_from([a])
57
Chapter 2 Market Basket Analysis (Association Rule Mining)
print('Visualization of Rules:')
edges = Graph2.edges()
colors = [Graph2[u][v]['color'] for u,v in edges]
weights = [Graph2[u][v]['weight'] for u,v in edges]
Example 1 is as follows.
Figure 2-35 shows items frequently bought along with WOODEN STAR CHRISTMAS
SCANDINAVIAN.
58
Chapter 2 Market Basket Analysis (Association Rule Mining)
59
Chapter 2 Market Basket Analysis (Association Rule Mining)
Example 2 is as follows.
Figure 2-37 shows the items frequently bought together with JAM MAKING SET
WITH JARS.
60
Chapter 2 Market Basket Analysis (Association Rule Mining)
61
Chapter 2 Market Basket Analysis (Association Rule Mining)
Summary
In this chapter, you learned how to build a recommendation system based on market
basket analysis. You also learned how to fetch items that are frequently purchased
together and offer suggestions to users. Most e-commerce sites use this method to
showcase items bought together. This chapter implemented this method in Python using
an e-commerce example.
62
CHAPTER 3
Content-Based
Recommender Systems
Content-based filtering is used to recommend products or items very similar to those
being clicked or liked. User recommendations are based on the description of an item
and a profile of the user’s interests. Content-based recommender systems are widely
used in e-commerce platforms. It is one of the basic algorithms in a recommendation
engine. Content-based filtering can be triggered for any event; for example, on click, on
purchase, or add to cart.
If you use any e-commerce platform, for example, Amazon.com, a product’s page
shows recommendations in the “related products” section. How to generate these
recommendations is discussed in this chapter.
Approach
The following steps build a content-based recommender engine.
1. Do the data collection (should have complete item description).
5. Recommend products.
63
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_3
Chapter 3 Content-Based Recommender Systems
Implementation
The following installs and imports the required libraries.
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity, manhattan_
distances, euclidean_distances
from sklearn.feature_extraction.text import TfidfVectorizer
import re
from gensim import models
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.style
%matplotlib inline
from gensim.models import FastText as ft
from IPython.display import Image
import os
64
Chapter 3 Content-Based Recommender Systems
• Word2vec: https://drive.google.com/uc?id=0B7XkCwpI5KDYNlNUT
TlSS21pQmM
• GloVe: https://nlp.stanford.edu/data/glove.6B.zip
• fastText: https://dl.fbaipublicfiles.com/fasttext/vectors-
crawl/cc.en.300.bin.gz
Content_df = pd.read_csv("Rec_sys_content.csv")
# Data Info
Content_df.info()
65
Chapter 3 Content-Based Recommender Systems
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3958 entries, 0 to 3957
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 StockCode 3958 non-null object
1 Product Name 3958 non-null object
2 Description 3958 non-null object
3 Category 3856 non-null object
4 Brand 3818 non-null object
5 Unit Price 3943 non-null float64
dtypes: float64(1), object(5)
memory usage: 185.7+ KB
Content_df.shape
(3958, 6)
StockCode 0
Product Name 0
Description 0
Category 102
Brand 140
Unit Price 15
dtype: int64
66
Chapter 3 Content-Based Recommender Systems
There are a few null values present in the dataset. However, let’s focus on Product
Name and Description to build a content-based recommendation engine. Removing
nulls from Category, Brand, and Unit Price is not required.
Now, let’s load the pre-trained models.
#Importing Word2Vec
word2vecModel = models.KeyedVectors.load_word2vec_format('GoogleNews-
vectors-negative300.bin.gz', binary=True)
#Importing FastText
fasttext_model=ft.load_fasttext_format("cc.en.300.bin.gz")
#Import Glove
glove_df = pd.read_csv('glove.6B.300d.txt', sep=" ",
quoting=3, header=None, index_col=0)
glove_model = {key: value.values for key, value in glove_df.T.items()}
As discussed, the Product Name and Description columns of text data are used to
build a content-based recommendation engine. Text preprocessing is mandatory. It is
followed by text-to-feature conversion.
The following describes the preprocessing steps.
1. Remove duplicates.
67
Chapter 3 Content-Based Recommender Systems
unique_df['desc_lowered'] = unique_df['desc_lowered'].apply(lambda x:
re.sub(r'[^\w\s]', '', x))
unique_df= unique_df.reset_index(drop=True)
Text to Features
Once the text preprocessing is done, let’s focus on converting the preprocessed text into
features.
There are several methods to convert text to features.
• CountVectorizer
• TF-IDF
• Word2vec
• fastText
• GloVe
Since machines or algorithms cannot understand the text, a key task in natural
language processing (NLP) is converting text data into numerical data called features.
There are different techniques to do this. Let’s discuss them briefly.
68
Chapter 3 Content-Based Recommender Systems
One Hot 1 1 0
Hot 0 1 0
Encoding 0 0 1
CountVectorizer
The drawback to the OHE approach is if a word appears multiple times in a sentence, it
gets the same importance as any other word that appears only once. CountVectorizer
helps overcome this because it counts the tokens present in an observation instead of
tagging everything as 1 or 0.
Table 3-2 demonstrates CountVectorizer.
69
Chapter 3 Content-Based Recommender Systems
Word Embeddings
Even though TF-IDF is widely used, it doesn’t capture the context of a word or sentence.
Word embeddings solve this problem. Word embeddings help capture context and
the semantic and syntactic similarity between the words. They generate a vector that
captures the context and semantics using shallow neural networks.
In recent years, there have been many advancements in this field, including the
following tools.
• Word2vec
• GloVe
• fastText
• Elmo
• SentenceBERT
• GPT
For more information about these concepts, please refer to our second edition
book on NLP, Natural Language Processing Recipes: Unlocking Text Data with Machine
Learning and Deep Learning Using Python (Apress, 2021).
The pre-trained models (word embeddings)—GloVe, Word2vec, and fastText—have
been imported/loaded. Now let’s import the CountVectorizer and TF-IDF.
70
Chapter 3 Content-Based Recommender Systems
# Importing IFIDF
tfidf_vec = TfidfVectorizer(stop_words='english', analyzer='word', ngram_
range=(1,3))
Similarity Measures
Once text is converted to features, the next step is to build a content-based model. The
similarity measures must get similar vectors.
There are three types of similarity measures.
• Euclidean distance
• Cosine similarity
• Manhattan distance
Note We have not yet converted text to features; we only loaded all the methods.
They are used later.
Euclidean Distance
Euclidean distance is calculated by taking the sum of the squared differences between
two vectors and then applying the square root.
Figure 3-3 explains the Euclidean distance.
71
Chapter 3 Content-Based Recommender Systems
Cosine Similarity
Cosine similarity is the cosine of the angle between two n-dimensional vectors in an
N-dimensional space. It is the dot product of the two vectors divided by the product of
the two vectors’ lengths (or magnitudes).
Figure 3-4 explains cosine similarity.
Manhattan Distance
The Manhattan distance is the sum of the absolute differences between two vectors.
Figure 3-5 explains the Manhattan distance.
#Euclidean distance
72
Chapter 3 Content-Based Recommender Systems
return similar_products
#Cosine similarity
def find_similarity(cosine_sim_matrix, index, n=10):
return similar_products
#Manhattan distance
def find_manhattan_distance(sim_matrix, index, n=10):
return similar_products
73
Chapter 3 Content-Based Recommender Systems
count_vector = cnt_vec.fit_transform(desc_list)
else:
sim_matrix = euclidean_distances(count_vector)
products = find_euclidean_distances(sim_matrix , index)
return products
74
Chapter 3 Content-Based Recommender Systems
# Cosine Similarity
get_recommendation_cv(product_id, unique_df, similarity = "cosine", n=10)
Figure 3-6 shows the output of cosine similarity for CountVectorizer features.
#Manhattan Similarity
get_recommendation_cv(product_id, unique_df, similarity =
"manhattan", n=10)
Figure 3-7 shows the output of Manhattan similarity for CountVectorizer features.
75
Chapter 3 Content-Based Recommender Systems
#Euclidean Similarity
get_recommendation_cv(product_id, unique_df, similarity =
"euclidean", n=10)
76
Chapter 3 Content-Based Recommender Systems
tfidf_matrix = tfidf_vec.fit_transform(desc_list)
else:
sim_matrix = euclidean_distances(tfidf_matrix)
products = find_euclidean_distances(sim_matrix , index)
return products
The input for this function is the same as what was used in the previous section. The
recommendations are for the same product.
Next, get recommendations using cosine similarity for TF-IDF features.
# Cosine Similarity
get_recommendation_tfidf(product_id, unique_df, similarity =
"cosine", n=10)
77
Chapter 3 Content-Based Recommender Systems
#Manhattan Similarity
get_recommendation_tfidf(product_id, unique_df, similarity =
"manhattan", n=10)
#Euclidean Similarity
get_recommendation_tfidf(product_id, unique_df, similarity =
"euclidean", n=10)
78
Chapter 3 Content-Based Recommender Systems
vector_matrix[index] = sentence_vector
else:
sim_matrix = euclidean_distances(vector_matrix)
products = find_euclidean_distances(sim_matrix , input_index)
return products
The input for this function is the same as what was used in the previous section. The
recommendations are for the same product.
Let’s get recommendations using Manhattan similarity for Word2vec features.
#Manhattan Similarity
get_recommendation_word2vec(product_id, unique_df, similarity =
"manhattan", n=10)
79
Chapter 3 Content-Based Recommender Systems
# Cosine Similarity
get_recommendation_word2vec(product_id, unique_df, similarity =
"cosine", n=10)
#Euclidean Similarity
get_recommendation_word2vec(product_id, unique_df, similarity =
"euclidean", n=10)
80
Chapter 3 Content-Based Recommender Systems
vector_matrix[index] = sentence_vector
else:
sim_matrix = euclidean_distances(vector_matrix)
products = find_euclidean_distances(sim_matrix , input_index)
return products
The input for this function is the same as what was used in the previous section. The
recommendations are for the same product.
Let’s get recommendations using cosine similarity for fastText features.
# Cosine Similarity
get_recommendation_fasttext(product_id, unique_df, similarity =
"cosine", n=10)
81
Chapter 3 Content-Based Recommender Systems
#Manhattan Similarity
get_recommendation_fasttext(product_id, unique_df, similarity =
"manhattan", n=10)
#Euclidean Similarity
get_recommendation_fasttext(product_id, unique_df, similarity =
"euclidean", n=10)
# Comparing similarity to get the top matches using GloVe pretrained model
82
Chapter 3 Content-Based Recommender Systems
except:
continue
vector_matrix[index] = sentence_vector
else:
sim_matrix = euclidean_distances(vector_matrix)
products = find_euclidean_distances(sim_matrix , input_index)
return products
The input for this function is the same as what was used in the previous section. The
recommendations are for the same product.
Next, get recommendations using Euclidean similarity for GloVe features.
#Euclidean Similarity
get_recommendation_glove(product_id, unique_df, similarity =
"euclidean", n=10)
83
Chapter 3 Content-Based Recommender Systems
To get recommendations using cosine similarity for GloVe features, change similarity
to “cosine”.
# Cosine Similarity
get_recommendation_glove(product_id, unique_df, similarity =
"cosine", n=10)
#Manhattan Similarity
get_recommendation_glove(product_id, unique_df, similarity =
"manhattan", n=10)
The purpose of a co-occurrence matrix is to present the number of times each word
appears in the same context.
“Roses are red. The sky is blue.” Figure 3-13 shows these words in a co-
occurrence matrix.
#preprocessing
df = df.head(250)
# Combining Product and Description
df['Description'] = df['Product Name'] + ' ' +df['Description']
unique_df = df.drop_duplicates(subset=['Description'], keep='first')
84
Chapter 3 Content-Based Recommender Systems
unique_df['desc_lowered'] = unique_df['Description'].apply(lambda x:
x.lower())
unique_df['desc_lowered'] = unique_df['desc_lowered'].apply(lambda x:
re.sub(r'[^\w\s]', '', x))
desc_list = list(unique_df['desc_lowered'])
co_ocr_vocab = []
for i in desc_list:
[co_ocr_vocab.append(x) for x in i.split()]
85
Chapter 3 Content-Based Recommender Systems
sentence_vector += co_occur_vector_matrix[co_ocr_vocab.
index(each_word)]
count += 1
except:
continue
vector_matrix[index] = sentence_vector/count
else:
sim_matrix = euclidean_distances(vector_matrix)
products = find_euclidean_distances(sim_matrix , index)
return products
#Euclidean Similarity
get_recommendation_coccur(product_id, unique_df, similarity =
"euclidean", n=10)
Figure 3-14 shows the Euclidean output for the co-occurrence matrix.
86
Chapter 3 Content-Based Recommender Systems
# Cosine Similarity
get_recommendation_coccur(product_id, unique_df, similarity =
"cosine", n=10)
#Manhattan Similarity
get_recommendation_coccur(product_id, unique_df, similarity =
"manhattan", n=10)
Summary
In this chapter, you learned how to build a content-based model using text data from
the data preparation to the recommendations to the users. You saw models built using
several NLP techniques. Using word embeddings is a much better option, given they
have the power to capture context and semantics.
87
CHAPTER 4
Collaborative Filtering
Collaborative filtering is a very popular method in recommendation engines. It is the
predictive process behind the suggestions provided by these systems. It processes and
analyzes customers’ information and suggests items they will likely appreciate.
Collaborative filtering algorithms use a customer’s purchase history and ratings to
find similar customers and then suggest items that they liked.
Figure 4-1 explains collaborative filtering at a high level.
For example, to find a new movie or show to watch, you can ask your friends
for suggestions since you all share similar tastes in content. The same concept is
used in collaborative filtering, where user-user similarity finds similar users to get
recommendations based on each other’s likes.
There are two types of collaborative filtering methods—user-to-user and item-
to-item. They are explored in the upcoming sections. This chapter looks at the
implementation of these two methods using cosine similarity before diving into
implementing the more popularly used KNN-based algorithm for collaborative filtering.
89
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_4
Chapter 4 Collaborative Filtering
Implementation
The following installs the surprise library.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import random
from IPython.display import Image
The following imports the KNN algorithm and csr_matrix for KNN data preparation.
Finally, import accuracy to get metrics such as root-mean-square error (RMSE) and
mean absolute error (MAE).
90
Chapter 4 Collaborative Filtering
Data Collection
This chapter uses a custom dataset that has been masked. Download the dataset from
the GitHub link.
The following reads the data.
data.shape
91
Chapter 4 Collaborative Filtering
(272404, 9)
The dataset has a total of 272,404 unique transactions in its nine columns.
Let’s check if there are any null values because a clean dataset is required for further
analysis.
data.isnull().sum().sort_values(ascending=False)
CustomerID 0
ShippingCost 0
ShipMode 0
Discount% 0
DeliveryDate 0
InvoiceDate 0
Quantity 0
StockCode 0
InvoiceNo 0
dtype: int64
The data is clean with no nulls in any columns. Further preprocessing is not required
in this case.
If there were any NaNs or nulls in the data, they were dropped using the following.
data1 = data.dropna()
Now let’s check for any data abnormalities by describing the data.
data1.describe()
92
Chapter 4 Collaborative Filtering
There aren’t any negative values in the Quantity column, but if there were, those
records would need to be dropped since it’s a data abnormality.
Let’s change the StockCode column datatype to string to maintain the same type
across all rows.
data1.StockCode = data1.StockCode.astype(str)
Memory-Based Approach
Let’s examine the most basic approach to implementing collaborative filtering: the
memory-based approach. This approach uses simple arithmetic operations or metrics to
calculate the similarities between two users or two items to group them. For example, to
find user-user relations, both users’ historically liked items are used to find the similarity
metric, that measures how similar the two users are.
Cosine similarity is a common similarity metric. Euclidean distance and Pearson’s
correlation are other popular metrics. A metric is considered geometric if the row
(column) of a given user (item) is treated as a vector or a matrix. In cosine similarity, the
similarity of two users (say) is measured as the cosine of the angle between the vectors
of the two users. For users A and B, the cosine similarity is given by the formula shown in
Figure 4-4.
93
Chapter 4 Collaborative Filtering
94
Chapter 4 Collaborative Filtering
Here, a matrix is formed to describe the behavior of all the users (purchase history
in our example) corresponding to all the items. Using this matrix, you can calculate the
similarity metrics (cosine similarity) between users to formulate user-user relations.
These relations help find users similar to a given user and recommend the items bought
by these similar users.
Implementation
Let’s first create a data matrix covering purchase history. It contains all customer IDs for
all items (whether a customer has purchased an item or not).
The data matrix shown in Figure 4-6 reveals the total quantity purchased by each
user against each item. Only information about whether the item was bought or not by
the user is needed, not the quantity.
Thus, an encoding of 0 or 1 is used, where 0 is not purchased, and 1 is purchased.
Let’s first write a function for encoding the data matrix.
def encode_units(x):
if x < 1: # If the quantity is less than 1
return 0 # Not purchased
if x >= 1: # If the quantity is greater than 1
return 1 # Purchased
95
Chapter 4 Collaborative Filtering
purchase_df = purchase_df.applymap(encode_units)
purchase_df.head()
The purchase data matrix reveals the behavior of customers across all items. This
matrix finds the user similarity scores matrix, and the similarity metric uses cosine
similarity. The user similarity score matrix has user-to-user similarity for each user pair.
First, let’s apply cosine_similarity to the purchase data matrix.
user_similarities = cosine_similarity(purchase_df)
Now, let’s store the user similarity scores in a DataFrame (i.e., the similarity scores
matrix).
user_similarity_data = pd.DataFrame(user_similarities,index=purchase_
df.index,columns=purchase_df.index)
user_similarity_data.head()
96
Chapter 4 Collaborative Filtering
The similarity score values are between 0 to 1, where values closer to 0 represent less
similar, and values closer to 1 represent more similar customers.
Using this user similarity scores data, let’s get recommendations for a given user.
Create a function for this.
def fetch_similar_users(user_id,k=5):
# separating data rows for the entered user id
user_similarity = user_similarity_data[user_similarity_data.index ==
user_id]
# calcuate cosine similarity between user and each other user
similarities = cosine_similarity(user_similarity,other_users_
similarities)[0].tolist()
user_indices = other_users_similarities.index.tolist()
top_k_users_similarities = sorted_index_similarity_pair[:k]
similar_users = [u[0] for u in top_k_users_similarities]
This function separates the selected user from all other users and then takes a cosine
similarity of the selected user with all users to find similar users. Return the top k similar
users (by CustomerID) to our selected user.
For example, let’s find the similar to user 12347.
similar_users = fetch_similar_users(12347)
97
Chapter 4 Collaborative Filtering
similar_users
def simular_users_recommendation(userid):
similar_users = fetch_similar_users(userid)
98
Chapter 4 Collaborative Filtering
This function gets the similar users for the given customer (ID) and obtains a list
of all the items bought by these similar users. This list is then flattened to get a final list
of unique items, from which shows randomly chosen ten recommended items for a
given user.
Using this function on user 12347 to get recommendations results in the following
suggestions.
simular_users_recommendation(12347)
User 12347 had ten suggestions from the items bought by similar users.
Implementation
Following the initial steps used in user-to-user collaborative filtering methods, let’s first
create the data matrix, which contains all the item IDs across their purchase history (i.e.,
quantity purchased by each customer).
items_purchase_df = (data1.groupby(['StockCode','CustomerID'])['Quantity'].
sum().unstack().reset_index().fillna(0).set_index('StockCode'))
items_purchase_df.head()
99
Chapter 4 Collaborative Filtering
This data matrix shows the total quantity purchased by each user against each item.
But the only information needed is whether the user bought the item.
Thus, an encoding of 0 or 1 is used, where 0 is not purchased, and 1 is purchased.
Use the same encode_units function created earlier.
items_purchase_df = items_purchase_df.applymap(encode_units)
The items purchase data matrix reveals the behavior of customers across all items.
Let’s use this matrix to find item similarity scores with the cosine similarity metric. The
item similarity score matrix has item-to-item similarity for each item pair.
First, let’s apply cosine_similarity to the item purchase data matrix.
item_similarities = cosine_similarity(items_purchase_df)
Now, let’s store the item similarity scores in a DataFrame (i.e., the similarity scores
matrix).
item_similarity_data = pd.DataFrame(item_similarities,index=items_purchase_
df.index,columns=items_purchase_df.index)
item_similarity_data.head()
100
Chapter 4 Collaborative Filtering
The similarity score values are between 0 and 1, where values closer to 0 represent
less similarity and values closer to 1 represent more similar items.
Using this item similarity score data, let’s get recommendations for a given user.
The following creates a function for this.
def fetch_similar_items(item_id,k=10):
# separating data rows of the selected item
item_similarity = item_similarity_data[item_similarity_data.index ==
item_id]
# calculate cosine similarity between selected item with other items
similarities = cosine_similarity(item_similarity,other_items_
similarities)[0].tolist()
101
Chapter 4 Collaborative Filtering
This function separates the selected item from all other items and then takes a cosine
similarity of the selected item with all items to find the similarities. Return the top k
similar items (StockCodes) to our selected item.
For example, let’s find similar items for user 12347.
similar_items = fetch_similar_items('10002')
similar_items
As expected, you see the default ten similar items to item 10002.
Now let’s get the recommendations by showing similar items to those bought by a
particular user.
Write another function to get similar item recommendations.
def simular_item_recommendation(userid):
simular_items_recommendation_list = []
102
Chapter 4 Collaborative Filtering
This function gets the list of similar items for all previously bought items by our given
customer (ID). This list is then flattened to get a final list of unique items, from which
randomly chosen ten items as recommendations for our given user are shown.
Again, trying this function on user 12347 to get the recommendations for that user
results in the following suggestions.
simular_item_recommendation(12347)
103
Chapter 4 Collaborative Filtering
'21041',
'23316',
'22550']
User 12347 has ten suggestions that are similar to items previously bought.
KNN-based Approach
You have learned the basics of collaborative filtering and implementing user-to-user and
item-to-item filtering. Now let’s dive into machine learning-based approaches, which
are more robust and popular in building recommendation systems.
Machine Learning
Machine learning is a machine’s capability to learn from experience (data) and make
meaningful predictions without being explicitly programmed. It is a subfield of artificial
intelligence that deals with building systems that can learn from data. The objective is to
make computers learn on their own without any intervention from humans.
There are three primary machine learning categories.
Supervised Learning
In supervised learning, labeled training data is leveraged to derive the pattern or
function and make a model or machine learn. Data consists of a dependent variable
(Target label) and the independent variables or predictors. The machine tries to learn
the function of labeled data and predict the output of unseen data.
Unsupervised Learning
In unsupervised learning, a machine learns the hidden pattern without leveraging
labeled data, so training doesn’t happen. These algorithms learn to capture patterns
based on similarities or distances between data points.
Reinforcement Learning
Reinforcement learning is the process of maximizing a reward by taking action. The
algorithms learn how to reach a goal through experience.
104
Chapter 4 Collaborative Filtering
Supervised Learning
There are two types of supervised learning: regression and classification.
Regression
Regression is a statistical predictive modeling technique that finds the relationship
between a dependent variable and one or more independent variables. Regression
is used when the dependent variable is continuous; prediction can take any
numerical value.
Popular regression algorithms include linear regression, decision tree, random
forest, SVM, LightGBM, and XGBoost.
Classification
Classification is a supervised machine learning technique in which the dependent or
output variable is categorical; for example, spam/ham, churn/not churned, and so on.
105
Chapter 4 Collaborative Filtering
K-Nearest Neighbor
The k-nearest neighbor (KNN) algorithm is a supervised machine learning model that is
used for both classification and regression problems. It is a very robust algorithm that is
easy to implement and interpret and uses less calculation time. Labeled data is needed
since it’s a supervised learning algorithm.
Figure 4-12 explains KNN algorithms.
106
Chapter 4 Collaborative Filtering
Now let’s try implementing a simple KNN model on purchase_df, created in user-
to-user filtering. This approach follows similar steps you have seen before (i.e., base
recommendations from a list of items purchased by similar users). The difference is that
a KNN model finds similar users (for a given user).
Implementation
Before passing our sparse matrix (i.e., purchase_df ) into KNN, it must be converted into
a CSR matrix.
CSR divides a sparse matrix into three separate arrays.
• values
• extent of rows
• index of columns
purchase_matrix = csr_matrix(purchase_df.values)
Next, create the KNN model using the Euclidean distance metric.
knn_model.fit(purchase_matrix)
Now that the KNN model is in place, let’s write a function to fetch similar users using
the model.
def fetch_similar_users_knn(purchase_df,query_index):
# Creating empty list where we will store user id of similar users
107
Chapter 4 Collaborative Filtering
simular_users_knn = []
simular_users_knn.append( purchase_df.index[indices.
flatten()[i]])
This function first calculates the distances and indices of the five nearest neighbors
using our KNN model’s function. This output is then processed, and a list of similar users
alone is returned. Instead of user_id as the input, take the index in the DataFrame.
Let’s test this out for index 1497.
fetch_similar_users_knn(purchase_df,1497)
simular_users_knn
Now that we have similar users, let’s get the recommendations by showing the items
bought by these similar users.
108
Chapter 4 Collaborative Filtering
def knn_recommendation(simular_users_knn):
This function replicates the logic used in user-to-user filtering. Next, let’s get the final
list of items that similar users purchased and recommend any random ten items from it.
Using this function on the previously generated similar users list gets the following
recommendations.
knn_recommendation(simular_users_knn)
109
Chapter 4 Collaborative Filtering
'22926',
'22921',
'22605',
'23298',
'22916',
'22470',
'22927',
'84978']
User 14729 has ten suggestions from the products bought by similar users.
Summary
This chapter covered collaborative filtering-based recommendation engines and
implementing the two types of filtering methods—user-to-user and item-to-item—using
basic arithmetic operations. The chapter also explored the k-nearest neighbor algorithm
(along with some machine learning basics). It ended by implementing user-to-user-
based collaborative filtering using the KNN approach. The next chapter explores other
popular methods to implement collaborative filtering-based recommendation engines.
110
CHAPTER 5
Collaborative
Filtering Using Matrix
Factorization, Singular
Value Decomposition,
and Co-Clustering
Chapter 4 explored collaborative filtering and using the KNN method. A few more
important methods are covered in this chapter: matrix factorization (MF), singular
value decomposition (SVD), and co-clustering. These methods (along with KNN) fall
into the model-based collaborative filtering approach. The basic arithmetic method of
calculating cosine similarity to find similar users falls into the memory-based approach.
Each approach has pros and cons; depending on the use case, you must select the
suitable approach.
Figure 5-1 explains the two types of approaches in collaborative filtering.
111
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_5
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
The memory-based approach is much easier to implement and explain, but its
performance is often affected due to sparse data. But on the other hand, model-based
approaches, like MF, handle the sparse data well, but it’s usually not intuitive or easy to
explain and can be much more complex to implement. But the model-based approach
performs better with large datasets and hence is quite scalable.
This chapter focuses on a few popular model-based approaches, such as
implementing matrix factorization using the same data from Chapter 4, SVD, and
co-clustering models.
Implementation
Matrix Factorization, Co-Clustering, and SVD
The following implementation is a continuation of Chapter 4 and uses the same dataset.
Let’s look at the data.
data1.head()
112
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
Let’s reuse item_purchase_df from Chapter 4. It is the matrix containing the items
and the information on whether customers bought them.
items_purchase_df.head()
This chapter uses the Python package called surprise for modeling purposes. It has
implementations of popular methods in collaborative filtering, like matrix factorization,
SVD, co-clustering, and even KNN.
First, let’s format the data into the proper format required by the surprise package.
Start by stacking the DataFrame/matrix.
data3 = items_purchase_df.stack().to_frame()
#Renaming the column as Quantity
data3 = data3.reset_index().rename(columns={0:"Quantity"})
data3
113
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
print(items_purchase_df.shape)
print(data3.shape)
(3538, 3647)
(12903086, 3)
As you can see, items_purchase_df has 3538 unique items (rows) and 3647 unique
users (columns). The stacked DataFrame is 3538 × 3647 = 12,903,086 rows, which is too
big to pass into any algorithm.
Let’s shortlist some customers and items based on the number of orders.
First, put all the IDs in a list.
The following imports the counter to count the number of orders made by each
customer and for each item.
customer_count_df = customer_count_df[customer_count_df["Quantity"]>120]
customer_count_df.rename(columns={'index':'CustomerID'},inplace=True)
customer_count_df
Similarly, repeat the same process for items (i.e., counting the number of orders
placed per item and storing it in a DataFrame).
115
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
# storing the count and item description in a dataframe
item_count_df = pd.DataFrame.from_dict(count_items, orient='index').reset_
index().rename(columns={0:"Quantity"})
Drop all items that were ordered less than 120 times.
item_count_df = item_count_df[item_count_df["Quantity"]>120]
item_count_df.rename(columns={'index':'StockCode'},inplace=True)
item_count_df
Next, apply a join on both DataFrames with stacked data to create the shortlisted
final DataFrame.
116
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
Figure 5-7 shows the shortlisted DataFrame output.
Now that the size of the data has been reduced, let’s describe it and view the stats.
data4.describe()
You can see from the output that the count has significantly reduced to 385,672
records, from 12,903,086. But this DataFrame is to be formatted further using built-in
functions from the surprise package to be supported.
Read the data in a format supported by the surprise library.
reader = Reader(rating_scale=(0,5095))
The range has been set as 0,5095 because the maximum quantity value is 5095.
Load the dataset in a format supported by the surprise library.
117
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
formated_data = Dataset.load_from_df(data4, reader)
Implementing NMF
Let’s start by modeling the non-negative matrix factorization method.
Figure 5-9 explains matrix factorization (multiplication).
# model fitting
118
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
algo1.fit(train_set)
# model prediction
pred1 = algo1.test(test_set)
Using built-in functions, you can calculate the performance metrics like RMSE (root-
mean-squared error) and MAE (mean absolute error).
# RMSE
accuracy.rmse(pred1)
#MAE
accuracy.mae(pred1)
RMSE: 428.3167
MAE: 272.6909
The RMSE and MAE are moderately high for this model, so let’s try the other two and
compare them at the end.
You can also cross-validate (using built-in functions) to further validate these values.
The cross-validation shows that the average RMSE is 427.774, and MAE is
approximately 272.627, which is moderately high.
119
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
Implementing Co-Clustering
Co-clustering (also known as bi-clustering) is commonly used in collaborative filtering.
It is a data-mining technique that simultaneously clusters the columns and rows of a
DataFrame/matrix. It differs from normal clustering, where each object is checked for
similarity with other objects based on a single entity/type of comparison. As in co-
clustering, you check for co-grouping of two different entities/types of comparison for
each object simultaneously as a pairwise interaction.
Let’s try modeling with the co-clustering method.
# model fitting
algo2.fit(train_set)
# model prediction
pred2 = algo2.test(test_set)
Calculate the RMSE and MAE performance metrics using built-in functions.
# RMSE
accuracy.rmse(pred2)
#MAE
accuracy.mae(pred2)
RMSE: 6.7877
MAE: 5.8950
The RMSE and MAE are very low for this model. Until now, this has performed the
best (better than NMF).
Cross-validate (using built-in functions) to further validate these values.
120
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
Figure 5-11 shows the cross-validation output for co-clustering.
The cross-validation shows that the average RMSE is 14.031, and MAE is
approximately 6.135, which is quite low.
Implementing SVD
Singular value decomposition is a linear algebra concept generally used as a
dimensionality reduction method. It is also a type of matrix factorization. It works
similarly in collaborative filtering, where a matrix with rows and columns as users and
items is reduced further into latent feature matrixes. An error equation is minimized to
get to the prediction.
Let’s try modeling using the SVD method.
# model fitting
algo3.fit(train_set)
# model prediction
pred3 = algo3.test(test_set)
Calculate the RMSE and MAE performance metrics using built-in functions.
# RMSE
accuracy.rmse(pred3)
#MAE
accuracy.mae(pred3)
121
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
The following is the output.
RMSE: 4827.6830
MAE: 4815.8341
The RMSE and MAE are significantly high for this model. Until now, this has
performed the worst (worse than NMF and co-clustering).
Cross-validate (using built-in functions) to further validate these values.
The cross-validation shows that the average RMSE is 4831.928 and MAE is
approximately 4821.549, which is very high.
data1[(data1['StockCode']=='47590B')&(data1['CustomerID']==15738)].
Quantity.sum()
78
Let’s get the prediction for the same combination to see the estimation or prediction.
algo2.test([['47590B',15738,78]])
122
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
The following is the output.
The predicted value given by the model is 133.01, while the actual was 78. It is close
to the actual and validated the model performance even further.
The predictions are from the co-clustering model.
pred2
Now let’s use these predictions and see the best and the worst predictions, but first,
let’s get the final output onto a DataFrame.
But first, let’s also add important information like the number of item orders and
customer orders for each record using the following function.
def get_item_orders(user_id):
try:
# for an item, return the no. of orders made
return len(train_set.ur[train_set.to_inner_uid(user_id)])
123
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
except ValueError:
# user not present in training
return 0
def get_customer_orders(item_id):
try:
# for an customer, return the no. of orders made
return len(train_set.ir[train_set.to_inner_iid(item_id)])
except ValueError:
# item not present in training
return 0
predictions_data['item_orders'] = predictions_data.item_id.apply(get_
item_orders)
predictions_data['customer_orders'] = predictions_data.customer_
id.apply(get_customer_orders)
Calculate the error component to get the best and worst predictions.
124
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
best_predictions = predictions_data.sort_values(by='error')[:10]
best_predictions
worst_predictions = predictions_data.sort_values(by='error')[-10:]
worst_predictions
125
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
Figure 5-15 shows the worst predictions.
You can now use the predictions data to get to the recommendations. First, find the
customers that have bought the same items as a given user, and then from the other
items they have bought, to fetch the top items and recommend them.
Let’s again use customer 12347 and create a list of the items this user bought.
['82494L',
'84970S',
'47599A',
'84997B',
'85123A',
'84997C',
'85049A']
Get the list of customers who bought the same items as user 12347.
# Getting list of unique customers who also bought same items (item_list)
126
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
customer_list = predictions_data[predictions_data['item_id'].isin(item_
list)]['customer_id'].values
customer_list = np.unique(customer_list).tolist()
customer_list
[12347,
12362,
12370,
12378,
...,
12415,
12417,
12428]
Now let’s filter these customers (customer_list) from predictions data, remove the
items already bought, and recommend the top items (prediction).
['16156S',
'85049E',
'47504K',
'85099C',
'85049G',
'85014B',
127
Chapter 5 COLLABORATIVE FILTERING USING MATRIX FACTORIZATION, SINGULAR VALUE
DECOMPOSITION, AND CO-CLUSTERING
'72351B',
'84536A',
'48173C',
'47590A']
Summary
This chapter continued the discussion of collaborative filtering-based recommendation
engines. Popular methods like matrix factorization, SVD, and co-clustering were
explored with a focus on implementing all three models. For the given data, the
co-clustering method performed the best, but you need to try all the different
methods available to see which best fits your data and use case in building a
recommendation system.
128
CHAPTER 6
Hybrid Recommender
Systems
The previous chapters implemented recommendation engines using content-based and
collaborative-based filtering methods. Each method has its pros and cons. Collaborative
filtering suffers from cold-start, which means when there is a new customer or item in
the data, recommendation won’t be possible.
Content-based filtering tends to recommend similar items to that purchased/liked
before, becoming repetitive. There is no personalization effect in this case.
Figure 6-1 explains hybrid recommendation systems.
Reference: https://www.researchgate.net/profile/Xiangjie-Kong-2/
publication/330077673/figure/fig5/AS:710433577107459@1546391972632/A-
hybrid-paper-recommendation-system.png
129
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_6
Chapter 6 Hybrid Recommender Systems
Implementation
Let’s import all the required libraries.
import pandas as pd
import numpy as np
from scipy.sparse import coo_matrix # for constructing sparse matrix
from lightfm import LightFM # for model
from lightfm.evaluation import auc_score
import time
import sklearn
from sklearn import model_selection
Data Collection
This chapter uses the same custom e-commerce dataset used in previous chapters. It can
be found at github.com/apress/applied-recommender-systems-python.
The following reads the data.
#orders data
order_df = pd.read_excel('Rec_sys_data.xlsx','order')
#customers data
customer_df = pd.read_excel('Rec_sys_data.xlsx','customer')
#products data
product_df = pd.read_excel('Rec_sys_data.xlsx','product')
order_df.head()
130
Chapter 6 Hybrid Recommender Systems
customer_df.head()
product_df.head()
131
Chapter 6 Hybrid Recommender Systems
Data Preparation
Before building the recommendation model, the required data must be in the proper
format so that the model can take input. Let’s get the user-to-product interaction matrix
and product-to-features interaction mappings.
Start with getting the list of unique users and unique products. Write two functions
to get the unique lists.
132
Chapter 6 Hybrid Recommender Systems
user_list
item_list
array(['Ganma Superheroes Ordinary Life Case For Samsung Galaxy Note 5 Hard
Case Cover',
'Eye Buy Express Prescription Glasses Mens Womens Burgundy Crystal
Clear Yellow Rounded Rectangular Reading Glasses Anti Glare grade',
...,
'Mediven Sheer and Soft 15-20 mmHg Thigh w/ Lace Silicone Top Band
CT Wheat II - Ankle 8-8.75 inches',
Union 3" Female Ports Stainless Steel Pipe Fitting',
'Auburn Leathercrafters Tuscany Leather Dog Collar’,
'3 1/2"W x 32"D x 36"H Traditional Arts & Crafts Smooth Bracket,
Douglas Fir'])
Let’s create a function to get the total list of unique values given three feature names
from a DataFrame. It gets the total unique list for three features: Customer Segment, Age,
and Gender.
133
Chapter 6 Hybrid Recommender Systems
feature_unique_list = features_to_add(customer_df,'Customer
Segment',"Age","Gender")
feature_unique_list
Now that we have the unique list for users, products, and features, we need to create
ID mappings to convert user_id, item_id, and feature_id into integer indices because
LightFM can’t read any other data types.
Let’s write a function for that.
item_to_index_mapping = {}
index_to_item_mapping = {}
# Create id mappings to convert item_id
for item_index, item_id in enumerate(item_list):
item_to_index_mapping[item_id] = item_index
index_to_item_mapping[item_index] = item_id
feature_to_index_mapping = {}
index_to_feature_mapping = {}
# Create id mappings to convert feature_id
for feature_index, feature_id in enumerate(feature_unique_list):
feature_to_index_mapping[feature_id] = feature_index
index_to_feature_mapping[feature_index] = feature_id
134
Chapter 6 Hybrid Recommender Systems
user_to_index_mapping, index_to_user_mapping, \
item_to_index_mapping, index_to_item_mapping, \
feature_to_index_mapping, index_to_feature_mapping =
mapping(user_list, item_list, feature_unique_list)
user_to_index_mapping
{12346: 0,
12347: 1,
12348: 2,
12350: 3,
12352: 4,
...}
Now let’s fetch the user-to-product relationship and calculate the total quantity
per user.
135
Chapter 6 Hybrid Recommender Systems
Let’s split the user-to-product relationship into train and test data.
user_to_product_train,user_to_product_test = model_selection.train_test_
split(user_to_product,test_size=0.33, random_state=42)
Now that the data and the ID mappings are in place, to get the user-to-product
and product-to-features interaction matrix, let’s first create a function that returns the
interaction matrix.
136
Chapter 6 Hybrid Recommender Systems
Then let’s generate user_item_interaction_matrix for train and test data using the
preceding function.
#for train
user_to_product_interaction_train = interactions(user_to_product_train,
"CustomerID",
"Product Name", "Quantity", user_to_index_mapping, item_to_index_mapping)
#for test
user_to_product_interaction_test = interactions(user_to_product_test,
"CustomerID",
"Product Name", "Quantity", user_to_index_mapping, item_to_index_mapping)
print(user_to_product_interaction_train)
(2124, 230) 10
(1060, 268) 16
: :
(64, 8) 24
(3406, 109) 1
(3219, 12) 12
137
Chapter 6 Hybrid Recommender Systems
Model Building
The data is in the correct format, so let’s begin the modeling process. This chapter uses
the LightFM model, which can incorporate user and item metadata to form robust
hybrid recommendation models.
Let’s try multiple models and then choose the one with the best performance. These
models have different hyperparameters, so this is part of the hyperparameter tuning
stage of modeling.
The loss function used in the model is one of the parameters to tune. The three
values are warp, logistic, and bpr.
Let’s start the model-building experiment.
Attempt 1 is loss = warp, epochs = 1, and num_threads = 4.
start = time.time()
#===================
# fitting the model with hybrid collaborative filtering + content based
(product + features)
model_with_features.fit_partial(user_to_product_interaction_train,
user_features=None,
item_features=product_to_feature_interaction,
sample_weight=None,
epochs=1,
num_threads=4,
verbose=False)
#===================
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
138
Chapter 6 Hybrid Recommender Systems
Calculate the area under the curve (AUC) score for validation.
start = time.time()
#===================
# Getting the AUC score using in-built function
auc_with_features = auc_score(model = model_with_features,
test_interactions = user_to_product_
interaction_test,
train_interactions = user_to_product_
interaction_train,
item_features = product_to_feature_interaction,
num_threads = 4, check_intersections=False)
#===================
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
start = time.time()
#===================
# fitting the model with hybrid collaborative filtering + content based
(product + features)
model_with_features.fit_partial(user_to_product_interaction_train,
user_features=None,
item_features=product_to_feature_interaction,
sample_weight=None,
139
Chapter 6 Hybrid Recommender Systems
epochs=1,
num_threads=4,
verbose=False)
#===================
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
start = time.time()
#===================
# Getting the AUC score using in-built function
140
Chapter 6 Hybrid Recommender Systems
start = time.time()
#===================
# fitting the model with hybrid collaborative filtering + content based
(product + features)
model_with_features.fit_partial(user_to_product_interaction_train,
user_features=None,
item_features=product_to_feature_interaction,
sample_weight=None,
epochs=1,
num_threads=4,
verbose=False)
#===================
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
start = time.time()
#===================
# Getting the AUC score using in-built function
141
Chapter 6 Hybrid Recommender Systems
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
start = time.time()
#===================
# fitting the model with hybrid collaborative filtering + content based
(product + features)
model_with_features.fit_partial(user_to_product_interaction_train,
user_features=None,
item_features=product_to_feature_interaction,
sample_weight=None,
epochs=10,
num_threads=20,
verbose=False)
#===================
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
time taken = 0.77 seconds
start = time.time()
#===================
# Getting the AUC score using in-built function
142
Chapter 6 Hybrid Recommender Systems
num_threads = 4, check_intersections=False)
#===================
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
The last model (logistic) performed the best overall (highest AUC score). Let’s merge
the train and test and do a final training by using the parameters from the logistic model,
which gave 0.89 AUC.
Merge the train and test with the following function.
143
Chapter 6 Hybrid Recommender Systems
row_list = np.array(row_list)
col_list = np.array(col_list)
data_list = np.array(data_list)
Call the preceding function to get the final (full) data to build the final model.
user_to_product_interaction = train_test_merge(user_to_product_interaction_
train, user_to_product_interaction_test)
start = time.time()
#===================
#final model fitting
final_model.fit(user_to_product_interaction,
user_features=None,
item_features=product_to_feature_interaction,
sample_weight=None,
epochs=10,
num_threads=20,
verbose=False)
#===================
144
Chapter 6 Hybrid Recommender Systems
end = time.time()
print("time taken = {0:.{1}f} seconds".format(end - start, 2))
Getting Recommendations
Now that the hybrid recommendation model is ready, let’s use it to get the
recommendations for a given user.
Let’s write a function for getting those recommendations given a user id as input.
def get_recommendations(model,user,items,user_to_product_interaction_
matrix,user2index_map,product_to_feature_interaction_matrix):
users = userindex
known_positives = items[user_to_product_interaction_matrix.tocsr()
[userindex].indices]
print('User index =',users)
145
Chapter 6 Hybrid Recommender Systems
top_items = items[np.argsort(-scores)]
for x in known_positives[:10]:
print(" %s" % x)
print(" Recommended:")
for x in top_items[:10]:
print(" %s" % x)
This function calculates a user’s prediction score (the likelihood to buy) for all items,
and the ten highest scored items are recommended. Let’s print the known positives or
items bought by that user for validation.
Call the following function for a random user (CustomerID 17017) to get
recommendations.
get_recommendations(final_model,17017,item_list,user_to_product_
interaction,user_to_index_mapping,product_to_feature_interaction)
Known positives:
Ganma Superheroes Ordinary Life Case For Samsung Galaxy Note 5 Hard
Case Cover
MightySkins Skin Decal Wrap Compatible with Nintendo Sticker Protective
Cover 100's of Color Options
Mediven Sheer and Soft 15-20 mmHg Thigh w/ Lace Silicone Top Band CT Wheat
II - Ankle 8-8.75 inches
MightySkins Skin Decal Wrap Compatible with OtterBox Sticker Protective
Cover 100's of Color Options
MightySkins Skin Decal Wrap Compatible with DJI Sticker Protective Cover
100's of Color Options
146
Chapter 6 Hybrid Recommender Systems
MightySkins Skin Decal Wrap Compatible with Lenovo Sticker Protective Cover
100's of Color Options
Ebe Reading Glasses Mens Womens Tortoise Bold Rectangular Full Frame Anti
Glare grade ckbdp9088
Window Tint Film Chevy (back doors) DIY
Union 3" Female Ports Stainless Steel Pipe Fitting
Ebe Women Reading Glasses Reader Cheaters Anti Reflective Lenses
TR90 ry2209
Recommended:
Mediven Sheer and Soft 15-20 mmHg Thigh w/ Lace Silicone Top Band CT Wheat
II - Ankle 8-8.75 inches
MightySkins Skin Decal Wrap Compatible with Apple Sticker Protective Cover
100's of Color Options
MightySkins Skin Decal Wrap Compatible with DJI Sticker Protective Cover
100's of Color Options
3 1/2"W x 20"D x 20"H Funston Craftsman Smooth Bracket, Douglas Fir
MightySkins Skin Decal Wrap Compatible with HP Sticker Protective Cover
100's of Color Options
Owlpack Clear Poly Bags with Open End, 1.5 Mil, Perfect for Products,
Merchandise, Goody Bags, Party Favors (4x4 inches)
Ebe Women Reading Glasses Reader Cheaters Anti Reflective Lenses
TR90 ry2209
Handcrafted Ercolano Music Box Featuring "Luncheon of the Boating Party" by
Renoir, Pierre Auguste - New YorkNew York
A6 Invitation Envelopes w/Peel & Press (4 3/4 x 6 1/2) - Baby Blue
(1000 Qty.)
MightySkins Skin Decal Wrap Compatible with Lenovo Sticker Protective Cover
100's of Color Options
Many recommendations align with the known positives. This provides further
validation. This hybrid recommendation engine can now get recommendations for all
other users.
147
Chapter 6 Hybrid Recommender Systems
Summary
This chapter discussed hybrid recommendation engines and how they can overcome the
shortfalls of other types of engines. It also showcased the implementation with the help
of LightFM.
148
CHAPTER 7
Clustering-Based
Recommender Systems
Recommender systems based on unsupervised machine learning algorithms are
very popular because they overcome many challenges that collaborative, hybrid, and
classification-based systems face. A clustering technique is used to recommend the
products/items based on the patterns and behaviors captured within each segment/
cluster. This technique is good when data is limited, and there is no labeled data to
work with.
Unsupervised learning is a machine learning category where labeled data is not
leveraged, but still, inferences are discovered using the data at hand. Let’s find the
patterns without the dependent variables to solve business problems. Figure 7-1 shows
the clustering outcome.
Grouping similar things into segments is called clustering; in our terms, “things” are
not data points but a collection of observations. They are
149
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_7
Chapter 7 Clustering-Based Recommender Systems
There are mainly two important algorithms that are highly being used in the
industry. Before getting into the projects, let’s briefly examine how algorithms work.
Approach
The following basic steps build a model based on similar users’ recommendations.
1. Data collection
2. Data preprocessing
4. Model building
5. Recommendations
150
Chapter 7 Clustering-Based Recommender Systems
Implementation
Let’s install and import the required libraries.
import pandas as pd
import numpy as np
151
Chapter 7 Clustering-Based Recommender Systems
Figure 7-3 shows the output of the first five rows of records data.
Figure 7-4 shows the output of the first five rows of customer data.
Figure 7-5 shows the output of the first five rows of product data.
152
Chapter 7 Clustering-Based Recommender Systems
153
Chapter 7 Clustering-Based Recommender Systems
The key insight from this chart is that data is not biased based on gender.
Let’s create buckets of age columns and plot them against the number of customers.
155
Chapter 7 Clustering-Based Recommender Systems
x = ["18-25","26-35","36-45","46-55","55+"]
y = [len(age18_25.values),len(age26_35.values),len(age36_45.
values),len(age46_55.values),len(age55above.values)]
plt.figure(figsize=(15,6))
sns.barplot(x=x, y=y, palette="rocket")
plt.title("Number of Customer and Ages")
plt.xlabel("Age")
plt.ylabel("Number of Customer")
plt.show()
Figure 7-9 shows the age column buckets plotted against the number of customers.
This analysis shows that there are fewer customers ages 18 to 25.
Label Encoding
Let’s encode all categorical variables.
156
Chapter 7 Clustering-Based Recommender Systems
df_customer['age'] = df_customer.Age
df_customer['gender']= gender_encoder.fit_transform(df_customer['Gender'])
df_customer['customer_segment']= segment_encoder.fit_transform(df_
customer['Customer Segment'])
df_customer['income_segment']= income_encoder.fit_transform(df_
customer['Income'])
print("gender_encoder",df_customer['gender'].unique())
print("segment_encoder",df_customer['customer_segment'].unique())
print("income_encoder",df_customer['income_segment'].unique())
gender_encoder [1 0]
segment_encoder [2 0 1]
income_encoder [0 1 2]
df_customer.iloc[:,6:]
Figure 7-10 shows the output of the DataFrame after encoding the values.
157
Chapter 7 Clustering-Based Recommender Systems
Model Building
This phase builds clusters using k-means clustering. To define an optimal number of
clusters, you can also consider the elbow method or the dendrogram method.
K-Means Clustering
k-means clustering is an efficient and widely used technique that groups the data based
on the distance between the points. Its objective is to minimize total variance within the
cluster, as shown in Figure 7-11.
• Euclidean distance
• Manhattan distance
158
Chapter 7 Clustering-Based Recommender Systems
• Cosine distance
• Hamming distance
Repeat steps 2, 3, and 4 until the same points are assigned to each cluster, and the
cluster centroid is stabilized.
Hierarchical Clustering
Hierarchical clustering is another type of clustering technique that also uses distance to
create the groups. The following steps generate clusters.
3. Combine these two most similar points and form one cluster.
4. This continues until all the clusters are merged and form a final
single cluster.
159
Chapter 7 Clustering-Based Recommender Systems
Usually, the distance between two clusters has been computed based on Euclidean
distance. Many other distance metrics can be leveraged to do the same.
Let’s build a k-means model for this use case. Before building the model, let’s
execute the elbow method and the dendrogram method to find the optimal clusters.
The following is an elbow method implementation.
# Elbow method
wcss = []
for k in range(1,15):
kmeans = KMeans(n_clusters=k, init="k-means++")
kmeans.fit(df_customer.iloc[:,6:])
wcss.append(kmeans.inertia_)
plt.figure(figsize=(12,6))
plt.grid()
plt.plot(range(1,15),wcss, linewidth=2, color="red", marker ="8")
plt.xlabel("K Value")
plt.xticks(np.arange(1,15,1))
plt.ylabel("WCSS")
plt.show()print("income_encoder",df_customer['income_segment'].unique())
160
Chapter 7 Clustering-Based Recommender Systems
linkage_matrix = np.column_stack(
[model.children_, model.distances_, counts]
161
Chapter 7 Clustering-Based Recommender Systems
).astype(float)
model = model.fit(df_customer.iloc[:,6:])
The optimal or least number of clusters for both methods is two. But let’s consider 15
clusters for this use case.
162
Chapter 7 Clustering-Based Recommender Systems
Note You can consider any number of clusters for implementation, but it should
be greater than the optimal, or the least, number of clusters from k-means
clustering or dendrogram.
# K-means
# Perform kmeans
km = KMeans(n_clusters=15)
clusters = km.fit_predict(df_customer.iloc[:,6:])
df_customer
Figure 7-15 shows the output of df_cluster after creating the clusters.
df_customer
163
Chapter 7 Clustering-Based Recommender Systems
Figure 7-16 shows the output of df_cluster after selecting particular columns.
# Loop through each bar in the graph and add the percentage value
for p in g.ax.patches:
txt = str(p.get_height().round(1)) + '%'
txt_x = p.get_x()
164
Chapter 7 Clustering-Based Recommender Systems
txt_y = p.get_height()
g.ax.text(txt_x,txt_y,txt)
Figure 7-17 shows the plot for the customer segment against clusters.
165
Chapter 7 Clustering-Based Recommender Systems
Let’s plot a chart that gives the average age per cluster.
df_customer.groupby('cluster').Age.mean().plot(kind='bar')
Figure 7-20 shows the plot for average age per cluster.
Until now, all the data preprocessing, EDA, and model building have been
performed on customer data.
Next, join customer data with the order data to get the product ID for each record.
order_cluster_mapping
167
Chapter 7 Clustering-Based Recommender Systems
Figure 7-21 shows the output after merging customer data with order data.
Now, let’s create score_df using groupby on 'cluster' and 'StockCode', and count it.
score_df = order_cluster_mapping.groupby(['cluster','StockCode']).count().
reset_index()
score_df = score_df.rename(columns={'CustomerID':'Score'})
score_df
168
Chapter 7 Clustering-Based Recommender Systems
missing_zero_values_table(df_product)
So, there are discrepancies present in the product data. Let’s clean it and
check again.
df_product = df_product.dropna()
missing_zero_values_table(df_product)
169
Chapter 7 Clustering-Based Recommender Systems
Let’s work on the Description column since we’re dealing with similar items.
The Description column contains text, so preprocessing and converting text to
features are required.
df_product['Description'] = df_product['Description'].replace({"[^A-Za-z0-9
]+": ""}, regex=True)
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(df_product['Description'])
170
Chapter 7 Clustering-Based Recommender Systems
The text preprocessing and text-to-features conversion are done. Now let’s build a
k-means model using 15 clusters.
km_des = KMeans(n_clusters=15,init='k-means++')
clusters = km_des.fit_predict(X)
df_product['cluster'] = clusters
df_product
Figure 7-25 shows the output after creating clusters for product data.
171
Chapter 7 Clustering-Based Recommender Systems
within_cosine_similarity = []
for i in range(len(vec_train.todense())):
within_cosine_similarity.append(cosine_similarity(vec_train[i,:].
toarray(), vec_query.toarray())[0][0])
df['Similarity'] = within_cosine_similarity
return df
def recommend_product(customer_id):
# filter for the particular customer
cluster_score_df = score_df[score_df.cluster==order_cluster_
mapping[order_cluster_mapping.CustomerID == customer_id]['cluster'].
iloc[0]]
172
Chapter 7 Clustering-Based Recommender Systems
top_orders = cust_orders.groupby(['StockCode']).count().reset_index()
top_orders = top_orders.rename(columns = {'CustomerID':'Counts'})
top_orders['CustomerID'] = customer_id
top_5_bought = top_orders.nlargest(5,'Counts')
print(top_5_bought)
df = df_product[df_product['cluster']==df_product[df_product.
StockCode==top_clusters.StockCode.iloc[0]]['cluster'].iloc[0]]
query = df_product[df_product.StockCode==top_clusters.StockCode.
iloc[0]]['Description'].iloc[0]
print("\nquery\n")
print(query)
recomendation = cosine_similarity_T(df,query)
print(recomendation.nlargest(3,'Similarity'))
recommend_product(13137)
173
Chapter 7 Clustering-Based Recommender Systems
The first set highlights similar user recommendations. The second set highlights
similar item recommendations.
Summary
In this chapter, you learned how to build a recommendation engine using unsupervised
ML algorithms, which is clustering. Customer and order data were used to recommend
the products/items based on similar users. The product data was used to recommend
the products/items using similar items.
174
CHAPTER 8
Classification Algorithm–
Based Recommender
Systems
A classification algorithm-based recommender system is also known as the buying
propensity model. The goal here is to predict the propensity of customers to buy a
product using historical behavior and purchases.
The more accurately you predict future purchases, the better recommendations
and, in turn, sales. This kind of recommender system is used more often to ensure
100% conversion from the users who are likely to purchase with certain probabilities.
Promotions are offered on those products, enticing users to make a purchase.
Approach
The following basic steps build a classification algorithm-based recommender engine.
1. Data collection
3. Feature engineering
5. Model building
6. Evaluation
Figure 8-1 shows the steps for building a classification algorithm-based model.
175
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_8
Chapter 8 Classification Algorithm–Based Recommender Systems
Implementation
Let’s install and import the required libraries.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns.display import Image
import os
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score,confusion_
matrix,classification_report
from sklearn.linear_model import LogisticRegression
from imblearn.combine import SMOTETomek
from collections import Counter
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score, roc_curve
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import roc_curve, roc_auc_score
from sklearn.naive_bayes import GaussianNB
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
from xgboost import XGBClassifier
176
Chapter 8 Classification Algorithm–Based Recommender Systems
Figure 8-2 shows the output of the first five rows of records data.
Figure 8-3 shows the output of the first five rows of customer data.
177
Chapter 8 Classification Algorithm–Based Recommender Systems
Figure 8-4 shows the output of the first five rows of product data.
Figure 8-5 shows the output of grouping by stock code and customer ID and sums
the quantity.
178
Chapter 8 Classification Algorithm–Based Recommender Systems
Next, check for null values for customers and records datasets.
InvoiceNo 0
StockCode 0
Quantity 0
InvoiceDate 0
DeliveryDate 0
Discount% 0
ShipMode 0
ShippingCost 0
CustomerID 0
dtype: int64
--------------
CustomerID 0
Gender 0
Age 0
Income 0
Zipcode 0
179
Chapter 8 Classification Algorithm–Based Recommender Systems
Customer Segment 0
dtype: int64
There are no null values present in the datasets. So, dropping or treating them is not
required.
Let’s load CustomerID and StockCode into different variables and create a cross-
product for further usage.
Now, let’s merge 'group' and 'a' with 'CustomerID' and 'StockCode'.
180
Chapter 8 Classification Algorithm–Based Recommender Systems
As you can see, null values are present in the Quantity column.
Let’s check for nulls.
779771
(810000, 3)
Let’s treat missing values by replacing null with zeros and checking for
unique values.
181
Chapter 8 Classification Algorithm–Based Recommender Systems
182
Chapter 8 Classification Algorithm–Based Recommender Systems
Let’s extract the first hierarchy level from the category column and join the product_
data table.
StockCode 0
Brand 0
Unit Price 0
Category 0
183
Chapter 8 Classification Algorithm–Based Recommender Systems
dtype: int64
Figure 8-11 shows the output of the first five rows after merging.
print(final_data1.shape)
# Check for null values in each columns
final_data1.isnull().sum()
(61200, 10)
184
Chapter 8 Classification Algorithm–Based Recommender Systems
CustomerID 0
Gender 0
Age 0
Income 0
Customer Segment 0
StockCode 0
Quantity 0
Brand 0
Unit Price 0
Category 0
dtype: int64
print(final_data1['Category'].unique())
print('------------\n')
print(final_data1['Income'].unique())
print('------------\n')
print(final_data1['Brand'].unique())
print('------------\n')
print(final_data1['Customer Segment'].unique())
print('------------\n')
print(final_data1['Gender'].unique())
print('------------\n')
print(final_data1['Quantity'].unique())
185
Chapter 8 Classification Algorithm–Based Recommender Systems
['male' 'female']
------------
[ 0 1 3 5 15 2 4 8 6 24 7 30 9 10
62 20 18 12 72 50 400 36 27 242 58 25 60 48
22 148 16 152 11 31 64 147 42 23 43 26 14 21
1200 500 28 112 90 128 44 200 34 96 140 19 160 17
100 320 370 300 350 32 78 101 66 29]
From this output, you can see some special characters in the brand column. Let’s
remove them.
## test cleaning
final_data1['Brand'] = final_data1['Brand'].str.replace('?', '')
final_data1['Brand'] = final_data1['Brand'].str.replace('&', 'and')
final_data1['Brand'] = final_data1['Brand'].str.replace('(', '')
final_data1['Brand'] = final_data1['Brand'].str.replace(')', '')
print(final_data1['Brand'].unique())
186
Chapter 8 Classification Algorithm–Based Recommender Systems
All the datasets have merged, and the required data preprocessing and cleaning are
completed.
Feature Engineering
Once the data is preprocessed and cleaned, the next step is to perform feature engineering.
Let’s create a flag column, using the Quantity column, that indicates whether the
customer has bought the product or not.
If the Quantity column is 0, the customer has not bought the product.
Figure 8-12 shows the first five rows' output after creating the target column.
A new flag_buy column is created. Let’s do some basic exploration of that column.
array([0, 1])
187
Chapter 8 Classification Algorithm–Based Recommender Systems
<class 'pandas.core.frame.DataFrame'>
Int64Index: 61200 entries, 0 to 61199
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 61200 non-null int64
1 Gender 61200 non-null object
2 Age 61200 non-null int64
3 Income 61200 non-null object
4 Customer Segment 61200 non-null object
5 StockCode 61200 non-null object
6 Quantity 61200 non-null int64
7 Brand 61200 non-null object
8 Unit Price 61200 non-null float64
9 Category 61200 non-null object
10 flag_buy 61200 non-null int64
dtypes: float64(1), int64(4), object(6)
memory usage: 5.6+ MB
188
Chapter 8 Classification Algorithm–Based Recommender Systems
You can get more business insights by looking at the historical data itself.
Let’s start exploring the data. Plot a chart of the brand column.
plt.figure(figsize=(50,20))
sns.set_theme(style="darkgrid")
sns.countplot(x = 'Brand', data = final_data1)
The key insight from this chart is that the Mightyskins brand has the highest sales.
Let’s plot the Income column.
189
Chapter 8 Classification Algorithm–Based Recommender Systems
The key takeaway insight from this chart is that low-income customers are buying
more products. However, there is not a major difference between medium and high-
income customers.
Let’s dump a few charts here. For more information, please refer to the notebook.
Plot a histogram to show age distribution.
190
Chapter 8 Classification Algorithm–Based Recommender Systems
plt.figure(figsize=(10,5))
sns.set_theme(style="darkgrid")
sns.histplot(data=final_data1, x="Age", hue="Category", element= "poly")
191
Chapter 8 Classification Algorithm–Based Recommender Systems
It looks like this particular use case has a data imbalance. Let’s build the model after
sampling the data.
Model Building
Let’s encode all the categorical variables before building the model. Also, store the stock
code for further usage.
192
Chapter 8 Classification Algorithm–Based Recommender Systems
Train-Test Split
The data is split into two parts: one for training the model, which is the training set, and
another for evaluating the model, which is the test set. The train_test_split library from
sklearn.model_selection is imported to split the DataFrame into two parts.
193
Chapter 8 Classification Algorithm–Based Recommender Systems
Logistic Regression
Linear regression is needed to predict a numerical value. But you also encounter
classification problems where dependent variables are binary, like yes or no, 1 or 0,
true or false, and so on. In that case, logistic regression is needed. It is a classification
algorithm and continues linear regression. Here, log odds are used to restrict the
dependent variable between 0 and 1.
Figure 8-20 shows the logistic regression formula.
194
Chapter 8 Classification Algorithm–Based Recommender Systems
195
Chapter 8 Classification Algorithm–Based Recommender Systems
Linear and logistic regression are the traditional way of using statistics as a base to
predict the dependent variable. But there are a few drawbacks to these algorithms.
• These algorithm face challenge when data and target feature is non-
linear. Complex patterns are hard to decode.
Advanced machine learning concepts like decision tree, random forest, SVM, and
neural networks can be used to overcome these limitations.
Implementation
##training using logistic regression
196
Chapter 8 Classification Algorithm–Based Recommender Systems
# calculate score
pred=logistic.predict(x_test)
print(confusion_matrix(y_test, pred))
print(accuracy_score(y_test, pred))
print(classification_report(y_test, pred))
[[23633 0]
[ 2 845]]
0.9999183006535948
precision recall f1-score support
0 1.00 1.00 1.00 23633
1 1.00 1.00 1.00 847
accuracy 1.00 24480
macro avg 1.00 1.00 1.00 24480
weighted avg 1.00 1.00 1.00 24480
This chapter’s “Exploratory Data Analysis” section discussed the target distribution
and its imbalances. Let’s apply a sampling technique, make it balanced data, and then
build the model.
197
Chapter 8 Classification Algorithm–Based Recommender Systems
# Calculate Score
y_pred=logistic.predict(x_test)
print(confusion_matrix(y_test,y_pred))
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))
[[23633 0]
[ 0 847]]
1.0
precision recall f1-score support
0 1.00 1.00 1.00 23633
1 1.00 1.00 1.00 847
accuracy 1.00 24480
macro avg 1.00 1.00 1.00 24480
weighted avg 1.00 1.00 1.00 24480
Decision Tree
The decision is a type of supervised learning in which the data is split into similar groups
based on the most important variable to the least. It looks like a tree-shaped structure
when all the variables split hence the name tree-based models.
198
Chapter 8 Classification Algorithm–Based Recommender Systems
The tree comprises a root node, a decision node, and a leaf node. A decision node
can have two or more branches, and a leaf node represents a decision. Decision trees
handle any type of data, be it quantitative or qualitative. Figure 8-24 shows how the
decision tree works.
Let’s examine how tree splitting happens, which is the key concept in decision trees.
The core of the decision tree algorithm is the process of splitting the tree. It uses different
algorithms to split the node and is different for classification and regression problems.
The following are for classification problems.
• The Gini index is a probabilistic way of splitting the trees. It uses the
sum of the probability square for success and failure and decides the
purity of the nodes. CART (classification and regression tree) uses the
Gini index to create splits.
• Overfitting occurs when the algorithms tightly fit the given training
data but is inaccurate in predicting the outcomes of the untrained or
test data. The same is the case with decision trees as well. It occurs
199
Chapter 8 Classification Algorithm–Based Recommender Systems
when the tree is created to perfectly fit all samples in the training
dataset, affecting test data accuracy.
Implementation
##Training model using decision tree
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier()
dt.fit(X_res, y_res)
y_pred = dt.predict(x_test)
print(dt.score(x_train, y_train))
print(confusion_matrix(y_test,y_pred))
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))
1.0
[[23633 0]
[ 0 847]]
1.0
precision recall f1-score support
0 1.00 1.00 1.00 23633
1 1.00 1.00 1.00 847
accuracy 1.00 24480
macro avg 1.00 1.00 1.00 24480
weighted avg 1.00 1.00 1.00 24480
Random Forest
Random forest is the most widely used machine learning algorithm because of its
flexibility and ability to overcome the overfitting problem. A random forest is an
ensemble algorithm that is an ensemble of multiple decision trees. The higher the
number of trees, the better the accuracy.
200
Chapter 8 Classification Algorithm–Based Recommender Systems
The random forest can perform both classification and regression tasks. The
following are some of its advantages.
• Randomly takes the square root of m features and 2/3 bootstrap data
sample with a replacement for training each decision tree randomly
and predicts the outcome
• Computes the votes for each predicted target and considers the mode
as a final prediction in terms of classification
201
Chapter 8 Classification Algorithm–Based Recommender Systems
Implementation
##Training model using Random forest
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier()
rf.fit(X_res, y_res)
# Calculate Score
y_pred=rf.predict(x_test)
print(confusion_matrix(y_test,y_pred))
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))
[[23633 0]
[ 0 847]]
1.0
precision recall f1-score support
0 1.00 1.00 1.00 23633
1 1.00 1.00 1.00 847
accuracy 1.00 24480
macro avg 1.00 1.00 1.00 24480
weighted avg 1.00 1.00 1.00 24480
KNN
For more information on the algorithm, please refer to Chapter 4.
Implementation
#Training model using KNN
from sklearn.metrics import roc_auc_score, roc_curve
from sklearn.neighbors import KNeighborsClassifier
model1 = KNeighborsClassifier(n_neighbors=3)
model1.fit(X_res,y_res)
y_predict = model1.predict(x_test)
202
Chapter 8 Classification Algorithm–Based Recommender Systems
# Calculate Score
print(model1.score(x_train, y_train))
print(confusion_matrix(y_test,y_predict))
print(accuracy_score(y_test,y_predict))
print(classification_report(y_test,y_predict))
203
Chapter 8 Classification Algorithm–Based Recommender Systems
In the preceding models, the logistic regression performance is better than all
other models.
So, using that model, let’s recommend the products to one customer.
204
Chapter 8 Classification Algorithm–Based Recommender Systems
# to build the model we have encoded the stockcode column now we will
decode and recommend.
items = []
for item_id in recomm_one_cust['StockCode'].unique().tolist():
prod = {v: k for k, v in mappings['StockCode'].items()}[item_id]
items.append(str(prod))
items
These are the product IDs that should be recommended for customer 17315.
If you want recommendations with product names, filter these IDs in the
product table.
recommendations = []
for i in items:
recommendations.append(prod_df[prod_df['StockCode']== i]
['Product Name'])
recommendations
205
Chapter 8 Classification Algorithm–Based Recommender Systems
You can also do this recommendation using the probability output from the model
by sorting them.
Summary
In this chapter, you learned how to recommend a product/item to the customers using
various classification algorithms, from data cleaning to model building. These types of
recommendations are an add-on to the e-commerce platform. With classification-based
algorithm output, you can show the hidden products to the user, and the customer
is more likely to be interested in those products/items. The conversion rate of these
recommendations is high compared to other recommender techniques.
206
CHAPTER 9
Deep Learning–Based
Recommender System
So far, you have learned various methods for building recommender systems and saw
their implementation in Python. The book began with basic and intuitive methods, like
market basket analysis, arithmetic-based content, and collaborative filtering methods,
and then moved on to more complex machine learning methods, like clustering, matrix
factorizations, and machine learning classification-based methods. This chapter
continues the journey by implementing an end-to-end recommendation system using
advanced deep learning concepts.
Deep learning techniques utilize recent and rapidly growing network architectures
and optimization algorithms to train on large amounts of data and build more expressive
and better-performing models. Graphics Processing Units (GPUs) and deep learning
have been driving advances in recommender systems for the past few years. Due to
their massively parallel architecture, using GPUs for computation provides higher
performance and cost savings. Let’s first explore the basics of deep learning and then
look at the deep learning–based collaborative filtering method (neural collaborative
filtering).
207
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_9
Chapter 9 Deep Learning–Based Recommender System
Figure 9-1 shows a neural network, the building blocks of any deep learning
algorithm.
Each node’s output transforms the input based on the weights provided to nodes
and edges, and as you progress through the network, from input to output layer, the
prediction is optimized and refined further by each layer. This is known as forward
propagation. Another important process, known as backpropagation, uses loss
optimization algorithms, such as gradient descent, to calculate and reduce the losses in
prediction by adjusting the weights of the nodes and edges for each layer while moving
backward from the output layer toward the input layer. These two processes work
together to build the final network that gives accurate predictions.
This was a simple explanation of basic neural networks, which are typically the
building blocks of every deep learning algorithm.
Multilayer perceptron (MLP) is a neural network with multiple layers that are fully
connected (i.e., all nodes of the previous layer are connected to all nodes in the next
layer). Every node in an MLP usually uses a sigmoid function as its activation function.
The sigmoid function takes real values as input and returns a real value between 0
and 1 using this formula: sigmoid(x) = 1/(1 + exp(–x)), where x is the input. In NCF, the
209
Chapter 9 Deep Learning–Based Recommender System
activation function is a rectified linear activation function (ReLU). The input returns the
same number if it’s a positive value; it outputs a 0 if it’s a negative value. The formula for
ReLU is max(0, x), where x is the input.
Figure 9-3 shows a multilayer perceptron (MLP).
Implementation
The following installs and imports the required libraries.
%load_ext autoreload
%autoreload 2
import sys
import pandas as pd
import tensorflow as tf
210
Chapter 9 Deep Learning–Based Recommender System
Data Collection
Let’s consider an e-commerce dataset. Download the dataset from the GitHub link.
211
Chapter 9 Deep Learning–Based Recommender System
print(customer_df.head())
print(prod_df.head())
Figure 9-4 shows the output of the first five rows of records data.
Figure 9-5 shows the output of the first five rows of customer data.
Figure 9-6 shows the output of the first five rows of product data.
Data Preprocessing
Let’s select the required columns from records_df and drop nulls, if any. Also, dropping
the string item IDs (StockCode) for getting desired input data fa or modeling.
212
Chapter 9 Deep Learning–Based Recommender System
#selecting columns
df = record_df[['CustomerID','StockCode','Quantity','DeliveryDate']]
#dropping the StockCodes (item ids) that are string for this experiment, as
NCF only takes integer ids
df["StockCode"] = df["StockCode"].apply(lambda x: pd.to_numeric(x,
errors='coerce')).dropna()
# dropping nulls
df = df.dropna()
print(df.shape)
df
Figure 9-7 shows the output order data after selecting the required columns.
df = df.rename(columns={
213
Chapter 9 Deep Learning–Based Recommender System
'CustomerID':"userID",'StockCode':"itemID",'Quantity':"rating",'Deliver
yDate':"timestamp"
})
Next, change the user_id and item_id datatypes to an integer since that is the
required format for NCF.
df["userID"] = df["userID"].astype(int)
df["itemID"] = df["itemID"].astype(int)
Train-Test Split
The data is split into two parts: one for training the model, which is the training set, and
another for evaluating the model, which is the test set.
Let’s split the data using the Spark chronological splitter provided in the utilities.
Save the train and test data into two separate files, which are later loaded into the
model init function.
train_file = "./train.csv"
test_file = "./test.csv"
train.to_csv(train_file, index=False)
test.to_csv(test_file, index=False)
214
Chapter 9 Deep Learning–Based Recommender System
TOP_K = 10
# Model parameters
EPOCHS = 50
BATCH_SIZE = 256
SEED = 42
215
Chapter 9 Deep Learning–Based Recommender System
items.extend(item)
preds.extend(list(model.predict(user, item, is_list=True)))
# Evaluate model
eval_map = map_at_k(test, all_predictions, col_prediction='prediction',
k=TOP_K)
eval_ndcg = ndcg_at_k(test, all_predictions, col_prediction='prediction',
k=TOP_K)
eval_precision = precision_at_k(test, all_predictions, col_
prediction='prediction', k=TOP_K)
eval_recall = recall_at_k(test, all_predictions, col_
prediction='prediction', k=TOP_K)
print("MAP:\t%f" % eval_map,
"NDCG:\t%f" % eval_ndcg,
"Precision@K:\t%f" % eval_precision,
"Recall@K:\t%f" % eval_recall, sep='\n')
MAP: 0.020692
NDCG: 0.064364
216
Chapter 9 Deep Learning–Based Recommender System
Precision@K: 0.047777
Recall@K: 0.051526
# read data
df_order = pd.read_excel('Rec_sys_data.xlsx', 'order')
df_customer = pd.read_excel('Rec_sys_data.xlsx', 'customer')
df_product = pd.read_excel('Rec_sys_data.xlsx', 'product')
The all_predictions object with a set of recommendations given by the model has
been created.
Select required and rename the columns.
#select columns
all_predictions = all_predictions[['userID','itemID','prediction']]
# rename columns
all_predictions = all_predictions.rename(columns={
"userID":'CustomerID',"itemID":'StockCode',"rating":'Quantity','prediction'
:'probability'
})
Now let’s write a function to recommend the products by giving the customer ID
as input.
The function uses the all_predictions object to recommend the products.
def recommend_product(customer_id):
print(df_order[df_order['CustomerID']==customer_id][['CustomerID',
'StockCode','Quantity']].nlargest(5,'Quantity'))
top_5_bought = df_order[df_order['CustomerID']==customer_id][['CustomerID
','StockCode','Quantity']].nlargest(5,'Quantity')
217
Chapter 9 Deep Learning–Based Recommender System
print(df_product[df_product.StockCode.isin(top_5_bought.StockCode)]
['Product Name'])
print(all_predictions[all_predictions['CustomerID']==customer_id].
nlargest(5,'probability'))
recommend = all_predictions[all_predictions['CustomerID']==customer_id].
nlargest(5,'probability')
print(df_product[df_product.StockCode.isin(recommend.StockCode)]
['Product Name'])
• The top five bought stock codes (item IDs) with the product names
for a given customer
• The top five recommendations for the same customer from NCF
Let’s use the function to recommend products for customers 13137 and 15127.
recommend_product(13137)
218
Chapter 9 Deep Learning–Based Recommender System
recommend_product(15127)
219
Chapter 9 Deep Learning–Based Recommender System
Summary
This chapter covered deep learning and how deep learning–based recommendation
engines work. You saw this by implementing an end-to-end deep learning–based
recommender system using NF. Deep learning–based recommender systems are a
very new but relevant field, and it has shown quite promising results in recent times.
If sufficient data and computational access are provided, then deep learning–based
techniques will surely outperform any other techniques out there in the market, and
hence is a very important concept to have in your repertoire.
220
CHAPTER 10
Graph-Based
Recommender Systems
The previous chapter covered deep learning-based recommender systems and
explained how to implement end-to-end neural collaborative filtering. This chapter
explores another recent advanced method: graph-based recommendation systems
powered by knowledge graphs.
Figure 10-1 illustrates a graph-based recommendation system for movie
recommendations.
221
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_10
Chapter 10 Graph-Based Recommender Systems
by illustrating the relationship between multiple entities. The graph structure, when
visualized, has three primary components; nodes, edges, and labels. The edge of the link
defines the relationship between two nodes/entities, where each node can be any object,
user, item, place, and so forth. The underlying semantics provide an additional dynamic
context to the defined relationships, enabling more complex decision-making.
Figure 10-2 shows a simple one-to-one relationship in a knowledge graph structure.
222
Chapter 10 Graph-Based Recommender Systems
This chapter uses Neo4j to implement the knowledge graphs. Neo4j is one of the best
graph databases in the market today. It is a high-performance graph store with a user-
friendly query language that is highly scalable and robust.
The knowledge graphs will fetch similar users for the required recommendations.
Implementation
The following installs and imports the required libraries.
import pandas as pd
from neo4j import GraphDatabase, basic_auth
from py2neo import Graph
import re
import neo4jupyter
Before establishing the connection between Neo4j and the notebook, create a new
sandbox in Neo4j at https://neo4j.com/sandbox/.
Once the sandbox is created, you must change the URL and the password.
You can find them in the connection details, as shown in Figure 10-4.
223
Chapter 10 Graph-Based Recommender Systems
driver = GraphDatabase.driver(
"bolt://44.192.55.13:7687",
auth=basic_auth("neo4j", "butter-ohms-chairman"))
def execute_transactions(transaction_execution_commands):
# Establishing connection with database
data_base_connection = GraphDatabase.driver
("bolt://44.192.55.13:7687",
auth=basic_auth("neo4j", "butter-ohms-chairman"))
# Creating a session
session = data_base_connection.session()
224
Chapter 10 Graph-Based Recommender Systems
for i in transaction_execution_commands:
session.run(i)
# This dataset contains detailed information about each stock which will be
used to link stockcodes and their description/title.
df1 = pd.read_excel('Rec_sys_data.xlsx','product')
df1.head()
for i in customerids:
225
Chapter 10 Graph-Based Recommender Systems
Once the customer nodes are done, create nodes for the stock.
for i in stockcodes:
# example of create statement "create (m:entity {property_key : 'XYZ'})"
statement = "create (s:stock{stockcode:"+ '"' + str(i) + '"' +"})"
create_stockcodes.append(statement)
Next, create a link between the stock codes and title, which are needed to
recommend items.
For this, let’s create another property key called 'Title' into the existing stock entity in
our Neo4j database.
# This cell of code will add all the unique stockcodes along with their
title in df2
stockcodes = df['StockCode'].unique().tolist()
226
Chapter 10 Graph-Based Recommender Systems
for i in range(len(stockcodes)):
dict_temp = {}
dict_temp['StockCode'] = stockcodes[i]
dict_temp['Title'] = df1[df1['StockCode']==stockcodes[i]]['Product
Name'].values
temp_Df = pd.DataFrame([dict_temp])
df2 = df2.append(temp_Df)
df2= df2.reset_index(drop=True)
# Doing some data preprocessing such that these queries can be run in neo4j
df2['Title'] = df2['Title'].apply(str)
df2['Title'] = df2['Title'].map(lambda x: re.sub(r'\W+', ' ', x))
df2['Title'] = df2['Title'].apply(str)
# This query will add the 'title' property key to each stock entity in our
neo4j database
for i in range(len(df2)):
query = """
MATCH (s:stock {stockcode:""" + '"' + str(df2['StockCode'][i]) +
'"' + """})
SET s.title ="""+ '"' + str(df2['Title'][i]) + '"' + """
RETURN s.stockcode, s.title
"""
g.run(query)
for i in transaction_list:
227
Chapter 10 Graph-Based Recommender Systems
# the 9th column in df is customerID and 2nd column is stockcode which we
are appending in the statement
statement = """MATCH (a:customer),(b:stock) WHERE a.cid = """+'"' +
str(i[8])+ '"' + """ AND b.stockcode = """ + '"' + str(i[1]) + '"' +
""" CREATE (a)-[:bought]->(b) """
relation.append(statement)
execute_transactions(relation)
Next, let’s find similarities between users using the relationship created.
The Jaccard similarity can be calculated as the ratio between the intersection and the
union of two sets. It is a measure of similarity, and as it is a percentage value, it ranges
between 0% to 100%. More similar sets have a higher value.
def similar_users(id):
# This query will find users who have bought stocks in common with the
customer having id specified by user
# Later we will find jaccard index for each of them
# We wil return the neighbors sorted by jaccard index in descending order
query = """
MATCH (c1:customer)-[:bought]->(s:stock)<-[:bought]-(c2:customer)
WHERE c1 <> c2 AND c1.cid =""" + '"' + str(id) +'"' """
WITH c1, c2, COUNT(DISTINCT s) as intersection
MATCH (c:customer)-[:bought]->(s:stock)
WHERE c in [c1, c2]
WITH c1, c2, intersection, COUNT(DISTINCT s) as union
WITH c1, c2, intersection, union, (intersection * 1.0 / union) as
jaccard_index
ORDER BY jaccard_index DESC, c2.cid
WITH c1, COLLECT([c2.cid, jaccard_index, intersection, union])[0..15] as
neighbors
WHERE SIZE(neighbors) = 15 // return users with enough neighbors
RETURN c1.cid as customer, neighbors
228
Chapter 10 Graph-Based Recommender Systems
"""
neighbors = pd.DataFrame([['CustomerID','JaccardIndex','Intersection',
'Union']])
for i in g.run(query).data():
neighbors = neighbors.append(i["neighbors"])
similar_users('12347')
similar_users(' 17975')
229
Chapter 10 Graph-Based Recommender Systems
def recommend(id):
# The query below is same as similar_users function
# It will return the most similar customers
query1 = """
MATCH (c1:customer)-[:bought]->(s:stock)<-[:bought]-(c2:customer)
WHERE c1 <> c2 AND c1.cid =""" + '"' + str(id) +'"' """
WITH c1, c2, COUNT(DISTINCT s) as intersection
MATCH (c:customer)-[:bought]->(s:stock)
WHERE c in [c1, c2]
WITH c1, c2, intersection, COUNT(DISTINCT s) as union
WITH c1, c2, intersection, union, (intersection * 1.0 / union) as
jaccard_index
ORDER BY jaccard_index DESC, c2.cid
WITH c1, COLLECT([c2.cid, jaccard_index, intersection, union])[0..15]
as neighbors
WHERE SIZE(neighbors) = 15 // return users with enough neighbors
RETURN c1.cid as customer, neighbors
230
Chapter 10 Graph-Based Recommender Systems
"""
neighbors = pd.DataFrame([['CustomerID','JaccardIndex','Intersection',
'Union']])
neighbors_list = {}
for i in g.run(query1).data():
neighbors = neighbors.append(i["neighbors"])
neighbors_list[i["customer"]] = i["neighbors"]
print(neighbors_list)
# From the neighbors_list returned, we will fetch the customer ids of
those neighbors to recommend items
nearest_neighbors = [neighbors_list[id][i][0] for i in range
(len(neighbors_list[id]))]
# The below query will fetch all the items boughts by nearest neighbors
# We will remove the items which have been already bought by the target
customer
# Now from the filtered set of items, we will count how many times each
item is repeating within the shopping carts of nearest neighbors
# We will sort that list on count of repititions and return in
descending order
query2 = """
// get top n recommendations for customer from their nearest
neighbors
MATCH (c1:customer),(neighbor:customer)-[:bought]->(s:stock)
// all items bought by neighbors
WHERE c1.cid = """ + '"' + str(id) + '"' """
AND neighbor.cid in $nearest_neighbors
AND not (c1)-[:bought]->(s) // filter for
items that our user hasn't bought before
231
Chapter 10 Graph-Based Recommender Systems
# We will also print the items bought earlier by the target customer
print(" \n---------- Top 8 StockCodes bought by customer " + str(id) +
" -----------\n")
print(df[df['CustomerID']==id][['CustomerID','StockCode','Quantity']].
nlargest(8,'Quantity'))
bought = df[df['CustomerID']==id][['CustomerID','StockCode',
'Quantity']].nlargest(8,'Quantity')
print((df1[df1.StockCode.isin(bought.StockCode)]['Product Name']).
to_string())
• The top eight stock codes and product names bought by a particular
customer
232
Chapter 10 Graph-Based Recommender Systems
2. Fetch all the items bought by the nearest neighbors and remove
the items the target customer has already bought.
3. From the filtered set of items, count the number of times each
item is repeating within the shopping carts of nearest neighbors
and then sort that list on the count of repetitions and return in
descending order.
recommend('17850')
recommend(' 12347')
233
Chapter 10 Graph-Based Recommender Systems
Summary
This chapter briefly covered knowledge graphs and how graph-based recommendation
engines work. You saw an actual implementation of an end-to-end graph-based
recommender system using Neo4j knowledge graphs. The concepts used are very new
and advanced and have become popular recently. Big players like Netflix and Amazon
are shifting to graph-based systems for their recommendations; hence, the approach is
very relevant and must-know.
234
CHAPTER 11
Emerging Areas
and Techniques
in Recommender Systems
This book has shown you multiple implementations of recommender systems (also
known as recommendation systems) using various techniques. You have gained a holistic
view of all these methods. Topics like deep learning and graph-based approaches are
still improving. Recommender systems have been a major research interest for a long
time. Newer, more complex, and more interesting avenues have been discovered, and
research continues in the same direction.
This chapter looks at real-time, context-aware, conversational, and multi-task
recommenders to showcase the vast potential for research and growth in this field.
Real-Time Recommendations
In general, batch recommendations are computationally inexpensive and
preferred because they can be generated daily (for example) and are much simpler
to operationalize. But recently, more focus has been on developing real-time
recommendations. Real-time recommendations are generally more computationally
expensive since they must be generated on-demand and are based on live user
interactions. Operationalizing real-time recommendations is also more complex.
Then why are real-time recommendations needed? They are essential when a time-
based and mission-centric customer journey depends on context. In most scenarios,
real-time demand needs to be met before the user loses interest and the demand fades.
Also, real-time analysis of a customer’s journey leads to better recommendations in the
235
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9_11
Chapter 11 Emerging Areas and Techniques in Recommender Systems
Conversational Recommendations
Recent years have seen a lot of research and effort in developing more conversational
systems. It is believed to revolutionize how human-computer interactions happen in
the future. This influence can also be seen in the recent developments in recommender
systems in the form of conversational systems.
Figure 11-1 shows the system design of conversational recommenders.
236
Chapter 11 Emerging Areas and Techniques in Recommender Systems
Context-Aware Recommenders
Researchers and practitioners have recognized the importance of contextual information
in many fields, such as e-commerce personalization, information retrieval, ubiquitous
and mobile computing, data mining, marketing, and management. Although there
has been a lot of research in recommender systems, many existing approaches do not
consider other contextual information, such as time, place, or other people’s company,
to find the most relevant. It focuses on recommending articles to users. (such as
watching a movie or eating). There is a growing understanding that relevant contextual
information is important in recommender systems and that it is important to consider it
when making recommendations.
Context-aware recommender systems represent an emerging area of
experimentation and research aimed at delivering more accurate content based on the
user’s context at any given moment. For example, is the user at home or on the go? Are
they using a large or small screen? Morning or evening? Given the data available to a
particular user, the contextual system may provide recommendations that the user will
likely accept in these scenarios.
Figure 11-2 shows different types of contextual recommenders.
237
Chapter 11 Emerging Areas and Techniques in Recommender Systems
Multi-task Recommenders
In many domains, there are several rich and important sources of feedback to draw from
while building a recommender system. For example, e-commerce websites generally
record user visits (to product pages), user clicks (click-stream data), additions to carts,
and purchases made at every user and item level. Post-purchase inputs like reviews and
returns are also recorded.
Integrating these different forms of feedback is critical to building systems that yield
better results rather than building a task-specific model. This is especially true where
some data is sparse (purchases, returns, and reviews) and some data is abundant (such
as clicks). In these scenarios, a joint model may use representations obtained from an
abundant task to improve its predictions on a sparse one through a phenomenon known
as transfer learning.
238
Chapter 11 Emerging Areas and Techniques in Recommender Systems
• It avoids overfitting.
239
Chapter 11 Emerging Areas and Techniques in Recommender Systems
learning architecture, where each type of information source (textual review, product
images, rating points, etc.) is adopted to learn appropriate user and item representations.
Figure 11-4 explains the representation of JRL.
Conclusion
Recommender systems have been gaining traction since the start of the e-commerce
era, but it has been around for some time. The first recommender system was developed
in 1979 in a system called Grundy, a computer-based librarian that offered suggestions
on books to read. The first commercial use of recommender systems was in the early
1990s. Since then, it has taken off because the financial incentives and time-saving
qualities that recommender systems provide are unmatchable. Recommender systems
have become essential for better user experience in many domains. The most popular
example is that of Netflix and its recommendation engine, which receives heavy funding
and focuses on research and development.
The constant need for recommender systems and their importance in various
domains has led to a huge demand for building good, reliable, and robust systems.
It calls for more research and innovation in developing these systems. This benefits
businesses and helps users save time in making decisions and getting the most suitable
option, which would be missed in most scenarios if done manually.
240
Chapter 11 Emerging Areas and Techniques in Recommender Systems
241
Index
A logistic regression, 194–198
random forest, 200–202
Accuracy, 195
steps, 175
Advanced machine learning concepts, 196
train-test split, 193
Apriori algorithm, 47–49
Classification algorithms, 206
Area under the curve (AUC), 139, 140, 142
Classification and regression
Artificial neural networks (ANN), 208
tree (CART), 199
Association rule
Clustering, 149
definition, 49
Clustering-based recommender systems
mlxtend, implementation, 50, 51
approaches, 150
output, 50, 56
data collection and
visualization techniques, 54, 55, 57,
downloading, 151
58, 60–62
data importing, 151, 152
data preprocessing, 153
B elbow method, 159
Backpropagation, 208 exploratory data analysis, 154–157
Best predictions, 125 hierarchical, 159
implementation, 151
k-means clustering, 158, 159
C Co-clustering, 111, 120, 121, 123, 128
Classification algorithm-based Collaborative-based filtering
recommender system methods, 7, 129
buying propensity model, 175 Collaborative-based recommendation
data collection, 177 engines, 8
DataFrame (pandas), 177, 178 Collaborative filtering, 89
data preprocessing, 178–187 approaches, 111, 112
decision tree, 198–200 customer’s purchase history and
EDA, 188–192 ratings, 89
feature engineering, 187, 188 data1, 92, 93
implementation, 176 data collection, 91
KNN, 202–206 DataFrame, 91
243
© Akshay Kulkarni, Adarsha Shivananda, Anoosh Kulkarni, V Adithya Krishnan 2023
A. Kulkarni et al., Applied Recommender Systems with Python, https://doi.org/10.1007/978-1-4842-8954-9
INDEX
244
INDEX
245
INDEX
246
INDEX
247
INDEX
S user_item_interaction_matrix, 137
User-to-product relationship, 136
Self-organizing maps (SOM), 10
User-to-user collaborative filtering, 94
Singular value decomposition (SVD), 111,
cosine_similarity, 96
121, 122, 128
DataFrame, 96
Statistical modeling, 196
data matrix covering purchase
Supervised learning, 104
history, 95
classification, 105
find similar users, 94
KNN, 106, 107
implementation, 99
regression, 105
item purchase data matrix, 100
Item similarity scores DataFrame,
T 100, 101
item-to-item collaborative filtering, 99
Term Frequency–Inverse Document
KNN-based approach, 104
Frequency (TF-IDF), 69, 70, 76–78
matrix, 95
Term frequency (TF), 69
purchase data matrix, 95, 96
Title, 226
similar item recommendations, 102
Top N approach, 5
similar user recommendations, 98
Train-test split, 193, 214
user similarity scores
Types of contextual recommenders,
DataFrame, 96, 97
237, 238
User-user collaborative filtering, 8
U, V W, X, Y, Z
Unsupervised learning, 104, 149 Word2vec features, 78–80
Unsupervised machine learning Word embeddings, 70, 87
algorithms, 149, 174 Worst predictions, 126
248