2024 - SPR - Recommender Systems Algorithms and Their Applications - Kar-Roy-Datta
2024 - SPR - Recommender Systems Algorithms and Their Applications - Kar-Roy-Datta
Pushpendu Kar
Monideepa Roy
Sujoy Datta
Recommender
Systems:
Algorithms
and their
Applications
Transactions on Computer Systems and
Networks
Series Editor
Amlan Chakrabarti, Director and Professor, A. K. Choudhury School of
Information Technology, Kolkata, West Bengal, India
Editorial Board
Jürgen Becker, Institute for Information Processing–ITIV, Karlsruhe Institute of
Technology—KIT, Karlsruhe, Germany
Yu-Chen Hu, Department of Computer Science and Information Management,
Providence University, Taichung City, Taiwan
Anupam Chattopadhyay , School of Computer Science and Engineering,
Nanyang Technological University, Singapore, Singapore
Gaurav Tribedi, Department of Electronics and Electrical Engineering, Indian
Institute of Technology Guwahati, Guwahati, India
Sriparna Saha, Department of Computer Science and Engineering, Indian Institute
of Technology Patna, Patna, India
Saptarsi Goswami, A. K. Choudhury school of Information Technology, Kolkata,
India
Transactions on Computer Systems and Networks is a unique series that aims
to capture advances in evolution of computer hardware and software systems
and progress in computer networks. Computing Systems in present world span
from miniature IoT nodes and embedded computing systems to large-scale
cloud infrastructures, which necessitates developing systems architecture, storage
infrastructure and process management to work at various scales. Present
day networking technologies provide pervasive global coverage on a scale
and enable multitude of transformative technologies. The new landscape of
computing comprises of self-aware autonomous systems, which are built upon a
software-hardware collaborative framework. These systems are designed to execute
critical and non-critical tasks involving a variety of processing resources like
multi-core CPUs, reconfigurable hardware, GPUs and TPUs which are managed
through virtualisation, real-time process management and fault-tolerance. While AI,
Machine Learning and Deep Learning tasks are predominantly increasing in the
application space the computing system research aim towards efficient means of
data processing, memory management, real-time task scheduling, scalable, secured
and energy aware computing. The paradigm of computer networks also extends it
support to this evolving application scenario through various advanced protocols,
architectures and services. This series aims to present leading works on advances
in theory, design, behaviour and applications in computing systems and networks.
The Series accepts research monographs, introductory and advanced textbooks,
professional books, reference works, and select conference proceedings.
Pushpendu Kar · Monideepa Roy · Sujoy Datta
Recommender Systems:
Algorithms and their
Applications
Pushpendu Kar Monideepa Roy
School of Computer Science School of Computer Engineering
University of Nottingham Ningbo China KIIT Deemed University
Ningbo, China Bhubaneswar, Odisha, India
Sujoy Datta
School of Computer Engineering
KIIT Deemed University
Bhubaneswar, Odisha, India
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Recommendation systems were introduced in the 90’s but have gradually become
an indispensable tool with the advent of numerous e-commerce companies. Recent
years have seen a huge jump in the number of such web services, and they rely
heavily on recommendation systems to gain an advantage over their competitors.
Recommendation systems gather information about the likes and dislikes of a user
and use various types of complex algorithms to predict what a user may be interested
in and send personalized recommendations to users. Brands like Netflix, Amazon,
Facebook, Spotify, and YouTube collect information about users and try to predict
user preferences. If a person buys a certain product, then suggestions for similar
products are sent to the user. If a user likes a particular type of music or movie, then
it will try to predict and recommend similar types of music or movies to the user. It is
a very vast and interesting area of research but at present, in this book, we have taken
some of the most important topics which form the basis of recommender systems,
along with some case studies and applications and suggestions for future research
directions.
This book will be useful to users who are new to the topic and wish to learn it. It
will also be useful to advanced users who know the theory but want to implement or
design a system from scratch and can learn from the different types of algorithms.
This book consists of 12 chapters.
Chapter 1 is a general introduction of what is the importance of recommender
systems and an overview of the scope of the book and its audience and the motivation
behind writing this book.
Chapter 2 is a general overview of all possible types of algorithms for recommen-
dation systems.
Chapter 3 discusses two of the most widely used types of recommender algorithms,
content-based systems and collaborative filtering methods, and their features and
suitability for implementation.
Chapter 4 discusses the decomposition of the matrix in clustering.
Chapter 5 discusses how to learn to rank users based on various factors and how
to detect profiles of false users, along with the Shilling attack example.
v
vi Preface
I would like to thank my parents, Mihir Kumar Kar and Pratima Kar, my wife,
Sangita, my daughter, Ritosmita, and my son, Ritanshu, for their continuous
support, guidance, and encouragement. I would like to express my sincere grati-
tude to Tianyi Ma and Chenyu Yang for helping in writing Chap. 8, Zhihang Zhu
for helping in writing Chap. 10, and Xinyi Wang for helping in writing Chap. 12.
—Pushpendu Kar
I would like to express my sincere thanks to my Mom (late Hasi Roy) and Dad
(late Sunil K. Roy) for their blessings even when they are no more physically here
to guide me, but I’m sure they are watching with satisfaction from above. I would
like to thank my sister Madhumita Roy for her constant support and inspiration
throughout this assignment. I am also thankful to her for kindly designing the cover
for the book. I would like to thank my B.Tech. students Aishi Paul and Divyansi
Mishra for their help in drawing the diagrams for the book. Thanks to each and
every one of you for your timely support and help.
—Monideepa Roy
vii
Contents
ix
x Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
About the Authors
xiii
xiv About the Authors
Abstract With the rapid growth of e-commerce, the web has become a very popular
source of doing business by various companies. Customers also find it a very attractive
proposition as it saves the time to go outside and shop for what a user needs, as well
as the fact that users have access to a huge array of choices to buy from. Since it is a
very tough and competitive market, and companies have realized that people usually
tend to buy similar types of products or watch similar types of movies, they have
now resorted to modern technology to make it easier for customers to make their
choices. This led to the advent of various recommendation algorithms with which
the companies are now able to predict the choices and personal preferences of their
customers and accordingly push appropriate suggestions or recommendations for
products that a person is likely to purchase. With the huge success of recommendation
systems, they are now widely being adopted by more and more brands and for
more varieties of applications. This chapter gives an overview of the reasons why
recommendation systems have become so popular.
1.1 Introduction
Since consumers today are faced with huge numbers of choices in terms of new
products or new movies to watch, and less time on their hands, so it’s difficult to
make the choices of selection of the most relevant options on their own. So, whenever
a person buys a new product or wants to watch a new movie, he/she prefers to find
out ratings or recommendations from past users to make their choices faster and
easier (Abowd et al. 1999). However, even that is time-consuming with the huge
volumes of data. So, this has led to the emergence of recommendation systems, which
use algorithms to predict and find the best matches for a person based on various
parameters. They also form the basis of many machine learning algorithms. In this
book, we take a look at the different types of algorithms that are used for generating
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 1
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_1
2 1 Introduction to Recommendation Systems
Although any business can benefit from implementing recommendation systems, the
two main factors which determine the extent to which a business can benefit from
recommendation systems are:
1.3 Who Can Benefit from Them? 3
Breadth of data—If the business has only a few customers, and they behave in
different ways, then using an automated recommendation system will not be of
much use to them. It will be much easier to let the employees use their own logic to
predict the preferences of the individual customers.
Depth of data—When the business has only one single data point for each of their
customers, recommendation systems will not have sufficient training data to base
their predictions on.
So organizations who can benefit from automated recommendation systems can
vary from e-commerce, retail and media to banking, telecom and other utilities. There
are of course many more areas which can benefit from implementing recommenda-
tion systems, but here we describe some of the most popular ones. A more detailed
discussion of specific brands will be done in the later chapters.
E-commerce is one of the first areas where recommendation systems were used.
Because these companies have access to online data of millions of customers, they
can easily use that data to generate accurate recommendations.
Retail is another area which can benefit to a great extent from recommendation
systems. Since retailers have direct access to huge volumes of shopping data therefore
they have a very good idea of the customers intent and can make accurate predictions.
Media industry is also one of the first few companies which were the first to
implement recommendation systems. Almost all news channels use recommendation
engines.
Banking is also a very important application where the financial situations and
past preferences of millions of customer data make a very comprehensive data bank.
The telecom industry also has a similar dynamics as that of the banking industry
where the service providers have access to a wide variety of customer data, call
and usage preferences, and past data of a huge volume of customers. The telecom
industry has the additional advantage that it has a limited number of products for
which they need information, to make their predictions on (Fig. 1.1).
Fig. 1.1 Increasing importance of personalization in the post pandemic market. Source - McKinsey
4 1 Introduction to Recommendation Systems
Recommender systems need two types of information that they can work on. One
is characteristics information which is information about items (like keywords and
categories) and users (like preferences, profiles). The system will find out the personal
preferences of users and maintain a profile of their customers so that they can make
suggestions to new users with similar profiles. The other is user–item interactions
like ratings, number of purchases, likes, etc. where the user rates a product he/
she has experienced. Based on this the algorithms in recommender systems can
be broadly divided into three categories: content-based, collaborative filtering, and
hybrid systems (Adomavicius et al. 2011). The content cased systems use charac-
teristics information while the collaborative filtering uses interactions between the
users and the items. Hybrid systems are a combination of both of them. A more
detailed study of the above types of algorithms is given in Chap. 3.
The first recommendation engine was made in 1992 at the Xerox Palo Alto Research
Center. It was mainly designed with the purpose of allowing users rate the messages
/documents of an experimental mail system called Tapestry. It used a method called
6 1 Introduction to Recommendation Systems
collaborative filtering to for the recommendation engine to tell the user about which
were the most read or most loved documents. It proved to be an efficient process
and gave very good results for tapestry. It was later developed further to be able
to perform more complex operations like filtering, retrieval, and browsing of e-
documents. Figure 1.2 shows the three generations of recommendation engines.
The most successful recommendation engine was probably built by Amazon,
when it made it to the list of the top 10 retailers in 2012. Amazon was at 10th position
with a revenue of 34.4 billion USD. As per McKinsey, 35% of Amazon’s revenue in
2012 came from recommendation engines.
The reason their recommendation engine was so successful as compared to the
others was that it adapted to the challenges that came with an increase in the number
of customers. So instead of building focus on each customer individually and giving
them recommendations based on their past activities, what they did was to make
clusters of customers who had similar choices. As a result they found out that the
end results were more accurate and that the email recommendations were the best
way to convince a customer to buy their product.
Content-based filtering is a method which focuses on the likes and dislikes of
a single user and preferences. The collaborative filtering method focuses more on
analyzing the preferences of a group of people.
A hybrid filtering method utilizes both the above two methods and focuses more
on what a customer might need instead of what he/she wants.
So how does one choose the most suitable recommendation engine? (Adomavicius
and Kwon 2007).
There are primarily two factors to be considered:
The first point is why does a particular business need a recommendation engine.
If it has a loyal customer group with a limited number of people, then content
filtering is good enough. However if there is some complexity in the data then either
collaborative filtering or hybrid filtering should be chosen.
The second point is whether there will be a need to scale up in the future. If
there is a possibility of scale up, then the recommendation engine should be chosen
accordingly.
LinkedIn—Like many other social media channels, LinkedIn also uses “you may
also know” types of recommendations, and the number of common connections
between any two persons.
This book is mainly aimed as a hands-on help to those who are learning for the
first time and wish to learn about recommendation systems from scratch and want
to implement them for some applications. The book gives an overview of all the
different types of algorithms and application scenarios. It also deals with security
and the types of attacks that are usually faced by a recommendation system and ways
to detect and safeguard a system from it. There are also two case studies to show
the importance and widespread applicability of recommendation systems. There is
also a chapter that discusses some novel and diverse applications of recommender
systems.
The book consists of eleven other chapters. Chapter 2 gives a broad overview of
all types of recommendation algorithms. Chapter 3 deals with the three main types
of algorithms, collaborative filtering, clustering, and hybrid algorithms. Chapter 4 is
about the decomposition of matrices for clustering. Chapter 5 deals with the problem
of how to rank the choices correctly and how to safeguard against attacks and fake
profiles. Chapter 6 is about knowledge-based and ensemble-based recommender
systems. Chapter 7 deals with the big data behind recommender systems. Chapter 8
discusses the importance of trust-centric and attack-resistance recommender systems.
Chapter 9 shows the steps in building a recommendation system. Chapters 10 and 11
are applications of recommender systems in healthcare and surveillance, respectively.
Chapter 12 discusses some novel applications of recommender systems as well as
scopes and ideas for their improvement.
1.10 Summary
In this chapter, we have seen why recommender systems have become so popular
and why a study of the various types of algorithms is so important to get a clear
understanding of how the process of recommendation works. It defines the scope of
this book and gives an overview of its contents of the book. The next chapter is a
formal introduction to recommendation systems and gives an overview of the various
types of algorithms used.
10 1 Introduction to Recommendation Systems
Think Tank
References
Abowd G, Dey A, Brown P, Davies N, Smith M, Steggles P (1999) Towards a better understanding
of context and context-awareness. In: Gellersen H-W (ed) Handheld and ubiquitous computing.
Springer, Berlin, pp 304–307
Adamopoulos P, Bellogin A, Castells P, Cremonesi P, Steck H (2014) REDD 2014—International
Workshop on Recommender Systems Evaluation: Dimensions and Design. Held in conjunction
with ACM Conference on Recommender systems
Adomavicius G, Tuzhilin A (2011) Context-aware recommender systems. In: Ricci F, Rokach L,
Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 217–253
Adomavicius G, Manouselis N, Kwon Y (2011) Multi-criteria recommender systems. In: Ricci F,
Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York,
pp 769–803
Adomavicius G, Kwon Y (2007) New recommendation techniques for multi-criteria rating systems.
IEEE Intell Syst 22(3):48–55
Chapter 2
Overview of Recommendation Systems
2.1 Introduction
With more and more companies using the Web as a medium for their business, recom-
mendation systems have become a very important tool for them to keep themselves
ahead of their competitors by providing the best-personalized recommendations for
a wide variety of items to users (Adomavicius et al. 2005; Adomavicius and Tuzhilin
2005a). One of the main factors why recommendation systems have become so
popular is that it is very easy to get user feedback about an item through online
services. A company can easily find out the likes and dislikes of a person through
various feedback mechanisms. For example on Netflix, a user can easily rate a movie
that he/she has watched just by simply clicking on the mouse. These are explicit
feedbacks where a user can give ratings in numerical or other formats. But there
are implicit types of feedback systems also. Even when a user is simply browsing
for some products, it usually means that the customer is interested in such types of
products. These types of feedback are used by online sellers like Amazon, Nykaa,
TataCliq, etc., and the data is also collected very effortlessly in terms of activity done
by the customer. So recommender systems basically utilize such types of data to
predict customer preferences.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 11
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_2
12 2 Overview of Recommendation Systems
There are a variety of applications that use recommender systems at present. Some
famous brands are discussed (Baccigalupo and Plaza 2006; Bailey 2008).
Amazon.com recommender system
Amazon was one of the pioneers in recommender systems and realized its benefits
very early. It’s an online retailer brand and sells a variety of products through its
web portal. It started originally as a book retailer but slowly progressed to many
other categories like software, electronics, games, tools, gifting, household, movies,
cosmetics, food, etc. Amazon has an explicit rating facility where it allows the user
to rate items on a 5-point scale. It also tracks the buying behaviors, previous items
purchased, and browsing history.
Netflix Movie Recommender System
Netflix is a portal that provides movies and web series to users. Here the users can
rate a movie that he/she has watched and then based on these ratings can provide
these suggestions to other customers who have watched similar movies. Similarly,
2.4 Classification of Recommendation Systems 13
the target user will also get suggestions for movies that are similar to what he/she
has watched or rated highly in the past.
Google News Personalization System
In this case (Arazy et al. 2009), the various news articles are the items and there are
no explicit ratings here such. It is a sort of unary feedback. That means that if a user
clicks on a particular news item, it is assumed that the user is interested in that news
and is taken as positive feedback. So here the feedback mechanism is implicit, and
based on this, similar news items will be suggested to the user.
Facebook Friend Recommendations
This social networking site suggests potential friends to users so that there is an
increase in the number of social connections on that site. However, the aim of this
type of recommendation is slightly different from those of product recommendations.
For retailers or merchants, product recommendations increase sales but here there is
an increase in social connections. When the number of social connections increases,
then this leads to an enhanced experience for a user on that social network. So the
company actually depends on this to increase their advertising revenues. So the basis
for recommendations for friends or links is actually a link prediction problem in the
field of social network analysis. So these systems rely more on structural relationships
rather than on the rating data.
In the next section we take a look at the various categories into which recommender
algorithms can be classified into.
Recommender systems can be broadly classified into three major categories (Ahn
et al. 2006; Anand and Mobasher 2005; Balabanovic and Shoham 1997), namely
content-based systems, collaborative systems and hybrid systems, as shown in
Fig. 2.1.
In the content-based filtering method, similarities in products, services or
content features and information gathered about the user are used to make the
recommendations.
The advantages of the system are as follows:
• Independent user: There is no need to prepare a similarity index of users
for building a personalization system. The recommendations can be done by
examining the attributes of the items and the profiles of the users.
• Enough information to avoid cold start: Even if there is very less rating infor-
mation, new items can be recommended by the others users who are there in the
population.
• Transparent Behavior: The method provides the attributes of the items based on
which the recommendations have been made.
14 2 Overview of Recommendation Systems
• It faces the problem of cold-start item, i.e., if the system hasn’t encountered the
item in the training phase, then it will be difficult for the system to suggest it in
the final personalization list.
• It can turn out to be a complex and expensive system, in cases where there is high
dimensionality of data set of very high-dimensional dataset, because calculating
the similarity index of millions of users is a tough task for the system.
• Since a majority of the datasets that are available in real life scenarios are sparse,
so generating recommendation for such cases using this system may lead the
system to recommend in the wrong direction.
In the hybrid systems, there is a condensation of various existing models like
content based and collaborative based or any other personalization technique.
This was done mainly to overcome the bottlenecks that were being faced by the
collaborative system. Its advantages are:
• It is very effective because it combines the benefits of various recommender
systems.
• It provides a place for the optimization of the recommendation model.
• The major drawbacks of the content based and collaborative based methods like
the cold start problem, the sparsity problem, the gray sheep problem are overcome
in this model.
But it has some disadvantages also.
• It is costly to implement it.
• It has high complexity in terms of time and space.
• It uses explicit information, which might pose a problem in data collection due to
privacy issues.
A more detailed explanation of the other subcategories is given in later chapters.
In some different domains, temporal data, location-based data, and social data, the
context of a recommender system plays a very crucial role (Aimeur et al. 2008;
Aimeur and Vezeau 2000).
Context-based recommender data consider many types of information before
making a recommendation.
For time-sensitive recommender systems, the ratings of an item may vary with
time, because user likes and dislikes evolve with time, e.g., mobile configurations,
houses, cars specifications, etc. change frequently over time.
For location-based recommender systems, two types of spatial locality are to be
considered—user-specific locality and item-specific locality. So the recommenda-
tions for the best places to visit nearby, or places to shop will vary based on the
locality that a user is currently in.
16 2 Overview of Recommendation Systems
Social recommender systems are based on social cues, network structures, tags,
or a combination of all three. They are slightly different from the other types of
recommendation systems.
2.6 Summary
Think Tank
References
Abstract In this chapter, the two most widely used types of recommender systems,
namely the collaborative filtering method and the content-based system, along with a
few of their important sub-types are discussed in this chapter. There are two types of
collaborative methods, namely the neighborhood-based and model-based methods.
The chapter discusses what are the features of and differences between the two
methods. The basic components of the content-based systems are also discussed.
Both the systems have their advantages and disadvantages which are also discussed
here.
3.1 Introduction
The previous chapter gave an overview of the various types of algorithms used
in recommendation systems. Since the two broad categories of recommendation
algorithms are the collaborative filtering model and the content-based recommender
systems; therefore this chapter explains the two methods in more detail. In CBS, the
ratings and the buying patterns of users are combined with the descriptive attributes
of the items to arrive at the recommendations. Here the descriptions of the item are
given some ratings and fed as training data for the creation of a regression model to
classify users.
So the system stores the descriptions of all the items bought or rated by a particular
user, and this information is used to predict whether a user will be interested in a new
product or not. In the collaborative filtering method, the system depends on previous
interactions between users and items to generate new suggestions. So it classifies
the groups of similar users into clusters and then provides suggestions to the users
based on the preferences of the people in the clusters. So the idea is that users in one
cluster are likely to have similar preferences or choices of products or movies, etc.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 19
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_3
20 3 Collaborative Filtering and Content-Based Systems
We discuss the details of the models in the next sections (Aslanian et al. 2016; Beel
et al. 2013; Bellogin et al. 2011).
Collaborative filtering methods (Fig. 3.1) (Bobadilla et al. 2011; Bogers and Bosch
2008; Das et al. 2007) can be broadly classified into two categories: neighborhood-
based collaborative filtering and model-based collaborative filtering. So we describe
each of them separately below:
• Neighborhood-based collaborative filtering—They are also called memory-
based algorithms and are among the earliest algorithms that have been used
for CF. It is based on the assumption that users with similar profiles have similar
patterns in which they rate items and that similar items get similar ratings. It is of
two types—user-based CF and item-based CF (Silva et al. 2017b; Su and Khosh-
goftaar 2009). In user-based CF methods, the system stores the ratings of products
by users who have profiles similar to the target user and then suggests these prod-
ucts to the target user. In item-based CF, a set of items with features that are similar
to a target item are selected first. Then the weighted average of all the ratings of a
particular user is used to predict the ratings of a user for that item. The main differ-
ence between user-based and item-based CF is that in the first case the ratings
are predicted using the ratings of the neighboring users, whereas in the second
case the predictions are based on the ratings of a user on neighboring items. The
algorithms for this method can be formulated in any one of the following ways:
either by predicting the rating value of a use-item combination or by determining
the top k items or top k users. The ratings of the matrices in a rating matrix can be
of four types: continuous ratings, interval-based ratings, ordinal ratings, binary
ratings, and unary ratings. In continuous ratings, there is a continuous scale that
corresponds to the degree of liking or dislike of an item. In interval-based ratings,
the ratings are usually on a 10-point or 20-point scale. Ordinal ratings, it’s similar
to interval-based ratings but some predefined categories are available, e.g., agree,
accept, weak accept, strongly accept, reject, neutral, etc. In binary ratings, as the
names suggest, only two options are available, yes or no. In unary ratings, a user
is allowed to specify only a positive preference but not a negative preference, i.e.,
the like button on Facebook.
There are mainly two principles that are there in neighborhood models: user-based
models and item-based models (Goldberg et al. 1992; Herlocker et al. 2004). In the
user-based model’s users with similar profiles usually give similar ratings on the same
items. So ratings of one user for an item may be used as a basis for recommending
it to other users with similar profiles. In the item-based model, if some items are
similar, then a user is likely to give similar ratings to those items.
• Model-based collaborative filtering—Here the recommendations for the items
are provided by first developing a model of the ratings of the user. So here the
algorithms use a probabilistic approach and use the collaborative filtering method
to compute the expected value of the user prediction, based on the ratings given
by a user on other items. The following are the advantages of the model-based
system over the neighborhood-based system.
Space efficiency—The size of the learning model is usually a lot smaller than the
original rating matrix. So the space requirements for the model-based system are
low, whereas the item-based or user-based methods have higher complexity in the
order of O(n^2).
Training speed and prediction speed—The preprocessing is faster in the model-
based systems as compared to the neighborhood models as it is quadratic in the
number of users or number of items in the case of the latter system. In a majority of
the cases, the compact and summarized model can efficiently make predictions.
Avoiding overfitting—A lot of machine learning algorithms suffer from the
problem of overfitting. This means that random artifacts in the data try to overly
influence the process of prediction. This issue is also there in classification and
random models. To avoid the problem of overfitting a summarization approach may
22 3 Collaborative Filtering and Content-Based Systems
be used. In addition, adding regularization methods can also be used for making
robust models.
The hybrid approach (Burke 2002; Chowdhury 2010; Glauber et al. 2013) is basically
a combination of various existing models, like the content based, collaborative based
or any of the personalization techniques (Fig. 3.3). This method came up a solution to
24 3 Collaborative Filtering and Content-Based Systems
overcome the bottlenecks that were encountered by the collaborative system which
was the most used system. So a hybrid system is basically a combination of one
or more techniques, e.g., a system can use the matrix factorization to reduce the
dimensions of a large data set and then collaborative filtering can be applied for the
generation of personalized lists.
The pros and cons have been described in the previous chapter. So at this point a
definition of the various hybrid methods (Liu et al. 2018) is given here.
In the weighted hybridization method, the decision is made based on the score
obtained from different recommender systems. Then the results of each of the recom-
mender systems is collated to a single numerical component for deciding the final
recommendation list.
In the cascade method of hybridization, the basis of the recommendation is a chain
of recommendations, which means that the results of one recommendation system
are fine-tuned based on the results of another recommendation system.
In the switched method of hybridization, the method chooses a suitable recom-
mendation system from the set of the recommendation systems.
In the mixed hybridization method, as the name suggests, different recommen-
dation techniques work together to create a collaborative decision on the final
personalized list.
In the meta level method, the output from one recommendation system is taken
as the input for another recommendation system.
In the feature combination method, various types of knowledge source features
are aggregated together to form a single domain.
In the feature augmentation method, the features of one knowledge source is
calculated so that it becomes compatible to become the input for some other
recommendation algorithm.
Apart from the three major approaches described above, there are also various
other personalized services, like demography based systems, knowledge-based
systems and community-based systems (Wu et al. 2015).
In the demography-based system, the users are categorized based on the demo-
graphic data like gender, age, qualification, location, etc. This type of system is
difficult to implement in real-life scenarios, because it is very difficult to gather
correct and complete demographic data of users.
In the knowledge based system, the recommendations are based on the needs of
the user. So it uses the knowledge about a user and the item to decide which if the
items will fulfill the user needs. So it depends on services based on user preferences
(Konstan and Riedl 2012; Lathia et al. 2010).
In the community-based systems, communities are made by the recommendation
system based on people who share common interests. It relies on a user-item interac-
tion within a community, and recommendations of items are made after an aggregate
decision is obtained from the community.
3.5 Similarity Measures Used by a Recommender System 25
where z1 and z2 are any two points or objects in Euclidean space whose
similarity needs to be evaluated.
ii. Manhattan distance—This method computes the distance on gridlines. The
calculation is done by summing the vertical and horizontal component of any
set of points. So the Manhattan distance between any two points z1 and z2 is
given by the following formula:
∑
n
dist(z 1 , z 2 ) = dist(z 2 , z 1 ) = |z 1 − z 2 | (3.2)
i=1
iii. Adjusted cosine similarity—This method of cosine similarity takes into consid-
eration the changing rating scale of the users. So the adjusted cosine similarity
between any two items I 1 and I 2 can be calculated with the following formula:
∑ ( )( )
u∈U I1 ,I2 ru I1 − ru' ru I2 − ru'
Adj. Cos. (I1 , I2 ) = /∑ ( )2 /∑ ( ) (3.6)
− ru' ' 2
u∈U I1 ,I2 r u I1 u∈U I ,I r u I2 − r u
1 2
where U I1 ,I2 represents the set of users who have rated both items I 1 and I 2 ,
and ru I1 and ru I2 denote the ratings that have been given by user 1 to I 1 and I 2
respectively, and ru' is the average rating give by user u.
iv. Jaccard similarity—It is an index value that is used for calculating the similarity
and diversity between a set of objects. It is defined as division of intersection
over union. The similarity value varies between 0 and 1, where 0 represents low
similarity and 1 represents high similarity. The Jaccard similarity between any
two objects z1 and z2 is given by the following formula:
3.6 Evaluation Metrics of Recommender Systems (Ge et al. 2010) 27
This method is a measurement of the deviation of the predicted value from the actual
value and is calculated according to the following formula:
∑n
i=1 |Ai − Pi |
MAE = (3.7)
n
This is used for the calculation of error during the prediction of value of an object
and is the square root of the difference between the predicted and the actual values
and is calculated according to the following formula:
/
∑n
i=1 ( Ai − Pi )2
RMSE = (3.8)
n
where Ai and Pi are the actual and predicted values respectively and n represents the
total number of items for which predictions have been made.
3.6.3 Precision
This is defined as the total number of relevant items in the recommendation list
divided by the total number of items in that list and is calculated by the following
formula:
True Positive
Precision(P) = (3.9)
True Positive + False Positive
where true positive is the number of items which are relevant and present in the list
and false positive is the number of items which are not relevant but still present in
the list.
28 3 Collaborative Filtering and Content-Based Systems
3.6.4 Recall
This is defined as the ratio of the relevant items in the recommendation list divided by
the total number of relevant items in the population and is calculated by the following
formula:
True Positive
Recall(R) = (3.10)
True Positive + False Negative
F 1 score is the harmonic mean between precision and recall with the following
formula:
2∗ P ∗ R
F1 Score = (3.11)
(P + R)
3.7 Summary
So in this chapter, the concepts of two widely used systems, namely collaborative
filtering and content-based recommendation systems were introduced (Shardanand
and Maes 1995; Silva et al. 2017a). This chapter explains the two main types of
collaborative filtering methods and their features. The next method is the content-
based method which differs from the collaborative filtering method in that it depends
on the past ratings and likings of similar items by the target user itself instead of
collecting the ratings of other users with similar profiles for those items. It also
describes the hybrid methods and some personalized methods. After that the methods
for calculating the similarity measures have been discussed, and followed by the
various evaluation metrics of a recommendation system. The next chapter discusses
the matrix decomposition and clustering process.
References 29
Think Tank
References
Aslanian E, Radmanesh M, Jalili M (2016) Hybrid recommender systems based on content feature
relationship. IEEE Transactions on Industrial Informatics
Beel J, Genzmehr M, Langer S, Nürnberger A, Gipp B (2013) A comparative analysis of offline
and online evaluations and discussion of research paper recommender system evaluation.In:
Proceedings of the international workshop on reproducibility and replication in recommender
systems evaluation. ACM, pp 7–14
Bellogin A, Castells P, Cantador I (2011) Precision-oriented evaluation of recommender systems:
an algorithmic comparison. In: Proceedings of RECSYS. ACM, pp 333–336
Bobadilla J, Ortega F, Hernando A, Alcalá J (2011) Improving collaborative filtering recommender
system results and performance using genetic algorithms. Knowl Based Syst 24(8):1310–1316
Bogers T, Van den Bosch A (2008) Recommending scientific articles using citeulike. In: Proceedings
of RECSYS. ACM, pp 287–290
Burke R (2002) Hybrid recommender systems: survey and experiments. User Model User-Adapt
Interact 12(4):331–370
Chowdhury G (2010) Introduction to modern information retrieval. Facet Publishing, Abingdon
Das AS, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online
collaborative filtering. In: Proceedings of 7 WWW. ACM, pp 271–280
Di Noia T, Mirizzi R, Ostuni VC, Romito D, Zanker M (2012.) Linked open data to support
content-based recommender systems. In: Proceedings of Semantics. ACM, pp 1–8
Fernandes BB, Sacenti JA, Willrich R (2017) Using implicit feedback for neighbors selection:
alleviating the sparsity problem in collaborative recommendation systems. In: Proceedings of
WEBMEDIA. ACM, pp 341–348
Ge M, Delgado-Battenfeld C, Jannach D (2010) Beyond accuracy: evaluating recommender systems
by coverage and serendipity. In: Proceedings of RECSYS. ACM, pp 257–260
Glauber R, Loula A, Rocha-Junior JB (2013) A mixed hybrid recommender system for given names.
ECML PKDD Discov Challenge 2013:25–36
Goldberg D, Nichols D, Oki BM, Terry D (1992) Using collaborative filtering to weave an
information tapestry. Commun ACM 35(12):61–70
Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering
recommender systems. ACM Trans Inf Syst 22(1):5–53
Konstan J, Riedl J (2012) Recommender systems: from algorithms to user experience. User Model
User-Adapt Interact 22(1):101–123
Lathia N, Hailes S, Capra L, Amatriain X (2010) Temporal diversity in recommender systems. In:
Proceedings of ACM SIGIR. ACM, pp 210–217
Liu Y, Wang S, Khan MS, He J (2018) A novel deep hybrid recommender system based on auto-
encoder with neural collaborative filtering. Big Data Mining and Analytics 1(3):211–221
30 3 Collaborative Filtering and Content-Based Systems
McNee SM, Riedl J, Konstan JA (2006) Being accurate is not enough: how accuracy metrics have
hurt recommender systems. In: Proceedings of CHI. ACM, pp 1097–1101
Said A, Bellogín A (2014) Comparative recommender system evaluation: benchmarking recom-
mendation frameworks. In: Proceedings of RECSYS. ACM, pp 129–136
Santana LL, Souza AB, Santana DL, Dourado WA, Durão FA (2017) Evaluating ensemble strategies
for recommender systems under metadata reduction. In: Proceedings of WEBMEDIA. ACM,
pp 125–132
Shani G, Gunawardana A (2011) Evaluating recommendation systems. In: Recommender systems
handbook. Springer, Berlin, pp 257–297
Shardanand U, Maes P (1995) Social information filtering: algorithms for automating “word of
mouth”. In: Proceedings of SIGCHI. ACM Press/Addison-Wesley Publishing Co., Boston, MA,
pp 210–217
Silva DV, Silva RD, Durão FA (2017a) RecStore: recommending stores for shopping mall customers.
In: Proceedings of WEBMEDIA. ACM, pp 117–124
Silva N, Carvalho D, Pereira AC, Mourão F, Rocha L (2017b) Evaluating different strategies to
mitigate the ramp-up problem in recommendation domains. In: Proceedings of WEBMEDIA.
ACM, pp 333–340
Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell
2009:4
Wu D, Zhang G, Lu J (2015) A fuzzy preference treebased recommender system for personalized
business-to-business e-services. IEEE Trans Fuzzy Syst 23(1):29–43
Chapter 4
Matrix Decomposition for Clustering
and Collaborative Filtering
Abstract Since the consumers of today are flooded with choices for various prod-
ucts like movies on OTT platforms, online music, and other online shopping sites, so
to increase user satisfaction and maintain loyalty, the retailers and content providers
need to find ways to match users with their most preferred products of choice. So they
use recommender systems which have been very successful in providing accurate
suggestions of items to customers. The two main strategies used by recommender
systems are content-based models and collaborative filtering. Matrix factorization is
a collaborative filtering method to find the relationship between items’ and users’
entities. Latent features, the association between users and movies matrices, are
determined to find similarity and make a prediction based on both item and user
entities. Matrix factorization is a way to generate latent features when multiplying
two different kinds of entities. Since not every user gives ratings to all the items they
use, there are many missing values in the matrix and it results in a sparse matrix.
Hence, the null values not given by the users would be filled with 0 such that the
filled values are provided for the multiplication. It has been observed that matrix
factorization models are superior when compared to the nearest neighbor technique,
for the generation of product recommendations. This is because it incorporates addi-
tional factors like implicit feedback, temporal effects, and confidence levels into the
recommendation process. Therefore in this chapter, we see how the process of matrix
decomposition works, in detail.
4.1 Introduction
At a time when consumers are faced with a huge variety of options of products
to choose from, the success of the retailers depends on how accurately they can
predict user preferences and choices, and suggest new products which the user has
a very high probability of buying. Recommendations can be generated by a wide
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 31
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_4
32 4 Matrix Decomposition for Clustering and Collaborative Filtering
variety of algorithms. As we have already seen, the two main strategies used by
recommender systems are collaborative filtering and content-based methods. The
user-based or item-based collaborative filtering methods are simple and intuitive,
but matrix factorization techniques are generally more effective because they allow
the discovery of the latent features underlying the interactions between users and
items. The two main areas of collaborative filtering are the neighborhood methods
and the latent factor models. The neighborhood model computes the relationships
between items or between users. The first category finds the preferences of a user
for a particular item, by finding the ratings of “neighboring” items by the same user.
A neighbor of a product means that those products are likely to get similar ratings
when reviewed by the same user. For example, if we consider the movie “Black Hawk
Down”, then neighbors of that particular movie will include movies like, those which
involve wars or are directed by Ridley Scott. So if we want to predict the rating that a
particular user has given to the movie “Black Hawk Down”, then we need to search
for the ratings that the user has given to similar movies that the user has watched.
Another alternative approach that tries to predict ratings by characterizing items
and users through inference of some rating patterns is latent factor models. These
factors are a computerized alternative to the options created by humans (Sarwar
et al. 2000; Funk 2006). For example, if we consider movies, there can be several
additional factors that can be considered, like comedy vs drama, the amount of
action, children’s orientation, and other such dimensions which are usually not very
well defined. Latent factor models rely on matrix decomposition to discover latent
features to find the underlying interactions between any two types of entities. So in
the next section, an overview of the matrix decomposition process is given (Koren
2008; Paterek 2007; Takács et al. 2007).
D1 D2 D3 D4 D5
U1 5 3 - 1 3
U2 4 - - 1 2
U3 1 4 3 - -
U4 - 3 2 4 -
U5 - - 5 3 4
Fig. 4.1 Matrix for user ratings for movies with blanks/hyphens
is rated by a user. So suppose if two users give high ratings for a particular movie,
then there might be a common factor for both of them liking the movie like maybe
both of them like the same actor/actress, or both of them like action movies as their
preferred genre of movies. If we are able to find these hidden features, then it would
be easier to predict the ratings of a particular user about a particular item, as we can
match the features of the user with that of the item. So when we try to uncover these
features, we assume that the number of these features is less than the number of users
and items. The assumption is valid since otherwise, it would mean that each user
has a unique feature, which although not entirely impossible, is a rare phenomenon.
Secondly, if this situation really happened, then making recommendations would be
fairly useless, as the other users would not be interested in the movies that have been
rated by other users((Salakhutdinov and Mnih 2008; Bell and Koren 2007)).
R ≈ P × Q T = R̂ (4.1)
Here the rows of P represent the strengths of the associations between a user and the
features, whereas the rows of Q represent the strengths of the associations between
an item and the features. So to predict the rating of an item d j by ui , the dot product
of the vectors which correspond to ui and d j , is calculated.
k
r̂ i j = p Ti q j = pi k q k j (4.2)
k=1
The objective is to find a method to find P and Q. One way of doing this is to first
initialize the two matrices with some values and then calculate how their product
differs from M, after which the difference is minimized iteratively. This process is
34 4 Matrix Decomposition for Clustering and Collaborative Filtering
known as gradient descent and is aimed to find a local minimum of the difference.
The difference is also called the error between the estimated rating and the real rating
and can be calculated by the equation shown below for each pair of user-item:
2
2
K
ei2j = ri j − r̂ i j = ri j − pi k q k j (4.3)
k=1
Here the squared error is considered as the estimated rating can be higher or lower
than the real rating. For error minimization, it is necessary to know the direction in
which the values of pik and qkj have to be modified, i.e., the gradient at the current
values needs to be known. So the above equation is differentiated with respect to
these variables separately:
∂ 2
ei j = −2 ri j − r̂ i j q k j = −2ei j qk j (4.4)
∂ pik
∂ 2
e = −2 ri j − r̂ i j p i k = −2ei j pik (4.5)
∂qk j i j
After getting the gradient, the update rules for both pik and qkj can be formulated
as follows:
∂ 2
pi k = pi k + α e = p i k + 2αei j qk j (4.6)
∂ pik i j
∂
qkj = qkj + α ei2j = q k j + 2αei j pik (4.7)
∂q k j
Here α is a constant and its value gives the rate of approaching the minimum and
is normally taken to be a small value like 0.0002. The reason behind this is that
if the steps toward the minimum are taken to be too large, then there is a chance
of missing the minimum which will lead to oscillations around the minimum. The
update rules are applied to iteratively perform the operations until the error converged
to its minimum. The overall error can be checked by the equation given below, which
tells when to stop the process:
2
K
E= ei j = ri j − pi k q k j (4.8)
(u i ,di, ri j )∈T (u i ,di, ri j )∈T k=1
The matrix that is obtained from implementing the above algorithm is as follows:
As we see here, the approximations obtained are very close to the actual ratings,
and some predictions about the unknown values can also be made.
4.4 The Netflix Example 35
The online movie rental company Netflix announced a contest in 2006 for improving
their recommender system. So the company released a training set consisting of more
than 100 million ratings spanning over 500,000 anonymous customers and how they
rated more than 17,000 movies, where each movie was rated on a scale of 1–5 stars,
for the teams to work on. The teams which took part submitted the predicted ratings
for a test set of approximately 3 million ratings. Netflix calculated the RMSE (root-
mean-square error) based on the held-out truth. The challenge was that whichever
team was the first to improve on Netflix’s algorithm’s RMSE performance by 10%
or more would win prize money of $1 million.
If none of the teams succeeded in reaching the 10 percent goal, then a prize of
$50,000 was given to the team in the first place, meaning with the least RMSE, after
every yearly competition. This competition generated a lot of interest in the field
of collaborative filtering because, until that point in time, the data that was publicly
available for research in collaborative filtering was many magnitudes smaller than
what was released by Netflix. So the release of this data created a flurry of activities
and research worldwide. As per the Netflix website, there were more than 48,000
teams from 182 counties who had downloaded the data.
A group consisting of Yehuda Koren from Yahoo Research and Robert Bell and
Chris Volinsky from AT&T Labs—Research won the top spot in 2007 with their
entry names BellKor and won the Progress Prize for 2007 with what was the best
score at that time: 8.43% better than Netflix, and later joined with the team Big Chaos
to win the 2008 Progress Prize with 9.46% improvement.
The winning entries had greater than 100 different predictor sets, most of which
were factorization models or variants of the model discussed above. When the Netflix
user-movie matrix is factorized, it gives the most descriptive parameters for the
prediction of user preferences for movies. The first two factors from the Netflix data
matrix factorization are shown in Fig. 4.2 where movies are placed based on their
factor vectors. The first-factor vector on the x-axis has comedies on one side and
horror movies targeted at male or teenage audiences, and the other side contains
serious undertones and strong female leads. The second-factor vector on the y-axis
has independent, critically acclaimed, weird films, and on the bottom mainstream
formula films. There are various films on the intersection of these films. Thus it
was seen that matrix factorization is a very crucial method in collaborative filtering
methods. By using it successfully to the Netflix Prize data, it has been found that
they offer much better accuracy as compared to the nearest neighbor technique. The
property that makes them even more convenient is that these models can naturally
integrate many important aspects of the data, like multiple feedback forms, temporal
data, and confidence levels (Figs. 4.3, 4.4).
36 4 Matrix Decomposition for Clustering and Collaborative Filtering
import numpy
eR = numpy.dot(P,Q)
e = 0
for i in xrange(len(R)):
for j in xrange(len(R[i])):
if R[i][j] > 0:
e = e + pow(R[i][j] - numpy.dot(P[i,:],Q[:,j]), 2)
for k in xrange(K):
e = e + (beta/2) * (pow(P[i][k],2) + pow(Q[k][j],2))
if e < 0.001:
break
return P, Q.T
(a)
R = [
[5,3,0,1],
[4,0,0,1],
[1,1,0,5],
[1,0,0,4],
[0,1,5,4],
]
R = numpy.array(R)
N = len(R)
M = len(R[0])
K = 2
P = numpy.random.rand(N,K)
Q = numpy.random.rand(M,K)
nP, nQ = matrix_factorization(R, P, Q, K)
nR = numpy.dot(nP, nQ.T)
(b)
Fig. 4.2 a Code snippets for matrix factorization in Python. b Code snippet to be run on the above
algorithm, containing many zero values
Fig. 4.4 First two vectors from a matrix decomposition of the Netflix Prize data. The selected
movies were placed at the appropriate spot based on their factor vectors in two dimensions (Zhou
et al. 2008; Koren et al. 2009)
4.5 Summary
References
Bell R, Koren Y (2007) Scalable collaborative filtering with jointly derived neighborhood interpo-
lation weights. In: Proceedings on IEEE International Conferencce Data Mining (ICDM 07).
IEEE CS Press, pp 43–52
Funk S (2006) Netflix update: try this at home. http://sifter.org/~simon/journal/20061211.html
Hu YF, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In:
Proceedings on IEEE International Conference Data Mining (ICDM 08). IEEE CS Press, pp
263–272
Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model.
In: Proceedings on 14th ACM SIGKDD Int’l Conference Knowledge Discovery and Data
Mining. ACM Press, pp 426–434
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems.
Computer 42:42–49
Paterek A (2007) Improving regularized singular value decomposition for collaborative filtering.
In: Proceedings on KDD Cup and Workshop. ACM Press, pp 39–42
Salakhutdinov R, Mnih A (2008) Probabilistic matrix factorization. In: Proceedings on Advances
in Neural Information Processing Systems 20 (NIPS 07). ACM Press, pp 1257–1264
Sarwar BM et al. (2000) Application of dimensionality reduction in recommender system—a case
study. In: Proceedings on KDD Workshop on Web Mining for e-Commerce: Challenges and
Opportunities (WebKDD). ACM Press
Takács G et al (2007) Major components of the gravity recommendation system. SIGKDD Explor
9:80–84
Zhou Y et al. (2008) Large-scale parallel collaborative filtering for the Netflix prize. In: Proceedings
on 4th International Conference Algorithmic Aspects in Information and Management, LNCS
5034. Springer, pp 337–348
Chapter 5
Learning How to Rank and Collecting
User Behavior
Keywords LtR algorithms · Rank · Filter · Fake user profile · Biased feedback ·
Forward selection · Backward selection · Shilling attack · Obfuscated attack ·
Detection algorithms
5.1 Introduction
Most of the techniques discussed in the previous chapters treat the recommendation
problem as a prediction problem and rarely present all the ratings to the users. The
system usually suggests the top n items to the user. Moreover, a user normally pays
more attention to the results at the top of the list as compared to the items which
are ranked lower. So some predicted values may not be displayed to the user or
optimized predicted values may not always provide the best recommendations. The
main reason for this is that the objective functions of prediction-based methods are
not fully aligned with the experience of the end user (Adomavicius and Tuzhilin
2005; Hu et al. 2008).
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 39
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_5
40 5 Learning How to Rank and Collecting User Behavior
“Learning to rank” combines different types of data sources like distance, popu-
larity, or the outputs from a recommender system. The main thing here is that the
rank does not always need to be a part of the recommender system. So basically,
during ranking one looks at the input sources that can help in the ordering of the
objects. In this chapter, we take the example of a popular app named FourSquare
to show how the learning-to-rank technique is used for ordering suggestions. After
that, we take a look at the various types of LtR algorithms (Koren and Sill 2011; Liu
2009; Radlinski et al. 2008). After that, we see the process of how to detect a fake
profile and a review of the various types of attacks on a recommender system and
how to detect them.
FourSquare (now Factual) has a website and a mobile app that is basically a guide to
cities. It is a city search platform, which provides personalized recommendations to
users about nearby places based on the location, to find the best matches for nearby
places to visit, the best places to shop, or the best restaurants based on the preferences
of that particular user. A person can search for information and reviews on various
places and events in a particular geographical area. It also learns the preferences of
a user over time and can predict and suggest the places the person is likely to go to,
even when the person is visiting another place anywhere in the world. The user can
search for information and reviews about various facilities and events in any part of
the world.
Suppose a person is in a new place and is searching for the list of nearby coffee
shops around him/her. The person will get a list of recommendations of coffee shops
around him/her pushed into the phone or in the web application. Usually, these
suggestions may not always be arranged in the order of nearest distances or ratings
of the restaurants. So how does the system work in FourSquare? It uses some features
like spatial score, timeliness, popularity, here now, personal history of the user, etc.
to arrive at the ratings. But they do not include parameters like distance or ratings of
the places. In order to incorporate these and obtain the revised rankings, one needs
to design an appropriate weighted function and train the machine learning algorithm
with it. So basically the machine is now trained to rank the places based on distance
and ratings to get the optimized rankings. So a simplified view of the FourSquare
problem is shown in Fig. 5.1. Other relevant parameters may also be chosen to make
the ranking optimized. In the next section, we take a look at some LtR(Learning to
Rank) algorithms. One thing that needs to be kept in mind here is that while a hybrid
recommender system predicts ratings, the LtR algorithms produce the orderings.
5.3 Feature Selection in Recommender Systems 41
Feature selection (Manning et al. 2008) is a very important process in the designing
of efficient recommendation algorithms. So what is a feature? A feature is basically
an X-variable in the datasets and is usually defined by a column. Nowadays datasets
can have more than 100 + features, which make them very difficult to work with
normally. So in such cases feature selection techniques come in very handy. What
feature selection does is to reduce the number of features that are included in a
model but without sacrificing the predictive power of the model. So we normally
look for features that are redundant, irrelevant can actually affect the performance
of the model negatively, so it will be useful if can identify these features and remove
them.
So the main benefit of feature selection is that it prevents overfitting, by removing
extraneous data and helps the model to focus on the important aspects of the data
and helps in increasing the accuracy of the predictions made by the model. There are
three types of feature selection methods: Wrapper methods (forward, backward,
and stepwise selection), Filter methods (ANOVA, Pearson correlation, variance
thresholding), and Embedded methods (Lasso, Ridge, Decision Tree).
• Wrapper methods—These models start with a particular subset of features and
calculates the importance of each of those features. Then the model iterates and
tries other different subsets of features, until an optimal subset is found. The chal-
lenge with this method is that for datasets with very large number of features it will
require a very high computation time, and because of this it will overfit the model
when the number of data points is less. The important wrapper methods for feature
selection are forward selection, backward selection, and stepwise selection.
The forward selection methods initially start with zero features, and for each of
these features it will run a model and find the p-value related to the t-test or F-test
performed on them. Then a selection of the feature with the lowest p-value is made
and added to the working model. Then it will take the first feature and run the models
with a second feature added to it and select the second feature with the lowest p-value.
Similarly, it will take two features previously selected and run the model with a third
42 5 Learning How to Rank and Collecting User Behavior
feature and so on. Therefore only those features which have a significantly p-vales
are added to the model. So any feature with a low p-value will not be included in the
final model.
In the backward selection process, it starts with all the features that are there in
the dataset and then runs the model to calculate the p value associated with the t-test
or the F-test for each feature. The feature which has the largest insignificant p-value
will be removed from the model and then the process is started again iteratively until
all the features which have insignificant p-values are removed from the model.
The stepwise selection method is a hybrid of the forward and backward selection
methods. Here, the process starts with 0 features and adds the feature with the lowest
significant p-value. Then it will find the second feature with the lowest significant
p-value. In the third iteration, it finds the next feature with the lowest significant
p-value and also remove any of the features that were previously added but at present
have insignificant p-values.
5.3 Feature Selection in Recommender Systems 43
The ANOVA (Analysis of Variance) process, as the name says, sees the variation
within the treatments of a feature as well as in between treatments. These variances
are useful parameters for this particular method because here we can find out if a
feature is properly accounting for variations in the dependent variable.
def ANOVA(X,y):
Quick linear model for sequentially testing the effect of many regressors
Returns:
(F,pvalues) = f_regression(X,y)
return (F,pvalues)
The Pearson correlation coefficient measures the similarity of two features that
range between − 1 and 1. If any two features have a value close to − 1 or 1 then it
implies that the two features may be related or have a high correlation to each other.
The cutoff value of a high correlation vs low correlation is dependent on the range
of the correlation coefficients in a dataset.
44 5 Learning How to Rank and Collecting User Behavior
selector = VarianceThreshold(threshold)
selector.fit(data)
return data[data.columns[selector.get_support(indices=True)]]
is also zero, and therefore only the sum of squared residuals is minimized. When
lambda increases asymptotically, we arrive at a slope which is close to zero, so
the larger is the value of lambda, the less sensitive the prediction becomes to the
independent variable (Fig. 5.2).
The Lasso Regression is also a regularization method to reduce overfitting. It
also puts a penalty on the beta coefficients in a model, but it also adds a penalty term
to the cost function of the model with a lambda value that has to be tuned. It is similar
to the Ridge regression but with only one very major difference: the penalty function
is now lambda*|slope|. The main difference between the Ridge regression and the
Lasso Regression is that the Lasso Regression can force the beta coefficient to zero
so that the feature is removed from the model. So if we are looking to reduce the
model complexity, Lasso regression is the preferred method. The result of the Lasso
Regression is also similar to the result given by the Ridge regression. Both of them
can be used for Logistic regression, regression with discrete values, and regression
with interaction. The difference between the two methods can be realized when
we increase the value on lambda. Ridge can only shrink the slope asymptotically
close to zero, whereas Lasso can shrink the slope all the way to zero (Fig. 5.3).
The advantage is evident when we lots of parameters in the model. In Ridge if we
increase the value of lambda, the most important parameters may shrink a little, and
the less important parameters stay with a high value. In contrast, in Lasso when the
value of lambda is increased, the most important parameters shrink little, but the less
important parameters get near to zero. In this way Lasso can exclude the unimportant
parameters from the model.
The Decision Tree is another method for feature selection and uses a regression
tree or a classification tree depending on whether the response variable is continuous
or discrete respectively. Decision tree Regressor builds a tree incrementally by split-
ting the dataset into subsets which give rise to a tree with decision nodes and leaf
nodes. A decision node has two or more branches where each value represents the
attribute tested. The Leaf node represents the decision on the numerical target. The
topmost node is called the root node which corresponds to the best predictor.
46 5 Learning How to Rank and Collecting User Behavior
It works by creating splits in the tree depending on certain features for the creation
of the algorithm to find the response variable. At each split, the function that was
used to create the tree will check all possible splits for all the features and will choose
the feature that splits the data into the most homogeneous groups. It basically means
that it selects the feature that predicts best, what the response variable will be at each
point in the tree.
The LtR algorithms (Learning to Rank) can be broadly classified into three cate-
gories based on their ways of evaluating the ranked list during the training
phase—pointwise, pairwise, and listwise, as shown in Fig. 5.4.
• Pointwise—In this system, a score is produced for each item and then they are
ranked accordingly. This is similar to the approaches in the recommendation
systems in previous chapters. Rating prediction is different from ranking in that
ranking does not care about the utility score of an item being even one million, as
long as the score is a valid rank in the system.
• Pairwise—As the name suggests, this is a binary classifier that uses a function to
take two items as the input and returns the ordering of the two items. So they are
basically pairs of items where a user has given the preference. The pair only has
the information on whether the first item is preferred to the second item or not by
giving a + 1 or a − 1. In this binary classification problem, the aim is to optimize
the output so that the learning method implicitly aims to minimize the number of
pairwise inversions in the training data.
• Listwise—This is the best LtR approach as it takes the entire ranked list and
optimizes it. Listwise ranking is preferred because it understands that ordering
is more important at the top of a list as compared to the bottom of the list. The
pointwise and pairwise algorithms can’t differentiate where an item is on the
ranked list.
But with the gradual advent of deep learning, the algorithms for ranking are being
gradually integrated with deep learning. There are four types of algorithms which are
typically used for ranking now and are as follows: the logistic regression (LR), the
factorization method (FM), gradient boosted decision trees(GBDT), and the DeepFM
models.
The logistic regression model, (LR) is the most classic binary algorithm and is easy
to use and needs low computation power. The factorization machine (FM) has been
applied in the past few years on various customer scenarios and has been found to
give promising results and uses the inner product method for feature representation.
The third method is a logistic regression method that uses gradient boosting decision
trees (GBDTs) and feature encoding for increasing the interpretability of the data
features. The fourth algorithm is the DeepFM algorithm and uses a combination of
deep learning and classis learning algorithms.
Apart from collecting content about the items for a recommendation, it is also very
important to obtain information about the likes and dislikes of a particular user, in
order to complete the process of recommendation. While the collection of data is
done during the offline phase, the recommendations are found during the online
phase, when a particular user is interacting with a system. An active user is someone
for whom the prediction is done at any given point in time. So during the online
phase, the personal preferences of the user are combined with the content for the
creation of predictions. The data related to the likes and dislikes can be in any one
of the following forms:
Ratings—Here the users will specify ratings that will indicate their preferences
for a particular item. The ratings may be binary, interval based, ordinal, or even real-
valued. The choice of the rating type usually has a big impact on the type of model
that is to be used for learning about the user profiles.
Implicit feedback—This defines the user-actions like buying or browsing an item.
In a majority of the cases, only the positive preferences of the user are collected
along with implicit feedback, but negative preferences are not collected.
Text Opinions—In some instances, the opinions expressed by the user might be
in the form of text descriptions also. If that happens, then the implicit ratings can be
extracted from these opinions. This type of extraction of rating deals with opinion
mining and sentiment analysis.
Cases—Users sometimes may also specific examples or cases of items that he/
she is interested in. Then these cases can be utilized as implicit feedback for some
algorithms.
In all of the above cases, the likes and dislikes of a user about an item are converted
to unary, binary, interval-based, or real ratings.
5.7 Detection of Fake/Malicious Profiles 49
With the present situation of information overload, consumers have a tough time
finding and selecting the information that is most relevant to them. Recommender
systems are a boon for potential customers because they use information filtering
techniques to help the consumers to make their choices of items to select. However,
with the rise of users and items featured on the recommender systems, new challenges
have also come up. The collaborative filtering technique is one of the widely used
recommendation systems in use at present. But unfortunately, it is also extremely
prone to shilling/profile injection attacks. The effect of these attacks is that they
adversely affect the recommendation process to promote or demote a particular
attack. In this section, we give an overview of the various types of shilling attacks
and some detection algorithms (Bhaumik et al. 2006, 2007). A more comprehensive
study is outside the scope of this book.
As we have already discussed in Chap. 3, recommendation systems can be
broadly classified into two types, i.e., collaborative filtering-based and content-based.
The content-based approach works by recommending the products to the users by
comparison of the products to the profiles of the users.
The collaborative filtering recommender system on the other hand analyses the
last behavior of a user to find the best matches. It is based on the assumption that
users with similar behaviors will have similar interests. So it basically depends on
the relationship between the users and the items. But unfortunately, because of its
openness and dependency on user ratings, collaborative filtering is very often prone
to shilling attacks or profile injection attacks.
A shilling attack is a type of attack, where a malicious user profile is deliberately
inserted into an existing collaborative filtering data set so that the outcome of the
recommendation system gets changed. The result is that these injected profiles will
explicitly rate the items in such a manner that the main target item will either get
promoted or demoted.
We explain the effect of a shilling attack with the following example:
Suppose there are only two users A and B in a system, and they have given similar
ratings to some products say p1, p2, and p4. Now if user B gives a high rating to
product p3, then p3 will also be recommended to user A. So basically it finds the
top x users who are similar to the target user u and then the ratings of the products
for the user u are calculated based on similar users’ ratings of the products, and the
top few products with high ratings, which have not yet been rated by user u, are
then recommended to the user u. So whenever a new user with a similar profile gives
a high rating to a particular product, then that product will be recommended to the
other users with similar profiles. In the same way, if some new users give a low rating
to a particular product, then the chances of that product being recommended to other
users with similar profiles become low.
50 5 Learning How to Rank and Collecting User Behavior
In Fig. 5.5, product X gets promoted and recommended to other users based on the
high ratings of a malicious user who has injected his profile into the recommendation
system. Shilling attacks can be classified into two categories, a push attack or a nuke
attack, depending on what purpose it is being used for. If it is being used to gain
promotion for an item then it is a push attack and if it is being used to demote an
item then it is termed a nuke attack, and both types are used to gain an edge or profit
over a competitor.
There are different types of shilling attacks as shown in Fig. 5.6, and they are
broadly classified as standard attacks and obfuscated attacks (Bryan et al. 2008;
Burke et al. 2006). Standard attacks do not make a special attempt to not get detected
in a recommender system. So the detection algorithms have a higher chance of
detecting these types of shilling/profile injection attacks. Some examples of this type
of attack are random attack, average attack, bandwagon attack, reverse bandwagon
attack, segmented attack, probe attack, and love/hate attack. Obfuscated attacks on
the other hand try to prevent themselves from getting detected by obfuscating their
attack signatures. Most of these methods make small modifications to the standard
techniques to obtain obfuscation. The obfuscation may reduce the impact of the attack
sometimes but the plus side is that they have lesser chances of getting detected. Some
techniques of this type are noise injection, user shifting, target shifting, average over
popular, mixed attack, power item attack, power item attack, SAShA.
• Standard Attacks
Random Attack or the RandomBot attack is the simplest type of shilling attack
where the items rated by the attack profile are chosen randomly, except the target
item. The ratings for all these items are around the system overall mean, and the
target item is given the maximum or minimum rating depending upon whether it’s a
push attack or a nuke attack. It is easy to implement but not very effective. It’s used
more to disrupt a recommendation system than to actually promote an item.
The Average Attack is similar to the random attack as far as selecting the item is
considered.
In the Bandwagon Attack, the profiles that are generated by the attackers are filled
with popular items with high ratings, and the target item is given the highest ratings.
The Reverse Bandwagon Attack is the opposite of the Bandwagon Attack where
the target product is given the lowest ratings and is used for nuke attacks.
In the Segmented Attack, a specific group of users who are likely to buy an item
in an e-commerce setup are targeted. This attack has a high impact as it is aimed at
a particular segment.
52 5 Learning How to Rank and Collecting User Behavior
The Probe Attack is usually not a generalized one for all systems. Here the
attacker utilizes the predicted rating scores that are projected by some recommen-
dation systems and gives genuine ratings to some items. When the recommendation
system suggests further items, the attacker can make the rated list of items based on
this list.
The Love/Hate Attack is a type of nuke attack which is very effective. In this filler,
items are chosen randomly by the attacker and given the highest ratings, while the
target items are given the lowest ratings. Although it seems like a simple model, it’s
a highly effective model. Although it was basically designed for nuke attacks, it can
also be applied to push attacks, but then they will not be as effective in this case.
• Obfuscated Attacks
The Noise Injection method adds a Gaussian distributed random number which
is multiplied by a constant, to each of the ratings, which are a subset of the infected
profiles.
In the User Shifting technique, a subset of the rated item of each of the injected
profiles is changed.
In Target Shifting, the ratings of the target items are shifted to one level lesser than
the highest that is possible in push attacks.
The Average over Popular technique obfuscates the Average attacks where the
filler items are chosen from among the top x% of the items which are the most
popular, with equal probability.
The Mixed Attack is achieved by applying the random, average, bandwagon, and
segmented attacks in equal proportions.
The Power Item Attack uses the power items which are chosen by some particular
methods, where power items are defined as a set of items that is capable of influencing
the largest number of items.
The Power User Attack is similar to the above attack, but here the set of users
who have the maximum influence on the broadest group of users is chosen.
SAShA is a technique that uses the semantic features which are extracted from
a knowledge graph for improving the standards of the usual Collaborative filtering
attack models.
There are various types of detection algorithms and they can be broadly classified
as supervised and unsupervised detection methods. Since the supervised techniques
need the data to be labeled during the training process, and labeled data is very less in
recommendation systems, therefore the unsupervised methods are used more here.
A majority of these detection algorithms target a particular trait in a shilling attack.
Even though obfuscation makes it possible to evade detection to some degree, there
must be some innate features present in the attack to make it effective to a certain
extent. The traits can be user-based traits or item-based traits (Fig. 5.7).
5.10 Summary 53
5.10 Summary
In this chapter, we have discussed a very crucial topic of how to properly rank
the items in a recommendation system and an overview of the types of ranking
algorithms. The chapter also discusses the problems of fake profiles and how the
presence of malicious users can adversely affect the outcomes of suggestions by
a recommendation system and reduce its credibility. An overview of the shilling
attack and its types has also been given here along with techniques to detect such
types of attacks and the corresponding actions to be taken. In Chap. 8, we have
discussed how to build trust centric and attack resistant recommendation systems.
The recommendation algorithms that have been described in the previous chapters
used past ratings of users, and preferences of users with similar profiles to arrive at
their suggestions of products for new users. But in some types of situation if sufficient
past ratings are not available, or there are complex variations in the combinations
of preferences of some particular products, then the aforementioned methods are
not very effective. This is especially true for products that are not bought frequently
or high-end luxury products which have high levels of personal customizations or
the preferences of users evolve over time. To handle such situations, knowledge-
based and hybrid, and ensemble-based techniques are highly useful to give accurate
suggestions. These are discussed in the next chapter.
54 5 Learning How to Rank and Collecting User Behavior
Think Tank
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey
of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Bhaumik R, Williams C, Mobasher B, Burke R (2006) Securing collaborative filtering against
malicious attacks through anomaly detection. In: Workshop on Intelligent Techniques for Web
Personalization (ITWP)
Bhaumik R, Burke R, Mobasher B (2007) Crawling attacks against web-based recommender
systems. In: International Conference on Data Mining (DMIN), pp 183–189
Bryan K, O’Mahony M, Cunningham P (2008) Unsupervised retrieval of attack profiles in
collaborative recommender systems. In: ACM Conference on Recommender Systems, pp
155–162
Burke R, Mobasher B, Williams C, Bhaumik R (2006) Classification features for attack detection
in collaborative recommender systems. In: ACM KDD Conference, pp 542–547
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: ICDM
’08. IEEE Computer Society, pp 263–272
Koren Y, Ordrec JS (2011) An ordinal model for predicting personalized item rating distributions.
In: Proceedings of the fifth ACM conference on Recommender systems, RecSys’11. ACM, pp
117–124
Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331
Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval, 1st edition.
Cambridge University Press, Cambridge
Radlinski F, Kleinberg R, Joachims T. Learning diverse rankings with multi-armed bandits. In:
Proceedings of the 25th international conference on Machine learning, ICML’08. ACM, New
York, NY, pp 784–791
Rendle S, Freudenthaler C, Gantner Z, Lars S-T (2009) Bpr: Bayesian personalized ranking from
implicit feedback. In: UAI’09. AUAI Press, pp 452–461
Chapter 6
Knowledge-Based, Ensemble-Based,
and Hybrid Recommender Systems
6.1 Introduction
Sometimes users require information on some categories of items that are not bought
very frequently. Some examples of such products are houses, cars, tourism plans,
financial services, or some costly luxury products. So there may not be a sufficient
amount of ratings available for the recommendation process to work on (Aggarwal
et al. 2001). Since the items are purchased less frequently and have various types
of detailing and combinations, it is difficult to get sufficient ratings for a particular
combination of parameters for a product. The cold start problem is a similar problem
that is encountered in recommendation systems when sufficient ratings are not avail-
able to it. In addition to this, the preferences of the consumers may also change over
time. For example, the preferences for the models and specifications of cars may
change over time, so the past data stored about previous user ratings may not be
useful anymore. Also, there may be different combinations of parameters that are
relevant to different users like color, engine capacity, fuel efficiency, brand, interior
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 55
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_6
56 6 Knowledge-Based, Ensemble-Based, and Hybrid Recommender Systems
options, etc. So basically it’s a very complex task to generate sufficient ratings for
such a varied combination of parameters. These types of cases are dealt with by
using knowledge-based recommender systems, which do not use ratings as a basis
for their recommendation process. Here the recommendation is based on the simi-
larities between the requirements of the users and the descriptions of the products, or
the constraints that are specified by the user. So the process basically uses a knowl-
edge base and hence the name of this approach. So while content-based systems or
collaborative filtering are based entirely on the past actions or ratings of a user, or
the action or ratings of people with similar profiles, knowledge-based systems are
different in the way that they tell the users to explicitly specify their requirements
(Bobadilla et al. 2013; Bohnert et al. 2008).
Sometimes when the inputs have a wide variety, then a person has the flexibility
to use various types of recommender systems for performing the same task. This
gives rise to the option of hybridization where a combination of different types of
systems is implemented to get the best of both. Hybrid recommender systems have
a close relation to the area of ensemble analysis, where multiple machine learning
algorithms are combined together for the creation of a more robust model. Ensemble-
based recommender systems not only use multiple data sources but also increase
the effectiveness of a particular type of recommender system by implementing the
combination of multiple models of the same type. In this chapter the knowledge-
based, ensemble-based, and hybrid recommendation systems are discussed. A brief
introduction of the cold start problem is also given here since it is an interesting
problem and also a popular topic for research.
The cold start problem is very well known in recommendation systems and is also
an interesting aspect for research. This problem is usually faced in computer-based
information systems, where some amount of data modeling is involved. To be more
precise, it deals with the problem where the system cannot draw any conclusions
or infer for the users or items about which the system has not yet gathered suffi-
cient information. Since recommender systems involve information filtering, which
enables it to try to present information of items that may be of possible interest to
the user. To do this, a recommendation system compares the profile of a user to some
reference characteristics. These characteristics may be characteristics of the item or
the past behavior of a user. So based on what the system is designed to do, the user
can be linked to different types of interactions like ratings, purchases, number of
page visits, etc. There are three cases of the cold start problem: new community, new
item and new user (Fig. 6.1).
In the new community case, also known as systemic bootstrapping, is when the
system is at the point of startup, when there is almost no user interaction and no
6.3 Knowledge-Based Recommender Systems 57
information based on which the system can depend upon. In this case, the disadvan-
tages of both the new user and new item are present, because of which the techniques
used to deal with the two cases are not applicable to system bootstrapping.
In the new item case, whenever a new item is added to a system or catalogue,
although it might have content information, it will not have any interactions. So this
is the item cold start problem. The problem is mainly created for collaborative filtering
algorithms, because the algorithm uses item interactions to make recommendations.
So in the condition that there are no interactions, then a purely collaborative algorithm
will not be able to recommend an item.
A number of solutions have been devised to tackle the cold start problem. Here
in the following sections, we discuss three new types of recommendation systems,
namely knowledge-based, ensemble-based, and hybrid systems.
Since content-based and collaborative systems both need a lot of information about
previous buys and user ratings, hence they are unable to make good recommendations
when there is a lack of available data. This is also referred to as the cold start problem.
So they are not suitable for applications in areas where there are highly customized
products, like real estate, cars, or other luxury goods. These items are generally
bought quite rarely and therefore there are insufficient ratings available in many cases.
For example, a person is searching a house with a specific number of rooms, facing a
particular direction or in a particular area. So it is unlikely to find a sufficient number
of past ratings for a such specific combination of parameters. Similarly, ratings and
preferences of cars based on build, engine, color, etc. evolve over time and past ratings
may not be suitable for present scenarios as constant improvements are going on in the
automobile industry. In such circumstances, knowledge-based recommender systems
work very well for products that are not bought regularly, because they rely on explicit
58 6 Knowledge-Based, Ensemble-Based, and Hybrid Recommender Systems
user solicitations for their system. The difference in concept between collaborative
filtering, content-based, and knowledge-based systems is shown in Fig. 6.2 (Boldi
et al. 2008; Bridge et al. 2005).
Knowledge-based recommender systems (Burke 2000; Burke et al. 1996;
Felfernig et al. 2007) work well in the following types of scenarios:
• In situations where a user wishes to explicitly specify his/her requirements, inter-
action is a very important thing in such systems. Such type of detailed feedback
is not allowed in collaborative or content-based systems.
• In cases where obtaining the ratings for a particular type of product is difficult,
because the product domain is more complex based on the item type and option
availability.
• In cases where the ratings maybe time—sensitive, like the ratings on older models
of cars or computers or mobile phones, etc. will not be useful beyond a particular
time limit because more advanced versions have been rolled out in the market.
So in knowledge-based systems, the user has greater control over the guidance of
a recommendation system, because here the user is able to give the specific details for
a complex problem domain. Knowledge-based systems can be divided on the basis
of their methods of interactions with the users and the respective knowledge bases
needed for the interactions. They are constraint-based and case-based recommender
systems.
• Constraint-based recommender systems—In this type of system, the user normally
gives some constraints, like an upper or lower limit on the attributes of the item.
Some domain-specific rules are implemented for matching the requirements of
the user with the attributes of the user, e.g., a user may search for a car with
cruise control and diesel and manual. So the user attributes can also be used in the
searching process. Now based on how many results the search returns, the user
may modify or relax the constraints if too few results are returned. The above
process can be repeated a number of times until the user gets the desired results.
• Case-based recommender systems—In this system the user specifies specific cases
to be used as target or anchoring points. Then the item attributes that have been
defined are used to retrieve similar items based on similarity metrics, which are
defined specifically for a domain. So it is the similarity metrics that form the basis
of the domain knowledge that is used for the recommendations here. Sometimes
the results that are returned by the users are interactively modified to be used as
6.3 Knowledge-Based Recommender Systems 59
new targets. Suppose a user gets a result that is almost similar to what the user is
searching for, then the user may retry the query but with some modifications in
the attributes. Sometimes a directional critique is also used. It is a method where
some items with some specific attributes which are less than or greater than the
attributes of a particular item are pruned off to help the user through the final
recommendations.
The interactive processes of both cases are shown in Fig. 6.3a, b.
The interactions that the users have with the recommender system can be either
conversational system, search-based, or navigation-based system. They are explained
further as follows:
• Conversational system—Here the preferences of the user are found through a
feedback loop. This system is useful because when dealing with a complex item
domain, the preferences of the user can only be found through an iteration of
conversations.
• Search-based system—In this system search engines are used to find out user
preferences by asking a preset sequence of queries.
• Navigation-based system—Here when the user gets a recommended item, he/she
will specify the number of changes to be made to it and after an iteration of such
change, requests can finally arrive at the desired item. Such systems are also called
critiquing recommender systems.
Given the overview, a more detailed discussion of the constraint-based and case-
based systems is given in the next sections.
This system allows the users to specify hard requirements or constraints on the
item attributes. On addition there will be a set of rules for matching the customer
requirements with the item attributes. But it is not necessary that the customers always
specify their queries in terms of the same attributes that describe the items. So there
also needs to be an additional set of rules that will relate the customer requirements
with the item attributes. For example if we take Table 6.1, the following are the
customer-specified attributes:
Marital-Status (categorical), Family-size (numerical), Suburban-or-city (binary),
Min-Bedrooms (numerical), Max-Bedrooms (numerical), Max-Price (numerical)
These attributes can be either inherent customer properties or they may be customer
requirements for the product. These requirements are often specified interactively
during a conversation between a customer and a recommendation system. Some of
these attributes are also not included in Table 6.1. But while some of the customer
requirements like max price may be mapped easily, other mappings like suburban or
rural may not be as obvious.
60 6 Knowledge-Based, Ensemble-Based, and Hybrid Recommender Systems
Table 6.1 Examples of attributes for a recommendation app for buying houses
Item id Beds Baths Locality Type Floor area Price
1 3 2 Pune Townhouse 1600 63 L
2 5 2.5 Chennai Split level 3600 90 L
3 4 2 Delhi Ranch 2600 75 L
4 2 1.5 Bangalore Condo 1500 60 L
5 4 2 Kolkata Colonial 2700 80 L
This is done by using something called knowledge bases. They contain additional
rules that help in the mapping of customer requirements/attributes with product
attributes.
Suburban-or-rural = Suburban ⇒ Locality = <List of relevant localities>
These rules are called filter conditions, because they map the requirements of the
user to the item attributes and then this mapping is used for filtering the retrieved
results.
Some compatibility constraints relate customer attributes to one another and are
useful when customers give their personal information during an interaction. One
such example is as follows:
Marital-status = single ⇒ Min-Bedrooms ≤ 5
So it has been inferred either through data mining of historical data or through domain
specific experience that if individuals are single then they do not prefer buying large
houses. In the same way, large families will not prefer small houses.
So this constraint can be modeled with the following rule:
Family-Size ≥ 5 ⇒ Min-Bedrooms ≥ 3
Similarity metrics are used here for retrieving examples that are similar to the cases
that are specified. For example in Table 6.1, a user can specify a locality, no. of
bedrooms and a preferred location for specifying the set of attributes. But here,
there are no hard constraints like minimum or maximum values enforced on the
constraints, unlike the constraint-based systems. A similarity function is used for the
retrieval of cases which are most similar to the cases specified by the users. So if
there are no matches for homes that match the user specifications, then a similarity
function is used for retrieving and ranking items that are as similar as possible to
the user queries. So in these types of systems do not face the problem of retrieving
empty sets. Many differences are also there between constraint-based and case-based
recommender systems, in the aspect of how the results are refined. While the former
62 6 Knowledge-Based, Ensemble-Based, and Hybrid Recommender Systems
uses requirement relaxation, modification and tightening for the refinement of the
results, the latter uses repeated modification of the requirements of the user queries,
till a suitable solution was available. This led to the method of critiquing. The basic
principle of the critiquing method was that a user could select one or more of the
retrieved results and then specify further queries like:
Give me more items like X, but they are different in attribute(s) Y according to guidance Z.
The basic aim of critiquing is to support interactive browsing of the item space so
that a user slowly gets aware of further options available to them from the examples
that have been retrieved. The advantage of interactive browsing of item space is that
a user can gradually learn during the process of formulation of interactive queries.
Often in many cases a user may be able to arrive at choices through repeated and
interactive exploration, which could otherwise have not been reached at the beginning
of the search. If we consider the example of Table 6.1, a user can specify a preferred
price, the number of bedrooms and a preferred locality. But it is also possible that a
user enters a target address for asking for examples of possible options in houses he/
she may be interested in.
By repeating the process of critiquing, the results that a user finally gets are
sometimes very different from what the user query had initially specified. This is
because of the fact that quite often a user may not be able to easily articulate ALL of
the preferred features in the beginning itself. For example, a user may be unaware of
the prices for a house with a desired set of features when he/she is first starting the
query process. So with the help of this interaction process the gap between the item
availability, and the user perceptions is gradually narrowed down.
So for the efficient working of a case-based recommender system, there are two
main points that need to be considered while designing the system:
• Similarity metrics, where the importance of various attributes need to be
incorporated into the similarity function for the effective working of the system.
• Different types of Critiquing methods are used to provide support to the various
goals of exploration.
Similarity Metrics
Suppose there is an application which has d attributes. We need to find the similarity
values between two partial attribute vectors defined on a subset S of d attributes (i.e.,
|S| = s ≤ d).
Suppose X = (×1... ×d) and T = (t1... td) are two d-dimensional vectors, which
are partially specified here and T is the target. We assume that the attribute subset
S ⊆ {1...d} is specified in both vectors. Partial attribute vectors are used here the
queries are usually defined only on a small subset of the attributes that are specified
by the user. Then the similarity function f (T,X) between the two sets of vectors is:
wi · Sim(ti , xi )
f T, X = i∈S
i∈S wi
6.4 Ensemble-Based and Hybrid Recommender Systems 63
where Sim(ti, xi) is the similarity between the values xi and yi, and weight wi is the
weight of the ith attribute, which regulates the relative importance of that attribute.
Critiquing Methods
The basic idea behind the use of critiques is that in a lot of cases, the users do not
often know exactly how to state their query initially. If it is a complex domain, it may
become even more difficult for them to translate their requirements in a semantically
meaningful way so that they can match with the attribute values of the products. So
after seeing the results of a query a user may understand how to phrase her query in
a different manner.
Once the users have received the results, there is a feedback mechanism using
critiques. Although most cases have interfaces which critique the most similar
matching item, a user can also critique any item from the list of retrieved items.
Here the users change requests on one or more of the attributes of an item which
they like. So in the context of the house buying example in Table 6.1, a user may be
interested in a specific house but might want it in a different area or with a different
number of bedrooms. So the user can make changes in the features in any one of the
products he/she likes. It may be a directional critique (like “ less expensive”) or a
replacement critique (like “different color”). Then the examples which do not meet
the user -specified critiques are eliminated and a different set of more similar items are
retrieved. If there are multiple critiques, then recent ones are given the higher prece-
dence. So critiques can be of three types, simple, compound, and dynamic critiques.
In a simple critique, only a single change to one of the features of a recommended
item is done by the user. In a compound critique, a user can specify modification
of multiple features in a single cycle. In dynamic critiquing, data mining is used on
the results that have been retrieved for finding the most effective roads for exploring
and presenting to the user. So dynamic critiques are basically compound critiques
as they almost always give combinations of the changes which are presented to the
user, with the major difference being that only a subset of the most relevant items is
given, on the basis of the recent retrieved results.
As already explained in the earlier chapters, different systems have different sources
of data and different advantages and disadvantages. While knowledge-based systems
need explicit user specifications, content-based and collaborative filtering are based
on past ratings and preferences. So knowledge-based systems address the cold start
problem in a much better way. But all the models are restrictive when there are
multiple sources of data available. If different types of recommender systems are
used along with different data sources, then the predictions may be more robust.
This led to the design of Hybrid recommender systems. There are primarily three
ways to create such hybrid systems:
64 6 Knowledge-Based, Ensemble-Based, and Hybrid Recommender Systems
• Ensemble Design—Here, the results from various algorithms are combined into a
single and more robust output. So maybe the ratings of content-based and collab-
orative might be combined together into a single output. But there are differences
in the ways the methods are combined together (Ma et al. 2009; Yu et al. 2003).
• Monolithic Design—In this system, an integrated recommendation engine is
created using various data types. In some cases, the existing CF or CBF methods
have to be modified to be fitted to the overall approach, although the two methods
are different from each other. It also integrates the data sources very tightly.
• Mixed System—Like the ensemble method, these systems use multiple recom-
mender algorithms, but the difference is that the products that are suggested
by the different systems are presented all together and beside each other. For
example, the list of programs on television for an entire day is seen as a whole
with multiple suggestions. So it’s basically the combination of the items that create
the recommendation.
So hybrid systems are used in a broader context and while all ensemble systems are
hybrid systems, the reverse may not always be true. Figure 6.4 shows the taxonomy
of the hybrid systems.
Hybrid recommender systems (Tang et al. 2003; Tran and Cohen 2000; Satten
2005) can be classified as follows:
• Weighted—In the weighted system, the combination of the scores from various
recommender systems are combined to form a single composite score by using
a weighted aggregate of the scores of the separate components of the ensemble.
The way for deciding the weights of the components can either be heuristics, or
formal statistical models.
• Switching—In this method the algorithm switches among the different types of
recommender systems, depending on the requirements of the system then, e.g., a
knowledge-based recommender system may be used in the early phases to combat
the challenge of the cold start problem. Then gradually as the system gathers more
ratings, the system might switch to another system like CF or CB algorithms, or
whichever algorithm will be more suited at that point in time.
6.5 Ensemble Methods from the Classification Perspective 65
Ensemble methods are applied to the area of data classification for increasing the
robustness of the learning algorithms, but it is also applied to other types of recom-
mender systems also. Collaborative filtering and classification methods differ from
each other in the aspect that the class variables are not clearly defined in the first
method and there may be missing entries in any column or row. The missing rows
also indicate that training and test instances are not clearly defined. Now, one may
ask if the bias-variance theory for classification is also applicable for recommender
systems. It has been observed that the combination of different collaborative recom-
mender systems give a higher degree of accuracy in results. The reason for this is that
66 6 Knowledge-Based, Ensemble-Based, and Hybrid Recommender Systems
the bias-variance theory, which was designed for classification, is also applicable for
collaborative filtering areas. Therefore most of the traditional ensemble techniques
from classification may be generalized to collaborative filtering. The only problem
that arises is that if the missing entries occur in row or column of data, it may be
algorithmically challenging to generalize the ensemble algorithm for classification
to collaborative filtering. Let us consider a classification or regression model where
we need to predict a specific field. Then the classifier error for the prediction of the
dependent variable can be broken into three parts, namely the bias, variance, and the
noise:
Bias—All classifiers make their own assumptions for modelling regarding the
type of the decision boundary between the classes. In case a classifier has a high
bias, consistently incorrect predictions will be made by it on specific choices of test
instances around the incorrectly modeled decision boundary, and this holds true even
if the training data samples are different during the learning process.
Variance—If there are random variations in the selection of the training data,
then it will lead to dissimilar models. In that case there may be inconsistencies in the
prediction of the dependent variable for a test case, for different selections of training
data sets. The variance of a model is also closely linked to overfitting. If a classifier
has a tendency to overfit, then the predictions it makes will also be inconsistent for
the same test case but with different sets of training data.
Noise—The intrinsic errors in labeling the target class is called noise. However
not much can be done for correcting it, because noise is an intrinsic property of the
quality of data. So ensemble analysis normally focuses on the reduction of variance
and bias.
The formula for the expected MSE of a classifier is:
The total error of a classifier can be reduced by reducing either the bias or the
variance. The classification and the collaborative filtering methods differ in the fact
that the missing entries may be present in any column as against only in a class
variable. But the result of the bias-variance is valid even when it is applied for
the prediction of a particular column, irrespective of whether they are specified
incompletely or not. So the rules of ensemble analysis hold good for collaborative
filtering.
6.6 Summary
frequently, highly customized or high value items to be bought, the methods in the
previous chapters were not very helpful. We have analyzed the types of applications
that need user interactions for the recommender systems to arrive at their suggestions
instead of using previous user ratings or finding users with similar preferences. An
overview of knowledge-based recommender systems and their ideal application areas
are discussed along with the various types of hybrid and ensemble-based systems.
Since recommender systems rely on huge amounts of data to be able to make
more accurate predictions, a successful system should be able to efficiently handle
huge volumes and varieties of data, or what we now call Big Data. The next chapter
gives an overview of the big data behind such recommendation systems, its roles,
and challenges.
Think Tank
1. What are knowledge based recommender systems?
2. What are ensemble based systems?
3. What are the different types of hybrid recommender systems?
4. What is the relation between variance, bias and noise?
References
Aggarwal C, Procopiuc C, Yu PS (2001) Finding localized associations in market basket data. IEEE
Trans Knowl Data Eng 14(1):51–62
Bobadilla J, Ortega F, Hernando A, Gutierrez A (2013) Recommender systems survey. Knowl-Based
Syst 46:109–132
Bohnert F, Zukerman I, Berkovsky S, Baldwin T, Sonenberg L (2008) Using interest and transition
models to predict visitor locations in museums. AI Commun 2(2):195–202
Boldi P, Bonchi F, Castillo C, Donato D, Gionis A, Vigna S (2008) The queryflow graph: model and
applications. In: ACM Conference on Information and Knowledge Management, pp 609–618
Bridge D, Goker M, McGinty L, Smyth B (2005) Case-based recommender systems. Knowl Eng
Rev 20(3):315–320
Burke R (2000) Knowledge-based recommender systems. In: Encyclopedia of library and
information systems, pp 175–186
Burke R, Hammond K, Young B (1996) Knowledge-based navigation of complex information
spaces. In: National Conference on Artificial Intelligence, pp 462–468
Felfernig A, Teppan E, Gula B (2007) Knowledge-based recommender technologies for marketing
and sales. Int J Pattern Recogn Artif Intell 21(02):333–354
Ma H, Lyu M, King I (2009) Learning to recommend with social trust ensemble. In: ACM SIGIR
Conference, pp 203–210
Tang T, Winoto P, Chan KCC (2003) On the temporal analysis for improved hybrid recommenda-
tions. In: International Conference on Web Intelligence, pp 214–220
Tran T, Cohen R (2000) Hybrid recommender systems for electronic commerce. In: Knowledge-
Based Electronic Markets, Papers from the AAAI Workshop, Technical Report WS-00-04, pp
73–83
van Satten M (2005) Supporting people in finding information: hybrid recommender systems and
goal-based structuring. Ph.D. Thesis, Telemetica Instituut, University of Twente
Yu K, Shcwaighofer A, Tresp V, Ma W-Y, Zhang H (2003) Collaborative ensemble learning.
Combining collaborative and content-based filtering via hierarchical Bayes. In: Conference
on Uncertainty in Artificial Intelligence, pp 616–623
Chapter 7
Big Data Behind Recommender Systems
Abstract For recommender systems to make accurate predictions about the prefer-
ences of users, they rely on varieties of information and feedback from the customers.
So this naturally involves dealing with and processing huge volumes of data every
day. So here we see the concept of big data and why it is so important. We also see
how recommender systems can benefit from using big data, what the types of data
stored and what the challenges are. Finally, some examples show how exactly it is
used by the recommender systems by taking the example of Twitter.
7.1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 69
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_7
70 7 Big Data Behind Recommender Systems
Although the concept of big data is relatively new, large data sets originated way back
in the 1960s and 70s, with the setting up of data centers and relational databases. It was
around 2005 that people started to realize the huge volumes of data that were being
generated through Facebook, YouTube, and other online services. It was around this
time that Hadoop (Dobrea and Xhafa 2014; Sheikh 2013) was developed which was
an open source framework for storing and analyzing big data sets. Another similar
framework was NoSQL. With the advent of such open source frameworks, it became
easier to store and work with big data (Philip Chen and Zhang 2014; Chardonnens
2013). But it is not only humans who are generating such huge amounts of data. After
the rise of the Internet of Things (IoT), there were many objects and devices which
were connected to the Internet which were gathering data on customer usage patterns
and product performances. The introduction of machine learning has brought in even
more volumes of data.
Big data is defined as data that contains greater variety, arriving in increasing
volumes and with more velocity. These are also known as the three Vs in big data. Big
data consists of larger and more complex data sets, especially from new data sources.
Since these data sets are so voluminous, therefore the traditional data processing
software can’t manage them. But if these massive volumes of data can be harnessed
in some way, they could be hugely successful in addressing business problems that
we could not handle before.
Big data is characterized by the 3Vs, volume, velocity, and variety, as mentioned
in Fig. 7.1. The first parameter Volume implies that there are huge volumes of data
involved. Most of the data is low-density and unstructured data. A lot of this data is
also of unknown value and for many organizations can run into hundreds of petabytes
of data.
The velocity of the data means the rate at which this data is received and in a lot
of cases how quickly this data can be processed and acted upon. Usually the highest
velocity of data streams directly into the memory instead of being written to the disk.
Recommender systems are one of the most common and easily understandable appli-
cations of big data. They are based on well-defined and logical phases of data collec-
tion, ratings, and filtering. We can find applications of recommendation systems in
eCommerce, entertainment, gaming, education, advertising, home decor, and some
other industries. The applications differ based on the types of recommendation
services they provide but the central goal of all of them is to personalize content
and offers.
In order to achieve this, machine learning (ML) engineers have designed recom-
mender systems that redefine the ways customers search for products or services or
learn about new opportunities and goods they may be interested in. However, the
driving force behind all these systems is big data. There are at present numerous
types of recommender systems designed to offer a variety of personalized content,
but the work of all of them is based on voluminous datasets.
As already mentioned in Chap. 3, there are three major types of recommender
systems:
• Content-based filtering
• Collaborative filtering
• Hybrid recommender systems
72 7 Big Data Behind Recommender Systems
All of these are reliant on user behavior data, including activities, preferences, and
likes, or can take into account the description of the items that users prefer, or both.
It has been widely observed that the incorporation of recommender systems into
businesses has increased the number of items sold, has sold more diverse items, is
better at increasing user satisfaction, and helps the service providers to have a better
understanding of what the user wants. It also helps the user to make customized
and informed decisions thereby increasing customer loyalty and retention. However,
even the most advanced of these recommender systems can be rendered irrelevant
and ineffective without the presence of big data. A recommendation system can’t do
its work if it is not supplied with sufficient data for the algorithms it uses, because
such systems rely heavily on information about past purchases, browsing history,
and feedback from a huge number of customers. Such huge volumes of data can only
be provided with the help of big data.
In the next section, we discuss the diverse types of data that are required by a
majority of recommendation systems.
7.5 Challenges
There are however several types of challenges (Río and López 2014; Alejandro Zarate
Santovena 2013; Vemuganti 2013) that are encountered by organizations in the usage
of this big data for their advantage. One of the primary reasons was that of protecting
the user’s privacy. A lot of privacy issues were raised when companies tried to collect
and process a user’s personal data because they claimed that the identity of the user
could be traced from the data collected. Some other major challenges are:
Data Capture and Storage—The size of data sets is growing size everyday,
because the sources of the data are now manifold, e.g., through mobile devices,
sensors, remote sensing, software logs, cameras, microphones, RFIDs, and many
more. Around 2.5 quintillion bytes of data are created everyday, and this is increasing
exponentially everyday. Moreover, the data is mostly unstructured data from a variety
of sources. The collection of all this data is an expensive process, but in many cases,
the data has to be just deleted, mainly because of insufficient space to store them.
To store and analyze the data properly, new techniques and frameworks have to be
designed, as the existing relational databases, etc. are not adequate to handle them
anymore.
Data Transmission—Cloud data storage is the most popular way of storing huge
volumes of data. However network bandwidth capacity is a major bottleneck when
it comes to accessing this data from the cloud, especially when the volume of data is
large. Moreover, storing the data in the clouds also poses several security risks which
also need to be dealt with.
Data Curation—Data curation is aimed at data discovery and retrieval, data quality
assurance, value addition, reuse, and preservation over time. It involves authentica-
tion, archiving, management, preservation, retrieval, and representation. However,
the existing database management tools are inadequate for processing this big data.
In addition, the volume of the data will only increase in the future, as more and more
organizations are realizing the benefits of big data in analyzing business trends,
preventing diseases, and combatting crime. So newer technologies are being used to
tackle this.
Data Analysis—In some applications like navigation, social networks, finance,
biomedicine, and intelligent transport systems, the time to analyze the data should
be as less as possible because they need the results almost on a real-time basis. But
74 7 Big Data Behind Recommender Systems
when dealing with huge volumes of data, reducing the latency in the analysis is a big
challenge.
Data Visualization—The primary aim of data visualization is to represent knowl-
edge more intuitively and effectively through the use of various types of graphs.
Presenting the data in a schematic form is much more intuitive and easier to compre-
hend. However, for big data applications, it is difficult to perform data visualization,
because of its large size and high dimension. The existing big data visualization tools
perform poorly in terms of functionalities, scalability, and response time.
Sparsity—Data sparsity is a major challenge to the recommender system. In such
a situation, the number of items to be rated are much more than the already rated
items by the user. Thus, the very few entries of the user-item matrix are filled with
values that results the matrix to be sparse and reduced recommendation. One possible
solution to this problem is to provide suggestion to a user by checking his profile
and analyzing similarities so that if two users share a common interest on a product
then one user’s recommendation can be given to another user. The sparsity problem
is addressed by Singular Value Decomposition (SVD) technique, which reduces the
dimensionality of sparse rating matrix.
Scalability—With the increase in no of users, products, and rating the scalability
issue arises in recommender systems. When the product information and number
of items increase as well as recommender systems are expected to quickly generate
recommendations to the customers the system requires an increased scalability. But
the execution of such systems becomes strenuous and exorbitant. So it is essential to
design an efficient and effective data model that can adapt itself to deal with growing
dataset. One possible solution to the scalability issue is to perform computation on
multiple machines in parallel using distributed algorithm.
Overspecialization—Due to overspecialization issue of recommender systems, a
highly rated item is suggested which has already purchased or experienced by the
user. As it doesn’t work according to the user preference, the user lose interest in the
system. Neighborhood collaborative filtering, randomness introduction using genetic
algorithm, or by removing similar items are proposed to handle overspecialization.
Serendipity—Each recommender system should achieve the very crucial
Serendipity objective that focuses on achieving user trust and loyalty. Recom-
mender systems should provide significantly novel and relevant suggestions in
contrast to user’s previous rating for items. It is challenging to apprehend the idea
of serendipity completely due to its subjective nature and is rarely seen in real
life scenarios. Solutions like re-ranking the accuracy results has been introduced
to achieve serendipity.
Coverage—As the types of items for cataloging increase, the systems need to
have a high coverage but maintain low latency.
Diversity—The recommendation engine should be able to give its users a variety
of recommendations.
Adaptability—The system should be able to adapt quickly to the continually
changing world of content.
7.6 An Example of the Role of Big Data in Twitter 75
Data sparsity is a major issue in recommender system. Here we discuss the solution
to deal with the sparsity issue in recommender system with an example.
M = U · S · VT (7.1)
where
U is an orthogonal matrix of size k × r that holds left singular vectors of M in its
column. This means r columns of U hold eigenvectors of the r nonzero eigenvalues
of MM T .
S is a diagonal matrix of size r × r that holds the singular values of M in its
diagonal entries in decreasing order such as s1 ≥ s2 ≥ s3 ≥ ..... ≥ sr , which are
nonnegative square roots of the eigenvalues of MMT .
V is an orthogonal matrix of size l × r that holds the right singular vectors of
M in its columns, which means its r columns hold eigenvectors of the r nonzero
eigenvalues of M T M.
Additionally, S could be reduced by taking the largest n singular values only and
thus obtain S n of size n × n. Similarly, U and V could be replaced by keeping the
first k singular vectors and discarding the rest resulting U n of size k × n and V n of
size k × n. As a result, Mn = Un · Sn · VnT and M n ≈ M, where M n is the closest rank
n approximation to M.
n
Rn = Un S VT
n
k×l k×n n×n n×l
7.7 Singular Value Decomposition-Based Recommender Systems 77
Relationship between users and items and similarity between them could be induces
by some lower dimensional structure in the data to the recommender system using
SVD. As an example, the ratings provided by a certain user to a particular item such
as movie depends on some implicit factors like the preference of that user across
different movie genres (Reeve 2013; Popescu and Etzioni 2007). The SVD-based
recommender system considers users and items as unknown feature vectors to be
learnt by applying SVD to user–item matrix and breaking it down into three smaller
matrices: U, V, and S. This is done by constructing the sparse user-item matrix from
the input data set and then imputing it by some values to fill the missing ratings and
reduce its sparseness before computing its SVD. There are several imputation tech-
niques: impute by Zero, impute each column by its Item Average, impute each row
by its User Average, or impute each missing cell by the mean value of User Average
and item average. This is resulted in a filled matrix M fl that could be normalized by
subtracting the average rating of each user from its corresponding row resulting in
M nr , which is useful in offsetting the difference in rating scale between the different
users. Here, SVD could be applied to M nr to compute U n (this holds users’ features),
S n (holds the strength of the hidden features) and V n (holds items’ features) such
that their inner product will give the closest rank-k approximation to M nr . The SVD
removes noise from the user-item relationship by discarding the small singular values
from S through lower-rank approximation of the user-item matrix that is better than
the original technique. Hence, the dot product of the corresponding feature vectors
are computed to predict the reference of user i to item j. This means, compute the
dot product of the ith row of (U n ·S n ) and jth column of VkT and add back the user
average rating that was subtracted while normalizing M fl . This is presented as:
where Pi j is the predicted rating for user i and item j, ra is the user average rating,
V_,T j is the jth column of VT and (Un .Sn )i,_ is the ith row of the matrix resulting from
multiplying U n and S n . In point of fact, the dot product of two vectors measures
the cosine similarity between them. Thus, the above formula could be interpreted
as finding the similarity between user i and item j vectors and then adding the user
average rating to predict the missing rating Pi j . The Algorithm 7.1 presents the
technique of movie recommendation by a large-scale SVD-based Recommender
System.
78 7 Big Data Behind Recommender Systems
References 79
7.8 Summary
Thus in this chapter, we have seen what is Big data, and what are its features. We have
also seen what are the different types of big data and what are the challenges involved.
We have also seen how big data plays a very important role in the development of
a successful and relevant recommendation system. In the next chapter, we discuss
what how to build trust-centric and attack resistant recommendation systems.
Think Tank
References
Chardonnens T (2013) Big data analytics on high velocity streams: specific use cases with storm.
Software Engineering Group, Department of Informatics, University of Fribourg, Switzerland
del Río S, López V, Benítez JM, Herrera F (2014) On the use of MapReduce for imbalanced big
data using Random Forest. Research Center on Information and Communications Technology,
University of Granada, Granada
Dobrea C, Xhafa F (2014) Intelligent services for Big Data science. Future Generat Comput Syst
37:267–281
Philip Chen CL, Zhang C-Y (2014) Data-intensive applications, challenges, techniques and
technologies: a survey on Big Data. Inform Comput Sci Intell Syst Appl 275:314–347
Popescu A-M, Etzioni O (2007) Extracting product features and opinions from reviews. In: Natural
Language Processing and Text Mining. Springer, London
Reeve A (2013) Big data integration data integration, best practice techniques and technologies.
Morgan Kaufmann, pp 141–156
Santovena AZ (2013) Big data: evolution, components, challenges and opportunities. Ms thesis,
Massachusetts Institute of Technology, Sloan School of Management, Cambridge
Sheikh N (2013) Big data, Hadoop, and cloud computing, implementing analytics. Morgan
Kaufmann
Tran VT (2013) Scalable data management systems for Big Data. Ph.D. thesis, École Normale
Supérieure de Cachan, Antenne de Bretagne, INRIA, Rennes
Vemuganti G (2013) Challenges and opportunities. Infosys Labs Briefings 11(1)
Xhafa F, Barolli L (2014) Semantics, intelligent processing and services for big data. Int J Grid
Comput Sci 37:201–202
Chapter 8
Trust-Centric and Attack-Resistant
Recommender System
Abstract In the information age where social media and online platforms are
popular, recommendation systems are being used more and more widely. As a result,
the evaluation of recommendation systems has become more rigorous, and a better-
quality recommendation system can largely improve the competitiveness of products.
This chapter proposes three different ways to improve the performance of recom-
mendation systems based on both attack and trust: usage of the Simulated Annealing
Algorithm in the score propagation model, evaluation and usage of users’ activeness,
and some administrative measures.
8.1 Introduction
The recent several decades have witnessed the prosperous development of informa-
tion and communication technologies, which contribute to the interconnection of
the world. As a consequence, the amount of daily online activities is boosting expo-
nentially and a large volume of data is being generated every second. Meanwhile,
heterogeneous online items including commodities, short videos, music, news, etc.,
are emerging at an unprecedented speed. When all kinds of data about the users and
the items are fed into a system, the system will be confronted with the problem of
information overload when it is trying to recommend items to the users. To mitigate
this problem, many recommender systems are being developed. The two most repre-
sentative recommender systems are the collaborative filtering recommender system
and the content-based filtering recommender system (Sridevi et al. 2016). Besides
these, knowledge-based and hybrid recommendation techniques are also frequently
applied. However, some recommender systems may be vulnerable to attacks, espe-
cially collaborative filtering recommender systems. Therefore, some techniques are
required to improve the efficiency of the recommender systems in the presence of
attacks.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 81
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_8
82 8 Trust-Centric and Attack-Resistant Recommender System
Attack and trust are two significant terms in the recommender system. With the
advent of the information age, the amount of data is gradually increasing, and people
need to spend a lot of time filtering useful information, which is called Informa-
tion Overload. Although collaborative filtering alleviates the information overload
phenomenon to a large extent, its own openness makes it vulnerable to attacks leading
to inaccurate recommendation results. Therefore, it becomes critical to accurately
identify the attacker and determine the level of trust in the target user.
Behind the explosion of the recommendation system is the current situation of
information overload, people need to help themselves find the right content through
the algorithm. The ability to accurately identify user needs is the core of evaluating
a recommendation system. One of the quantitative ways to assess this includes the
robustness of the system. Flexibility in adjusting the trust rating of the recommender
system network and timely identification of attackers increase robustness.
The definition of trust falls into different categories, and in many cases, its exact
definition is quite ambiguous. The quantitative study of trust had proliferated since
1992 when Marsh pioneered a formal mathematical model of trust and introduced the
concept of computational trust (Marsh 1994). Golbeck and Hendler (2005) defined
trust as: “a commitment to believe in the smooth running of the future actions of
another entity” in computer science. In other words, item A trusts entity B if B
satisfies the requirements or passes the tasks of A (Golbeck 2006).
In social networks, trust has the following properties (Golbeck and Hendler 2005):
• Asymmetric: if user A trusts user B, this doesn’t mean that B trusts A.
• Not distributive: if user A trusts user B and C, this doesn’t mean that A trusts B
and A trusts C.
8.2 Literature Review 83
• Not generic: if user A trusts user B in field X, this doesn’t mean that A trusts B in
field Y.
• Transitivity: we assume that trust can be transitive in the recommender system
under certain constraints.
Trust-based recommender system models can be classified into two categories based
on the ability to use the trust context approach. Context refers to the user’s domain of
expertise and aims to consider trustworthy relationships between users of the user’s
skills in a given context (Selmi et al. 2016).
• Approach with trust contextualization
1. Model of Abdul-Rahman
This model defines trust as a subjective measurement or conviction about
an individual’s experience in a given context. This belief selects a value among
four values based on the user’s opinion: very bad, bad, good, and very good.
For Abdul-Rahman, the trust between two users is determined only by their
interaction, since transferability is not considered (Abdul-Rahman and Hailes
1998). Figure 8.1 shows the corresponding meanings and descriptions for
different values defined by the model of Abdul-Rahman.
2. Model of Charif Alchiekh Haydar
In this in-context model, User X decides whether or not to collaborate with
the target user based on the target user’s reputation. The keywords collected
from the question to which the target offers the proper response determine the
user’s reputation score. As a result, the user profile is built on keywords, and
the user’s reputation score isn’t fixed. A node is generated between a user and
each term linked with a query when he offers an appropriate answer. The new
model aims to order the list of users after establishing the reputation scores
of the different users answering a question, and the user with the greatest
reputation score will deliver the answer, which is more likely to be reliable
under this forecast (Haydar et al. 2013).
(d−n+1)
if n ≤ d
tr(X, Z ) = d (8.1)
0 if n > d
2. TidalTrust
This model is used in social networks. Its purpose is to recommend movies
to users. In this model, a user can evaluate his trust in another user using a
discrete value of [1, 10], and, in addition, each user can rate the movie with
1–5 stars. The trust network between users is represented by a directed graph
(Golbeck and Hendler 2005).
The recommendation score r sm calculated by a source s of a movie m is
calculated using Eq. 8.2:
∑
i∈S tsi r im
rsm = ∑ (8.2)
i∈S tsi
3. Model of O’Donovan
This model is mainly based on collaborative filtering. In this new model,
consumers and producers refer to users and neighborhoods. A new layer of
trust in collaborative filtering by changing the used keywords is added which
is the main principle of this model. There are three methods of adding a trust
layer: weighting, filtering, and combination (Haydar 2014).
The first way involves replacing the similarity in collaborative filtering
with the value w (c, p, i) where c stands for consumer, p for a producer, and i
for an item. It is formalized using Eq. 8.3:
2 × similarity(c, p) × reputation( p, i )
w(c, p, i ) = (8.3)
similarity(c, p) + reputation( p, i )
4. Model of Simon
This model is a social recommendation system based on social relationships
between users. In this system, users only communicate with trusted users.
Trust is therefore an explicit value. The system aims to predict the missing
8.2 Literature Review 85
portions between users and objects, which are called scores (Meyffret et al.
2012a, b).
The system evaluates the fitness by Coverage, Root Mean Square Error (RMSE),
and F-Measure is the statistical measurements. Equations 8.4, 8.5, and 8.6 show the
calculation of RMSE and F-Measure. The coverage is the percentage of anticipated
ratings out of all possible ratings. It doesn’t say if the ratings were predicted accu-
rately, but it does highlight how many forecasts an algorithm can make. The RMSE
stands for the prediction’s average error. It’s essentially the standard deviation of the
error without the mean. The RMSE is a measure of how accurate a forecast is. The
lower the RMSE, the more accurate the prediction (Meyffret et al. 2012a, b).
The paper uses the classical leave-one-out method for the evaluation of the dataset,
i.e., for the whole dataset, removing one of the ratings at a time and trying to predict
it using other data.
8.2 Literature Review 87
/
∑N
n=1 ( pn − rn )2
RMSE = (8.4)
N
RMSE
Precision = 1 − (8.5)
range
2 ∗ Precision ∗ Coverage
F1 = (8.6)
Precision + Coverage
O’Mahony et al. (2005) divided the task of building attack profiles into two sub-tasks.
The first concerns the selection of items that together with the target item constitute
the profile and the second relates to the ratings given to the selected items. This group
proposed two attack strategies which are popular attack and probe attack. Popular
attack selects popular items which are prevalently liked or disliked by the public. By
choosing the popular items to build the attack profiles, the cost of an attack can be
minimized since such a profile is highly possible to be located in the neighborhood of
many genuine users. A probe attack involves probing the recommender system and
selecting the items based on the recommendations provided by the system. This type
of attack requires less knowledge about the system and is easier to implement. Only
a small number of items should be selected by the attacker as the initial seed to derive
recommendations from the system and attack profiles can be created according to the
recommendations progressively. Profiles created by probe attacks tend to have high
similarity with genuine users and are difficult to distinguish from genuine users.
88 8 Trust-Centric and Attack-Resistant Recommender System
Burke et al. (2006) investigated the attack strategies more meticulously. They
divided the attack profile into four categories, namely selected items, filler items,
unrated items, and the target item. Different attack models treat these items in
different ways. These attack models include random attack, average attack, band-
wagon attack, segment attack, and love/hate attack. The concrete rating methods
for some of these attack models are provided in Cao et al. (2013). Aggarwal (2016)
explained these attacks in more detail and their attack models involve null items, filler
items, and the target item, which are slightly different from the above two research
groups. The following are some concrete explanations for these different types of
attacks. Table 8.1 describes the generation methods for filler items for these attack
models.
• Random Attack—Filler items are selected randomly and ratings complying with
a probability distribution around the global mean are assigned to the filler items.
This type of attack requires the knowledge of the global mean which is the mean
value of all ratings across all items.
• Average Attack—Filler items are selected randomly and for each specific selected
item, the average value of ratings on this item given by other users are assigned
in the attack profile.
• Bandwagon Attack—Filler items contain two parts. The first part consists of
popular items which are widely liked and are assigned the maximum allowed
rating value. The second part incorporates randomly selected items and is rated
randomly. The type of attack does not require knowledge of the rating matrix but
the attacker needs to know what items are popularly liked by the users.
Table 8.1 Filler items selection and rating rules for different attack models
Attack model Generation method for filler items
Random attack Global mean rating → randomly selected items
Average attack Average rating value of the specific item → randomly selected
items
Bandwagon attack r max → popularly liked items
Random rating → randomly selected items
Popular attack (push) r min → widely disliked items
r min + 1 → widely liked items
Popular attack (nuke) r max → widely liked items
r max – 1 → widely disliked items
Love/Hate attack r min → nuked item
r max → other items
Reverse bandwagon attack r min → widely disliked items
Probe attack r max → items recommended by the system based on the seed
profile
Segment attack r max → the pushed item and the items of the same segment
(category)
8.2 Literature Review 89
Attack detection aims to distinguish between attack profiles and genuine profiles.
Therefore, attack detection can be treated as a classification problem. Classifica-
tion features can be extracted from the rating matrix-like Rating Deviation from
Mean Agreement (RDMA), Weighted Deviation from Mean Agreement (WDMA)
(Burke et al. 2006), etc. Typical solutions for a classification problem include Bayes
learning, K-nearest neighbors (KNN), decision tree, support vector machine (SVM),
etc. Williams et al. (2007) compared the performances of KNN, C45, and SVM for
attack detection and concluded that SVM adds significant robustness to the recom-
mender system. Cao et al. (2013) proposed Semi-SAD which is a semi-supervised
learning-based shilling attack detection algorithm. This algorithm first trains a naive
90 8 Trust-Centric and Attack-Resistant Recommender System
Bayes classifier on a small set of labeled users and then incorporates unlabeled users
into the classifier. Two more algorithms were used for performance comparison. The
first was a supervised naive Bayes classifier called Bayes-SAD, and the second was an
unsupervised approach based on a principal component analysis called PCA-SAD.
The result of this research demonstrated that Semi-SAD can better detect various
kinds of shilling attacks than the other two algorithms.
The score propagation system introduces randomness and can provide novelty recom-
mendations for specific users, but maintains a low frequency to prevent noise.
However, this randomness cannot continuously converge or disperse after a change
in user preferences. For example, a specific group of fan users may be interested in
only a very small fraction of the content. A more ideal model should have novelty
recommendation results converge gradually as users engage in system interactions
until they remain at a low frequency. After the user’s interest changes, it gradually
diverges and increases the novelty recommendation content nonlinearly.
Previous research made many assumptions about the attacks with no appropriate
proof. For example, Aggarwal (2016) claimed that attack profiles tend to have high
self-similarity and therefore a cluster with a smaller radius is considered the attack
cluster. However, as the number of users is surging, the aforementioned assumption
may not always be true. If the number of users is larger enough, it may be inevitable
that an unexpected number of genuine users are included in the cluster which is
considered the attack cluster, which makes it more difficult to distinguish between
attackers and genuine users. Attack profiles and real profiles can be mixed in the
same cluster, and there is no proof that the proportion of attack profiles in a cluster
with a smaller radius is high.
Another assumption made by many research groups which may not always be
practical is that the attackers are allowed to create abundant accounts to perform
profile injection attacks. This assumption may hold when there is no restriction
on account creation for most of the applications. However, since more and more
applications impose rigorous restrictions on account creation, especially in China,
this assumption will fail in many cases. Real name registration becomes compul-
sory for many applications in China, especially for e-commerce applications, which
means the users must register with their real name and ID number to activate their
8.3 Challenges in Previous Research 91
account, otherwise they are not allowed to do a transaction or even enter the appli-
cation. In this scenario, profile injection attacks or shilling attacks become impos-
sible, and as a consequence, it is not always necessary to defend the recommender
system against these attacks by detecting and removing fake profiles when such
administrative measures are enforced.
Apart from the impractical assumptions, the evolution of attacks is also making
the attack-resistant mechanisms less effective. While most of the previous research
focused on traditional attacks like random attacks, average attacks, etc., new types of
attacks are appearing. Some genuine users can be utilized by the attacker to perform
the attack. Specifically, the attacker may distribute some money or give some other
benefits to a group of genuine users and hire them to give ratings to the target item.
In this way, these genuine users perform the attacks under the guidance of the real
attacker, and there will be no clear boundary between a fake profile and a genuine
profile, which means fake ratings may exist in a genuine user’s profile. This type of
attack requires no knowledge about the recommender system since the attacker does
not need to create profiles. It can be easily implemented with some economic cost
and can be highly effective. If detection has to be performed, every single rating in
the rating matrix should be investigated and the ratings should be categorized into
genuine ratings and fake ratings, which is almost impossible to be done. Therefore,
some countermeasures other than detecting and removing the attack profiles or attack
ratings can be designed to improve the performance of the recommender system and
minimize the effect of the attacks.
Another defect is that most of the previous research only considered the rating
matrix and the recommender algorithm when creating the attack profiles or doing
the detection. However, many other factors can be taken into account. For example,
a recommender system can be time-sensitive and make use of the time at which a
rating is given to analyze a user’s behaviors.
In conclusion, the complexity of the attack strategies and lack of a comprehen-
sive perspective become a big challenge for the relevant research topics. The attack
strategies and defense strategies are improving in an adversarial way. While more
robust defense strategies are being developed, researchers should also learn about
new attack strategies to ensure the effectiveness of their pro- posed defense strate-
gies. In the meantime, the assumptions made to support the research should adapt to
the real scenarios which tend to be mutable.
92 8 Trust-Centric and Attack-Resistant Recommender System
The following two strategies show the divergence method and the convergence
method, respectively. Convergence indicates that users will receive increasingly accu-
rate tweets, while the divergent approach ensures that users have access to certain
other domains of information.
In the score propagation model, when we use weighted weight to determine the
influence of a user there is an upper bound to avoid enormous effects. However, in
some specific areas like fan groups and academic forums, users are specific groups
and most likely do not need diverse project recommendations. Gradually converging
items and information may be more in line with demand.
In this case, allowing the weights to exceed the upper limit will lead to better
results. Extended analysis of user behavior means user feedback was collected implic-
itly and explicitly utilizing click-stream and user activity data for our RS extension
with weighted multi-attributes (Akcayol et al. 2018). After this, the system can iden-
tify accurate user profiles and thus feel that those users can be overloaded with the
weight of the item, thus obtaining progressively converging recommendation results.
However, to avoid convergence to a minimum, any should set a small random
number to generate recommendations for items that are not related to the domain.
When a user’s click event occurs, it means that the user has become interested in
items outside this domain, in which case we should revert the weights to the upper
limit.
Algorithm 1 SA
Input: T0, , x0 and UserSize
Output: BestSln x∗
1: function Simulated Annealing (defaultValue)
2: for i = 1 → UserSize do
3: Generate r U (0, 1)
4: if r < 0.5 then
5: Use default weighted weight generates x0
6: else
7: Use 0 as weighted weight generates x0
8: end if
9: Calculate δT
10: if δT < 0 then
11: Accept x∗ as the new solution
12: else
13: Generate p ∼ U (0, 1)
14: if p < expδTT then
15: Accept x∗ as the new solution
16: end if
17: end if
18: end for
19: T = T0
20: end function
user, the second layer is then the friends of the original user, and the third layer is
the friends of the friends of the original user.
Each layer has a weighted value to the next layer. Root Mean Square Error (RMSE)
is chosen to be the fitness value that reflects the accuracy of the system. With the
system at a higher temperature, the probability of the system receiving a less accurate
solution increases. This causes the system to try different combinations of weights,
thus avoiding the system from falling into a local optimum.
To defend a recommender system against attacks, the most intuitive way is to detect
and remove the attack profiles. However, if only the rating matrix is used for attack
profile detection, the detection may not always be effective since the attacks are
becoming more and more complex. Therefore, some other features derived from
user behaviors can be used when designing the recommender algorithms and some
countermeasures other than detection and removal can be taken to mitigate the effect
of attacks and increase the cost to perform attacks.
One factor which is neglected by many research groups is the time at which a
rating is given. By investigating the distribution of time at which a specific user gives
ratings to different items (which will be referred to as rating time in the following
contents) and the total number of ratings given by this user, the activeness of this user
can be analyzed, which can be used as a feature when designing the recommender
algorithms.
One feasible way we propose to quantify a user’s activeness makes use of the
information entropy contained by the distribution of the rating times of this specific
user. Information entropy is a measure of histogram dispersion. The distribution of
the rating times of a specific user can be described by a histogram, which contains
bins separated on a daily, weekly, or monthly basis. Every bin represents the number
of ratings given by this specific user within the period of that bin. The entropy can
be calculated as Eq. 8.7:
∑
L−1
( )
H (X ) = − p(rk ) log2 p(rk ) (8.7)
k=0
where H(X) is the information entropy of the distribution of the rating times of a
specific user, L is the number of bins contained by the histogram, and p(r k ) is the
possibility of a rating time belonging to the kth bin. p(r k ) can be calculated as the
quotient of the sum of the values of all bins divided by the value of the kth bin (with
the starting index of 0).
8.4 Possible Improvements for Future Research 95
A more dispersed histogram will have a larger entropy and a less dispersed one
will have a smaller entropy. When all the instances fall in one single bin and all
the other bins are empty, the entropy for this histogram will become 0. When the
instances fall uniformly in every bin, which means each bin has the same value, the
entropy of the histogram will reach the maximum value. To be more detailed, when
a histogram on a daily basis containing 7 days is used to calculate the entropy, a user
who only gives ratings on one single day but becomes inactive on other days, will
have an entropy of −1 ∗ log2 1 = 0 (in this case, L is 1, and p(r k ) will only have one
value which is 1). If a user gives the same number of ratings every day, the entropy
will become −7 ∗ 17 ∗ log2 21 = 2.81.
Figure 8.5 is an example of the histogram of the distribution of rating times of
a single user over one week on a daily basis. In this specific case, the entropy is
calculated as −0.32 ∗ log2 0.32 − 0.12 ∗ log2 0.12 − 0.16 ∗ log2 0.16 − 0.24 ∗ log2
0.24 − 0 − 0.12 ∗ log2 0.12 − 0.04 ∗ log2 0.04 = 2.36.
The entropy only considers the possibilities of the rating time distributions but
neglects the quantity. Therefore, apart from the entropy which reflects the dispersion
of a user’s rating times, another two factors should also be considered to quantify the
user’s activeness, which is the total number of ratings given by this specific user and
the average (or median) number of ratings given by some or all the other ( users. We )
divide the user’s activeness further into two types, global activeness actglobal (u) ,
and local activeness (actlocal (u)), which are respectively defined as Eq. 8.8:
nu
actglobal (u) = · H (Ru ) (8.8)
n(u)
Fig. 8.5 An example of the histogram of the distribution of rating times of a single user over one
week on a daily basis
96 8 Trust-Centric and Attack-Resistant Recommender System
where nu is the total number of ratings given by user u, n(u) is the average number
(which can be substituted with a median number) of ratings given by his near regis-
tered users who register their accounts on the same day or in the same week or month
as this specific user, and H(Ru ) is the information entropy of all the rating times of
user u on a basis of the same period as the former measure, and Eq. 8.9:
nu
actlocal (u) = · H (Ru ) (8.9)
n
where nu is the number of ratings given by user u within a specific period such as a
month, n is the average number (which can be substituted with a median number) of
ratings given by all the other users within the same month, H(Ru ) us the information
entropy of the rating times of user u over this period on a daily or weekly basis.
For the local activeness, the whole period of the histogram and the span of a single
bin can vary if necessary. Local activeness may be preferred when considering a
user’s activeness as a feature in the recommender algorithms since recent data tend
to have higher reference values. In addition, the global activeness may be slightly
biased due to the following two reasons. Firstly, the concrete registration time for a
specific user tends to be slightly different from his near registered users but they are
categorized into the same group and are considered to have registered in the same
period. Secondly, near registered users from different groups may not have the same
number of members, and groups registered in different periods may not always have
similar behaviors. However, if we ignore the slight discrepancies in the registration
times of the users in the same group and assume that the numbers of the members in
every near registered group are the same and they behave quite similarly, the global
activeness can be considered unbiased.
Users’ activeness can be used in several different ways. It can be used as a factor
of the weight of the influence of other users when predicting a rating for a target
user on a specific item in a user-based collaborative filtering recommender system.
In such a system, the weight of the influence of one user on another user tends to be
the similarity between the two users, which is usually calculated using the Pearson
correlation coefficient based on their rating matrix. However, we propose that the
activeness of user j be incorporated into the weight. Here we take the local activeness
as the example so that the weight becomes Eq. 8.10:
nj ( )
wi, j = sim(i, j ) · · H Rj (8.10)
n
where sim(i, j) is the similarity between user i and user j. However, such a weight
will be dominated by some extraordinarily active users. Therefore, some variation
of the rectified linear unit can be utilized to mitigate the effects of such users. After
applying the rectification, the weight becomes Eq. 8.11:
8.4 Possible Improvements for Future Research 97
(n ) ( )
j
wi, j = sim(i, j ) · f · H Rj (8.11)
n
where f is a variation of the rectified linear unit, which is defined in [0, +∞) and
has the form of Eq. 8.12:
x, if x < α
f (x) = (8.12)
α, if x ≥ α
Apart from the above-mentioned strategies, some administrative measures can also
be important to enhance the robustness of the recommender system. For example, real
name registration can be one feasible proposal that has been discussed in Sect. 8.4.2.
If real name registration should be implemented, the developers should not only
defend the system against recommender system attacks, but they also need to prevent
network attacks to protect users’ privacy. Therefore, some more complicated tech-
nical measures and administrative measures should be designed to ensure server
98 8 Trust-Centric and Attack-Resistant Recommender System
security and protect users’ sensitive personal information, which should be compul-
sory for real-name registration. The following sections briefly describe some admin-
istrative measures concerning server security, which may pave the way for real-name
registration.
Password Policy
A good password policy should be formulated to protect the user’s account, which
contributes to the security of the user’s personal information. Passwords tend to
be stored in a hashed format using one-way functions rather than in plaintext and
therefore the passwords become unreadable. However, it is possible to crack the
hashed passwords to recover the original passwords. Brute force attack and dictionary
attack are two typical offline password cracking strategies. Such offline attacks utilize
leaked hashed password files and try numerous character combinations to see whether
there is a hash collision. A brute force attack tries an extensive number of possible
passwords and can be well supported by GPUs with fast computational capabilities.
However, brute force attacks are inefficient to crack long passwords. A dictionary
attack tries different combinations from several dictionaries as possible passwords. A
dictionary can be a general dictionary or a special one that may contain a set of leaked
plaintext passwords or user information such as names, birthdays, etc. The following
is an example password policy that can prevent these attacks to some extent: The
length of the password should be between 8 and 16.
• At least one lowercase letter, one uppercase letter, one punctuation, and one
numeric number should be contained.
• No dictionary words or keyboard sequences should be used.
• No personal information should be contained such as phone number, name,
birthday, etc.
• The password should be changed on at least a semiannual basis.
• The password should not be changed to a previously used one.
Apart from the password policy, some additional measures can be implemented
for further authentication. Two-step authentication can be used to enhance account
security. The first step is to check the user-defined static password which may follow
the above rules. A one-time password can be utilized for the second step and it
can be implemented in different ways. For example, many companies use short
message services to provide users with a PIN code through cellphone communication.
Hardware tokens such as microprocessor-based smart cards or pocket-size key fobs
can also be used to generate a one-time password. With the two-step authentication,
the user’s account will be much more secure.
Device lock can be another authentication measure. With a device lock, the users
can only log in to their accounts on authorized devices. The users are required to bind
8.5 Summary 99
their accounts with their phone numbers or emails. Then, when a user needs to log
in to his account on a new device, a PIN code will be sent through a short message
service on the phone or sent via email. In this way, even if the passwords are leaked,
the users’ accounts remain secure.
IP address monitoring can also be one administrative measure to defend the system
against recommender system attacks. When an abnormal number of profiles are
created by the same IP address, these profiles may be created by an attacker. There-
fore, these suspicious profiles can be removed from the system and a blacklist firewall
can be constructed to drop the packets from this IP address.
8.5 Summary
In this chapter, we introduce the concept of trust and attack in recommender systems.
Specifically, we describe the definitions, properties, and some models of trust, and we
explain how social scoring can be applied in recommender systems. We also briefly
talk about the evolution of trust-based recommender systems. As for the concept
of attack, some widely existing attack strategies and some innovative strategies are
discussed, and we do a brief survey about the attack detection techniques. Further-
more, some challenges in previous research about trust and attack in recommender
systems are pinpointed, and according to these challenges, several possible solu-
tions are proposed. The improvements in the score propagation model, the definition
and usage of user’s activeness, and some administrative strategies are proposed to
enhance the performance of recommender systems, which can be more reliable and
resilient. In conclusion, although the accuracy of recommendations given by a recom-
mender system is of paramount importance, there are other heterogeneous factors
such as trust and attack which should not be neglected when designing an effective
and robust recommender system.
Think Tank
References
Abdul-Rahman A, Hailes S (1998) A distributed trust model. In: Aggarwal CC (ed) New security
paradigms workshop. Recommender systems, vol 1. Springer, Berlin
Aggarwal CC (2016) Recommender systems, vol. 1. Springer
Akcayol MA, Utku A, Aydoğan E, Mutlu B (2018) A weighted multi-attribute-based recom-
mender system using extended user behavior analysis. Electron Commer Res Appl 28:86–93.
Retrieved from https://www.sciencedirect.com/science/article/pii/S1567422318300164; https://
doi.org/10.1016/j.elerap.2018.01.013
Burke R, Mobasher B, Williams C, Bhaumik R (2006) Classification features for attack detection in
collaborative recommender systems. In: Proceedings of the 12th ACM SIGKDD international
conference on knowledge discovery and data mining, pp. 542–547
Cao J, Wu Z, Mao B, Zhang Y (2013) Shilling attack detection utilizing semi-supervised learning
method for collaborative recommender system. World Wide Web 16(5):729–748
Fan W, Derr T, Zhao X, Ma Y, Liu H, Wang, J, Tang J, Li Q (2021) Attacking black-box recommen-
dations via copying cross-domain user profiles. In: 2021 IEEE 37th international conference on
data engineering (ICDE), pp 1583–1594. https://doi.org/10.1109/ICDE51399.2021.00140
Golbeck J (2006) Generating predictive movie recommendations from trust in social networks. In:
Stølen K, Winsborough WH, Martinelli F, Massacci F (eds) Trust management. Springer, Berlin,
pp 93–104
Golbeck JA, Hendler J (2005) Computing and applying trust in web-based social networks. PhD
dissertation, University of Maryland at College Park, USA (AAI3178583)
Haydar C (2014) Les systèmes de recommandation à base de confiance (trust-based recommender
systems)
Haydar C, Roussanaly A, Boyer A (2013, Nov) Individual opinions versus collective opinions in
trust modelling. In: SOTICS 2013, the third international conference on social eco-informatics,
Lisbon, Portugal, pp 92–99. Retrieved from https://hal.inria.fr/hal-00929925
Marsh SP (1994) Formalising trust as a computational concept
Massa P, Bhattacharjee B (2004) Using trust in recommender systems: an experimental analysis.
In: Proceedings of itrust2004 international conference, pp 221–235
Meyffret S, Médini L, Laforest F (2012a) Trust-based local and social recommendation. Association
for Computing Machinery, New York, NY, USA. Retrieved from https://doi.org/10.1145/236
5934.2365945
Meyffret S, Médini L, Laforest F (2012b, 04) Recommandation sociale et locale basée sur la
confiance. Doc numérique 15:33–56. https://doi.org/10.3166/dn.15.1.33-56
O’Mahony MP, Hurley NJ, Silvestre GC (2005) Recommender systems: attack types and strategies.
In: AAAI, pp 334–339
Picot-Clémente R, Cruz C, Nicolle C (2010, Oct) A semantic-based recommender system using a
simulated annealing algorithm
Selmi A, Brahmi Z, Gammoudi MM (2016) Trust-based recommender systems: an overview.
In: Proceedings of 27th international business information management association (IBIMA)
conference, Milan, Italy
Song J, Li Z, Hu Z, Wu Y, Li Z, Li J, Gao J (2020) PoisonRec: an adaptive data poisoning framework
for attacking black-box recommender systems. In: 2020 IEEE 36th international conference on
data engineering (ICDE), pp 157–168. https://doi.org/10.1109/ICDE48307.2020.00021
Sridevi M, Rao RR, Rao MV (2016) A survey on recommender system. Int J Comput Sci Inf Secur
14(5):265
Williams CA, Mobasher B, Burke R (2007) Defending recommender systems: detection of profile
injection attacks. SOCA 1(3):157–170
Chapter 9
Steps in Building a Recommendation
Engine
Abstract In this chapter, we discuss the steps one needs to keep in mind while
designing an efficient recommender system. We also see what are the design param-
eters for rating the efficiency of a recommender system. Then the steps to build such
a system are discussed along with a generic architecture.
9.1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 101
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_9
102 9 Steps in Building a Recommendation Engine
ways of designing an efficient system, Sect. 9.4 discusses the steps to design a recom-
mendation engine, Sect. 9.5 discusses the architecture, and finally the summary in
Sect. 9.6.
All predictive models or recommender systems rely very heavily on large volumes of
data. The more data a machine learning algorithm has, the better will be the results
it returns. It is because of this reason that the organizations which have the best
recommender systems are those which have access to huge volumes of data, like
Amazon, Google, and Netflix. In this section, we discuss the main parameters for
evaluating the efficiency of a recommender system. While some of the parameters
can be quantified concretely, some other parameters are more subjective and rely on
the user (Fig. 9.1). The factors are as follows:
a. Accuracy—Accuracy is one of the most important parameters for measuring
the efficacy of a recommender system. Since most of the ratings are numerical
data, so the metrics for accuracy are similar to those used in regression modeling.
There can be errors at various stages of the system. If a rating matrix R contains
a known user rating r(act), then if a recommendation system estimates this rating
as r(est), the entry-specific error of the system will be r(est)-r(ct). The overall
errors also may be calculated using various methods. Moreover, all the entries for
a rating matrix cannot be used for training the model or for accuracy metrics as
it will lead to overfitting. However, for a recommender system to be successful,
it’s not all about accuracy. We need to consider many other performance metrics
and not just the accuracy only. Focusing only on the accuracy aspect can actually
be detrimental to the system. Suppose a person is using a travel recommenda-
tion system. If the system returns with recommendations of places that have
already been visited by that person, then it would probably be rated as a poor
recommendation system.
Collaborative Filtering
This model is based on the assumption that people like things similar to what they
have already liked and also that people with similar preferences are likely to choose
similar things (Guo 2013; Cai-Nicolas 2005).
There are two types of collaborative filtering models—nearest neighbor and matrix
factorization.
Nearest neighbor collaborative filtering
Here the nearest neighbor (Fig. 9.3) approach is used to find similar users or similar
products. There are two basic ways to filter the information for users—namely Item-
based collaborative filtering and User-based collaborative filtering.
The user-based collaborative filtering finds users who have the same preferences
and tastes in products as the current customer and have similar purchasing behavior.
These choices are then recommended to the new user.
The item-based collaborative filtering has a different approach. It will recommend
products that are similar to the product the user has already purchased, e.g., if a user
has already liked a movie X, then a movie recommender system will try to find
movies with similar characteristics and recommend those movies. The parameters to
be considered for the matching could be producer name, actors, genre, release date,
etc.
Matrix Factorization
The matrix factorization method is another class of collaborative filtering algorithms
that are implemented in recommender systems. This method decomposes the user-
item interaction matrix into the product of two lower-dimensionality rectangular
matrices. This category of algorithm became very popular when it was implemented
in the Netflix prize challenge and Simon Funk 2006 showed his findings to the
research community to show highly effective it was. The results of the prediction
can be improved if different regularization weights are assigned to the latent factors
based on the item popularity and user activeness.
In this method, when a user gives feedback about any particular product, the
feedback is collected and stored in the form of a 2 × 2 matrix. The rows of the matrix
represent the different users and the columns of the matrix represent the different
products. The resulting matrix is mostly sparse because not all persons will buy every
product (Fig. 9.4).
This is the first step while designing a successful system. We need to identify what
are the goals of the system and what is the type of business and its special or typical
features. For this there needs to be inputs from the operations team, the products team
and the advertising team. The points that need to be discussed are: what the end goal
of the business is, whether it will benefit from recommendations, at which point ill the
recommendations occur, the availability of the data on which the recommendations
will be based, whether all the contents or products should be equally treated and
whether we can segment users with similar tastes.
Since recommendation systems rely on data to make accurate predictions, the amount
of data for most successful systems runs to terabytes. So the more data they have,
the more accurate the predictions will be. This data is mostly about user preferences
and is based on feedback which can be of two types—explicit and implicit user
feedbacks. While explicit feedback is clear as it states the likes and dislikes of a user,
implicit user feedback is something a user has not mentioned in his/her profile and
is more complicated to interpret.
One needs to consider the changing tastes of users while exploring and cleaning
the data. So to keep up with a user’s current tastes, older data that may not be
relevant any more should be eliminated periodically or add a weight factor to more
recent activities. It is challenging to work with datasets for recommendation systems
because they are usually higher-dimensional data and a lot of the times they do not
have any values, so clustering and outlier detection is difficult.
9.5 Architecture 109
Based on the previous steps, one can build a recommendation engine by just ranking
the scores of users and get product recommendations. So there is no need to apply
machine learning here and can be used for some simpler types of use cases. But
for more complex ones, there needs to be more sub-tasks to be done for the further
refinement of the system. This can be done by either combining the recommendations
from different types of systems, use of multiple algorithms in parallel, or using pure
machine learning approaches for combining multiple recommendation systems.
In the final stage there is deployment of the model, and including a feedback into the
loop, so that a number of iterations can be run to improve and refine the model.
9.5 Architecture
Usually, most recommender systems have three major components, an input, a recom-
mendation algorithm, and an output. The general structure of a recommender system
is shown in Fig. 9.5. The input consists of two steps. At first, we find out what is
the history of the user’s interaction concerning various items. The representation of
this data depends on the recommendation algorithm being used and can be a vector,
a matrix, or a tensor. In the majority of the recommendation algorithms, the input
is represented as a matrix of ratings. It is an m × n table, where the rows represent
the users and the columns represent the various products, and the intersection of the
rows and columns represents the rating given by a user for that particular item in the
column. If the slot is empty, it means that particular product has not yet been rated
by that particular user.
110 9 Steps in Building a Recommendation Engine
The second step involves the calculation of the distance between the target users
and the other users for finding their nearest neighbors. This results in an m × m
matrix, where m is the number of users and the contents of any cell, i, j, is the trust
entity between users u and v. While in the traditional approaches the neighbors are
chosen based on similarity, now most of them are chosen based on the trust entity.
After that a suitable recommendation algorithm is applied, where the objective is
to find the missing entries in the rating matrix.
Finally, the output contains the list of recommendations of products that are
predicted to be of the highest preference to the user.
9.6 Summary
Thus in this chapter, we have given an overview of the steps to be followed while
developing a new recommender system from scratch. We have also reviewed the
parameters for the evaluation of an efficient recommender system, So depending
on the type of application for the recommendation, it may sometimes be necessary
to shuffle the order of priorities of the parameters to get a balanced, accurate, and
relevant output.
A brief discussion of the most commonly used recommendation algorithms is also
given here so that a new developer can decide the best method for his/her application.
Finally, the general architecture of a recommender system has been given, and it
may be customized to add more details based on the specific requirements of the
application.
References 111
Think Tank
References
Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collabora-
tive filtering. In: Proceedings of the fourteenth conference on uncertainty in artificial intelligence,
pp 43–52
Cai-Nicolas Z (2005) Towards decentralized recommender systems. PhD thesis, University of
Freiburg
Chein-Shung H, Yu-Pin C (2007) Using trust in collaborative filtering recommendation. In: New
trends in applied artificial intelligence
Gediminas A, Alexander T (2005) Toward the next generation of recommender systems: a survey
of the state of the art and possible extensions. IEEE Trans Knowl Data Eng 17:734–749
Guibing G, Jie Z, Neil YS (2013) A novel Bayesian similarity measure for recommender systems.
In: Proceedings of the 23rd international joint conference on artificial intelligence (IJCAI), pp
2619–2625
Guo G (2013) Integrating trust and similarity to ameliorate the data sparsity and cold start for
recommender systems. In: Proceedings of the 7th ACM conference on recommender systems
(RecSys)
Jennifer AG (2005) Computing and applying trust in web-base social networks. Thesis
Mano P, Dimitris P, Themistoklis K (2005) Alleviating the sparsity problem of collaborative filtering
using trust inferences. In: Trust management
Paolo M, Paolo A (2007) Trust-aware recommender systems. In Proceedings of the 2007 ACM
conference on recommender systems, pp 17–24
Roger C, James HD, Schoorman FD (1995) An integrative model of organizational trust. Acad
Manag Rev 709–734
Viet-An N, Ee-Peng L, Jing J, Aixin S (2008) To trust or not to trust? Predicting online trusts using
trust antecedent framework
Young AK, Rasik P (2012) A trust prediction framework in rating-based experience sharing social
networks without a web of trust. Inf Sci 191:128–145
Chapter 10
Recommender System for Health Care
Abstract As the recommender system is applied in more and more areas, the health-
care RS is produced and has come into use for decades. In the healthcare or medical
area, advice or suggestions are given for different topics like diagnosis, medicine,
food, and exercise. While healthcare recommender system (HRS) can do lots of
medical and fitness suggestions work accurately, several defects and optimizable
functions exist in the current stages of the recommender system. To handle these
drawbacks, there are several methods and techniques to be applied.
10.1 Introduction
In recent decades, with the improvement of people’s living standards and the rapid
development of medical treatment levels, the medical information available on the
Internet has increased significantly. In such a trend, more and more people demand
better medical care or health care. To meet the popular requirements and avoid issues
caused by information overload, healthcare search engines and recommendation
systems are invented. Search engines filter and retrieve information through direc-
tions known to the user, while recommender systems generate information through
directions unknown to the user. To some extent, these two methods can cope with
most of the user’s requirements and issues.
Healthcare recommender systems have been widely applied in many areas of
medical treatment and support like diagnosis, medicine, and lifestyle (food, exer-
cise, daily routine) recommendations. HRS includes several aspects of people’s
requirements at different levels such as therapy decisions and food and medicine
suggestions. After surveying and analyzing several different kinds of recommender
systems, although those available on the market can meet their target users’ require-
ments well and perform effectively and precisely, some challenges and gaps still
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 113
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_10
114 10 Recommender System for Health Care
exist between their expected performance and current conditions. Briefly speaking,
these issues are the main content of systems’ further improvement and updates.
In many people’s thoughts, the ideal healthcare recommender system can handle
any emergency or daily healthcare-related cases and meet every requirement
perfectly. But in most current existing healthcare recommender systems, they can
only give recommendations for a specified purpose like recommending medicine,
food, exercise, or giving diagnosis and related therapies. Modern medical or health-
care suggesting procedures include several steps. Take disease diagnosis and treat-
ment as an example, the first step is to give a diagnosis and generate accurate therapy.
This therapy includes several aspects like curative medicine, follow-up lifestyle
adjustment, and post-cure diagnosis. Apart from the accuracy issues and the tradi-
tional recommender system problems, the connection among several different cate-
gories of recommender systems can affect the user experience significantly. In this
example, different recommender systems work independently without effective data
communication which means that some parts of the recommendation may be too
generic and cannot consider specific patient’s conditions. Also, the work efficiency
and accuracy would be rather unreliable which means that more work will be done
manually by healthcare workers or common users. Beyond these connection prob-
lems, the cold start, data sparsity, and other accuracy challenges should be positioned
and solved.
This document mainly mentions the challenges that exist in the system and how
can developers improve their system’s performance and user experience. In the first
part, the document describes and analyzes the current HRS and the technologies/
approaches behind it. The second focuses on issues in current HRS and suggestions
on how to handle them.
prescription (Zhang et al. 2015), symptom analysis, and diagnosis (Gräßer et al.
2017). Also, a cooperative relationship exists between them like when giving therapy
toward a specific patient case, the whole recommendation procedure involves diag-
nosis recommendation, medicine recommendation, and lifestyle recommendation,
while therapy has different aspects of suggestions and prescriptions.
Medical recommendation systems often provide users with information such as
dietary recommendations, exercise advice, and recommendations for medications
and treatments (Zhang et al. 2015). In addition to the public, there are also recom-
mendation systems for medical professionals (Tran et al. 2021), physicians can also
make better decisions with the help of this recommendation system (Stark et al.
2019).
10.2.1.1 Overview
The maturity of online access technology and the gradual improvement of medical
information have facilitated the development of recommendation systems for diag-
nosis and treatment. In areas where medical care is less developed, doctors often
do not have enough experience and knowledge to deal with various diseases, and a
116 10 Recommender System for Health Care
system like this can help local patients get better diagnostic resources. In addition,
different departments in hospitals sometimes do not communicate and cooperate
effectively with each other, and a recommendation system can fix this deficiency.
The audience for the system’s programs is two types of people: healthcare
professionals who need assistive technology to help with diagnosis and the general
population who want to perform self-diagnosis.
Data mining is an important technique in this part, which is a technique to discover
and extract the hidden pattern information in the dataset. Data mining has been widely
used in the field of medical diagnosis to identify valid diagnostic knowledge from
a large number of electronic medical record databases to assist medical personnel.
Commonly used data mining algorithms are KNN, decision tree algorithm, and arti-
ficial neural network algorithm. Medvedeva et al. (2017) proposed disease diagnosis
and treatment recommendation system (DDTRS) for data extraction using DPCA,
and they first group reports to obtain clustering centers and then use Apriori algo-
rithm for association analysis. Electronic health records (EHRs), electronic medical
records (EMRs), personal health records (PHRs), etc., can be used to mine patients’
cases and personal information (Stark et al. 2019). In addition, expert advice and
knowledge are important parts of the data source. Some systems use sensors to
collect key physiological data from patients, such as blood pressure, heart rate, and
other information.
Taking judging the probability of cardiovascular disease and its related symptoms
as an example, the recommender system will build a probability table by applying
the Bayes network methodology. Figure 10.2 shows the relationship between the
symptoms and heart diseases and gives the calculation procedures of the operating
principles of Bayes network for HRS.
Patient similarity analysis is also an essential part of the process. Using collabora-
tive filtering techniques to find similarities between patients based on symptoms has
good predictive power. Equation (10.1) shows similar calculation procedures when
giving a diagnosis to the target patient by applying Pearson correlation coefficient
(Jain et al. 2020). The previous patient’s case with the highest similarity score will
be considered the most suitable reference when giving a diagnosis.
∑
(ra − ra )(rb − rb )
p∈P
Sim(a, b) = /∑ / , (10.1)
2 ∑
p∈P (ra − ra ) p∈P (r b − r b )
2
while
P means the attributes’ set that contains the useful diagnosis data.
Patient a is the target patient and
Patient b is the recorded patient case.
ra and rb are attribute p’s data from Patient a and Patient b.
10.2 Analysis of Healthcare Recommender System 117
When making therapy decision for patients, the preciseness and accuracy must be
guaranteed, or it will cause misdiagnose and leads to serious consequence.
In this case, the recommender system should collect several data like basic infor-
mation about the patients (age, gender, BMI, family anamnesis), therapy description
information (a medicine used in this therapy, curative period, dietary consideration),
past patients’ records used to do a comparison (this part is for the therapies which have
been widely used in curing patients), clinical trial data (this part is for the therapies
which have not been used and can be considered as new-produced therapy). These
data are used commonly when making therapy decisions because most diseases can
118 10 Recommender System for Health Care
be estimated through these indexes. In addition to these data, more specified infor-
mation needs are to be collected. For example, when estimating a patient, the severity
of his/her Psoriasis (a kind of skin disease) (Gräßer et al. 2017), the system needs
to collect the health status of the patient’s skin while these data are the main factors
to judge. Also, the system will not collect other irrelevant data to save memory
resources.
Table 10.1 shows the example of Psoriasis diagnosis data collection. The first
dataset mentioned below is user attributes, which can be used in many diseases’
diagnoses, the middle dataset can be changed into other kinds of specified measure-
ment indexes according to the actual application scenarios, while the other dataset
is designated for Psoriasis only like PSAI scores and severity in order to give an
accurate diagnosis to this specified disease.
After collecting all the needed data, the information is transferred into different
categories of value that can be easily calculated by the recommender system program.
All the categories of data can be used as components of data mining. For example,
the above table’s data has some general information like gender and weight, which
may have potential effects on the final diagnosis. Also, it contains some direct data
like the development part of body skin and they play an important role in generating
the therapy choices list.
This HRS’ drawbacks are obvious that all the therapy recommendations should
know what the patient’s disease is in advance or the RS cannot make any accurate
decision. Before applying this recommender system, the disease diagnosis (symp-
toms analysis and diagnose) RS should be applied and recognize the disease as
accurately as possible. Also, the most used RS in the current stage is classification-
based, which may give overbroad therapies a selective list and lead to poor curative
efficiency and low accuracy. To handle this issue, traditional data mining techniques
should be improved, and the restriction of descriptive data should be overcome.
Besides, the traditional issues of RS also occur in this category of HRS. The
sparsity of the data is one of the most important problems of this system, which can
be compensated by a demographic-based technical approach, as the latter has broader
coverage. One reason why this problem exists is that patients hold back information
about themselves due to privacy concerns. Another reason is that doctors do not
require all physical tests when diagnosing a patient. As for the cold start problem,
it can be effectively mitigated by importing many users’ prior cases. The merging
of home healthcare data is also a difficult problem, which involves the limitation of
high-dimensional data.
When giving a diagnosis to patients, the most important target is to make no mistakes.
In existing recommender systems, the most applied measurement methods depend on
keyword analysis instead of the quantitative index while the recommender system can
handle figures better than keywords. For example, in Cardiovascular Risk Compu-
tation Recommender System (Guzmán et al. 2018), the programmers developed a
decision tree to judge whether the patient has cardiovascular diseases. The decision
tree shown in Fig. 10.3 has several keywords like heart frequency, lifestyle, cardio-
vascular risk, and current exercises, and under these nodes, there is 2–4 keyword
such as sedentary, normal, and active for describing the patient’s lifestyle. Compare
with other quantitative indexes like blood pressure and heart rate, these keywords
are too broad for doctors to make a judgment. This method may be rather good for
data mining when there exists a large enough dataset for a recommender system to
train and judge. If the number of cases is small, then the system will be more likely
to give inaccurate judgments.
To handle these issues, the main point is to use more quantifiable factors like heart
rate and several categories of blood indexes which can be managed intuitively by a
recommender system rather than vague keywords. To some extent, the recommen-
dation will mainly focus on the quantizable rather than simply Boolean operators
or choices. Besides, for some rare disease which has little or even no data stored in
the background database, the recommender system cannot give valuable suggestions
120 10 Recommender System for Health Care
and these cases should be handled by doctors themselves. Table 10.2 gives some
commonly used body indexes that can measure users’ health status.
All the patients’ cases (clinical information) are stored in the background database
and each risk factor is assigned a score to represent its impact degree on the patient.
Also, the patient’s information can have other kinds of data but are not shown in this
table due to the irrelevance of data and they may not have any effects to estimate
patients’ health status. Besides the information given above, all the cases will be
clearly labeled whether the patients have diabetes or not. After that, these data can
be treated as training sets and entered into the recommender system to make it give
a more accurate recommendation (diagnosis).
10.2.2.1 Overview
With the rapid development of medical technology and pharmaceutical level, more
and more medicines are invented and applied in different areas like specific medicine
10.2 Analysis of Healthcare Recommender System 121
of disease and medicine to prevent specified diseases. While the variety of medicines
has increased significantly, the probability of accidents has also increased. According
to the FDA report, more than 42% of medication errors are caused by physicians.
The human brain can sometimes make mistakes. A drug recommendation system
can be a good help to solve this problem. However, drug recommendation systems
require access to very specialized medical information. Data specialization in the
drug field may be more stringent than in other fields. A movie recommendation
list can occasionally contain content that is not of interest to the user, but a drug
recommendation list should never contain drugs that cause illness or death. The
machine must recommend the right drug and needs to identify multiple correct drug
interactions and adverse reactions to related drugs. Some ontology and rule-based
drug recommendation systems recommend patients by analyzing detailed informa-
tion about the drug itself. González et al. (2009) used an ontological approach to
diagnose specific diseases. It uses only three variables: type of disease, drug, and
allergy rules.
To some extent, HRS for medicine can be used by working cooperatively with HRS
to diagnose when giving therapy to patients. For the commonly used medicine, the
HRS background database will clearly label it with its attributes like target disease,
curative period, and market price. And for the new-produced medicine, the HRS
will analyze its clinical data and ingredients to extract useful information and then
store the medicine in the background database. When searching for the most suitable
medicine for patients with a specific disease, the HRS will compare the patients’
profiles and medicine database and generate a selective list for the attending doctor
to decide.
As the level of modern pharmaceuticals improves significantly, there are so many
kinds of drugs for different purposes on the market today which means that the HRS
for medicine should give appropriate recommendations according to the medical or
healthcare purposes. To handle this challenge, the medical records stored in the back-
ground database are clearly labeled with medical purpose, usage cautious, potential,
or direct side effects and other necessary attributes which may influence the final
recommendation. Table 10.3 shows the background dataset structure, Table 10.4
shows the relative recommendation procedures used to cope with the challenge, and
Table 10.5 gives a simple description of these three categories. The medicine database
can be mainly divided into three different types according to the user group.
Also, prediction of drug response is an important research direction, and prediction
of drug side effects and interactions or prediction of patient response to specific drugs
might take the drug recommendation system to a new level. A combined sparse linear
recommendation and logistic regression model (SlimLogR) was used in Chiang et al.
(2018) to solve the recommendation problem, which contains a drug recommendation
component and a label prediction component that efficiently identifies decisions that
may cause adverse drug reactions. The effects of drugs for some cancers or rare
diseases may vary from person to person, and in the era of precision medicine, one
can predict a patient’s response to a specific drug by analyzing data on the patient’s
gene expression, rather than relying solely on the large number of cases that existed
in the past to make predictions. Suphavilai et al. (2018) combining different cell lines
122 10 Recommender System for Health Care
with specific drugs and exploring their potential relationships lead to understanding
drug mechanisms and recommending accurate drug regimens more effectively. Drug
side effects are a potential factor in public safety and one of the major contributors to
illness and death in health care (Tatonetti et al. 2012). Galeano and Paccanaro (2018)
it proposed a collaborative filtering model for the large-scale prediction of drug side
effects that can be used for the early detection of adverse drug events.
This method is based on an authoritative medicine database that stored all the
information about medicine attributes like target disease, ingredients, and latent side
effects. For example, the side-effect resource (SIDER) version 4.1 is one of the mature
healthcare databases which contains information about medicines on the market and
their recorded side effects (Kuhn et al. 2015).
Besides, the similarity measurement methodology can be used in this category
of recommendation such as the Jaccard similarity index calculation. To predict the
drug’s side effects, the first step is to find the group of drugs with similar curative
effects’ ingredients and other attributes. These features should be indexation and
given related weight during the calculation step. The similarity equation can be
mentioned as the weight of the features that two or more drugs both have to be
divided by the total weight of the features that all the drugs in this group have.
The similarity equation and estimation equation are given in Eqs. 10.2 and 10.3,
respectively.
∑n ∏m
wn am,n
Simdrugs = i=1
∑n i=1
, (10.2)
i=1 wn
while
After collecting the dataset, the second step is to predict the target drug (new-
produced drug) side effects. The calculation result will show the side effect with the
highest probability, and it will be considered as the potential side effect of the target
drug. The equation can be written as:
∑n,m
i=1 Simm wn
Probdrug = ∑n , (10.3)
i=1 wn
while
Simm means the similarity index between the target drug and the searching drug m.
wn means the weight of a kind of side effect n.
When developing this kind of HRS, the system should not only focus on the
users’ profiles to collect their health status but also take their preferences and further
requirements into account. For example, if the target user is a normal person and
wants to keep fit, the recommender system should give a healthcare medicine list
for users to choose from rather than compulsory medical suggestions. Figure 10.5
is the flowchart that shows how the recommender system copes with the user’s
requirements with different identities and authorities.
To solve this issue, the recommender system will be mainly used by healthcare
professional workers if the recommended medicine is used for curing or preventing
diseases, so they will know the purpose why the target medicine should be used.
When giving personalized medicine prescriptions, the objective purpose should be
confirmed, and it needs the background medical database to store the purpose of
different medicine. Also, when the objective purpose of medicine is used to keep fit
or for daily health care, the main consideration for generating the recommendation
is the user’s preference. Based on this idea, the Naïve Bayes classifier can be applied
and optimized to the recommendation list.
10.2.3.1 Overview
HRS for lifestyle can be divided into two categories, HRS for food and HRS for
exercise. The food recommendation system or diet recommendation system performs
like the drug RS to a certain extent. When recommending food for certain diseases,
the recommendation system should take the user’s lifestyle and health conditions
into account. This part of the recommendation is mainly based on the user’s dietary
habits and current health conditions like BMI, blood pressure, blood glucose, and
blood fat. According to these data or figures, the RS can recommend specified food
with suitable ingredients and sufficient nutrition. In some cases, food can have the
same effects as drugs do. The food categories would not change regularly like drugs,
so building up the RS about food can be easier. When giving a recommendation,
there are several categories of recommendation-generating patterns depending on
the applied technologies, and four of them are described here.
The first type of system is based on a personalized model of users (Galeano and
Paccanaro 2018). It asks for inputs such as the user’s past eating habits, such as
yesterday’s recipe, or the user’s preferred foods. Typically, this system displays a
series of individual foods, such as beef, lamb, and fish, and asks the user to rate
these foods (Yera Toledo et al. 2019). Such systems also recommend restaurants that
match the user’s tastes (Tung and Soo 2004).
The second type of system tends to study the available nutritional information
and process the information according to the recommendations of professionals,
rather than prioritizing individualized modeling (Galeano and Paccanaro 2018). Such
systems use existing healthcare recommendations and process information through
genetic algorithms (Syahputra et al. 2017), ant colony optimization (Rehman et al.
2017), or bacterial foraging optimization methods (Chouhan et al. 2018). In this case,
the system asks the user to enter information such as age, gender, and occupation
instead of dietary habits. NutElcare (Espín et al. 2016) is a dietary recommendation
system that takes recommendations from nutrition or dietetic databases to generate
reliable dietary recommendations for older adults. GPS and pedometers are used to
estimate how much nutritional intake the user needs for the day and then generate a
ranked list of recommendations (Nag et al. 2017).
10.2 Analysis of Healthcare Recommender System 127
The third meal recommendation system blends the first two approaches, which
effectively combines user preferences and nutritional recommendations. Trattner and
Elsweiler (2017) presented a pioneering approach that investigated the possibility of
including nutritional factors in recipes. Ge et al. (2015) and McCarthy et al. (2016)
have made great efforts in this regard, being able to balance the user’s taste and
healthy dietary requirements.
The fourth type of recommendation system is a group of people who recommend
a common diet (Galeano and Paccanaro 2018). For example, when a group of users
is planning to have a party, this kind of system can consider everyone’s eating habits
to make recommendations instead of considering only one user (Kuhn et al. 2015).
Another category of this HRS is physical exercise recommendation. Compared
with other kinds of healthcare recommender systems, its significance is much lower
than other kinds of HRS. When recommending the exercise, the RS will take the
user’s physical conditions into account. Unlike the medicine and food recommenda-
tion system, the enforceability of the physical exercise recommender system is much
lower while the recommendation will mainly depend on the user’s personal willing-
ness. The only factor that needs to be concerned is the user’s health conditions. For
example, users with leg injuries should not be recommended for long-distance races
and other kinds of exercises with legs. And users with cardiopathy or asthma should
not engage in vigorous exercises. To cope with this requirement, build up a don’t list
to record all the chronic diseases that need special attention. This information should
be stored in the database.
With people’s pursuit of a healthy diet, the HRS for healthy food and dietary manage-
ment rose in response to people’s growing requirement for physical health. As
mentioned before, this kind of recommender system lacks enforcement, and people
can choose their dietary plans and decide whether to follow them or not. Also, the
variety of food recipes and ingredients has increased significantly. Helping people
choose food properly and wisely is important in today’s world.
The HRS for healthcare food recommendation works slightly differently than
other HRS. It firstly collects users’ basic information like preferences to reduce the
probability that users do not satisfy the dietary plan and refuse to follow it. Then,
the system will take the users’ health status to choose which food or ingredients to
recommend. For example, recommend low-glycemic foods or sugar substitutes for
diabetics and food with less oil and salt for high blood pressure users. This step will
generate a recommendation list of detailed dietary plans which contains guidance
like eating what, when to eat, and how to eat to get better effects. After that, the
recommender system will ask users to evaluate the dietary plans by scoring so that
for other users, the system can compare users’ profiles with other users’ records
collected before and save lots of execution time to generate a recommendation list.
128 10 Recommender System for Health Care
0.8
Food Taste
0.6
0.4
0.2
0
0 5 10 15 20 25
Dinner Period
It is obvious that in this kind of recommender system, the user’s preference and
requirements have a higher priority and the system is of low coerciveness. But some-
times, the user’s health status should be considered, and the recommender lifestyle
may not be accepted by the users due to their personalities and life experience. To
handle this issue, the target group positioning should be done when generating the
recommendation result. KNN algorithm can be applied in this area. For example, the
following figure consists of two categories of users’ eating lifestyles (food taste, 0 is
a light flavor and 1 is a heavy flavor, and dinner starting time). After all the data is
inserted into the cluster diagram, Fig. 10.6 shows the final cluster diagram and it is
easy to find the clusters.
Figure 10.6 contains several features, and in the actual condition, there will be
more users’ information and considered features. Beyond this diagram, all the points
shown in the diagram contain case information like the recommendation list given to
the users and their satisfaction with the plan which can be mentioned as the training
set.
Before starting the recommender system, the program will collect the user’s daily
lifestyle information and insert the data point into the cluster diagram. Then, the
system will find the user group that the target user belongs to and give the recom-
mendation list like other members in the user group have taken. After that, the system
will ask about the user’s satisfaction with the result for further system training and
provide more accurate suggestions.
References 129
10.4 Summary
Think Tank
References
Chiang W-H, Shen L, Li L, Ning X (2018) Drug recommendation toward safe polypharmacy. ArXiv
abs/1803.03185
Chouhan SS, Kaul A, Singh UP, Jain S (2018) Bacterial foraging optimization based radial basis
function neural network (BRBFNN) for identification and classification of plant leaf diseases:
an automatic approach towards plant pathology. IEEE Access
Espín V, Hurtado MV, Noguera M (2016) Nutrition for elder care: a nutritional semantic
recommender system for the elderly. Expert Syst J Knowl Eng 33:201–210
Galeano D, Paccanaro A (2018) A recommender system approach for predicting drug side effects.
In: 2018 International joint conference on neural networks (IJCNN), pp 1–8
Ge M, Ricci F, Massimo D (2015) Health-aware food recommender system. In: Proceedings of the
9th ACM conference on recommender systems
130 10 Recommender System for Health Care
Abstract Cybercrime activities are increasing all over the world at an alarming rate
and pose a serious threat to a nation and its citizens. For this reason, surveillance
and monitoring of online browsing activities of individuals become necessary to
prevent terrorist and criminal activities, in the interests of national security. There
are at present several such surveillance projects in implementation already in India
as well as abroad. The cyber-surveillance tools monitor data stored on hard drives,
as well as data transferred over computer networks, such as the Internet through
emails, or through mobile phones in the form of calls or messages. In this chapter,
we have proposed a sparse matrix-based framework that will track the browsing
activities of individuals over a period of time, and if some persons are found to be
surfing a chain of websites that are categorized as potentially harmful, for a period
of time, then it will raise an alert to the governing authorities. The framework also
uses recommender systems and browsing history tracking algorithms. We expect that
this can be of assistance and can be utilized by authorized monitoring agencies of
countries as a threat analysis tool.
11.1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 131
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_11
132 11 A Surveillance Framework of Suspicious Browsing Activities …
There are at present several mass government surveillance projects in India, all of
which are monitored by the various core security agencies authorized by the GOI.
There are various other mass surveillance projects in place all over the world also.
Here, we briefly mention three such major projects in India (http://trackingobserver.
cs.washington.edu; Acar et al. 2014).
The National Intelligence Grid (NATGRID) contains integrated data in the form
of a master database and is used for counter-terrorism operations. It integrates the
databases from various important security agencies under the GOI. It was formed
after the 2008 Mumbai attacks. The whole master database is scheduled to live by
December 31, 2020. Very few people will have access to it on a case-to-case basis.
NATGRID is a counter-terrorism measure, which collects varied data from various
standalone agencies and Indian government ministries. The collected data collate
all kinds of information from these databases, like details of banking accounts,
taxes, transactions on credit or debit cards, records of visa and immigration, travel
itineraries, etc. With the help of NATGRID, security agencies can locate and extract
relevant information about terror suspects by pooling data from datasets all over the
country. Basically, it is aimed to help in the identification and capturing of suspected
terrorists and thereby help in preventing plots of terror attacks.
11.3 Web User Tracking of Browsing Patterns and Populating Recommender … 133
The Central Monitoring System (CMS) works on the principle of telephone inter-
ception provisioning and is a centralized system. It was developed by C-DOT, which
is a GOI-owned telecommunication technology development center. It monitors all
the communications that take place on mobile phones, landlines, and the Internet
in the country. It is capable of keeping track of the persons who have initiated or
received calls on mobile or landline numbers, the time, date, and durations of the
calls, as well as the location of the targets, failed calls, the call data records (CDRs)
of roaming subscribers, and forwarded telephone numbers by target subscribers. It
was designed primarily to strengthen security in the country.
The Center for Artificial Intelligence and Robotics (CAIR) has developed a soft-
ware network named NEtwork TRaffic Analysis (NETRA) in a DRDO laboratory
and is used by various counter-intelligence agencies like the Intelligence Bureau
and RAW. Initially, this program is being tested by the national security agencies on
small scale, but it is planned to be deployed at a pan India level soon. It has been
designed to monitor Internet traffic on a real-time basis because of the increasing
threats posed by criminals and terrorist groups who are using data communication
among themselves. NETRA is capable of analyzing the voice traffic which passes
through a variety of softwares like Skype, Google Talk, etc., and it can also detect
and intercept messages which contain keywords like “attack”, “bomb”, “blast”, or
“kill” in real-time from a huge number of tweets, status updates, emails, Internet
calls, blogs, forums, etc. It can also detect images that are generated on the Internet
for obtaining the intended intelligence information. The system with RAW analyzes
a large amount of international data which crosses through the Internet networks in
India.
The methods mentioned above all deal with the Internet or network communi-
cation where they track tweets, emails, messages, and other electronic documents.
But to the best of our knowledge, there is at present no framework to track the
browsing patterns of users to predict and raise alerts for visiting links for malicious
or potentially harmful websites. In this chapter, we have proposed a framework that
will be able to trace the browsing patterns of various users using various parameters
and alert authorities if any suspicious activity is suspected. The authorities can then
further investigate the flagged users for possible threats. In the next section, we give
an overview of how the web tracking of the browsing patterns is usually done along
with an introduction to the concept of recommender systems.
App and service providers do extensive web user tracking to collect all information
about their users, in order to provide them with a superior product experience. Infor-
mation about the location, browsing tendencies, communication records, financial
information, and general preferences regarding users’ online and offline activities
134 11 A Surveillance Framework of Suspicious Browsing Activities …
can give significant insights into a person’s activities. A lot of this access is often
directly granted from the user when he/she is using a particular service for browsing
particular sites. In many cases, a lot of this private information is captured by online
services even when the direct consent or knowledge of the user is not there. So their
party services follow users in order to track the users across the different websites
which they access. When a user surfs the web, they leave traces of their identity
in the form of the patterns of their activities and many more such unstructured data
which creates the users’ online footprints (Acar et al. 2013, 2014). The fact that users
carry around a host of personal computers and other communication devices makes
them locatable, identifiable, and trackable across different locations, networks, and
services. Therefore, this information, arising from users’ activities, along with other
technologies, can effectively enhance the surveillance capabilities and lead to an
effective monitoring system in the interest of national security. Figure 11.1 shows
how service providers track the chain of websites browsed by a user, to provide
advertisements the user may be interested in.
Recommender systems (Adomavicius and Tuzhilin 2005; Ahn 2006; Bailey 2008)
are software tools that use agents to help the user to find the most suitable pages
of their interest. The algorithms used by the recommender systems are mainly of
three types: collaborative filtering, content-based, and hybrid methods. In a content-
based system, recommendations are made by collecting information about the profile
features of a user. The idea is that if a user has shown interest in the past for a particular
thing, then it is very likely that the user will also be interested in that object again in
the future. Usually, objects of similar types are put in a group based on the similarity
of their features (Balcan et al. 2006; Boutilier et al. 2003; Bridge and Ricci 2007).
The profiles of users are created either by the use of historical interactions or by
Fig. 11.1 Advertisers respond with corresponding advertisements based on user web search activity
(Puglisi et al. 2017)
11.4 Sparse Matrices 135
explicitly asking users about their interests. Figure 11.1 explains how recommender
systems work.
In a collaborative filtering system, user interactions are utilized to filter out the
objects which are of interest. The set of interactions is visualized as a matrix, where
the interactions between the users i and items j are represented by the entries (i, j). It
can be thought of as a generalization of classification and regression. But while in the
former case it is predicted whether a variable directly depends on the other variables,
in collaborative filtering no such distinction is made between the feature variables and
the class variables. The problem is visualized as a matrix, but instead of predicting the
values of a unique column, the values of any given entry are predicted. At present, this
is one of the approaches being used most frequently and normally provides results
that are better than content-based recommendations. The recommendation systems
of YouTube, Netflix, and Spotify use this type of system (Box et al. 2005; Breiman
and Breiman 1996; Puglisi et al. 2017).
Hybrid systems are a combination of both types of information and are aimed to
overcome the issues that come up while working with just one kind.
In general, most of the large matrices are found to be sparse in nature. A recom-
mender system matrix also is usually a sparse matrix. A matrix as we know is a
two-dimensional data object made of m rows and n columns, so the total number of
values is m x n. If most of the elements of the matrix have a “0” as a value, then the
matrix is called a sparse matrix. The advantages of a sparse matrix over a normal
dense matrix are as follows:
Storage: The number of nonzero elements is lesser than the number of elements, and
therefore, the amount of memory needed to store only the nonzero elements will be
lesser.
Computing time: The computing time may be reduced by the logical design of a data
structure that traverses only the nonzero elements.
Suppose we take the following sparse matrix in Fig. 11.2.
If we represent it as a 2D array, then it will waste a lot of memory, as the zeros
in the matrix are usually not needed in most instances. Therefore, instead of storing
the zeroes also, we store only the nonzero elements, i.e., as a triple (row, column,
value).
Although sparse matrices can be represented in various ways, the two most
frequent ways to store them are as:
a. Array representations.
b. Linked list representations.
In the array representation, the sparse matrix is represented as a 2D array whereto
the following three rows are used:
Row: Index of a row, this is where the nonzero elements are located.
Column: Index of a column, this is where the nonzero elements are located.
Value: This is where the value of the nonzero elements is located at index (row,
column).
So, the above matrix is represented as follows in Fig. 11.3.
In the linked list representation, each node contains four fields. These fields are
defined as:
Row: This is the index of the row, which contains the location of the nonzero element.
Column: This is the index of a column, which contains the location of the nonzero
element.
Value: This is the value of the nonzero element which is located at the index (row,
column).
Next node: This contains the address of the next node.
The amount of information that can be extracted for surveillance is difficult to deter-
mine because its accuracy depends on four factors: the web structure, how the web
resources are mapped to the topology of the global Internet, a typical user’s web-
browsing behavior, and the technical capabilities and the policy restrictions of the
adversary/authority.
The data for the tracking reports comes from the three following sources:
• The HTTP request of the user.
• The browser/system information.
• First-/third-party cookies.
11.5 Our Proposed Framework 137
So the tracker is capable of inspecting the packet contents and can either do the
tracking of an individual target or surveil a large group of users. Although a major
challenge here is the dearth of persistent device identifiers, this can be overcome to
a large extent by observing third-party cookies. Since there are multiple unrelated
third-party cookies on most web pages, they can be tied together to most of a user’s
web traffic, even without the IP address. So the network traffic can be separated into
clusters, where each cluster corresponds to only one user.
The following are the steps for the targeted surveillance process:
• Either the target identity is scanned in plaintext HTTP traffic or some auxiliary
method is used to get the targets cookie ID on some first-party page.
• Then, the target known first-party cookie can be transitively connected to other
third-party and first-party cookies of the target. In the case of en masse surveil-
lance, all the HTTP traffics can be clustered first, and then individual identities
attached to these clusters.
• Once the identities have been attached, various types of information and activities
can be extracted or predicted. Firstly, the browsing history itself may provide
primary information about the interests of the user, e.g., terror attacks. Secondly,
it can also provide sensitive information in unencrypted web content like purchase
history, address, etc.
Our proposed framework tracks the web browsing of users to build a sparse matrix
similar to a recommendation system, based on the tracking patterns. The values of
the matrix and the sites are monitored to see if a user is suspected to traverse the
Internet with malicious intentions. Then an alert system is in place to raise flags to
the concerned authorities.
Figure 11.4 shows the process of tracking a user’s browsing information. The
user sends HTTP requests to various websites. All these websites will now tell the
user to browse to send a request to the same tracker for an ID. The user’s browser
will now send requests to the tracker, and the tracker replies with an instruction to
set a cookie with an ID. For single website trackers, it is a first-party cookie, while
cross-website trackers will store their ID in a third-party cookie. So in this case the
tracker will set a single third-party cookie with an ID that can be accessed over all of
the websites. This information from the trackers’ set for multiple uses can be used
to build and populate a recommender system sparse matrix. When the number of
hits to a particular page that has been flagged as potentially dangerous, which goes
beyond a particular threshold value, then it sends an alert to the surveillance system
administrator.
138 11 A Surveillance Framework of Suspicious Browsing Activities …
A list of users and the list of websites browsed by them are tracked using a third-
party tracker using the following algorithm. The browsing data and other related
information are shared by the tracker to build the recommender alert system matrix.
Whenever access to harmful sites is detected in the matrix, it will send alerts to the
concerned authorities.
11.6 The Proposed Algorithm for the Threat Analysis and Alert 139
FindThreat( )
Input : userIDList[] ;
websiteList[];
cookieID[];
trackers_3P[];
threshold_value;
Output: threatAlert( );
user_ID;
for each user in userIDList[] do
for each website in websiteList[] do
send httpRequest() to each website in websiteList[];
website response with webpage data and tells browser to send cookie_ID request
to browser;
browser requests for cookie_ID[]to tracker_3P[];
tracker response with cookieID[] to user[];
generateCookieID( );
share user browser data with website;
for i=1 to n do
for j=1 to m do
buildRecommenderSystem( ) R[i][j];
end for
end for
end for
end for
for i=1 to n do
for j=1 to m do
if R[i][j]>= threshold_value, then
send threatAlert(), user_ID;
end for
end for
140 11 A Surveillance Framework of Suspicious Browsing Activities …
11.8 Summary
In this chapter, we have proposed a framework for the surveillance and tracking of
suspicious browsing activities by a user. It maintains a sparse matrix of the chain of
websites browsed by a user as well as a list of probable sites they may access. Based
on this sparse matrix, if the number of hits to harmful crosses a threshold value, then
an alert is sent to the authorities. It can also be an effective tool for the threat analysis
for potentially malicious users on the website. In the future, we plan to extend this
work to deal with cases where the users are browsing in incognito mode.
Think Tank
References
Acar G et al (2013) FPDetective, dusting the web for fingerprinters. In: ACM SIGSAC 2013
conference on computer & communications security, pp 1129–1140
Acar G et al (2014) The web never forgets: persistent tracking mechanisms in the wild. In: 21st
ACM conference on computer and communications security (CCS 2014)
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey
of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Ahn LV (2006) Games with a purpose. Computer 39(6):92–94. https://doi.org/10.1109/MC.2006
Bailey RA (2008) Design of comparative experiments. Cambridge University Press
Balcan MF, Beygelzimer A, Langford J (2006) Agnostic active learning. In: ICML’06, 23rd
international conference on machine learning. ACM, New York, NY, USA, pp 65–72
Boutilier C, Zemel R, Marlin B (2003) Active collaborative filtering. In: 19th annual conference on
uncertainty in artificial intelligence, pp 98–106
Box G, Hunter SJ, Hunter WG (2005) Statistics for experimenters: design, innovation, and discovery.
Wiley-Interscience
References 141
Abstract This chapter gives a brief introduction to the recommender system and
provides details of six different applications of the recommender system (health care,
security, tourism, e-commerce, e-learning, and social network). It mainly discusses
reasons to use recommender systems, real-life examples, as well as techniques such
as collaborative filtering, content-based filtering, and hybrid filtering behind it. Based
on the existing problems within each recommender system, some possible solutions
are given to improve the current recommender systems, respectively.
12.1 Introduction
Nowadays, the rapid development of the Internet enables a large influx of information
provided for users through networks and different platforms. However, this leads to
information-overloaded problems, which means that users cannot find exactly what
they really want in a short time due to a large number of dazzling choices, making
them hard to make decisions (Kunaver and Požrl 2017). Therefore, in order to deal
with this problem, it is necessary to come up with solutions to help users quickly find
out what they want and provide them with the most appropriate products or services
(Lu et al. 2020). That is how recommender systems came into being to solve this
problem (Lu et al. 2015). Recommender system (RS) is a filtering system that uses
complex algorithms to select the most relevant items, content, or services that users
most frequently search for, tailored to their preferences (Das et al. 2016; Isinkaye
et al. 2015). It has now been widely used in all parts of our life, such as e-commerce,
traveling, and health care. For example, Amazon uses recommender systems to help
users find books or other products they like. Moreover, introducing recommender
systems not only makes life much easier for users but also benefits companies and
providers. This documentation is divided into three parts: algorithms (techniques,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 143
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2_12
144 12 Some Novel Applications of Recommender System and Road Ahead
12.2.1 Security
In daily life, people tend to consult their friends or trust for an unfamiliar problem
or something and make their own choices based on these judgments and opinions. A
typical collaborative filtering algorithm is based on the user’s collaborative filtering
algorithm. Its basic principle is to get user neighbors using historical rating data and
recommend to the target user according to rates similar to the nearest neighbor of
the score data (Yubo et al. 2010).
12.2 Applications of Recommender System 145
Therefore, the unrated projects’ score target user-made can predict through the
nearest neighbor approximation of the weighted average, to produce recommenda-
tions. This process comprises three steps to complete: data presentation, finding the
nearest neighbor, and producing recommendations (Yubo et al. 2010).
This recommendation method involves the use of k-nearest neighbors. KNN algo-
rithm is used to reduce the exposure of user privacy and protect user files from
privacy threats based on not reducing the quality of the content recommended to
users (Katarya and Verma 2017).
Web service recommendation methods are also based on collaborative filtering
recommend services and Mashup call history, user similarity, or service similarity.
This method is generally used for Quality of Service (QoS) prediction in the early
days, selecting high-quality services in web service recommendations.
• Statistics-based recommendation
A statistical method is a mature privacy protection method used in the data calculation
stage (such as similarity calculation). It can process sensitive information in data files
by removing features, obfuscation, adding noise, and other methods. This is often
referred to as the anonymization algorithm. At present, K-anonymity, I-Diversity,
and T-Total are the most common authorization and acceptance methods (Weiming
et al. 2019). Although these three statistical methods are very efficient and relatively
simple to calculate, sometimes they do not play a substantial role in user privacy
theft with common characteristics (Sweeney 2020).
For example, if the dimensionality of information obtained from users is sufficient,
even though the use of these technologies anonymizes a single column of information,
the attacker can still compare and categorize, retrieve all the user’s information, or
relocate the user personally (Wong et al. 2006).
• Cryptography-based recommendation
hybrid technique combines two or more filtering methods. It thus gains an increase
in accuracy and performance.
• User identification
Since the existence of proxy servers and firewalls, user identification cannot be
distinguished by IP address. Then, we should adopt the heuristic rules. Different
IP addresses represent different users. When the user’s IP addresses are the same,
different operating systems or browsers can represent different users. When the user’s
IP addresses, operating systems, and Internet explorers are the same, website topolo-
gies represent different users. When a web cannot be reached from the history of
previously visited pages, then it is considered a new user (Zhang and Wang 2013).
• Session identification
Session identification is to break records of web pages that the user has visited into
separate sessions. It considers all the web pages that a user visited at one time as one
user session. Because it is difficult to determine whether the user has left the website,
the easiest way to determine whether the user has left the site is to use the maximum
timeout. If the period between two pages’ request is more than a certain time limit,
then we consider that the session has been finished, then start a new session. A large
number of statistical studies have shown that 30 min is relatively standard for judging
time out or not (Zhang and Wang 2013).
• The matrix factorization algorithm cannot effectively capture the complex inter-
action information between the two in the sparse Mashup–web service call matrix,
which may potentially result in lower recommendation performance.
12.2.2 Tourism
This is about making a recommendation to the target user based on items with
high ratings made by users who have similar preferences to the target user (Fararni
et al. 2021). Nowadays, more factors can be taken into consideration, such as
weather condition and location (latitude, longitude, and GPS coordinates) as shown
in Fig. 12.1.
Fig. 12.2 User-generated content in tourism recommender system (Bayati et al. 2022)
This is based on the user preferences and similarity analysis between items and
then recommending to the target users (Fararni et al. 2021). Multi-user profiles and
TR-service profiles can be used in this algorithm (shown below).
In order to provide the best recommendation to the target user, the target user’s
profile must be compared with the profile of other users in the TR service. Therefore,
it is also important to calculate the similarity between the distribution of the target
user, θc , and the distribution of the user profile layer in the TR service, denoted by
θs , which is positive feedback from other users. The relevance score is calculated by
Eq. 12.1.
1
r̂u,s,c = , (12.1)
D K L (θc ||θs )
where D K L (θc ||θs ) refers to the divergence between these two probability distribu-
tions, which can be interpreted as Eq. 12.2:
P(w|Rs )
D K L (θc ||θs ) == P(w|Rs ) log . (12.2)
P(w|Rc )
The higher the score, the more similar is between the target user and users in the
TR service. Therefore, recommendations can be made more accurately based on the
relative score (Sondess et al. 2019). However, it is not logically sound since the target
user may not want to go to the same place twice (Fararni et al. 2021).
• Hybrid filtering
The idea of this approach is to maximize the potential of collaborative filtering and
content-based filtering (Fararni et al. 2021), making the recommender system more
robust. Big data systems, deep learning techniques, and social networks can be used
to help make the filtering algorithm and recommender system more robust.
• Knowledge-based
• Collect user feedback frequently to see if there are things that need to be improved.
For example, we can collect feedback from the user once a month to see if there
are things that need to be improved.
• Ask the new user to fill in his/her information before browsing the web pages
about tourism. Then by building the user profile, find a similar user profile pattern
in the database and make recommendations based on this.
• As Fig. 12.3 shows, updating the user profile frequently to keep track of the user’s
latest preference and make a more accurate recommendation.
12.2.3 E-commerce
As the Internet and smart mobile devices evolve rapidly, online shopping has become
an indispensable part of our lives. It seems to benefit us a lot and make our life
much easier since we can buy things that we want without stepping out of our
homes. However, the great number of items makes it hard for the customer to
make choices, which may discourage customers from buying reliable products,
thus hurting the economy (Duo and Su 2015). In addition, companies rely on the
users’ shopping data to make analyses (Hidayatullah and Anugerah 2018). There-
fore, recommender systems in e-commerce are introduced and needed to increase
customers’ online shopping experience by using complex algorithms to make accu-
rate recommendations to customers and satisfy their requirements (Fu and Leng
2018).
12.2 Applications of Recommender System 151
This technique is one of the most popular techniques that are used in e-commerce.
It can be divided into a user-based clustering model and an item-based clustering
model (Zhao and Ji 2013). It focuses on the other users who share similar shopping
behaviors (i.e., buying the same product) with the target user (Ouaftouh et al. 2019).
Similarity calculation is used in this technique to make the recommendation more
persuasive. It is described as Eq. 12.3.
x·y
sim(x, y) = cos(x, y) = . (12.3)
||x||∗|| y||
Cosine similarity is used to find the similarity between two users x and y (Zhao
2019). Therefore, it can effectively filter the user that does not share any shopping
similarities with the target user. Then, the recommender system will make a recom-
mendation about items that similar users have bought to the target user that he/she
did not buy. This can indefinitely enhance the accuracy of the recommender system.
• Content-based filtering
datasets. Association rules or frequent items can be used to describe the relationship
(Zhao 2019). Two steps are needed to achieve association rules. Firstly, a minimum
threshold probability for the item set. Let us say minPro = m, any frequent item set
with a probability greater than m can be chosen. Secondly, assume the confidence of
the item is c, and X and Y are two frequent item sets (X ∩ Y = ∅). If P(Y |X ) ≥ c,
then X → Y can be classified as the association rule as.
• Hybrid filtering
The idea of this approach is to maximize the potential of collaborative filtering and
content-based filtering (Fararni et al. 2021), making the recommender system more
robust. Big data systems, deep learning techniques, and social networks can be used
to help make the filtering algorithm and recommender system more robust.
• User clustering models
where user refers to the user profile attribute, and q refers to the corresponding value
of the attribute.
Another similarity can be used to calculate the similarity between users U1 and
U2 in the same clustering group (shown in Eq. 12.5):
12.2 Applications of Recommender System 153
∑
similarityInSameGroup(U1 , U2 ) = wi × simi (xi , yi ), (12.5)
where wi refers to the weights of different attributes in the user profile and simi (xi , yi )
is calculated based on a similarity metric (Zhao 2019).
• There are some existing gaps in collaborative filtering. For example, some users
would not like to rate or make comments on the items they bought. This may
affect the user profile and similarity value, thus lowering the accuracy of the
recommendation. Another example is that the cold start is still to be solved, which
means that when a new user is shopping on the e-commerce website for the first
time, there will be no recommendation to the user because there is no shopping
history for this user (Zhao 2019).
• Although collaborative filtering is successfully used most of the time, there still
exist some potential situations that need to be considered. For example, customers
may change their preferences over time (Zhao 2019).
• Even though some recommender systems can provide users with relatively accu-
rate recommendations, the quality of the product cannot be guaranteed since some
users are employed by the shop to make a high rating for its products to make
more people browse his shop and buy products in his shop.
• If a new user logs in to the shopping app for the first time, we may list a variety
of style of clothes and daily necessities and then let the user chooses his/her
preferences based on these. Also, if there is something not listed in the interface,
we may give the user a chance to type in what he/she may buy or may be interested
in. Thus, after the user enters the app, items according to his/her preferences are
recommended. This might solve the cold-start problem.
For example, if a user uses Taobao for the first time, he/she should be given a list
of items that he/she might be interested in, such as clothes and shoes. If the items
are not given in the list, the user should also be able to write down the items that he/
she would like to buy, such as watches. This is presented in Fig. 12.5.
154 12 Some Novel Applications of Recommender System and Road Ahead
12.2.4 E-learning
With the advent of more advanced technology, individuals can learn online instead of
offline, making individuals’ timetables more flexible. E-learning systems are around
to provide students and learners with virtual educational environments in which
they do not need others’ assistance in the process of learning. Through e-learning,
they can have access to a wide variety of learning resources (Ansari et al. 2016).
However, due to information overload, many learners are experiencing challenges in
retrieving relevant and useful learning resources that meet their needs. It seems that
the core component of a working and efficient e-learning system is its recommender
system (Ansari et al. 2016). The recommender system in an e-learning context tries
to intelligently recommend learning resources to a learner based on the task already
done by the learner (Nath 2018).
Fig. 12.6 Collaborative filtering in recommender system in e-learning (Isinkaye et al. 2015)
This technique is based on a comparison of the content of the learning object and
a learner profile. The two classes of content-based recommendation are case-based
reasoning techniques and attribute-based techniques. A case-based reasoning tech-
nique recommends learning objects that are in highest correlation to objects the
learner liked in the past. This technique does not desire a content analysis. The
quality of the recommendation rises when the learners have rated more learning
objects. The new learner problem also stated case-based reasoning techniques. The
limitation of this technique is overspecialization because it recommends only the
learning objects that are in higher correlation with the learner profile or interest. In
attribute-based techniques, learning objects are recommended based on the mapping
of their attributes to the learner profile. Attributes could be weighted for their rele-
vance to the learner. This technique is sensitive to changes in the learner profile (Nath
2018).
156 12 Some Novel Applications of Recommender System and Road Ahead
• Hybrid filtering
This technique classifies the learners based on their personal attributes and the recom-
mendations are based on the demographic classes. This approach assumes that all
learners belonging to a certain demographic class have alike interests or preferences.
It uses demographic data about the learner and their point of view for the recom-
mended learning objects. It forms people-to-people correlations like collaborative
ones. But they use different data, such as the same age group. The benefit of this
approach is that it is independent of learner rating history (Nath 2018).
• Context-aware systems
(a) Traditional recommender systems compromise with two types of entities,
users, and items. The recommender system includes additional information
about the learner’s context such as available time, location, and people nearby.
These data can be used to change recommendations based on the individual
learner’s characteristics. Context is information that can be used to clas-
sify the situation of an entity. An entity is an object, person, or place that
can be considered relevant to the interaction between an application and a
user. The context data consists of different attributes, like physical location,
date, season, emotional state, physiological state, personal history, etc. For
example, a website may recommend songs to a user by asking about the
12.2 Applications of Recommender System 157
current mood of the user. This system automatically uses context data to run
the system that is suitable for a specific time, place, or event. It is neces-
sary to combine the context data into the recommender systems to recom-
mend learning objects to the learners under some circumstances. It covers the
understanding of the learner’s objective with objects that learners might find
interesting by knowing the wide area of contextual attributes (Nath 2018).
• When there is a new learner to the system who has no prior rating found in the
rating table, it is difficult to give a prediction of a learning object for the new
learner because it requires the learner’s historic rating to calculate the similarity
for determining the neighbors.
• A cold start problem for a new learning object occurs when there are not enough
previous ratings related to that learning object exists (Nath 2018).
• Data sparsity occurs when the number of learners who have rated a learning object
is too small compared to the number of available learning objects. If there is no
such overlap in ratings with the target learner occurs, it is difficult to generate
appropriate recommendations (Nath 2018).
• Specialization is the major problem faced by the content-based recommender
system. The learners are recommended with learning objects that are already
familiar with. It prevents learners from finding new learning objects and other
alternatives. Additional techniques must be added to the system to make sugges-
tions outside the scope of learner interest. By integrating additional methods, the
learner will be provided with a set of different and wide ranges of options (Nath
2018).
• In the context of a demographic recommender, privacy is considered to be a
major issue. To provide a more accurate recommendation to the learner, the most
sensitive data of a learner must be acquired. It includes demographic information
and information about the location of a specific learner, which may rupture the
privacy of the learner (Nath 2018).
Collect user feedback frequently to see if there are changes in the user’s needs and
then changes need to be made accordingly. For example, if a user does not like
fashionable clothes anymore, then the app should reduce the recommendation of
fashionable clothes.
158 12 Some Novel Applications of Recommender System and Road Ahead
Nowadays, many people use social media to communicate with others, share their
interests, and obtain information. Recommender systems are surely the applica-
tions that can take the most immediate and evident advantages by leveraging in
different ways user experiences and interactions within a social community to suggest
multimedia objects of interest (Amato et al. 2017).
Nowadays, recommender applications and services have been introduced to
support effectively and efficiently the intelligent browsing of items’ collections,
assisting users to find “what they need” within this ocean of information and thus
realizing the well-known transition in the web from the “search” to the “discovery”
paradigm (Amato et al. 2017).
Just as a real example, each minute thousands of tweets are sent on Twitter, several
hundreds of hours of videos are uploaded to YouTube, and a huge quantity of photos
are shared on Instagram or uploaded to Flickr.
In the case of image-sharing social media, some recommender systems were devel-
oped for Flicker and aimed to perform personalized POI recommendations based
on the target user’s images. The author topic-based collaborative filtering (ATCF)
method is proposed to enable POI recommendations when the target user visits
a new place by discovering topics from the images’ metadata. Besides, a Visual-
enhanced Probabilistic Matrix Factorization model (VPMF) was proposed, which
adds visual features of the images into the collaborative filtering model. Some recom-
mender systems have been developed based on Instagram. One of them utilizes
an external knowledge base to build relationships between hashtags and perform
picture recommendations based on the correlations. Another method is developed
to discover topical authorities related to the target user by inferring topical inter-
ests from the user’s biography, propagating interests over the follower graph, and
assigning topics to authorities. Also, CNN-based methods have been proposed to
extract the visual features from the images and perform visual content-enhanced
POI recommendations.
The model-based RS requires a learning phase in advance for finding out the optimal
model parameters before making a recommendation. Once the learning phase is
finished, the model-based RS can predict the ratings of users very quickly. Among
them, the latent factor model (LFM) is very competitive and widely adopted to imple-
ment RS, which factorizes the user-item rating matrix into two low-rank matrices:
the user feature and item feature matrices. It can alleviate data sparsity using dimen-
sionality reduction techniques and usually produce more accurate recommenda-
tions than the memory-based CF approach, while drastically decreasing the memory
requirement and computation complexity.
Memory-based and trust-aware collaborative filtering—trust relationships
between users have been introduced into RS as an effective approach to overcome
the problems of data sparsity and cold start (Chen et al. 2018). The hybrid approach
builds an active user’s trust network using trust statements between the users to
improve the accuracy of similarities between users. One of the core roles of the
trusted network is to resolve the neighbor selection between a user’s trust statements
and its similarity values.
• Content-based filtering
This is based on the user preferences and similarity analysis between items and then
recommending to the target users.
Content-based filtering offers support for message filtration. Specifically, users
interact with the system via a GUI to set up and manage their FRs/BLs (Thilaga-
vathi and Taarika 2014). Machine learning-based text categorization techniques are
used to automatically allot each short text message with a set of categories based on
the content. Short text classifier is built for accurate extraction and set of discrimi-
nating features in the message. The neural learning model is employed for efficient
text classification. In addition, the neural model is enclosed within a hierarchical
two-level classification. Short messages are classified as neutral or non-neutral and
then are classified based on the appropriateness to each of the considered categories
(Thilagavathi and Taarika 2014).
• Hybrid filtering
The idea of this approach is to maximize the potential of collaborative filtering and
content-based filtering (Fararni et al. 2021), making the recommender system more
robust. Big data systems, deep learning techniques, and social networks can be used
to help make the filtering algorithm and recommender system more robust.
• Context-aware recommendation
• For RS, only recommending popular and highly rated items to the active user
often results in better recommendation results. However, the user can also easily
obtain such item information from other sources, that is, the actual value of such
recommendation is not high. Therefore, a good RS should be able to discover
items that are difficult to be found by users spontaneously, but meanwhile which
also fit the users’ interests (Chen et al. 2018).
• Recommending items to users relying solely on accuracy not only wastes
resources but also brings little benefit. If they cannot explain the recommended
results well, then they cannot determine whether the recommended items meet
the needs of users, resulting in reduced system reliability. If RS can provide some
explanation information when generating recommendations, the reliability of the
recommended results may greatly be improved. Meanwhile, they will greatly
arouse the users’ attention (Chen et al. 2018).
• Update the user profile frequently to keep track of the user’s latest preference for
social networking and collect feedback on using the social media from the user,
then change the recommendation to better satisfy the need of the user.
12.3 Summary
Background information on the recommender system is given in this chapter and there
are six applications listed (healthcare, security, tourism, e-commerce, e-learning, and
social network). Within each application, there are some basic techniques used in
the corresponding recommender system, among which collaborative filtering and
content-based filtering are the most popular ones. Figures, charts as well as mathe-
matical equations in this chapter may be useful to get a better understanding of the
recommender system. There are also some inevitable problems such as a cold start
which need to be tackled. Therefore, possible ideas are provided to avoid the problem
as much as possible and to maximize the potential of the recommender system.
Think Tank
References
Allen RB (2019) User models: theory, method practice. Int J Man-Mach Stud 32:511–543
Amato F, Moscato V, Picariello A, Sperlí G (2017) Recommendation in social media networks. In:
IEEE third international conference on multimedia big data (BigMM), 213–216
Ansari MH, Moradi M, NikRah O, Kambakhsh KM (2016) CodERS: a hybrid recommender system
for an E-learning system. In: 2nd international conference of signal processing and intelligent
systems (ICSPIS), 1–5
Bayati M, Harounabadi A, Akbari D (2022) Developing a location-based recommender system
using collaborative filtering technique in the tourism industry. Tehnički glas 16(1)
Chen R, Hua Q, Zhang L, Kong X (2018) A survey of collaborative filtering-based recommender
systems: from traditional methods to hybrid methods based on social networks. IEEE Access 6
Cordero P, Enciso M, López D et al (2020). A conversational recommender system for diagnosis
using fuzzy rules. Expert Syst Appl
Das D, Sahoo L, Datta S (2016) A survey on recommender system. (IJCSIS) Int J Comput Sci Inf
Secur 14(5)
Duo L, Su JT (2015) A recommender system based on contextual information of click and purchase
data to items for e-commerce. In: Third international conference on cyberspace technology
(CCT 2015)
Fararni KA, Nafis F, Aghoutane B et al (2021) Hybrid recommender system for tourism based on
big data and AI: a conceptual framework. Big Data Min Analyt 4(1):47–55
Felix G, Falko T, Jochen S et al (2020) A pharmaceutical therapy recommender system enabling
shared decision-making. User Model User-Adapted Interact
Fu CJ, Leng ZH (2018) A framework for recommender systems in E-commerce based on distributed
storage and data mining. Int Conf E-Bus E-Govern
Haider MH, Al-Azawei A, Al-A’araji N (2019) Developing a healthcare recommender system
using an enhanced symptoms-based collaborative filtering technique. J Comput Theor Nanosci
16:920–926
Hidayatullah A, Anugerah MA (2018) A recommender system for E-commerce using multi-
objective ranked bandits algorithm. In: International conference on computing, engineering,
and design (ICCED), 170–174
Isinkaye FO, Folajimi YO, Ojokuh BA (2015) Recommender systems: principles, methods and
evaluation. Egypt Inform J 16:261–273
Kamran M, Javed A (2015) A survey of recommender systems and their application in healthcare.
Techn J 20(4). https://prdb.pk/article/a-survey-of-recommender-systems-and-their-application-
in-hea-7462
Katarya R, Verma O (2017) Privacy-preserving and secure recommender system enhance with
K-NN and social tagging, 52–57. https://doi.org/10.1109/CSCloud.2017.24
Kunaver M, Požrl T (2017) Diversity in recommender systems—a survey. Knowl-Based Syst
123:154–162
Lu J et al (2015) Recommender system application developments: a survey. Decis Support Syst
74:12–32
Lu J, Zhang Q, Zhang GQ (2020) Recommender systems in intelligent information system, 6
162 12 Some Novel Applications of Recommender System and Road Ahead
A C
Abdul-Rahman, 83 Cardiovascular, 116, 119, 120
Accurate, 2, 3, 5, 6, 10, 12, 14, 22, 31, 53, Charif Alchiekh Haydar, 83
55, 67, 69, 75, 86, 92, 94, 101, 103, Clinical trial, 117
108, 110, 114, 118–120, 123, 128, Cluster diagram, 128
144, 146, 148–153, 157, 159 CNN, 158
Administrative measures, 81, 91, 97–99 Collaborative filtering, 5–7, 9, 14, 16,
Algorithm, 1, 2, 5, 9–11, 13, 16, 19–21, 24, 19–22, 24, 28, 29, 31, 32, 35, 37, 49,
32, 34–36, 39–41, 44, 46–50, 52–57, 52, 56–58, 63, 65, 66, 71, 74, 75, 81,
64–66, 69, 72, 74, 77, 81, 82, 86, 82, 84, 87, 96, 105–107, 116, 117,
89–92, 94, 96, 97, 101–103, 105, 123, 129, 134, 135, 143–145,
107, 109, 110, 116, 126, 128, 131, 147–149, 151–155, 158–160
132, 134, 138, 143–146, 148–152, Content-based filtering, 6, 13, 22, 71, 75,
154, 159 81, 143, 147–149, 151, 152, 155,
Amazon.com, 7, 12 159, 160
Artificial Neural Network (ANN), 123 Context-aware systems, 156
Association rule mining, 151 Contextualization, 83, 84
Asymmetric, 82 Cosine similarity, 26, 77, 151
Attack, 9, 40, 49–53, 81, 82, 87–91, 94, Cryptography-based recommendation, 145
97–99, 105, 132, 133, 137, 144
Attack detection technique, 87, 89, 99
Attack profiles, 51, 53, 87–91, 94
D
Attack-resistant, 53, 79, 91
Data mining, 61, 63, 116, 118, 119
Attack strategy, 87, 88, 91, 99
Data poisoning attack, 87
Authentication, 73, 98
Data-sharing, 85
Author Topic-based Collaborative Filtering
DBScan, 148
(ATCF), 158
Decision tree, 41, 45, 48, 89, 116, 119, 120
Average attack, 50–52, 88, 91
Demographic recommendation, 156
Diagnose, 115, 118, 119, 121
Diet, 126, 127
B Disease Diagnosis and Treatment
Bandwagon attack, 50, 51, 88, 89 Recommendation System (DDTRS),
Bayes network, 116, 117 116
Binary ratings, 21 Diseases, 73, 114–119, 121–127, 129
Body index, 120 Drug, 121, 123, 124, 126, 129
© The Editor(s) (if applicable) and The Author(s), under exclusive license 163
to Springer Nature Singapore Pte Ltd. 2024
P. Kar et al., Recommender Systems: Algorithms and their Applications, Transactions on
Computer Systems and Networks, https://doi.org/10.1007/978-981-97-0538-2
164 Index
E M
E-commerce, 1–3, 51, 90, 143, 150–154, Medicine, 113–115, 117, 120–127, 129
157, 160 Memory-based, 20, 155, 159
E-learning, 143, 154–157, 160 Meta-heuristic, 92
Electronic Medical Record (EMR), 116 Model-based, 19–21, 155, 159
Model of O’Donovan, 84
Model of Simon, 84
F MoleTrust, 84
Fake, 9, 39, 40, 49, 53, 91, 97, 105 MOOC, 154
Feedback, 11, 13, 23, 31, 35, 39, 47, 48, 58,
59, 63, 69, 72, 92, 107–109, 148,
N
150, 157, 160
Naïve Bayes classifier, 125
Fitness value, 94
Natural language processing, 7
Food, 12, 104, 113, 114, 126–129
Netflix, 2, 4, 7, 8, 11, 12, 32, 35, 37, 69,
102, 107, 135
Not distributive, 82
G Not generic, 83
Google news personalization, 13
O
H Online, 2, 3, 5, 11, 12, 31, 35, 48, 70, 81,
Healthcare, 9, 113, 114, 116, 119, 121, 122, 103, 104, 115, 131, 133, 134, 146,
124, 125, 127, 129, 143, 160 150, 154
Healthcare Recommender System (HRS), Ordinal ratings, 21
113–116, 119–121, 123–129
Hybrid filtering, 6, 143, 145, 149, 152, 156,
P
159
Password, 98, 99
Personal Health Record (PHR), 116
Privacy protection, 144, 145
I Psoriasis patient, 118
Information entropy, 94, 96
Information overload, 49, 81, 82, 113, 114,
154 Q
Internet, 70, 71, 113, 114, 131–133, 136, Query, 2, 59, 61–63, 83
137, 143, 146, 150
IP address, 99, 137, 146
R
Random attack, 50, 51, 88, 91
Randomness, 74, 90
J
Rating matrix, 21, 37, 65, 74, 88, 89, 91,
Job recommendation, 69
94, 96, 102, 103, 105, 110, 159
Recommendation, 1–16, 19, 21, 22–25,
27–29, 31–33, 37, 39–41, 46–53,
K 55–61, 64–67, 69, 71–75, 77, 79, 81,
K-Means, 152 82, 84, 87, 90, 92, 99, 101–105,
K-NN algorithm, 128, 145 108–111, 113–117, 119–129, 134,
Knowledge-based recommendation, 156 135, 137, 140, 144–161
Recommender system, 2, 4, 5, 9–16, 19,
22–25, 27, 29, 31, 32, 35, 40, 41, 46,
L 49, 50, 56–59, 61–65, 67, 69, 71, 72,
Lifestyle, 113–115, 119, 126, 128, 129 74, 76, 77, 79, 81–83, 85–87, 89, 91,
Local activeness, 95, 96 92, 94, 96, 97, 99, 101–105, 107,
Love/hate attack, 50, 52, 88, 89 109, 110, 113, 114, 116–120,
Index 165