Recommendation System
Recommendation System
Recommendation System
Cairo University
REVIEW
a
Department of Mathematical Science, Ekiti State University, Ado Ekiti, Nigeria
b
Department of Computer Science, University of Ibadan, Ibadan, Nigeria
c
Department of Computer Science, Federal University of Technology, Akure, Nigeria
KEYWORDS Abstract On the Internet, where the number of choices is overwhelming, there is need to filter,
Collaborative filtering; prioritize and efficiently deliver relevant information in order to alleviate the problem of information
Content-based filtering; overload, which has created a potential problem to many Internet users. Recommender systems
Hybrid filtering technique; solve this problem by searching through large volume of dynamically generated information to pro-
Recommendation systems; vide users with personalized content and services. This paper explores the different characteristics
Evaluation and potentials of different prediction techniques in recommendation systems in order to serve as
a compass for research and practice in the field of recommendation systems.
2015 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information,
Cairo University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.
org/licenses/by-nc-nd/4.0/).
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
2. Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
3. Phases of recommendation process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
3.1. Information collection phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
3.1.1. Explicit feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
3.1.2. Implicit feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
3.1.3. Hybrid feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
3.2. Learning phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
* Corresponding author.
E-mail address: [email protected] (F.O. Isinkaye).
Peer review under responsibility of Faculty of Computers and
Information, Cairo University.
http://dx.doi.org/10.1016/j.eij.2015.06.005
1110-8665 2015 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Information, Cairo University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
262 F.O. Isinkaye et al.
other systems that use content-based filtering to help users find ratings, user and item features in a single unified framework
information on the Internet include Letizia [16]. The system was proposed by Condiff et al. [30].
makes use of a user interface that assists users in browsing
the Internet; it is able to track the browsing pattern of a user 3. Phases of recommendation process
to predict the pages that they may be interested in. Pazzani
et al. [17] designed an intelligent agent that attempts to predict 3.1. Information collection phase
which web pages will interest a user by using naive Bayesian
classifier. The agent allows a user to provide training instances
This collects relevant information of users to generate a user
by rating different pages as either hot or cold. Jennings and
profile or model for the prediction tasks including user’s attri-
Higuchi [18] describe a neural network that models the inter-
bute, behaviors or content of the resources the user accesses. A
ests of a user in a Usenet news environment.
recommendation agent cannot function accurately until the
Despite the success of these two filtering techniques, several
user profile/model has been well constructed. The system needs
limitations have been identified. Some of the problems associ-
to know as much as possible from the user in order to provide
ated with content-based filtering techniques are limited content
reasonable recommendation right from the onset.
analysis, overspecialization and sparsity of data [12]. Also, col-
Recommender systems rely on different types of input such
laborative approaches exhibit cold-start, sparsity and scalabil-
as the most convenient high quality explicit feedback, which
ity problems. These problems usually reduce the quality of
includes explicit input by users regarding their interest in item
recommendations. In order to mitigate some of the problems
or implicit feedback by inferring user preferences indirectly
identified, Hybrid filtering, which combines two or more filter-
through observing user behavior [31]. Hybrid feedback can
ing techniques in different ways in order to increase the accu-
also be obtained through the combination of both explicit
racy and performance of recommender systems has been
and implicit feedback. In E-learning platform, a user profile
proposed [19,20]. These techniques combine two or more filter-
is a collection of personal information associated with a speci-
ing approaches in order to harness their strengths while level-
fic user. This information includes cognitive skills, intellectual
ing out their corresponding weaknesses [21]. They can be
abilities, learning styles, interest, preferences and interaction
classified based on their operations into weighted hybrid,
with the system. The user profile is normally used to retrieve
mixed hybrid, switching hybrid, feature-combination hybrid,
the needed information to build up a model of the user.
cascade hybrid, feature-augmented hybrid and meta-level
Thus, a user profile describes a simple user model. The success
hybrid [22]. Collaborative filtering and content-based filtering
of any recommendation system depends largely on its ability to
approaches are widely used today by implementing content-
represent user’s current interests. Accurate models are indis-
based and collaborative techniques differently and the results
pensable for obtaining relevant and accurate recommendations
of their prediction later combined or adding the characteristics
from any prediction techniques.
of content-based to collaborative filtering and vice versa.
Finally, a general unified model which incorporates both
3.1.1. Explicit feedback
content-based and collaborative filtering properties could be
developed [12]. The problem of sparsity of data and cold- The system normally prompts the user through the system
start was addressed by combining the ratings, features and interface to provide ratings for items in order to construct
demographic information about items in a cascade hybrid rec- and improve his model. The accuracy of recommendation
ommendation technique in [23]. In Ziegler et al. [24], a hybrid depends on the quantity of ratings provided by the user. The
collaborative filtering approach was proposed to exploit bulk only shortcoming of this method is, it requires effort from
taxonomic information designed for exacting product classifi- the users and also, users are not always ready to supply enough
cation to address the data sparsity problem of CF recommen- information. Despite the fact that explicit feedback requires
dations, based on the generation of profiles via inference of more effort from user, it is still seen as providing more reliable
super-topic score and topic diversification. A hybrid recom- data, since it does not involve extracting preferences from
mendation technique is also proposed in Ghazantar and actions, and it also provides transparency into the recommen-
Pragel-Benett [23], and this uses the content-based profile of dation process that results in a slightly higher perceived recom-
individual user to find similar users which are used to make mendation quality and more confidence in the
predictions. In Sarwar et al. [25], collaborative filtering was recommendations [32].
combined with an information filtering agent. Here, the
authors proposed a framework for integrating the content-
based filtering agents and collaborative filtering. A hybrid rec-
ommender algorithm is employed by many applications as a Information
result of new user problem of content-based filtering tech- collecon phase
niques and average user problem of collaborative filtering Feedback
[26]. A simple and straightforward method for combining
content-based and collaborative filtering was proposed by Learning phase
Cunningham et al. [27]. A music recommendation system
which combined tagging information, play counts and social
relations was proposed in Konstas et al. [28]. In order to deter-
mine the number of neighbors that can be automatically con- Prediction/Recom
nected on a social platform, Lee and Brusilovsky [29] mendation phase
Recommender
System
Clustering, techniques
Associaon techniques,
Bayesian networks,
Neural Networks User-based Item-based
User1
Prediction
Useri CF
Model
Recommendation
Userm
4.1.1. Pros and Cons of content-based filtering techniques that contribute to the highest ratings and hence allowing the
CB filtering techniques overcome the challenges of CF. They users to have total confidence on the recommendations pro-
have the ability to recommend new items even if there are no vided to users by the system.
ratings provided by users. So even if the database does not
contain user preferences, recommendation accuracy is not 4.2. Collaborative filtering
affected. Also, if the user preferences change, it has the capac-
ity to adjust its recommendations in a short span of time. They Collaborative filtering is a domain-independent prediction
can manage situations where different users do not share the technique for content that cannot easily and adequately be
same items, but only identical items according to their intrinsic described by metadata such as movies and music.
features. Users can get recommendations without sharing their Collaborative filtering technique works by building a database
profile, and this ensures privacy [39]. CBF technique can also (user-item matrix) of preferences for items by users. It then
provide explanations on how recommendations are generated matches users with relevant interest and preferences by calcu-
to users. However, the techniques suffer from various prob- lating similarities between their profiles to make recommenda-
lems as discussed in the literature [12]. Content based filtering tions [43]. Such users build a group called neighborhood. An
techniques are dependent on items’ metadata. That is, they user gets recommendations to those items that he has not rated
require rich description of items and very well organized user before but that were already positively rated by users in his
profile before recommendation can be made to users. This is neighborhood. Recommendations that are produced by CF
called limited content analysis. So, the effectiveness of CBF can be of either prediction or recommendation. Prediction is
depends on the availability of descriptive data. Content over- a numerical value, Rij, expressing the predicted score of item
specialization [40] is another serious problem of CBF tech- j for the user i, while Recommendation is a list of top N items
nique. Users are restricted to getting recommendations that the user will like the most as shown in Fig. 3. The tech-
similar to items already defined in their profiles. nique of collaborative filtering can be divided into two cate-
gories: memory-based and model-based [35,44].
4.1.2. Examples of content-based filtering systems
News Dude [41] is a personal news system that utilizes synthe- 4.2.1. Memory based techniques
sized speech to read news stories to users. TF-IDF model is The items that were already rated by the user before play a rel-
used to describe news stories in order to determine the short- evant role in searching for a neighbor that shares appreciation
term recommendations which is then compared with the with him [45,46]. Once a neighbor of a user is found, different
Cosine Similarity Measure and finally supplied to a learning algorithms can be used to combine the preferences of neigh-
algorithm (NN). CiteSeer is an automatic citation indexing bors to generate recommendations. Due to the effectiveness
that uses various heuristics and machine learning algorithms of these techniques, they have achieved widespread success in
to process documents. Today, CiteSeer is among the largest real life applications. Memory-based CF can be achieved in
and widely used research paper repository on the web. two ways through user-based and item-based techniques.
LIBRA [42] is a content-based book recommendation sys- User based collaborative filtering technique calculates similar-
tem that uses information about book gathered from the ity between users by comparing their ratings on the same item,
Web. It implements a Naı̈ve Bayes classifier on the information and it then computes the predicted rating for an item by the
extracted from the web to learn a user profile to produce a active user as a weighted average of the ratings of the item
ranked list of titles based on training examples supplied by by users similar to the active user where weights are the simi-
an individual user. The system is able to provide explanation larities of these users with the target item. Item-based filtering
on any recommendations made to users by listing the features techniques compute predictions using the similarity between
266 F.O. Isinkaye et al.
items and not the similarity between users. It builds a model of compare the list of top-N recommendations. Model based
item similarities by retrieving all items rated by an active user techniques resolve the sparsity problems associated with rec-
from the user-item matrix, it determines how similar the ommendation systems.
retrieved items are to the target item, then it selects the k most The use of learning algorithms has also changed the manner
similar items and their corresponding similarities are also of recommendations from recommending what to consume by
determined. Prediction is made by taking a weighted average users to recommending when to actually consume a product. It
of the active users rating on the similar items k. Several types is therefore very important to examine other learning algo-
of similarity measures are used to compute similarity between rithms used in model-based recommender systems:
item/user. The two most popular similarity measures are
correlation-based and cosine-based. Pearson correlation coeffi- Association rule: Association rules mining algorithms [49]
cient is used to measure the extent to which two variables lin- extract rules that predict the occurrence of an item based
early relate with each other and is defined as [47,48] on the presence of other items in a transaction. For
Pn instance, given a set of transactions, where each transaction
ðra;i ra Þðru;i ru Þ
sða; uÞ ¼ P i¼1
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn ð1Þ is a set of items, an association rule applies the form A fi B,
n 2 2
ðr
i¼1 a;i r a Þ i¼1 ðru;i ru Þ where A and B are two sets of items [50]. Association rules
can form a very compact representation of preference data
From the above equation, sða; uÞ denotes the similarity that may improve efficiency of storage as well as perfor-
between two users a and u, ra;i is the rating given to item i mance. Also, the effectiveness of association rule for uncov-
by user a, ra is the mean rating given by user a while n is the ering patterns and driving personalized marketing decisions
total number of items in the user-item space. Also, prediction has been known for sometimes [2]. However, there is a clear
for an item is made from the weighted combination of the relation between this method and the goal of a
selected neighbors’ ratings, which is computed as the weighted Recommendation System but they have not become
deviation from the neighbors’ mean. The general prediction mainstream.
formula is Clustering: Clustering techniques have been applied in dif-
Pn
ðru;i ru Þ sða; uÞ ferent domains such as, pattern recognition, image process-
pða; iÞ ¼ ra þ i¼1 Pn ð2Þ ing, statistical data analysis and knowledge discovery [51].
i¼1 sða; uÞ
Clustering algorithm tries to partition a set of data into a
Cosine similarity is different from Pearson-based measure set of sub-clusters in order to discover meaningful groups
in that it is a vector-space model which is based on linear alge- that exist within them [52]. Once clusters have been formed,
bra rather that statistical approach. It measures the similarity the opinions of other users in a cluster can be averaged and
between two n-dimensional vectors based on the angle between used to make recommendations for individual users. A
them. Cosine-based measure is widely used in the fields of good clustering method will produce high quality clusters
information retrieval and texts mining to compare two text in which the intra-cluster similarity is high, while the
documents, in this case, documents are represented as vectors inter-cluster similarity is low. In some clustering
of terms. The similarity between two items u and v can be approaches, a user can have partial participation in differ-
defined as [12,43,48] follows: ent clusters, and recommendations are then based on the
P average across the clusters of participation which is
u ~
~ v ru;i rv;i
u;~
sð~ vÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffii qP
¼ qP ffiffiffiffiffiffiffiffiffiffiffiffi ð3Þ weighted by degree of participation [53]. K-means and
uj j~
j~ vj 2 2
i ru;i i rv;i Self-Organizing Map (SOM) are the most commonly used
among the different clustering methods. K-means takes
Similarity measure is also referred to as similarity metric,
an input parameter, and then partitions a set of n items into
and they are methods used to calculate the scores that express
K clusters [54]. The Self-Organizing Map (SOM) is a
how similar users or items are to each other. These scores can
method for an unsupervised learning, based on artificial
then be used as the foundation of user- or item-based recom-
neurons clustering technique [55]. Clustering techniques
mendation generation. Depending on the context of use, simi-
can be used to reduce the candidate set in collaborative-
larity metrics can also be referred to as correlation metrics or
based algorithms.
distance metrics [12].
Decision tree: Decision tree is based on the methodology of
tree graphs which is constructed by analyzing a set of train-
4.2.2. Model-based techniques
ing examples for which the class labels are known. They are
This technique employs the previous ratings to learn a model then applied to classify previously unseen examples. If
in order to improve the performance of Collaborative filtering trained on very high quality data, they have the ability to
Technique. The model building process can be done using make very accurate predictions [56]. Decision trees are
machine learning or data mining techniques. These techniques more interpretable than other classifier such as Support
can quickly recommend a set of items for the fact that they use Vector machine (SVM) and Neural Networks because they
pre-computed model and they have proved to produce recom- combine simple questions about data in an understandable
mendation results that are similar to neighborhood-based rec- manner. Decision trees are also flexible in handling items
ommender techniques. Examples of these techniques include with mixture of real-valued and categorical features as well
Dimensionality Reduction technique such as Singular Value as items that have some specific missing features.
Decomposition (SVD), Matrix Completion Technique, Artificial Neural network: ANN is a structure of many con-
Latent Semantic methods, and Regression and Clustering. nected neurons (nodes) which are arranged in layers in sys-
Model-based techniques analyze the user-item matrix to iden- tematic ways. The connections between neurons have
tify relations between items; they use these relations to weights associated with them depending on the amount of
Recommendation systems 267
influence one neuron has on another. There are some neighbor is one of the major techniques employed in collab-
advantages in using neural networks in some special prob- orative filtering recommendation systems [60]. They depend
lem situations. For example, due to the fact that it contains largely on the historical rating data of users on items. Most
many neurons and also assigned weight to each connection, of the time, the rating matrix is always very big and sparse
an artificial neural network is quite robust with respect to due to the fact that users do not rate most of the items rep-
noisy and erroneous data sets [57]. ANN has the ability resented within the matrix [61]. This problem always leads
of estimating nonlinear functions and capturing complex to the inability of the system to give reliable and accurate
relationships in data sets also, they can be efficient and even recommendations to users. Different variations of low rank
operate if part of the network fails. The major disadvantage models have been used in practice for matrix completion
is that it is hard to come up with the ideal network topology especially toward application in collaborative filtering
for a given problem and once the topology is decided this [62]. Formally, the task of matrix completion technique is
will act as a lower bound for the classification error. to estimate the entries of a matrix, M 2 Rmn , when a sub-
Link analysis: Link Analysis is the process of building up set, X Cfði; jÞ : 1 6 i 6 m; 1 6 j 6 ng of the new entries is
networks of interconnected objects in order to explore pat- observed, a particular set of low rank matrices,
tern and trends [58]. It has presented great potentials in Mb ¼ UV T , where U 2 Rmk and V 2 Rmk and
improving the accomplishment of web search. Link analysis k minðm; nÞ. The most widely used algorithm in practice
consists of PageRank and HITS algorithms. Most link for recovering M from partially observed matrix using
analysis algorithms handle a web page as a single node in low rank assumption is Alternating Least Square (ALS)
the web graph [59]. minimization which involves optimizing over U and V in
Regression: Regression analysis is used when two or more an alternating manner to minimize the square error over
variables are thought to be systematically connected by a observed entries while keeping other factors fixed. Candes
linear relationship. It is a powerful and diversity process and Recht [63] proposed the use of matrix completion tech-
for analyzing associative relationships between dependent nique in the Netflix problem as a practical example for the
variable and one or more independent variables. Uses of utilization of the technique. Keshavan et al. [64] used SVD
regression contain curve fitting, prediction, and testing sys- technique in an OptSpace algorithm to deal with matrix
tematic hypotheses about relationships between variables. completion problem. The result of their experiment showed
The curve can be useful to identify a trend within dataset, that SVD is able provide a reliable initial estimate for span-
whether it is linear, parabolic, or of some other forms. ning subspace which can be further refined by gradient des-
Bayesian Classifiers: They are probabilistic framework for cent on a Grassmannian manifold. Model based techniques
solving classification problems which is based on the defini- solve sparsity problem. The major drawback of the tech-
tion of conditional probability and Bayes theorem. niques is that the model building process is computationally
Bayesian classifiers [36] consider each attribute and class expensive and the capacity of memory usage is highly inten-
label as random variables. Given a record of N features sive. Also, they do not alleviate the cold-start problem.
(A1, A2, . . ., AN), the goal of the classifier is to predict class
Ck by finding the value of Ck that maximizes the posterior
probability of the class given the data P(Ck|A1, A2, . . ., AN) 4.2.3. Pros and Cons of collaborative filtering techniques
by applying Bayes’ theorem, P(Ck|A1, A2, . . ., AN) P(A1,
Collaborative Filtering has some major advantages over CBF
A2, . . ., AN|Ck)P(Ck). The most commonly used Bayesian
in that it can perform in domains where there is not much con-
classifier is known as the Naive Bayes Classifier. In order
tent associated with items and where content is difficult for a
to estimate the conditional probability, P(A1, A2, . . .,
computer system to analyze (such as opinions and ideal).
AN|Ck), a Naive Bayes Classifier assumes the probabilistic
Also, CF technique has the ability to provide serendipitous
independence of the attributes that is, the presence or
recommendations, which means that it can recommend items
absence of a particular attribute is unrelated to the presence
that are relevant to the user even without the content being
or absence of any other. This assumption leads to P(A1, A2,
in the user’s profile [65]. Despite the success of CF techniques,
. . ., AN|Ck) = P(A1|Ck)P(A2|Ck). . . P(AN|Ck). The main
their widespread use has revealed some potential problems
benefits of Naive Bayes classifiers are that they are robust
such as follows.
to isolated noise points and irrelevant attributes, and they
handle missing values by ignoring the instance during prob-
4.2.3.1. Cold-start problem. This refers to a situation where a
ability estimate calculations. However, the independence
recommender does not have adequate information about a
assumption may not hold for some attributes as they might
user or an item in order to make relevant predictions [66].
be correlated. In this case, the usual approach is to use
This is one of the major problems that reduce the performance
Bayesian Networks. Bayesian classifiers may prove practi-
of recommendation system. The profile of such new user or
cal for environments in which knowledge of user prefer-
item will be empty since he has not rated any item; hence,
ences changes slowly with respect to the time needed to
his taste is not known to the system.
build the model but are not suitable for environments in
which users preference models must be updated rapidly or
4.2.3.2. Data sparsity problem. This is the problem that occurs
frequently. It is also successful in model-based recommen-
as a result of lack of enough information, that is, when only a
dation systems because it is often used to derive a model
few of the total number of items available in a database are
for content-based recommendation systems.
rated by users [34,67]. This always leads to a sparse user-
Matrix completion techniques: The essence of matrix com-
item matrix, inability to locate successful neighbors and finally,
pletion technique is to predict the unknown values within
the generation of weak recommendations. Also, data sparsity
the user-item matrices. Correlation based K-nearest
268 F.O. Isinkaye et al.
always leads to coverage problems, which is the percentage of 4.2.4. Examples of collaborative systems
items in the system that recommendations can be made for [68] Ringo [69] is a user-based CF system which makes recommen-
dations of music albums and artists. In Ringo, when a user
4.2.3.3. Scalability. This is another problem associated with initially enters the system, a list of 125 artists is given to the
recommendation algorithms because computation normally user to rate according to how much he likes listening to them.
grows linearly with the number of users and items [67]. A rec- The list is made up of two different sections. The first session
ommendation technique that is efficient when the number of consists of the most often rated artists, and this affords the
dataset is limited may be unable to generate satisfactory num- active user opportunity to rate artists which others have
ber of recommendations when the volume of dataset is equally rated, so that there is a level of similarities between dif-
increased. Thus, it is crucial to apply recommendation tech- ferent users’ profiles. The second session is generated upon a
niques which are capable of scaling up in a successful manner random selection of items from the entire user-item matrix,
as the number of dataset in a database increases. Methods used so that all artists and albums are eventually rated at some
for solving scalability problem and speeding up recommenda- point in the initial rating phases.
tion generation are based on Dimensionality reduction tech- GroupLens [70] is a CF system that is based on client/server
niques, such as Singular Value Decomposition (SVD) architecture; the system recommends Usenet news which is a
method, which has the ability to produce reliable and efficient high volume discussion list service on the Internet. The short
recommendations. lifetime of Netnews, and the underlying sparsity of the rating
matrices are the two main challenges addressed by this system.
4.2.3.4. Synonymy. Synonymy is the tendency of very similar Users and Netnews are clustered based on the existing news
items to have different names or entries. Most recommender groups in the system, and the implicit ratings are computed
systems find it difficult to make distinction between closely by measuring the time the users spend reading Netnews.
related items such as the difference between e.g. baby wear Amazon.com is an example of e-commerce recommenda-
and baby cloth. Collaborative Filtering systems usually find tion engine that uses scalable item-to-item collaborative filter-
no match between the two terms to be able to compute ing techniques to recommend online products for different
their similarity. Different methods, such as automatic term users. The computational algorithm scales independently of
expansion, the construction of a thesaurus, and Singular the number of users and items [53] within the database.
Value Decomposition (SVD), especially Latent Semantic Amazon.com uses an explicit information collection technique
Indexing are capable of solving the synonymy problem. to obtain information from users. The interface is made up of
The shortcoming of these methods is that some added terms the following sections, your browsing history, rate these items,
may have different meanings from what is intended, which and improve your recommendations and your profile. The sys-
sometimes leads to rapid degradation of recommendation tem predicts users interest based on the items he/she has rated.
performance. The system then compares the users browsing pattern on the
system and decides the item of interest to recommend to the collaborative approach, utilizing some collaborative filtering
user [71]. Amazon.com popularized feature of ‘‘people who in content-based approach, creating a unified recommendation
bought this item also bought these items’’. Example of system that brings together both approaches.
Amazon.com item-to-item contextual recommendation inter-
face is shown in Fig. 4. 4.3.1. Weighted hybridization
Weighted hybridization combines the results of different rec-
4.2.5. Trust in collaborative filtering recommendation systems ommenders to generate a recommendation list or prediction
Trust in RS is defined as the correlation between similar pref- by integrating the scores from each of the techniques in use
erence toward the items that are commonly rated or liked by by a linear formula. An example of a weighted hybridized
two users [72]. Trust improves RS by combining similarity recommendation system is P-tango [76]. The system consists
and trust between users. That is, the way neighbors are selected of a content-based and collaborative recommender. They are
is modified by introducing trust in order to develop new rela- given equal weights at first, but weights are adjusted as predic-
tionship between users so that it can increase connectivity and tions are confirmed or otherwise. The benefit of a weighted
alleviate the challenges of data sparsity and cold start associ- hybrid is that all the recommender system’s strengths are uti-
ated with traditional collaborative filtering techniques. Some lized during the recommendation process in a straightforward
of the empirical studies conducted by Ziegler et al. [24] way.
revealed that correlation exists between trust and user similar-
ity when community’s trust network is bound to some specific 4.3.2. Switching hybridization
application. Following the studies, it can be deduced that com- The system swaps to one of the recommendation techniques
putational trust models can act as appropriate means to sup- according to a heuristic reflecting the recommender ability to
plement or completely replace current collaborative filtering produce a good rating. The switching hybrid has the ability
technique [73]. to avoid problems specific to one method e.g. the new user
Different trust metrics are used in RS to measure and cal- problem of content-based recommender, by switching to a col-
culate the value between users in a network. These metrics laborative recommendation system. The benefit of this strategy
are of two types, local and global trust metrics. Local trust met- is that the system is sensitive to the strengths and weaknesses
rics used the subjective opinion of the active user to predict the of its constituent recommenders. The main disadvantage of
trustworthiness of other users from the active user perspective. switching hybrids is that it usually introduces more complexity
The trust value represents the amount of trust that the active to recommendation process because the switching criterion,
user puts on another user. Based on this technique, different which normally increases the number of parameters to the rec-
users trust the active user differently and therefore their trust ommendation system has to be determined [34]. Example of a
value is different from each other. Global trust metrics repre- switching hybrid recommender is the DailyLearner [77] that
sents an entire community’s opinion regarding the current uses both content-based and collaborative hybrid where a
user; therefore, every user receives only one value that repre- content-based recommendation is employed first before collab-
sents her level of trustworthiness in the community. Trust orative recommendation in a situation where the content-
scores in global trust metrics are calculated by the aggregation based system cannot make recommendations with enough
of all users’ opinions as regards the current user. Users’ repu- evidence.
tation on ebay.com is an example of using global trust in an
online shopping website. ebay.com calculates user reputation 4.3.3. Cascade hybridization
based on the number of users who left positive, negative, or The cascade hybridization technique applies an iterative refine-
neutral feedback for the items sold by the current user. ment process in constructing an order of preference among dif-
When the user does not have a specific opinion regarding ferent items. The recommendations of one technique are
another user, she usually relies on these aggregated trust refined by another recommendation technique. The first rec-
scores. Global trust can be further divided into two parts ommendation technique outputs a coarse list of recommenda-
namely profile level and item-level The profile-level trust refers tions which is in turn refined by the next recommendation
to the general definition of global trust metrics in which it technique. The hybridization technique is very efficient and
assigns one trust score to every user. tolerant to noise due to the coarse-to-finer nature of the itera-
tion. EntreeC [34] is an example of cascade hybridization
4.3. Hybrid filtering method that used a cascade knowledge-based and collabora-
tive recommender.
Hybrid filtering technique combines different recommendation
techniques in order to gain better system optimization to avoid 4.3.4. Mixed hybridization
some limitations and problems of pure recommendation sys- Mixed hybrids combine recommendation results of different
tems [74,75]. The idea behind hybrid techniques is that a com- recommendation techniques at the same time instead of having
bination of algorithms will provide more accurate and effective just one recommendation per item. Each item has multiple rec-
recommendations than a single algorithm as the disadvantages ommendations associated with it from different recommenda-
of one algorithm can be overcome by another algorithm [65]. tion techniques. In mixed hybridization, the individual
Using multiple recommendation techniques can suppress the performances do not always affect the general performance
weaknesses of an individual technique in a combined model. of a local region. Example of recommender system in this
The combination of approaches can be done in any of the fol- category that uses the mixed hybridization is the PTV system
lowing ways: separate implementation of algorithms and com- [78] which recommends a TV viewing schedule for a user by
bining the result, utilizing some content-based filtering in combining recommendations from content-based and
270 F.O. Isinkaye et al.
collaborative systems to form a schedule. Profinder [79] and where Pui is the predicted rating for user u on item i, ru,i is the
PickAFlick [80] are also examples of mixed hybrid systems. actual rating and N is the total number of ratings on the item
set. The lower the MAE, the more accurately the recommenda-
4.3.5. Feature-combination tion engine predicts user ratings. Also, the Root Mean Square
The features produced by a specific recommendation technique Error (RMSE) is given by Cotter et al. [85] as
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
are fed into another recommendation technique. For example, 1X
the rating of similar users which is a feature of collaborative RMSE ¼ ðp ru;i Þ2 ð5Þ
n u;i u;i
filtering is used in a case-based reasoning recommendation
technique as one of the features to determine the similarity Root Mean Square Error (RMSE) puts more emphasis on
between items. Pipper is an example of feature combination larger absolute error and the lower the RMSE is, the better the
technique that used the collaborative filter’s ratings in a recommendation accuracy.
content-based system as a feature for recommending movies Decision support accuracy metrics that are popularly used
[81]. The benefit of this technique is that, it does not always are Reversal rate, Weighted errors, Receiver Operating
exclusively rely on the collaborative data. Characteristics (ROC) and Precision Recall Curve (PRC),
Precision, Recall and F-measure. These metrics help users in
4.3.6. Feature-augmentation selecting items that are of very high quality out of the available
The technique makes use of the ratings and other information set of items [86]. The metrics view prediction procedure as a
produced by the previous recommender and it also requires binary operation which distinguishes good items from those
additional functionality from the recommender systems. For items that are not good. ROC curves are very successful when
example, the Libra system [42] makes content-based recom- performing comprehensive assessments of the performance of
mendation of books on data found in Amazon.com by some specific algorithms. Precision is the fraction of recom-
employing a naı̈ve Bayes text classifier. Feature- mended items that is actually relevant to the user, while recall
augmentation hybrids are superior to feature-combination can be defined as the fraction of relevant items that are also
methods in that they add a small number of features to the pri- part of the set of recommended items [87]. They are computed
mary recommender. as
Correctly recommended items
4.3.7. Meta-level Precision ¼ ð6Þ
Total recommended items
The internal model generated by one recommendation tech-
nique is used as input for another. The model generated is Correctly recommended items
Recall ¼ ð7Þ
always richer in information when compared to a single rating. Total useful recommended items
Meta-level [17] hybrids are able to solve the sparsity problem F-measure defined below helps to simplify precision and
of collaborative filtering techniques by using the entire model recall into a single metric. The resulting value makes compar-
learned by the first technique as input for the second tech- ison between algorithms and across data sets very simple and
nique. Example of meta-level technique is LaboUr [82] which straightforward [83].
uses instant-based learning to create content-based user profile
that is then compared in a collaborative manner. 2PR
F-measure ¼ ð8Þ
PþR
5. Evaluation metrics for recommendation algorithms Coverage has to do with the percentage of items and users
that a recommender system can provide predictions.
The quality of a recommendation algorithm can be evaluated Prediction may be practically impossible to make if no users
using different types of measurement which can be accuracy or few users rated an item. Coverage can be reduced by defin-
or coverage. The type of metrics used depends on the type of ing small neighborhood sizes for user or items [88].
filtering technique. Accuracy is the fraction of correct recom-
mendations out of total possible recommendations while cov- 6. Conclusion
erage measures the fraction of objects in the search space the
system is able to provide recommendations for. Metrics for Recommender systems open new opportunities of retrieving
measuring the accuracy of recommendation filtering systems personalized information on the Internet. It also helps to alle-
are divided into statistical and decision support accuracy met- viate the problem of information overload which is a very
rics [83]. The suitability of each metric depends on the features common phenomenon with information retrieval systems
of the dataset and the type of tasks that the recommender sys- and enables users to have access to products and services
tem will do [36]. which are not readily available to users on the system. This
Statistical accuracy metrics evaluate accuracy of a filtering paper discussed the two traditional recommendation tech-
technique by comparing the predicted ratings directly with niques and highlighted their strengths and challenges with
the actual user rating. Mean Absolute Error (MAE) [84], diverse kind of hybridization strategies used to improve their
Root Mean Square Error (RMSE) and Correlation are usually performances. Various learning algorithms used in generating
used as statistical accuracy metrics. MAE is the most popular recommendation models and evaluation metrics used in mea-
and commonly used; it is a measure of deviation of recommen- suring the quality and performance of recommendation algo-
dation from user’s specific value. It is computed as follows [76]: rithms were discussed. This knowledge will empower
1X researchers and serve as a road map to improve the state of
MAE ¼ jp ru;i j
u;i u;i
ð4Þ the art recommendation techniques.
N
Recommendation systems 271
[42] Mooney RJ, Roy L. Content-based book recommending using IEEE international conference on data mining workshops.
learning for text categorization. In: Proceedings of the fifth ACM ICDMW’08. IEEE; 2008. p. 553–62.
conference on digital libraries. ACM; 2000. p. 195–204. [63] Candès EJ, Recht B. Exact matrix completion via convex
[43] Herlocker JL, Konstan JA, Terveen LG, Riedl JT. Evaluating optimization. Found Comput Math 2009;9(6):717–72.
collaborative filtering recommender systems. ACM Trans Inform [64] Keshavan RH, Montanari A, Sewoong O. Matrix completion
Syst 2004;22(1):5–53. from a few entries. IEEE Trans Inform Theor 2010;56(6):2980–98.
[44] Breese J, Heckerma D, Kadie C. Empirical analysis of predictive [65] Schafer JB, Frankowski D, Herlocker J, Sen S. Collaborative
algorithms for collaborative filtering. In: Proceedings of the 14th filtering recommender systems. In: Brusilovsky P, Kobsa A, Nejdl
conference on uncertainty in artificial intelligence (UAI-98); 1998. W, editors. The Adaptive Web, LNCS 4321. Berlin Heidelberg
p. 43–52. (Germany): Springer; 2007. p. 291–324. http://dx.doi.org/10.1007/
[45] Zhao ZD, Shang MS. User-based collaborative filtering recom- 978-3-540-72079-9_9.
mendation algorithms on Hadoop. In: Proceedings of 3rd [66] Burke R. Web recommender systems. In: Brusilovsky P, Kobsa A,
international conference on knowledge discovering and data Nejdl W, editors. The Adaptive Web, LNCS 4321. Berlin
mining, (WKDD 2010), IEEE Computer Society, Washington Heidelberg (Germany): Springer; 2007. p. 377–408. http://
DC, USA; 2010. p. 478–81. doi: 10.1109/WKDD.2010.54. dx.doi.org/10.1007/978-3-540-72079-9_12.
[46] Zhu X, Ye HW, Gong S. A personalized recommendation system [67] Park DH, Kim HK, Choi IY, Kim JK. A literature review and
combining case-based reasoning and user-based collaborative classification of recommender systems research. Expert Syst Appl
filtering. In: Control and decision conference (CCDC 2009), 2012;39(11):10059–72.
Chinese; 2009. p. 4026–28. [68] Su X, Khoshgoftaar TM. A survey of collaborative filtering
[47] Melville P, Mooney-Raymond J, Nagarajan R. Content-boosted techniques. Adv Artif Intell 2009;4:19.
collaborative filtering for improved recommendation. In: [69] Shardanand U, Maes P. Social information filtering: algorithms
Proceedings of the eighteenth national conference on artificial for automating ‘‘word of mouth’’. In: Proceedings of the SIGCHI
intelligence (AAAI), Edmonton, Canada; 2002. p. 187–92. conference on human factors in computing systems. ACM Press/
[48] Jannach D, Zanker M, Felfernig A, Friedrich G. Recommender Addison-Wesley Publishing Co.; 1995. p. 210–7.
systems – an introduction. Cambridge University Press; 2010. [70] Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR,
[49] Mobasher B, Jin X, Zhou Y. Semantically enhanced collaborative Riedl J. Applying collaborative filtering to usenet news. Commun
filtering on the web. In: Web mining: from web to semantic web. ACM 1997;40(3):77–87.
Berlin Heidelberg: Springer; 2004. p. 57–76. [71] Lee TQ, Young P, Yong-Tae P. A time-based approach to
[50] Yoon HC, Jae KK, Soung HK. A personalized recommender effective recommender systems using implicit feedback. Expert
system based on web usage mining and decision tree induction. Syst Appl 2008;34(4):3055–62.
Expert Syst Appl 2002;23:329–42. [72] Shambour Q, Lu J. A trust-semantic fusion-based recommenda-
[51] Ku_zelewska U. Advantages of information granulation in clus- tion approach for e-business applications. Decis Support Syst
tering al-gorithms. In: Agents and artificial intelligence. NY: 2012;54(1):768–80.
Springer; 2013. p. 131–45. [73] O’Donovan J, Smyth B. Trust in recommender systems. In:
[52] McSherry D. Explaining the pros and cons of conclusions in CBR. Proceedings of the 10th international conference on intelligent
In: Calero PAG, Funk P, editors. Proceedings of the European user interfaces, ACM; 2005. p. 167–74.
conference on case-based reasoning (ECCBR-04). Madrid [74] Adomavicius G, Zhang J. Impact of data characteristics on
(Spain): Springer; 2004. p. 317–30. recommender systems performance. ACM Trans Manage Inform
[53] Linden G, Smith B, York J. Amazon.com recommendation: item- Syst 2012;3(1).
to-item collaborative filtering. IEEE Internet Comput 2003;7(1): [75] Stern DH, Herbrich R, Graepel T. Matchbox: large scale online
76–80. bayesian recommendations. In: Proceedings of the 18th interna-
[54] Michael JA, Berry A, Gordon S, Linoff L. Data mining tional conference on World Wide Web. ACM, New York, NY,
techniques. 2nd ed. Wiley Publishing Inc.; 2004. USA; 2009. p. 111–20.
[55] Hosseini-Pozveh M, Nemartbakhsh M, Movahhedinia N. A [76] Claypool M, Gokhale A, Miranda T, Murnikov P, Netes D,
multimedia approach for context-aware recommendation in Sartin M. Combining content- based and collaborative filters in an
mobile commerce. Int J Comput Sci Inform Secur 2009;3(1). online newspaper. In: Proceedings of ACM SIGIR workshop on
[56] Caruana R, Niculescu-Mizil A. An empirical comparison of recommender systems: algorithms and evaluation, Berkeley,
supervised learning algorithms. In: Cohen W. Moore AW, editors. California; 1999.
Machine Learning, Proceedings of the twenty-third international [77] Billsus D, Pazzani MJ. A hybrid user model for news story
conference, ACM, New York; 2003. p. 161–8. classification. In: Kay J, editor. In: Proceedings of the seventh
[57] Larose TD. Discovering knowledge in data. Hoboken (New international conference on user modeling, Banff, Canada.
Jersey): John Wiley; 2005. Springer-Verlag, New York; 1999. p. 99–108.
[58] Berry MJA, Linoff G. Data mining techniques: for marketing, [78] Smyth B, Cotter P. A personalized TV listings service for the
sales, and customer support. New York: Wiley Computer digital TV age. J Knowl-Based Syst 2000;13(2–3):53–9.
Publishing; 1997. [79] Wasfi AM. Collecting user access patterns for building user
[59] Deng C, Xiaofe H, Ji-Rong W, Wei-Ying M. Block-level link profiles and collaborative filtering. In: Proceedings of the 1999
analysis. In: Proceedings of the 27th annual international ACM international conference on intelligent user, Redondo Beach, CA;
SIGIR conference on research and development in information 1999. p. 57–64.
retrieval; 2004. p. 440–7. [80] Burke R, Hammond K, Young B. The FindMe approach to
[60] Koren Y, Bell R, Volinsky C. Matrix factorization techniques for assisted browsing. IEEE Expert 1997;12(4):32–40.
recommender systems. IEEE Comput 2009;8:30–7. [81] Basu C, Hirsh H, Cohen W. Recommendation as classification:
[61] Bojnordi E, Moradi P. A novel collaborative filtering model based using social and content-based information in recommendation.
on combination of correlation method with matrix completion In: Proceedings of the 15th national conference on artificial
technique. In: 16th CSI international symposium on artificial intelligence, Madison, WI; 1998. p. 714–20.
intelligence and signal processing (AISP), IEEE; 2012. [82] Schwab I, Kobsa A, Koychev I. Learning user interests through
[62] Takács G, István P, Bottyán N, Tikk D. Investigation of various positive examples using content analysis and collaborative filter-
matrix factorization methods for large recommender systems. In: ing. Draft from Fraunhofer Institute for Applied Information
Technology, Germany; 2001.
Recommendation systems 273
[83] Sarwar B, Karypis G, Konstan J, Reidl J. Item-based collabora- [86] Sarwar BM, Konstan JA, Herlocker JL, Miller B, Riedl JT. Using
tive filtering recommendation algorithms. In: Proceedings of the filtering agents to improve prediction quality in the grouplens
10th international conference on World Wide Web, ACM, Hong research, collaborative filtering system. In: Proceedings of the
Kong, China; 2001. p. 295. ACM conference on computer supported cooperative work. New
[84] Goldberg K, Roeder T, Gupta D, Perkins C. Eigentaste: a York (NY, USA): ACM; 1998. p. 345–54.
constant time collaborative filtering algorithm. Inform Retrieval J [87] Drosou M, Pitoura E. Search result diversification. SIGMOD Rec
2001;4(2):133–51. 2010;39(1):41–7.
[85] Cotter P, Smyth B. PTV: Intelligent personalized TV guides. In: [88] Papagelis M, Plexousakis D. Qualitative analysis of user-based
Twelfth conference on innovative applications of artificial intel- and item-based prediction algorithms for recommendation agents.
ligence; 2000. p. 957–64. Int J Eng Appl Artif Intell 2005;18(4):781–9.