0% found this document useful (0 votes)

62 views16 pages

Enhancing Collaborative Filtering by User Interest Expansion Via Personalized Ranking

facet search

Uploaded by

Pavani Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views16 pages

Enhancing Collaborative Filtering by User Interest Expansion Via Personalized Ranking

facet search

Uploaded by

Pavani Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

218 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO.

1, FEBRUARY 2012

Enhancing Collaborative Filtering by User Interest

Expansion via Personalized Ranking
Qi Liu, Enhong Chen, Senior Member, IEEE, Hui Xiong, Senior Member, IEEE,
Chris H. Q. Ding, Member, IEEE, and Jian Chen, Fellow, IEEE

Abstract—Recommender systems suggest a few items from automatically recommend the few optimal items, which users
many possible choices to the users by understanding their past might like or have interests to buy by learning the user profiles,
behaviors. In these systems, the user behaviors are influenced users’ previous transactions, the content of items, etc. [2]. In the
by the hidden interests of the users. Learning to leverage the
information about user interests is often critical for making better recent 20 years, many different types of recommender systems,
recommendations. However, existing collaborative-filtering-based such as collaborative-filtering-based methods [36], content-
recommender systems are usually focused on exploiting the in- based approaches [12], and hybrid approaches [46], have been
formation about the user’s interaction with the systems; the in- developed.
formation about latent user interests is largely underexplored.
To that end, inspired by the topic models, in this paper, we pro-
pose a novel collaborative-filtering-based recommender system by A. Collaborative Filtering
user interest expansion via personalized ranking, named iExpand.
The goal is to build an item-oriented model-based collaborative- Since collaborative-filtering methods only require the infor-
filtering framework. The iExpand method introduces a three- mation about user interactions and do not rely on the content
layer, user–interests–item, representation scheme, which leads to information of items or user profiles, they have more broad ap-
more accurate ranking recommendation results with less compu-
tation cost and helps the understanding of the interactions among plications [14], [16], [20], and more and more research studies
users, items, and user interests. Moreover, iExpand strategically on collaborative filtering have been reported [15], [26], [27].
deals with many issues that exist in traditional collaborative- These methods filter or evaluate items through the opinions
filtering approaches, such as the overspecialization problem and of other users [41]. They are usually based on the assumption
the cold-start problem. Finally, we evaluate iExpand on three that the given user will prefer the items which other users with
benchmark data sets, and experimental results show that iExpand
can lead to better ranking performance than state-of-the-art meth- similar preferences liked in the past [2].
ods with a significant margin. In the literature, there are model-based and memory-based
methods for collaborative filtering. Model-based approaches
Index Terms—Collaborative filtering, latent Dirichlet allocation
(LDA), personalized ranking, recommender systems, topic model. learn a model to make recommendation. Algorithms of this
category include the matrix factorization [38], the graph-based
approaches [14], etc. The common procedure of memory-based
I. I NTRODUCTION approaches is first to select a set of neighbor users for a
given user based on the entire collection of previously rated
T HE DEVELOPMENT of recommender systems has been
stimulated by the rapid growth of information on the
Internet. For information filtering, recommender systems can
items by the users. Then, the recommendations are made based
on the items that neighbor users like. Indeed, these methods
are referred to as user-oriented memory-based approaches. In
addition, an analogous procedure, which builds item similar-
Manuscript received November 8, 2010; revised March 24, 2011 and
June 24, 2011; accepted July 12, 2011. Date of current version December 7,
ity groups using corating history, is known as item-oriented
2011. This research was supported in part by the Natural Science Foundation memory-based collaborative filtering [40].
of China under Grants 60775037 and 71028002, by the Key Program of However, existing collaborative-filtering methods often di-
National Natural Science Foundation of China under Grant 60933013, and by
the Research Fund for the Doctoral Program of High Education of China under
rectly exploit the information about the users’ interaction with
Grant 20093402110017. A preliminary version of this work has been published the systems. In other words, they make recommendations by
in the Association for Computing Machinery Conference on Information and learning a “user–item” dualistic relationship. Therefore, exist-
Knowledge Management 2010. This paper was recommended by Associate
Editor J. Liu.
ing methods often neglect an important fact that there are many
Q. Liu and E. Chen are with the School of Computer Science and latent user interests which influence user behaviors. To that end,
Technology, University of Science and Technology of China, Hefei 230026, in this paper, we propose a three-layer, user–interests–item,
China (e-mail: [email protected]; [email protected]).
H. Xiong is with the Management Science and Information Systems representation scheme. Specifically, we interpret an interest as a
Department, Rutgers Business School, Rutgers University, Newark, NJ 07102 requirement from the user to items, while for the corresponding
USA (e-mail: [email protected]). item, the interest can be considered as one of its characteristics.
C. H. Q. Ding is with the Department of Computer Science and Engineering,
University of Texas at Arlington, Arlington, TX 76019 USA (e-mail: Indeed, it is necessary to leverage this three-layer representation
[email protected]). for enhancing collaborative filtering, since this representation
J. Chen is with the Department of Management Science and Engineering, leads to better explanation of why recommended items are
School of Economics and Management, Tsinghua University, Beijing 100084,
China (e-mail: [email protected]). chosen and helps the understanding of the interactions among
Digital Object Identifier 10.1109/TSMCB.2011.2163711 users, items, and user interests.
1083-4419/$26.00 © 2011 IEEE
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 219

Indeed, the key challenges are how to model latent user

interests and the potential correlations and changes between
them. It is relatively easy to extract user interests in content-
based or hybrid recommender systems by tracking the text
information, such as “keywords” or “tags” [43]. However, for
collaborative-filtering systems, it is difficult to identify the user
latent interests, since the only information available is the user
interaction information with the system.

C. Contributions
To address the aforementioned challenges, in our prelim-
inary work [31], we proposed an item-oriented model-based
collaborative-filtering method named iExpand. In iExpand,
Fig. 1. Simple example of a movie recommender system (where the photos we assume that each user’s rating behavior depends on an
are downloaded from IMDB [http://www.imdb.com/)]. (a) When users decide underlying set of hidden interests and we use a three-layer,
to watch a movie, there are some latent interests that affect their choices.
(b) Users’ interests may change after they watch a movie. user–interests–item, representation scheme to generate recom-
mendations. Specifically, each user interest is first captured
by a latent factor which corresponds to a “topic” in topic
B. Motivating Example
models. Then, we learn the transition probabilities between
Fig. 1(a) shows an example of a movie recommender sys- different latent interests. Moreover, to deal with the cold-
tem. In the figure, user a is interested in kung fu movies, start and “overspecialization” problems, we model the possible
while user b likes Oscar movies. While both of them have expansion process of user interests by personalized ranking.
watched the movie Crouching Tiger, Hidden Dragon, which In other words, we exploit a personalized ranking strategy on
was recommended by the system, they have different reasons a latent interest correlation graph to predict the next possible
for watching this movie. Thus, if we can identify user latent interest for each user. At last, iExpand generates the recom-
interests, we will have a better understanding about the users’ mendation list by ranking the candidate items according to the
requirements, since user interests can better connect users and expanded user interests. We should note that, compared with
items. Also, when leveraging the information of user inter- previous topic-model-based collaborative-filtering approaches,
ests for developing recommender systems, we must be aware discovering the correlation between latent interests and using
that user interests can change from time to time under the personalized ranking to expand user current interests are the
influence of many internal and external factors. For instance, main advantages of iExpand.
as shown in Fig. 1(b), after watching the movie Crouching In addition, in many previous model-based recommender
Tiger, Hidden Dragon, user interests may be affected by it. systems, there are many parameters which are assigned default
For user a, while he is a fan of kung fu movies, he may start values. However, the best values for them should be determined
watching other movies directed by Ang Lee. Also, user b may in each particular scenario. In iExpand, we develop a model
become a fan of kung fu movies after her first-time exposure to select parameter values by combining Minka’s fixed-point
to this kung fu movie. If recommender systems cannot cap- iterations and an evaluation method for topic models.
ture these changes and only make recommendations according In this paper, we further explain why topic models can be
to the user’s past interests rather than exploring his/her new used to simulate the user latent interests and we demonstrate
preferences, then they are prone to the “overspecialization” the way of extracting these interests from the latent Dirichlet
problem [2]. allocation (LDA) model by the Gibbs sampling method. In
In addition, in real scenarios, the training data are far less addition, we illustrate how to use iExpand for making online
than plentiful and most of the items/users only have a few recommendations in the real-world applications. Finally, we
rating/buying records. At this time, typical measures fail to provide systematic experiments on three data sets selected from
capture actual similarities between items/users and the system a wide and diverse range of domains, and we use multiple
is unable to make meaningful recommendations. This situation evaluation metrics to evaluate the performance of iExpand.
is summarized as the cold-start problem [41]. Let us take Since iExpand views collaborative filtering as a ranking prob-
user b in Fig. 1(a) as an example. If she has only watched lem and aims to make recommendations by directly ranking the
one movie Crouching Tiger, Hidden Dragon and it has been candidate items, we report the ranking prediction accuracy. As
watched by few people before the rating of user b, for traditional shown in the experimental results, iExpand outperforms four
collaborative-filtering systems, it is difficult to find out the benchmark methods: two graph-based algorithms and two algo-
similar items or users for both Crouching Tiger, Hidden Dragon rithms based on dimension reduction. As many other algorithms
and user b. However, if we have identified that Crouching Tiger, formulate collaborative filtering as a regression problem (i.e.,
Hidden Dragon belongs to kung fu movies and Oscar movies, rating prediction) [30], we also report the comparison results of
then the system could recommend user b the movies that belong the rating predictions. In addition to this, these new experiments
to these two interests or some related interests. Thus, the cold- provide more insights into the iExpand model, such as the effect
start problem can be alleviated. of the parameters and the low computational cost.
220 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

D. Outline
The rest of this paper is organized as follows. Section II gives
the detail of the iExpand method for effective recommendation.
In Section III, we show the experimental results and many
discussions. In Section IV, we introduce the related work.
Finally, Section V concludes this paper.

II. U SER I NTEREST E XPANSION

In this section, we first introduce the framework of the
iExpand model. Then, we describe each step of the model in
detail. In addition, we show how to select parameters. Finally,
we address the computational complexity issue. Fig. 2. Framework of the iExpand model. Gray arrows show the general
process of the model, while black arrows show the procedure of online
recommendations.
A. The Framework of the iExpand Model
TABLE I
First of all, the iExpand model assumes that, in recom- M ATHEMATICAL N OTATIONS
mender systems, a user’s rating behavior depends on an under-
lying set of hidden interests. Inspired by the topic models, in
iExpand, each user is represented as a probability distribution
over interests and each interest is a probability distribution
over items. Fig. 3(a) shows the three-layer representation,
user–interests–item. What is more is that the iExpand model
assumes that the order of items in a user’s rating record can
be neglected and the users’ order in a user set can also be
neglected, which means both items and users are exchangeable.
In correspondence with the LDA model [8], a topic model that
we use in iExpand for extracting user interests, the users are
documents, the items are words, and the latent interests are
topics, respectively.
We should note that, in terms of items, a latent interest can
be viewed as one specific characteristic of the items and the
users who have this latent interest will prefer the items with this characteristics. In this way, the latent topics can be used to
characteristic. Since an item may have multiple characteristics, simulate the real-world interests.
it belongs to many latent interests (i.e., polysemy). At the same Fig. 2 shows the framework of the iExpand model. From
time, different items may have a similar characteristic; they may Fig. 2 we can see that, when a user comes, the learning and rec-
at least refer to one same latent interest (i.e., synonymy). Let us ommendation process of the iExpand model generally consists
take the movies in Fig. 1 for example; the movie Crouching of four steps. In the first step, the information about user latent
Tiger, Hidden Dragon belongs to multiple interests, Oscar, interests is extracted by the inference of the LDA model. In
kung fu, Yun-Fat Chow, etc., and there are also many movies the second step, the correlation graph/matrix of latent interests
that can be denoted by the interest kung fu. However, in most is established by an item–interest bipartite graph projection.
cases, it is impossible to understand the item characteristics In the third step, for a given user, his/her interest distribution
clearly (e.g., in collaborative-filtering scenario). Fortunately, as is expanded by letting the current interest vector perform a
a simulation tool, the topic model (e.g., LDA) can be used to random walk on the interest correlation graph/matrix. Finally,
learn the meaning and characteristics of items in a data-driven the candidate items are ranked using expanded user interests
fashion, i.e., from given rating records, possibly without further and the recommendation list is generated.
content or prior knowledge of these items. Each step of the iExpand model is introduced in the follow-
Topic models are a type of statistical models, which were ing sections. For better illustration, Table I lists all mathemati-
firstly proposed in machine learning and natural language cal notations used in this paper.
processing for discovering the hidden topics (e.g., Basketball,
Travel, and Cooking) that occur in a collection of documents.
B. Extracting User Interests From the LDA Model
In terms of collaborative filtering, the documents can be viewed
as the users, the words are items, and topics become the hidden In this section, we show how to extract the information about
interests. Based on the hypothesis of topic models, the co- user latent interests from the LDA model. The information
occurrence structure of items in the rating records can be used about latent interests include the probability distribution of each
to recover the latent interest structure and the items that often user over interests, the probability distribution of each interest
appear together in one rating record may tend to have the same over items, and the distribution of each interest.
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 221

Fig. 3. (a) Three-layer representation scheme. (b) Graphical model represen-

tation of LDA.

For collaborative filtering, the LDA model can be represented

by a probabilistic graphical model, as shown in Fig. 3(b),
where shaded and unshaded variables indicate observed and
latent (i.e., unobserved) variables, respectively. In Fig. 3(b),
each user in M users is represented as a bag of item tokens
Mu , and each token is viewed as an observed variable i.
Because LDA can provide an intuitive description of each
observed variable i, it is a type of generative probabilistic
model. Specifically, this item token is generated from a multino-
mial distribution over items φt , specific to an interest t, and Fig. 4. Example for the LDA inference process based on Gibbs sampling.
interest t is chosen from a multinomial distribution over in-
Ii (φij ) and user Ui ’s distribution over interest Tj (θij ) can be
terests θu , specific to this user. Both θ and φ are modeled by
estimated by
the Dirichlet distribution, with the hyperparameters α and β,
respectively. NK
Cij +β
Extracting user interests θ from LDA is a latent variable φij = P (Ii |Tj ) =

N
NK + N β
inference process, which is to “invert” the generative model Cnj
n=1
and “generate” latent variables (i.e., the interests’ distribution
MK
over items φ and the users’ interest distributions θ) from Cij +α
θij = P (Tj |Ui ) = (1)
given observations. After inference, the value of these latent
K
MK
variables should maximize the posterior distribution of the Cik + Kα
k=1
entire user rating records (i.e., given observations). However,
learning these latent variables is intractable in general [8]. where C N K and C M K are matrices with dimensions N × K
Thus, many approximations have been proposed, including and M × K, respectively. Cij NK
denotes the number of times
Gibbs sampling [19], variational inference [8], and so on. that item Ii is sampled from interest Tj , and Cij MK
refers to
The previous research has found that main differences among the number of times that interest Tj is assigned to the items
these approaches could be explained by the different settings in user Ui ’s rating record. Fig. 4 shows a simple example for
of two hyperparameters [5]. In this paper, we choose the the LDA inference process by extracting two users’ interest
Gibbs sampling technique, a form of Markov chain Monte distributions from Gibbs sampling. From Fig. 4, we can see that
Carlo, which is easy to implement and provides a relatively users with different preferences will finally get different interest
efficient method for extracting a set of interests from a large distributions.
rating set. In addition, in iExpand, we further extract the probability
The Gibbs sampling algorithm begins with the assignment of distribution of each latent interest Ti (ϑi ), and ϑi can be
each item token in users’ rating records to a random interest, estimated by
determining the initial state of the Markov chain. In each of
the following iterations of the chain, for each item token, the
M
MK
Cmi +α
Gibbs sampling method estimates the conditional distribution m=1
ϑi = P (Ti ) = . (2)
of assigning this token to each interest, conditioned on the
K M
M K + Kα
interest assignments to all other item tokens. An interest is Cmk
k=1 m=1
sampled from this conditional distribution and then stored as
the new interest assignment for this token. After an enough It is worth distinguishing between our user interests and the
number of iterations for the Markov chain, the interest as- latent topics in topic models, like Probabilistic Latent Semantic
signment for each item token will converge and each token in Analysis (PLSA) or LDA. In iExpand, each user has a distri-
the rating records is assigned to a “stable” interest. According bution on the spectrum of interests, whereas in PLSA/LDA,
to the assignment, the distribution of interest Tj over item a topic is a latent variable and the distributions are specified
222 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

just deal with user current interests, the systems will suf-
fer from the overspecialization problem and the cold-start
problem.
To address this issue, we use PageRank [34], a personalized
ranking strategy on the user interest correlation graph. We
choose this strategy not only because it can create personalized
views of interest importance but also because it can predict user
interest expansion by exploiting the structure of the interest
Fig. 5. Example of interest–item bipartite graph. For simplicity, not all the correlation graph. Given a user interest (a vector), we do repeat
edges between each pair of item and interest are shown. PageRank iterations (i.e., guided random walks) until conver-
gence. The final converged PageRank score vector contains the
by the specific topic, i.e., they are class (topic)-conditional
expanded user interests. One can also view this as predicting the
distributions. Thus, the model representation of iExpand and
next possible interest for each user. Thus, we can make diverse
PLSA/LDA are significantly different from each other.
recommendations in a systematic way.
The algorithmic approach here is the personalized ranking
C. Correlation Graph of User Interests [25]. First, for user Ui , we represent his/her current interest
(0) (0)
In this section, we describe how to compute the transition model through vector θi in which the jth entry θi (j)
probabilities between latent interests by the correlation graph. corresponds to a latent interest Tj and is initialized as θij . From
(0)
In order to construct the correlation graph of latent interests, we Section II-C, we know θi is normalized and it represents
use the items as intermediary entities. ϕ is created to estimate the probability distribution on each latent interest when random
each item’s probability distribution over interests and ϕij can walk starts.
be estimated by Next, let θi perform Random Walk with Restart (RWR)
[18] (a specific implementation of the personalized ranking)
P (Tj , Ii ) φij ϑj
ϕij = P (Tj |Ii ) = = K . (3) on the correlation graph. Let us consider a random walk that
P (Ii ) (0)
ϑk φik starts from θi ; when arriving at Tj , it randomly chooses
k=1 Tj ’s neighbors and keeps walking. For each step, in ad-
Although the interests may be correlated to each other in dition to making such decisions, the random walker goes
reality, in LDA, when α is given, the distributions of interests back to the starting point with a certain probability c, so
are independent. Unlike the Correlated Topic Model [7], in as to counteract the dependence on far-away parts of the
iExpand, we model those correlations in the form of probabil- graph.
ities. Specifically, we first use a bipartite graph G = X, E to For example, the process of one step random walk of the user
represent the relationships between items and interests, with the Ui from step s to step (s + 1) can be formalized as
vertex set X = I ∪ T , as shown in Fig. 5. In the bipartite graph, (s+1) (s) (0)
the weight of the edge from interest Tj to item Ii is φij and the θi = (1 − c)θi ψ + cθi (5)
weight of the edge from Ii to Tj is ϕij .
while, for all the users, their one-step updates can be formal-
Then, by projecting G, we get the relationships between
ized as
interests, and we use ψ to represent them. Also, ψij indicates

the recommending strength of interest Ti for Tj , and it can be θ(s) = θ, s=0
computed by (6)
θ(s+1) = (1 − c)θ(s) ψ + cθ, s 0

N
N
(s)
ψij = P (Tj |Ti ) = P (Tj |In )P (In |Ti ) = ϕnj φni . (4) where θi serves as the interest vector for Ui after s steps of
n=1 n=1 random walk have completed. All users’ interest vectors form
(s)
At last, the bipartite graph is transformed into a correlation a matrix θ(s) where θij means the steady-state probability that
graph which describes the relations between interests, and ψ a random walk starts from user Ui and stops at interest Tj after
becomes its correlation matrix. It can be proven easily that s steps, meanwhile it implies the affinity of Tj with respect
(s)
each entry in ψ is equal to or greater than zero and ψ is row to Ui . The bigger θij , the closer Ui and Tj .
normalized. In terms of a correlation matrix, ψij means the The personalized ranking is run for all users simultaneously,
coefficient of correlation between Ti and Tj , from Ti ’s view. and it only takes several steps on average before θ(s) converges.
However, in terms of random walk, ψij is the probability that The parameter c indicates the restart probability, and (1 − c) is
current state jumps from Ti to Tj . the decay factor used to represent how much relationship is lost
in each step.
D. User Interest Expansion
E. From Expanded User Interests to the Item Recommendation
In this section, we describe the solution for the expan-
sion of user interests. As discussed previously, user interests In this section, we describe the last process of iExpand, the
often change from one to another. If recommender systems ranking of the items and the generation of recommendation
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 223

lists. In iExpand, the items are ranked by their relevance with F. Estimating the Parameters
any given user. The user’s possible distribution on latent inter-
In this section, we present a method of selecting val-
ests serves as intermediary entities
ues for the parameters of iExpand. There are three para-

K
K meters: the hyperparameters α and β and the number of
(s)
P (Ij |Ui ) = P (Ij |t = k)Ps (t = k|Ui ) = φjk θik . (7) interests K.
k=1 k=1 First of all, we select the values for α and β. Previous
research works have found that α = 50/K and β = 0.01 work
It is easy to obtain the top K recommendations by well with different text collections, and they are often used
ranking the candidate items. Thus, iExpand directly gener- as the default values [10], [49]. However, Steyvers et al. [45]
ates recommendations without the step of predicting rating pointed out that good choices for these values depend on the
scores. number of interests and the item size. Furthermore, Asuncion
In addition, if the user rating has been taken into consid- et al. [5] suggested that hyperparameters play an important role
eration, iExpand can be used as a rating prediction method, in learning accurate topic models. Therefore, finding the best
such as the traditional memory-based collaborative-filtering α, β settings for each scenario is important. There are many
methods. Here, Pearson Correlation on expanded user interest ways for learning them [48], among which Minka’s fixed-point
vectors can be used to compute user similarities Sim(Ui , Uh ). iteration is widely used. It was proposed by Minka in [33] and
Therefore, the neighborhood N eighbor(Ui ) can be formed was carefully studied by Wallach [48]. In iExpand, each step of
for user Ui . Then, the rating from user Ui to item Ij can be fixed-point iteration is formalized as
predicted by

M K
α MK
Ψ Cmk + α − Ψ(α)
Sim(Ui , Uh ) ∗ (rh,j − rh )
α∗ ←−
m=1 k=1
Uh ∈Neighbor(Ui )

K
r̂i,j = ri +
|Sim(Ui , Uh )|
(8)
M MK
K Ψ Cmk + Kα − Ψ(Kα)
Uh ∈Neighbor(Ui ) m=1 k=1

K NK

N
where r̄i and r̄h are the average rating values for user Ui and β Ψ Cnk + β − Ψ(β)
Uh , respectively. rh,j refers to the rating value for item Ij from β ∗ ←− k=1 n=1

N (11)
user Uh .
K NK
N Ψ Cnk + N β − Ψ(N β) .
What we discussed earlier is about how to make recommen- k=1 n=1
dations in a general iExpand process. However, in real-world
applications, we face the challenge of online recommendations. Next, in addition to α and β, we choose the right value for
Since users’ interest distributions may change quickly from the interest number K. In previous works, if categories of the
time to time, while the correlation of interests evolves slowly, data sets are known, then K will be set equal to that number [9].
we can update both the inference process and the correlation However, in most scenarios, the category is unknown and how
graph periodically offline while renewing the user interests to set K becomes a problem. In most cases, K is randomly
whenever he/she rates. For example, when user Uu rates a new chosen or given a default value [5], [10], [51]. Until now,
item, Uu ’s interests can be resampled by Gibbs sampling. In one possible approach for setting this value is to compute the
each iteration, the interest assignment for every item in Uu ’s likelihood of the test data under different K values, then the
rating record is sampled by best one is chosen by a grid search. Exact computation of the
posterior probability is intractable, since it requires summing
P (tui = j|tu¬i , Uu , . . .) over all possible assignments. However, we can approximate it
by an estimator. In this paper, we refer to an approach proposed
CINuK u u
Cj¬i + α
j + Cj¬i + β by Wallach et al. [47] named Chib-style estimation which was
∝ i
(9)

N u
K u initially proposed as one evaluation method for topic models.
NK
Cnj + Cj + N β − 1 Ck + Kα − 1 The main idea of this approach is first to choose a special set
n=1 k=1
of latent topic assignments and then use Bayes’ rule to estimate
where tui = j means the interest assignment of item Iiu to the posterior probability.
u is a vector, and Cj u denotes the number of times Finally, as the posterior probability depends on α, β, and
interest Tj , C
K, we combine these factors together and propose a parameter
that interest Tj is assigned to the items in Uu ’s rating record.
learning algorithm, as shown in Algorithm 1. In Algorithm 1,
Also, ¬i refers to the interest assignments of all other items,
inputting the initial values of α and β, we first use Gibbs
not including the current instance. After performing interest
sampling and Minka’s fixed-point iteration to learn optimal
resampling, each interest distribution component of Uu can be
values for α and β specific to each number of interest K.
computed by
Then, Chib-style estimation is used to compute the posterior
Cju + α probability of the test data, under current parameter setting.
θuj = P (Tj |Uu ) = . (10) Lastly, the parameters with the best posterior probability are

K
chosen for the model, and they are used as the default settings
Cku + Kα
k=1 for performance comparison in the experimental part.
224 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

TABLE II
D ESCRIPTION OF T HREE DATA S ETS
Algorithm 1: Estimating Parameters (a, b)

input: a is the initial value of α;

b is the initial value of β;
output: the best values for α, β, and K
for all candidate K do
Initialize α = a; detailed information about these three data sets are described
Initialize β = b; in Table II.
for loop ← 1 to M AX_LOOP do For each user’s rating record, we split it into a training set
Gibbs sampling; and a test set, by randomly selecting some percentage of the
Update α, β by (11); ratings to be part of the training set and the remaining ones to
posterior = log(Chib − style estimation(α, β, K)); be part of the test set. To observe how each algorithm behaves at
Record the maximum posterior and the corresponding different sparsity levels, we construct different sizes of training
α, β, and K; sets from 10% to 90% of the ratings with the increasing step at
Return the best values for α, β, and K 10%. In total, we construct nine pairs of training and test sets,
and each split named as x − (100 − x) means x percent ratings
are selected to be the training data and the remaining (100 − x)
percent ratings for testing.
G. Computational Complexity
Benchmark Methods. In order to demonstrate the effective-
In this section, we analyze the computational complexity ness of iExpand, we compare it with many other benchmark
issues for iExpand. Specifically, the time cost for the inference methods for both the ranking prediction accuracy and the rating
of LDA is O(M · N · K · l), where l is the iteration number prediction accuracy. For the ranking purpose, we compare it
of Gibbs sampling. For the bipartite graph projection, most of with two graph-based algorithms, ItemRank [18] and L+̇ [14],
the time is used to construct the correlation matrix ψ and the as well as two algorithms based on dimension reduction, LDA
time cost in this phase is O(N · K 2 ). For each user, the cost for and SVD [39], both of which do not take user interests into
random walk is O(s · K 2 ) on average. Thus, for all the users, consideration. Among them, ItemRank is a personalized rank-
it costs O(s · M · K 2 ). Since K
M and K
N and the ing strategy on the item correlation graph for alleviating the
time cost for ranking the items and making recommendations cold-start problem and L+̇ is widely used for measuring node
can be neglected, the total computational complexity for the similarities in a graph. Similar to iExpand, both LDA and
general iExpand process is O(M · N · K · l). As we discussed SVD consider the latent factors. However, they are just used
in Section II-D, in real-world applications, both the inference for dimension reduction and do not consider the correlation
process and the correlation graph can be updated periodically between latent factors. All the aforementioned four methods
offline; thus, for online computing, we just need to run less can be seen as the related methods for iExpand.
than 30 iterations of Gibbs sampling [37] and one personalized For the rating purpose, we also compare it with four existing
ranking or rating prediction for the current user, both of which methods. For the memory-based method, we implemented the
can be done efficiently. The online recommendations can be user-based collaborative filtering (UCF) [36]. For the model-
followed by the black arrows shown in Fig. 2. based method, we chose the RSVD [15] and LDA. In ad-
dition to this, we also implemented the graph-based algo-
III. E XPERIMENTAL R ESULTS rithm ItemRank [18]. Both UCF and RSVD are state-of-the-art
collaborative-filtering algorithms, and they are widely used for
In this section, we present the experimental results to eval- baselines.
uate the performance of iExpand. Specifically, we demonstrate Among all these methods, RSVD, UCF, and SVD are orig-
the following: 1) the results of parameter selection based on inally rating-oriented algorithms and the rest of the methods,
Algorithm 1; 2) a performance comparison between iExpand including iExpand, are ranking-oriented algorithms. All these
and many other benchmark methods; 3) an analysis of the methods have been chosen as the baseline methods.
parameters in personalized ranking; 4) the understanding of
interests and interest expansion; and 5) the discussion about the
advantages and limitations of the iExpand model. B. Evaluation Metrics
For the purpose of evaluation, we adopted Degree of Agree-
A. Experimental Setup
ment (DOA) [14], Top-K [26], and Recall [20], [39] as the eval-
All the experiments were performed on three real-world uation metrics for ranking prediction accuracy. All of them are
data sets: MovieLens [1], Book-Crossing [55], and Jester [17]. commonly used for ranking accuracy, and these three metrics
The first one is collected by the GroupLens Research Project try to characterize the recommendation results from different
and has become a benchmark for evaluating recommender perspectives.
systems. For the last two data sets, we only choose part of DOA measures the percentage of item pairs ranked in the
them, considering the scalability problem of many of our correct order with respect to all pairs [14], [18]. Let N WUi =
benchmark methods (i.e., the graph-based algorithms). The I − (LUi ∪ EUi ) denote the set of items that do not occur in
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 225

the training and test sets for Ui , where LUi and EUi mean the
item set that Ui rated in the training and test sets, respectively.
Furthermore, we define check_order as

1, if P RIj ≥ P RIk
check_orderUi (Ij , Ik ) =
0, otherwise
where P RIj denotes the predicted rank of item Ij in the
recommendation list. Then, the individual DOA for user Ui is
defined as

j∈EUi ,k∈N WUi check_orderUi (Ij , Ik )
DOAUi = .
|EUi | × |N WUi |
An ideal ranking corresponds to a 100% DOA, and we use
DOA to stand for the average of each individual DOA.
Top-K indicates the precision of the selected top K items,
and Recall measures the ratio of the number of hits to the size of
each user’s test data [39]. For each user Ui , these two measures
are defined as follows:
#hits #hits
T op − KUi = , RecallUi = .
K |EUi |
Fig. 6. Results of parameter selection for MovieLens. (a) Best α for different
For the purpose of evaluating the rating effectiveness, we number of interests. (b) Best β for different number of interests. (c) Log-
likelihood of posterior for different number of interests.
also choose the Mean Absolute Error (MAE) and the Root
Mean Squared Error (RMSE) as the evaluation metrics. Both TABLE III
of them are commonly used in traditional collaborative-filtering PARAMETER S ETTINGS
systems [2], [15], [20], [27].

C. Parameters in LDA
In this section, we investigate the learning performances of
two parameters, namely, hyperparameters and the number of mark approaches: ItemRank [18], L+̇ [14], UCF [36], SVD
interests, by Algorithm 1. Here, the first 893 users in Movie- [39],1 LDA, and RSVD [15]. For the purpose of comparison,
Lens are used as training data and the remaining 50 users form we record the best performance of each algorithm by tuning
the test set. Similarly, for Book-Crossing, the first 900 users their parameters. The training models of all these algorithms
are treated as training samples and the remaining users as test are learned only once, and ratings in the test set have never
data. Also, for Jester data set, the first 1800 users are treated as been used in the training process. Therefore, in order to make
training data and the remaining 200 users for testing. For each a clearer and fairer comparison, we do not take the online
run of Algorithm 1, we initialize the parameters as a = 0.5 and recommendation into consideration.
b = 0.5 and turn on Minka’s updates after 15 iterations, and First of all, we show a comparison of the effectiveness
these settings are similar to the ones in [5]. of all the algorithms. Tables IV and V and Fig. 7 show the
The estimation of posterior for the test set is computed for K performances of their recommendations with respect to dif-
sizes from 50 to 800 for both MovieLens and Book-Crossing ferent splits and different evaluation metrics. Table IV(a)–(c)
and K from 20 to 100 for Jester. The Gibbs sampling algorithm illustrates the evaluation results of the DOA/Recall measures.
runs 1000 iterations each time. Let us take the MovieLens Fig. 7 demonstrates the top K results on the three data sets, and
data set as an example. The results of parameter selection are Table V shows the evaluation results of the rating prediction
shown in Fig. 6. The results suggest that the test set are best accuracy on the MovieLens data set. Note that we did not report
accounted for by an LDA model incorporating 300 interests the rating prediction results on the Book-Crossing and Jester
and the corresponding best hyperparameter settings are α = data sets because the rating scale is too big for Jester and most
0.001 and β = 0.08. In Fig. 6, we can observe that the best of the ratings are 0 in Book-Crossing.
hyperparameters for collaborative filtering are different from DOA/Recall. In terms of DOA/Recall measures, from Ta-
those of text applications based on topic models. Finally, the ble IV, we can see that iExpand outperforms the other four
results of parameter selection are summarized in Table III. algorithms in each split. Also, the sparser the data, the more
significant improvement can be made. Indeed, both Item-
D. Performance Comparison
1 In our implementation, we rank the items by computing their Pearson
In this section, we present a performance comparison of both correlation with each user. This is slightly different from the implementation
effectiveness and efficiency between iExpand and the bench- in [39]; however, this way can yield better results for our situation.
226 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

TABLE IV
P ERFORMANCE C OMPARISON OF D IFFERENT A LGORITHMS BASED ON DOA/R ECALL R ESULTS. (a) P ERFORMANCE C OMPARISON ON THE M OVIE L ENS
DATA S ET [(L EFT ) DOA IN P ERCENT. (R IGHT ) R ECALL IN P ERCENT.]. (b) P ERFORMANCE C OMPARISON ON THE B OOK -C ROSSING
DATA S ET [(L EFT ) DOA IN P ERCENT. (R IGHT ) R ECALL IN P ERCENT.]. (c) P ERFORMANCE C OMPARISON
ON THE J ESTER DATA S ET [(L EFT ) DOA IN P ERCENT. (R IGHT ) R ECALL IN P ERCENT.]

TABLE V
P ERFORMANCE C OMPARISON OF D IFFERENT A LGORITHMS BASED ON R ATING R ESULTS [(L EFT ) MAE. (R IGHT ) RMSE]

Rank and iExpand aim at alleviating the sparsity problem iExpand and LDA is interest expansion or not and because
and the cold-start problem, and they perform better than iExpand can expand user interests and increase the diversity
L+̇ , SVD, and LDA (except for Jester) when the training in a properly controlled manner, it performs much better than
sets are sparse, such as the 10–90 and 20–80 splits. How- LDA in all the cases. This means interest expansion can lead
ever, iExpand performs much better than ItemRank. For ex- to a better performance than only exploiting the current user
ample, in the three 10–90 splits, iExpand achieves nearly interests. Another interesting observation is that the smaller
two points of improvement on DOA values with respect to and sparser the training set, the more significant improve-
ItemRank. ment is made by iExpand compared with LDA, and when
In addition, both LDA and iExpand reduce data dimensions, the training set becomes larger and denser, the improvement
so they perform better when the data are dense, while SVD, becomes less obvious. The reason is that, when there are
another algorithm based on dimension reduction, does not enough interactions between a user and the system, the user
perform well. This may because of the use of different de- has experienced various types of items and his/her preference
composing techniques. Finally, as the main difference between has been decided. Hence, there will be not much difference
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 227

Fig. 7. Performance comparison based on top K results. (a) 10-90, MovieLens. (b) 30-70, MovieLens. (c) 50-50, MovieLens. (d) 70-30, MovieLens. (e) 90-10,
MovieLens. (f) 10-90, Book-Crossing. (g) 30-70, Book-Crossing. (h) 50-50, Book-Crossing. (i) 70-30, Book-Crossing. (j) 90-10, Book-Crossing. (k) 10-90, Jester.
(l) 30-70, Jester. (m) 50-50, Jester. (n) 70-30, Jester. (o) 90-10, Jester.

from the current interest distribution to the next possible interest MAE/RMSE. From Table V, we can see that iExpand
distribution.2 performs the best on the two sparsest splits, while in gen-
Top-K. For better illustration, we select five splits from each eral, RSVD outperforms the other methods in terms of the
data set, and we only show the results of the three algorithms MAE/RMSE. On the sparse splits, the methods that can dis-
with the best top K performances. Fig. 7 shows the comparative cover the indirect correlations and deal with the cold-start
results of ItemRank, LDA, and iExpand, where the performance problem (i.e., iExpand and ItemRank) get better results than
of ItemRank is chosen as the baseline and the comparative other algorithms (i.e., RSVD, LDA, and UCF). However, on
results of LDA and iExpand against ItemRank on each k (k the remaining splits, the rating-oriented methods (i.e., RSVD
ranges from 5 to 25) are demonstrated. In Fig. 7, we can see that and UCF) generally perform better than the ranking-oriented
iExpand performs better than the baseline on almost every split, methods (i.e.,ItemRank, LDA, and iExpand). Another inter-
while there are more than five splits where LDA performs worse esting observation is that these two types of evaluation met-
than the baseline. Also, iExpand outperforms LDA, only except rics DOA/Recall/top K and MAE/RMSE lead to inconsistent
for the last two splits of Book-Crossing. In all, in terms of the judgements on the algorithms. The same observation has been
top K measure, in most cases, iExpand performs better than reported in many previous works [20], [30].
other methods. Finally, the sparser the data, the more significant Note that we chose SVD instead of RSVD for the ranking
improvement can be seen. This is similar to the results of comparison. The reason is that RSVD led to very bad results
DOA/Recall. which are not comparable with other methods in our ranking ex-
periments. In addition, the question about whether the ranking
prediction accuracy or the rating prediction accuracy is more
important is beyond the scope of this paper.
2 Only one exception is for the Jester data, where LDA performs nearly as
Runtime. Next, we compare the computational efficiency of
well as iExpand on the first two splits. This is because Jester data are a very
dense data, which can be seen from the data description in Table II, and this many algorithms. Fig. 8 shows the execution time of these algo-
alleviates the advantages of interest expansion. rithms on each data set. Without a surprise, on both MovieLens
228 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

Fig. 8. Comparison of the execution time on each data set. (a) MovieLens data set. (b) Book-Crossing data set. (c) Jester data set.

sets is often bigger than that for smaller data sets. On the one
hand, when the training set is large, the correlation graph is
dense and there are plenty of direct contacts between vertices.
In this scenario, few indirect similarities needs to be considered,
and the random walk regresses to one step random walk or
there is no need for random walk. On the other hand, when
the correlation graph is sparse, random walk does not need to
restart frequently for lack of direct contacts. In this scenario, the
indirect contacts should be considered, and multistep random
walk will perform better than one-step random walk.
As an example, Fig. 9(d) shows the effect of the step of
random walk s on the performances of iExpand for two splits.
We can see that both curves converge after a few (no more than
ten) steps. The results show that random walk does enhance
the performance of iExpand and the best performance can be
achieved by just a few steps of random walk.

F. Understanding of Interests and Interest Expansion

In this section, we first show the interrelationships between
Fig. 9. Effect of parameters in personalized ranking. (a) Best c for Movie-
Lens. (b) Best c for Book-Crossing. (c) Best c for Jester. (d) Steps of random
latent interests and explicit interests, and then, we explain the
walk for two splits. advantages of interest expansion by examples.
In the previous sections, we do not distinguish latent interests
and Book-Crossing data sets, among these five model-based and explicit interests deliberately. As we have mentioned, the
collaborative filterings, LDA costs the least time, and with former is a latent factor extracted by the topic model, while the
dimension reduction, the execution time of iExpand is almost latter is the one identified in the real world. In this paper, we
no longer than that of LDA. Both of them are much faster than use latent interests to simulate explicit interests, and research
three other algorithms with respect to each split. For Jester data works have shown their one-to-one correspondence [8], [9],
set, where there are only 100 items, ItemRank method costs less [51]. Moreover, Mei et al. [32] proposed general approaches
time than the other approaches. However, in most real-world for interpreting the meaning of each latent topic. The question
applications, the number of items are more than thousands, and is whether every latent interest has a real meaning for use in
the time cost for ItemRank will be relatively very high. At the iExpand. A positive answer is critical for the effectiveness of
same time, with the increase of the item numbers, its time cost iExpand for collaborative filtering.
will rise rapidly. To this end, we consider the first three latent interests ex-
tracted from the MovieLens data set. Table VI lists the top five
movies for each latent interest identified. As can be seen, all five
E. Analysis of Parameters in Personalized Ranking
movies in the first latent interest have the same genres which
In this section, we provide an analysis of two parameters: the can be tagged as Action, Adventure, and Fantasy3 or they can
restart probability c and the step of random walk s. be labeled “Harrison Ford” (and contain one mistake), while
To study the effect of c, we let it vary in the range of [0, 1). movies in the second column all fall into Comedy and Drama.
When it is 0, random walk will never restart. When c is close However, there are several types of movie genres for the third
to 1, the performance of iExpand will be similar to the LDA one. After a closer look, we find that all of these movies are
algorithm. Fig. 9(a)–(c) shows the relationships between the generally recognized as classic movies and they all have won
best value of c with regard to Recall/DOA metrics and the size more than one Oscar award. Another observation is that the
of training data set for iExpand on three benchmark data sets.
In the figure, we can observe that the best c for larger data 3 This information can be obtained in IMDB. URL: http://www.imdb.com/.
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 229

TABLE VI
T OP M OVIES IN THE F IRST T HREE L ATENT U SER I NTERESTS

TABLE VII
E XAMPLE OF U SER U140 AND THE C ORRESPONDING R ECOMMENDATION R ESULTS

movie Star Wars is given high probability in both latent interests Tables IV and V and Fig. 7, while this does not mean it will
1 and 3. This verifies that topic models can capture the multiple work for every single user and there may exist users whose
characteristics of each movie, and each characteristic can be interest expansion are different from the majority.
resolved by other movies in the corresponding latent interest.
The aforementioned analysis helps to map each latent interest
into explicit interests. This means that, even for collaborative G. Discussion
filtering, every latent factor still has a real meaning, although In this section, we analyze the advantages and limitations of
the interpretation may not be as easy and precise as that the iExpand method. From the experimental results, we can see
in text applications. Furthermore, this indicates that, in real that there are many key advantages of iExpand. First, iExpand
applications, if we only get several interest information input models the implicit relations between users and items through a
by the new user, we can still find out the possible items that a set of latent user interests. This three-layer representation leads
given user may like by the item–interest relationship described to more accurate ranking recommendation results. Second,
in iExpand model and thus mitigate the cold-start problem. iExpand can save the computational cost by reducing the num-
In the previous sections, we have showed that interest ex- ber of item dimensions. This dimensionality reduction can also
pansion can lead to a better performance than the method of help to alleviate the sparseness problem which is inherent to
only exploiting the current user interests. In the following, we many traditional collaborative-filtering systems. Third, iExpand
will illustrate the difference between these two recommending enables diverse recommendations by the interest expansion.
strategies by a user case. Let us consider the user U140 in This can help to avoid the overspecialization problem. Finally,
the MovieLens data set. The ratings of this user can be well iExpand can deal with the cold-start recommendations. This
divided into two types, thriller and nonthriller. According to means we only need several items or interests input by the new
this classification, we select the thriller movies to be the training user, and then, the corresponding items this user may like can
data and eight of the nonthriller movies to be the test set. Then, be predicted and recommended.
we run the two recommending strategies one by one, and we The main limitation of iExpand lies in its “bag of items”
get two types of recommendations in the end. The results are assumption, where in each user’s rating record, the rating
shown in Table VII. contextual information (e.g., rating time) is totally ignored.
In Table VII, we can see that the top eight recommendations However, Ding et al. [13] demonstrated that the ratings pro-
from the algorithm with interest expansion and the method duced at different times have different impacts on the prediction
without interest expansion are different from each other. First, of future user behaviors. Furthermore, Adomavicius et al. [3]
the method with interest expansion achieves a better result with presented a systematic discussion on the importance of contex-
more correctly predicted movies. Second, the recommendation tual information when providing recommendations. Thus, it is
results from the method with interest expansion are more possible for iExpand to further improve the recommendations
diversified.4 In other words, the interest expansion is more by considering the contextual information, such as time stamp
proper to capture the diversified interests and find potential and the rating orders.
interests for the users. Finally, we would like to point out that
this advantage is meaningful to most of the users which can be
seen from the results of the performance comparisons shown in IV. R ELATED W ORK
In general, related work can be grouped into four cat-
4 We should note that, in this case, we choose the movie genres as the criterion egories. The first category has a focus on the graph-
for diversity, and there may be other appropriate criterions. based collaborative-filtering methods. Here, the graph-based
230 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

collaborative-filtering methods refer to those approaches which The third category of related work has a focus on solving
use the similarity of graph vertices to make recommendations the overspecialization problem in recommender systems. This
[14], [18], [44], [50], [52]. In these methods, users and items happens when the user is limited to being recommended the
are treated as vertices of a correlation graph and graph theory is items that are “similar” (with respect to content) to those
exploited for characterizing the relationship of user–item pairs. already rated [2]. In other words, at this time, users’ new or
The recommendation list is generated by considering how close latent interests will never be explored. This problem bothers
the candidate items are to a given user. The correlation graph most of the existing recommender systems, particularly for the
may consist of all users [44], all items [18], [50], [52], or both content-based approaches, where many studies have attempted
user and item vertices [14]. to find the solutions, for instance, filtering out the items which
While these graph-based collaborative-filtering methods are too similar to something the user has seen before [6] or
have elegant design ideas, they typically require more memory introducing some kind of serendipity [24].
and have high computational costs due to a large number of Since the overspecialization problem can be somewhat alle-
vertices. Moreover, most of these methods cannot explain why viated by the use of similar user interests, this problem has been
the items are chosen, and they provide limited understanding of largely ignored by most of the collaborative-filtering works.
the interactions among users, items, and user interests. However, some efforts have been dedicated to the solutions of
The second category includes the research work related to this issue. Among them, one possible approach is to consider
topic models, which are based upon the idea that documents are the transitive similarities in item-based collaborative filterings
mixtures of topics, where a topic is a probability distribution [18], [52]. However, directly computing the transitive simi-
over words. Many kinds of topic models have been proposed, larities between items will increase both the space and time
among which PLSA [21] and LDA [8] are most widely used costs. Another approach is to introduce diverse recommenda-
and studied. tions. For instance, Ziegler et al. [55] determined the overall
Before we describe topic models, we first introduce Latent diversity of the collaborative recommendations by introducing
Semantic Index (LSI), which was first proposed as a method the content information. Zhang et al. [54] modeled the goals of
for automatic indexing and retrieval [11]. LSI uses a technique maximizing the diversity of the recommendations while main-
called Singular Value Decomposition (SVD) to find the “latent taining adequate similarity to the user query as an optimization
semantic space” by decomposing the original matrix. LSI/SVD problem, and they applied this technique to an item-based
have been used for making recommendations probably since recommendation algorithm. Furthermore, a survey about some
2000 [28], [39]. Also, many SVD-based rating prediction meth- diversity enhancement algorithms was made in [53]. While the
ods are actually one of the successful competitors for the performances of these systems can be improved by introducing
Netflix prize [15], [27], [28]. These low rank recommenders diversity, most of them suffer from a tradeoff between diversity
usually treat collaborative filtering as a regression problem of and the recommending accuracy. A key reason is that they
user ratings. Although they perform well in rating predictions, neglect the fact that diversity should be made by exploiting
their effectiveness in generating recommendation lists should users’ possible interest expansion instead of randomly choosing
be further explored, since the rating prediction accuracy is not some explicit interests.
always consistent with the ranking accuracy [20], [30]. The fourth category of related work is focused on solving
The following PLSA topic model can be viewed as an the cold-start problem. Cold-start problem will happen when
enhancement of LSI. PLSA has a sound statistical foundation the recommender systems try to give recommendations to the
and has defined a proper generative model of the data [21]. users whose preference are underexplored or try to recommend
Also, PLSA is based on the observation that user preferences the new items whose characteristics are also unclear [2]. Thus,
and item characteristics are governed by a few latent semantics. it can be further classified as the item-side cold-start problem
As a statistical model, PLSA is able to capture the complex [42] and the user-side cold-start problem [29].
dependences among different factors using well-defined proba- For the content-based or the hybrid recommender systems,
bilistic semantics [30]. PLSA has been used both for automatic where there are profile descriptions, this problem can be al-
question recommendation [51] and collaborative filtering [22]. leviated by understanding items or users with such content
While PLSA has been successfully developed, it suffers from information. For instance, to deal with the item-side cold-
the overspecialization problem. start problem, Schein et al. proposed a probabilistic model
Compared with PLSA, the LDA model possesses fully gen- that combines item content and the collaborative information
erative semantics and has also been widely researched [5], [9], for recommendation [42]. To address the user-side cold-start
[47]. LDA is heavily cited in many text-related tasks, such as problem, Lam et al. proposed a User-Info Aspect Model by
finding scientific topics [19] and the information retrieval tasks using information of users, such as age and gender [29].
[49], but its feasibility and effectiveness in collaborative filter- However, for collaborative filtering, where there are no
ing is largely underexplored. Sometimes, topic models were content information, the only way to address the cold-start
only used to reduce the dimensionality of the data [10], [22], problem is to understand both users and items better from
like the function of principal component analysis [17]. In pre- the limited and sparse rating records. For instance, in order
vious topic-model-based collaborative-filtering algorithms, the to improve the recommendation performance under cold-start
correlation between latent factors has never been considered; conditions, Ahn [4] designed a heuristic similarity measure
thus, they easily suffer from the overspecialization problem and based on the minute meanings (i.e., proximity, impact, and
the cold-start problem. popularity) of coratings. Aside from exploring information
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 231

from the direct relations among items (i.e., coratings), other sions,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749,
methods consider the indirect similarities. For instance, Huang Jun. 2005.
[3] G. Adomavicius and A. Tuzhilin, “Context-aware recommender systems,”
et al. [23] applied associative retrieval techniques to generate in Recommender Systems Handbook. New York: Springer-Verlag, 2011,
transitive associations in the user–item bipartite graph. In [35], pp. 217–253.
for alleviating the sparsity and the cold-start problems, the [4] H. J. Ahn, “A new similarity measure for collaborative filtering to alleviate
the new user cold-starting problem,” Inf. Sci., vol. 178, no. 1, pp. 37–51,
authors proposed a method using the trust inferences, which are Jan. 2008.
also transitive associations between users. Meantime, similar to [5] A. Asuncion, M. Welling, P. Smyth, and Y. W. Teh, “On
this paper, many random-walk-based similarity methods have smoothing and inference for topic models,” in Proc. Int. Conf. UAI,
2009, pp. 27–34.
been used in [14], [18], and [52]. However, these methods [6] D. Billsus and M. J. Pazzani, “User modeling for adaptive news access,”
consider the relationship between items or user–item pairs, User Model. User-Adapted Interaction, vol. 10, no. 2, pp. 147–180,
rather than the correlation between latent interests. Meanwhile, 2000.
[7] D. M. Blei and J. D. Lafferty, “A correlated topic model of science,” Ann.
as mentioned previously, with the increase of new items, users, Appl. Statist., vol. 1, no. 1, pp. 17–35, 2007.
or rating records, both their space and time costs will rise [8] D. M. Blei, Y. N. Andrew, and I. J. Michael, “Latent Dirichlet allocation,”
rapidly. J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
[9] K. R. Canini, L. Shi, and T. L. Griffiths, “Online inference of topics
with Latent Dirichlet allocation,” in Proc. 12th Int. Conf. AISTATS, 2009,
V. C ONCLUDING R EMARKS vol. 5, pp. 65–72.
[10] W. Chen, J. C. Chu, J. Luan, H. Bai, Y. Wang, and E. Y. Chang, “Collabo-
In this paper, we exploited user latent interests for devel- rative filtering for orkut communities: Discovery of user latent behavior,”
in Proc. 18th Int. Conf. WWW, 2009, pp. 681–690.
oping an item-oriented model-based collaborative framework, [11] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and
named iExpand. Specifically, in iExpand, a topic-model-based R. Harshman, “Indexing by latent semantic analysis,” J. Amer. Soc. Inf.
method is first used to capture each user’s interests. Then, Sci., vol. 41, no. 6, pp. 391–407, 1990.
[12] S. Debnath, N. Ganguly, and P. Mitra, “Feature weighting in content based
a personalized ranking strategy is developed for predicting a recommendation system using social network analysis,” in Proc. 17th Int.
user’s possible interest expansion. Moreover, a diverse recom- Conf. WWW, 2008, pp. 1041–1042.
mendation list is generated by using user latent interests as an [13] Y. Ding and X. Li, “Time weight collaborative filtering,” in Proc. 14th
ACM Int. CIKM, 2005, pp. 485–492.
intermediate layer between the user layer and the item layer. [14] F. Fouss, A. Pirotte, J.-M. Renders, and M. Saerens, “Random-walk
There are two key benefits of iExpand. First, the three-layer computation of similarities between nodes of a graph with application to
representation enables a better understanding of the interactions collaborative recommendation,” IEEE Trans. Knowl. Data Eng., vol. 19,
no. 3, pp. 355–369, Mar. 2007.
among users, items, and user interests and leads to more ac- [15] S. Funk, Netflix Update: Try This at Home, 2006. [Online]. Available:
curate ranking recommendation results. Second, since the user http://sifter.org/~simon/journal/20061211.html
interests and the change of the interests have been taken into [16] Y. Ge, H. Xiong, A. Tuzhilin, K. Xiao, M. Gruteser, and M. J. Pazzani,
“An energy-efficient mobile recommender system,” in Proc. 16th ACM
the consideration, iExpand can keep track of these changes and SIGKDD Int. Conf. KDD, 2010, pp. 899–908.
significantly mitigate the overspecialization problem and the [17] K. Goldberg, T. Roeder, D. Gupta, and C. Perkins, “Eigentaste: A con-
cold-start problem. stant time collaborative filtering algorithm,” Inf. Retrieval, vol. 4, no. 2,
pp. 133–151, Jul. 2001.
Finally, an empirical study has been conducted on three [18] M. Gori and A. Pucci, “A random-walk based scoring algorithm applied
benchmark data sets, namely, MovieLens, Book-Crossing, and to recommender engines,” in Proc. 8th Int. Workshop Knowl. Discov. Web
Jester. The corresponding experimental results demonstrate (WebKDD)—Advances in Web Mining and Web Usage Analysis, 2006,
pp. 127–146.
that iExpand can lead to better ranking performances than state- [19] T. L. Griffiths and M. Steyvers, “Finding scientific topics,” Proc. Nat.
of-the-art methods including two graph-based collaborative- Acad. Sci. U.S.A. (PNAS), vol. 101, pp. 5228–5235, 2004.
filtering algorithms and two dimension-reduction-based al- [20] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluat-
ing collaborative filtering recommender systems,” ACM Trans. Inf. Syst.
gorithms. Due to an intellectual use of dimension-reduction (TOIS), vol. 22, no. 1, pp. 5–53, Jan. 2004.
techniques, iExpand also has low computational cost and is [21] T. Hofmann, “Probabilistic latent semantic analysis,” in Proc. 15th Conf.
highly scalable for a large number of users, items, and rating UAI, 1999, pp. 289–296.
[22] T. Hofmann, “Latent semantic models for collaborative filtering,” ACM
records. In the future, we plan to overcome the limitations of Trans. Inf. Syst. (TOIS), vol. 22, no. 1, pp. 89–115, Jan. 2004.
the current model and extend it to go beyond the usual recom- [23] Z. Huang, H. Chen, and D. Zeng, “Applying associative retrieval
mendations. In particular, we want to refine the iExpand model techniques to alleviate the sparsity problem in collaborative filter-
ing,” ACM Trans. Inf. Syst. (TOIS), vol. 22, no. 1, pp. 116–142,
so as to deal with the context-aware user–interests mining Jan. 2004.
problem. [24] L. Iaquinta, M. de Gemmis, P. Lops, and G. Semeraro, “Introducing
serendipity in a content-based recommender system,” in Proc. HIS, 2008,
pp. 168–173.
ACKNOWLEDGMENT [25] G. Jeh and J. Widom, “Scaling personalized web search,” in Proc. 12th
Int. Conf. WWW, 2003, pp. 271–279.
The authors are would like to thank the anonymous reviewers [26] Y. Koren, “Factorization meets the neighborhood: A multifaceted collab-
for their constructive comments. Q. Liu would like to thank the orative filtering model,” in Proc. 14th ACM SIGKDD Int. Conf. KDD,
2008, pp. 426–434.
China Scholarship Council for their support. [27] Y. Koren, “Collaborative filtering with temporal dynamics,” in Proc. 15th
ACM SIGKDD Int. Conf. KDD, 2009, pp. 447–456.
[28] M. Kurucz, A. A. Benczur, and K. Csalogany, “Methods
R EFERENCES for large scale SVD with missing values,” in Proc. KDDCup, 2007,
[1] Movielens Datasets, 2007. [Online]. Available: http://www.grouplens.org/ pp. 31–38.
node/73#attachments [29] X. N. Lam, T. Vu, T. D. Le, and A. D. Duong, “Addressing cold-start
[2] G. Adomavicius and A. Tuzhilin, “Toward the next generation of rec- problem in recommendation systems,” in Proc. 2nd Int. Conf. Ubiquitous
ommender systems: A survey of the state-of-the-art and possible exten- Inf. Manage. Commun., 2008, pp. 208–211.
232 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 42, NO. 1, FEBRUARY 2012

[30] N. N. Liu, M. Zhao, and Q. Yang, “Probabilistic latent preference Qi Liu received the B.E. degree in computer science
analysis for collaborative filtering,” in Proc. 18th ACM CIKM, 2009, from Qufu Normal University, Shandong, China, in
pp. 759–766. 2007. He is currently working toward the Ph.D.
[31] Q. Liu, E. Chen, H. Xiong, and C. H. Q. Ding, “Exploiting user interests degree from the School of Computer and Technol-
for collaborative filtering: Interests expansion via personalized ranking,” ogy, University of Science and Technology of China,
in Proc. 19th ACM CIKM, 2010, pp. 1697–1700. Hefei, China.
[32] Q. Mei, X. Shen, and C. Zhai, “Automatic labeling of multinomial He is currently supported by the China Scholar-
topic models,” in Proc. 13th ACM SIGKDD Int. Conf. KDD, 2007, ship Council and will stay for a year in Rutgers, The
pp. 490–499. State University of New Jersey, as a Visiting Re-
[33] T. Minka, Estimating a Dirichlet Distribution, 2000. [Online]. search Student in the Data Mining Group. His main
Available: http://research.microsoft.com/en-us/um/people/minka/papers/ research interests include intelligent data analysis,
dirichlet/minka-dirichlet.pdf recommender systems, and Web data mining. During his Ph.D. study, he has
[34] L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation published several papers in refereed conference proceedings and journals.
ranking: Bringing order to the web,” Comput. Sci. Dept., Stanford Univ.,
Stanford, CA, Tech. Rep. 1999-0120, 1998.
[35] M. Papagelis, D. Plexousakis, and T. Kutsuras, “Alleviating the sparsity
problem of collaborative filtering using trust inferences,” in Proc. Trust Enhong Chen (SM’07) received the Ph.D. degree
Manage., 2005, pp. 224–239. from the University of Science and Technology of
[36] R. Paul, I. Neophytos, S. Mitesh, B. Peter, and R. John, “GroupLens: An China (USTC), Hefei, China.
open architecture for collaborative filtering of netnews,” in Proc. ACM He is a Professor and the Vice Dean of the School
Conf. CSCW, 1994, pp. 175–186. of Computer Science and Technology, USTC. His
[37] X. H. Phan, L. M. Nguyen, and S. Horiguchi, “Learning to classify short general areas of research are data mining, personal-
and sparse text & web with hidden topics from large-scale data collec- ized recommendation systems, and Web information
tions,” in Proc. 17th Int. Conf. WWW, 2008, pp. 91–100. processing. He has published more than 100 papers
[38] R. Salakhutdinov and A. Mnih, “Probabilistic matrix factorization,” in in refereed conferences and journals. His research is
Proc. Adv. Neural Inf. Process. Syst., 2000, pp. 1257–1264. supported by the National Natural Science Founda-
[39] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Application of dimen- tion of China, National High Technology Research
sionality reduction in recommender systems—A case study,” in Proc. and Development Program 863 of China, etc. He is the program committee
ACM WebKDD Workshop, 2000, pp. 82–90. member of more than 20 international conferences and workshops.
[40] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative
filtering recommendation algorithms,” in Proc. 10th Int. Conf. WWW,
2001, pp. 285–295.
[41] J. B. Schafer, D. Frankowski, J. Herlocker, and S. Sen, “Collaborative
filtering recommender systems,” in Proc. Adapt. Web, Lecture Notes in Hui Xiong (SM’07) received the B.E. degree
Computer Science, 2007, pp. 291–324. from the University of Science and Technology
[42] A. I. Schein, A. Popescul, L. H. Ungar, and D. M. Pennock, “Methods of China, Hefei, China, the M.S. degree from the
and metrics for cold-start recommendations,” in Proc. 25th Annu. National University of Singapore, Singapore, and
Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval (SIGIR), 2002, the Ph.D. degree from the University of Minnesota,
pp. 253–260. Minneapolis, MN.
[43] S. Sen, J. Vig, and J. Riedl, “Tagommenders: Connecting He is currently an Associate Professor and the
users to items through tags,” in Proc. 18th Int. Conf. WWW, 2009, Vice Department Chair of the Management Science
pp. 671–680. and Information Systems Department, Rutgers Uni-
[44] X. Song, B. L. Tseng, C. Y. Lin, and M. T. Sun, “Personalized versity, NJ. His general area of research is data and
recommendation driven by information flow,” in Proc. 29th Annu. knowledge engineering, with a focus on developing
Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval (SIGIR), 2006, effective and efficient data analysis techniques for emerging data-intensive
pp. 509–516. applications. He has published over 90 technical papers in peer-reviewed
[45] M. Steyvers and T. Griffiths, “Probabilistic topic models,” in Handbook journals and conference proceedings. He is a Coeditor of Clustering and
of Latent Semantic Analysis, vol. 427. Mahwah, NJ: Lawrence Erlbaum Information Retrieval (Kluwer Academic Publishers, 2003) and a Co-Editor-
Associates, 2007, pp. 1–15. in-Chief of Encyclopedia of GIS (Springer, 2008). He is an Associate Editor of
[46] P. Symeonidis, A. Nanopoulos, and Y. Manolopoulos, “Providing justi- the Knowledge and Information Systems Journal and has served regularly in the
fications in recommender systems,” IEEE Trans. Syst., Man, Cybern. A, organization and program committees of a number of international conferences
Syst., Humans, vol. 38, no. 6, pp. 1262–1272, Nov. 2008. and workshops.
[47] H. M. Wallach, I. Murray, R. Salakhutdinov, and D. M. Mimno, “ Dr. Xiong is a senior member of the Association for Computing Machinery
Evaluation methods for topic models,” in Proc. 26th Annu. ICML, 2009, (ACM).
pp. 1105–1112.
[48] H. M. Wallach, “Structured topic models for language,”
Ph.D. dissertation, Univ. Cambridge, Cambridge, U.K., 2008.
[49] X. Wei and W. B. Croft, “LDA-based document models for ad-hoc Chris H. Q. Ding (M’09) received the Ph.D. degree
retrieval,” in Proc. 29th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. from Columbia University, New York, NY.
Retrieval (SIGIR), 2006, pp. 178–185. He is currently a Professor with the Department
[50] D. T. Wijaya and S. Bressan, “A random walk on the red carpet: Rating of Computer Science and Engineering, University
movies with user reviews and pagerank,” in Proc. 17th ACM CIKM, 2008, of Texas, Arlington (UTA). Prior to joining UTA,
pp. 951–960. he was in the Lawrence Berkeley National Labora-
[51] H. Wu, Y. Wang, and X. Cheng, “Incremental probabilistic latent semantic tory, University of California, Berkeley, and, prior
analysis for automatic question recommendation,” in Proc. ACM Conf. to that, with the California Institute of Technology,
RecSys, 2008, pp. 99–106. Pasadena. His general research areas are machine
[52] H. Yildirim and M. S. Krishnamoorthy, “A random walk method for learning/data mining and bioinformatics. He also
alleviating the sparsity problem in collaborative filtering,” in Proc. ACM works on information retrieval, Web link analysis,
Conf. RecSys, 2008, pp. 131–138. and high-performance computing. His research is supported by National Sci-
[53] M. Zhang, “Enhancing diversity in top-N recommendation,” in Proc. ence Foundation grants and by the University of Texas Regents STARS Award.
ACM Conf. RecSys, 2009, pp. 397–400. He has published over 150 research papers in peer-reviewed journals and
[54] M. Zhang and N. Hurley, “Avoiding monotony: Improving the conference proceedings, and these papers have been cited more than 5000
diversity of recommendation lists,” in Proc. ACM Conf. RecSys, 2008, times. He serves on many program committees of international conferences
pp. 123–130. and gave tutorials on spectral clustering and matrix models. He is an Associate
[55] C. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen, “Improving Editor of the journal Data Mining and Bioinformatics and is writing a book on
recommendation lists through topic diversification,” in Proc. 14th Int. spectral clustering to be published by Springer.
Conf. WWW, 2005, pp. 22–32. Dr. Ding is a member of the IEEE Computer Society since 2000.
LIU et al.: ENHANCING COLLABORATIVE FILTERING BY USER INTEREST EXPANSION 233

Jian Chen (M’95–SM’96–F’08) received the B.Sc.

degree in electrical engineering and the M.Sc. and
Ph.D. degrees in systems engineering from Tsinghua
University, Beijing, China, in 1983, 1986, and 1989,
respectively.
He is a Professor and the Chairman of the Man-
agement Science Department and the Director of
the Research Center for Contemporary Management,
Tsinghua University. His main research interests in-
clude supply chain management, E-commerce, deci-
sion support systems, and modeling and control of
complex systems. He has published over 100 papers in refereed journals and
has been a principal investigator for over 30 grants or research contracts with
the National Science Foundation of China, governmental organizations, and
companies. He has presented several plenary lectures.
Dr. Chen is a Ministry of Education Changjiang Scholar and the recipient of
the Fudan Management Excellence Award (3rd) and the Science and Technol-
ogy Progress Award from the Beijing Municipal Government; the Outstanding
Contribution Award from the IEEE Systems, Man, and Cybernetics Society; the
Science and Technology Progress Award from the State Educational Commis-
sion; and the Science and Technology Award for Chinese Youth.

Corrosion
100% (4)
Corrosion
11 pages
Looting in Kenya-Kroll Report (Hapa Kenya Version)
100% (7)
Looting in Kenya-Kroll Report (Hapa Kenya Version)
101 pages
Cosmic Journal
No ratings yet
Cosmic Journal
1 page
Recommendation System
No ratings yet
Recommendation System
17 pages
Peng 2013
No ratings yet
Peng 2013
4 pages
IJE - Volume 29 - Issue 6 - Pages 788-796
No ratings yet
IJE - Volume 29 - Issue 6 - Pages 788-796
9 pages
AI Recommendation System
No ratings yet
AI Recommendation System
20 pages
Wang 2007
No ratings yet
Wang 2007
6 pages
Unit Iii Collaborative Filtering
No ratings yet
Unit Iii Collaborative Filtering
51 pages
Module4 RecommenderSystem
No ratings yet
Module4 RecommenderSystem
11 pages
Item-Based Top-N Recommendation Algorithms: Mukund Deshpande and George Karypis University of Minnesota
No ratings yet
Item-Based Top-N Recommendation Algorithms: Mukund Deshpande and George Karypis University of Minnesota
35 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
A Survey For Personalized Item Based Recommendation System
No ratings yet
A Survey For Personalized Item Based Recommendation System
3 pages
Book Recommendation System Project
No ratings yet
Book Recommendation System Project
14 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
36 pages
Survey Paper On Recommendation Engine
No ratings yet
Survey Paper On Recommendation Engine
9 pages
Ijaret: International Journal of Advanced Research in Engineering and Technology (Ijaret)
No ratings yet
Ijaret: International Journal of Advanced Research in Engineering and Technology (Ijaret)
8 pages
Recommendation Systems: A Review
No ratings yet
Recommendation Systems: A Review
6 pages
Notes On Recommender Systems
No ratings yet
Notes On Recommender Systems
72 pages
Movie Recommendation Report
No ratings yet
Movie Recommendation Report
27 pages
Review of Clustering-Based Recommender Systems
No ratings yet
Review of Clustering-Based Recommender Systems
22 pages
Collaborative Filtering-Based Recommender System: Approaches and Research Challenges
No ratings yet
Collaborative Filtering-Based Recommender System: Approaches and Research Challenges
6 pages
Article 34
No ratings yet
Article 34
8 pages
DM Lect 6 - Recommender Systems
No ratings yet
DM Lect 6 - Recommender Systems
46 pages
10 26599@bdma 2018 9020012
No ratings yet
10 26599@bdma 2018 9020012
9 pages
LITERATURE SURVEY ON RECOMMENDATION ENGINEaper
No ratings yet
LITERATURE SURVEY ON RECOMMENDATION ENGINEaper
9 pages
Recommender Systems Asanov
No ratings yet
Recommender Systems Asanov
7 pages
Book Recommendation System
No ratings yet
Book Recommendation System
8 pages
An Item-Based Collaborative Filtering Method Using Item-Based Hybrid Similarity
No ratings yet
An Item-Based Collaborative Filtering Method Using Item-Based Hybrid Similarity
4 pages
2023 Scopus Kids Hobby Prediction
No ratings yet
2023 Scopus Kids Hobby Prediction
6 pages
Cheng 2016
No ratings yet
Cheng 2016
17 pages
Online Book Recommendation System
100% (1)
Online Book Recommendation System
21 pages
A Research of Job Recommendation System Based On Collaborative Filtering
No ratings yet
A Research of Job Recommendation System Based On Collaborative Filtering
6 pages
Survey On Collaborative Filtering Technique in Recommendation System
No ratings yet
Survey On Collaborative Filtering Technique in Recommendation System
7 pages
Movie Recommender Engine Using Collaborative Filtering: Smart Innovation October 2018
No ratings yet
Movie Recommender Engine Using Collaborative Filtering: Smart Innovation October 2018
9 pages
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
No ratings yet
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
5 pages
Web Crawling Based Context Aware Recommender Syste
No ratings yet
Web Crawling Based Context Aware Recommender Syste
25 pages
Reference Paper
No ratings yet
Reference Paper
5 pages
4 - IEEE - DM - Collabrative Filtering User Intrest
No ratings yet
4 - IEEE - DM - Collabrative Filtering User Intrest
1 page
1697mining Web Graphs For Recommendations
No ratings yet
1697mining Web Graphs For Recommendations
12 pages
Journal of Advanced Zoology: Content Based Filtering and Collaborative Filtering: A Comparative Study
No ratings yet
Journal of Advanced Zoology: Content Based Filtering and Collaborative Filtering: A Comparative Study
5 pages
Types of Recommendation Systems
No ratings yet
Types of Recommendation Systems
13 pages
Recommender: An Analysis of Collaborative Filtering Techniques
No ratings yet
Recommender: An Analysis of Collaborative Filtering Techniques
5 pages
Collaborative Filtering Using A Regression-Based Approach: Slobodan Vucetic
No ratings yet
Collaborative Filtering Using A Regression-Based Approach: Slobodan Vucetic
22 pages
Dynmic Trust Based Two Layer
No ratings yet
Dynmic Trust Based Two Layer
10 pages
Mindsight Codex
No ratings yet
Mindsight Codex
87 pages
Guest Editors Introduction Recommender S
No ratings yet
Guest Editors Introduction Recommender S
4 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
Unit..1 Rs
No ratings yet
Unit..1 Rs
16 pages
Implementation and Comparison of Recommender Systems Using Various Models
100% (1)
Implementation and Comparison of Recommender Systems Using Various Models
13 pages
N2VSCDNNR: A Local Recommender System
No ratings yet
N2VSCDNNR: A Local Recommender System
11 pages
Machine Learning Based Food Recipe Recommendation System: January 2023
No ratings yet
Machine Learning Based Food Recipe Recommendation System: January 2023
10 pages
Toward The Next Generation of Recommender Systems - A Survey of The State-Of-The-Art and Possible Extensions
No ratings yet
Toward The Next Generation of Recommender Systems - A Survey of The State-Of-The-Art and Possible Extensions
16 pages
Module 5
No ratings yet
Module 5
8 pages
UNIT I - Introduction-Recommender Systems
No ratings yet
UNIT I - Introduction-Recommender Systems
24 pages
An Effective Recommender System Based On
No ratings yet
An Effective Recommender System Based On
17 pages
Paper 23-An Automated Recommender System For Course Selection
No ratings yet
Paper 23-An Automated Recommender System For Course Selection
10 pages
10 Recommender Systems
No ratings yet
10 Recommender Systems
35 pages
3.related Works
No ratings yet
3.related Works
7 pages
Including Item Characteristics in The Probabilistic Latent Semantic Analysis Model For Collaborative Filtering
No ratings yet
Including Item Characteristics in The Probabilistic Latent Semantic Analysis Model For Collaborative Filtering
27 pages
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Sudhakar An 2017
No ratings yet
Sudhakar An 2017
6 pages
ISSN: 2230-7109 (ONLINE) ISSN: 2230-9543 (PRINT) : Official Acceptance of Research Paper
No ratings yet
ISSN: 2230-7109 (ONLINE) ISSN: 2230-9543 (PRINT) : Official Acceptance of Research Paper
1 page
Index Terms-Hand Gestured Based Wheel Chair Movement Control With Zigbee, Neural
No ratings yet
Index Terms-Hand Gestured Based Wheel Chair Movement Control With Zigbee, Neural
1 page
Machine Learning in Video Surveillance For Fall Detection: Lesya Anishchenko
No ratings yet
Machine Learning in Video Surveillance For Fall Detection: Lesya Anishchenko
4 pages
Water Marking Journal
No ratings yet
Water Marking Journal
7 pages
Standard PDI G102
No ratings yet
Standard PDI G102
8 pages
The Process of Photosynthesis
No ratings yet
The Process of Photosynthesis
2 pages
Mahbubur Rahman Ticket
No ratings yet
Mahbubur Rahman Ticket
1 page
TOS TLE 8 Agricrop For Sharing
No ratings yet
TOS TLE 8 Agricrop For Sharing
2 pages
Chapter 1 Capstone
No ratings yet
Chapter 1 Capstone
2 pages
Full Download The Future of HRD, Volume I: Innovation and Technology Mark Loon PDF
100% (2)
Full Download The Future of HRD, Volume I: Innovation and Technology Mark Loon PDF
76 pages
Tax Problems
No ratings yet
Tax Problems
3 pages
Math 6 March 23 Quarter 3 Speed
No ratings yet
Math 6 March 23 Quarter 3 Speed
34 pages
NS & Tech - Grade 4 - Terminology List - IsiZulu
No ratings yet
NS & Tech - Grade 4 - Terminology List - IsiZulu
11 pages
Besongntor Orockakwa
No ratings yet
Besongntor Orockakwa
37 pages
Diagnostic Report: ENGINE #1 - J1939 Active Fault Codes
No ratings yet
Diagnostic Report: ENGINE #1 - J1939 Active Fault Codes
4 pages
Hardening
No ratings yet
Hardening
7 pages
Updated Resume
No ratings yet
Updated Resume
3 pages
991.20 Nitrogeno Total en Leche - Kjeldahl
No ratings yet
991.20 Nitrogeno Total en Leche - Kjeldahl
2 pages
Shriman Tapasviji Maharaj.
No ratings yet
Shriman Tapasviji Maharaj.
40 pages
XARIOS 400.: Superior Versatility and Reliability For Large-Sized Delivery Vehicles
No ratings yet
XARIOS 400.: Superior Versatility and Reliability For Large-Sized Delivery Vehicles
2 pages
Types of Sensor and Their Application
50% (2)
Types of Sensor and Their Application
6 pages
Name: - Date: - Grade & Section: - Score
No ratings yet
Name: - Date: - Grade & Section: - Score
2 pages
6 Month MCQs (Oct To May 25) English
No ratings yet
6 Month MCQs (Oct To May 25) English
197 pages
MSS 064 Rev.00 Final
No ratings yet
MSS 064 Rev.00 Final
33 pages
14 Hes
No ratings yet
14 Hes
2 pages
Our Annual List of Must-Have Wines.: by The Editors of Wine Enthusiast Magazine
100% (1)
Our Annual List of Must-Have Wines.: by The Editors of Wine Enthusiast Magazine
10 pages
Effect of Feed Rate On The Generation of Surface Roughness in Turning
No ratings yet
Effect of Feed Rate On The Generation of Surface Roughness in Turning
7 pages
THEONE ? Sentence Improvement Pre 4th Oct Level Up Your English
No ratings yet
THEONE ? Sentence Improvement Pre 4th Oct Level Up Your English
145 pages
Briandavidphillips - Core Skills Hypnosis DVD Course
No ratings yet
Briandavidphillips - Core Skills Hypnosis DVD Course
6 pages
The Classification of Stocks With Basic Financial Indicators An Application of Cluster Analysis On The BIST 100 Index
No ratings yet
The Classification of Stocks With Basic Financial Indicators An Application of Cluster Analysis On The BIST 100 Index
29 pages
PJS Damansara Qtr4 2022 - Invoices
No ratings yet
PJS Damansara Qtr4 2022 - Invoices
3 pages
Cooling Tower Motor Type
No ratings yet
Cooling Tower Motor Type
1 page