A Hybrid Approach For Personalized Recommender System Using Weighted TFIDF On RSS Contents

International Journal of Computer Applications Technology and Research
Volume 5Issue 12, 764-774, 2016, ISSN:-23198656
A Hybrid Approach for Personalized

Recommender System Using Weighted TFIDF
on RSS Contents
Rebecca A. Okaka Waweru Mwangi George Okeyo
SCIT, JKUAT SCIT, JKUAT SCIT, JKUAT
Nairobi, Kenya Nairobi, Kenya Nairobi, Kenya
Abstract: Recommender systems are gaining a great popularity with the emergence of e-commerce and
social media on the internet. These recommender systems enable users access products or services that they
would otherwise not be aware of due to the wealth of information on the internet. Two traditional methods
used to develop recommender systems are content-based and collaborative filtering. While both methods
have their strengths, they also have weaknesses; such as sparsity, new item and new user problem that leads
to poor recommendation quality. Some of these weaknesses can be overcome by combining two or more
methods to form a hybrid recommender system. This paper deals with issues related to the design and
evaluation of a personalized hybrid recommender system that combines content-based and collaborative
filtering methods to improve the precision of recommendation. Experiments done using MovieLens dataset
shows the personalized hybrid recommender system outperforms the two traditional methods implemented
separately.
Keywords: recommender systems; collaborative filtering; content-based filtering; hybrid recommender

system; vector space model; term frequency inverse document frequency. Frequency.
recommendation is personalized or non-

1. INTRODUCTION personalized [2]. Some Research distinguishes
Changes in information seeking behavior can be three main categories of personalized RS:
observed globally [1]. Rapid increase in blogs and collaborative filtering (CF), content-based filtering
websites has led to an increase in information (CBF), and hybrid filtering (HF) [3]. Adomavicius
overload and it has become extremely difficult for and Tuzhilin claim that these three categories are
users to locate current relevant information, with the most popular and significant recommendation
vague ideas on where to get information, users methods. However, they pinpoint the shortcomings
often get lost or feel uncertain when seeking of these methods when used individually such as
information on their own, giving rise to the need limited content analysis, new item problem, new
for creating systems that are able to process the user problem, sparsity, scalability etc, which leads
existing information on one side, and help users by to poor recommendation quality. They also propose
suggesting products, services or articles that match possible improvements; such as combining two or
their tastes and preferences on the other side. more recommender filtering methods using
Recommender systems (RS) are promising tools to different hybridization techniques to overcome the
deal with these issues. challenges of single recommender systems.
There are lots of taxonomies of RS. They can be In CF, a user gets recommendations of items that
divided according to the fact whether the created he or she hasnt rated or liked before, but that were
www.ijcat.com 764
already positively rated by users in his or her the predicted rating of the user u on the item i such
neighborhood. In CBF, a user gets
that, is unknown. From this formulation, the
recommendations of items he or she had not seen main problem is predicting the rating a user would
or rated but similar to the ones he or she had rated give an item he or she have not seen, then
or liked earlier. HF combines two or more filtering computing the accuracy of the predicted rating.
methods to overcome the limitations of each
method. According to Tuzhilin et al, [4] the The main contribution of this work is that it
combination of two or more filtering methods provides a very straight forward hybrid architecture
proceeds in different ways; creating a unified that can be used to improve recommendation
model recommender system that brings all precision as well as provide top most relevant items
approaches together, utilizing some rules of one to users as recommendations. Because of the two
approach into a different approach and vice versa, methods used; content-based and collaborative
separate implementation of algorithms and then filtering, the new user and new item problems is
joining results, developing one model that applies eliminated; the new user problem in content-based
the characteristics of both methods. filtering is eliminated by collaborative filtering and
the new item problem in collaborative filtering is
The hybrid approach presented in this paper uses eliminated by content-based filtering. This hybrid
the weighted hybridization technique which approach uses the most widely used effective
probably is the most straight forward architecture information retrieval model, the VSM, and a very
for a hybrid system. Weighted hybridization simple efficient ranking algorithm tfidf.
technique was successfully used by the winners of
the Netflix Prize competition [5]. Our approach The rest of this paper is organized as follows;
involves separate implementation of algorithms section 2 reviews related work. Section 3 presents
then joining results, it is based on the idea of the hybrid model and experimental results are
merging predicted ratings computed by individual presented in section 4. Section 5 presents
recommenders to form a ranked list of items from conclusions and outlines of future research.
which top (top k, k=5) items are selected and
presented to the user as recommendations.
This hybrid approach combines CBF and CF
methods, while CBF are able to make predictions 2. RELATED WORK
on any item, CF only score an item if there are peer Hybrid recommender systems combine two or
users who have rated it, the combination of these more recommender systems. Depending on the
two methods therefore also helps eliminate the new hybridization approach different types of systems
item problem in CF and new user problem in CBF. can be found [6]. There have been some works on
This hybrid approach adapts the Vector Space using boosting algorithms for hybrid
Model (VSM) in both CBF and CF, uses ranking recommendations [7, 8]. These works attempt to
algorithm Term Frequency Inverse Document generate new synthetic ratings in order to improve
Frequency (TFIDF) and cosine similarity measure recommendation quality. The personalized hybrid
to find the relationships among users U, items I and recommender system combines collaborative and
attributes A. content-based information.
Generally, in a recommender system, there exists a Spiegel [9] proposed a framework that combines
large number of m items I= {i1, i2.im}, which are CBF, CF and demographic information for
described by a set of l attributes, A= {a1, a2.al}, recommending information sources such as web
where each item is described by one attribute or pages or news articles. The author used home
more, a number of n users, U= {u1, u2.un} and for HTML pages to gather demographic information of
each user u, a set of rated items IRu users. The recommender system is tested on very
= {1 , 2 , , }. For u U and i I, the few numbers of users and items which cannot
guarantee the efficiency of the proposed system.
recommender system predicts the rating , called
www.ijcat.com 765
The author does not also give an explanation on

how the model is built. The proposed hybrid approach adapts some
interesting features of the above systems; the use of
Melville [10] proposed a model in which content- collaborative and content information. It however
based algorithm is used to enhance the existing user uses the VSM, tfidf and cosine similarity measure
data then the collaborative filtering is used for which are very simple efficient algorithms that
rating prediction. But fails to justify how both enable item ranking based on weights. Prediction
approaches combined improves prediction accuracy is computed by getting the deviation of
accuracy. Another researcher [11] used a number the predicted rating from the actual rating. Other
of collaborative filtering algorithms such as works on hybrid recommender systems can be
Singular Value Decomposition (SVD), found in [16].
Asymmetric Factor Model and neighborhood
based approaches to build a recommender system.
The author shows that linearly combining these 3. THE HYBRID FILTERING
algorithms increases the accuracy of prediction, but MODEL
the use of all these models leads to significant
increase in training time.
Items Database
Basu et al. [12] use Ripper, a rule induction system,
to learn a function that takes a user and movie and ( Item features User behavior
predicts whether the movie will be liked or and similarities and similarities
disliked. They combine collaborative and content
information, by creating features such as comedies
liked by user and users who liked movies of genre Content based filtering Collaborative filtering
X. They however do not show how that approach
improved recommendation quality.
Several other hybrid approaches are based on

traditional CF, but also maintain a content-based Rated results Rated results
profile for each user. These content-based profiles,
rather than co-rated items, are used to find similar
users. In Pazzanis approach [13], each user profile Merge ratings
is represented by a vector of weighted words
derived from positive training examples using the
Winnow algorithm. Predictions are made by
applying CF directly to the matrix of user profiles Ranked item list
as opposed to the user ratings matrix. An
alternative approach by; Fab [14] uses relevance
feedback to simultaneously mold a personal filter
along with a communal topic filter. Documents are Select top K items
initially ranked by the topic filter and then sent to a
users personal filter. The users relevance
feedback is used to modify both the personal filter
Recommendations
and the originating topic filter. Good et al. [15] use
collaborative filtering along with a number of
personalized information filtering agents. User
Predictions for a user are made by applying CF on
the set of other users and the active users Figurepresents
This section 1. The hybrid filtering
the HF model Item
approach;
personalized agents. database refers to the large amounts of data
www.ijcat.com 766
available on different domains, the model log10 N

IDFt = (3)
implements both CBF and CF methods separately. DFt
The two methods used (CBF and CF) complement

each other and contribute to each others The TFIDF value of a term is commonly defined as
effectiveness [17]. This hybrid approach uses the the product of its TF and IDF values.
VSM on CBF and CF methods, tfidf and cosine
similarity measure to compute relationships among TF-IDF t,d = TFt,d IDFt (4)
items and users. CF and CBF methods are used to
obtain separate ratings or score for every item. The The TF-IDF weight W, for each term in a document
more than one rating for every item are merged into d is given by;
a single value. The items are then ranked to form a log10 N
single list of ranked items based on their scores, a Wt,d = (1 + log10 TFt,d ) (5)
DFt
set of items (e.g. top K, where K equals 5, 10.)
topping the list (with highest scores or ratings) are Generally;
finally presented to the user as recommendations.
1+log10 N
Wt,d = if TFt,d> 0 (6)
3.1.The Vector Space Model DFt
The vector space model [18] (VSM) is a standard

algebraic model commonly used in information Wt,d = 0, otherwise (7)
retrieval (IR). It treats a textual document as a bag
of words, disregarding grammar and even word 3.1.1 The Vector Space Model in Content-
order. It represents both documents and queries by based filtering
term sets and compares global similarities between Suppose a user profile is denoted by U and item
documents and queries. The VSM typically uses profiles by I. TFi,j is the number of times the term ti
tfidf (or a variant weighting scheme) to weight the occurs in item Ij I, and the inverse document
terms. Then each document is represented as a frequency of a term ti Ij I is calculated as;
vector of tfidf weights. Queries are also considered
as documents. Cosine similarity is used to compute IDFi = log10 I / DFi (8)
similarity between document vectors and the query
vector. The term frequency TFt,d of term t in Where DFi is equal to the number of items
document d is defined as the number of times that containing ti and I is equal to the total number of
a term t occurs in a document d. Note that; items being considered. Therefore;
TFt,d = 1 if t exists in d (1) TFIDF = TFi,j IDFi (9)
TFt,d = 0 if t does not exist in d (2) The TFIDF of each term is then calculated, and the
vector of each user profile and item profiles are
It positively contributes to the relevance of d to t. constructed based on their included terms. These
The inverse document frequency IDFt of term t vectors have the same length, so the similarity of
measures the rarity of t in a given corpus. If t is rare, these profiles can be calculated as;
then the documents containing tare more relevant
to t. IDFt is obtained by dividing N by DFt and then U. I t1 tfidfU tfidfI
taking the logarithm of that quotient, where N is the Sim(U,I)= |U||I| = (10)
t1 tfidf2U +t1 tfidf2I
total number of documents and DFt is the document
frequency of t or the number of documents
containing t. Formally; The resulting similarity should range between from
0 to 1. If Sim(U,I)=0,then the two profiles are
independent and if Sim(U,I) > 0, the profiles have
some similarity. Information about a set of items
www.ijcat.com 767
with similar rating patterns compared to the item users are independent and if Sim(Ui,Uj)=1,the users
under consideration is the basis for predicting the are similar. The information about a set of users
rating a Ui would give the item. The prediction with a similar rating behavior compared to the
formula is; current user is the basis for predicting the rating a
user Ui would give an item he or she has not rated.
similarity(Ui ,Ib )rUi,Ia Based on the nearest neighbor of user Ui it is easy
Pred(Ui, Ia) = similarity(Ui ,Ib )
(11)
to determine the prediction of user Ui.
Normally, the predicted rating of a user u for an similarity(Ui ,Uj )(rj,item rj )
item i in CBF is the average rating of the user on Pred(Ui, I)= ri + similarity(Ui ,Uj )
(17)
items viewed, therefore equation 11 can also be
written as; Where, Uj is Ui nearest neighbor, ri is the average
similarity(Ui ,Ib )rUi,Ia
rating of Ui, rj,item is the rating of Uj on the given

, |CBF = similarity(Ui ,Ib )
(12) item andrj is the average rating of Uj. Also, given
that the predicted rating of a user u on an item I in

, |CBF = , (13) CF is given as,
|CF, equation 17 can therefore be
written as:
Where , , isthe average rating of Ui on items
similarity(Ui ,Uj )(rj,item rj )
already is viewed, and , |CBFis the predicted
, |CF = ri + similarity(Ui ,Uj )
(18)
rating of a user on an item in CBF.
3.1.2 The Vector Space Model in

Collaborative filtering
The user profiles are represented as both
documents and queries in an n-dimensional matrix.
The weight for each term t in a user profile p is
given by:
Wi, j = TFi,j IDFi which can also be written as;
Wi, j = TFi,j log10 P / pi (14)
IDFi = log10 P / pi (15)
Where, TFi,j is the frequency of a term t in a profile

p, P is the total number of profile, p i is the total
number of profiles containing term t and Wi, j is the
weight of the ith term in a profile j. The similarity
between user Ui and user Uj is calculated using
cosine similarity measure. The equation for
calculating the similarity is as follows;
Ui . Uj n
k=1 tfidfk,i tfidfk,j
Sim(Ui,Uj)= |U ||U = (16)
i j| n 2
k=1 tfidfk,i +n 2
k=1 tfidfk,j 3.2 Hybridization Process
Again the resulting similarity should range
Table1. Extended user-item, user-user matrix
between from 0 to 1. If Sim(Ui,Uj)=0,then the two
www.ijcat.com 768
Item User profile-Attribute tf-idf User-User cosine similarity

i1 i2 i3 im a1 a2 a3 al u1 u2 u3 un
User u1 - - 4 3 0.04 0 0 0 1 0.1 0 0.2
u2 4 2 - 5 0 0.01 0.02 0.02 0.1 1 0.1 0
u3 - - 3 - 0.04 0 0 0.02 0 0.1 1 0

un - 3 - - 0.04 0.01 0.02 0.02 0.2 0 0 1
Item- a1 0.0 0.0 0.0 0.0
Attribute Rated item
tfiidf a2 0.0 0.01 0.0 0.01
a3 0.02 0.02 0.0 0.0

al 0.02 0.0 0.0 0.0 Unrated item
User- u1 0.3 0.2 0.1 0.1

Item
cosine u2 0.3 0.0 0.0 0.0
similarity u3 0.1 0.5 0.3 0.4

un 0.0 0.2 0.4 0.2
To take into account the difference in the
contribution of each predictor in the final rating
As stated earlier the HF model combines CBF and prediction, each predictor is assigned a parameter.

CF which uses user-item matrix and user-user Such that the resulting rating prediction, |HF of a
matrix respectively. Table 1 shows the model user u on an item i from HF is computed as follows;
matrix with sample tfidf and cosine similarity
scores among users and items. This model is based
, |HF = , |CBF + , |CF (19)
on the idea of deriving recommendation items by
combining predictions computed by each
Where , |CBF and , |CF are the predicted
individual recommenders CBF (Eq. 13) and CF
(Eq. 18), here the separate scores of an individual rating of an item i I for user u U in CBF and CF
recommender on an item i I recommended to a respectively.
user u U are merged into a single unit. To compute the value for each parameter, a
function S(n) that gives the weight of a users rating
n (n=|IRu|) is used. The sigmoid function satisfies
these constraints for S(n).
The parameters and can be computed using the
sigmoid function as follows;
1
= (20)
1+
1
=1 (21)
1+
www.ijcat.com 769
These parameters and , represent the weight (Representational Sate Transfer Application
confidence levels given to CBF and CF Program Interface). The Hybrid Recommender
respectively. The resulting rating predictions of module executes the methods on the background. It
items from the hybrid approach are ranked based is connected to the User Interface module via the
on their prediction scores, from the ranked items RecommendationRetrieval interface that enables
list the top scoring set of items (top k items) are the resulting recommendations to be shown to the
selected and provided to the user as user.
recommendations.
4. HYBRID DESIGN
4.1 System Physical Architecture
Figure 2 below shows the physical architecture of
the proposed hybrid recommender system; it shows
a set of simpler systems each with its own local
context that is independent but not inconsistent
with the context of the larger system as a whole.
Both servers could still be physically implemented
in a single network node.
Web
Application
Server
User Workstation
Database
Figure 3. The Component Diagram
Database
Hybrid
Recommender
4.3 System Activity Diagram
The following activity diagram shows the flow of
Engine Server
events within the proposed hybrid approach. It
shows how the user interacts with the system.
Figure 2. The Physical architecture
4.2 System Component Diagram

Figure 3 shows a simple component viewpoint of
the Hybrid Recommender system. The Hybrid
Recommender module, while calculating the
accurate recommendation, uses the data stored in
the Database module via a RESTful API
www.ijcat.com 770
value of 1 and 0 is used to indicate whether a movie

belongs to a specific genre or not. The dataset is
split into 5 subsets, each having (80%) training and
(20%) test sets.
5.2 Evaluation metrics

The evaluation was done using prediction accuracy
metric: Mean Absolute Error (MAE), which is used
to represent how accurately a RS estimates a users
preference for an item. MAE is calculated by
averaging the absolute deviation of a users
predicted score and actual score. The smaller the
MAE the more precise the RS.

| |
MAE = (22)

Where, n is the total number of items, i is the

current item, si is the actual score a user expressed
for item i, and pi is the RSs predicted score a user
has for i.
In this experiment 5-fold cross validation was
performed on sub datasets 1 to 5 provided by
MovieLens 100k dataset, 80% training data and
20% test data on each sub dataset. This experiment
compares the results of the hybrid approach to CBF
and CF methods implemented separately.
Figure 4. The Activity Diagram

5.3 Results
Even though the 5 sub data sets used have almost
the same number of users and items, they have
different rating patterns therefore a standard
5. EXPERIMENTS AND number of users and items were used for
RESULTS experiment across all the datasets. Results
5.1 Dataset presented here are the average of MAE across all
The MovieLens (http://www.grouplense.org) 100k the sub data sets given the specified number of
dataset was used. This data was collected by the users and items.
GroupLens Research Project at the University of
Minnesota during a seven-month period between
19th September 1997 and 22nd April 1998. The
MovieLens is used mainly because it is publicly
available and has been used in many hybrid
recommender systems and therefore considered a
good benchmark for this purpose. This dataset
contains 943 users, 1682 movie items and 100000
ratings. Each user rates a minimum of 20 movies
using integer values 1 to 5 and not all movies are
rated by all users. There are 19 movie genres. A
movie can belong to more than one genre. A binary
www.ijcat.com 771
0.4
.
Table 2. Average MAE given 100 items 0.3
CF
Filtering Methods
MAE
No of 0.2 CBF
Users CF CBF HF
HF
100 0.3686 0.3828 0.3433
0.1
350 0.3374 0.3632 0.3162
500 0.3398 0.3659 0.3161 0
800 0.3258 0.3555 0.3081 100 350 500 800
Number of Users
0.4 Figure 6. MAE given 500 items
0.3
Table 4. MAE given 700 items
MAE
CF
0.2 Filtering Methods
CBF No of
HF Users CF CBF HF
0.1
100 0.3326 0.3554 0.3029
350 0.3324 0.3690 0.3167
0
100 350 500 800 500 0.3203 0.3676 0.3122
Number of Users 800 0.3020 0.3564 0.2986

Figure 5. MAE given 100 items 0.4
0.3
Table 3. MAE given 500 items
CF
MAE
No of Filtering Methods CBF

0.2
Users CF CBF HF HF
Figure 7. MAE given 700 items
100 0.3396 0.3588 0.3043
0.1
350 0.3110 0.3560 0.2998
500 0.3016 0.3544 0.2954
0
800 0.2971 0.3519 0.2953
100 350 500 800
Number of Users
www.ijcat.com 772
Collaborative filtering and content-based filtering

performing on average 6% and 17% worse than the
Table 5. MAE given 1200 items hybrid approach respectively; the hybrid approach
achieves an average MAE of 0.3084 whereas
No of Filtering Methods collaborative and content-based filtering achieve
Users CF CBF HF 0.3258 and 0.3622 respectively.
100 0.3374 0.3539 0.305
350 0.3386 0.3859 0.3245 6. CONCLUSIONS AND
500 0.3294 0.3745 0.3145 FURTHER WORK
In this paper a hybrid approach that combines
800 0.2997 0.3442 0.2883
content-based and collaborative filtering methods
has been used to improve recommendation
accuracy. Both methods use the effective
0.4
information retrieval model the VSM, a very
simple efficient ranking algorithm TFIDF and
0.3 cosine similarity measure to find the relationships
among users, items and attributes. The evaluation
CF of the proposed hybrid model using real data has
MAE
0.2 CBF proven it achieves better prediction accuracy

HF compared to a single content-based and single
collaborative based recommender system. Because
0.1 of this good performance, this hybrid
recommendation approach and the information
retrieval methods can therefore be adapted in
0 different domains for recommendation purposes.
100 350 500 800
Number of Users The possible future work related to this study is
first to test the efficiency of this approach to other
Figure 8. MAE given 1200 items larger datasets and secondly, to explore the
possibilities of experimenting with other variants
Collaborative filtering contributes greatly in the of tfidf, similarity measures and the vector space
results of this approach more so where there are model to see how well they perform in this kind of
large numbers of items; its performance becomes hybrid recommender environment.
better with increasing number or items and users
respectively, but does not perform as well with 7. REFERENCES
large number of users and small number of items. [1] Gavgani V.Z. Health Information Need and
On the other hand, content-based filtering does not Seeking Behavior of Patients in Developing
make much contribution to this approach, its Countries' Context; an Iranian Experience,
performance worsens as the number of items Proceedings of the 1st ACM International
increases and its prediction is worse in cases where Health Informatics Symposium, 2010, paper
there are small number of users and items 1112, pp. 575579.
respectively. [2] Kazienko P., Koodziejski P. 2005.
WindOwls Adaptive Systems for the
However, across all of the evaluations, results show Integration of Recommendation Methods in E
that the hybrid filtering model achieves better commerce, Springer Verlag, 218 224.
prediction accuracy than each of the traditional [3] Adomavicius G., Tuzhilin A. 2005. Toward
filtering methods implemented separately. the next generation of recommender systems:
www.ijcat.com 773
A survey of the state-of-the-art and possible filtering. Artificial Intelligence Review, 1999,
extensions. IEEE Trans. Knowl. Data Eng. 13(5-6):393408.
(2005); 17:734-749. [14] Marko B., Yoav S. Fab: Content-based,
[4] Tuzhilin A., Adomavicius G., 2005. Towards collaborative recommendation.
the next generation of recommender systems. Communications of the Association for
A survey of the state of the art and possible Computing Machinery, 1997,40(3):6672.
extensions. IEEE Trans Knowl Data Eng, [15] Good N., Schafer J. B., Konstan J.A.,
17:734 749. Borchers A., Sarwar B., Herlocker J., Riedl J.
[5] Bell R., Koren Y., and Volinsky Ch. Chasing Combining collaborative filtering with
$1,000,000: How we won the Netflix Progress personal agents for better recommendations.
Prize. ASA Statistical and Computing In Proceedings of the Sixteenth National
Graphics Newsletter, 18(2):412, 2007. Conference on Artificial Intelligence (AAAI-
[6] Burke R. Hybrid recommender systems: 99), 1999, pp 439446.
survey and experiments, User Modeling and [16] Jahrer M., Toscher A., Legenstein R.
User Adapted Interaction 12 (4) (2002) 331 Combining predictions for accurate
370. recommender systems, in Proceedings of the
[7] Melville P., Mooney R., Nagarajan R. SIGKDD conference. New York, NY, USA:
Content-boosted collaborative filtering for ACM 2010, pp. 693-702.
improved recommendations, in:18th National [17] Burke R. D. Hybrid recommender systems
Conference on Artificial Intelligence (AAAI- survey and experiments, User Model, User-
02), 2002, PP. 187-192. Adapt. Interact, vol 12, no. 4, 2002, pp. 331-
[8] Park S. T., Pennock D., Madani O., Good N., 370.
DeCoste D. Nave filterbots for robust cold [18] Montaner, M., Lopez, B. and De la Rosa
start recommendations, in: KDD 06: J.L.2003. A Taxonomy of Recommender
Proceedings of the 12th ACM SIGKDD Agents on the Internet, Artificial Intelligence
International Conference on Knowledge Review, Kluwer Academic Publisher, 19, 285
Discovery and Data Minning, 2006, pp. 669- 330.
705.
[9] Spiegel S., Kunegis J., Li F. Hydra: a hybris
recommender system [cross-linked rating and
content information] in CIKM-CNIKM, 2009,
pp. 75-80.
[10] Pazzani M. J. A framework for collaborative,
content based and demographic filtering,
Artfi.Intell. Rev., vol. 13, no. 5-6, 1999, pp.
393-408.
[11] Melville P., Mooney R. J. Nagarajan R.
Content boosted collaborative filtering for
improved recommendation, in proceedings
of AAAI/IAAI, 2002, pp.187 193.
[12] Basu C., Hirsh H., Cohen C.
Recommendation as classification: Using
social and content-based information in
recommendation. In Proceedingsof the
Fifteenth National Conference on Artificial
Intelligence (AAAI-98), 1998, pp 714720.
[13] Pazzani A., Michael J. A framework for
collaborative, content-based and demographic
www.ijcat.com 774

A Hybrid Approach For Personalized Recommender System Using Weighted TFIDF On RSS Contents

Uploaded by

Copyright:

Available Formats

A Hybrid Approach For Personalized Recommender System Using Weighted TFIDF On RSS Contents

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Hybrid Approach For Personalized Recommender System Using Weighted TFIDF On RSS Contents

Uploaded by

Copyright:

Available Formats

International Journal of Computer Applications Technology and Research

Volume 5Issue 12, 764-774, 2016, ISSN:-23198656

A Hybrid Approach for Personalized

Keywords: recommender systems; collaborative filtering; content-based filtering; hybrid recommender

recommendation is personalized or non-

The author does not also give an explanation on

Several other hybrid approaches are based on

available on different domains, the model log10 N

The two methods used (CBF and CF) complement

The vector space model [18] (VSM) is a standard

TFt,d = 1 if t exists in d (1) TFIDF = TFi,j IDFi (9)

3.1.2 The Vector Space Model in

Wi, j = TFi,j IDFi which can also be written as;

Wi, j = TFi,j log10 P / pi (14)

IDFi = log10 P / pi (15)

Where, TFi,j is the frequency of a term t in a profile

Item User profile-Attribute tf-idf User-User cosine similarity

User- u1 0.3 0.2 0.1 0.1

Figure 3. The Component Diagram

4.2 System Component Diagram

value of 1 and 0 is used to indicate whether a movie

5.2 Evaluation metrics

Where, n is the total number of items, i is the

Figure 4. The Activity Diagram

Number of Users 800 0.3020 0.3564 0.2986

No of Filtering Methods CBF

Collaborative filtering and content-based filtering

0.2 CBF proven it achieves better prediction accuracy

You might also like