0% found this document useful (0 votes)
19 views

An Item Based Collaborative Filtering Recommendation Algorithm Using Rough Set Prediction

This paper presents an item-based collaborative filtering recommendation algorithm that utilizes rough set theory to address the sparsity problem in user ratings, which negatively impacts the accuracy of predictions in recommender systems. The proposed method fills vacant ratings in the user-item matrix and employs collaborative filtering to generate recommendations, demonstrating improved accuracy compared to traditional user-based collaborative filtering methods. Experiments conducted on the MovieLens dataset show that the combination of rough set theory and item-based filtering yields better decision-support accuracy metrics.

Uploaded by

Rahim Mahruf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

An Item Based Collaborative Filtering Recommendation Algorithm Using Rough Set Prediction

This paper presents an item-based collaborative filtering recommendation algorithm that utilizes rough set theory to address the sparsity problem in user ratings, which negatively impacts the accuracy of predictions in recommender systems. The proposed method fills vacant ratings in the user-item matrix and employs collaborative filtering to generate recommendations, demonstrating improved accuracy compared to traditional user-based collaborative filtering methods. Experiments conducted on the MovieLens dataset show that the combination of rough set theory and item-based filtering yields better decision-support accuracy metrics.

Uploaded by

Rahim Mahruf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2009 International Joint Conference on Artificial Intelligence

An Item Based Collaborative Filtering Recommendation Algorithm Using Rough


Set Prediction

Ping SU HongWu YE
Zhejiang Business Technology Institute, Zhejiang Textile & Fashion College,
Ningbo 315012, P. R. China Ningbo 315211, P. R. China
e-mail: [email protected] e-mail: [email protected]

Abstract—Recommender systems represent personalized applications, the ratio of rated items to the total of available
services that aim at predicting users’ interest on information items is very low. The absence of a sufficient amount of
items available in the application domain. Collaborative available ratings significantly affects CF methods reducing
filtering technique has been proved to be one of the most the accuracy of prediction. The sparsity of ratings problem
successful techniques in recommendation systems in recent
is particularly important in domains with large or
years. Poor quality is one major challenge in collaborative
filtering recommender systems. Sparsity of users’ ratings is the continuously updated list of items as well as a large number
major reason causing the poor quality. To solve this problem, of users. The sparsity problem may occur when either none
this paper proposed an item based collaborative filtering or few ratings are available for the target user, or for the
recommendation algorithm using the rough set theory target item that prediction refers to, or for the entire
prediction. This method employs rough set theory to fill the database in average. Different treatments are required and
vacant ratings of the user-item matrix where necessary. Then it different prediction techniques must be employed depending
utilizes the item based collaborative filtering to produce the on the sparsity conditions, making the selection of an
recommendation. The experiments were made on a common appropriate approach a cumbersome task [3,4]. Current CF
data set using different filtering algorithms. The results show
approaches are limited in the sense that they address specific
that the proposed recommender algorithm combining rough
set theory and item based collaborative filtering can improve aspects of the above problem.
the accuracy of the collaborative filtering recommendation To solve this problem, in this paper, we proposed an item
system. based collaborative filtering recommendation algorithm
using the rough set theory prediction. This method employs
Keywords-recommender system; item based collaborative rough set theory to fill the vacant ratings of the user-item
filtering; rough set; sparsity matrix where necessary. Then it utilizes the item based
collaborative filtering to produce the recommendation. The
I. INTRODUCTION experiments were made on a common data set using
While the rapid growth and wide application of the different filtering algorithms. The results show that the
Internet and information technology has provided an proposed recommender algorithm combining rough set
unprecedented abundance of information resources, it has theory and item based collaborative filtering can improve
also led to the problem of information overload. Thus, the accuracy of the collaborative filtering recommendation
methods to help find resources of interest have attracted system.
much attention from both researchers and vendors. To deal II. USING ROUGH SET THEORY TO PREDICTION WHERE
with the problem, the personalized recommendation systems NECESSARY
play a more important role [1,2].
Recommender system plays an important role particularly In our proposed approach, we employ the rough set
in an electronic commerce environment as a new marketing theory [5,6,7] to predict the vacant ratings where necessary.
strategy. Although a multifarious of recommendation A. Basic definition
techniques has been developed recently, collaborative
Definition1 S = (U, A, {Va}, a) is the information system,
filtering (CF) has been known to be the most successful
where U is a no empty finite set, named discussed field. A is
recommendation techniques and has been used in a number
a no empty finite set too, named property set. Va is the
of different applications such as recommending web pages,
movies, tapes and products. The CF assumes that a good value field of the property of a⊆A. U → Va is a mapping
way to find a certain user’s interesting content is to find relation, which makes any element of discussed field U have
other people who have similar interests with him. CF the exclusive value when getting property a from Va. If A is
methods operate upon user ratings on observed items composed by condition attribute set C and conclusion
making predictions concerning users’ interest on attribute set D, meanwhile C and D satisfy C ∪D = A , C
unobserved items. However, in most cases in real-world ∩D = Φ, then S is a decision making system. To show

978-0-7695-3615-6/09 $25.00 © 2009 IEEE 308


DOI 10.1109/JCAI.2009.155
simply , ( U , C ∪{ d} ) can be used to express decision ak(xin+1)=*
making system. else ak(xin+1)=ak(xin)
Definition2 For knowledge denotation system S = ( U , end for
A ,{ Va} , a) , suppose R ∈A , X ∈U ,POSR ( X) = R- n=n+1
X ,NEGR ( X) = U - R- X and BNR ( X) = R- X - R- X are end while
respectively called positive fields , negative fields and calculate the empty with the max frequency
border of R below X. End
Definition3 UB ( x , X) = card([ x ]B ∩X) / card ([ x ]B)
III. USING ITEM BASED COLLABORATIVE FILTERING TO
is the reliance degree of element x to set X. where , card
PRODUCE RECOMMENDATIONS
denotes the base of gather.
Definition4 Given R is an equivalent relation family and A. Measuring the item rating similarity
r ∈ R , when ind( R) = ind ( R - { r} ) , r is omissible for R ,
There are several similarity algorithms that have been
or else r is not omissible. All sets of no omissible relation in
used [2,3,4]: Pearson correlation, cosine vector similarity,
R are called the core of R, noted as CORE( R) . Therefore,
adjusted cosine vector similarity, mean-squared difference
CORE( R) = ∩RED( P), RED( P) is all the reduced family
and Spearman correlation.
of P.
Pearson’s correlation, as following formula, measures the
Definition5 Set information system S = (U, A, V, f),
linear correlation between two vectors of ratings as the
define M as :M = (M ( i, j) ) n×n and M ( i, j) = { ak | ( ak
target item t and the remaining item r.
( xi ) ≠ ak ( xj ) ) ∧ ( ak ( xi ) ≠ * ) ∧ ( ak ( xj ) ≠ * ) } , * m
express the lack。 ∑ ( R i t − A t )( R ir − A r )
Definition6 Set information system S = (U, A, V, f), s im ( t , r ) = i =1
m m
define miss attribute set of object xi as MAS = { ak | ak
( xi ) = 3 , k = 1, 2, ..., p}、define nearest set of object xi as ∑
i =1
( R it − A t ) 2 ∑ ( R ir − A r ) 2
i =1
NSi = { xj | M ( i, j) = <, i ≠ j, j = 1, 2, ..., n}、define miss Where Rit is the rating of the target item t by user i, Rir is
object set of information system S as MOS = { xi | MASi ≠ the rating of the remaining item r by user i, At is the average
<, i = 1, 2, ..., n} define set of object xi as LSi= { xj | P ( i, j) rating of the target item t for all the co-rated users, Ar is the
= maxx k∈NS i{ P ( i, k) } , xj ∈NSi , i ≠ j, j = 1, 2, ..., n} , average rating of the remaining item r for all the co-rated
the similarity of xi and xj on the attribute set A as P ( i, j) users, and m is the number of all rating users to the item t
B. Prediction where necessary and item r.
The cosine measure, as following formula, looks at the
Based on the basic definition, utilize the maximal angle between two vectors of ratings as the target item t and
similarity object of the object which has the null attribute the remaining item r.
value having presence. Fill up the object as the one possess m
most strong null attribute value fill up capable. Thereby, this
paper forms the self-contained information system.
∑ R it R ir
Specific algorithm as follows:
s im (t , r ) = i =1
m m
Input: vacant user rating matrix
Output: complete user rating matrix
∑i =1
R it 2 ∑ R ir 2
i =1
Begin Where Rit is the rating of the target item t by user i, Rir is
calculate the differentiate M0, MASi0, MOS0 the rating of the remaining item r by user i, and m is the
n=0 number of all rating users to the item t and item r.
while(Sn+1≠Sn) The adjusted cosine, as following formula, is used for
for each xi in MOSn similarity among items where the difference in each user’s
calculate LSin use of the rating scale is taken into account.
for each xi not in MOSn m

for(int k=1;k<p+1;k++) ∑ (R it − Ai )( Rir − Ai )


ak(xin+1)=ak(xin) sim ( t , r ) = i =1
m m
for each xi in MOSn
for(int k=1;k<p+1;k++) ∑ ( Rit − Ai ) 2 ∑ ( Rir − Ai ) 2
i =1 i =1
if(LSin==1)
if(xj in LSin) ak(xin+1)=ak(xin) Where Rit is the rating of the target item t by user i, Rir is
else the rating of the remaining item r by user i, Ai is the average
if((xi, xj in LSin)∧( ak ( xin ) ≠ ak ( xjn ) )∧( rating of user i for all the co-rated items, and m is the
number of all rating users to the item t and item r.
ak ( xin ) ≠*)∧( ak ( xjn ) ≠*))

309
B. Prediction using item-based CF measure as the evaluation metric. Assume that p1, p2, p3,
Since we have got the membership of item, we can ..., pn is the prediction of users' ratings, and the
calculate the weighted average of neighbors’ ratings, corresponding real ratings data set of users is q1, q2, q3, ...,
weighted by their similarity to the target item. qn. See the ROC-4 definition as following:
n
The rating of the target user u to the target item t is as
following: ∑u i
c ROC - 4 = i =1


n
R u i × s im ( t , i )
Pu t = i =1
∑v
i =1
i
c

∑ s im ( t , i ) ⎧1, pi ≥ 4 and qi ≥ 4 ⎧1, pi ≥ 4


i =1 ui = ⎨ vi = ⎨
Where Rui is the rating of the target user u to the
neighbour item i, sim(t, i) is the similarity of the target item
⎩0, otherwise ⎩0, otherwise
t and the neighbour item i, and c is the number of the The larger the ROC-4, the more accurate the predictions
neighbours. would be, allowing for better recommendations to be
formulated.
IV. EXPERIMENTAL EVALUATION AND RESULTS
C. Comparing the proposed CF with the user based CF
In this section, we describe the dataset, metrics and In this paper, we compare the proposed CF that
methodology for the comparison between traditional and combining the rough set theory and the item based CF with
proposed CF algorithm, and present the results of our the only utilizing the user based CF. As showing in the
experiments. Figure 1, it includes the decision support accuracy metrics
A. Data set of ROC-4 for the two comparing methods in relation to the
We use MovieLens collaborative filtering data set to different numbers of recommender items. The obvious
evaluate the performance of proposed algorithm [8]. conclusion is that the combining method is better than only
MovieLens data sets were collected by the GroupLens using the user based CF.
Research Project at the University of Minnesota and
MovieLens is a web-based research recommender system 0.82 user-based CF
that debuted in Fall 1997. Each week hundreds of users visit proposed CF
MovieLens to rate and receive recommendations for movies.
The site now has over 45000 users who have expressed 0.79
opinions on 6600 different movies. We randomly selected
ROC-4

enough users to obtain 100, 000 ratings from 1000 users on


0.76
1680 movies with every user having at least 20 ratings and
simple demographic information for the users is included.
The ratings are on a numeric five-point scale with 1 and 2 0.73
representing negative ratings, 4 and 5 representing positive
ratings, and 3 indicating ambivalence.
B. Performance measurement 0.7
10 15 20 25 30
The metrics for evaluating the accuracy of a prediction Number of recommender items
algorithm can be divided into two main categories [8,9]:
statistical accuracy metrics and decision-support metrics. Figure 1. Comparing the proposed CF with the user based
Statistical accuracy metrics evaluate the accuracy of a CF
predictor by comparing predicted values with user provided
values. Decision-support accuracy measures how well V. CONCLUSIONS
predictions help user select high-quality items. In this paper, Recommender systems represent personalized services
we use decision-support accuracy measures. that aim at predicting users’ interest on information items
Decision support accuracy metrics evaluate how effective available in the application domain. Collaborative filtering
a prediction engine is at helping a user select high-quality technique has been proved to be one of the most successful
items from the set of all items. The receiver operating techniques in recommendation systems in recent years. Poor
characteristic (ROC) sensitivity is an example of the quality is one major challenge in collaborative filtering
decision support accuracy metric. The metric indicates how recommender systems. Sparsity of users’ ratings is the
effectively the system can steer users towards highly-rated major reason causing the poor quality. To solve this
items and away from low-rated ones. We use ROC-4 problem, in this paper, we proposed an item based

310
collaborative filtering recommendation algorithm using the [3] Hyung Jun Ahn, A new similarity measure for collaborative filtering
to alleviate the new user cold-starting problem, Information Sciences
rough set theory prediction. This method employs rough set 178 (2008) 37-51.
theory to fill the vacant ratings of the user-item matrix [4] George Lekakos, George M. Giaglis, Improving the prediction
where necessary. Then it utilizes the item based accuracy of recommendation algorithms: Approaches anchored on
collaborative filtering to produce the recommendation. The human factors, Interacting with Computers 18 (2006) 410–431.
experiments were made on a common data set using [5] Chong-Ben Huang, Song-Jie Gong, Employing rough set theory to
different filtering algorithms. The results show that the alleviate the sparsity issue in recommender system, In: Proceeding of
the Seventh International Conference on Machine Learning and
proposed recommender algorithm combining rough set Cybernetics (ICMLC2008), IEEE Press, 2008, pp.1610-1614.
theory and item based collaborative filtering can improve [6] Yee Leung, Wei-Zhi Wu, Wen-Xiu Zhang, Knowledge acquisition in
the accuracy of the collaborative filtering recommendation incomplete information systems: A rough set approach, European
system. Journal of Operational Research 168 (2006) 164–180.
[7] ZHANG wei, LIU lu, GE Jian, Collaborative Filtering Algorithm
REFERENCES based on Rough Set, MINI- MICRO SYSTEMS,Vol126 No. 11,
2005.
[1] SongJie Gong, The Collaborative Filtering Recommendation Based
on Similar-Priority and Fuzzy Clustering, In: Proceeding of 2008 [8] Huang qin-hua, Ouyang wei-min, Fuzzy collaborative filtering with
Workshop on Power Electronics and Intelligent Transportation multiple agents, Journal of Shanghai University (English Edition),
System (PEITS2008), IEEE Computer Society Press, 2008, pp. 248- 2007,11(3):290-295.
251. [9] Gao Fengrong, Xing Chunxiao, Du Xiaoyong, Wang Shan,
[2] Jong-Seok Lee, Chi-Hyuck Jun, Jaewook Lee, Sooyoung Kim, Personalized Service System Based on Hybrid Filtering for Digital
Classification-based collaborative filtering using market basket data, Library, Tsinghua Science and Technology, Volume 12, Number 1,
Expert Systems with Applications 29 (2005) 700–704. February 2007,1-8.

311

You might also like