0% found this document useful (0 votes)
22 views7 pages

Recommendation System Techniques and Related Issues A Survey

This paper surveys recommendation system techniques, focusing on collaborative filtering, content-based recommendations, and hybrid approaches. It discusses various algorithms, metrics, and challenges faced by recommendation systems, such as cold-start, data sparsity, and privacy issues. The review highlights the importance of recommendation systems in e-commerce for enhancing user experience and increasing organizational revenue.

Uploaded by

鄭文竣
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views7 pages

Recommendation System Techniques and Related Issues A Survey

This paper surveys recommendation system techniques, focusing on collaborative filtering, content-based recommendations, and hybrid approaches. It discusses various algorithms, metrics, and challenges faced by recommendation systems, such as cold-start, data sparsity, and privacy issues. The review highlights the importance of recommendation systems in e-commerce for enhancing user experience and increasing organizational revenue.

Uploaded by

鄭文竣
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Int. j. inf. tecnol.

https://doi.org/10.1007/s41870-018-0138-8

ORIGINAL RESEARCH

Recommendation system techniques and related issues: a survey


Pushpendra Kumar1 • Ramjeevan Singh Thakur1

Received: 14 November 2017 / Accepted: 2 April 2018


 Bharati Vidyapeeth’s Institute of Computer Applications and Management 2018

Abstract Nowadays, e-commerce websites are emerging he/she is interested in and recommends that items or
as a new market and allow the millions of product to the products [2]. Nowadays, RS is used on almost every
user for sale. The selection of product from millions of E-commerce websites, assisting millions of users.
product requires an additional tool called recommendation E-Commerce sites such as Netflix and Movielens [3] for
system. The recommendation system (RS) helps the user to moves, Amazon [4] for books, CD’s and many other
find the items they are looking for. Collaborative filtering is products, Entree for restaurant and Jester [5] for jokes uses
one of the techniques used in the RSs that is widely studied recommender system to assist his consumer. The result of
and used to make recommendation. In this paper, a review the recommender system is used for both E-Commerce
of the various methods, algorithms used in the recom- organization as well as users [6] i.e., RS not only assist the
mender system, the metrics used in RSs and the challenges customer in getting the preferred item but also increase the
of recommendation system such as Cold-start, Data spar- revenue of the organization by selling more products.
sity, Scalability, Privacy etc. have been discussed. Recommendation System can be classified into three cat-
egories based on how the recommendation is performed:
Keywords Recommendation system (RS)  Collaborative Content-based recommendation, Collaborative filtering
filtering (CF)  Cold-start  Data sparsity (CF), and Hybrid approaches. Collaborative Filtering is one
of the most widely used and successful technique to rec-
ommend an item. It recommends the item to the particular
1 Introduction user based on the rating of the other user in the system.
Content based approaches perform prediction based on
With the rapid growth of information on the Internet, it has characteristics of the item from their past history, for
become a pervasive problem for finding relevant informa- example recommending a movie that has been categorized
tion from the Internet. In this era of Internet, e-commerce as ‘‘Action Movies’’ to a user who likes action movies.
has been growing rapidly and allowing millions of product Hybrid approach is a combination of content based and
for sale. E-commerce users are suffering from problems of collaborative filtering technique in different ways. Figure 1
selection of a product from millions of product. Recom- shows the framework of Memory based collaborative fil-
mendation system helps the e-commerce user to select the tering recommendation system. The framework shows that
items from millions of items [1]. A Recommender system millions of individual users are using E-Commerce sites
(RS) collects information from a customer about the items and their reviews regarding products are store in corre-
sponding server. The E-Commerce organizations collect
the data from server and preprocessed into require format
& Pushpendra Kumar (User Rating Matrix). The rating matrix is used to compute
[email protected] the similarity between users by using the different simi-
1 larity algorithm such as Pearson correlation coefficient,
Department of Computer Applications, Maulana Azad
National Institute of Technology, Bhopal, Madhya Pradesh, cosine similarity, adjusted cosine similarity and Jaccard’s
India similarity etc. then rating for a particular item that active

123
Int. j. inf. tecnol.

Fig. 1 Framework of memory


based collaborative filtering
recommendation system
Data Collection Data Preprocessing

Database Processed Data


Server

Rating Matrix
Recommend Compute
Predict
Missing Rating Similarity
New User Rank Items

Top N Item

user has not yet been given is predicted and top N predicted similarity. So this method outperforms under various situ-
items are recommended to the active user. The rest of paper ations of data sparsity.
is organized as follows. In Sect. 2 Related work is dis- Koohi et al. [9] Collaborative filtering suffers from data
cussed. In Sect. 3 Recommendation system techniques are sparsity and high dimensionality problem. In this paper,
discussed. In Sect. 4 Issues related to Recommendation author solves these issues by finding the neighbor user
system are discussed. In Sect. 5 Evaluation metrics of using subspace clustering approach. The author constructs
Recommendation system are discussed. Finally, In Sect. 6 the different subset of a rated matrix as Interested (I),
conclusion is given. Neither Interested Nor Uninterested (NIU) and Uninter-
ested (U). Based on these subsets three level of the tree is
created for the neighbors of an active user. This method is
2 Related work efficient in dealing with sparse data.
Verma et al. [10] proposed a recommendation system
Rodrigues et al. [6] proposed a framework which combines using collaborative filtering (CF) technique and fuzzy
the item-based collaborative filtering (CF) with user c-means (FCM) clustering algorithm. FCM clustering is
demographic information in cluster weighted mechanism used for item clustering and CF is used for rating predic-
to solve the cold start and data sparsity issues. This system tion. FCM performs better than K means clustering because
provides the good recommendation to the new users which K means has a restriction that one item belongs to single
makes the user experience great and also increases the cluster where as one item may similar to more than one
organization revenue. Better recommendations can be group of items.
provided by making the cluster based on cross domain data. Kumar et al. [11] proposed a hybrid collaborative fil-
For example, if a user who likes romantic songs, the system tering method to resolve the issues of sparsity and scala-
can recommend him love story movies. bilities provide more personalized recommendations. The
Ji et al. [7] introduced a scalable CF algorithm based on proposed method works in two phases, in the first phase
matrix factorization, performed prediction using two resolve the sparsity using Case based reasoning (CBR)
decision matrices: user-category and user-keyword instead followed average filling and in the second phase resolve the
of using single user-item rating matrix. The proposed scalability using clustering into the group by Self-orga-
algorithm is implemented on real data set and the result nizing map optimized with a Genetic algorithm.
shows that model has good scalability for new items. Koohi et al. [12] Proposed a Collaborative Filtering
Gu et al. [8] a simple collaborative filtering suffers from recommendation system using fuzzy C-means clustering
data sparsity problem because of the explosive growth of algorithm, performance against the K-means and SOM
users and items in e-commerce. This paper introduced a clustering approaches have been evaluated. The experi-
dynamic-weighted CF technique (DWCF) to resolve data mental result shows that fuzzy c-means clustering outper-
sparsity and adaptive issues. In this approach similarity forms another clustering in terms of accuracy, precision
between user and items is found then a weight controlling and recall.
method is proposed to find the impacts of user & item

123
Int. j. inf. tecnol.

Table 1 Summarized information of literature review


References Author Issues name Types of Technique used Dataset Advantages
filtering used

[6] C. M. Rodrigues, Cold start and Hybrid Item-based CF algorithm Movie Lens Save resources and time by
et al. real-time collaborative combined with user predicting the relevant
2016 prediction filtering demographic based CF data only
problem algorithm in clusters
weighted mechanism
[7] Ke Ji, et al. Scalability Collaborative Matrix factorization Real Good scalability for new
2104 problem filtering based approach Dataset items
published
by KDD
CUP 2012
[8] Liang Gu, et al. Data sparsity and Collaborative Dynamic-weighted Movie Lens Under various situations of
2014 adaptivity issues filtering collaborative filtering data sparsity, DWCF
approach (DWCF) performs good
recommendations than
traditional hybrid
approach
[9] Hamidreza Data sparsity and Collaborative Finding neighbor user Movie lens Efficient in dealing with
Koohi, et al. high filtering using subspace and Jester sparse data and performs
2017 dimensionality clustering method & better than the
new similarity measure conventional method in
is used to compute terms of accuracy,
similarity value precision and recall.
[10] S. K. Verma, Sparsity and cold Collaborative Fuzzy C means clustering Movie Lens Cold- start problem is
et al. start problem filtering resolve very efficiently
2013
[11] N. P. Kumar, Data sparsity and Hybrid user- For data sparsity case Movie lens Performs enhanced
et al. scalability item based based reasoning and for personalized
2015 collaborative scalability self recommendations with
filtering organizing map better rating quality
optimized with GA is
used
[12] Hamidreza Recommendation Collaborative Fuzzy C means clustering Movie lens Fuzzy C- means clustering
Koohi, et al. accuracy filtering algorithm produced better
2016 results compare to SOM
and K-means clustering
[13] O-Joun Lee, Data sparsity and Collaborative Markov model and fuzzy Movie lens Improves performance
et al. scalability filtering clustering instability
2015
[14] Kyoung-jae Kim, Data sparsity Case based GA K- means clustering Real-world By providing good optimal
et al. reasoning data set solutions in less time,
2008 (CBR) from the efficiently segment the
internet online shopping customer
shopping
mall
[15] Yilmaz Ar, et al. Prediction error Collaborative Genetic approach Movielens Reducing the prediction
2016 filtering error and helps in
recommending better
product

Lee et al. [13] introduce a Predictive Clustering-based user preferences and bridging the gap between the static
Collaborative Filtering (PCCF) that combines the Markov model and dynamic model.
model and fuzzy clustering with Clustering based CF Kim et al. [14] proposed a recommender system for
(CBCF). This method solves the issue of reduced coverage online shopping market using GA K-means clustering. In
and of unstable performance by tracking the changes in this system, the author tries to segment the online shopping

123
Int. j. inf. tecnol.

Fig. 2 Collaborative filtering User/Item I1 I2 ….. I3 In


process
riu : Prediction on
U1
item ‘i’ for the user ‘u’
U2 CF Algo
Recommends Top N
. list of items for user
.
U3
Un

Rating Matrix

user according to their buying behavior. GA is used to rated item is recommended to the active user. In the model-
resolve the local optima problem found in K-means clus- based technique, previous ratings are used to develop a
tering & provide a method of finding the relevant groups model using machine learning technique. Once the model is
more efficiently. developed predictions can be made for an individual user.
Ar et al. [15] proposed an approach that reduces the
prediction error that occurs in collaborative filtering RS. 3.2 Content-based filtering
The conventional CF method uses similarity values directly
for the rating prediction of an item whereas in proposed Content-based (CB) approach performs recommendation of
approach author uses a genetic algorithm before using the those items that are similar in characteristic to the item that
prediction process to get the better result. The statical the users have already used in their past. CB approach
analysis performed on various similarity matrices such as performs more analysis on the attribute of the item in order
Vector Cosine Similarity, Pearson’s Correlation and to produce recommendations. CB filtering (CBF) technique
Extended Jaccard Coefficient and result shows that evolu- is most successful in webpages, publications and news
tionary approach has reduced the prediction error. Table 1 recommendation.
shows the summary of the works that have done by dif- CBF system automatically creates personalized profiles
ferent authors in the field of the recommendation system. of the user based on his feedback and type of item likes. In
order to generate meaningful recommendations, collected
user information is compared against the characteristic of
3 Recommendation system techniques the item examine [19] as shown in the Fig. 3.

3.1 Collaborative filtering 3.3 Hybrid filtering

Collaborative filtering (CF) technique recommends an item Hybrid filtering system achieves by a combination of two or
to the particular user based on the rating/opinions of the other more recommendation system in order to get better perfor-
user [16, 18]. CF system performs recommendation by mance over collaborative filtering and content-based filter-
building a database of preferences for items by the user. The ing. It is possible to combine CF and CBF technique in a
system then finds the user with similar interest and prefer- different way to obtain hybrid filtering system, which may
ences by calculating similarities between the user profiles produce several outputs. Hybridization process categorized
[17], build a group of similar user called neighborhood. A into seven different types [17] such as (1) Weighted (2)
user gets the recommendation to those products that he has switching (3) Mixed approach (4) Feature combination (5)
not rated/purchased but his neighbors are rated. Collabora- feature augmentation (6) Cascade and (7) Meta-level.
tive filtering performs predictions or recommendations, the
prediction is a numerical value and recommendation is a list
of top N items that the user will like the most [17] as shown in 4 Issue related to recommendation system
the Fig. 2. Collaborative filtering technique can be classified
into two broad categories (a) Memory-based technique 4.1 Limited content analysis
(b) Model-based technique [16–18]. Memory-based tech-
nique identify the similarity between an active user to all Content-based filtering (CBF) techniques are restricted by
other user using similarity measures such as Pearson corre- the characteristic that is explicitly concerned with the item
lation, Cosine similarity, Jaccard coefficient etc. Then that is recommended. So in order to obtain enough number
missing rating of an active user is predicted and the top k of characteristic, the content must be in the form that can

123
Int. j. inf. tecnol.

to generate the satisfactory result for a large volume of the


User Item dataset. Thus, it is very difficult to apply recommendation
technique with huge and dynamic data sets produced by
item-users interaction. Scalability problem can be solved
using Dimensionality reduction, Bayesian Network and
Clustering etc.
Recommend

Build
4.5 Privacy issue

Recommendation algorithm requires input from the user


population to produce quality personalized recommenda-
Predict items not Match tions; this may lead to issues of data privacy and security.
experienced User Thus a technique required to be designed that can reason-
ably and carefully use the user data by assuring that
Fig. 3 Content-based filtering process
information about the user-item rating can’t be freely
available to the malicious users.

be parsed automatically or the characteristics should be 4.6 Synonymy


assigned manually [24]. CBF also facing another problem
that is when two different items having the same charac- It is the situation which refers similar items having dif-
teristics are not distinguishable to the system. ferent names or entries. RS algorithms are unable to find
the difference between closely related items such as
4.2 Cold-start problem ‘‘comedy movie’’ and ‘‘comedy film’’. The extreme usage
of synonym words decreases the performance of collabo-
It refers to the situation where it is difficult to make rec- rative filtering recommendation. Synonymy problem can
ommendations for a new user and items. Because of lack of be solved by using these methods (a) Construction of a
sufficient rating information, it is difficult to find similarity thesaurus (b) Singular Value Decomposition (SVD)
between users and items. So, neither the taste of the new (c) Latent Semantic Indexing.
users can be predicted nor the new items be rated or pur-
chased by the users, this situation leads to less accurate
recommendations. The cold start problem can be solved in 5 Evaluation metrics of RSs
many ways such as (a) Ask the new user at the beginning to
rate some items. (b) Ask to state the taste of new users The quality of a recommendation system algorithm can be
explicitly. (c). Recommends items to the new user based on assessed using the different method. The type of metrics
the collected demographic information. used depends on the types of filtering technique. The
assessment of prediction and recommendation has been
4.3 Data sparsity problem considered essential so that the user can have the best
experience with RSs. Evaluation metrics can be classified
This is the problem that occurs when a majority of the users as follows:
do not rate most of the items and consequently, the user-
item matrix becomes very sparse. So, the chance of getting 5.1 Statistical accuracy metric
a set of users with the similar rating decreases. Collabo-
rative filtering uses the nearest neighbor approach to rec- It evaluates the accuracy by comparing the predicted rating
ommend items and less rating makes difficult to make with the actual rating. The commonly used metrics are
accurate predictions about items. Mean Absolute Error (MAE), Root Mean Squared Error
(RMSE) and Correlation.
4.4 Scalability problem
5.1.1 Mean absolute error
Recommender system is facing one of the vital and fore-
most issues with the large real-world dataset are called It is an average of the absolute deviation between predicted
scalability. If the size of dataset grows with the number of rating and actual rating. The lower MAE value shows the
user and items the computation also grows linearly. i.e., better prediction [20]. Let r1, r2, r3,…., rn are the actual
when the dataset is small algorithm works well but unable

123
Int. j. inf. tecnol.

ratings and the corresponding p1, p2, p3,…., pn are the 6 Conclusion
predicted ratings.
It is defined as follows Recommendation system has an ability to provide per-
Pn sonalized information on the internet. In this era of internet,
jri  pi j
MAE ¼ i¼1 : ð1Þ lots of RSs have been developed that are based on Content-
n
based filtering, Collaborative filtering and Hybrid system
and helps to reduce the problem of information overload.
5.1.2 Root mean square error In this study authors found that CF recommendation system
provide better recommendation but still facing problem of
It is also used for the measure of model performance. scalability and sparsity. So there is a possibility to improve
RMSE is obtained by squaring the difference between the quality and performance of collaborative filtering based
predicted rating and actual rating, adding those together, recommendation system by using the fuzzy clustering and
dividing that by the no of test points and then taking the the optimization technique.
square root of the result [21]
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2
i¼1 ðri  pi Þ
RMSE ¼ : ð2Þ
n References

1. Deshpande M, Karypis G (2004) Item-based top-N recommen-


5.1.3 Correlation dation algorithms. ACM Trans Inf Syst 22(1):143–177
2. Sarwar B, Karypis G, Konstan J, Riedl J (2000) Analysis of
Correlation analysis refers to the measure of the linear recommendation algorithms for e-commerce. In: Proceedings of
the 2nd ACM conference on Electronic and commerce,
relationship between two variables. A higher correlation pp 158–167
value shows more accurate rating prediction or 3. Miller BN, Albert I, Lam SK, Konstan JA, Riedl J (2003) Movie
recommendations Lens unplugged: experiences with an occasionally connected
Pn recommender system. In: Proceedings of the Int’l Conf. Intelli-
p
i¼1 ð i  p Þ ð ri  r Þ gent user interfaces, Miami, Florida, USA, pp 223–266
corrðp; rÞ ¼  P  :
n p 2 Pn 2 1=2 4. Linden G, Smith B, York J (2003) Amazon.com recommenda-
i¼1 ð i  p Þ i¼1 ð ri  r Þ tions: item-to-item collaborative filtering. IEEE Int Comput
ð3Þ 7:76–80
5. Goldberg K, Roeder T, Gupta D, Perkins C (2001) Eigentaste: a
constant time collaborative filtering algorithm. Inf Retr J
4:133–151
5.2 Classification accuracy metrics 6. Rodrigues CM, Rathi S, Patil G (2016) An efficient system using
item and user-based CF techniques to improve recommendation.
The recommendation system makes decisions about whe- In: Proc. of International Conf. on next generation computing
ther an item is good or not by measuring the frequency technologies (NGCT), pp 569–574
7. Ji K, Shen H (2014) Using category and keyword for personalized
[22]. Generally in a Recommendation system, binary rating recommendation: a scalable collaborative filtering algorithm. In:
is used that is the items are relevant to the user interest or Proc. of sixth international symposium on parallel architectures,
not because rating dataset is extremely sparse compared to algorithms and programming, Beijing, pp 197–202
binary selection dataset. For this metrics, a rating dataset is 8. Gu L, Yang P, Dong Y (2014) An dynamic-weighted collabo-
rative filtering approach to address sparsity and adaptivity issues.
transformed into the binary dataset. Three classification In: 2014 IEEE Congress on Evolutionary Computation (CEC),
accuracy metrics Precision, Recall, and F-1 score are often Beijing, pp 3044–3050
used to assess the relevance between recommendations and 9. Koohi H, Kiani K (2017) A new method to find neighbor users
user interest [23] that improves the performance of collaborative filtering. Expert
Syst Appl 83:30–39
Relevant Item Recommended 10. Verma SK, Mittal N, Agarwal B (2013) Hybrid recommender
Pr ecision ¼ ; ð4Þ system based on fuzzy clustering and collaborative filtering. In:
Total Item Recommended
Proceedings of the 4th International Conference on Computer and
Relevant Item Recommended Communication Technology (ICCCT), Allahabad, pp 116–120
Recall ¼ ; ð5Þ
Total Relevant Items 11. Nitin PK, Fan Z (2015) Hybrid user-item based collaborative
filtering. Procedia Comput Sci 60:1453–1461
2  Precision  Recall
F1 ¼ ð6Þ 12. Koohi H, Kiani K (2016) User based Collaborative Filtering
Pecision þ Recall: using fuzzy C-means. Measurement 91:134–139
13. Lee OJ, Jung JJ, Eunsoon Y (2015) Predictive clustering for
performance stability in collaborative filtering techniques. In:
Proceedings of the IEEE 2nd International Conference on
Cybernetics (CYBCONF), Gdynia, 2015, pp 48–55

123
Int. j. inf. tecnol.

14. Kim K-J, Ahn H (2008) A recommender system using GA - 19. Sharma L, Gera A (2013) A survey of recommendation system:
means clustering in an online shopping market. Expert Syst Appl research challenges. Int J Eng Trends Technol 4(5):1989–1992
34(2):1200–1209 20. Gong S (2010) A collaborative filtering recommendation algo-
15. Ar Y, Bostanci E (2016) A genetic algorithm solution to rithm based on user clustering and item clustering. J Softw
the collaborative filtering problem. Expert Syst Appl 5(7):745–752
61:122–128 21. Chai T, Draxler RR (2014) Root mean square error (RMSE) or
16. Kumar B, Sharma N (2016) Approaches, issues and challenges in mean absolute error (MAE)?—arguments against avoiding
recommender systems: a systematic review. Ind J Sci Technol RMSE in the literature. Geosci Model Dev Discuss 7:1247–1250
9(47). https://doi.org/10.17485/ijst/2016/v9i47/94892 22. Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evalu-
17. Isinkaye FO, Folajimi YO, Ojokoh BA (2015) Recommendation ating collaborative filtering recommender systems. ACM Trans
systems: principles, methods and evaluation. Egypt Inform J Inf Syst (TOIS) 22(1):5–53
16:261–273 23. Yang Z, Wu B, Zheng K, Wang X, Lei L (2016) A survey of
18. Khusro S, Ali Z, Ullah I (2016) Recommender systems: issues, collaborative filtering-based recommender systems for mobile
challenges, and research opportunities. In: Kim K, Joukov N (eds) internet applications. IEEE Access 4:3273–3287
Information science and applications (ICISA) 2016. Lecture 24. Adomavicius G, Tuzhilin A (2005) Toward the next generation of
Notes in Electrical Engineering, vol 376. Springer, Singapore, recommender systems: a survey of the state-of-the-art and pos-
pp 1179–1189 sible extensions. IEEE Trans Knowl Data Eng 17(6):734–749

123

You might also like