Research Problems in Recommender Systems
Research Problems in Recommender Systems
Saumya Chaturvedi
Galgotias University, India
E-mail: [email protected]
Aanchal Vij
Galgotias University, India
E-mail: [email protected]
Sunita Tripathi
University of Allahabad, India
E-mail: [email protected]
Abstract. With continuous growth of web applications around the globe, it is a challenge to
find the suitable information needed for the user in a limited time.Number of handheld mobile
devices is increasing and most of the business revolves around the correct search of the data.
Without a proper recommender system it is very difficult to get required information from the
web applications. Web applications use recommender systems to provide suitable data to users
based on their choices and interests. For different kinds of needs different types of recommender
systems have been proposed. Two most basic types of recommender systems are collaborative
filtering recommender system and content based recommender system. Sometimes these two
recommender systems are combined to increase the efficiency of a recommender system The
generated new recommender system is known as hybrid recommender system.
The purpose of this paper is to help readers understand the basics of recommender systems.
This paper identifies key areas of research openly available for new researchers. After reading
this paper new researchers can understand basic problems of recommender systems which need
improvement and hence they can make those problems their area of research.
1. Introduction
Recommender systems help us in getting the data which we need. It filters information which
is are needed by the user. Today we have a lot of data in any system.[1, 2] The example
of systems in which recommenders are needed is YouTube Netflix or any other E-Commerce
platforms like Flipkart and Amazon. The scenario which we are facing today is that data is
increasing and screen size is decreasing. Screen size we mean to say that initially systems were
used from desktop and laptop which were around 15 inches and now it is being used by mobile
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
which ranges from 4 to 7 inches.[3, 4, 5, 6, 7] When the user searches for any item and it is not
available in first 5 or 10 searches he leaves the system and tries that item on another system.
The biggest irony can be an item is available but didn’t come on the top of the searches which
user did. so in this case what will happen is he will buy that item from another computer E-
Commerce platform rather than the platform which was unable to provide a recommendation.
A recommendation system can increase sales of a particular application. Many E-Commerce
platforms did field because of lack of good recommendation system on their platforms. A good
recommender system also staves users time and keep the user engaged in the system resulting
in higher revenue.[8]
Many top-notch companies which are using recommendation systems are Google, YouTube
Netflix, Flipkart, Amazon, Prime, gaana.com and many more. Every system comes with its
advantages and disadvantages. So recommendation systems also face many problems which are
yet to be solved effectively.
So the purpose of writing this paper is to make the Reader aware of recommender systems and
its major techniques. The paper will also explore research problems in recommender systems
based on the extensive study done by us using the papers which are referenced in the reference
section. In this paper, we have identified some key areas of research which are open to new
researchers. so students who are in their Masters and PhD can take this area and take t the topic
as their area of research and contribute to the development and improvement of recommender
systems of the new generation.
We have included papers from early 1997 to 2020. More than 50 papers have been included
in the study. We have not only identified problems but the latest solutions to problems papers
have also be added for researchers to understand the problem in detail.
2. GENERAL CONCEPTS
Recommender systems is a system that helps users to choose items which they may need.
different artificial intelligence techniques and machine learning techniques are applied to achieve
this output.[9, 10]
Some examples of recommender systems can be google.com, amazon.com, Netflix ,and other
popular e-commerce,music and video portals available online.Since these systems have millions
of items so they can not function properly without a good recommender system.
In a recommendation-system application, there are two classes of entities, which we shall
2
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
refer to as users and items. The formal definition of the recommender system is:
• C: The set of all users
• S: The set of all possible items that can be recommended, for example, video, songs and
books.
• U: A utility function that measures the usefulness of a specific item s S to user c C, i.e.,
U: C X S
• R, where R is a totally ordered set.
The space S of feasible items can be very large, ranging in lots of hundreds or even thousands
and thousands of items in some applications, such as recommending books or CDs. Similarly, a
person’s area can also be very large—millions in some cases. [11]
In recommender machine how useful the item is determined by its rating. Rating is a measure
of item liked by the user. As it is given by the user hence reliability of this rating has extra
value in understanding the choices of the user. The rating can be taken in different ways. some
common forms of rating are it is asked based on 1 to 5 scales as in case of apps in Google Play.
It can also ask based on the scale of 1 to 10 as used by many rating methods. Many customer
service agencies use this scale of rating. Whatever rating scale you are going to use your one
and describes like and others end describes the extent of dislike.[12]
User profile can be generated by storing its traits like age gender area email mobile and other
things. Item profile can be generated based on features of the item. Like in case of a book it is
the language author Jonah cost publisher etc. In the case of television, it is the brand feature
power consumption and many features which can be used to create the profile. The way we
create the profile has an important impact on the recommendation system.[13] Ratings are done
on a subset of data rather than entire data. Rating matrix is created between user and item
and this becomes the heart of the recommender system. The way the rating matrix is analysed
defines the recommender system. Different domains use Different techniques for extracting data
from the user-item matrix. [14, 15, 5]
Recommender systems are recommends different items to the user based on whatever machine
learning or artificial intelligence techniques they have used on the rating matrix.[15] Good
recommender systems improve with user feedback. A good recommender system also performs
good recommendation even though very few ratings are there. The accuracy of a good
recommender systems increases as the history of user increases in the system. [16]
3
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
Some examples of collaborative filtering systems are movie recommendation systems and
social networks.[4]
• The system contains a big database of the item to be recommended. This database consists
of features of items. This database is known as an item profile database.
• Users provide little information about their preference likes and dislikes to the system and
with this little information the system builds a user profile.
• The recommendation is done on the basis of a comparison of item profile with user interest.
One can make better-personalised recommendations by means of utilising the elements of gadgets
and users. An object profile is defined by way of its essential features. For example, a book can
be described using its title, genre,language, publisher, cost etc. Using the weighting procedure,
similarity can be calculated between items. In some domains, we can represent elements by
means of boolean values while in others we can represent the values using a set of restrained
values. Consider the example of the newspaper where we analyze the newspaper articles on the
basis of the exceptional form of topics. Boolean cost is indicative of whether a phrase is present
4
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
in the article or not. Integer cost may want to define the categorical way the range of time of
word appears in an Article. This method gives a successful recommendation in content-based
recommender systems without using explicit ratings.[19, 20]
4. RESEARCH PROBLEMS
These recommendation systems have a great future. Today some problems are yet to be solved
by the research community to make research more efficient. Some of the problems which we
understand can be solved are listed below. Figure 5 Shows all the research problems we are
going to address in the later sections. These research problems can give ideas to work in the
area.
5
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
6
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
about user and item before recommending fails. We can have three different subproblems in the
domain of cold start problems. The first problem arises when we do not have any information
about the new user who is entering the system. this only happens when a user joins the system
for the first time. An example can be if you are joining Amazon or Flipkart for the first time
this kind of problem arises. This problem is known as the new user cold-start problem.[26]
The second problem arises when we introduce a new item to the system. this item is very
original of its kind. The recommender system is unable to find any ratings associated with this
item. The collaborative filtering system which needs user-item rating Matrix in order to give a
recommendation is unable to start and the problem is known as cold start item problem. The
third problem arose when we launched the system for the first time. In this case, we do not
have either any user information or any item information. In other words, we do not have any
user-item matrix of ratings which is required for collaborating recommender systems to work
properly. This problem is known as the cold start system problem. In cold start problems,
famous content-based solutions can be applied to solve the problem.[27, 28] Other solutions are
using a combination of various machine learning techniques.[29, 30]
4.4. Scalability
Scalability is the property of the system which defines weather system will be able to cope up
when the system grows. [34] For example, in case of recommender systems, scalability can be
understood as a situation where a recommender system is performing very well in case of few
users like 1000 users but as the user grows to 10000 or 100000 it starts performing a way which
is not desirable. When the system faces scalability issues it becomes slow it starts feeling it start
giving problems which it has never given when a load of users recommendation were less.
The scalability issues can be divided into two parts hardware scalability and software
scalability. the hardware is scalability is about the increase of hardware to solve the scalability
problem. For example, one can increase processor, RAM and server configuration to solve the
problem. But only hardware air capacity increase cannot solve the problem.[34]
Software scalability is about writing algorithms and using methods which work well when
hardware configuration is increased as needed in future. although this is a major problem which
is not as easy as it seems to be. because there are algorithms which perform very well when
the amount of data on which they have to operate is small but as the data increases they start
7
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
performing inefficiently. The accuracy of the prediction decreases as the data increases. some
algorithms are not able to utilise the increased efficiency of the hardware hence create a problem
of scalability.[32]
So this is also an open area of research in recommender systems. as we know incoming
Technology we are going for parallel processing and Hadoop and other architectures and big
data. so this problem needs to be addressed. So it gives upcoming researchers a new area to
start the research and solve this problem using an innovative method.[33]
8
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
recommendation systems which are completely based on user preferences may perform wrong
recommendations for eg. Suppose today I am browsing books for myself but tomorrow I may
browse sports item for myself or It can be said that a 10 year kid is searching for multiple
items without any thought just to scroll without any intention of purchase, at that time the
recommendation system based on user preferences may recommend me the wrong items, so
making coordination with frequently changing user preferences is the most important issue in
recommendation system[38].
5. Conclusion
This paper has introduced recommender systems to new researches. This paper has also
identified key problems which need research in recommender systems. This paper can help
PhD and Masters students in choosing their area of research. The research gap is already
presented in this paper to form different problems of recommender systems.
The recommendation system finds its utility in major areas of web Applications. As these
problems get solved more and more useful recommendation systems will become. With more
reliable recommendations web applications will be more intelligent and usable.
References
[1] G. Adomavicius and A. Tuzhilin, “Toward the next generation of recommender systems: A survey of the
state-of-the-art and possible extensions,” IEEE transactions on knowledge and data engineering, vol. 17,
no. 6, pp. 734–749, 2005.
[2] D. Agarwal and B.-C. Chen, “Regression-based latent factor models,” in Proceedings of the 15th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, (New York, NY,
USA), pp. 19–28, ACM, 2009.
[3] P. Adamopoulos and A. Tuzhilin, “On over-specialization and concentration bias of recommendations:
Probabilistic neighborhood selection in collaborative filtering systems,” in Proceedings of the 8th ACM
Conference on Recommender systems, pp. 153–160, ACM, 2014.
[4] M. Jiang, P. Cui, F. Wang, Q. Yang, W. Zhu, and S. Yang, “Social recommendation across multiple relational
domains,” in Proceedings of the 21st ACM international conference on Information and knowledge
management, pp. 1422–1431, ACM, 2012.
[5] X. Su and T. M. Khoshgoftaar, “A survey of collaborative filtering techniques,” Adv. in Artif. Intell.,
vol. 2009, pp. 4:2–4:2, Jan. 2009.
[6] A. Gunawardana and C. Meek, “A unified approach to building hybrid recommender systems,” in Proceedings
of the Third ACM Conference on Recommender Systems, RecSys ’09, (New York, NY, USA), pp. 117–124,
ACM, 2009.
[7] Y. Rong, X. Wen, and H. Cheng, “A monte carlo algorithm for cold start recommendation,” in Proceedings
of the 23rd international conference on World wide web, pp. 327–336, ACM, 2014.
[8] Y. Zhang, F. Sun, X. Yang, C. Xu, W. Ou, and Y. Zhang, “Graph-based regularization on embedding layers
for recommendation,” ACM Trans. Inf. Syst., vol. 39, Sept. 2020.
[9] Z. Huang and M. K. Ng, “A fuzzy k-modes algorithm for clustering categorical data,” IEEE Transactions
on Fuzzy Systems, vol. 7, no. 4, pp. 446–452, 1999.
[10] M. Papagelis, D. Plexousakis, and T. Kutsuras, “Alleviating the sparsity problem of collaborative filtering
using trust inferences,” in Trust management, pp. 224–239, Springer, 2005.
9
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
10
RASCC 2020 IOP Publishing
Journal of Physics: Conference Series 1717 (2021) 012002 doi:10.1088/1742-6596/1717/1/012002
extensible python framework for recommender systems,” in Proceedings of the 12th ACM Conference
on Recommender Systems, RecSys ’18, (New York, NY, USA), p. 494–495, Association for Computing
Machinery, 2018.
[33] T. Kitazawa and M. Yui, “Query-based simple and scalable recommender systems with apache hivemall,” in
Proceedings of the 12th ACM Conference on Recommender Systems, RecSys ’18, (New York, NY, USA),
p. 502–503, Association for Computing Machinery, 2018.
[34] W. Pan, E. W. Xiang, N. N. Liu, and Q. Yang, “Transfer learning in collaborative filtering for sparsity
reduction.,” in AAAI, vol. 10, pp. 230–235, 2010.
[35] Z.-K. Zhang, C. Liu, Y.-C. Zhang, and T. Zhou, “Solving the cold-start problem in recommender systems
with social tags,” EPL (Europhysics Letters), vol. 92, no. 2, p. 28002, 2010.
[36] B. Lika, K. Kolomvatsos, and S. Hadjiefthymiades, “Facing the cold start problem in recommender systems,”
Expert Systems with Applications, vol. 41, no. 4, pp. 2065–2073, 2014.
[37] M. Yan, J. Sang, T. Mei, and C. Xu, “Friend transfer: Cold-start friend recommendation with cross-
platform transfer learning of social knowledge,” in Multimedia and Expo (ICME), 2013 IEEE International
Conference on, pp. 1–6, IEEE, 2013.
[38] N. Lathia, S. Hailes, L. Capra, and X. Amatriain, “Temporal diversity in recommender systems,” in
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information
retrieval, pp. 210–217, ACM, 2010.
[39] E. Herder and B. Zhang, “Unexpected and unpredictable: Factors that make personalized advertisements
creepy,” in Proceedings of the 23rd International Workshop on Personalization and Recommendation on
the Web and Beyond, ABIS ’19, (New York, NY, USA), p. 1–6, Association for Computing Machinery,
2019.
[40] M. Sun, F. Li, J. Lee, K. Zhou, G. Lebanon, and H. Zha, “Learning multiple-question decision trees for
cold-start recommendation,” in Proceedings of the Sixth ACM International Conference on Web Search
and Data Mining, WSDM ’13, (New York, NY, USA), pp. 445–454, ACM, 2013.
[41] L. Zhang, D. Agarwal, and B.-C. Chen, “Generalizing matrix factorization through flexible regression priors,”
in Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, (New York, NY, USA),
pp. 13–20, ACM, 2011.
[42] D. Agarwal and B.-C. Chen, “flda: Matrix factorization through latent dirichlet allocation,” in Proceedings
of the Third ACM International Conference on Web Search and Data Mining, WSDM ’10, (New York,
NY, USA), pp. 91–100, ACM, 2010.
[43] Z. Gantner, L. Drumond, C. Freudenthaler, S. Rendle, and L. Schmidt-Thieme, “Learning attribute-to-
feature mappings for cold-start recommendations,” in Data Mining (ICDM), 2010 IEEE 10th International
Conference on, pp. 176–185, IEEE, 2010.
[44] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “Bpr: Bayesian personalized ranking from
implicit feedback,” in Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence,
pp. 452–461, AUAI Press, 2009.
[45] O. Moreno, B. Shapira, L. Rokach, and G. Shani, “Talmud: Transfer learning for multiple domains,” in
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM
’12, (New York, NY, USA), pp. 425–434, ACM, 2012.
[46] B. Li, Q. Yang, and X. Xue, “Transfer learning for collaborative filtering via a rating-matrix generative
model,” in Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09,
(New York, NY, USA), pp. 617–624, ACM, 2009.
[47] H.-N. Kim, A. El-Saddik, and G.-S. Jo, “Collaborative error-reflected models for cold-start recommender
systems,” Decision Support Systems, vol. 51, no. 3, pp. 519–531, 2011.
[48] H. J. Ahn, “A new similarity measure for collaborative filtering to alleviate the new user cold-starting
problem,” Information Sciences, vol. 178, no. 1, pp. 37–51, 2008.
[49] J. Bobadilla, F. Ortega, A. Hernando, and J. Bernal, “A collaborative filtering approach to mitigate the new
user cold start problem,” Knowledge-Based Systems, vol. 26, pp. 225–238, 2012.
[50] M.-H. Nadimi-Shahraki and M. Bahadorpour, “Cold-start problem in collaborative recommender systems:
Efficient methods based on ask-to-rate technique,” Journal of computing and information technology,
vol. 22, no. 2, pp. 105–113, 2014.
11