Dsa Unit 1 PDF
Dsa Unit 1 PDF
1
Introduction to Data Science: Review, Challenges,
and Opportunities
G. R. Sinha
Myanmar Institute of Information Technology (MIIT), Mandalay, Myanmar
CONTENTS
1.1 Introduction ........................................................................................................................... 2
1.2 Data Science ........................................................................................................................... 2
1.2.1 Classification .............................................................................................................. 3
1.2.2 Regression .................................................................................................................. 4
1.2.3 Deep Learning ........................................................................................................... 4
1.2.4 Clustering ................................................................................................................... 4
1.2.5 Association Rules ...................................................................................................... 4
1.2.6 Times Series Analysis ............................................................................................... 5
1.3 Applications of Data Science in Various Domains ........................................................... 5
1.3.1 Economic Analysis of Electric Consumption ........................................................ 6
1.3.2 Stock Market Prediction ........................................................................................... 6
1.3.3 Bioinformatics ........................................................................................................... 6
1.3.4 Social Media Analytics ............................................................................................. 6
1.3.5 Email Mining ............................................................................................................. 7
1.3.6 Big Data Analysis Mining Methods ....................................................................... 7
1.4 Challenges and Opportunities ............................................................................................ 8
1.4.1 Challenges in Mathematical and Statistical Foundations ................................... 8
1.4.2 Challenges in Social Issues ...................................................................................... 8
1.4.3 Data-to-Decision and Actions ................................................................................. 8
1.4.4 Data Storage and Management Systems ............................................................... 9
1.4.5 Data Quality Enhancement ..................................................................................... 9
1.4.6 Deep Analytics and Discovery ................................................................................ 9
1.4.7 High-Performance Processing and Analytics ....................................................... 9
1.4.8 Networking, Communication, and Interoperation .............................................. 9
1.5 Tools for Data Scientists ....................................................................................................... 9
1.5.1 Cloud Infrastructure ............................................................................................... 10
1.5.2 Data/Application Integration ............................................................................... 10
1.5.3 Master Data Management ..................................................................................... 10
1.5.4 Data Preparation and Processing ......................................................................... 10
1.5.5 Analytics ................................................................................................................... 10
1.5.6 Visualization ............................................................................................................ 10
1.1 Introduction
Data science is a new area of research that is related to huge data and involves concepts like
collecting, preparing, visualizing, managing, and preserving. Even though the term data
science looks related to subject areas like computer science and databases, it also requires
other skills, including non-mathematical ones. Data science not only combines data analy-
sis, statistics, and other methods, but it also includes the corresponding results. Data sci-
ence is intended to analyze and understand the original phenomenon related to the data
by revealing the hidden features of complex social, human, and natural phenomena related
to data from another point of view other than traditional methods.
Data science includes three stages: designing the data, collecting the data, and finally
analyzing the data. There is an exponential increase in the applicability of data science in
various areas because data science has been making enormous strides in data processing
and use. Business analytics, social media, data mining, and other disciplines have bene-
fited due to the advance in data science and have shown good results in the literature.
Data science has made remarkable advancements in the fields of ensemble machine
learning, hybrid machine learning, and deep learning. Machine learning methods (ML) can
learn from the data with minimum human interference. Deep learning (DL) is a subset of
ML that is applicable in different areas, like self-driving cars, earthquake predictions, and so
on. There are many pieces of evidence in the literature that show the superiority of DL over
ML methods; DL methods include artificial neural networks, k-nearest neighbors, and sup-
port vector machine (SVM) in different disciplines, such as medical, social media, and so on.
Torabi et al. developed a hybrid model where two predictive machine learning algorithms
are combined together [1]. Here, an additional optimization-based method has also been used
for maximizing the prediction function. Mosavi and Edalatifar illustrated that hybrid machine
learning models perform very accurately compared to single machine learning models [2].
This chapter presents a review of various data science methods and details how they are
used to deal with critical challenges that arise when working with big data analytics.
According to the literature, different classification, regression, clustering, and deep learning–
based methods have often been used. However, there is an opportunity to improve in new
areas, like temporal and frequent pattern discovery for load prediction. This chapter also
discusses the future trends of data science, to explore new tools and algorithms that are
capable of intelligently handling large datasets that are collected from various sources.
FIGURE 1.1
Data science process
and information technology—like email information privacy, market, stock data, data sci-
ence, and real-time monitoring—have also been a good influence.
It is well known that data science builds algorithms and systems for discovering knowl-
edge, detecting the patterns, and generating useful information from massive data. To do
so, it encompasses an entire data analysis process that starts with the extraction of data and
cleaning, and extends to data analysis, description, and summarization. Figure 1.1 depicts
the complete process. It starts with data collection. Next, the data is cleaned to select the
segment that has the most valuable information. To do so, the user will filter over the data
or formulate queries that can erase unnecessary information. After the data is prepared, an
exploratory analysis that includes visualizing tools will help decide the algorithms that are
suitable to gain the required knowledge. This complete process will guide the user toward
the results that will help them make suitable decisions.
Depending on the primary outcomes, the complete process should be fine-tuned to
obtain improved results. This will involve changing the parameter values or making
changes to the datasets. These kinds of decisions are not made automatically, so the
involvement of an expert in result analysis is a crucial factor.
From a technical point of view, data science consists of a set of tools and techniques that
deals with various goals corresponding to multiple situations. Some of the recent methods
used are clustering, classification, deep learning, regression, association rule mining, and
time-series analysis. Even though these methods are often used in text mining and other
areas, anomaly detection and sequence analysis are also helpful to provide excellent results
for text mining problems.
1.2.1 Classification
Wu et al. have classified a set of objects that predict the classes based on the attributes.
Decision trees (DT) are used to perform and visualize that classification [3]. DTs may be
generated using various algorithms, such as ID3, CLS, CART, C4.5, and C5.0. Random for-
est (RF) is one more classifier that will construct a set of DTs, and then predicts through the
aggregation of the values generated from each DT. A classification model was developed
by using a technique known as Least Squares Support Vector Machine (LS-SVM). The
classification task is performed by LS-SVM by using a hyper-plane in a multidimensional
space for separating the dataset into the target classes [4].
1.2.2 Regression
Regression analysis aims for the numerical estimation of the relationship between vari-
ables. This involves the estimation of whether or not the variables are independent. If a
variable is not independent, then the first step is to determine the type of dependence.
Chatterjee et al. proposed a regression analysis that is often used for predicting and fore-
casting, and also to understand how the dependent variables will change corresponding to
the fixed values of independent variables [5].
1.2.4 Clustering
Jain et al. proposed a clustering-based method using the degree of similarity [8]. In cluster-
ing, the objects are separated into groups called clusters. This type of learning is called
unsupervised learning, as there is no prior idea over the classes as to which group the
objects belong. Based on the similarity measure criterion, cluster analysis has various mod-
els: (i) based on the connectivity distance, connectivity models are generated, i.e., hierar-
chical clustering; (ii) by using the nearest cluster center, the objects are assigned, centroid
models are generated, i.e., k-means; (iii) by means of statistical distributions, the distrib-
uted models are generated, i.e., expectation-maximization algorithm; (iv) based on high-
density areas that exist in the data, the clusters are defined in density models; (v) graphs
are used for expressing the dataset in graph-based models.
FIGURE 1.2
Data Science Techniques
example, the generalized rule induction algorithm and its adaptations are often used, per
Tan et al. [10].
1.3.3 Bioinformatics
Bioinformatics is a new area that uses computers to understand biological data like genom-
ics and genetics. This helps scientists understand the cause of disease, physiological prop-
erties, and genetic properties. Baldi et al. [17] utilized various techniques to estimate the
applicability and efficiency of different predictive methods in the classification task. The
previous error estimation techniques are primarily focused on supervised learning using
the microarray data. Michiels et al. [18] have used various random datasets to predict can-
cer using microarray data. Ambroise et al. [19] solved a gene selection problem based on
microarrays data. Here, 10-fold validation has been used. Here, 0.632 bootstrap error
estimates are used to deal with prediction rules that are overfitted. The accuracy of 0.632
bootstrap estimators for microarray classification using small datasets is proposed in Braga
et al. [20]
study has been carried out by using maximum entropy, naïve Bayes, and positive-negative
word counting. Wolny [22] proposed a model to recognize the emotion in Twitter data and
performed an emotion analysis study. Here, the feelings and sentiments were discussed in
detail by explaining the existing methods.
The emotion and sentiment are classified based on symbols via an unsupervised classi-
fier, and the lexicon was explained by suggesting future research. Coviello et al. [23] have
analyzed the emotion contagion related to Facebook data. The instrumental variable
regression technique has been used to analyze the Facebook data. Here, the emotions of the
people, such as negative and positive emotions during rainy days, were detected. Roelens
et al. [24] explained that the detection of the people who influence social networks is a dif-
ficult task or area of research, but one of great interest so that referral marketing and
spreading information regarding products can reac the maximum possible network.
TABLE 1.1
An overview of data science methods used in different applications
S.no Applications Methods Source
1.5.5 Analytics
Analytics includes commercial tools like Rapid Miner [37], Mat Lab, IBM SPSS Modeler
and SPSS Statistics, SAS Enterprise Miner, and so on, in addition to some new tools, like
Google Cloud Prediction API, ML Base, Big ML [38], Data Robot, and others.
1.5.6 Visualization
Some commercial and free software listed in KDnuggets [39] to visualize include Miner3D,
IRIS Explorer, Interactive Data Language, Quadrigram, Science GL, and so on.
1.5.7 Programming
Additionally, Java, Python, SQL, SAS, and R languages have been used for data analytics.
Some data scientists have also included Go, Ruby, .net, and Java Script [40].
FIGURE 1.3
Data Science Programming Models
1.6 Conclusion
This chapter has surveyed the modern advances in information technology, and the influ-
ence these advances have had on big data analytics and its applications. The effectiveness
of different data science algorithms that can be applied to solve the challenges in big data
has been examined. Data science algorithms will be extensively used in the future to
address the problems and challenges in big data applications.
In various areas, the exploitation and discovery of meaningful insights from the dataset
will be very much required. Big data applications are necessary in different fields like
industry, government, and so on. This new perspective will challenge research groups to
develop better solutions to manage large heterogeneous amounts of real-time data. It also
deals with the uncertainty associated with it. Data science techniques reveal important
tools that can extract and exploit the information and knowledge that exists in the user
dataset. In the coming days, big data techniques will increase possibilities, and may also
democratize them.
References
1. Torabi, M., Hashemi, S., Saybani, M. R., Shamshirband, S., & Mosavi, A. (2019). A Hybrid clus-
tering and classification technique for forecasting short-term energy consumption. Environmental
Progress & Sustainable Energy, 38(1), 66–76.
2. Mosavi, A., & Edalatifar, M. (2018). A hybrid neuro-fuzzy algorithm for prediction of reference
evapotranspiration. In International conference on global research and education (pp. 235–243).
Cham: Springer.
3. Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., … & Zhou, Z. H. (2008). Top
10 algorithms in data mining. Knowledge and information systems, 14(1), 1–37.
4. Suykens, J. A., Van Gestel, T., & De Brabanter, J (2002). Least squares support vector machines.
World Scientific.
5. Chatterjee, S., Hadi, A. S., & Price, B. (2000). Regression analysis by example. New York: John
Wiley & Sons Inc..
6. Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for
financial market predictions. European Journal of Operational Research, 270(2), 654–669.
7. Tamura, K., Uenoyama, K., Iitsuka, S., & Matsuo, Y. (2018). Model for evaluation of stock values
by ensemble model using deep learning.
8. Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys
(CSUR), 31(3), 264–323.
9. Verma, M., Srivastava, M., Chack, N., Diswar, A. K., & Gupta, N. (2012). A comparative study
of various clustering algorithms in data mining. International Journal of Engineering Research and
Applications (IJERA), 2(3), 1379–1384.
10. Tan, P. N., Steinbach, M., & Kumar, V. (2016). Introduction to data mining. Delhi: Pearson
Education India.
11. Das, S. (1994). Time series analysis. (Vol 10). Princeton, NJ: Princeton University Press.
12. Hüllermeier, E. (2005). Fuzzy methods in machine learning and data mining: status and pros-
pects. Fuzzy Sets and Systems, 156(3), 387–406.
13. Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: the fuzzy c-means clustering algorithm.
Computers & Geosciences, 10(2–3), 191–203.
14. Chicco, G., Napoli, R., Piglione, F., Postolache, P., Scutariu, M., & Toader, C. (2004). Load pat-
tern-based classification of electricity customers. IEEE Transactions on Power Systems, 19(2),
1232–1239.
15. Figueiredo, V., Rodrigues, F., Vale, Z., & Gouveia, J. B. (2005). An electric energy consumer
characterization framework based on data mining techniques. IEEE Transactions on power sys-
tems, 20(2), 596–602.
16. Sharaff, A., & Srinivasarao, U. (2020). Towards classification of email through selection of infor-
mative features. In 2020 First International Conference on Power, Control and Computing Technologies
(ICPC2T) (pp. 316–320). IEEE.
17. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A., & Nielsen, H. (2000). Assessing the accuracy
of prediction algorithms for classification: an overview. Bioinformatics, 16(5), 412–424.
18. Michiels, S., Koscielny, S., & Hill, C. (2005). Prediction of cancer outcome with microarrays: a
multiple random validation strategy. The Lancet, 365(9458), 488–492.
19. Ambroise, C., & McLachlan, G. J. (2002). Selection bias in gene extraction on the basis of micro-
array gene-expression data. Proceedings of the national academy of sciences, 99(10), 6562–6566.
20. Braga-Neto, U. M., & Dougherty, E. R. (2004). Is cross-validation valid for small-sample micro-
array classification? Bioinformatics, 20(3), 374–380.
21. Joshi, S., & Deshpande, D. (2018). Twitter sentiment analysis system. International Journal of
Computer Applications, 180(47), 0975–8887.
22. Wolny, W. (2016). Emotion analysis of twitter data that use emoticons and emoji ideograms.
23. Coviello, L., Sohn, Y., Kramer, A. D., Marlow, C., Franceschetti, M., Christakis, N. A., & Fowler,
J. H. (2014). Detecting emotional contagion in massive social networks. PloS one, 9(3), e90315.
24. Roelens, I., Baecke, P., & Benoit, D. F. (2016). Identifying influencers in a social network: the
value of real referral data. Decision Support Systems, 91, 25–36.
25. Gudkova, D., Vergelis, M., Demidova, N., and Shcherbakova, T. (2017). Spam and phishingin
Q2 2017, Securelsit, Spam and phishing reports, https://securelist.com/spamand-phishing-in-
q2-2017/81537/, 2017.
26. Caruana, G., & Li, M. (2008). A survey of emerging approaches to spam filtering. ACM
Computing Surveys (CSUR), 44(2), 1–27.
27. Dada, E. G., Bassi, J. S., Chiroma, H., Adetunmbi, A. O., & Ajibuwa, O. E. (2019). Machine learn-
ing for email spam filtering: review, approaches and open research problems. Heliyon, 5(6),
e01802.
28. Bhowmick, A., & Hazarika, S. M. (2016). Machine learning for e-mail spam filtering: review,
techniques and trends. arXiv preprint arXiv:1606.01042.
29. Aggarwal, C. C., & Zhai, C. (Eds.). (2012). Mining text data. Springer Science & Business Media.
30. Sharaff, A., & Nagwani, N. K. (2016). Email thread identification using latent Dirichlet alloca-
tion and non-negative matrix factorization based clustering techniques. Journal of Information
Science, 42(2), 200–212.
31. Laney, D. (2001). 3D data management: controlling data volume, velocity and variety. META
group research note, 6(70), 1.
32. Chen, M. M. S., & Liu, Y. (2014). Big Data: A Survey. Mobile Networks and Applications, 19,
171–209.
33. Liu, L. (2013). Computing infrastructure for big data processing. Frontiers of Computer Science,
7(2), 165–170.
34. Han, X., Li, J., Yang, D., & Wang, J. (2012). Efficient skyline computation on big data. IEEE
Transactions on Knowledge and Data Engineering, 25(11), 2521–2535.
35. Cao, L. (2017). Data science: challenges and directions. Communications of the ACM, 60(8),
59–68.
36. Stodder, D., & Matters, W. D. P. (2016). Improving data preparation for business analytics.
Applying technologies and methods for establishing trusted data assets for more productive
users. Best Practices Report Q, 3(2016), 19–21.
37. RapidMiner. 2016. RapidMiner. (2016). https://rapidminer.com/.
38. BigML. 2016. BigML. Retrieved from https://bigml.com/.
39. KDnuggets. 2015. Visualization Software. Retrieved from: http://www.kdnuggets.com/soft-
ware/visualization.html.
40. Davis, J. (2016). 10 Programming Languages And Tools Data Scientists Used. (2016).
41. Wikipedia. 2016. Comparison of Cluster Software. Retrieved from https://en.wikipedia.org/
wiki/Comparison_of_cluster_software.
42. Capterra. 2016. Top Reporting Software Products. Retrieved from http://www.capterra.com/
reporting-software/.
43. Desale, D. (2015). Top 30 Social Network Analysis and Visualization Tools. KDnuggets. https://
www.kdnuggets.com/2015/06/top-30-social-network-analysis-visualization-tools.html.
2
Recommender Systems: Challenges and
Opportunities in the Age of Big Data and Artificial
Intelligence
Mehdi Elahi
University of Bergen, Bergen, Norway
CONTENTS
2.1 Introduction ......................................................................................................................... 16
2.2 Methods ................................................................................................................................ 17
2.2.1 Classical .................................................................................................................... 17
2.2.2 Collaborative Filtering ........................................................................................... 17
2.2.3 Content-Based Recommendation ......................................................................... 18
2.2.4 Hybrid FM ............................................................................................................... 19
2.2.5 Modern Recommender Systems ........................................................................... 20
2.2.6 Data-Driven Recommendations ........................................................................... 20
2.2.7 Knowledge-Driven Recommendations ............................................................... 20
2.2.8 Cognition-Driven Recommendations .................................................................. 23
2.3 Application .......................................................................................................................... 23
2.3.1 Classic .......................................................................................................................23
2.3.1.1 Multimedia ................................................................................................ 23
2.3.1.2 Tourism ......................................................................................................25
2.3.1.3 Food ............................................................................................................25
2.3.1.4 Fashion ....................................................................................................... 26
2.3.2 Modern ..................................................................................................................... 27
2.3.2.1 Financial Technology (Fintech) .............................................................. 27
2.3.2.2 Education .................................................................................................. 27
2.3.2.3 Recruitment ............................................................................................... 27
2.4 Challenges ............................................................................................................................ 29
2.4.1 Cold Start ................................................................................................................. 29
2.4.2 Context Awareness ................................................................................................. 30
2.4.3 Style Awareness ....................................................................................................... 30
2.5 Advanced Topics ................................................................................................................. 31
2.5.1 AI-Enabled Recommendations ............................................................................. 31
2.5.2 Cognition Aware ..................................................................................................... 32
2.5.3 Intelligent Personalization ..................................................................................... 32
2.5.4 Intelligent Ranking ................................................................................................. 33
2.5.5 Intelligent Customer Engagement ........................................................................33
15
2.1 Introduction
In the times of Big Data, choosing the right products is a challenge for consumers due to
the massive volume , velocity, and variety of related data produced online. Because of this,
users are getting more and more desperate when making choices among an unlimited set
of choices. Recommender systems are support apps that can deal with this challenge by
assisting shoppers to make choices on what to purchase (Jannach, Zanker, Felfernig, and
Friedrich, 2010; Resnick and Varian, 1997; Ricci, Rokach, and Shapira, 2015). Recommender
systems can learn from particular preferences and tastes of users and build personalized
suggestions that tailor to users’ preferences and necessities rather than offering sugges-
tions based on mainstream taste (Elahi, 2011; Elahi, Repsys, and Ricci, 2011).
Many recommender software options and algorithms have been proposed, up to now,
by the academic and industrial community. Most of these algorithms are capable of getting
input data from various data types and then exploiting them to generate recommendations
on top of the data. These data types can describe either the item content (e.g., category,
brand, and tags) or the user preferences (e.g., ratings, likes, and clicks). The data is col-
lected and pre-processed, cleaned, and then exploited to build a model in which the items
are projected as arrays of features. Recommendation lists for a specific user is then made
by filtering the items that represent alike features to the rest of the item sets that user liked/
rated high.
Enhanced capabilities of recommender techniques in understanding the varied catego-
ries of user tastes and precisely tackling information burden has enabled them to become
an important part of any online shop that tackles the expansion of item cataloging (Burke,
2002; Elahi, 2014). Diverse categories of recommender engines have been built in order to
generate personalized selection and relevant recommendations of products and services
ranging from clothing and outfits to movies and music. Such a personalized selection and
suggestion is usually made based on the big data of a huge community of connected
users, and by calculating the patterns and relationships among their preferences (Chao,
Huiskes, Gritti, and Ciuhu, 2009; Elahi, 2011; Elahi and Qi, 2020; He and McAuley, 2016;
Nguyen, Almenningen, Havig, Schistad, Kofod-Petersen, Langseth, and Ramampiaro,
2014; Quanping 2015; Tu and Dong 2010). The excellency in performance of recommender
systems has been validated in the diverse range of e-commerce applications where a
choice support mechanism is necessary to handle customers’ needs and help them when
interacting with online e-commerce. Such an assistance improves the user experiences
when shopping or browsing the system catalogue (He and McAuley, 2016; Tu and Dong,
2010).
In this chapter, we will provide an outline of different types of real-world recommender
systems, along with challenges and opportunities in the age of big data and AI. We will
discuss the progress in cognitive technology, in addition to evolutionary development in
areas such as AI (with all relevant disciplines such as ML, DL, and NLP), KR, and HCI, and
how they can empower recommender systems to effectively support their users.
We discuss that modern recommendation systems require access to and the ability to
understand big data, in all different forms, and that big data generated on data islands can
Recommender Systems 17
2.2 Methods
2.2.1 Classical
Diverse recommendation approaches have already been developed and tested, which can
be classified within a number of categories. A well-adopted category of methods is called
content-based (Pazzani and Billsus, 2007). Methods within this category suggest items
based on their descriptors (Balabanovíc and Shoham, 1997). For example, book recom-
mender systems take terms within the text of a book as descriptors and suggest to the user
other books that have descriptors similar to the book the user liked in the past. Another
popular category is collaborative filtering (Desrosiers and Karypis, 2011; Koren and Bell,
2011). Collaborative filtering methods predict the preferences (i.e., ratings) of users by
learning the preferences that a set of users provided to items and suggests to users those
items with the highest predicted preferences. Methods within the demographic (Wang,
Chan, and Ngai, 2012) category generate recommendations by identifying similar users
based on the demographics of the users (Pazzani, 1999). These methods attempt to group
existing users by their personal descriptors and make relevant suggestions based on their
demographic descriptions. Knowledge-based (Felfernig and Burke, 2008) methods are
another category that tries to suggest items that are inferred from the needs and constrains
entered by users (Burke, 2000). Knowledge-based methods are distinguished by their
knowledge about how a specific item fulfills a particular user’s needs (Claypool, Gokhale,
Miranda, Murnikov, Netes, and Sartin, 1999). Hence, these methods can mine inferences
based on the connections within the user’s need and the possible recommendation. Hybrid
(Li and Kim, 2003) methods combine diverse individual methods among those noted ear-
lier in order to handle the particular restrictions of an individual method.
filtering can sort the items based on the predicted ratings and recommend those with the
highest ratings.
Classical methods in collaborative filtering systems are neighbor-based, which compute
user-to-user or item-to-item similarities based on the co-rating patterns of the users and
items. In item-based collaborative filtering, items can be computed as alike if the commu-
nity of interconnected users have rated those items in a similar way. Analogously, in user-
based collaborative filtering, users with similar rating patterns form neighborhoods that
are used for rating prediction. Hence, ratings predictions are performed based on how the
item has been co-rated by other users who were considered as like-minded compared to
the target users.
Another category of collaborative filtering systems adopt Latent factor models in order
to generate rating prediction. A well-adopted category of these methods is matrix factorization
(Koren, 2008b; Koren and Bell, 2011). Matrix factorization builds mathematical models on
top of ratings data and forms a set of factors for the users and items. These sets, with equal
length, are learned from every rating elicited from users. Every factor of these sets is
assigned to an item and represents the level in which an item projects a particular latent
aspect of user preference. In the movie domain, as an example, item factors could be inter-
preted as the genre of the movie, while user factors could describe the taste of the users
toward such genres.
In order to identify such factors, matrix factorization decomposes the rating matrix into
different matrices:
R ≈ SM T (2.1)
rˆui s
f 1.. F
uf mif (2.2)
where the suf describes the level of the user u preferences towards the factor f , and the mif
describes the strength of the factor f is in the item i (Koren, 2008b).
Recommender Systems 19
So far, a diverse spectrum of CBF approaches have been formulated and tested in the
context of recommender systems. A well-adopted method is K -nearest neighbors (KNN )
which exploits the similarities using items content and builds suggestions on top of it. The
similarities scores among the item j and all the rest of the items allows us to build a set of
nearest neighbor items (i.e., NN j) containing the items with the maximum similarity scores
to the item j . Accordingly, the preferences (e.g., likes/dislikes or the star ratings) that have
provided for the items within the nearest neighbors set are then used to predict the prefer-
ence r̂ij for user i and item j:
rˆij
jNN j , rij 0
rij ssjj
(2.3)
jNN j , rij 0
ssjj
where ruj > 0 reflects the elements of the preferences matrix, R , i.e., user ratings included in
the matrix of all ratings.
2.2.4 Hybrid FM
While the collaborative filtering method and content-based method have both been largely
adopted by the recommender system community, they have a number of restrictions.
These restrictions will be explained later on in this chapter. In order to address such restric-
tions, hybrid methods have been developed by hybridizing these methods (Low, Bickson,
Gonzalez, Guestrin, Kyrola, and Hellerstein, 2012). While hybrid methods can also have
diverse forms, we briefly introduce one of the most recent methods, called
factorization machines (Burke, 2002; Rendle, 2012).
Factorization machines is a recommender method that is formed by extending the clas-
sical matrix factorization method TURI (2018). Factorization machines hybridizes matrix
factorization by mixing it with a well-known machine learning method named
support vector machines (SVM ). This hybrid method enables the factorization machines to
be capable of taking advantage of not only the user preferences (e.g., ratings), but also item
descriptions, as well as any additional data attributed by users. This enables factorization
machines to adopt a wide range of data, typically referred to as side information, or item
descriptors (e.g., category, title, or tag) as well as user attributes (e.g., demographics, emo-
tion, mood, and personality). Hence, factorization machines build mathematical models
on top of user ratings, as well as item descriptors or user attributes in order it make prefer-
ence predictions (Rendle, 2012).
Predicting the user preferences (e.g., likes and dislikes, or ratings) is conducted through
the next formula:
r̂ij wi w j a T x i bT y i ui T v j (2.4)
where µ denotes the bias factor, wi is the user weight, w j is the item weight, and xi and yj
are feature set for user and item, respectively.
There other advanced models (such as Mooney and Roy, 2000; Ahn, Brusilovsky, Grady,
He, and Syn, 2007) that go beyond traditional methods by building probabilistic models
based on the user or item input data. For instance, in Fernandez-Tob́ıas and Cantador
(2014) and Manzato (2013), a model called gSVD + + has been developed that can take
advantage of content data attributed into MF Koren (2008a).
Recommender Systems
FIGURE 2.1
The data-lake-as-a-service architecture (CoreDB Beheshti et al., 2017a).
21
Downloaded by Lalu Yadav ([email protected])
lOMoARcPSD|50021907
22
Authentication, Access Control, Data
SPARQL SQL Query
Query
Full-Text Search
Encryption, etc.
Security
Index &
Search
elastic
Databases
NoSQL
Meta-Data
MongoDB CouchDB HBase Hive
CRUD
...
FIGURE 2.2
CoreKG: knowledge-lake-as-a-service architecture (Beheshti et al., 2018).
Recommender Systems 23
Ghafari, Goluguri, and Edrisi, 2020b). For example, a new line of research started (Beheshti
et al., 2020b) to use crowdsourcing techniques to capture domain experts’ knowledge and
use them to provide accurate and personalized recommendations. Another line of work
has been leveraged by intelligent knowledge lakes (KLs) to address the following two chal-
lenges: (i) The cold-start problem: leveraging intelligent knowledge lakes will bring infor-
mative data from a crowd of people and use it to generate recommendations.; (ii) Bias and
variance: leveraging intelligent knowledge lakes will be able to guide recommender sys-
tems to choose the best next steps by following the best practices learned from domain
experts. This is important, as features used for training recommenders may be gathered by
humans, which enables biases to get into data preparation and training phases. To build an
intelligent KL, it is important to mimic domain expert’s knowledge. This can be done using
techniques such as collecting feedback, organizing interviews, and requesting surveys. To
achieve this goal, it is important to capture important events and entities (and relation-
ships among them) that are happening in real time in various disciplines and fields, such
as education and fintech.
2.3 Application
2.3.1 Classic
2.3.1.1 Multimedia
Multimedia is probably the most popular application domain in recommender systems.
Multimedia recommender systems can exploit different forms of preference data and can
use different types of multimedia descriptors when creating recommendations (Elahi,
Ricci, and Rubens, 2012; Hazrati and Elahi, 2020). While such features can have different
forms, we can classify them into a two main categories: high -level and low -level forms of
FIGURE 2.3
A contextualized tweet (Beheshti et al., 2019).
descriptors (Cantador, Szomszor, Alani, Fernandez, and Castells, 2008; Hazrati and
Elahi, 2020).
High-level descriptors illustrate more of the semantic and syntactic characteristics of
multimedia items and can be aggregated from either structured forms of metadata, e.g., a
relational databases or an ontology (Cantador et al., 2008; Mooney and Roy, 2000), or from
less structured form of data, e.g., user reviews, film plots, and social tags (Ahn et al., 2007;
Hazrati and Elahi, 2020).
Low-level descriptors, on the other side of the story, are aggregated directly from multi-
media files (e.g., audio or visual files). In the music domain, for instance, low-level
Recommender Systems 25
descriptors can represent the acoustic configurations of the songs (e.g., rhythm, energy,
and melody), which can be adopted by recommender systems to find similar songs and to
generate personalized recommendation for a user (Bogdanov and Herrera, 2011; Bogdanov,
Serra, Wack, Herrera, and Serra, 2011; Knees, Pohle, Schedl, and Widmer, 2007; Seyerlehner,
Schedl, Pohle, and Knees, 2010).
In video domain, low-level descriptors can represent the visual aspects of the videos
and thus reflect an artistic style (Canini, Benini, and Leonardi, 2013; Lehinevych, Kokkinis-
Ntrenis, Siantikos, Dogruoz, Giannakopoulos, and Konstantopoulos. 2014; Yang, Mei,
Hua, Yang, Yang, and Li, 2007; Zhao, Li, Wang, Yuan, Zha, Li, and Chua, 2011).
It is a fact that recommendation based on low-level features do not draw much attention
to multimedia recommender systems. On the other hand, such features received massive
attention in some related research fields, namely, in computer vision (Rasheed, Sheikh, and
Shah, 2005), and content-based video retrieval. Despite the differences in objectives, these
communities share objectives such as formulating the informative descriptors of video and
movie items. Hence, they report outcomes and insights that can be beneficial to the context
of the multimedia recommender systems (Brezeale and Cook, 2008; Hu, Xie, Li, Zeng, and
Maybank, 2011; Rasheed et al., 2005).
2.3.1.2 Tourism
Another well-studied domain in the research on the recommender systems is tourism. This
is a domain where contextualization plays an important role. We can define contextualiza-
tion as the process of incorporating contextual factors (such as weather condition, travel
goals, and means of transportation) in the recommendation generation. The idea is to make
personal suggestions by incorporating diverse sources of user data, as well as the condition
represented by contextual factors (Adomavicius and Tuzhilin, 2011). For example, a group
of tourists may be interested in visting suggested indoor attractions (e.g., museums) dur-
ing bad weather, but in nice weather they may prefer outdoor activities (e.g., hiking).
Recommender systems that are capable of using such contextual factors are known as
CARS.
CARS are empowered to exploit mathematical modeling in order to better learn user
preferences in different contextual situations based on diverse sources of data, e.g., the
temperature, season, the geographical position, and even the vehicle type. Due to the pop-
ularity of this research domain, a big amount of research has already been conducted in in
this domain (Baltrunas, Ludwig, Peer, and Ricci, 2012; Chen and Chen, 2014; Gallego,
Woerndl, and Huecas, 2013; Hariri, Mobasher, and Burke, 2012; Kaminskas, Ricci, and
Schedl, 2013; Natarajan, Shin, and Dhillon, 2013). The majority of these works can exploit
the context experienced by the user in the recommending process.
2.3.1.3 Food
There are a diverse categories of food recommendation systems that have recently been
proposed by the community (Trevisiol, Chiarandini, and Baeza-Yates, 2014; West, White,
and Horvitz, 2013). For example, Freyne and Berkovsky (2010) built a food recommenda-
tion system that, through an effective user interaction model, collects user preferences and
generates personalized suggestions. Their system converts the preferences of the users for
recipes into preferences for ingredients, and then merges these converted preferences to
form user suggestions.
Elahi, Ge, Ricci, Massimo, and Berkovsky (2014) devised a different approach for food
recommendation that can combine the predictions for food along diverse aspects (such as
user food preferences, nutrition, ingredients, and expenditure) to measure a score for a
potential food (or meal). The objective is to take into account measures that shall impact
the user’s food choices in order to make a more beneficial set of recommendations (Teng,
Lin, and Adamic, 2012). In their next paper, the same authors performed an assessment of
the rating prediction method, which used a variant of MF. This method exploits more data
than utilizing only ratings, such as subjective tags paired to different recipes by users. It
has been discovered that extra data input on the user preferences allows the technique to
outperform other baseline methods, including those developed in Freyne and Berkovsky
(2013).
Generally speaking, the preferences that are aggregated by a recommender system can
have two forms, i.e., long-term affinities or short-term affinities. While obtaining and
aggregating both forms of preferences is essential, the research on recommender systems
does not identify the differences between these two forms. Only limited research works
have considered such differences (e.g., Ricci and Nguyen, 2007). The noted example is one
of the few works that developed a recommender system, which elicits both generic long-
term affinities and specific short-term affinities.
We would like to point out that the traditional line of research on recommender systems
typically undermines the importance of human-system interaction model, as an essential
component for creating an industrial-grade system. Hence, they mainly concentrate on
enhancing the core analytical models by supposing that the preference acquisition proce-
dure is conducted only in the beginning, and then ended.
2.3.1.4 Fashion
Fashion is traditionally referred to as the prevailing form of clothing, and it can be formu-
lated by the concept of changing. Fashion includes diverse characters of self-fashioning,
such as styles in the street to the other calls of high fashion made by designers (Bollen,
Knijnenburg, Willemsen, and Graus, 2010); Person, 2019). One of the biggest issues for this
type of application is the growing diversity and expanding number of fashion products.
This is an effect that can certainly lead to choice overload for the fashion consumers. This
is not necessarily negative, since the more available options then there is a higher likeli-
hood that consumers will find a desired product. However, such an effect may lead to the
impossibility of actually choosing a product, i.e., the problem of receiving too many
options, particularly when they are very diverse (Anderson, 2006).
Recommender techniques are powerful tools that can effectively tackle this issue by
making relevant suggestions of products tailored to the needs of the users. They can
build a filtering mechanism that eliminates uninteresting and irrelevant products from a
shortlist of recommendations. They can thoroughly mine the user data in order to learn
particularities among user preferences for each single user. For instance, Amazon can
look into the purchase history of users and build predictive models that can ultimately
be used to make personalized recommendation for the purchaser. Hence, the smart
engine behind the recommender can actively understand the users’ behaviors, and
obtain diverse and informative forms of data describing the user tastes in order to obtain
knowledge on the individual requirements of every user (Rashid, Albert, Cosley, Lam,
Mcnee, Konstan, and Riedl, 2002; Rubens, Elahi, Sugiyama, and Kaplan, 2015; Su and
Khoshgoftaar, 2009).
Recommender Systems 27
2.3.2 Modern
2.3.2.1 Financial Technology (Fintech)
Financial technology (fintech) aims to use technology to provide financial services to busi-
nesses or consumers. Any form of recommendation method in this field will need to under-
stand three main dimensions to provide intelligent recommendations: (i) banking entities,
such as customers and products; (ii) banking domain knowledge, such as how different
banking segments operate across customers, sales and distribution, products and services,
people, processes, and technology; and (iii) banking processes, to help understand the best
practices learned by knowledge experts in processes such as fraud detection, customer
segmentation, managing customer data, risk modeling for investment banks, and more.
The main shortcoming of existing RSs is that they do not consider domain experts’
knowledge, and hence may not exploit user-side information such as cognitive character-
istics of the user. These aspects are quite vital to support intelligent and time-aware
recommendations.
To support data analytics focusing on customers’ cognitive activities, it is important to
understand customers’ dimensions both from banking and non-banking perspectives, as
depicted in Figure 2.4. Modern approaches, such as cognitive recommender systems (Beheshti
et al.. 2020b), model the customer behavior and activities as a graph-based data model
(Beheshti, Benatallah, and Motahari-Nezhad, 2016; Hammoud, Rabbou, Nouri, Beheshti,
and Sakr, 2015) over customers’ cognitive graphs to personalize the recommendations.
2.3.2.2 Education
One of the most popular application domains in recommender systems is the field of edu-
cation. Education allows individuals to reach their full potential and aids in the develop-
ment of societies by reducing poverty and decreases social inequalities. Recently, the world
has experienced an increasing growth in this domain, both on quantity and quality
measures.
This, in turn, has already generated several challenges in the education system, such as
instructors’ workload in dealing with assessments and providing recommendations based
on students’ performance and skills assessment.
In this context, recommender systems can be significantly important tools for personal-
izing teaching and learning by understanding and analyzing important indicators such as
knowledge, performance (e.g., cognitive, affective, and psychomotor indicators), and skills
(e.g., decision-making and problem-solving). An attractive planned work for the future
could implement a time-aware deep learning model to construct and analyze learners’
profiles in order to better understand students’ performance and skills. The learning mod-
els would enable recommender systems to identify similarly performing students, which
may facilitate personalizing learning process, subject selections, and recruitment.
2.3.2.3 Recruitment
Talent acquisition and recruitment processes are examples of ad hoc processes that are
controlled by knowledge workers aiming to achieve a business objective/goal. Attracting
and recruiting the right talent is a key differentiator in modern organizations, and recom-
mender systems can play an important role in assisting recruiters in the recruitment pro-
cess. For example, consider a recommendation engine that has access to LinkedIn profiles,
FIGURE 2.4
Users’ dimensions in a banking scenario (Beheshti et al.,2020b).
is able to extract data and knowledge from business artifacts (e.g., candidates’ CV and
position descriptions), has access to curation algorithms to contextualize the data and
knowledge, and is able to link them to the facts in the recruitment domain knowledge base.
Artificial intelligence (AI) has enabled organizations to create business leverage by
applying cutting edge automation techniques (Shahbaz, Beheshti, Nobari, Qu, Paik, and
Mahdavi, 2018): (i) improving the overall quality and effectiveness of the recruitment
process; (ii) extracting relevant information from a candidate’s CV automatically; (iii)
aggregation of different candidate evaluations and relevant information; (iv) under-
standing the best practices used by recruiters; (v) extracting personality traits and appli-
cant attitudes from social media sites, something that was traditionally only possible
through interviews. All these techniques can be leveraged by recommender systems
building effective ranking algorithms that optimize the recommendations and help
maintain a priority pool of talent. AI-enabled recommender systems would be able to
help match the behaviors of the most talented people in their organizations, and help
businesses recruit the right candidates for open jobs by aggregating information from
Recommender Systems 29
different sources and then ranking them based on their overall score. Another future line
of work would be to use computer vision algorithms to assess interviews of potential
candidates and compare them to the organization’s best talent in order to make recom-
mendations (Hirevue, 2019).
2.4 Challenges
Recommender systems typically exploit datasets that contain user feedback (e.g., likes and
dislikes, or ratings) that represent preferences produced by a big crowd of interconnected
users to a large list of items (Desrosiers and Karypis, 2011). Exploiting such data empowers
the recommender systems to learn the patterns and connections among users, and use
them to estimate the missing assessments (likes and dislikes, or ratings) of users for the
unexplored items and then suggest items that may be attractive to a target user (Koren and
Bell. 2011).
The above-mentioned procedure is oversimplified, and there are many grand concerns
that have not been fully addressed so far. Hereafter, we briefly explain some of these
concerns.
# of existing feedbacks
1− (2.5)
# of all possible feedbacks
In some of the acute situations of sparsity, the effectiveness of the recommender systems
can be strongly deteriorated, consequently resulting in a significant decrease in the perfor-
mance of the system. In such a condition, the quantity of available user feedback is largely
smaller than the number of missing feedback, and the operating system has to build pre-
dictions with satisfactory level of quality (Adomavicius and Tuzhilin 2005; Braunhofer,
Elahi, and Ricci, 2014).
Different cold start conditions can take place in the actual applications, namely extreme
cold start and moderate cold start conditions.
• Extreme cold start condiction take place when a user starts using the system and asks
for a recommendation before producing any feedback. The problem can also happen
when a brand-new product is inserted into the catalog and has no associated data
that can represent that item. This, in turn, can lead to a failure in suggesting that new
product to an existing users. Both of these situations are critical issues that have to be
handled activrely by the system.
• Mild cold start conditions happen once a small number of feedbacks are produced by
a user to existing products, and the system can use this limited data to generate a
recommendation. This problem may also take place for a new product when a small
amount of content data are not fully produced. Mild cold start may take place as a
combined condition of extreme and warm start. This can still lead to a failure if not
be promptly addressed by the operating system.
Recommender Systems 31
in media have a common belief that the impact of colors becomes larger as they are predis-
posed in making a particular emotional goal.
A number of research works on recommender systems have reported that users’ prefer-
ences can be impacted greater by low-level descriptors in comparison to high-level descrip-
tors (expressing the semantic or syntactic forms in films) (Elahi, Deldjoo, Bakhshandegan
Moghaddam, Cella, Cereda, and Cremonesi, 2017; He, Fang, Wang, and McAuley, 2016;
Messina, Dominquez, Parra, Trattner, and Soto, 2018; Rimaz, Elahi, Bakhshandegan
Moghadam, Trattner, Hosseini, and Tkalcic, 2019; Roy and Guntuku, 2016). Examples of
such low-level descriptors can be color energy, shot duration, and lighting key. (Wang and
Cheong, 2006) have a proved to influence on user mood and emotion (Roberts, Hager, and
Heron, 1994). In addition to that, various forms of motion (such as camera movement) can
play a significant role and are commonly adopted by filmmakers when aiming to affect the
perception of movie watchers (Heiderich, 2018). A range of methods and techniques have
been adopted to address the task of learning visual descriptors from films (Ewerth,
Schwalb, Tessmann, and Freisleben, 2004; Savian, Elahi, and Tillo, 2020; Tan, Saur, Kulkami,
and Ramadge, 2000).
Despite of the importance of low-level descriptors, the usage of them has not drawn
much consideration in recommendation systems (e.g., an example is Messina et al. [2018]).
However, these audiovisual descriptors are thoroughly investigated in the related areas,
namely within the computer vision community (Naphide and Huang, 2001; Snoek and
Worring, 2005).