Advances in Modelling and Analysis A: Keywords

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Advances in Modelling and Analysis A

Vol. 61, No. 2, June, 2018, pp. 64-69


Journal homepage: http://iieta.org/Journals/AMA/AMA_A

Comparative study on traditional recommender systems and deep learning based recommender
systems
N.L. Anantha1*, Bhanu P. Bathula2
1
Acharya Nagarjuna University, Department of IT, VFSTR University, Vadlamudi, Guntur 522213, India
2
Department of CSE, Thirumala Engineering College, Jonnalagadda (V), Narasaraopet-522601, Andhra Pradesh, India

Corresponding Author Email: [email protected]

https://doi.org/10.18280/ama_b.610202 ABSTRACT

Received: 17 April 2018 Recommender systems is a big breakthrough for the field of e-commerce. Product
Accepted: 4 June 2018 recommendation is challenging task to e-commerce companies. Traditional Recommender
Systems provided the solutions in recommending the products. This in turn help companies
Keywords: to generate good revenue. Now a day Deep Learning is using in every domain. Deep
recommender systems, deep learning, item- Learning techniques in the field of Recommender Systems can be directly applied. Deep
based collaborative filtering, user-based Learning has ample number of algorithms. These algorithms can be used to give
collaborative filtering, matrix factorization recommendations to users to purchase products. In this paper performance of Traditional
Recommender Systems and Deep Learning-based Recommender Systems are compared.

1. INTRODUCTION Table 2. Traditional recommender systems

Recommender System (RS) is a filtering technique which is System Product


a sub-domain of Information Retrieval. RS filters the data Facebook Friends, Advertisements
from www and gives context-oriented data to the user. Pandora Music
Recommender Systems could predict the products gives those YouTube Online videos
products to the users. Recommender systems gives services to Tripadvisor Travel products
both producer and consumer. Recommender systems helps in IMDb Movies
improving the revenue of e-commerce websites. Similarly, RS
helps in recommending friends, music, documents and videos. Recommender Systems [2] has the following phases. The
Recommender systems classified into 3 categories. Content- first phase is Information Collection phase, second phase is
based filtering, collaborative filtering and knowledge-based Learning phase and the final is Prediction/ Recommending
systems. These 3 categories are traditional Recommender Phase.
Systems. Lot of research has been done on traditional Recommendation systems are first coined by Tapestry [3].
Recommender Systems. The goal of Recommender Systems Recommender systems are also called as Recommendation
is either to perform prediction or Ranking. Since, couple of Systems. The purpose of the Recommender systems is, it
years lot of research is taking on Deep Learning based provides predicted products or predicting the ratings of the
Recommender Systems. In this paper, authors are discussing products. Recommender Systems are categorized into mainly
mainly five Deep learning techniques. They are as follows. 3 types
Multi-Layer Perceptron (MLP), AutoEncoder (AE), 1) Content based filtering
Restricted Boltzmann Machine (RBM), Convolutional Neural 2) Collaborative filtering
Network (CNN), Recurrent Neural Network (RNN). 3) Knowledge-based Systems.
Recommender systems are used in different areas. The The content in Content Based filtering means descriptions.
following table1 [1] describes examples of products The descriptive attributes are used to give recommendations in
recommended by Recommender Systems. Content based filtering. Content Based filtering works as, User
given good rating to a movie. Authors don’t have any other
Table 1. Examples of product recommendation information. But movie has its type. Genre is nothing but
Movie Type. Authors have following movie types. They are
System Product Action, Animation, Comedy, Fiction, Thriller, etc. Each
Amazon.com Books and other products movie falls under one of the categories or more than one
Netflix DVDs, Streaming Video category. Based on the Genre the movies are recommended to
Jester Jokes the user. The advantages of content-based recommendations
GroupLens News are when user doesn’t have enough information then this is the
MovieLens Movies best approach used for recommending the products.
last.fm Music News Dude is a personal news system, which reads new
Google News News with the help of synthesized speech. To give recommendations
Google Search Advertisements it uses TF-IDF model is used to identify the descriptions of the

64
news stories and uses Cosine similarity measure to identify Jaccard metrics are used.
similar news. Cosine similarity between two products is calculated as.
LIBRA is a content-based book Recommendation System
that analyzes the books gathered from the web. It uses Naïve ∑ 𝑟⋃𝑖 𝑟𝑈𝑗
𝑢𝜖𝑈𝑖𝑗
Bayes classifier to learn user profile and predicts the books for CS(i,j)= (1)
that user. √∑
2 ∑
𝜇𝑈𝑖 √
2
𝜇𝑈𝑗
Collaborative Filtering is most powerful technique in 𝑢∈⋃𝑖 𝑢∈⋃𝑗

Recommender Systems. Its main focus is the rating given by


the user on products. The big challenge for the Collaborative where Ui is product i, is rated by the user group, and Uij is the
Filtering is sparse data and Cold start problem. It is mainly set of users rated products and j are rated by user group.
used in recommending the products based on the user’s Another widespread measure to compute similarity is
interest. Collaborative Filtering is classified in 2 categories. Pearson Correlation similarity:
First is Item-based recommendation and second is User-based
recommendation. Item-based recommendation identifies the ∑ (𝑟𝑢𝑖 −𝑟𝑖 )(𝑟𝑢𝑖 −𝑟𝑗 )
𝑢𝜖𝑈𝑖𝑗
similarity between the items. User-based recommendation PS(i,j)= (2)
identifies the similarity between the users based on the 2
√∑𝑢∈𝑈 (𝑟𝑢𝑖 −𝑟𝑖 ) √∑
2
(𝑟𝑢𝑗 −𝑟𝑗 )
𝑖𝑗 𝑢∈𝑈𝑖𝑗
products purchased or rated by the users. After identified the
similarity between users or items, recommends the products.
In finding similar products or similar users it uses K-nearest where Ui is product i, is rated by the user group, and Uij is the
Neighbor or Matrix factorization approaches. In another set of users rated products i and j are rated by user group.
context, Collaborative Filtering is further divided into 2 ways. Jaccard similarity is used to measure the similarity between
One is Memory Based Collaborative Filtering, and another is two set of elements. The Jaccard similarity between two items
Model Based Collaborative filtering. In the model-based is computed as
filtering, authors supply some part of the data for training.
With that data model learns and applies on the test data for 𝐽𝑆(𝑖, 𝑗) =
|𝑈𝑖 ∩𝑈𝑗|
(3)
either recommendation or prediction. While in Memory based 𝑈𝑖 ∪𝑈𝑗

filtering, entire data is given to the model, then model learns


and uses that knowledge for recommendation or prediction. where Ui is the set of users rated an item i similarly Uj is the
Group Lens is client/server-based architecture uses CF set of users rated an item j.
System; the system recommends Usenet news which is a high- The following metrics are used to evaluate Traditional
volume discussion list service on the Internet. Ringo [7] is a Recommender Systems. They are Accuracy, MAE, RMSE,
user-based CF system which gives recommendations of music precision and recall, ROC curve.
artists and albums. Ringo makes new user to rate list of 125 Mean Absolute Error [8], Root Mean Absolute Error [9],
artists. Precision-recall curve [10] metrics are statistical accuracy
Knowledge based systems are used in, the products are not metrics.
frequently being purchased. This context authors get in Cold MAE is the measure of deviation of user’s actual value and
start problem. Knowledge based systems provide predicted value. It is formulated as follows:
recommendation with the combination of user ratings, item
1
attributes and domain knowledge. Knowledge based systems MAE= ∑𝑢,𝑖 |Pui −ru| (4)
𝑁
are further divided into the following types. Constraint-based
recommender systems, Case-based recommender systems. where Pui is the predicted rating on item i by the user u, i is
The following diagram gives the classification of different original rating and N is the total ratings on the item set. The
approaches in Recommender Systems. MAE is minimum means, the prediction of ratings of the
Recommender engine accurate. Also, the Root Mean Square
Error (RMSE) is given by

1
RMSE= √ ∑𝑢,𝑖 (𝑃𝑢,𝑖 − 𝑟𝑢,𝑖 )2 (5)
𝑛

The minimum RMSE means, prediction of ratings by the


Recommender Engine is accurate.
Precision is the fraction of good products recommended to
total recommended products and recall defined as the fraction
of good products recommended those are part of the set of all
useful products recommended. They are computed as
𝑔𝑜𝑜𝑑 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑠 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑒𝑑
Precision: (6)
𝑡𝑜𝑡𝑎𝑙 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑒𝑑 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑠
Figure 1. Recommender systems
𝑔𝑜𝑜𝑑 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑠 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑒𝑑
Recall: (7)
𝑎𝑙𝑙 𝑢𝑠𝑒𝑓𝑢𝑙 𝑝𝑟𝑜𝑑𝑢𝑐𝑡𝑠 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑒𝑑
1.1 Results analysis on traditional recommender systems
In this paper authors used RoC curve to know the
In finding the similarity between users and items in this performance of the algorithms. Authors have taken Jester and
paper Pearson correlation coefficient, Cosine Similarity and Movielens datasets for testing the performance of Traditional

65
Recommender System Algorithms. Movielens data set offers Collaborative Filtering Technique. Crab is a python
consists of movie name, Genre, movie id, user id, rating given framework offers Collaborative filtering. Recommender lab is
by each user and other details. Rating is ranging from 1 to 5. Package Provides infrastructure for development of
Jester data set consists of movie name, Genre, Movie id, user recommendations in R offers Collaborative filtering and Top
id, Rating for the movie given by each user and other details. N Recommendations. Graph lab is Machine learning platform
For experiment, Item-based Collaborative Filtering using offers Collaborative filtering, Matrix Factorization, Top N
Jaccard, Pearson and cosine similarity and item-based Recommendation. Scikit is a Machine Learning Framework
collaborative filtering using Jaccard, Pearson and cosine offers Collaborative filtering, Top N Recommendations and
similarity applied on both Jester and Movielens Datasets. The Matrix Factorization methods. My Media Lite is a C#
following Fig 1 and Fig 2 gives the details of the performance implementation of recommended algorithms offers
of each algorithm on Movielens and Jester Datasets. User- Collaborative Filtering Technique.
based collaborative filtering with Jaccard Correlation
Coefficient is performing well on Movielens dataset. Similarly,
on Jester Dataset User-based collaborative filtering with 2. DEEP LEARNING BASED TECHNIQUES
Pearson is performing well. From the RoC Curve figure
authors know that as the data size is increasing the model Deep learning is sub field of Machine learning. Deep
performance also increasing. learning is showing immense impact on the fields of image
processing, Natural language processing, computer vision and
speech recognition. Deep learning is also showing remarkable
impact on the Recommender Systems.
Deep learning techniques consists of activation functions
and uses Neural Networks. Each Neuron contains activation
function. Deep Learning based Recommender Systems uses
non-liner functions. Linear function is a function where graph
is a straight line. Linear functions don’t have any exponents
higher than 1. A simplest form of linear function is y= mx + c,
where m and c are constants. A non-linear function is a
function where the graph is not a straight line. In this paper
authors discuss all popular activation functions. Activation
functions are used for neurons is to introduce non-linearity to
the network. They are sigmoid, tanh, ReLU, softmax etc.,
Activation Functions:
Figure 2. Roc Curve on Movielens Dataset Sigmoid function ranges from 0 to 1. It is also called as
logistic function. It is named sigmoid because it is in s-shape.
The drawbacks of sigmoid function are sigmoid saturate and
kill gradients and sigmoid outputs are not zero centered.
Sigmoid function is defined as:
1
𝜎(𝑥) = (8)
1+𝑒 −𝑥

Hyperbolic Tangent function (TanH) function ranges from


-1 to 1. It is a trigonometric function.
2
tanh(𝑥) = −1 (9)
1+ 𝑒 −2𝑥

Softmax function is generalization of logistic function. The


output of Softmax function is a categorical distribution. It is
Figure 3. Roc Curve on Jester Datase used in multiple classification methods. Softmax function is
defined as:
1.2 Recommender systems tools/ frameworks
exp(𝑥𝑖 )
𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑥𝑖 ) = ∑𝑛 (10)
RS is supported by so many frameworks, tools, package and 𝑗=1 exp(𝑥𝑗 )

libraries. These help us to test the performance of our


algorithms. They are as follows. Easyrec is a Java-based ReLU function is very much useful in feedforward neural
personalized RS, offers Top-N Recommendations. PREA is a networks. ReLU is defined as:
Java based personalized recommendation algorithms toolkit
gives us the ability to work on collaborative filtering. LibRec 𝑓(𝑥) = max{0, 𝑥} (11)
is also a java-based RS offers only Collaborative filtering
technique. Duine is also a java framework offers both Authors use the activation functions to carry the neural
Collaborative filtering technique and Top-N network. Now authors know the notations used in deep
Recommendations. LensKit is a Java toolkit for learning techniques.
recommendations offers both Collaborative filtering, Top N In this paper authors describe functioning of deep learning
Recommendation. Case Recommender is a python Framework techniques in Recommender systems. They are

66
Multilayer perceptron: is a basic model used in deep an output in RNN [13], the network maintains the previous
learning. MLP has a mathematical function which takes some inputs as persistent in the memory to produce output. The best
set of inputs and maps them to output values. MLP is a applications of RNN is Natural Language processing.
feedforward neural network with multiple layers. MLP is
having one input layer and one output layer. For processing
data MLP is having one or more hidden layers. MLP contains
perceptron. Each perceptron has one activation function.
Figure 1. (a) MLP with one hidden layer.
Restricted Boltzmann Machines: Boltzmann Machine is a
stochastic recurrent Neural Network consisting of binary
Neurons [11]. Boltzmann Machines consist of one visible
layer which takes input and another hidden layer. In
Boltzmann Machine, each layer consists of set of nodes. Each
node in the visible layer has connections with each node in the
layer (intra-node connections) as well as each node in the Figure 4. Multilayer perceptron
visible layer has connections with hidden layers (inter-node
connections).
These connections making each node dependent on each
node causing inefficient sampling etc., To overcome these
limitations, Paul Smolen sky proposed Restricted Boltzmann
Machine [12]. RBM also has 2 layers. One is visible layer and
one is hidden layer. The only change in RBM is, the nodes in
a layer don’t have connections with the nodes in that layer (no
Intra-node connection). Node in the visible layer has
connections with all the nodes in the hidden layer. If RBM
uses more hidden layers they are called as deep belief
networks. RBM’s are using in Recommendation systems,
classification, regression, clustering, anomaly detection,
feature learning, dimensionality reduction. In Recommender Figure 5. Bolzmann machine
systems, RBM is being used in collaborative filtering. Deep
Boltzmann Machines, Convolutional Boltzmann Machines
and etc are other variants of Boltzmann Machines. Figure 1.(b)
RBM with one hidden layer.
Auto Encoders: An autoencoder neural network is an
unsupervised learning algorithm that applies backpropagation,
setting the target values to be equal to the inputs. Autoencoder
Neural Network is like Multilayer Perceptron. It consists of 3
layers, encoder and decoder. First layer is input layer, second
layer is a single layer, or more than one hidden layer and third
layer is output layer. Encoders role is simplifying the data
representation and decoder decodes back the original data.
Algorithms takes this encodes data learns more than the Figure 6. Restricted Botlzmann machine
normal data. Denoising autoencoders, contractive
autoencoders and sparse autoencoders are other variants on
autoencoders. Autoencoders are mainly useful in the research
areas like information retrieval and dimensionality reduction.
Convolutional Neural Network: It is a class of deep, feed-
forward ANN which is a variation of MLP designed to
minimize preprocessing. CNN is made up with the following
layers. They are convolution layer, pooling layer and fully
connected layer. Convolution layer is not fully connected layer.
it takes an image as input generates a feature map or activation
map. Feature map contains the information about the image.
Pooling layer is also called as downsampling. Pooling layer
uses either max pooing or average pooling to perform
downsampling. Fully connected layer means every node in the
network has connections with each node in the next layer.
Convolution Neural Network takes an image as input and
gives the probabilities of the objects available in the given
image.
Recurrent Neural Network: MLP and other Neural Network
architectures map input vector to an output vector only. But
Recurrent Neural Networks maintains the information about
the history of previous inputs to each output. Before producing Figure 7. Autoencoder

67
3. DEEP LEARNING BASED RECOMMENDER 3.5 RNN based recommender system
SYSTEMS
Recurrent Recommender Network (RRN) [9] is RNN uses
3.1 MLP based recommender system preferences of user changes over time and temporal evolution
of items seasonality and predicts the ratings.
Neural collaborative filtering [14] is a MLP based technique
which uses matrix Factorization approach and gives user- 3.6 Experimental results on deep learning based
based and item-based recommendation. Cross-domain recommender systems
Content-boosted Collaborative Filtering Neural Network [15]
is MLP based technique offers user-based and item-based Authors have successfully completed experimentation on
recommendations. Deep Factorization Machine, Deep FM [16] Movielens and Jester datasets using Autoencoder.
is a MLP based technique uses Factorization Machines to Autoencoder internally uses MLP. In this experimentation,
provide user-based and item-based recommendations. different Activation functions are used. The table 2 gives loss
values of different activation functions used in Autoencoders.
Among those activation functions, Autoencoder with Relu
activation functions is giving better results.

Table 3. Results using autoencoder

Autoencoder
S.No Dataset
Sigmoid Relu TanH
1 Movielens 0.18 0.16 0.19
2 Jester 0.17 0.15 1.19
Figure 8. Recurrent neural network

4. CONCLUSION & FUTURE WORK

In this paper, authors have successfully completed


experiments on both traditional and Deep learning-based
recommender system Algorithms. In future, authors want to
propose an algorithm on Convolutional Neural Network which
gives better results than before.

REFERENCES
Figure 9. Convolution neural network
[1] Aggarwal CC. (2016). Recommender systems. Springer
3.2 Restricted Boltzmann machine based recommender International Publishing, 2016.
system [2] Billsus D, Pazzani MJ. (2000). User modeling for
adaptive news access. User Model User-adapted Interact
Restricted Boltzmann Machine Collaborative Filtering 10(2–3): 147–80.
[RBM-CF] [17] is RBM uses Collaborative Filtering provides [3] Wu CY, Ahmed A, Beutel A, Smola AJ, Jing H. (2017).
user-based recommendations. Hybrid RBM-CF [18] Recurrent recommender networks. In Proceedings of the
incorporates item features and offers both user-based Tenth ACM International Conference on web Search and
recommendations and item-based recommendations. Data Mining. ACM 495–503.
[4] Chen LS, Hsu FH, Chen MC, Hsu YC. (2008).
3.3 AutoEncoder based recommender system Developing recommender systems with the
consideration of product profitability for sellers. Int J
AutoRec [19] is an AutoEncoder based technique which Inform Sci 178(4): 1032–48.
offers user-based and item-based recommendation. Authors [5] Cotter P, Smyth BPTV. (2008). Intelligent personalized
have separate implementation for item-based recommendation TV guides. In: Twelfth conference on innovative
using I-AutoRec and for user-based recommendation using U- applications of artificial intelligence, 957–64.
AutoRec. Collaborative Filtering Neural network (CFN) [20- [6] Donghyun K, Chanyoung P, Jinoh O, Lee SY, Yu H.
21] is also an AutoEncoder based technique offers itembased (2016). Convolutional matrix factorization for document
and userbased recommendations using I-CFN and U-CFN. context-aware recommendation. In Proceedings of the
10th ACM Conference on Recommender Systems. ACM,
3.4 CNN based recommender system 233–240.
[7] Drosou M, Pitoura E. (2010). Search result
Deep Cooperative Neural Network [DeepCoNN] [12] is diversification. SIGMOD Rec 39(1): 41–7.
Convolutional Neural Network uses factorization Machines [8] Florian S, Jeremie M. (2015). Collaborative filtering with
provide users rating predictions. ConvMF [23] is a combined stacked denoising auto encoders and sparse inputs. In
model of Convolutional Neural Network and Probabilistic NIPS Workshop on Machine Learning for e-Commerce.
Matrix Factorization technique offers item-based [9] Florian S, Romaric G, J´er´emie M. (2016). Hybrid
recommendations. recommender system based on autoencoders. In

68
Proceedings of the 1st Workshop on Deep Learning for [18] Mooney RJ, Roy L. (2016) Content-based book
Recommender Systems. ACM, 11–16. recommending using learning for text categorization. In:
[10] Goldberg K, Roeder T, Gupta D, Eigentaste PC. (2015). Proceedings of the Fifth ACM Conference on Digital
A constant time collaborative filtering algorithm. Inform Libraries. ACM 195–204.
Retrieval J. 4(2): 133–51. [19] Rumelhart DE, Hinton GE, Williams RJ. (2017).
[11] Goldberg D, Nichols D, Oki BM, Terry D. (2014). Using Learning representations by back-propagating errors.
collaborative filtering to weave an information tapestry. Nature 323: 6088-533.
Commun. ACM 35(12): 61-70. [20] Salakhutdinov R, Mnih A, Hinton G. (2017). Restricted
[12] Haykin SS. (2016) Neural networks: A comprehensive Boltzmann machines for collaborative filtering. In
foundation. Proceedings of the 24th International Conference on
[13] Guo HF, Tang RM, Ye YM, Li ZG, He XQ. Deep FM. Machine Learning, ACM, 791-798.
(2017). A Factorization-Machine based Neural Network [21] Smolensky P. (2015). On the comprehension/production
for CTR Prediction 2782–2788. dilemma in child language. Linguistic Inquiry 27(4):
[14] Isinkaye FO, Folajimi YO, Ojokoh BA. (2015). 720-731.
Recommendation systems: Principles, methods and [22] Sedhain S, Krishna Menon A, Sanner S, Xie LX. (2015).
evaluation. Egyptian Informatics Journal 16(3): 261-273. Autoencoders meet collaborative filltering. In
[15] Lian JX, Zhang FZ, Xie X, Sun GZ. (2017). CCCF Net: Proceedings of the 24th International Conference on
A content-boosted collaborative filtering neural network World Wide web. ACM, 111–112.
for cross domain recommender systems. in proceedings [23] He XN, Liao LZ, Zhang HW, Nie LQ, Hu X, Chua TS.
of the 26th international conference on world wide web (2017). Neural collaborative filtering. In Proceedings of
companion. International World Wide Web Conferences the 26th International Conference on World Wide web.
Steering Committee 817–818. International World Wide Web Conferences Steering
[16] Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon Committee 173–182.
LR, Riedl J. (2017) Applying collaborative filtering to [24] Liu XM, Ouyang YX, Rong Wg, Xiong Z. (2015). Item
usenet news. Commun ACM 40(3): 77–87. category aware conditional restricted Boltzmann
[17] Zheng L, Noroozi V, Yu PS. (2017). Joint deep modeling machine based recommendation. In International
of users and items using reviews for recommendation. In Conference on Neural Information Processing. Springer
Proceedings of the Tenth ACM International Conference 609–616.
on web Search and Data Mining (WSDM ’17). ACM,
New York, NY, USA 425–434.

69

You might also like