Incorporating Service Proximity Into Web Service Recommendation Via Tensors Recomposition
Incorporating Service Proximity Into Web Service Recommendation Via Tensors Recomposition
Supervisor
Dr. Jian Yu
Dr. Sira Yongchareon
July 2019
By
Zhentao Wu
School of Engineering, Computer and Mathematical Sciences
Copyright
Copyright in text of this thesis rests with the Author. Copies (by any process) either
in full, or of extracts, may be made only in accordance with instructions given by the
Author and lodged in the library, Auckland University of Technology. Details may be
obtained from the Librarian. This page must form part of any such copies made. Further
copies (by any process) of copies made in accordance with such instructions may not
be made without the permission (in writing) of the Author.
The ownership of any intellectual property rights which may be described in this
thesis is vested in the Auckland University of Technology, subject to any prior agreement
to the contrary, and may not be made available for use by third parties without the
written permission of the University, which will prescribe the terms and conditions of
any such agreement.
Further information on the conditions under which disclosures and exploitation may
take place is available from the Librarian.
2
Declaration
I hereby declare that this submission is my own work and
that, to the best of my knowledge and belief, it contains no
material previously published or written by another person
nor material which to a substantial extent has been accepted
for the qualification of any other degree or diploma of a
university or other institution of higher learning.
Signature of candidate
3
Acknowledgements
Throughout the writing of this dissertation, I have received a great deal of support and
assistance.
I would first like to thank my primary supervisor, Dr. Jian Yu, whose expertise was
invaluable in the formulating of the research topic and methodology in particular. He
provided me with the tools that I needed to choose the right direction and successfully
complete my dissertation.
I would also like to thank my secondary supervisor, Dr. Sira Yongchareon, for their
valuable guidance.
In addition, I would like to thank my parents for their wise counsel and sympathetic ear.
You are always there for me. Finally, there are my friends, who were of great support in
deliberating over our problems and findings, as well as providing a happy distraction to
rest my mind outside of my research.
4
Abstract
The sparseness of Mashup-API rating matrix coupled with cold-start and scalability
issues have been identified as the most critical challenges that affect most Collaborative
filtering based Web APIs recommendation solution. Sparseness deteriorates the rating
prediction accuracy. Several Web-API recommendation approaches employ basic col-
laborative filtering technique which operates on second-order matrices or tensors by
decomposing the Mashup-API interaction matrix into two low-rank matrix approxima-
tions, and then make prediction based on the factorized tensors. While most existing
CF, Matrix factorization-based Web-API recommendation approaches have shown
promising improvement in recommendation results, one limitation is that they only
focus on 2-dimensional data model in which historical interaction between Mashup-API
are mainly used. However, recent works in recommendation domain show that by
incorporating additional information into the rating data, Web-API rating prediction
accuracy can be enhanced. Inspired by these works, this research proposes a collaborat-
ive Filtering method based Tensors factorization, an extension of Matrix factorization-
that exploits the ternary relation among three key entities in Web service ecosystem
Mashup-API-Proximity. Modelling the Web-API rating data with Tensor decomposition
technique enables incorporation proximity information as third additional entity into
Web service recommendation application to improve prediction accuracy. Specifically,
we employ High Order Singular Value Decomposition approach with regularization
term to extend the traditional Mashup-API matrix into Mashup-API-Proximity tensors.
5
Experimental analysis on ProgrammableWeb dataset shows promising results compare
with some state-of-the-art approaches.
Keywords: Tensors Factorization, ProgrammableWeb, High Order Singular Value
Decomposition, Recommendation, Mashups, Web-API recommendation
6
Contents
Copyright 2
Declaration 3
Acknowledgements 4
Abstract 5
1 Introduction 12
1.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.1 Recommendation systems and Big Data . . . . . . . . . . . . . 15
1.3.2 Recommending Web Services . . . . . . . . . . . . . . . . . . . 16
1.4 Research Motivation and Significance . . . . . . . . . . . . . . . . . . 20
1.5 Research Scope and Methodology . . . . . . . . . . . . . . . . . . . . . 23
1.6 Research contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.7 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2 Literature Review 28
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Web Service Recommendation . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.1 Content-based Approach . . . . . . . . . . . . . . . . . . . . . . 31
2.2.2 QoS-based Approach . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.3 Context-Aware approach . . . . . . . . . . . . . . . . . . . . . . 33
2.2.4 Semantic-Based approach . . . . . . . . . . . . . . . . . . . . . 33
2.3 Collaborative Filtering Techniques . . . . . . . . . . . . . . . . . . . . 34
2.3.1 Model-Based Recommendation Methods . . . . . . . . . . . . 36
2.3.2 Memory-Based Recommendation Methods . . . . . . . . . . . 37
2.3.3 Challenges of collaborative filtering techniques . . . . . . . . 39
2.3.4 Common Solutions to collaborative filtering challenges . . . 41
2.3.5 Collaborative Filtering Methods for Web Service Recommend-
ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.6 Clustering-Based Recommendation Approach . . . . . . . . . 43
7
2.3.7 Hybrid Service Recommendation . . . . . . . . . . . . . . . . . 45
2.3.8 Hybrid Web Service Recommendation . . . . . . . . . . . . . . 48
2.4 Matrix Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 Matrix Decomposition Models . . . . . . . . . . . . . . . . . . . . . . . 51
2.5.1 Eigenvalue Decomposition Method . . . . . . . . . . . . . . . . 52
2.5.2 Singular Value Decomposition . . . . . . . . . . . . . . . . . . 53
2.5.3 Principal Component Analysis . . . . . . . . . . . . . . . . . . 55
2.5.4 Probability Matrix Factorization . . . . . . . . . . . . . . . . . 55
2.5.5 Non-Negative Matrix Factorization . . . . . . . . . . . . . . . 57
2.5.6 Matrix Factorization in Web service Recommendation . . . . . 58
2.6 Tensors Decomposition Techniques . . . . . . . . . . . . . . . . . . . . 60
2.6.1 Tensors Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.6.2 Tucker Decomposition And Higher Order Singular Value De-
composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.6.3 Parallel Factor Analysis (PARAFAC) . . . . . . . . . . . . . . . 64
2.7 Pairwise Interaction Tensor Factorization . . . . . . . . . . . . . . . . . 66
3 Research Method 67
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2 Data Acquisition and Processing . . . . . . . . . . . . . . . . . . . . . . 69
3.3 Mashup-Oriented MF-Based Recommendation . . . . . . . . . . . . . 69
3.4 Tensors Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.4.1 Need for Tensors in Recommender Systems . . . . . . . . . . 74
3.4.2 Notations and Operations . . . . . . . . . . . . . . . . . . . . . 75
3.5 High-Order Singular Value Decomposition on Tensors . . . . . . . . . 77
3.5.1 HOSVD Algorithm Description . . . . . . . . . . . . . . . . . . 77
3.5.2 HOSVD Decomposition with Single Contextual Variable . . . 78
3.5.3 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.5.4 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.5 Basic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.6 Geographical Distance Between Services . . . . . . . . . . . . . . . . 83
3.6.1 Estimating Geographical Proximity Score . . . . . . . . . . . . 83
4 Analysis 85
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.2.1 Pandas module . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.2.2 Sktensor and Tensorly modules . . . . . . . . . . . . . . . . . . 87
4.2.3 Numpy module . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2.4 Matplotlib module . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.1 Three-dimensional tensor construction . . . . . . . . . . . . . 87
4.3.2 Sampling the training set and testing set . . . . . . . . . . . . . 90
4.3.3 Web API Prediction With Tensor Decomposition . . . . . . . . 91
8
4.3.4 The performance of HOSVD . . . . . . . . . . . . . . . . . . . 93
4.3.5 Comparison with PMF . . . . . . . . . . . . . . . . . . . . . . . 94
4.4 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.1 Data sparsity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.2 The performance of HOSVD and PMF with different size train-
ing set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4.3 The denoising influence on performance of HOSVD . . . . . 99
4.4.4 Prediction accuracy comparison between HOSVD and PMF . 99
5 Discussion 102
5.1 Results Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.1.1 Impact of Dimensionality . . . . . . . . . . . . . . . . . . . . . 103
5.1.2 Impact of HOSVD Tensor Density . . . . . . . . . . . . . . . . 103
5.1.3 Impact of Regularization Parameter . . . . . . . . . . . . . . . 104
5.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6 Conclusion 106
6.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
References 109
Appendices 115
9
List of Tables
4.1 The RMSE and MAE of HOSVD and PMF model with different size
training set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
10
List of Figures
11
Chapter 1
Introduction
1.1 Aim
12
Chapter 1. Introduction 13
real-world, multiple information about object interactions usually involve more than
two participating entities, and several additional information about the state of the inter-
actions like location, time, or mood of the user are usually available (W. Wu et al., 2017).
Recent techniques are leveraging various auxiliary information currently available on
the web to enhance rating prediction accuracy. However, while various side/auxiliary
information is available to support the performance of recommendation applications,
incorporating multifaceted information such as ternary relation between independent
objects into rating matrix is a challenging task. Moreover, many existing approaches for
prediction and recommendation can neither handle heterogeneous, large-scale datasets
nor deal with cold-start problem.
In service-oriented computing, various web service recommendation approaches
including memory-based, content-based, social-based, and context-based and other
hybrid collaborative filtering approach exit (Bobadilla, Ortega, Hernando & Gutiérrez,
2013). Many of these approaches explored the historical, co-invocation and interaction
information that exist between Web-service compositions Mashups and component
services like Web-APIs to solve the discovery problem and recommendation/service
selection problems that exist in the service computing domain (Yao, Wang, Sheng,
Benatallah & Huang, 2018; B. Cao et al., 2017), Generally, most of these works
focus on using two entities : mashups, (which could be related to Web-APIs users )
and Web-APIs (which is the item in Web service recommendation context) in their
applications. Several other efforts also integrate Quality of Service (Z. Zheng, Zhang
& Lyu, 2012; Z. Zheng, Ma, Lyu & King, 2012), Location (X. Chen, Zheng, Yu &
Lyu, 2013) and Trust (Su, Xiao, Liu, Zhang & Zhang, 2017) information into their
prediction/recommendation systems. However, most of these approaches consider two-
dimensional matrix representation techniques, and thus, can not effectively exploit the
underlying latent features that exist within multiple object interactions. As a dominant
and common implementation of the CF method, the basic matrix factorization technique
Chapter 1. Introduction 14
The above questions will address the application of multi-variate models like MF
and its extensions in capturing latent relationships that exist between Web-APIs.
1.3 Background
This section provides background knowledge on the key components of this research
work. First, a smart overview of the Recommendation systems and Big Data, Second, it
provides the background of knowledge of Collaborative filtering with Matrix-factorization
based Web service recommendation systems and Finally introduced the Motivation
behind the approach used in this research.
Companies such as F acebook, Amazon, N ef lix, eBay, Google and T witter employ
recommender systems in their services to improve customer intimacy and satisfaction.
The growth of recommender systems has been in parallel with the web. The emergence
of Web 2.0, coupled with the continuous development of various web services, have
led to a rapid increase in the amount of digital information generated on various web
platforms. For instant, with the increase in the development of e-commerce platforms,
more auxiliary data that captures features of both users and items are now available for
enhancing the performance of recommender systems. Online social networks contain
gigabytes of data, which can be mined to provide recommendations support. The boom
of social media has also contributed to the development in recommendation, especially
in social recommendation. Generally, social recommendation focuses on modelling
social network information as regularization terms to constrain the matrix factorization
framework (J. Zheng et al., 2017).They generate similarity of users by leveraging rich
social interactions among users, for instance, friendships in Facebook, following rela-
tions on Twitter. Emerging technologies such as Big data has played a significant role
in developing recommender systems (Y. Zhang, Chen, Mao, Hu & Leung, 2014). The
use of BigData technologies enables discovery of latent information within data with
large volume features, most of which are now used to support item recommendation.
For example, as of April 2019, the current largest, online Web-API repository called
ProgrammableWeb 1 has over 20,320 Web-APIs belonging to more than 400 predefined
categories, and over 7,000 mashups (service compositions). Similarly, a popular Web
service marketplace, Mashape 2 currently has over 10,000 collections of public and
private APIs with about 100,000 records of engaged developers around the globe. The
continual growth of these service ecosystems conveys the popular emerging service
economy (Tan, Fan, Ghoneim, Hossain & Dustdar, 2016). This also reflects both
economic and social impacts of web services in the software development industry.
Diverse services can now be used for composing new, value-added application − also
known as "mashups " −, which combine several Web-APIs from different sources to
satisfy complex user requirements. Web-APIs have become ubiquitous due to majority
of software applications/services currently been offered as some form of Web-APIs
(Adeleye, Yu, Yongchareon, Sheng & Yang, 2019).
The diversity and rapid increase in the number of Web-APIs on the internet poses a
great challenge to their consumption and reusability. While there are tens of thousands
of web services with diverse functionalities currently available in various repositories,
the discovery and selection of appropriate services capable of satisfying some specific
and complex user requirements is a great challenge. Software developers and other
service users still find it very challenging to select suitable services from a pool of
functionally similar and related services, especially for service composition purposes.
Recently, recommendation techniques have been employed to tackle service discovery
and selection complexities. Web service recommendation involves automatic identific-
ation of usefulness or suitability of web services, selecting candidate services based
on users’ requirements (or behaviour analysis) and proactively recommending most
suitable services to end users. Various researchers have made efforts to proactively
1
https://www.programmableweb.com
2
https://www.mashape.com
Chapter 1. Introduction 18
recommend services to users based on certain parameters and preferences. Thus, web
service recommendation became an active process of service search, discovery and selec-
tion that can be facilitated with various computer science and mathematical techniques
such data analysis, machine learning, deep learning, graph theory and probabilistic
approaches, to enhance accuracy in prediction (Yao, Sheng, Ngu, Yu & Segev, 2015).
Even though existing web service recommendation techniques show improvements, the
recommendation tasks continue to pose a great challenge for service engineers due to
rapid, continuous increase in the number of web services. Generally, recommender
systems provide support for selecting products and conventional services; however, their
direct application to web services is not straightforward due to the following challenges:
• Current Scarcity of user feedbacks on web services (Lecue, 2010) and High
degree of uncertainty in the feedbacks (Users feedbacks are sometimes subjective)
(L. Chen, Wang, Yu, Zheng & Wu, 2013).
• Lack of specific quality of services (QoS) and context information (Su et al.,
2017).
• Need to fine-tune web services to the requirement of intended user (L. Liu, Lecue
& Mehandjiev, 2013).
• Inadequate formal semantic specifications for describing web services (Yao, Wang
et al., 2018).
Other recent methods include network-based and hybrid approaches. Hybrid recom-
mendation approaches integrate two or more of the above methods to recommend
appropriate services to users. In other to improve recommendation processes, some
researchers tried to leverage the capabilities of multiple techniques by combining
them.(both network-based and hybrid approaches are later discussed in Literature
review chapter).
In this era of information overload and data explosion, internet and web users have to
struggle through a rapidly, continually increasing amount of both relevant and irrelevant
information. Specifically, in service-oriented computing, tens of thousands of web
services are currently available on web and this number has been continuously soaring.
Web services have had enormous impact on the Web as a potential means for supporting
a distributed service-based economy. However, the discovery and reuse of appropriate
web services on a global scale especially for complex service requirements task are still
very limited and challenging. Moreover, the diversity of web services present a unique
challenge to service composition; it is now more challenging to select appropriate
services from myriad of functionally similar services. While there are billions of web
pages currently available on the web, very few numbers of web services are publicly
available to service consumers. Traditional approach to searching through the web using
Chapter 1. Introduction 21
keywords have proven inefficient. Thus, managing web service information overload
and data explosion requires the support of intelligent systems that can leverage the
available information, process and filter faster than humans and recommend services
based on user’s requirements. To satisfy these diverse needs, user-centric or personalized
recommender systems have emerged.
Conventional web service recommender systems deal with two major types of
entities, which include users service requirements and service profiles (e.g. service
functional descriptions or historical information). A popular approach is to recommend
web services based on the semantic similarity between user’s service requirement and
a combination of service description and co-invocation information. Due to increas-
ing complexity of user’s service requirement and the surge in large-scale information
ecosystems, multi-layer information is required to improve service recommendation
quality. Generally, dominant frameworks for web service recommendation are logically
two dimensional focusing on the interaction between web service composition, ( which
usually reflects historical usage of web services) and web services. This exemplifies
a sort of interaction that manifests between traditional users (consumers) and items
(products), and are normally characterized by a single relation (Vahedian, Burke &
Mobasher, 2017). However, the ever-growing increase in social ecosystems, where
multi-entity interaction and correlation occur across different domains and context
have created multi-dimensional space that reflects complex heterogeneous relationship
between various entities. Consequently, large-scale Heterogeneous Information Net-
work that consists of interconnected entities with multi-type relations can exist (Jamali
& Lakshmanan, 2013).
Previous studies suggest that more effective utilization of side information or auxil-
iary data can help improve the quality of recommender system (Yang, Lei, Liu & Li,
2017), (X. Liu et al., 2012), (X. Yu et al., 2014). Through open data initiative and rapid
development of APIs, huge amount of auxiliary data, which can be used to improve
Chapter 1. Introduction 22
• The challenges of modelling and utilizing these complex and heterogeneous data
in enhancing recommender systems (Shi, Hu, Zhao & Philip, 2019) .
• The challenge of developing a generic approach to model these varying data (of
different types and attributes) from different platform.
This research scope extends to the study and review of recommender systems with
focus on various techniques and methods applicable to providing web service recom-
mendation solution. In order to effectively leverage available online information to
enhance the quality of web service recommendations, this research work study several
collaborative filtering techniques to induce a model from rating a matrix and utilize the
model to perform recommendation. Specifically, by modelling web service information
with location information as three-order tensors, correlation and interaction among
web services, service composition and location can be captured using the core tensor
in reduced-dimension form. This study also provides an overview and application of
higher-order tensors with respect recommender systems implementation and enhance-
ment. Tensor decomposition is applied to web service data arrays for extracting service
attributes.
Recommender systems mainly base their suggestions on rating data of two entities
(users and items), which are often placed in a matrix with one representing users and the
other representing items of interest. These ratings are given explicitly by users creating
a sparse user-item rating matrix because an individual user is likely to rate only a small
fraction of the items that belong to the item set. Another challenging issue with this
user-item rating matrix is scalability of data (i.e., the large number of possible registered
users or inserted items), which may affect the time performance of a recommendation
algorithm.
We can deal with all aforementioned challenges by applying matrix decomposition
Chapter 1. Introduction 24
social tags according to their own cognitive level. It emphasizes process and inform-
ation that involved. Early websites with folksonomy features have strong “tagging”
characteristics, such as Movie Lens, people can classify the interested movies according
to different tags and grade for the movie. And some typical websites such as picture
website Flicker, video website YouTube also have the same characteristic.
The information contained in tags can be an effective basis for recommendation to
other users. Therefore, personalized recommendation system has been greatly developed
in folksonomy websites. In addition to tags, other social media included in folksonomy
websites are also common raw data in recommendation systems. Because of the
relationship among users, tags and resources in public annotation website, there is a
great space to develop the algorithm of recommendation system. From the more mature
collaborative filtering recommendation system to label-based tag recommendation
system, each method has its defects and shortcomings. How to make full use of the
original data and develop a more efficient, accurate and personalized recommendation
system for folksonomy websites has become a hot issue in academic research.
regularized terms into the weighted Frobenious norm in three-order matrix factorization.
Finally, the report starts from the tensor structure, study how to decomposition
tensor and achieve the distributed storage of tensor data, avoid the situation of a general
computer cannot process the huge amount of data in tensor. Then in the distributed
storage achieve the SVD calculation method to decomposition the data, the experiment
express that it can solve the data sparse problem, it provides the idea for SVD in the
application of big data analysis. And then based on it, provide HOSVD method based
on it, the experiment shows that save the storage space, and improve the accuracy
without any other changes, accelerate advantage of HOSVD be more distinct, it has the
important significance about improving processing and running efficiency of HOSVD
analysis application when incremental data comes.
The rest of this paper is organized into five parts: Chapter 2 presents a literature review
of introduces and explores the fundamental concepts related to this research work, and
explores previous efforts in providing web service discovery and recommendation solu-
tions. The chapter identifies the various models and characteristics which are likely to be
employed in this research. Chapter 3 describes the research methods in detail, provides
the preliminary knowledge overview of tensors. Then presents the representation and
construction of the tensor model. Finally , the prediction algorithm based on HOSVD
decomposition in the proposed recommendation framework is illustrated. Chapter 4
displays the implementation and examination for the hypotheses, and the corresponding
measurement. Chapter 5 presents the analysis and discussion of our results. Chapter 6
is a general conclusion part for this research. Moreover, the challenges regarding the
research and the future work are proposed in this chapter.
Chapter 2
Literature Review
2.1 Introduction
Recently, there have been several research works conducted in service-oriented com-
puting domain to tackle various issues, and fill research gaps related to Web service
discovery, composition and recommendation. Several techniques and models have
been explored to support these activities. This chapter introduces and explores the
fundamental concepts related to this research work, and explores previous efforts in
providing web service discovery and recommendation solutions. The first part of the
chapter presents the general overview of Web service recommendation and discovery
with respect to service computing. As shown in Figure 2.1, various categories of Web
service recommender systems are discussed with different kind of techniques used to
implement these systems. In general, three different recommender systems are con-
sidered : (i) Content-based filtering techniques, (ii) Collaborative Filtering techniques
and (iii) Hybrid approaches. The collaborative filtering technique is further divided into
two separate groups: (a) Model-based approaches, and (b) Memory-based approaches.
As for model-based, various data-mining and machine-learning techniques are
employed to achieve the recommendation objectives. Most common techniques are
28
Chapter 2. Literature Review 29
• Current Scarcity of user feedbacks on web services (L. Liu et al., 2013) and High
degree of uncertainty in the feedbacks (Users feedbacks are sometimes subjective)
(X. Chen, Zheng & Lyu, 2014)
• Lack of specific quality of services (QoS) and context information (Su et al.,
2017)
• Need to fine-tune web services to the requirement of intended user. (L. Liu et al.,
2013)
• Inadequate formal semantic specifications for describing web services (Yao et al.,
2015)
Currently, most service engineers and developers rely on manual searching (keyword-
based) of service registries or other public sites such as Google Developers, Program-
mableWeb and Yahoo-pipe to discover and select require web services. Clearly, such
search is ineffective and time consuming. Therefore, effective recommendation ap-
proach is required to support service consumers in selecting suitable services for a
Chapter 2. Literature Review 31
This group of service recommender solutions rely on syntactic and semantic information
of web services and users to facilitate service recommendation process. The key
idea of content-based methods is to exploit information about user’s preferences and
services content descriptions, which include semantic information of service interfaces,
functionality descriptions, QoS values and so on. They recommend web services based
on the similarity of user preferences and the descriptive information of web services
(Yao et al., 2015). We further describe this group of work under three sub-sections as
follow.
This approach relies on the use of various information that describes nonfunctional
characteristics of web services such as cost, response time, availability, reliability
and throughput to recommend services to users (Su et al., 2017). Most QoS-based
Chapter 2. Literature Review 32
approaches use collaborative filtering algorithms to rank or filter out most suitable
services using the QoS historical information of the user Chen et al. (X. Chen et
al., 2014) proposed a QoS Aware service recommendation approach that uses a user-
collaborative mechanism for collecting past QoS information of web services from
different services users. Based on this data, authors employed a collaborative filtering
technique to predict service QoS value, which is in turn used to rank services. Similar
approach used in (Z. Zheng, Ma, Lyu & King, 2013). In (C. Yu & Huang, 2016),
authors leverage the capabilities of memory-based and model-based CF algorithm to
improve recommendation accuracy based on QoS information. The main limitation of
this approach is that QoS properties are subjective and vary widely among different
service users due to various factors such as network conditions, location, taste and so on.
Hence it difficult to acquire or estimate. QoS properties are measured at the client-side,
which makes it susceptible to various uncertainty (L. Chen et al., 2013). Recent research
efforts in this area have been channelled towards improving QoS information reputation.
(Su et al., 2017) addresses the problem of data credibility caused by dishonest users. The
authors proposed a trust-aware approach for reliable personalized QoS prediction for
service recommendation. The approach used on K-mean clustering algorithm to identify
the honest user’s cluster on each service and classify the QoS feedback submitted by
each user as positive or negative feedbacks based on the cluster. Similarly, Wu et al.
(C. Wu, Qiu, Zheng, Wang & Yang, 2015) proposed a credibility-aware QoS prediction
approach to address unreliability of QoS data. The authors based their approach on
two-phase K-mean clustering to identify the dishonest users by creating cluster values
for untrustworthy index calculation in first phase and another cluster for users in second
K-mean phase. Differentiating between honest and dishonest users is not a trivial task
and considering the subjectivity of user’s information, there is a still lot of gap in this
area.
Chapter 2. Literature Review 33
This group of service recommenders employ context and location information to cluster
users and services and make recommendation. Chen et al., (X. Chen et al., 2014) used
location and QoS information to recommend personalize services to users. In (Fan et
al., 2017), authors proposed a context-aware service recommendation approach based
on temporal-spatial effectiveness. The authors model spatial correlation between user’s
location and the web services’ location on user preference expansion, then compute their
similarities. Based on this computation, services are ranked and recommended to the
users. In (Xu, Yin, Deng, Xiong & Huang, 2016), authors employ context-information
of both service and user to improve the performance of QoS based recommendation.
For users, author employs geographical information as user-context and identify similar
neighbour for each user based their similarity. The authors mapped the relationships
between the similarity value and geographical distance, and for the services, affiliation
information was used as context. Recommendation is made based on QoS record of
user, service and the neighbours.
This group of service recommenders is based on the use of formal ontology to measure
similarity for recommendation. They are usually supported by a lightweight semantic
similarity assessment model that originated from ontology-based conceptual similarity
(H. Xia & Yoshida, 2007). Ontology-based comparison is the backbone of semantic-
based web service matching measure (W. Chen, Paik & Hung, 2015). Wang et al. (Wang,
Xu, Qi & Hou, 2008) discussed various semantic matching algorithms with their defects.
The ontologies are usually defined as set of semantic attributes that denote services’
functionality, category, input and output parameters and so on. For instance, in Liu et al.
(L. Liu et al., 2013), a semantic content-based recommendation approach was introduced.
Chapter 2. Literature Review 34
The approach estimates services similarities based on five different components: I/O,
functionality, category, precondition and effect. It measures semantic similarity based
on these aspects and filter services with different functionalities and categories. Lee
and Kim (Lee & Kim, 2011) proposed an ontology learning method for RESTful
web service, which allows web services to be grouped into concepts so as to capture
relationships between words using pattern The model enables automatic generation
ontologies from web application description languages (WADL). Elmeleegy et al.
(Elmeleegy, Ivan, Akkiraju & Goodwin, 2008) exploited a repository of composition
to estimate the popularity of a specific output, and make recommendations using
conditional probability that an output will be included in a composition. The authors
use a semantic matching algorithm and a meter planner to modify the composition
to produce the suggested output. Building ontology to support this approach is very
challenging, as it requires massive amount of expert knowledge and multiple ontology
development to cater for diverse users and domains.
Collaborative filtering (CF) techniques are widely used in recommender systems that
recommend items such as web services to users based on the similarity of different users.
CF is still active and interesting research area (Bobadilla, Hernando, Ortega & Bernal,
2011; Bobadilla, Hernando, Ortega & Gutiérrez, 2012; Ekstrand, Riedl, Konstan et al.,
2011; Schafer, Frankowski, Herlocker & Sen, 2007). Unlike content-based which is
domain dependent, CF techniques utilize domain-independent prediction algorithms
for content that cannot be adequately represented or described using descriptive data
(metadata) (Isinkaye et al., 2015). Collaborative filtering algorithms are commonly
used techniques in the data mining and information retrieval. They are based on
using historical behaviour of past users to establish connections between users and
Chapter 2. Literature Review 35
of commonly used models include Matrix factorization (Luo, Xia & Zhu, 2012) ,
Bayesian classifier (M.-H. Park, Hong & Cho, 2007; Friedman, Geiger & Goldszmidt,
1997), Latent features (J. Zhong & Li, 2010; Yao et al., 2015), Dimensionality Reduc-
tion techniques (Van Der Maaten, Postma & Van den Herik, 2009; Sarwar, Karypis,
Konstan & Riedl, 2000) such as Singular Value Decomposition (SVD) (S. Zhang, Wang,
Ford, Makedon & Pearlman, 2005),(Vozalis & Margaritis, 2007), Matrix Completion
Technique, Latent Semantic methods, and Regression and Clustering and so on. Model-
based techniques analyze the user-item matrix to establish relations between items (such
as web services). These relations are then in turn used to compare the list of top-N
recommendations.
The items that were already rated by the user before play a relevant role in searching for
a neighbor that shares appreciation with him (Zhao & Shang, 2010; Zhu, Ye & Gong,
2009). Once a neighbor of a user is found, different algorithms can be used to combine
the preferences of neighbors to generate recommendations. Due to the effectiveness
of these techniques, they have achieved widespread success in real life applications.
Memory-based CF can be achieved in two ways through user-based and item-based
techniques. User based collaborative filtering technique calculates similarity between
users by comparing their ratings on the same item, and it then computes the predicted
rating for an item by the active user as a weighted average of the ratings of the item by
users similar to the active user where weights are the similarities of these users with the
target item. Item-based filtering techniques compute predictions using the similarity
between items and not the similarity between users. It builds a model of item similarities
by retrieving all items rated by an active user from the user-item matrix, it determines
how similar the retrieved items are to the target item, then it selects the k most similar
Chapter 2. Literature Review 38
items and their corresponding similarities are also determined. Prediction is made by
taking a weighted average of the active users rating on the similar items k. Several types
of similarity measures are used to compute similarity between item/user. The two most
popular similarity measures are correlation-based and cosine-based. Pearson correlation
coefficient is used to measure the extent to which two variables linearly relate with each
other and is defined as (Isinkaye et al., 2015) :
From the above equation, Sim(a, u) denotes the similarity between two users a and
u, ra;i is the rating given to item i by user a, ra is the mean rating given by user a while n
is the total number of items in the user-item space. Also, prediction for an item is made
from the weighted combination of the selected neighbors’ ratings, which is computed
as the weighted deviation from the neighbors’ mean. The general prediction formula is:
∑i ru ,i rv ,i
CoSim(u, v) = √ √ (2.3)
∑i r2 u ,i × ∑i r2 v ,i
Similarity measure is also referred to as similarity metric, and they are methods
Chapter 2. Literature Review 39
used to calculate the scores that express how similar users or items are to each other.
These scores can then be used as the foundation of user- or item-based recommendation
generation. Depending on the context of use, similarity metrics can also be referred to
as correlation metrics or distance metrics (Adomavicius & Tuzhilin, 2005).
Table 2.3 provides summary of different CF techniques used in building recom-
mendation solutions as discussed in (Isinkaye et al., 2015) .
New-item problem occurs due to lack of ratings for new items. Initially, new
items do not normally have rating, so, until the items are rated by a considerable
number of item users, they are not likely to be recommended to users. Therefore,
most new items that lack rating become isolated and unnoticed by item consumers.
If such item can be discovered via other means, then the new-item issue would
have less impact. New-community problem (Lam, Vu, Le & Duong, 2008; Schein,
Popescul, Ungar & Pennock, 2002) is common challenge that occurs when newly
starting up recommendation system without sufficient information or data for
making an efficient and reliable prediction.
In general, the main challenges of collaborative filtering algorithms include the cold-
start issues, sparsity of rating matrix and the ever-growing nature of rating data (scalab-
ility). These challenges are usually tackled with various Matrix Factorization (MF)
models such as Probabilistic Matrix Factorization (PMF), Single value decomposition
(SVD) and Principal Component Analysis (PCA) (Bokde et al., 2015). Most of these
models are based on latent factor models (Koren, Bell & Volinsky, 2009). The basic
form of MF characterizes both users and items by vectors of factors deduced from item
Chapter 2. Literature Review 42
rating pattern. High correlation between item and user factors results to a recommenda-
tion. PCA and SVD techniques are more suitable for identifying latent semantic factors
in information retrieval, especially when dealing with CF challenges. More details of
various matrix decomposition methods are further discussed in Section 2.4
ation
services recommender systems ranked services based on services’ QoS values. Such
recommendation approach requires explicit specification of users’ requirements to be
able to recommend appropriate services (Yao, Wang et al., 2018). On the other hand,
CF algorithms are capable of capturing users’ implicit requirements. In (Z. Zheng et al.,
2010), authors used a combination of user-based and item-based CF approach to improve
QoS value prediction used for their recommendation system. They estimated similarity
between services and users using Pearson correlation coefficient algorithm, predicted
missing values in the service user matrix and recommend top-k services to users.
Hu et al. (Hu, Peng, Hu & Yang, 2015) proposed QoS prediction approach based on
temporal dynamics of QoS attributes and personalized factors of service consumers. The
authors combined improve time series forecasting method with collaborative filtering
to compensate for shortcomings of ARIMA models. Authors in (Zhou, Wang, Guo
& Pan, 2015) used collaborative filtering for making web service recommendation by
exploiting past usage experiences of service consumers.
Service clustering technique is a recent approach used to improve the quality of service
discovery and support service recommendation solutions. The method enables creation
of web services clusters with similar functionalities in order to reduce service’s search-
ing space during service discovery and recommendation. Generally, this technique
uses service documents including tags as the main information sources for clustering
(L. Chen, Yu, Philip & Wu, 2015). Existing methods focus on two aspects: (i) some
methods first analyze user’s service requirements and the service description documents,
and then create web services clusters based on their functionality similarity (Platzer,
Rosenberg & Dustdar, 2009; Sun & Jiang, 2008) (ii) Others utilize tags contributed by
users and perform clustering by introducing the similarity of service document and tag
Chapter 2. Literature Review 44
development.
Generally, hybrid recommendation technique (Burke, 2002; B. Cao et al., 2017; Porcel,
Tejeda-Lorente, Martínez & Herrera-Viedma, 2012) combines two or more different
recommendation techniques to enhance the recommendation quality and performance,
and gain better system optimization to avoid some drawbacks of pure, traditional
recommendation techniques (Adomavicius & Zhang, 2012). The ideal is to leverage the
advantages of individual technique to gain better recommendation performance as the
disadvantages of one technique could be minimized or removed by another techniques.
Using hybrid recommendation approach could suppress or limit the weaknesses of an
individual method in an integrated recommendation model. In most cases, collaborative
filtering approach is integrated with some other techniques in an effort to avoid ramp-
up problem (Burke, 2002). Table 2.1 show the summary of different hybridization
methods commonly used in building recommendation systems. Various ways in which
the integration or combination of pure recommendation techniques can be realized are
discussed as follows:
Switching Hybridization
system is very sensitive and responsive to the strengths and deficiencies of its component
recommender techniques. However, the strategy also suffers from complexity associated
with the recommendation processes, which is due to switching procedure and criterion
(Isinkaye et al., 2015). The switching criterion normally leads to more complexity in
the recommendation processes due to the increasing number of parameters, which has
to be determined by the recommender system (Burke, 2002). A popular example of
switching hybrid recommender system called DailyLearner is discussed in (Billsus &
Pazzani, 1999). DailyLearner employed both content-based and collaborative hybrid.
In a scenario where content based technique cannot make recommendations due to cold
start problem or lack of sufficient information, content-based recommendation is used
first and then followed by collaborative filtering approach.
Mixed Hybridization
do not always have impact on the general performance. Mixed hybrid approach is
usually employed where it is practical to make large number of recommendations
simultaneously, An example of mixed hybridization method was discussed in (Smyth &
Cotter, 2000).
Cascade Hybridization
based on latent factor model usually model recommendation ratings as dot product of
item factor matrix and user factor rating matrix. MF model map the factors to a merged
latent factor space of dimensionality f , where user-item relationships are realized as
inner products in the dimensional space. High similarity between each item i and user
u results to a recommendation (Koren et al., 2009). Each user u is map to a vector
pu ∈ Rf and likewise each item i is associated with a vector qi ∈ Rf . For a given item i,
the elements of qi estimate the degree to which the item possesses the factors, positive
or negative. For a given user u, the elements of pu measure the extent of interest the user
has in items that are high on the corresponding factors, again, positive or negative.The
result of the dot product is qi T pu . The captures the interconnection between user u
and item i, the user’s universal interest in the item’s properties. This estimate fairly
Chapter 2. Literature Review 51
accurately the user u′ s rating of item i, which is denoted by rui , resulting to the estimate:
rui = qi T pu (2.4)
The main task here is estimating the mapping of each item and users to factor vectors
qi ,pu ∈ Rf . After completion of the mapping, the RS can easily compute the rating
a particular user will allocate to a particular item using equation 2.4. This method
is considered effective performance-wise in reducing problems related to high-level
sparsity in recommender system databases. Some recent research efforts employ
dimensionality reduction techniques to tackle similar problem. MF methods have also
proven very efficient and flexible in handling large recommender system databases and
enhancing scalability. Another major advantage of MF is that it enables incorporation
of additional information, especially when there is no sufficient explicit information to
make recommendation. Recommender systems can use implicit information such as
historical information about user behavioural patterns, browsing history, web activities
etc. to infer user preferences. Implicit information or feedbacks usually represent the
presence or absence of events and are normally represented as a densely filled matrix.
Generally, recommendation tasks and many other real-world tasks are usually represen-
ted with initial high-dimensional matrices that required decomposition or factorization
into two or more smaller matrices. The resulting matrices from the decomposition
(factors of the initial matrix) have several advantages due smaller dimensions. For
instance, the smaller dimension would result to reduced processing time and minimize
the amount of memory requirement needed for storing the matrices. Hence, improving
the overall computational efficiency of the algorithms that would have performed less
Chapter 2. Literature Review 52
If rank r of the matrix A is equal to its dimension n, then, matrix A can be decom-
posed as follows:
A = EΛE −1 (2.7)
In linear algebra, the Singular Value Decomposition (SVD) is an important tool that
is use to solve mathematical problems including factorization of a real or complex
matrix (Berry, Dumais & O’Brien, 1995; Symeonidis & Zioupos, 2016). SVD is one of
the common techniques used for matrix dimension reduction, and can be considered
as the generalization of the eigen-decomposition of a positive semi-definite normal
matrix to any m × n through an extension of polar decomposition. It has many useful
applications in recommender systems, statistics and signal processing. The major issue
in a decomposition based on SVD is to find a lower dimensional feature space (Isinkaye
et al., 2015). Formally, the SVD of an mn real or complex matrix A is a decomposed
form of the :
SV D(A) = U ΣV T (2.8)
A = U SU T = EΛE −1 (2.9)
A1 = M T M = V S 2 V T (2.10)
A2 = M M T = U S 2 U T (2.11)
Chapter 2. Literature Review 55
For both A1 and A2 matrices , similar computations to equation 2.10 and equation
2.11 can be performed . That is , the application of SVD on the initial original matrix A
can be followed to calculate matrices A1 and A2 SVD factorization.
To decide when to apply matrices A1 and A2 , minimum dimension of the matrix
Ais selected. If the dimension of matrix A is chosen to be n × m and m << n, then A1
is chosen. A2 is chosen when n << m.
Principal component Analysis (PCA) is also one of the common statistical techniques
for data analysis and processing. It is a well-established technique for dimensional-
ity reduction use to extract dominant patterns from high-dimensionality dataset by
transforming large sets of variables into smaller one .
2.5 (?, ?). Let Iij be equal to 1 if Rij - (that is user i rated item j ), and Iij equals 0 if
otherwise. In addition, let N (x∣µ, σ 2 ) = f X(x) where X ∼ N (µ, σ 2 ). The conditional
distribution of the corresponding observed ratings for user and items can be defined
with as follows:
N M Iij
p(R∣U, V, σ 2 ) = ∏ ∏ [N (Rij ∣Ui T Vj , σ 2 ] (2.12)
i=1 j=1
Where N (x∣µ, σ 2 ) is the PDF (probability density function) of the Gaussian distri-
bution with µ as the mean and variance σ 2 . By placing zero mean spherical Gaussian
priors on U and V with hyperparameters σU2 , σV2 : p(U ∣σU2 ) = ∏N
i=1 N (Ui ∣0, σU I),
2
p(V ∣σV2 ) = ∏M
i=1 N (Vi ∣0, σV I)
2
To maximize the log posterior over U and V , N definition is substituted and log is
taken:
1 N M 1 N T 1 M T
ln p(U, V ∣R, σ , σ U , σ V ) = − 2 ∑ ∑ Iij (Rij − Ui Vj ) − 2 ∑ Ui Ui − 2 ∑ Vj Vj
2 2 2 T 2
2σ i=1 j=1 2σ U i=1 2σ V j=1
1 ⎛⎛ N M ⎞ ⎞
− ∑ ∑ Iij ln σ 2 + N D ln σ 2 U + M D ln σ 2 V + C
2 ⎝⎝ i=1 j=1 ⎠ ⎠
(2.13)
where C is a constant that does not depend on the parameters. Maximizing the
log-posterior over item and user features with the observation noise variance and prior
variances ( hyper-parameters) as constant reduces the optimization to minimization of
the sum-of-squared-errors objective function with quadratic regularization terms:
1 N M 2 λU N λV M
E= ∑ ∑ Iij (Rij − UiT Vj ) + ∑ ∣∣ Ui ∣∣2F ro + ∑ ∣∣ Vj ∣∣2F ro (2.14)
2 i=1 j=1 2 i=1 2 j=1
σ2 σ2
Where λV = σ2 V , λU = σ2 U , and ∣∣ . ∣∣2F represents the the Frobenious norm of a
Chapter 2. Literature Review 57
matrix. Equation 2.14 gives the local minimum of the objective function through the
computation of gradient descent in U and V. The P M F model can also be considered
as an a probabilistic extension of SVD model discussed in previous section (Mnih &
Salakhutdinov, 2008). Constrained PMF is further discussed in (Mnih & Salakhutdinov,
2008).
Non-Negative Matrix Factorization (NMF) is also a widely used tool for processing
and analyzing high-dimensional data because it automatically extracts sparse and easily
interpretable features from a set of non-negative data vectors (Gillis, 2014). NMF
algorithm is one of the multivariate analysis and linear algebra algorithms which
can factorize a matrix A into two matrices P and Q, with property that the three
matrices do not have negative elements (Guan, Tao, Luo & Yuan, 2012). Non-negative
feature enables the resulting matrices more suitable for objects clustering application
(Symeonidis & Zioupos, 2016).
Chapter 2. Literature Review 58
• They are strongly dependent on training data and predict ranking value based on
the opinions of past users’ rating feedbacks
This section provides the preliminary knowledge of Tensors, which is one of the key
techniques used in this research work. First, an overview of tensors factorization is
presented and then various related tensors decomposition methods Tuckers Decomposi-
tion(TD) − as the underlying tensor decomposition technique for Higher Order SVD
(HOSVD), PARAllel FACtor analysis (PARAFAC), Low-Order Tensor Decomposition
Chapter 2. Literature Review 61
composition
Tucker decomposition (Tucker, 1966) has few variants. Higher order singular value
decomposition (HOSVD) is a specific variant of Tucker decomposition that factorizes
a tensor into a set of matrices with one small core tensor. Often regarded as Tucker I
decomposition (Symeonidis & Zioupos, 2016; De Lathauwer, De Moor & Vandewalle,
2000) . In order to use HOSVD technique on a 3rd-order tensor A, three matrix unfold-
ing functions that are the matrix representations of tensor A having all column (row,...)
vectors stacked over each other successively is constructed. Figure 2.6 illustrates a
typical unfolding of a 3-order tensor A, where A1 , A2 , A3 are known as mode-1, mode-2,
mode-3 matrix unfolding of A respectively.
Chapter 2. Literature Review 62
⎧
⎪
⎪
⎪
⎪1, (u, i, t) ∈ Y
⎪
au , i, t ∶= ⎨
⎪
⎪
⎪
⎪
⎪ 0, else
⎩
Hence, the tensor  is built as product of the core tensor Ĉ and the mode products
ˆ and T̂ as expressed below :
of the three matrices Û , I,
 ∶= Ĉ ×u Û ×i Iˆ ×t T̂ (2.16)
ˆ and T̂ are all low-rank feature matrices denoting a mode that is user, items,
Û , I,
and tags, respectively in terms of its small number of latent dimensions kU , kI , kT
, and Ĉ ∈ RkU ×kI ×kT is the core tensor which governs the relation between the latent
semantic factors. Model parameters to be optimized are represented by the quadruple
Chapter 2. Literature Review 64
θ̂ ∶= (Ĉ, Û , I,
ˆ T̂ ) as shown in Figure 2.7.
2
arg min ∑ (âu,i,t − au,i,t ) (2.17)
θ̂ (u,i,t)∈Y
kU kI kT
â(u, i, t) ∶= ∑ ∑ ∑ ĉũ,ĩ,t̃ ⋅ ûu,ũ ⋅ îi,ĩ ⋅ t̂t,t̃ (2.18)
ũ=1 ĩ=1 t̃=1
the feature dimension of a feature matrix are marked with a tilde, and elements of a
feature matrix are marked with hat like (t̂t,t̃ ) (Symeonidis & Zioupos, 2016).
The Parallel Factor Analysis (PARAFAC) (Bro, 1997) is a special case Tucker de-
composition method (also known as canonical decomposition ), which minimizes the
complexity of tensor decomposition by assuming only a diagonal and core tensor. The
Chapter 2. Literature Review 65
Figure 2.8: The graphical representation of PARAFAC with diagonal core tensor and
the factorization dimensionality (equal for the three modes)
⎧
⎪
⎪
⎪
⎪1, if = û = î = t̂
! ⎪
cû,î,t̂ = ⎨
⎪
⎪
⎪
⎪
⎪ 0, else
⎩
For instance, given a 3-way array with 3-loading matrices A, B, C for a PARAFAC
model . The 3-loading matrices have elements aif , bjf , ckf respectively. The tri-linear
method results to minimizing the sum of squares of the residuals, eijk in the model.
F
xijk = ∑ aij bjf ckf + eijk (2.19)
f =1
F
xijk = ∑ af ⊗ bf ⊗ ckf (2.20)
f =1
where af is the ft h column of loading matrix A, while bf and cf hold thesame values
for loading matrices B and C respectively (Bro, 1997).
Research Method
This chapter provides the details of the construction of the proposed Tensor factorization
models used for Web-API recommendation in this research, This research work adopts
both exploratory and quantitative approaches to understudy the application of 3-order
Matrix factorization technique in exploiting the latent feature that exists among multi-
dimensional relations in Web service domain. Generally, the research method can be
divided into four key components: (i) First, the construction and description of the data-
set used in the research(ii) the representation and construction of regularization-based
Tensors factorization optimization model for the Web-API recommendation application.
(iii) the prediction algorithm based on HOSVD decomposition for the proposed Web-
API recommendation framework. (iv) Comparative procedure for Traditional 2-orders
Probabilistic Matrix Factorization and 3-order Tensors Factorization using Web-APIs
dataset.
3.1 Overview
This research intends to explore both Tensor factorization (TF) technique and Prob-
abilistic Matrix factorization (PMF) to tackle various issues related to the both large
67
Chapter 3. Research Method 68
This research work explores publicly available dataset from ProgrammableWeb reposit-
ory, the current largest Web-API, mashup repository. The time-stamped, raw dataset
craw from the repository consists of textual descriptions of 17829 APIs and 5691
mashups, and their historical invocation from June 2005 to January 2019. Considering
that the ProgrammableWeb backend database is not publicly accessible, only its web
pages can be employed for collecting the data. Hence, data scraping technique is
employed to crawl data from the repository web pages. After that, the web pages are
apportioned into two categories: Web-APIs and mashups, Each Web-API has features
including tags, name, description, users (as mashup), publication date, URL, end-
point, portal and category. Likewise, every mashup also has the above metadata plus
the set of Web-APIs invoked. Table 3.1 describes some basic statistics of the acquired
ProgrmmableWeb dataset.
Due to the fact that the ProgrammableWeb dataset does not include ratings between
mashups and APIs, Mashup-Web-API invocation data is adopted has the ratings here.
Then several cleaning and preprocessing follow. First, redundant APIs and mashups
are removed and then obtain 5691 mashups, with only 1,170 Web-APIs included. A
very sparse mashup-API mapping was achieved with density 1.6 × 10− 3 . All the APIs
and mashups descriptions were put into two lists respectively, and assign a tag for each
description.
Table 3.2: Sample Specification of Web-API (a) and Mashup Profile (b) in PW dataset
Generally, MF models map both users and items to a joint latent factor space of
dimensionality, such that user-item interactions are modelled as inner products in that
space (Koren et al., 2009). This success and adoption can be mainly attributed to its
exceptional scalability and accuracy.
For instant, suppose two entities u&v, where u = {u1 , u2 , . . . um } is a set of users,
and v = {v1 , v2 , . . . vn } be the set of items; the primary idea here is to decompose a
2-dimensional user-item matrix R ∈ Rm×n into two low-order matrices representing user
latent subspace matrix U ∈ Rm×d and item latent subspace matrix V ∈ Rn×d respectively.
Dimensional shared latent space d ≪ min(m, n). The likelihood of a particular user
ui interacting with item vj will be approximated through computation of the following
optimization problem (Yao, Sheng et al., 2018):
m n
L(U, V) = min ∑ ∑ (Rij − Ui VjT )2 (3.1)
U,V i=1 j=1
Chapter 3. Research Method 71
m n Iij
p(R∣U, V, σ) = ∏ ∏ [N (Rij ∣Ui T Vj , σ 2 ] (3.2)
i=1 j=1
m n
p (U ∣σ 2 U ) = ∏ N (Ui ∣0, σ 2 U I), p (V ∣σ 2 V ) = ∏ N (Vj ∣0, σ 2 V I) (3.3)
i=1 j=1
Chapter 3. Research Method 72
For both user U and item V latent features, the log of the posterior distribution over
both entities is computed as follows based on elements of equation 3.3:
1 m n 1 m T 1 n T
ln p(U, V ∣R, σ 2 , σ 2 U , σ 2 V ) = − ∑ ∑ ij ij
I (R − Ui
T 2
Vj ) − ∑ i i
U U − ∑ Vj Vj
2σ 2 i=1 j=1 2σ 2U i=1 2σ 2V j=1
1 ⎛⎛ m n ⎞ ⎞
− ∑ ∑ Iij ln σ 2 + md ln σ 2 U + nd ln σ 2 V + C
2 ⎝⎝ i=1 j=1 ⎠ ⎠
(3.4)
1 m n 2 λU m λV n
E= ∑ ∑ Iij (Rij − Ui T Vj2 ) + ∑ ∣∣ Ui ∣∣2F + ∑ ∣∣ Vj ∣∣2F (3.5)
2 i=1 j=1 2 i=1 2 j=1
σ2 σ2
Where λU = σ2 U , λV = σ2 V , and ∣∣ . ∣∣2F represents the the Frobenious norm1 . By
performing gradient descent on both U and V in Equation 3.7, a Local Minimum of
equation 3.7’s objective function can be realized as in equation 3.1.
Rn×k , where each element rij implies if or if not an API represented by ai is consumed
by a mashup (or a users) denoted by mj ( that is true if rij = 1 and false if rij = 0 ). The
main objective of P M F in this case is to map the mashups with respect to component
APIs into a shared a dimensional shared latent space d ≪ min{n, k}. The resulting
latent subspace matrices for mashup and Web-APIs are arranged in k × d for matrix M
and n × d for matrix A respectively,where API ai ∈ Rd and mashup mj ∈ Rd . Therefore,
the probability-based prediction that ai will be consumed by mj is calculated by:
If the latent factors of both mashups and Web-APIs are both represented as matrices M
and A, such that A ∈ Rn×d and M ∈ Rk×d respectively, these factors can be learned from
minimizing the sum-of-squared-errors Objective function with quadratic regularization
terms as follow:
1 n k 2 λA n λM k
L= ∑ ∑ ij ij
I (r − a T
i mj ) + ∑ ∣∣ A ∣∣
i F
2
+ ∑ ∣∣ Mj ∣∣2F (3.7)
2 i=1 j=1 2 i=1 2 j=1
σ2 σ2
Similar to equation in 3.7, λA = σ2 A , λM = σ2 M , and ∣∣ . ∣∣2F represents the the
terms introduced to reduce over-fitting the training data. The value of λ in this case is
data-dependent.where Iij equals 1 if API ai is invoked by mashup mj , and 0 otherwise.
The aim of the optimization is to minimize the sum-of-squared-errors loss function with
quadratic regularization terms, and gradient-decent approaches can be applied to find a
local minimum (Yao, Sheng et al., 2018)
Chapter 3. Research Method 74
For ternary relations, where standard matrix factorization cannot be applied tensors are
used. Tensor is regarded as a multi-dimensional matrix- a generalization of a matrix,
for a order N tensor means N-dimensional tensor.
A square matrix A ∈ RN × N is a matrix with the same number of rows N and columns
N . A is called non-singular if there is another matrix B ∈ RN × N such that AB = I
and BA = I, where I is an identity matrix I ∈ RN × N . If A is not invertible , then it is
singular.
A square matrix A is called orthogonal, if the column vectors of A form an or-
thonormal sets in ∈ RN . That is, A is matrix with real numbers entries, whose columns
and rows are orthogonal unit vectors. AAT = AT A = I, where AT is the transpose of
matrix A.
Frobenius Norm || . ||
The Frobenius norm 2 , also known as the Euclidean norm ( also used for the vector
L2 − norm)is matrix norm of an M × N matrix A defined as the square root of the sum
of the absolute squares of its elements, define in equation 3.8 below:
¿
ÁN M
∣∣ A ∣∣F = Á
À∑ ∑ ∣aij ∣2 (3.8)
i=1 j=1
2
http://mathworld.wolfram.com/FrobeniusNorm.html
Chapter 3. Research Method 76
Matrix Unfolding
This section describes tensor factorization method employ in this work called HOSVD,
which is an extension of Singular Value Decomposition (SVD) method described in the
literature review. The section will show algorithm and step-by-step implementation of
HOSVD, and how the method can be employ to exploit the underlying latent semantic
structure in three-dimension data model. Then an illustration of Web-API or service
oriented tensor model implementation is presented with algorithms.
The algorithm description presented here shows the operation of HOSVD and how it is
performed based on inferred latent associations in 3-dimensional plane. Tensor decom-
position technique initially build a tensor, depending on the usage data triplet u, i, t of
user (mashup), item (API), and proximity. The idea is to employ the three entities that
relate in a location-based context-aware recommender system. Consequently, a tensor
A will be unfolded into three new matrices. Thereafter, SVD approach will be apply
to each of the new matrix. The following 6-steps summarizes the SVD procedures
(Symeonidis et al., 2008):
• Step-1 : The construction of the initial tensor A based on the integration data
triplet
• Step-2 :The matrix unfolding of tensor A, where three mode matrix representation
of tensor A is constructed , resulting to the creation of three new matrices (one
for each mode) as shown in equation 2.15
• Step-3 :After unfolding the matrix , SVD is then applied on all the three new
matrices.
Chapter 3. Research Method 78
F = S ×1 U 1 ×2 U 2 (3.9)
A = S ×1 U 1 ×2 U 2 ×3 U 3 (3.10)
Note that the tensor-matrix multiplication operator ×u indicates the direction on the
tensor on which to multiply the matrix using the superscript.
Since this research employs proximity as the only contextual information used in
HOSVD tensor data model, an illustration of a typical HOSVD decomposition with
single contextual variable P is described in this section, hence, Y, which is the tensor
holding the ratings, will be three-dimensional. While multiple contextual variables
could be used, the generalization number of dimensions coupled with number of context
variables is trivial (Karatzoglou et al., 2010). Assuming five star rating scale Y is given
with 3-dimensions {0, . . . , 5}m×a×p for a sparse tensor Y ∈ Y m×a×p , where m are the
number of users (mashups in our case), a number of items (APIs in out case) and p
Chapter 3. Research Method 79
is the contextual variable (s) (i.e pi ∈ {1, . . . , p} - in our case only proximity is used
as a single contextual variable). For the rating Y , the value 0 represents that a user
(mashup) did not rate or consume an item (an API). It is worth to note that 0 in this
case specially indicates missing data and not synonymous to dislike. Figure 3.3 shows
the 3-dimensional tensor decomposed into 3-matrices M ∈ Rm×dM , A ∈ Ra×dA and
P ∈ Rp×dP and a core tensor S ∈ RdM ×dA ×dP . With respect to this representation, the
decision function for a single user i (mashup) , item j (like API) and Context k (e.g.
proximity) fusion becomes :
Mi∗ represents the entries of the ith row of matrix M . This factorization model enables
full control over the dimensionality of the factors retrieved for the users, items and
proximity (or any other context) by tuning the dM , dA , dP parameters . This feature is
very valuable especially when dealing with large-scale real-world datasets where both
user’s matrix and item matrix can grow size and cause storage problem (Karatzoglou et
al., 2010).
In comparison with 2-D Matrix factorization approach described in section 2.4, the loss
function for 3-order tensor is described as follows:
1
L(F, Y ) ∶= ∑ Dijk l(Fijk , Yijk ) (3.13)
∣∣S∣∣1 i,j,k
Where D ∈ {0; 1}m×a×p is a binary tensor which has non-zero entries Dijk wherever Yijk
is observed. l ∶ R × Y Ð→ R represents a point-wise loss function penalizing distance
between the observation and the estimate . Fijk already described in equation 3.12. It
worth nothing that the overall loss L is meant to captured only the observed values in
the sparse tensor Y and not the missing ones.
Generally, there different possible choices of approaches for estimating the loss
function l. Some the common approaches are described below:
Squared Error
The squared error provides an estimate of the squared difference between the estimated
values and what is estimated. It gives a computation of the conditional mean as follows:
1
l(f, y) = (f − y)2 (3.14)
2
Absolute Loss
l(f, y) = ∣f − y∣ (3.15)
Chapter 3. Research Method 81
While, the are other loss function possible (Karatzoglou et al., 2010) , this work
focuses on the two described above.
3.5.4 Regularization
Usually, minimizing the loss function tends to lead to overfitting ( i.e. model is adapting
itself too much to the training data and thus, not generalizing well to the test data)
the training data. Therefore, in order to reduce the overfitting effect on the test data,
a regularization term is usually introduced. From equation 3.12, where S, M, A, P
constitute the data model, we can optimize the complexity of the model and ensure it
does not grow without bound. Hence, a L2 regularization term or l2 norm of the factors
is added. For matrix , the norm can also be referred to as the Frobenius norm.
1
Ω[M, A, P ] ∶= [λM ∣∣M ∣∣2F + λA ∣∣A∣∣2F + λP ∣∣P ∣∣2F ] (3.17)
2
Similarly, the core tensor S complexity can also be restricted by imposing the
l2 norm penalty:
1
Ω[S] ∶= [λS ∣∣S∣∣2F ] (3.18)
2
To this extent, this framework will aim to optimize a regularized risk functional which
is an aggregate of equation 3.13 and 3.17, that is L(F, Y ) and Ω[M, A, P ]. Hence, the
objective function for the optimization problem becomes:
There are various factors influencing the network performance between the target
user and the target web service. The most critical factors are network distance and
network bandwidth (J. Liu, Tang, Zheng, Liu & Lyu, 2015), which are highly relevant to
locations of the target user and the target service. Incorporation of location information
between users and candidate services have been found to be an important influence in the
consumption and selection of services. The location of service consumer and the service
can influence the observed quality of rating in service invocation. Chen et al. (X. Chen,
Liu, Huang & Sun, 2010) and (J. Liu et al., 2015) emphasis how critical the location of
service with respect to users are in influencing the Quality of Service of the web-service.
Location here implies the network environments that could affect the QoS attributes (
like response time, throughput etc.) of a service. For instance, if the quality of network
performance between a target user and the target Web service is high, the likelihood
that the user will experience high QoS on the target service will increase. Moreover,
users who reside in the same network region usually observe thesame response time
with respect to service. Therefore, we consider geographical proximity between users
and services as the single contextual variable in our recommendation framework, which
is capable of influencing the invocation preferences of services with respect to mashups.
Intuitively, we assume higher preferences for the API-mashups interaction with closer
proximity.
In order to capture the proximity score between a particular Web-API ai and mashup mj ,
this work employ Haversine3 which is used for determining the great-circle distance
between two points on a sphere given their longitudes and latitudes. The formula is
3
https://en.wikipedia.org/wiki/Haversine_formula
Chapter 3. Research Method 84
defined below:
√
ϕu − ϕi γu − γi
du, i = 2r.arcsin( sin2 ( ) + cos(ϕi )cos(ϕu )sin2 ( )) (3.20)
2 2
where r = 6371 denotes the Earth radius, ϕu , ϕi ∈ (−180, 180] denote the latitudes
in corresponding geolocations of u and i, and γu , γi ∈ (−180, 180] also represent the
respective longitudes. Relevant geographical information for the service proximity
computations were acquired from the GeoIP database 4 via a lookup of each Web service
and Mashup URLs. The resulting proximity score is normalized to be within the range
[0.1, 1]. A min-max normalization is performed on d as follows:
du,i − min(d)
zu,i = 0.8. + 0.1 (3.21)
max(d) − min(d)
zu,i represents the normalized values du,i . Therefore, the geographical proximity
values between entities mu and ai can be simplified as follows:
4
https://www.maxmind.com/en/geoip2-databases
Chapter 4
Analysis
4.1 Introduction
At present, all large websites such as Last.fm, Movielens and YouTube use social label
recommendation system to classify items and share information among users, and gradu-
ally realize label classification and build corresponding user groups. However, there
is a problem on the Internet, that is, some users feel that typing tags are very tedious,
which results in the sparse data problem of tag system caused by users’ unwillingness
to provide tags. In addition, there are still some problems in the tag recommendation
system, such as lexical differences and semantic ambiguity. Therefore, we need to find
85
Chapter 4. Analysis 86
a kind of label that users think is the most appropriate and comprehensive interpretation
of the item information, so that users can query, share and integrate the item information
more effectively. For these reasons, recent studies have been devoted to mining user
tags (tag metadata) on specific items to improve tag recommendation algorithms. Tradi-
tional recommendation systems generally use collaborative filtering recommendation
algorithm based on two-dimensional data ((?, ?); Herlocker et al. 2002; Sun et al. 2006;
Karypis 2001), while other algorithms combine labels into standard CD algorithm
to form three-dimensional association data (Tso-Sutter et al. 2008), which is helpful
to mine the potential semantic association among the three types of entities. For the
latter, we need to solve two questions firstly: (i) the construction of three-dimensional
relationship among users, items and tags; (ii) the sparsity of meta-data. In order to
mine the potential semantic association between mashup and APIs, we first constructed
three-dimensional tensor (proximity × Mashup × Apis) based raw data and decomposed
it with HOSVD. Finally, we compared its performance with PMF algorithm based
two-dimensional data. The results indicated that HOSVD is better than PMF in aspect
of strong association prediction.
4.2 Tools
The following function modules were carried out in Python 3.6.3 environment (Oliphant
2007).
Pandas module (Bernard 2016) was used to get the number of column and row of data in
dataframe format, conduct dataframe format conversion and basic operation associated
with dataframe .
Chapter 4. Analysis 87
Sktensor and Tensorly modules (Kossaifi et al. 2019) were used to conduct three-
dimensional tensor construction and decomposition.
Numpy module (Van Der Walt et al. 2011) were used to sample the training dataset
and test dataset from raw data and conduct singular value decomposition for two-
dimensional matrix.
Matplotlib module (Hunter 2007) was used to conduct analysis visualization, including
heat map and line chart plotting.
4.3 Implementation
We select the first and fifth columns in mashup − data.csv and the seventh column in
mashup − city.csv to combine a new data frame and then construct three-dimensional
matrix (Country × Mashup names × Apis).
Chapter 4. Analysis 88
d e f d f 2 t b ( df , name , t a g , i t e m ) :
Ncol = d f . s h a p e [ 1 ]
Nrow = d f . s h a p e [ 0 ]
df_name = d f . c o l u m n s . v a l u e s . t o l i s t ( ) [ name ]
d f _ t a g = df . columns . v a l u e s . t o l i s t ( ) [ t a g ]
d f _ i t e m = df . columns . v a l u e s . t o l i s t ( ) [ item ]
Users = [ ]
Tags = [ ]
Items = [ ]
f o r i i n r a n g e ( Nrow ) :
tags_temp = l i t e r a l _ e v a l ( df [ df_tag ] [ i ] )
Length = l e n ( tags_temp )
f o r j in range ( Length )
U s e r s . a p p e n d ( d f [ df_name ] [ i ] )
Tags . a p p e n d ( t a g s _ t e m p [ j ] )
Items . append ( l i t e r a l _ e v a l ( df [ d f _ i t e m ] [ i ] ) [ 0 ] )
Tb = pd . DataFrame ( { d f . c o l u m n s [ 0 ] : U s e r s , d f . c o l u m n s [ 1 ] : Tags , d f .
columns [ 2 ] : Items })
return tb
def t b 2 t c ( tc , tb ) :
N1 = t c . s h a p e [ 0 ]
N2 = t c . s h a p e [ 1 ]
N3 = t c . s h a p e [ 2 ]
tb_users = tb . ix [ : , 0 ]
tb_tags = tb . ix [ : , 1 ]
tb_items = tb . ix [ : , 2 ]
rn = tb . _ s t a t _ a x i s . values . t o l i s t ( )
r e s = np . z e r o s ( [ N1 , N2 , N3 ] )
f or i in rn :
res [ tb_items [ i ] ] [ tb_users [ i ] ] [ tb_tags [ i ]]=1
Chapter 4. Analysis 89
return ( res )
To show the sparsity of the dataset, a M ashup − Apis heat-map was plotted. Mean-
while, its sparsity was calculated by following formula:
count(eij)
Density = , eij ∈ A and eij ≠ 0 (4.1)
ncol × nrow
The codes were as follow, we (i) set the non-zero elements as one and calculated
the sum of this matrix; (ii) the sum of matrix divided by the product of the number of
rows and columns of matrix; (iii) the sparsity was 1 – density.
def tb2mat ( tb , r , c ) :
rows = t b . i x [ : , r ]
cols = tb . ix [ : , c ]
n r = l e n ( np . u n i q u e ( rows ) )
nc = l e n ( np . u n i q u e ( c o l s ) )
r e s = np . z e r o s ( [ nr , nc ] )
rn = tb . _ s t a t _ a x i s . values . t o l i s t ( )
f or i in rn :
r e s [ rows [ i ] ] [ c o l s [ i ] ] = 1
return ( res )
#Sparisity
mat1 = t b 2 m a t ( d0 , 0 , 1 )
d e n s i t y =sum ( sum ( mat1 ) ) / ( mat1 . s h a p e [ 0 ] * mat1 . s h a p e [ 1 ] )
# d e n s i t y = 0.0015
s p a r s i t y = 1− d e n s i t y
# s p a r s i t y = 0.9985
Chapter 4. Analysis 90
Then , t h i s m a t r i x was c o n v e r t e d i n t o t e n s o r ’A ’ w i t h ’ d t e n s o r ’
f u n c t i o n i n ’ s k t e n s o r ’ module .
d e f TC ( t b ) :
Nrow = t b . s h a p e [ 0 ]
Ncol = t b . s h a p e [ 1 ]
tb_users = tb . ix [ : , 0 ]
tb_tags = tb . ix [ : , 1 ]
tb_items = tb . ix [ : , 2 ]
t b _ t a g s _ u n i q u e = np . u n i q u e ( t b _ t a g s )
t b _ i t e m s _ u n i q u e = np . u n i q u e ( t b _ i t e m s )
t b _ u s e r s _ u n i q u e = np . u n i q u e ( t b _ u s e r s )
Tc = np . z e r o s ( ( l e n ( t b _ i t e m s _ u n i q u e ) , l e n ( t b _ u s e r s _ u n i q u e ) , l e n (
tb_tags_unique ) ) , float )
f o r index , v a l u e s in enumerate ( t b _ i t e m s _ u n i q u e ) :
items_temp = enumerate_fn ( tb_items , values )
f o r i in items_temp :
r1 = enumerate_fn ( tb_users_unique , t b _ u s e r s [ i ] ) [0 ]
r2 = enumerate_fn ( tb_tags_unique , t b _ t a g s [ i ] ) [ 0 ]
p r i n t ( i n d e x , " / " , ( l e n ( t b _ i t e m s _ u n i q u e ) ) , " : [ " , r1 , " , " , r2 , " ] " )
Tc [ i n d e x ] [ r 1 ] [ r 2 ] = 1
r e t u r n Tc
r a t i o = 0.6
Chapter 4. Analysis 91
t r a i n _ t b = d0 . l o c [ d0_cn [ 0 : i n t ( r a t i o * l e n ( d0_cn ) ) ] ]
t e s t _ t b = d0 . l o c [ d0_cn [ i n t ( r a t i o * l e n ( d0_cn ) ) : l e n ( d0_cn ) ] ]
t r a i n _ t c = t b 2 t c ( otc , t r a i n _ t b )
t e s t _ t c = t b 2 t c ( otc , t e s t _ t b )
The Web API prediction approach applies tensor factorization algorithm based on
HOSVD (as discussed in the method chapter) on the API data. In accordance to the
HOSVD approach introduced in Section 3.5, the algorithm uses as input the web service
data of tensor A and output the reconstructed tensor Â. Tensor  compute the latent
association among the API, Mashup (users) and the proximity (or location ) For each
elements in tensor  can be represented by by {m, a, p, r}. r measures the likeliness
that mashups m will consume API a within proximity p. Hence, AP I a can be predicted
/recommended to based o on the weight attributed {m, a} pair. The three-dimensional
tensor was unfolded into three two-dimensional matrices along with three dimensions of
A1, A2 and A3. It was conducted with unf old function in tensorly module. Secondly,
these three two-dimensional matrices were perform the singular value decomposition
(SVD) (Sun et al. 2005) via svd function in numpy module. After SVD, we can get
core tensor (S 1 , S 2 andS 3 ), left singular matrices (U 1 , U 2 and U 3 ) and transposition
matrices of right singular matrix (V (1)T , V (2)T and V (3)T ).
We sorted the diagonal matrix element of in descending order. The elements that their
accumulative proportions were not less than 90% the sum of all diagonal matrix element
was kept, while the rest were set to zero. Meanwhile, the corresponding column of
Chapter 4. Analysis 92
U 1 , U 2 and U 3 also were set to zero and then got similar matrix U1, U2 and U3 form
original matrices.
The core tensor S administers the relationships among the 3-entities (mashup, Api and
proximity) From the inital A, we constructed the similar core tensor S by following
codes:
S = A ×1 U 1 ×2 U 2 ×3 U 3 (4.3)
According to formular 4.3, we calculated the similar tensor  by the product of the
core tensor S coupled with the product of the 3-matrices:
 = S ×1 U 1 ×2 U 2 ×3 U 3 (4.4)
The processes described above were performed with following codes: The codes
were as following:
d e f myHOSVD( t r a i n _ t c , d e n o i s i n g ) :
tc = dtensor ( train_tc )
a0 = t l . u n f o l d ( t c , 0 )
a1 = t l . u n f o l d ( t c , 1 )
a2 = t l . u n f o l d ( t c , 2 )
U0 , S0 , V0 = l a . s v d ( a0 , f u l l _ m a t r i c e s = F a l s e )
U1 , S1 , V1 = l a . s v d ( a1 , f u l l _ m a t r i c e s = F a l s e )
U2 , S2 , V2 = l a . s v d ( a2 , f u l l _ m a t r i c e s = F a l s e )
b e s t _ r a n k = [ minRank ( S0 , d e n o i s i n g ) , minRank ( S1 , d e n o i s i n g ) , minRank (
S2 , d e n o i s i n g ) ]
U0 = np . a r r a y ( U0 )
Chapter 4. Analysis 93
U1 = np . a r r a y ( U1 )
U2 = np . a r r a y ( U2 )
U0s = np . a r r a y ( U0 [ : , r a n g e ( b e s t _ r a n k [ 0 ] + 1 ) ] )
U1s = np . a r r a y ( U1 [ : , r a n g e ( b e s t _ r a n k [ 1 ] + 1 ) ] )
U2s=np . a r r a y ( U2 [ : , r a n g e ( b e s t _ r a n k [ 2 ] + 1 ) ] )
uu = [ U0s . T , U1s . T , U2s . T ]
s = t t m ( t c , uu )
u = [ U0s , U1s , U2s ]
d2_1 = t t m ( s , u )
r e t u r n ( d2_1 )
d e f minRank ( S , t h e r ) :
Sr = s o r t e d ( S , r e v e r s e =True )
T = sum ( S r ) * t h e r
res = 0
f o r index , value in enumerate ( S ) :
res = res+value
i f ( r e s >= t ) :
res1 = index
break
return ( res1 )
We calculated RMSE and MAE to access the performance of HOSVD prediction based
on top predictions(Karypis 2001) according to following two formulas.
√
RM SE = ∑(prediction − true)2 (4.5)
d e f myRMSE( p r e d s , top_N ) :
T = s o r t e d ( p r e d s , r e v e r s e = T r u e ) [ : top_N ]
e u i = [ np . s q u a r e ( i −1) f o r i i n T ]
r e t u r n ( np . s q r t ( np . mean ( e u i ) ) )
d e f myMAE( p r e d s , top_N ) :
T = s o r t e d ( p r e d s , r e v e r s e = T r u e ) [ : top_N ]
e u i = [ np . a b s ( i −1) f o r i i n T ]
r e t u r n ( np . mean ( e u i ) )
We used the following codes provided by teacher get the PMF (Bao et al. 2013; Yang et
al. 2013) results.
c l a s s PMF ( ) :
’’’
a c l a s s f o r t h i s Double Co− o c c u r e n c e F a c t o r i z a t i o n model
’’’
d e f _ _ i n i t _ _ ( s e l f , R , l a m b d a _ a l p h a =1e −2 , l a m b d a _ b e t a =1e −2 ,
l a t e n t _ s i z e =50 , momuntum = 0 . 8 ,
l r = 0 . 0 0 1 , i t e r s =1000 , s e e d =None ) :
s e l f . lambda_alpha = lambda_alpha
s e l f . lambda_beta = lambda_beta
s e l f . momuntum = momuntum
s e l f .R = R
Chapter 4. Analysis 95
s e l f . r a n d o m _ s t a t e = RandomState ( seed )
self . iterations = iters
self . lr = lr
s e l f . I = copy . d e e p c o p y ( s e l f . R )
s e l f . I [ s e l f . I != 0] = 1
s e l f . U = 0 . 1 * s e l f . r a n d o m _ s t a t e . r a n d ( np . s i z e ( R , 0 ) ,
latent_size )
s e l f . V = 0 . 1 * s e l f . r a n d o m _ s t a t e . r a n d ( np . s i z e ( R , 1 ) ,
latent_size )
def loss ( s e l f ) :
l o s s = np . sum ( s e l f . I * ( s e l f . R−np . d o t ( s e l f . U, s e l f . V . T ) ) * * 2 ) +
s e l f . l a m b d a _ a l p h a * np . sum ( np . s q u a r e ( s e l f . U) ) + s e l f . l a m b d a _ b e t a *
np . sum ( np . s q u a r e ( s e l f . V) )
return loss
def p r e d i c t ( self , data ) :
i n d e x _ d a t a = np . a r r a y ( [ [ i n t ( e l e [ 0 ] ) , i n t ( e l e [ 1 ] ) ] f o r e l e i n
data ] , dtype= i n t )
u _ f e a t u r e s = s e l f .U. t a k e ( i n d e x _ d a t a . t a k e ( 0 , a x i s =1) , a x i s =0)
v _ f e a t u r e s = s e l f .V. t a k e ( i n d e x _ d a t a . t a k e ( 1 , a x i s =1) , a x i s =0)
p r e d s _ v a l u e _ a r r a y = np . sum ( u _ f e a t u r e s * v _ f e a t u r e s , 1 )
return preds_value_array
d e f t r a i n ( s e l f , t r a i n _ d a t a =None , v a l i _ d a t a =None ) :
train_loss_list = []
vali_rmse_list = []
l a s t _ v a l i _ r m s e = 1000
temp_U = np . z e r o s ( s e l f . U . s h a p e )
temp_V = np . z e r o s ( s e l f . V . s h a p e )
for i t in range ( s e l f . i t e r a t i o n s ) :
g r a d s _ u = np . d o t ( s e l f . I * ( s e l f . R−np . d o t ( s e l f . U, s e l f . V . T )
) , − s e l f . V) + s e l f . l a m b d a _ a l p h a * s e l f . U
Chapter 4. Analysis 96
g r a d s _ v = np . d o t ( ( s e l f . I * ( s e l f . R−np . d o t ( s e l f . U, s e l f . V . T
) ) ) . T , − s e l f . U) + s e l f . l a m b d a _ b e t a * s e l f . V
temp_U = ( s e l f . momuntum * temp_U ) + s e l f . l r * g r a d s _ u
temp_V = ( s e l f . momuntum * temp_V ) + s e l f . l r * g r a d s _ v
s e l f . U = s e l f . U − temp_U
s e l f . V = s e l f . V − temp_V
train_loss = self . loss ()
t r a i n _ l o s s _ l i s t . append ( t r a i n _ l o s s )
vali_preds = self . predict ( train_data )
v a l i _ r m s e = RMSE( t r a i n _ d a t a [ : , 2 ] , v a l i _ p r e d s )
v a l i _ r m s e _ l i s t . a p p e n d ( [ i t +1 , v a l i _ r m s e ] )
p r i n t ( ’ t r a i n i n g i t e r a t i o n : { : d} , l o s s : { : f } , v a l i _ r m s e
: { : f } ’ . f o r m a t ( i t +1 , t r a i n _ l o s s , v a l i _ r m s e ) )
r e t u r n s e l f . U, s e l f . V, t r a i n _ l o s s _ l i s t , v a l i _ r m s e _ l i s t
R = np . z e r o s ( [ t r a i n _ t c . s h a p e [ 1 ] , t r a i n _ t c . s h a p e [ 2 ] ] )
f o r e l e in range ( train_Tb . shape [ 0 ] ) :
R[ i n t ( t r a i n _ T b . i x [ ele , 0 ] ) , i n t ( t r a i n _ T b . i x [ ele , 1 ] ) ]= f l o a t ( t r a i n _ T b .
ix [ ele , 2 ] )
lambda_alpha = 0.01
lambda_beta = 0.01
l a t e n t _ s i z e = 15
l r = 0.0005
iters = 0
model = PMF( R=R , l a m b d a _ a l p h a = l a m b d a _ a l p h a , l a m b d a _ b e t a = l a m b d a _ b e t a ,
l a t e n t _ s i z e = l a t e n t _ s i z e , momuntum = 0 . 9 , l r = l r , i t e r s = i t e r s , s e e d
=1)
Chapter 4. Analysis 97
p r e d s = model . p r e d i c t ( d a t a =np . a r r a y ( t e s t _ T b ) )
4.4 Findings
training set
For evaluating the performance of HOSVD algorithm after remove 10% noise of left
singular matrices, we selected a certain proportion of top prediction to calculate the
RM SE and M AE based training dataset with different size sampled from the original
dataset. We randomly resample 10%, 20%, 30%, 40%, 50%, 60% training dataset to fit
the P M F model and HOSV D model. The results were listed in Table 1 and showed in
Figure 2. Table 1 showed that the RM SE and MAE of PMF-based top 50 predictions
slightly increased with the increasing size of the training set and independent on the
size of training sets.
Table 4.1: The RMSE and MAE of HOSVD and PMF model with different size
training set.
As shown in Figures 4.2 the performance of HOSVD was dependent on the size of
training dataset that is the performance would increase with increasing size of training
data set. When the size of training data set was less than 20%, the performance of
HOSVD model (blue line) was more bad than PMF model (red line) indicated by RMSE
and MAE based top 50 predictions. When the size of training data set was more than
20%, the performance of HOSVD model (blue line) was better than PMF model. These
results revealed that HOSVD model has better robust than PMF model when data sets
are highly sparse. Notably, when the size of training set were more than 30% and less
than 40%, the performances of HOSVD model was decreased.
Chapter 4. Analysis 99
Figure 4.2: The RMSE of HOSVD and PMF based training set with different size
In HOSVD, new three-dimensional tensor was reconstructed after denoising the left sin-
gular matrices. We evaluated the influence of denoising ratio on HOSVD performance
based top 50 predictions. Figure 4.3 and 4.4 showed that the RMSE and MAE were
linearly decreased with increasing denoising ratio. Increasing denoising ratio would
make the loss of original information from raw data set, but the accuracy top prediction
representing strong inner association between mashup and APIs still increase. It indic-
ated HOSVD has better performance to predict true relationships between variables.
Based on the same training set, we trained the HOSV D and P M F model and cal-
culate the RM SE based 60% trained dataset. The results showed that the RM SE
Chapter 4. Analysis 100
of HOSV D was increased with increased top predicted results sorted weight value
between mashup and APIs reversely (f rom 0.6 to 0.99), while PMF was stable and
around 0.94. Notably, the RMSE of HOSVD was less than PMF results when prediction
results occupied less top 100 predicted results, which indicated that HOSVD has better
performance than PMF when they predict stronger relevance between mashup and APIs.
Chapter 4. Analysis 101
Discussion
In this Chapter, we discuss the results from the implementation with respect to the
two research questions discussed in section 1.2. First, we discuss how we are able
to represent the Web-API recommendation problem as a 3-order data representation
task by completing the unobserved entries in the rating matrix. Then we compare
the impact of context information integration into the rating using HOSVD with the
conventional 2-D Probabilistic Matrix Factorization (PMF). We discuss the impact of
the dimensionality .
For recommend the APIs to mashup, we constructed the recommend system based
HOSVD algorithm with the three-dimensional tensor of proximity × Mashup × Apis
and compared its performance with PMF algorithm. The algorithm had the task of
predicting the API with respect to proximity of the mashup (user) location in the testing
dataset.
102
Chapter 5. Discussion 103
The results shown confirm the effectiveness of HOSVD-based rating data model, that
is incorporating multiple dimension of contextual (in this case proximity) information
into the three-dimensional tensor was better than two-dimensional PMF matrix because
of the former include more information. Even-though, both HOSVD and PMF solve
the sparsity issue associated with the rating matrix; the RMSE results of both methods
indicate that the HOSVD approach was able to achieve less error compare with PMF.
According to RMSE of both algorithms, we realized that HOSVD has the advantage
and robust of predicting closed relationships between variables than PMF. However, its
performance was similar to PMF algorithm when they predicted weaken correlations
even weaken. Thus, we think that it was better to select HOSVD for mining potential
correlations based sparse data rather than PMF. However, notably, the robust of HOSVD
was significantly affected by the size of train dataset, while it was not affected by the
denoising coefficient. Particularly, the HOSVD performance would worse than PMF
when the size of train dataset less than 20 percentage of the whole dataset. Therefore,
HOSVD model was more applicable to predict small samples according to large and
sparse samples.
In order to have a clear insight into the impact of different tensors densities on the rating
prediction, we consider the accuracy of our approach under different densities and
compare the prediction results. The density of the training set used in the experiment
was varied between the range of 10% to 90% with 10% steps at a time. Then we record
the prediction values under each different matrix density. As shown in figures 4.2, 4.3,
we can observe that the accuracy of our method is improved gradually with the increase
in the density of the training dataset. This is an indication that with an even more dense
Chapter 5. Discussion 104
5.2 Contributions
1. This work shows how to construct potential interactions between mashups and
APIs with respect to their proximity using three-dimensional tensor representation
Chapter 5. Discussion 105
Conclusion
6.1 Challenges
One of the challenges of this study was to get valid training datasets and decide the
top of effective predictions. Since the HOSVD model would generate many invalid
predictions and its robust and accuracy of prediction affected the size and sparsity of
training dataset. In reality, the training sets would more sparsity and more fragmented
than the datasets used in this study when we sample this data from the internet. Another
was memory management when the HOSVD model run. Before the decomposition of
high-order tensor, it needs to construct multiple matrices first. This process has slow
efficiency and needs huge storage space so that HOSVD model did not apply to huge
dataset. When dealing with large-scale data, especially when the data is non-linear, it
can greatly simplify the calculation by mapping the input space to the high-dimensional
space through the non-linear change of the kernel function. Xiao (Xiao et al. 2016)
found that the kernel-based method was applied to the representation of low-rank
matrices. RKLRR, RKLRS and RKNLRS algorithms were proposed. The performance
of clustering was improved several times, but the error of clustering was much smaller
than that of traditional methods. The representation of a low-rank matrix based on
106
Chapter 6. Conclusion 107
kernel function has only made some progress in low-rank matrices at present, and its
application to high-order tensor data remains to be studied. This will inevitably become
one of the research directions in the fields of community discovery, recommendation
system, signal processing and so on. It has great research value and significance for
Li (Tao et al. 2009). In this paper, a non-negative tensor decomposition algorithm
(GNTF) based on graph and low-rank representation is proposed. Compared with the
existing classification algorithm, the classification effect of the image is improved.
However, in this method, the constraints are not selected to compare the results, and the
optimization of the performance of the algorithm by kernel function is not considered.
In this classifier, Choosing appropriate constraints and applying the kernel function to
them will further improve the performance of the classifier and have a good effect on
the improvement of the algorithm.
In tensor decomposition, it is very important to make full and reasonable use of the
structure of high-dimensional data for problem modelling. In CP decomposition, it
is helpful to find the problem of matrix rank minimization and analyze the internal
structure information of the matrix for matrix filling and recovery performance. For
large-scale data, based on the existing hardware conditions, explore the effective and
parallelization of tensor decomposition algorithm for the solution of the problem has
important practical value and practical application, such as in nuclear matrix norm
minimization problem, singular value decomposition. The most time consuming is in
the solving process, such as the dimension of m * n matrices with time complexity
O (m*n2). Putting forward two kinds of block SVD signal processing algorithm and
two kinds of segmentation algorithm can greatly reduce the processing time. From
the examples described in the previous section, we have some understanding of the
Chapter 6. Conclusion 108
Adeleye, O., Yu, J., Yongchareon, S., Sheng, Q. Z. & Yang, L. H. (2019). A fitness-based
evolving network for web-apis discovery. In Proceedings of the australasian
computer science week multiconference (p. 49).
Adomavicius, G. & Tuzhilin, A. (2005). Toward the next generation of recommender
systems: A survey of the state-of-the-art and possible extensions. IEEE Transac-
tions on Knowledge & Data Engineering(6), 734–749.
Adomavicius, G. & Zhang, J. (2012). Impact of data characteristics on recommender
systems performance. ACM Transactions on Management Information Systems
(TMIS), 3(1), 3.
Bensmail, H. & Celeux, G. (1996). Regularized gaussian discriminant analysis
through eigenvalue decomposition. Journal of the American statistical Asso-
ciation, 91(436), 1743–1748.
Berry, M. W., Dumais, S. T. & O’Brien, G. W. (1995). Using linear algebra for
intelligent information retrieval. SIAM review, 37(4), 573–595.
Billsus, D. & Pazzani, M. J. (1999). A hybrid user model for news story classification.
In Um99 user modeling (pp. 99–108). Springer.
Blei, D., Ng, A. & Jordan, M. (2003). Latent dirichlet allocation journal of machine
learning research (3).
Bobadilla, J., Hernando, A., Ortega, F. & Bernal, J. (2011). A framework for collabor-
ative filtering recommender systems. Expert Systems with Applications, 38(12),
14609–14623.
Bobadilla, J., Hernando, A., Ortega, F. & Gutiérrez, A. (2012). Collaborative filtering
based on significances. Information Sciences, 185(1), 1–17.
Bobadilla, J., Ortega, F., Hernando, A. & Gutiérrez, A. (2013). Recommender systems
survey. Knowledge-based systems, 46, 109–132.
Bokde, D., Girase, S. & Mukhopadhyay, D. (2015). Matrix factorization model in
collaborative filtering algorithms: A survey. Procedia Computer Science, 49,
136–146.
Bro, R. (1997). Parafac. tutorial and applications. Chemometrics and intelligent
laboratory systems, 38(2), 149–171.
Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User
modeling and user-adapted interaction, 12(4), 331–370.
Burke, R. (2007). Hybrid web recommender systems. In The adaptive web (pp.
377–408). Springer.
109
References 110
Cao, B., Liu, X., Li, B., Liu, J., Tang, M., Zhang, T. & Shi, M. (2016). Mashup service
clustering based on an integration of service content and network via exploiting
a two-level topic model. In 2016 ieee international conference on web services
(icws) (pp. 212–219).
Cao, B., Liu, X., Rahman, M. M., Li, B., Liu, J. & Tang, M. (2017). Integrated content
and network-based service clustering and web apis recommendation for mashup
development. IEEE Transactions on Services Computing.
Cao, J., Wu, Z., Wang, Y. & Zhuang, Y. (2013). Hybrid collaborative filtering algorithm
for bidirectional web service recommendation. Knowledge and information
systems, 36(3), 607–627.
Chen, L., Wang, Y., Yu, Q., Zheng, Z. & Wu, J. (2013). Wt-lda: user tagging augmented
lda for web service clustering. In International conference on service-oriented
computing (pp. 162–176).
Chen, L., Yu, Q., Philip, S. Y. & Wu, J. (2015). Ws-hfs: A heterogeneous feature selec-
tion framework for web services mining. In 2015 ieee international conference
on web services (pp. 193–200).
Chen, S., Wang, F. & Zhang, C. (2007). Simultaneous heterogeneous data clustering
based on higher order relationships. In Seventh ieee international conference on
data mining workshops (icdmw 2007) (pp. 387–392).
Chen, W., Paik, I. & Hung, P. C. (2015). Constructing a global social service network for
better quality of web service discovery. IEEE transactions on services computing,
8(2), 284–298.
Chen, X., Liu, X., Huang, Z. & Sun, H. (2010). Regionknn: A scalable hybrid
collaborative filtering algorithm for personalized web service recommendation.
In 2010 ieee international conference on web services (pp. 9–16).
Chen, X., Zheng, Z. & Lyu, M. R. (2014). Qos-aware web service recommendation via
collaborative filtering. In Web services foundations (pp. 563–588). Springer.
Chen, X., Zheng, Z., Yu, Q. & Lyu, M. R. (2013). Web service recommendation
via exploiting location and qos information. IEEE Transactions on Parallel and
distributed systems, 25(7), 1913–1924.
De Lathauwer, L., De Moor, B. & Vandewalle, J. (2000). A multilinear singular
value decomposition. SIAM journal on Matrix Analysis and Applications, 21(4),
1253–1278.
Do, P., Pham, P., Phan, T. & Nguyen, T. (2018). T-mpp: A novel topic-driven meta-
path-based approach for co-authorship prediction in large-scale content-based
heterogeneous bibliographic network in distributed computing framework by
spark. In International conference on intelligent computing & optimization (pp.
87–97).
Ekstrand, M. D., Riedl, J. T., Konstan, J. A. et al. (2011). Collaborative filtering re-
commender systems. Foundations and Trends® in Human–Computer Interaction,
4(2), 81–173.
Elmeleegy, H., Ivan, A., Akkiraju, R. & Goodwin, R. (2008). Mashup advisor:
A recommendation tool for mashup development. In 2008 ieee international
conference on web services (pp. 337–344).
References 111
Fan, X., Hu, Y., Zheng, Z., Wang, Y., Brezillon, P. & Chen, W. (2017). Casr-tse: context-
aware web services recommendation for modeling weighted temporal-spatial
effectiveness. IEEE Transactions on Services Computing.
Friedman, N., Geiger, D. & Goldszmidt, M. (1997). Bayesian network classifiers.
Machine learning, 29(2-3), 131–163.
Gao, W., Chen, L., Wu, J. & Gao, H. (2015). Manifold-learning based api recommend-
ation for mashup creation. In 2015 ieee international conference on web services
(pp. 432–439).
Gillis, N. (2014). The why and how of nonnegative matrix factorization. Regularization,
Optimization, Kernels, and Support Vector Machines, 12(257).
Goldberg, D., Nichols, D., Oki, B. M. & Terry, D. (1992). Using collaborative filtering
to weave an information tapestry. Communications of the ACM, 35(12), 61–71.
Golub, G. H. & Van Loan, C. F. (2012). Matrix computations (Vol. 3). JHU press.
Guan, N., Tao, D., Luo, Z. & Yuan, B. (2012). Nenmf: An optimal gradient method
for nonnegative matrix factorization. IEEE Transactions on Signal Processing,
60(6), 2882–2898.
Herlocker, J. L., Konstan, J. A., Terveen, L. G. & Riedl, J. T. (2004). Evaluating
collaborative filtering recommender systems. ACM Transactions on Information
Systems (TOIS), 22(1), 5–53.
Hu, Y., Peng, Q., Hu, X. & Yang, R. (2015). Web service recommendation based
on time series forecasting and collaborative filtering. In 2015 ieee international
conference on web services (pp. 233–240).
Huang, C., Yao, L., Wang, X., Benatallah, B., Zhang, S. & Dong, M. (2018). Expert
recommendation via tensor factorization with regularizing hierarchical topical
relationships. In International conference on service-oriented computing (pp.
373–387).
Isinkaye, F., Folajimi, Y. & Ojokoh, B. (2015). Recommendation systems: Principles,
methods and evaluation. Egyptian Informatics Journal, 16(3), 261–273.
Jamali, M. & Lakshmanan, L. (2013). Heteromf: recommendation in heterogeneous
information networks using context dependent factor models. In Proceedings of
the 22nd international conference on world wide web (pp. 643–654).
Kang, G., Tang, M., Liu, J., Liu, X. F. & Cao, B. (2016). Diversifying web service
recommendation results via exploring service usage history. IEEE Transactions
on Services Computing, 9(4), 566–579.
Karatzoglou, A., Amatriain, X., Baltrunas, L. & Oliver, N. (2010). Multiverse recom-
mendation: n-dimensional tensor factorization for context-aware collaborative
filtering. In Proceedings of the fourth acm conference on recommender systems
(pp. 79–86).
Koren, Y., Bell, R. & Volinsky, C. (2009). Matrix factorization techniques for recom-
mender systems. Computer(8), 30–37.
Lam, X. N., Vu, T., Le, T. D. & Duong, A. D. (2008). Addressing cold-start problem in
recommendation systems. In Proceedings of the 2nd international conference on
ubiquitous information management and communication (pp. 208–211).
Lecue, F. (2010). Combining collaborative filtering and semantic content-based
References 112
Van Der Maaten, L., Postma, E. & Van den Herik, J. (2009). Dimensionality reduction:
a comparative. J Mach Learn Res, 10(66-71), 13.
Vozalis, M. G. & Margaritis, K. G. (2007). Using svd and demographic data for
the enhancement of generalized collaborative filtering. Information Sciences,
177(15), 3017–3037.
Wang, G., Xu, D., Qi, Y. & Hou, D. (2008). A semantic match algorithm for web ser-
vices based on improved semantic distance. In 2008 4th international conference
on next generation web services practices (pp. 101–106).
Watkins, D. S. (2004). Fundamentals of matrix computations (Vol. 64). John Wiley &
Sons.
Wu, C., Qiu, W., Zheng, Z., Wang, X. & Yang, X. (2015). Qos prediction of web
services based on two-phase k-means clustering. In 2015 ieee international
conference on web services (pp. 161–168).
Wu, W., Zhao, J., Zhang, C., Meng, F., Zhang, Z., Zhang, Y. & Sun, Q. (2017). Improv-
ing performance of tensor-based context-aware recommenders using bias tensor
factorization with context feature auto-encoding. Knowledge-Based Systems, 128,
71–77.
Xia, B., Fan, Y., Tan, W., Huang, K., Zhang, J. & Wu, C. (2015). Category-aware api
clustering and distributed recommendation for automatic mashup creation. IEEE
Transactions on Services Computing, 8(5), 674–687.
Xia, H. & Yoshida, T. (2007). Web service recommendation with ontology-based
similarity measure. In Second international conference on innovative computing,
informatio and control (icicic 2007) (pp. 412–412).
Xu, Y., Yin, J., Deng, S., Xiong, N. N. & Huang, J. (2016). Context-aware qos
prediction for web service recommendation and selection. Expert Systems with
Applications, 53, 75–86.
Yang, B., Lei, Y., Liu, J. & Li, W. (2017). Social collaborative filtering by trust. IEEE
transactions on pattern analysis and machine intelligence, 39(8), 1633–1647.
Yao, L., Sheng, Q. Z., Ngu, A. H., Yu, J. & Segev, A. (2015). Unified collaborative
and content-based web service recommendation. IEEE Transactions on Services
Computing, 8(3), 453–466.
Yao, L., Sheng, Q. Z., Wang, X., Zhang, W. E. & Qin, Y. (2018). Collaborative location
recommendation by integrating multi-dimensional contextual information. ACM
Transactions on Internet Technology (TOIT), 18(3), 32.
Yao, L., Wang, X., Sheng, Q. Z., Benatallah, B. & Huang, C. (2018). Mashup
recommendation by regularizing matrix factorization with api co-invocations.
IEEE Transactions on Services Computing.
Yu, C. & Huang, L. (2016). A web service qos prediction approach based on time-
and location-aware collaborative filtering. Service Oriented Computing and
Applications, 10(2), 135–149.
Yu, X., Ren, X., Sun, Y., Gu, Q., Sturt, B., Khandelwal, U., . . . Han, J. (2014).
Personalized entity recommendation: A heterogeneous information network
approach. In Proceedings of the 7th acm international conference on web search
and data mining (pp. 283–292).
References 115
Yu, X., Ren, X., Sun, Y., Sturt, B., Khandelwal, U., Gu, Q., . . . Han, J. (2013). Recom-
mendation in heterogeneous information networks with implicit user feedback. In
Proceedings of the 7th acm conference on recommender systems (pp. 347–350).
Zhang, S., Wang, W., Ford, J., Makedon, F. & Pearlman, J. (2005). Using singular
value decomposition approximation for collaborative filtering. In Seventh ieee
international conference on e-commerce technology (cec’05) (pp. 257–264).
Zhang, Y., Chen, M., Mao, S., Hu, L. & Leung, V. C. (2014). Cap: Community activity
prediction based on big data analysis. Ieee Network, 28(4), 52–57.
Zhao, Z.-D. & Shang, M.-S. (2010). User-based collaborative-filtering recommendation
algorithms on hadoop. In 2010 third international conference on knowledge
discovery and data mining (pp. 478–481).
Zheng, J., Liu, J., Shi, C., Zhuang, F., Li, J. & Wu, B. (2017). Recommendation in het-
erogeneous information network via dual similarity regularization. International
Journal of Data Science and Analytics, 3(1), 35–48.
Zheng, Z., Ma, H., Lyu, M. R. & King, I. (2010). Qos-aware web service recommenda-
tion by collaborative filtering. IEEE Transactions on services computing, 4(2),
140–152.
Zheng, Z., Ma, H., Lyu, M. R. & King, I. (2012). Collaborative web service qos
prediction via neighborhood integrated matrix factorization. IEEE Transactions
on Services Computing, 6(3), 289–299.
Zheng, Z., Ma, H., Lyu, M. R. & King, I. (2013). Collaborative web service qos
prediction via neighborhood integrated matrix factorization. IEEE Transactions
on Services Computing, 6(3), 289–299.
Zheng, Z., Zhang, Y. & Lyu, M. R. (2012). Investigating qos of real-world web services.
IEEE transactions on services computing, 7(1), 32–39.
Zhong, J. & Li, X. (2010). Unified collaborative filtering model based on combination
of latent features. Expert Systems with Applications, 37(8), 5666–5672.
Zhong, Y., Fan, Y., Huang, K., Tan, W. & Zhang, J. (2015). Time-aware service
recommendation for mashup creation. IEEE Transactions on Services Computing,
8(3), 356–368.
Zhou, Z., Wang, B., Guo, J. & Pan, J. (2015). Qos-aware web service recommendation
using collaborative filtering with pgraph. In 2015 ieee international conference
on web services (pp. 392–399).
Zhu, X., Ye, H. & Gong, S. (2009). A personalized recommendation system combining
case-based reasoning and user-based collaborative filtering. In 2009 chinese
control and decision conference (pp. 4026–4028).
Zou, B., Li, C., Tan, L. & Chen, H. (2015). Gputensor: Efficient tensor factorization
for context-aware recommendations. Information Sciences, 299, 159–177.
Appendix A
Database schema
116
Appendix A. Database schema 117
Appendix B
The database scripts and source codes used in this research are available on this link
https://drive.google.com/drive/folders/1SsEAid5-Zm4X6oXLivJzudlhQPHI1wWa?usp=sharing
118