0% found this document useful (0 votes)

4 views29 pages

Reference Paper 10

paper

Uploaded by

SAKTHI SANTHOSH B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views29 pages

Reference Paper 10

paper

Uploaded by

SAKTHI SANTHOSH B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

electronics

Article
A Survey on Knowledge Graph Embedding:
Approaches, Applications and Benchmarks
Yuanfei Dai 1 , Shiping Wang 1,2 , Neal N. Xiong 1,3 and Wenzhong Guo 1,2, *
1 College of Mathematics and Computer Sciences, Fuzhou University, Fuzhou 350108, China;
[email protected] (Y.D.); [email protected] (S.W.); [email protected] (N.N.X.)
2 Key Laboratory of Network Computing and Intelligent Information Processing, Fuzhou University,
Fuzhou 350108, China
3 Department of Mathematics and Computer Science, Northeastern State University,
Tahlequah, OK 003161, USA
* Correspondence: [email protected]

Received: 4 April 2020; Accepted: 29 April 2020; Published: 2 May 2020

Abstract: A knowledge graph (KG), also known as a knowledge base, is a particular kind of network
structure in which the node indicates entity and the edge represent relation. However, with the
explosion of network volume, the problem of data sparsity that causes large-scale KG systems to
calculate and manage difficultly has become more significant. For alleviating the issue, knowledge
graph embedding is proposed to embed entities and relations in a KG to a low-, dense and continuous
feature space, and endow the yield model with abilities of knowledge inference and fusion. In recent
years, many researchers have poured much attention in this approach, and we will systematically
introduce the existing state-of-the-art approaches and a variety of applications that benefit from these
methods in this paper. In addition, we discuss future prospects for the development of techniques
and application trends. Specifically, we first introduce the embedding models that only leverage
the information of observed triplets in the KG. We illustrate the overall framework and specific
idea and compare the advantages and disadvantages of such approaches. Next, we introduce
the advanced models that utilize additional semantic information to improve the performance of
the original methods. We divide the additional information into two categories, including textual
descriptions and relation paths. The extension approaches in each category are described, following
the same classification criteria as those defined for the triplet fact-based models. We then describe
two experiments for comparing the performance of listed methods and mention some broader
domain tasks such as question answering, recommender systems, and so forth. Finally, we collect
several hurdles that need to be overcome and provide a few future research directions for knowledge
graph embedding.

Keywords: knowledge graph embedding; knowledge representation; deep learning; statistical

relational learning

1. Introduction
Numerous large-scale knowledge graphs, such as SUMO [1], YAGO [2], Freebase [3], Wikidata [4],
and DBpedia [5], have been released in recent years. These KGs have become a significant resource for
many natural language processing (NLP) applications, from named entity recognition [6,7] and entity
disambiguation [8,9] to question answering [10,11] and information extraction [12,13]. In addition,
as an applied technology, a knowledge graph also supports specific applications in many industries.
For instance, it can provide visual knowledge representation for drug analysis, disease diagnosis in the
field of medicine [14,15]; in the field of e-commerce, it can be used to construct a product knowledge

Electronics 2020, 9, 750; doi:10.3390/electronics9050750 www.mdpi.com/journal/electronics

Electronics 2020, 9, 750 2 of 29

graph to accurately match the user’s purchase intention and product candidate set [16,17]; it also
can be employed in public security to analyze the relations between entities and obtain clues [18].
The knowledge graph stores real-world objective information that the data structure is in RDF-style
triplets (http://www.w3.org/TR/rdf11-concepts/) (h, r, t), where h and t are head and tail entity,
respectively, and r represents a relation between h and t. For instance, Figure 1 shows two triplets
that each entity has a corresponding description. However, with the explosion of network volume,
this traditional graph structure usually makes KGs hard to manipulate. The drawback of traditional
KGs mainly includes the following two aspects: (i) Computational efficiency issues. When using
the knowledge graph to calculate the semantic relations between entities, it is often necessary to
design a special graph algorithm to achieve it. However, this graph algorithm has high computational
complexity and poor scalability. While the knowledge graph reaches a large scale, it is difficult to meet
the needs of real-time computing. (ii) Data sparsity problem. Similar to other large-scale data, the
large-scale knowledge graph is also faced with a serious problem of data sparsity, which makes the
calculation of semantic or inferential relations of entities extremely inaccurate.

FounderOf LocatedIn
Elon Musk SpaceX Hawthorne

Elon Musk is an entrepreneur, SpaceX is a space Hawthorne is a city in

investor, and engineer. He transportation services southwestern Los Angeles
holds South African, Canadian, company headquartered in County, California, United
and U.S. citizenship and is the Hawthorne, California. It was States. As of 2010 it had a
founder, CEO, and lead founded in 2002 by population of 84,293, up from
designer of SpaceX. entrepreneur Elon Musk. 84,112 in 2000.

Figure 1. A simple instance of a knowledge graph.

For tackling these challenge, knowledge graph embedding has been provided and attracted
much attention, as it has the capability of knowledge graph to a dense and low dimensional,
feature space [19–25] and it can efficiently calculate the semantic relation between entities in low
dimensional space and effectively solve the problems of computational complexity and data sparsity.
This method can further be used to explore new knowledge from existing facts (link prediction [19,23]),
disambiguate entities (entity resolution [22,24]), extract relations (relation classification [26,27]), etc.
The embedding procedure is described as follows. Given a KG, the entities and relations are first
randomly represented in a low-, vector space, and an evaluation function is defined to measure the
plausibility of each fact triplet. At each iteration, the embedding vectors of entities and relations can
then be updated by maximizing the global plausibility of facts with some optimization algorithm. Even
though there are a large number of successful researches in modeling relational facts, most of them
can only train an embedding model on an observed triplets dataset. Thereupon, there are increasing
studies that focus on learning more generalizing KG embedding models by absorbing additional
information, such as entity types [28,29], relation paths [30–32], and textual descriptions [33–35].
Generally, knowledge graph embedding can utilize a distributed representation technology
to alleviate the issue of data sparsity and computational inefficiency. This approach has
three crucial advantages.

• The data sparsity problem has been effectively mitigated, because all elements in KGs including
entities and relations are embedded to a continuous low-, feature space.
• Compared with traditional one-hot representation, KG embedding employs a distributed
representation method to transform the original KG. As a result, it is effective to improve the
efficiency of semantic computing.
• Representation learning uses a unified feature space to connect heterogeneous objects to each other,
thereby achieving fusion and calculation between different types of information.
Electronics 2020, 9, 750 3 of 29

In this paper, we provide a detailed analysis of the current KG embedding technologies and
applications. We systematically describe how the existing techniques address data sparsity and
computation inefficiency problems, including the thoughts and technical solutions offered by the
respective researchers. Furthermore, we introduce a wide variety of applications that benefit from KG
embedding. Although a few surveys about KG representation learning have been published [36,37],
we focus on a different aspect compared with these articles. Cai et al. [36] performed a survey of
graph embedding, including homogeneous graphs [38–40], heterogeneous graphs [41–43], graphs with
auxiliary information [28,44,45], and graphs constructed from non-relational data [46–48]. Compared
with their work, we focus more specifically on KG embedding, which falls under heterogeneous
graphs. In contrast to the survey completed by Wang et al. [37], we describe various applications to
which KG embedding applies and compare the performance of the methods in these applications.
The rest of this article is organized as follows. In Section 2, we introduce the basic symbols
and formal problem definition of knowledge graph embedding and discuss embedding techniques.
We illustrate the general framework and training process of the model. In Section 3, we will explore
the applications supported by KG embedding, and then compare the performance of the above
representation learning model in the same application. Finally, we present our conclusions in Section 4
and look forward to future research directions.

2. Knowledge Graph Embedding Models

In this section, we firstly declare some notations and their corresponding explanations that will
be applied in the rest of this paper. Afterward, we supply a general definition of the knowledge graph
representation learning problem. Detailed explanations of notations are elucidated in Table 1.

Table 1. Detailed explanation of notations.

Notations Explanations
h, r, t Head entity h, tail entity t, and relation r
h, r, t The embedding vectors corresponding to h, r, t
xi The i-th element in vector x
A A numerical matrix
Aij The i-th row and j-th column element in matrix A
d The dimensionality of entity in embedding space
k The dimensionality of relation in embedding space

2.1. Notation and Problem Definition

Problem 1. Knowledge graph embedding: Given a KG composed of a collection of triplet facts
Ω = {< h, r, t >}, and a pre-defined dimension of embedding space d (To simplify the problem, we transform
entities and relations into the uniform embedding space, i.e., d = k), KG embedding aims to represent each entity
h ∈ E and relation r ∈ R into a d-, continuous vector space, where E and R indicate the set of entities and
relations, respectively. In other words, a KG is represented as a set of d-, vectors, which can capture information
of the graph, in order to simplify computations on the KG.

Knowledge graph embedding aims to map a KG into a dense, low-, feature space, which is
capable of preserving as much structure and property information of the graph as possible and aiding
in calculations of the entities and relations. In recent years, it has become a research hotspot, and many
researchers have put forward a variety of models. The differences between the various embedding
algorithms are related to three aspects: (i) how they represent entities and relations, or in other words,
how they define the representation form of entitles and relations, (ii) how they define the scoring
function, and (iii) how they optimize the ranking criterion that maximizes the global plausibility of
the existing triplets. The different models have different insights and approaches with respect to
these aspects.
Electronics 2020, 9, 750 4 of 29

We have broadly classified these existing methods into two categories: triplet fact-based,
representation learning models and description-based representation learning models. In this section, we first
clarify the thought processes behind the algorithms in these two types of graph embedding models,
as well as the procedures by which they solve the representation problem. After that, the training
procedures for these models are discussed in detail. It is worth noting that due to study limitations,
we can not enumerate all relevant knowledge graph embedding methods. Therefore, we only describe
some representative, highly cited and code-implemented algorithms.

2.2. Triplet Fact-Based Representation Learning Models

Triplet fact-based embedding model treats the knowledge graph as a set of triplets containing all
observed facts. In this section, we will introduce three groups of embedding models: translation-based
models, tensor factorization-based models, and neural network-based models.

2.2.1. Translation-Based Models

Since Mikolov et al. [49,50] proposed a word embedding algorithm and its toolkit word2vec,
the distributed representations learning has attracted more and more attention. Using that model,
the authors found that there are interesting translation invariance phenomena in the word vector space.
For example:
−−→ −−−→ −−→ −−−−→
King − Queen ≈ Man − Woman (1)

where − →
w is the vector of word w transformed by the word2vec model. This result means that the word
representation model can capture some of the same implicit semantic relationship between the words
“King” (“Man”) and “Queen” (“Woman”). Mikolov et al. proved experimentally that the property of
translation invariance exists widely in the semantic and syntactic relations of vocabulary.
Inspired by the word2vec, Bordes et al. [19] introduced the idea of translation invariance into the
knowledge graph embedding field, and proposed the TransE embedding model. TransE represents
all entities and relations to the uniform continuous, low-, feature space Rd , and the relations can be
regarded as connection vectors between entities. Let E and R indicate the set of entity and relation,
respectively. For each triplet (h, r, t), the head and tail entity h ,t, and the relation r are embedded to
the embedding vectors h, t, and r. As illustrated in Figure 2a, for each triplet (h, r, t), TransE follows
Mr Mr
h t h t

a geometric principle: h
h⊥

h
h⊥

r r
Mr Mr
t⊥ t⊥
dr dr
t t⊥ t t⊥
h⊥ h⊥

Entity and relation space Entity space Relation space of r

h+r ≈ t Entity and relation space Entity space Relation space of r
(2)

Mr Mr
h t h t

t1 h1⊥ h⊥ h⊥ t1 h1⊥
t M rh i r t M rh i r
h2 h h h2
h3 t 1⊥ r r h3 t 1⊥
r h2⊥ r r h2⊥
h1 Mr Mr h1 r
t2 M rt i t2 M rt i
h t 2⊥
t⊥ t⊥ h t 2⊥
t3 dr dr t3
h3⊥ r t t⊥ t t⊥ h3⊥ r
h⊥ h⊥
t 3⊥ t 3⊥

Entity and relation space Entity space Entityspace

Relation and relation
of space Entity spaceand relation space
Entity Entity
Relation space of rspace Relation space of r Entity and relation space Entity space Relation space of

(a) (b) (c) (d)

t1 h1⊥ t1 h1⊥
t M rh i t r M rh i r
h2 USA h2
h3 h3 t 1⊥ t 1⊥
r r h2⊥ h2⊥ r
h1 h1 r
Dreaming Dreaming t2 M rt i t2 M rt i
Wall Wall h h t 2⊥ t 2⊥ (Brain, Stem) (Cortical, Stem) (Brain, Stem) (Cortical, Stem)
Appliance t3 nationality t3
Goniff Goniff h3⊥ r h3⊥ r (Cell Nucleus, Linin) (HindBrain, Cube) (Cell Nucleus, Linin) (HindBrain, Cube)

Appliance Bill Clinton t 3⊥ t 3⊥

Chicago (Cortex, Shetland) (Mediterranean, Sardegna) (Cortex, Shetland) (Mediterranean, Sardegna)
.1

spouse RelationEntity
spacespace Relation space of
art

Entity and relation space Entity space

Entity and relation space of (Philippines, Cebu) (Philippines, Cebu)
place of birth
sP

rt (Atrabtic, Nwe York Bay) (Atrabtic, Nwe York Bay)

Pa
Ha

Hillary Clinton s
Sleeping Sleeping Ha (Atlantic, Abukir) t.2 (Atlantic, Abukir)
Par
Room Room (Atlantic, Sargasso) Has (Atlantic, Sargasso)
Euclidean distance in traditional model Mahalanobis distance in TransA Traditional model Gaussian mixture model

(e) (f) (g)

Figure 2. Illustrations of translation-based models. (a) TransE, (b) TransH, (c) TransR, (d) TransD,
(e) TransA, (f) KG2E, (g) TransG.
Electronics 2020, 9, 750 5 of 29

The authenticity of the given triplet (h, r, t) is computed via a score function. This score function
is defined as a distant between h + r and t under `1 -norm or `2 -norm constraints. In mathematical
expressions, it is shown as follows:

f r (h, t) = kh + r − tk`1 /`2 (3)

In order to learn an effective embedding model which has the ability to discriminate the
authenticity of triplets, TransE minimizes a margin-based hinge ranking loss function over the
training process.

L= ∑ ∑ max(0, γ + f r (h, t) − f r0 (h0 , t0 )) (4)

(h,r,t)∈S (h0 ,r 0 ,t0 )∈S0

where S and S0 denote the set consisting of correct triplets and corrupted triplets, respectively, and γ
indicates the margin hyperparameter. In training, TransE stochastically replaces the head or tail entity
in each triplet with other candidate entities to generate corrupted triplet set S0 . The construction
formula is shown in Equation (5).

S0 = (h0 , r, t) | h0 ∈ E, (h0 , r, t) ∈
/ S ∪ (h, r, t0 ) | t0 ∈ E, (h, r, t0 ) ∈

/S (5)

Although TransE has achieved a great advancement in large-scale knowledge graph embedding,
it still has difficulty in dealing with complex relations, such as 1 − N, N − 1, and N − N [51,52].
For instance, there is a 1 − N relation where the head entity has multiple corresponding tail entities, i.e.,
∀ i ∈ {1, 2, ..., n} , (h, r, ti ) ∈ S. According to the guidelines TransE followed h + r ≈ ti , all embedding
vectors of the tail entities should be approximately similar, as t1 ≈ t2 ≈ ... ≈ ti . More visually, there
are two triplet facts ( Elon_Musk, Founder_o f , SpaceX ) and ( Elon_Musk, Founder_o f , Tesla), in which
Founder_o f is a 1 − N relation mentioned above. Following TransE, the embedding vectors of SpaceX
and Tesla should be very similar in feature space. However, this result is clearly irrational because
SpaceX and Tesla are two companies in entirely different fields, except ElonMusk is their founder.
In addition, other complex relations such as N − 1 and N − N also raise the same problem.
To handle this issue in complex relations, TransH [51] extended the original TransE model,
it enables each entity to have different embedding representations when the entity is involved in diverse
relations. In other words, TransH allows each relation to hold its own relation-specific hyperplane.
Therefore, an entity would have different embedding vectors in different relation hyperplanes.
As shown in Figure 2b, for a relation r, TransH employs the relation-specific translation vector dr and
the normal vector of hyperplane wr to represent it. For each triplet fact (h, r, t), the embedding vectors
of h and t are firstly projected to the relation-specific hyperplane in the direction of the normal vector
wr . h⊥ and t⊥ indicate the projections.

h⊥ = h − wr> hwr , t⊥ = t − wr> twr (6)

Afterwards, the h⊥ and t⊥ are connected by the relation-specific translation vector dr . Similar to
TransE, a small score is expected when (h, r, t) holds. The score function is formulated as follows:

f r (h, t) = kh⊥ + dr − t⊥ k22 (7)

Here, k·k22 is the squared Euclidean distance. By utilizing this relation-specific hyperplane, TransH
can project an entity to different feature vectors depending on different relations, and solve the issue
of complex relations.
Following this idea, TransR [52] extended on the original TransH algorithm. Although TransH
enables each entity to obtain a different representation corresponding to its different relations,
the entities and relations in this model are still represented in the same feature space Rd . In fact,
an entity may contain various semantic meanings and different relations may concentrate on entities’
Electronics 2020, 9, 750 6 of 29

diverse aspects; therefore, entities and relations in the same semantic space might make the model
insufficient for graph embedding.
TransR expands the concept of relation-specific hyperplanes proposed by TransH to
relation-specific spaces. In TransR, for each triplet (h, r, t), entities are embedded as h and t into
an entity vector space Rd , and relations are represented as a translation vector r into a relation-specific
space Rk . As illustrated in Figure 2c, TransR projects h and t from the entity space into the relation space.
This operation can render those entities (denoted as triangles with color) that are similar to head or tail
entities (denoted as circles with color) in the entity space as distinctly divided in the relation space.
More specifically, for each relation r, TransR defines a projection matrix Mr ∈ Rk×d to transform
the entity vectors into the relation-specific space. The projected entity vectors are signified by h⊥ and
t⊥ , and the scoring function is similar to that of TransH:

h⊥ = Mr h, t⊥ = Mr t (8)
f r (h, t) = kh⊥ + r − t⊥ k22 (9)

Compared with TransE and TransH, TransR has made significant progress in performance.
However, it also has some deficiencies: (i) For a relation r, the head and tail entities share the same
projection matrix Mr , whereas it is intuitive that the types or attributes between head and tail entities
may be essentially different. For instance, in the triplet ( Elon_Musk, Founder_o f , SpaceX ), Elon_Musk
is a person and SpaceX is a company; they are two different types of entities. (ii) The projection
from the entity space to the relation-specific space is an interactive process between entities and
relations; it cannot capture integrated information when the projection matrix is generated only related
to relations. (iii) Owing to the application of the projection matrix, TransR requires a large amount of
computing resources, the memory complexity of which is O( Ne d + Nr dk ), compared to TransE and
TransH with O( Ne d + Nr k).
To eliminate the above drawback, an improved method TransD [22] was proposed. It optimizes
TransR by using two vectors for each entity-relation pair to construct a dynamic mapping matrix that
could be a substitute for the projection matrix in TransR. Its illustration is in Figure 2d. Specifically,
given a triplet (h, r, t), each entity and relation is represented to two embedding vectors. The first vector
represents meanings of the entity/relation, denoted as h, t ∈ Rd and r ∈ Rk , and the second vector
(defined as h p , t p ∈ Rd and r p ∈ Rk ) is used to form two dynamic projection matrices Mrh , Mrt ∈ Rk×d .
These two matrices are calculated as:

Mrh = r p h> >

p + I, Mrt = r p t p + I (10)

where Ik×d is an identity matrix. Therefore, the projection matrix involve both entity and relation, and
the embedding vectors of h and t are defined as:

h⊥ = Mrh h, t⊥ = Mrt t (11)

Finally, the score function is the same as that of TransR in Equation (9). TransD refined this model,
it constructs a dynamic mapping matrix with two projection vectors, that effectually reduces the
computation complexity to O( Ne d + Nr k).
All the methods described above including Trans (E, H, R, and D) ignore two properties of
existing KGs: heterogeneity (some relations have many connections with entities and others do
not), which causes underfitting on complex relations or overfitting on simple relations, and the
imbalance (there is a great difference between head and tail entities for a relation in quantities),
which indicates that the model should treat head and tail entities differently. TranSparse [24]
overcomes the heterogeneity and imbalance by applying two model versions: TranSparse(share)
and TranSparse(separate).
Electronics 2020, 9, 750 7 of 29

TranSparse (share) leverages adaptive sparse matrices Mr (θr ) to replace dense projection matrices
for each relation r. The sparse degree θr is linked to the number of entities connected with relation r; it
is defined as follows:

θr = 1 − (1 − θmin ) Nr /Nr∗ (12)

where θmin (0 ≤ θmin ≤ 1) is a hyperparameter, Nr denotes the number of entity pairs connected with
the relation, and Nr∗ represents the maximum of them. Therefore, the projected vectors are formed by:

h⊥ = Mr (θr )h, t⊥ = Mr (θr )t (13)

TranSparse (separate) employs two separate sparse mapping matrices, Mrh (θrh ) and Mrt (θrt ),
for each relation, where Mrh (θrh ) projects the head entities and the other projects the tail entities.
The sparse degree and the projected vectors are extended as follows:

θrl = 1 − (1 − θmin ) Nrl /Nr∗ l ∗ (l = h, t) (14)

h⊥ = Mrh (θrh )h, t⊥ = Mrt (θrt )t (15)

A simpler version of this method was proposed by Nguyen et al. [53], called sTransE. In that
approach, the sparse projection matrices Mrh (θrh ) and Mrt (θrt ) are replaced by mapping matrices Mrh
and Mrt , such that the projected vectors are transformed to:

h⊥ = Mrh h, t⊥ = Mrt t (16)

The methods introduced so far merely modify the definition of the projection vectors or matrices,
but they do not consider other aspects to optimize TransE. TransA [54] also boosted the performance
of the embedding model from another view point by modifying the distance measure of the score
function. It introduces adaptive Mahalanobis distance as a better indicator to replace the traditional
Euclidean distance because Mahalanobis distance shows better adaptability and flexibility [55]. Given
a triplet (h, r, t), Mr is defined as a symmetric non-negative weight matrix with the relation r; the score
function of TransA is formulated as:

f r (h, t) = (|h + r − t|)> Mr (|h + r − t|) (17)

As shown in Figure 2e, the two arrows represent the same relation HasPart,
and ( Room, HasPart, Wall ) and (Sleeping, HasPart, Dreaming) are true facts. If we use the isotropic
Euclidean distance, which is utilized in traditional models to distinguish the authenticity of a triplet, it
could yield erroneous triplets such as ( Room, HasPart, Goni f f ). Fortunately, TransA has the capability
of discovering the true triplet via introducing the adaptive Mahalanobis distance because the true one
has shorter distances in the x or y directions.
The above methods embedded the entity and relation to a real number space, KG2E [56] proposed
a novel approach that introduces the uncertainty to construct a probabilistic knowledge graph
embedding model. KG2E takes advantage of multi, Gaussian distributions to embed entities and
relations; each entity and relation is represented to a Gaussian distribution, in which the mean of
this Gaussian distribution is the position of the entity or relation in a semantic feature space, and the
covariance signifies the uncertainty of the entity or relation, i.e.,

h ∼ N ( µ h , Σ h ), r ∼ N ( µr , Σr ), t ∼ N ( µ t , Σ t ) (18)

Here, µh , µr , µt ∈ Rd are the mean vectors of h, r, and t, respectively, and Σh , Σr , Σt ∈ Rd×d

indicate the covariance matrices corresponding to entities and relations. KG2E also utilizes the distance
between h − t and r as metric to distinguish the authenticity of triplets, where the traditional h − t is
Electronics 2020, 9, 750 8 of 29

transformed to the distribution formula of N (µh − µt , Σh − Σt ). This model employs two categories
of approaches to estimate the similarity between two probability distributions: Kullback–Leibler
divergence [57] and expected likelihood [58].
Figure 2f displays an illustration of KG2E. Each entity is represented as a circle without underline
and the relations are the circles with underline. These circles with the same color indicate an observed
triplet, where the head entity of all triplets is Hillary Clinton. The area of a circle denotes the
uncertainty of the entity or relation. As we can see in Figure 2e that there are three triplets, and the
uncertainty of the relation “spouse” is lower than the others.
TransG [59] discusses the new situation of multiple relationship semantics, that is, when
a relationship is associated with different entity pairs, it may contain multiple semantics. It also
uses a Gaussian distribution to embed entities, but it is significantly different from KG2E because it
leverages a Gaussian mixture model to represent relations:

h ∼ N (µh , σh2 I), r ∼ N (µr , σr2 I) (19)

M
r∼ ∑ πr,m N (µh − µt , (σh2 + σt2 )I) (20)
m =1

where πr,m is the weight of distinct semantics and I indicates an identity matrix. As shown in Figure 2g,
dots are correct entities related to the relation “Has part” and the triangles represent corrupted entities.
In the traditional models (left), the corrupted entities do not have the ability to distinguish from the
entire set of entities because all semantics are confused in the relation “Has part.” In contrast, TransG
(right) can find the incorrect entities by utilizing multiple semantic components.
TransE only has the ability to handle simple relations, and it is incompetent for complex
relations. The extensions of TransE, including TransH, TransR, TransD, TransA, TransG, and so
forth, proposed many thoughtful and insightful models to address the issue of complex relations.
Extensive experiments in public benchmark datasets, which are generated from WordNet [60] and
Freebase, show that these modified models achieve significant improvements with respect to the
baseline, and verify the feasibility and validity of these methods. A comparison of these models in
terms of their scoring functions and space size is shown in Table 2.

Table 2. Comparison of translation-based models in terms of scoring functions and memory complexity.
In KG2E, µ = µh − µr − µt and Σ = Σh + Σr + Σt ; m indicates the number of semantic components for
each relation in TransG.

Model Scoring Function f t (h, t) Memory Complexity

TransE [19] k h + r − t k `1 / `2 O( Ne d + Nr k)(d = k)

2
TransH [51] h − wr> hwr + dr − t − wr> twr O( Ne d + Nr k)(d = k)
2
TransR [52] k Mr h + r − Mr t k22 O( Ne d + Nr dk)
2
TransD [22] (r p h>
p + I)h + r − (r p t>
p + I)t O( Ne d + Nr k)
2
TranSparse [24] k Mr (θr ) h + r − Mr (θr ) t k22 O( Ne d + (1 − θ ) Nr dk)
k Mrh (θrh ) h + r − Mrt (θrt ) t k22
STransE [53] k Mrh h + r − Mrt t k22 O( Ne d + Nr dk)(d = k)
TransA [23] (|h + r − t |)> Mr (h + r − t ) O( Ne d + Nr k2 )(d = k)
tr Σr−1 (Σh + Σt ) + µ> Σr−1 µ

KG2E [56] O( Ne d + Nr k)(d = k)
det(Σh +Σt )
− log det (Σ )
;
r

µ> Σ−1 µ − log (det Σ) ;

2
Mr kµh +µri −µt k2
TransG [59] Σm =i π r,m exp − 2
σ +σ 2 O( Ne d + Nr km)(d = k)
h t
Electronics 2020, 9, 750 9 of 29

2.2.2. Tensor Factorization-Based Models

Tensor factorization is another category of effective methods for KG representation learning.
The core idea of these methods is described as follows. First, the triplet facts in the KG are transformed
into a 3D binary tensor X . As illustrated in Figure 3, given a tensor X ∈ Rn×n×m , where n and m
denote the number of entities and relations, respectively, each slice Xk (k = 1, 2, ..., m) corresponds to a
relation type Rk , and Xijk = 1 refers to the fact that the triplet (i-th entity, k-th relation, j-th entity) exists
in the graph; otherwise, Xijk = 0 indicates a non-existent or unknown triplet. After that, the embedding
matrices associated with the embedding vectors of the entities and relations are calculated to represent
X as some factors of a tensor. Finally, the low-, representation for each entity and relation is generated
along with these embedding matrices.

e1 e2 . . . en
en
e1 e2 . . .

. rm
.
.
r2
r1

Figure 3. A tensor model of knowledge graph.

RESCAL [61] applies a tensor to express the inherent structure of a KG and uses the rank-d
factorization to obtain its latent semantics. The principle that this method follows is formulated as:

Xk ≈ ARk A T , f or k = 1, 2, ..., m (21)

where A ∈ Rn×d is a matrix that captures the latent semantic representation of entities and Rk ∈ Rd×d
is a matrix that models the pairwise interactions in the k-th relation. According to this principle,
the scoring function f r (h, t) for a triplet (h, r, t) is defined as:

f r (h, t) = h> Mr t (22)

Here, h, t ∈ Rd are the embedding vectors of entities in the graph and the matrix Mr ∈ Rd×d
represents the latent semantic meanings in the relation. It is worth noting that hi and t j , which
represent the embedding vectors of the i-th and j-th entity, are actually assigned by the values of the
i-th and j-th row of matrix A in Equation (21). A more complex version of f r (h, t) was proposed by
García-Durán et al. [62], which extends RESCAL by introducing well-controlled two-way interactions
into the scoring function.
However, this method requires O( Ne d + Nr k2 )(d = k ) parameters. To simplify the computational
complexity of RESCAL, DistMult [63] restricts Mr to be diagonal matrices, i.e., Mr = diag(r), r ∈ Rd .
The scoring function is transformed to:

f r (h, t) = h> diag(r)t (23)

DistMult not only reduces the algorithm complexity into O( Ne d + Nr k)(d = k), but the
experimental results also indicate that this simple formulation achieves a remarkable improvement
over the others.
Electronics 2020, 9, 750 10 of 29

Hole [64] introduced a circular correlation operation [65], denoted as ? : Rd × Rd → Rd , between

head and tail entities to capture the compositional representations of pairwise entities. The formulation
of this operation is shown as follows:

d −1
[h ? t] k = ∑ hi t(k+i) mod d (24)
i =0

The circular correlation operation has the significant advantage that it can reduce the complexity
of the composite representation compared to the tensor product. Moreover, its computational process
can be accelerated via:

h ? t = F −1 F (h ) F (t) (25)

Here, F (·) indicates the fast Fourier transform (FFT) [66], F −1 (·) denotes its inverse, and a
denotes the complex conjugate of a. Thus, the scoring function corresponding to Hole is defined as:

f r (h, t) = r> (h ? t) (26)

It is noteworthy that circular correlation is not commutative, i.e., h ? t 6= t ? h; therefore,

this property is indispensable for modeling asymmetric relations to semantic representation space.
The original DistMult model is symmetric in head and tail entities for every relation; Complex [67]
leverages complex-valued embeddings to extend DistMult in asymmetric relations. The embeddings
of entities and relations exist in the complex space C d , instead of the real space Rd in which DistMult
embedded. The scoring function is modified to:

f r (h, t) = Re h> diag(r)t (27)

where Re(·) denotes the real part of a complex value, and t represents the complex conjugate of t.
By using this scoring function, triplets that have asymmetric relations can obtain different scores
depending on the sequences of entities.
In order to settle the independence issue of entity embedding vectors in Canonical Polyadic (CP)
decomposition, SimplE [68] proposes a simple enhancement of CP which introduces the reverse of
relations and computes the average CP score of (h, r, t) and (t, r −1 , h):

1
h ◦ rt + t ◦ r0 t

f r (h, t) = (28)
2
where ◦ is the element-wise multiplication and r0 indicates the embedding vector of reverse relation.
Inspired by Euler’s identity eiθ = cos θ + i sin θ, RotatE [69] introduces the rotational Hadmard product,
it regards the relation as a rotation between the head and tail entity in complex space. The score function
is defined as follows:

f r (h, t) = kh ◦ r − t k (29)

QuatE [70] extends complex space into four-dimensional, hypercomplex space h, t, r ∈ Hd ,

it utilizes the Hamilton product to obtain latent inter-dependency between entities and relations
in this complex space:

r
f r (h, t) = h ⊗ ·t (30)
|r |
where ⊗ denotes the Hamilton product. Table 3 summarizes the scoring function and memory
complexity for all tensor factorization-based models.
Electronics 2020, 9, 750 11 of 29

Table 3. Comparison of tensor factorization-based models in terms of scoring functions and

memory complexity.

Model Scoring Function f t (h, t ) Memory Complexity

RESCAL [61] h> Mr t O( Ne d + Nr k2 )(d = k)

DistMult [63] h> diag(r )t O( Ne d + Nr k)(d = k)
HolE [64] r > (h ? t) O( Ne d + Nr k)(d = k)
ComplEx [67] Re h> diag(r)t O( Ne d + Nr k)(d = k)
SimplE [68] 1
2 (h ◦ rt + t ◦ r0 t) O( Ne d + Nr k)(d = k)
RotatE [69] kh ◦ r − t k O( Ne d + Nr k)(d = k)
QuatE [70] h ⊗ |rr | · t O( Ne d + Nr k)(d = k)

2.2.3. Neural Network-Based Models

Deep learning is a very popular and extensively used tool in many different fields [71–74].
They are also models with strong representation and generalization capabilities that can express
complicated nonlinear projections. In recent years, it has become a hot topic to embed a knowledge
graph into a continuous feature space by a neural network.
SME [21] defines an energy function for semantic matching, which can be used to measure the
confidence of each observed fact (h, r, t) by utilizing neural networks. As shown in Figure 4a, in SME,
each entity and relation is firstly embedded to its embedding feature space. Then, to capture intrinsic
connections between entities and relations, two projection matrices are applied. Finally, the semantic
matching energy associated with each triplet is computed by a fully connected layer. More specifically,
given a triplet (h, r, t) in Figure 4a, SME represents entities and relations to the semantic feature space
by the input layer. The head entity vector h is then combined with the relation vector r to acquire
latent connections (i.e., gle f t (h, r)). Likewise, gright (t, r) is generated from the tail entity and the relation
vectors t and r. Bordes et al. provided two types of g(·) functions, so there are two versions for SME.
SME is formulated by Equation (31):

gle f t (h, r) = Ml1 h + Ml2 r + bl

(31)
gright (t, r) = Mr1 t + Mr2 r + br

(a) (b) (c)

(d) (e) (f)

Figure 4. Illustrations of neural network-based-based models. (a) SME, (b) NTN, (c) MLP, (d) NAM,
(e) RMNN, (f) ConvKB.

SME (bilinear) is formulated in Equation (32):

Electronics 2020, 9, 750 12 of 29

gle f t (h, r) = (Ml1 h) ◦ (Ml2 r) + bl

(32)
gright (t, r) = (Mr1 t) ◦ (Mr2 r) + br

where M ∈ Rd×d is the weight matrix and b denotes the bias vector. Finally, the gle f t (h, r) and
gright (t, r) are concatenated to obtain the energy score f r (h, t) via a fully connected layer.

f r (h, t) = gle f t (h, r)> gright (t, r) (33)

NTN [20] proposes a neural tensor network to calculate the energy score f r (h, t). It replaces the
standard linear layer, which is in the traditional neural network, by employing a bilinear tensor layer.
As shown in Figure 4b, given an observed triplet (h, r, t), the first layer firstly embeds entities to their
embedding feature space. There are three inputs in the second nonlinear hidden layer including the
head entity vector h, the tail entity vector t and a relation-specific tensor Tr ∈ Rd×d×k . The entity
vectors h, t are embedded to a high-level representation via projection matrices Mr1 , Mr2 ∈ Rk×d
respectively. These three elements are then fed to the nonlinear layer to combine semantic information.
Finally, the energy score is obtained by providing a relation-specific linear layer as the output layer.
The score function is defined as follows:

f r (h, t) = r> f (h> Tr t + Mr1 h + Mr2 t + br ) (34)

where f ( x ) = tanh x indicates an activation function and br ∈ Rk denotes a bias, which belongs to
a standard neural network layer. Meanwhile, a simpler version of this model is proposed in this paper,
called a single layer model (SLM), as shown in Figure 4c. This is a special case of NTN, in which
Tr = 0. The scoring function is simplified to the following form:

f r (h, t) = r> f (Mr1 h + Mr2 t + br ) (35)

NTN requires a relation-specific tensor Tr for each relation, such that the number of parameters
in this model is huge and it would be impossible to apply to large-scale KGs. MLP [75] provides
a lightweight architecture in which all relations share the same parameters. The entities and relation in
a triplet fact (h, r, t) are synchronously projected into the embedding space in the input layer, and they
are involved in higher representation to score the plausibility by applying a nonlinear hidden layer.
The scoring function f r (h, t) is formulated as follows:

f r (h, t) = m> f (M1 h + M2 r + M3 t) (36)

Here, f ( x ) = tanh x is an activation function, and M1 , M2 , M3 ∈ Rd×d represent the mapping

matrices that project the embedding vectors h, r, and t to the second layer; m ∈ Rd are the second
layer parameters.
Since the emergence of deep learning technology, more and more studies have been proposed for
deep neural networks (DNNs) [71,76,77]. NAM [78] establishes a deep neural network structure to
represent a knowledge graph. As illustrated in Figure 4d, given a triplet (h, r, t), NAM firstly embeds
each element in the triplet to embedding feature space. Then, the head entity vector h and the relation
vector r are concatenated to a single vector z0 = [h; r] as the input to feed to the L + 1 layer where it
consists of L rectified linear (ReLU) [79] hidden layers as follows:

a` = M` z`−1 + b` (` = 1, 2, ..., L)
(37)
z` = ReLU(a` ) (` = 1, 2, ..., L)

where M` is the weight matrix and b` is the bias for layer `. Finally, a score is calculated by applying
the output pf the last hidden layer with the tail entity vector t:
Electronics 2020, 9, 750 13 of 29

f r (h, t) = σ (z L t) (38)

where σ (·) is a sigmoid activation function. A more complicated model is proposed in this paper,
called relation-modulated neural networks (RMNN). Figure 4e shows an illustration of this model.
Compared with NAM, it generates a knowledge-specific connection (i.e., relation embedding r) to all
hidden layers in the neural network. The layers are defined as follows:

a` = M` z`−1 + B` r (` = 1, 2, ..., L) (39)

where M` and B` denote the weight and bias matrices for layer `, respectively. After the feed-forward
process, RMNN can yield a final score using the last hidden layer’s output and the concatenation
between the tail entity vector t and relation vector r:

f r (h, t) = σ (z L t + B L+1 r) (40)

ConvKB [80] captures latent semantic information in the triplets by introducing a convolutional
neural network (CNN) for knowledge graph embedding. In the model, each triplet fact (h, r, t) is
represented to a three-row matrix, in which each element is transformed as a row vector. The matrix is
fed to a convolution layer to yield multiple feature maps. These feature maps are then concatenated
and projected to a score that is used to estimate the authenticity of the triplet via the dot product
operation with the weight vector.
More specifically, as illustrated in Figure 4f, γ = 3. First, the embedding vectors h, r, and t ∈ Rd
are viewed as a matrix A = [h; r; t] ∈ R3×d , and Ai: ∈ R3×1 indicates the i-th column of matrix A.
After that, the filter m ∈ R3×1 is slid over the input matrix to explore the local features and obtain
a feature map a = [ a1 , a2 , ..., ad ] ∈ Rd , such that:

ai = g(m> Ai: + b), i = 1, 2, ..., d (41)

where g(·) signifies the ReLU activation function and b indicates the bias. For this instance, there are
three feature maps corresponding to three filters. Finally, these feature maps are concatenated as
a representation vector ∈ R3d , and calculated with a weight vector w ∈ R3d by a dot product operation.
The scoring function is defined as follows:

f r (h, t) = C( g(A ∗ Ω))w (42)

Here, Ω is the set of filters, and A ∗ Ω denotes that a convolution operation is applied to matrix A
via the filters in the set Ω. C is the concatenation operator. It is worth mentioning that Ω and w are
shared parameters; they are generalized for all entities and relations.
In recent years, graph neural networks (GNNs) have captured more attention due to their great
ability in representation of graph structure. R-GCN [81] is an improved model, which provides
relation-specific transformation to represent knowledge graphs. The forward propagation is
formulated as follows:
 
1
= σ∑ ∑
( l +1) (l ) (l ) (l ) (l )
xi Wr x j + Wo xi  (43)
r ∈R j∈ N r c i,r
i

(l ) d(l )
where xi ∈R signifies the hidden state of the entity i in l-th layer, Nir indicates a neighbor collection
(l ) (l )
where it connects to entity i with relation r ∈ R, Wr and Wo are the weight matrices, and ci,r denotes
the normalization process such as ci,r = Nir .
Inspired by burgeoning generative adversarial networks (GANs), Cai et al. [82] proposed
a generative adversarial learning framework to improve the performance of the existing knowledge
Electronics 2020, 9, 750 14 of 29

graph representation models and named fr (h,t)it KBGAN. KBGAN’s innovative idea is to apply a KG

embedding model as the generator to obtain plausible negative samples, and leverage the positive
samples and generated negative samples to train the discriminator, which is the embedding model
we desire.
m m m 1 2 3

A simple overview introducing the framework is shown in Figure 5. There is a ground truth
triplet ( Microso f t, LocatedIn, Redmond), which is corrupted by disposing of its tail entity, such as
h
( Microso f t, LocatedIn, ?). The corrupted r
triplet is fed as an input into a generator (G) that can
receive a probability distribution over the t candidate negative triplets. Afterwards, the triplet with
the highest probability ( Microso f t, LocatedIn, SanFrancisco ) is sampled as the output of the generator.
The discriminator (D) utilizes the generated negative triplet and original truth triplet as input to train
the model, and computes their score d, which indicates the plausibility of the triplet. The two dotted
lines in the figure denote the error feedback in the generator and discriminator. One more point to
note is that each triplet fact-based representation learning model mentioned above can be employed
as the generator or discriminator in the KBGAN framework to improve the embedding performance.
Table 4 illustrates the scoring function and memory complexity of each neural network-based model.

Microsoft, LocatedIn, ?
Microsoft, LocatedIn, Redmond
loss

Sampled entity
San Francisco 0.85
d=1.0
California 0.1

Google G 0.01 D
Microsoft, LocatedIn,
Wall Street 0.02 d=2.0
San Francisco
Titanic 0.02

Reward

Figure 5. Framework of the KBGAN model.

Table 4. Comparison of neural network-based models in terms of scoring functions and

memory complexity.

Model Scoring Function f t ( h, t ) Memory Complexity

SME [21] (Ml1 h + Ml2 r + bl )> (Mr1 t + Mr2 r + br ) O( Ne d + Nr k)(d = k)

((Ml1 h) ◦ (Ml2 r) + bl )> ((Mr1 t) ◦ (Mr2 r) + br )
NTN [20] r> tanh(h> Tr t + Mr1 h + Mr2 t + br ) O( Ne d + Nr d2 k)(d = k)
SLM [20] r> tanh(Mr1 h + Mr2 t + br ) O( Ne d + Nr dk)(d = k)
MLP [75] m> tanh(M1 h + M2 r + M3 t) O( Ne d + Nr k)(d = k)
NAM [78] a` = M` z`−1 + b` ; z` = ReLU(a` ) O( Ne d + Nr k)(d = k)
f r (h, t) = sigmoid(z L t)
RMNN [78] a` = M` z`−1 + B` r; z` = ReLU(a` ) O( Ne d + Nr k)(d = k)
f r (h, t) = sigmoid
(z L t + B L +1 r )
( l +1) 1 (l ) (l ) (l ) (l )
R-GCN [81] xi = σ ∑r∈R ∑ j∈ Nir ci,r Wr x j + Wo xi O( Ne d + Nr k)(d = k)
ConvKB [80] C( g(A ∗ Ω))w O( Ne d + Nr k)(d = k)
KBGAN [82] kh + r − t k`1 /`2 ; (Generator: TransE) O( Ne d + Nr k)(d = k)
h> diag(r )t; (Discriminator: DistMult)

The above three types of triplet fact-based models focus on modifying the calculation formula
of the scoring function f r (h, t), and their other routine training procedures are roughly uniform.
The detailed optimization process is illustrated in Algorithm 1. First, all entity and relation embedding
vectors are randomly initialized. In each iteration, a subset of triplets sbatch is constructed by sampling
from the training set S, and it is fed into the model as inputs of the minibatch. For each triplet in
sbatch , a corrupted triplet is generated by replacing the head or tail entity with other ones. After that,
Electronics 2020, 9, 750 15 of 29

the original triplet set S and all generated corrupted triplets are incorporated as a batch training
set Tbatch . Finally, the parameters are updated by utilizing certain optimization methods.
Here, the set of corrupted triplets S0 is generated according to S(0 h, r, t) = (h0 , r, t)|h0 ∈ E ∪
(h, r, t0 )|t0 ∈ E, and there are two alternative versions of the loss function. The margin-based loss
function is defined as follows:

∑ max 0, γ + f r (h, t) − f r (h0 , t0 )

(44)
((h,r,t),(h0 ,r,t0 ))∈ Tbatch

and the logistic loss function is:

∑ log(1 + exp(−yhrt · f r (h, t))) (45)

((h,r,t),(h0 ,r,t0 ))∈ Tbatch

where γ > 0 is a margin hyperparameter and yhrt ∈ {−1, 1} is a label that indicates the category
(negative or positive) to which a given training triplet (h, r, t) belongs. These two functions can both
be calculated by the stochastic gradient descent (SGD) [83] or Adam [84] methods.

Algorithm 1 Learning triplet fact-based models.

Input: The training set S = {(h, r, t)}, entity set E, relation set R, embedding dimension k
Output: Entity and relation embeddings
1: initialize the entity embeddings e and relation embeddings r
2: loop
3: Sbatch ← sample(S, b) // sample a subset from S with minibatch size b
4: Tbatch ← φ // initialize the set of pairs of triplets
5: for (h, r, t) ∈ Sbatch do
6: (h0 , r, t0 ) ← sample(S0 ) // sample a corrupted triplet
7: Tbatch ← Tbatch ∪ {((h, r, t), (h0 , r, t0 ))}
8: end for
9: Update embeddings by minimizing the loss function
10: end loop

2.3. Description-Based Representation Learning Models

The models introduced above embed knowledge graphs to a specific feature space only based on
triplets (h, r, t). In fact, a large number of additional information associated with knowledge graphs
can be efficiently absorbed to further refine the embedding model’s performance, such as textual
descriptions information and relation path information.

2.3.1. Textual Description-Based Models

Textual information falls into two categories: entity type information and descriptive information.
The entity type specifies the semantic class to which the entity belongs. For example, Elon Musk
belongs to Person category and Tesla belongs to Company. An entity description is a paragraph that
describes the attributes and characteristics of an entity, for example, “Tesla, Inc. is an American electric
vehicle and clean energy company with headquarters in Palo Alto, California”.
The textual description-based model is an extension of the traditional triplet-based model that
integrates additional text information to evolve its performance. We introduce these models in an
expanded order based on translation, tensor factorization, and neural network methods.
The extensions of translation-based methods. Xie et al. [85] proposed a novel type-embodied
knowledge graph embedding learning method (TKRL) that exploits hierarchical entity types.
Electronics 2020, 9, 750 16 of 29

It suggests that an entity may have multiple hierarchical types, and different hierarchical types
should be transformed to different type-specific projection matrices.
In TKRL, for each fact (h, r, t), the entity embedding vectors h and t are first represented by using
type-specific projection matrices. Let h⊥ and t⊥ denote the projected vectors:

h⊥ = Mrh h, t⊥ = Mrt t (46)

where Mrh and Mrt indicate the projection matrices related to h and t. Finally, with a translation
r between two mapped entities, the scoring function is defined as the general form of the
translation-based method:

f r (h, t) = kh⊥ + r − t⊥ k22 (47)

To capture multiple-category semantic information in entities, the projection matrix Mrh /Mrt
(we use Mrh to illustrate the approach) is generated as the weighted summation of all possible type
matrices, i.e.,:
(
∑ n α Mc 1, ci ∈ Crh
Mrh = i=n1 i i , αi = (48)
∑ i =1 α i 0, ci ∈
/ Crh
where n is the number of types to which the head entity belongs, ci indicates the i-th type, Mci
represents the projection matrix of ci , αi signifies the corresponding weight, and Crh denotes the
collection of types that the head entity can be connected to with a given relation r. To further mine
the latent information stored in hierarchical categories, the matrix Mci is designed by two types
of operations:
m
Mc i = ∏ Mci(l) = Mci(1) Mci(2) ...Mci(m) , or
l =1
m
(49)
Mc i = ∑ βl Mci(l) = β1 Mci(1) + β2 Mci(2) + βm Mci(m)
l =1
(l )
where m is the number of subcategories for the parent category ci in the hierarchical structure, ci is the
(l )
l-th subcategory, M (l ) is the projection matrix corresponding to ci , and β l denotes the corresponding
ci
(l )
weight of ci .
TKRL introduces the entity type to enhance the embedding models; another aspect of textual
information is entity description. Wang et al. [86] proposed a text-enhanced knowledge graph
embedding model, named TEKE, which is also an extension of translation-based models.
Given a knowledge graph, TEKE firstly builds an entity description text corpus by utilizing
an entity linking tool, i.e., AIDA [87] for annotating all entities, then constructs a co-occurrence
network that has the ability of calculating the frequency of co-occurrence with the entities and words.
This paper advised that the rich textual information of adjacent words can effectively represent a
knowledge graph. Therefore, given an entity e, the model defines the valid textual context n(e) to
become its neighbors, and pairwise textual context n(h, t) = n(h) ∩ n(t) as common neighbors between
the head and tail entity for each relation r.
Then, the pointwise textual context embedding vector n(e) is obtained through the word
embedding toolkit. The pairwise textual context vector n(h, t) is a yield following a similar way.
Finally, these feature vectors that capture the textual information are employed to calculate the score,
such as TransE:
Electronics 2020, 9, 750 17 of 29

h⊥ = n ( h )A + h
t⊥ = n ( t )A + t (50)
r⊥ = n(h, t)B + r
where A and B denote the weight matrices, h, r, and t are the bias vectors. The score function f r (h, t) is
formulated as follows:

f r (h, t) = kh⊥ + r⊥ − t⊥ k22 (51)

The extensions of tensor factorization-based methods. Apart from taking advantage of triplet
information, Krompaß et al. [88] proposed an improved representation learning model for tensor
factorization-based methods. It integrates prior type-constraint knowledge with triplet facts in original
models, such as RESCAL, and achieves impressive performance in link prediction tasks.
Entities in large-scale KGs generally have one or multifarious predefined types. The types of head
and tail entities in a specific relation are constrained, which refers to type-constraints. For instance,
the relation MarryTo is reasonably only appropriate for the Person, to which the head and tail entities
belong. In this model, headk , as an indication vector, denotes the head entity types that satisfy the
type-constraints of relation k; tailk is also an indication vector index for the tail entity constraints of
relation k.
The main difference between this model and RESCAL is that the novel model indexes only those
latent embeddings of entities related to the relation type k, compared with indexing the whole matrix
A, as shown in Equation (21), in RESCAL.

Xk ≈ A[headk ,:] Rk A[Ttailk ,:] , f or k = 1, 2, ..., m (52)

Here, A[headk ,:] indicates the indexing of headk rows from the matrix A, and A[headk ,:] is the
indexing of tailk . As a result of its simplification, this model shows a shorter iteration time and it is
more suitable for large-scale KGs.
The extensions of neural network-based methods. The detailed description information associated
with entities and relations is exited in most practical large-scale knowledge graphs. For instance,
the entity Elon Musk has the particular description: Elon Musk is an entrepreneur, investor, and engineer.
He holds South African, Canadian, and U.S. citizenship and is the founder, CEO, and lead designer of SpaceX.
Xie et al. [89] provided a description-embodied knowledge graph embedding method (DKRL), which
can integrate textual information into the presentation model. It used two embedding models to
encode the semantic descriptions of the entity to enhance the representation learning model.
In this model, each head and tail entity is transformed to two vector representations, the one is
the structure-based embedding vector hs /ts ∈ Rd , which represents its name or index information and
another one is the description-based vector hd /td ∈ Rd , which captures descriptive text information
of the entity. The relation is also embedded to a Rd feature space. DKRL introduces two types of
the encoder to build the description-based embedding model, including a continuous bag-of-words
(CBOW) method and CNN method. The score function is expressed as a modified version of TransE:

f r (h, t) = khs + r − ts k22 + khd + r − ts k22

(53)
+ khs + r − td k22 + khd + r − td k22
Another text-enhanced knowledge graph embedding framework [35] has been proposed that
discovers latent semantic information from additional text to refine the performance of original
embedding models. It can represent a specific entity or relation to different feature vectors depending
on diverse textual description information.
Figure 6 shows a visual process for this model. First, by employing the entity linking tool [86],
it can gain the entity text descriptions at the top of the figure. To explore accurate relation-mentioned
sentences for a specific fact (h, r, t), the candidates should contain both the marked entities h and t
Electronics 2020, 9, 750 18 of 29

and one or more hyponym/synonym words corresponding to relation r, called mention extraction.
Then, the obtained entity text descriptions and mentions are fed into a two-layer bidirectional recurrent
neural network (Bi-RNN) to yield high-level text representations. After that, a mutual attention
layer [90] that achieves success in various tasks is employed to refine these two representations. Finally,
the structure-based representations that previous models generated are associated with the learned
textual representations to calculate the embedding vectors.

h f inal = αhd + (1 − α)hs

r f inal = αrd + (1 − α)rs (54)
t f inal = αtd + (1 − α)ts

where α ∈ [0, 1] denotes the weight factor, hs , rs and ts ∈ Rd are the embedding vectors learned from
the structural information, hd , rd and td ∈ Rd indicate the distributional representations of textual
descriptions, and h f inal , r f inal , and t f inal ∈ Rd are the text-enhanced representations forming the final
output of this model.

Text Entity ...... LSTM text

Corpus description representations
sen
t
enc

Attention
es

mechanism
synsets Mention Final
WorNet LSTM
extraction model
......
t)
r,
(h,

triplet facts-based
Triplets embeddings

Figure 6. Framework of text-enhanced KG embedding model.

2.3.2. Relation Path-Based Models

Another category of additional information that can refine the performance of an embedding
model is multi-step relational paths, which reveal one or more semantic relations between the
entities. Despite that some existing studies, such as TransE, have been a great success in modeling
triplet facts, the above approaches only focus on model one-step relation between two entities.
Be it different from one-step relation, multi-step relation comprises a sequence of relation paths
r1 → r2 → ... → rl , and involves more inference information. As an example, the relational
BornIn StateIn
path “Elon Musk −→ Pretoria −→ South A f rica” demonstrates a relation Nationality between
Elon Musk and South A f rica, i.e., ( Elon Musk, Nationality, South A f rica).
Lin et al. [30] firstly considered introducing the multi-step relation paths to knowledge
graph representation learning. They proposed a path-based embedding model, named PtransE.
For each triplet (h, r, t), r denotes the direct relation connecting h and t and P(h, t) = { p1 , p2 , ..., p N } is
the multi-step relation paths collection between h and t, where p = r1 → r2 → ... → ri l is an instance
of path. The score function Fr (h, t) is defined as:

Fr (h, t) = f r (h, t) + f P (h, t) (55)

where f r (h, t) indicates the normal triplet-based energy score function, which is equal to Equation (3).
f P (h, t) reflects the authenticity between h and t with multi-step relation paths, and it can be obtained
by the below equation:

1
f P (h, t) =
Z ∑ R( p|h, t) + f p (h, t) (56)
p∈ P(h,t)
Electronics 2020, 9, 750 19 of 29

Here, R( p|h, t) indicates the relation path confidence level of p that is acquired via a network-based
resource allocation algorithm [91]. Z = ∑ p∈ P(h,t) R( p|h, t) is a normalization factor and f p (h, t) signifies
the score function for (h, p, t). Therefore, the final problem is how to integrate various relations on the
multi-step relation paths p to a uniform embedding vector p, as illustrated in Figure 7.
For this challenge, PtransE applies three representative types of semantic composition operations
to incorporate the multiple relation r1 → r2 → ... → rl into an embedding vector p, including addition
(ADD), multiplication (MUL), and recurrent neural network (RNN):

ADD : p = r1 + r2 + ... + rl
MUL : p = r1 · r2 · ... · rl (57)
RNN : ci = f (W[ci−1 ; ri ])
where rl denotes the embedding vector of relation rl , ci represents the combined relation vector at the
i-th relation, W indicates a composition matrix, [ci−1 ; ri ] signifies the concatenation of ci−1 , and ri .
The ADD and MUL versions of PtransE are the extensions of translation-based methods; the RNN
version of PtransE and a similar approach proposed by Neelakantan et al. [92] can be considered as
the neural network methods’ extension in multiple relation paths. Next, we introduce an extension
method based on tensor factorization.

Figure 7. Framework of the PtransE model.

Compared with RESCAL, Guu et al. [93] extended the model by importing multiple relation paths
as additional information to enhance the performance. It leverages the multiplication composition
to combine different semantic inference information from the relations. The evolved f p (h, t) scoring
function corresponding to (h, p, t) is defined as follows:

f p (h, t) = h> (M1 · M2 · ... · Ml )t (58)

where (M) are the latent semantic meanings in the relation. The other training processes are the same
as previous works and it achieved better performance in answering path queries.

2.3.3. Other Models

Apart from the above methods that use additional information such as textual description and
relation paths in Section 2.3, there are also some studies that introduce other information into triplet
facts to evolve the traditional methods.
Reference [94,95] suggests that it is necessary to pay attention to the temporal aspects of triplets.
For example, (Steve Jobs, DiedIn, Cali f ornia) happened on 2011; given a date after 2011, it is improper
to obtain relations such as WorkAt where the head entity is Steve Jobs. They employ both the observed
triplets and the temporal order information of the triplets into the embedding space. Feng et al. [96]
proposed a graph-aware knowledge embedding method that considers a KG as a directed graph,
applying the structural information of the KG to generate the representations for entities and relations.
For more details about graph embedding, please refer to [36].
Electronics 2020, 9, 750 20 of 29

3. Applications Based on Knowledge Graph Embedding

After introducing the existing knowledge graph embedding methods, we explore diverse tasks
and applications that utilize these methods to yield a benefit in this section. These are two types of
tasks that are employed to evaluate the performance of embedding models on the most proposed
methods, including link prediction and triplet classification. In addition, we introduce the broader
domains in which the KG embedding technique can be employed and make contributions, including
intelligent question answering systems, recommender systems, and so forth.

3.1. Link Prediction

Link prediction is a common knowledge graph representation application, the goal is to predict
a lacking entity when given a concrete entity and relation. More exactly, it intends to predict h when
given (r, t) or r when given (h, t), indicated as (h, r, ?) and (?, r, t). For instance, (?, FounderOf, SpaceX)
means to predict who the founder of SpaceX is, and ((SpaceX, LocatedIn, ?) is to inquire about the
location of SpaceX.

3.1.1. Benchmark Datasets

Bordes et al. released two datasets for the KG embedding experiment: FB15K [19], which
is extracted from Freebase, and WN18 [21], which is generated from WordNet (the datasets are
available from https://www.hds.utc.fr/everest/doku.php?id=en:transe). These two datasets are the
most extensive benchmarks for this application. Elaborate statistics of these two datasets are shown
in Table 5.

Table 5. The statistics of datasets.

Datasets #Relation #Entity #Train #Valid #Test

WN18 18 40,943 141,442 5000 5000

FB15K 1345 14,951 483,142 50,000 59,071
WN11 11 38,696 112,581 2609 10,544
FB13 13 75,043 316,232 5908 23,733

3.1.2. Evaluation Protocol

Given a test triplet (h, r, t), the true tail entity is replaced by every entity e in the entity candidate
set. For each iteration, the authenticity of the corrupted triplet ((h, r, e)) is calculated according to the
score function f r (h, e). After scoring all triplets in descending order, the rank of (h, r, ?) is acquired.
This whole process is also suitable for the situation with lacking the head entity, i.e., (?, r, t). Further
more, two metrics are introduced to evaluate the performance of models: the averaged rank of the
correct entities (Mean Rank) and the proportion of the correct entities that ranked in top 10 (HITS@10).
However, not every corrupted triplet is incorrect, some generated triplets also exist in the
knowledge graph. In this situation, the rank of the corrupted triplet may be ahead of the original one.
To eliminate this problem, all corrupted triplets in which it appears either in the train, validation, or
test datasets are filtered out. The raw dataset is named “Raw” and the filtered one is named “Filter”. It
is worth to note that the scores with higher HITS@10 and lower Mean Rank denote better performance.

3.1.3. Overall Experimental Results

The detailed results of the link prediction experiment are shown in Table 6. It can be observed that:

• Overall, knowledge graph embedding approaches have made impressive progress in the
development of these years. For instance, HITS@10(%) in WN18 has improved from the initial
52.8% that RESCAL yielded to 96.4% that R-GCN obtained.
Electronics 2020, 9, 750 21 of 29

• R-GCN achieves the best performance in the WN18 dataset, but in another dataset, it is not one
of the best models. The reason is that R-GCN has to collect all information about neighbors
that connect to a specific entity with one or more relations. In WN18, there are only 18 types of
relations, it is easy to calculate and generalize. However, FB15K has 1345 types of relations, the
computational complexity has increased exponentially for R-GCN, which is why its performance
has declined.
• QuatE is superior to all existing methods in FB15K datasets, and is also the second best performing
in WN18. It demonstrates that capturing hidden inter-dependency between entities and relations
in four-, space is a benefit for knowledge graph representation.
• Compared with the triplet-based models, these description-based models do not yield higher
performance in this task. It reveals that external textual information is not fully utilized and
exploited; researchers can take advantage of this external information to improve performance in
the future.
• In the past two years, the performance of models has not improved much on these two datasets.
The most likely reason is that existing methods have already reached the upper bound of
performance, so this field needs to introduce new evaluation indicators or benchmark datasets to
solve this problem.

Table 6. Evaluation results on link prediction for different embedding methods.

Datasets WN18 FB15K

Mean Rank HITS@10(%) Measn Rank Hits@10(%)

Metric
Raw Filter Raw Filter Raw Filter Raw Filter

TransE [19] 263 251 75.4 89.2 243 125 34.9 47.1
TransH [51] 401 388 73.0 82.3 212 87 45.7 64.4
TransR [52] 238 225 79.8 92.0 198 77 48.2 68.7
TransD [22] 224 212 79.6 92.2 194 91 53.4 77.3
Transparse [24] 223 221 80.1 93.2 190 82 53.7 79.9
STransE [53] 217 206 80.9 93.4 219 69 51.6 79.7
TransA [23] 405 392 82.3 94.3 155 74 56.1 80.4
KG2E [56] 362 348 80.5 93.2 183 69 47.5 71.5
TransG [59] 357 345 84.5 94.9 152 50 55.9 88.2
RESCAL [61] 1180 1163 37.2 52.8 828 683 28.4 44.1
DistMult [63] – – – 94.2 – – – 58.5
HOLE [64] – – – 94.9 – – – 73.9
Complex [67] – – – 94.7 – – – 84.0
SimplE [68] – – – 94.7 – – – 83.8
RotatE [69] – 309 – 95.9 – 40 – 88.4
QuatE [70] – 162 – 95.9 – 17 – 90.0
SME [21] 526 509 54.7 61.3 284 158 31.3 41.3
NTN [20] – – – 66.1 – – – 41.4
R-GCN [81] – – – 96.4 – – – 84.2
KBGAN [82] – – – 89.2 – – – –
TKRL [85] – – – – 184 68 49.2 69.4
DKRL [89] – – – – 181 91 49.6 67.4
TEKE [86] 140 127 80.0 93.8 233 79 43.5 67.6
AATE [35] – 179 – 94.9 – 52 – 88.0
PTransE [30] – – – – 207 58 51.4 84.6
Electronics 2020, 9, 750 22 of 29

3.2. Triplet Classification

Triplet classification is regarded as a binary classification task that focuses on estimating the
authenticity of a triplet (h, r, t), proposed by Socher et al. [20].

3.2.1. Benchmark Datasets

Similar to link prediction, this application also has two benchmark datasets, named WN11 and
FB13, extracted from WordNet and Freebase (he datasets are available from http://www.socher.org/
index.php), respectively. Detailed statistics of the two datasets are shown in Table 5.

3.2.2. Evaluation Protocol

Given a test triplet (h, r, t), a score is calculated via a score function f r (h, t). If this score is
above a specific threshold σr , the corresponding triplet is classified as positive, otherwise as negative.
By maximizing the classification accuracy on the validation dataset, we can determine the value of
threshold σr .

3.2.3. Overall Experimental Results

Detailed results of the triplet classification experiment are shown in Table 7. It can be observed that:

• In summary, these knowledge graph representation learning models have achieved a greater
improvement on the WN11 dataset than FB13 because there is twice as much training samples in
FB13 as in WN11, but relations between the two datasets are similar in number. This also means
that FB13 has more data to train embedding models, thus it improves the generalization ability of
models and makes their performance gap smaller.
• In the triplet-based models, TransG outperforms all existing methods in the benchmark datasets.
It reveals that multiple semantics for each relation would refine the performance of models.
• Similar to the last task, the description-based models also do not yield impressive improvements
in triplet classification application. Especially in recent years, few articles utilize additional
textual or path information to improve the performance of models. There is still a good deal of
improvement space be achieved with additional information for knowledge graph embedding.

Table 7. Evaluation results on triplets classification accuracy (%) for different embedding methods.

Models WN11 FB13 AVG.

TransE [19] 75.9 81.5 78.7

TransH [51] 78.8 83.3 81.1
TransR [52] 85.9 82.5 84.2
TransD [22] 86.4 89.1 87.8
TransA [23] 83.2 87.3 85.3
KG2E [56] 85.4 85.3 85.4
TransG [59] 87.4 87.3 87.4
NTN [20] 70.4 87.1 78.8
TEKE [86] 84.8 84.2 84.5
AATE [35] 88.0 87.2 87.6

3.3. Other Applications

Apart from the aforementioned applications that are appropriate for KG embeddings, there are
other wider fields to which KG representation learning could be applied and play a significant role.
Question answering (QA) systems are a perpetual topic in artificial intelligence, in which the goal
is teaching machines to understand questions in the form of natural language and return a precise
Electronics 2020, 9, 750 23 of 29

answer. For the past few years, the QA systems based on KGs have received much attention and
some studies have been proposed in this direction [97,98]. The core idea of these methods is to embed
both a KG and question into a low-, vector space to make the embedding vector of the question and
its corresponding answer as close as possible. This technology also can be used in recommender
systems [99,100], which are systems capable of advising users regarding items they want to purchase
or hold, and other promising application domains.
However, the applications based on KG embedding are still in their initial stages; in particular,
there are few related studies in external applications, such as question answering, and the domains
in which researchers are concerned are very limited. Thus, this direction holds great potential for
future research.

4. Conclusions and Future Prospects

In this paper, we provide a systematic review of KG representation learning. We introduce
the existing models in concise words and describe several tasks that utilize KG embedding.
More particularly, we first introduce the embedding models that only apply triplet facts as inputs,
and further mention the advanced approaches that leverage additional information to enhance the
performance of the original models. After that, we introduce a variety of applications including link
prediction and triplet classification, etc. However, the studies on KG embedding are far from mature,
and extensive efforts are still required in this field.
To the best of our knowledge, these are three research directions that can be extended: (i) Although
those models that utilize the information of additional semantic information are more efficient than
the triplet fact-based models, the types of information they can incorporate are extremely limited.
The available multivariate information such as hierarchical descriptions between entities/relations
in KG, textual Internet information, and even the extracted information from other KGs, can also
be applied to refine the representation performance of the embedding models. (ii) The capability of
knowledge inference is also a significant part in knowledge graph embedding models. For instance,
there is no direct relationship between the head entity (h) and tail entity t in the original KG, but we
can explore the inherent connection between these two entities by employing a KG embedding model;
this would be greatly beneficial for question answering systems and other applications. Nonetheless,
when the relation paths between entities become longer, existing models cannot effectively solve
multiple complex relation path problems. Using deep learning technology to integrate all relevant
semantic information and represent it uniformly is an alternative solution to this issue. (iii) Although
KG representation learning applies to relatively few domains/applications at present, by adopting
new technologies such as transfer learning [101], existing models could be applied to a new field with
minor adjustments; we believe that this technique will expand to more domains and bring a great
improvement compared to traditional methods. We hope that this brief future outlook will provide
new ideas and insights for researchers.

Author Contributions: Conceptualization, Y.D. and S.W.; methodology, W.G.; software, Y.D.; validation, Y.D.
and S.W.; formal analysis, Y.D., W.G. and N.N.X.; investigation, Y.D.; data curation, Y.D.; writing—original draft
preparation, Y.D.; writing—review and editing, W.G. and N.N.X.; supervision, N.N.X.; project administration,
S.W., W.G. and N.N.X.; funding acquisition, W.G. All authors have read and agreed to the published version of
the manuscript.
Funding: This work is supported by the Guiding Project of Fujian Province under Grant No. 2018H0017.
Conflicts of Interest: The authors declare no conflict of interest.
Electronics 2020, 9, 750 24 of 29

References
1. Pease, A.; Niles, I.; Li, J. The suggested upper merged ontology: A large ontology for the semantic web and
its applications. In Proceedings of the Working Notes of the AAAI-2002 Workshop on Ontologies and the
Semantic Web, Edmonton, AB, Canada, 28–29 July 2002; Volume 28, pp. 7–10.
2. Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A core of semantic knowledge. In Proceedings of the 16th
International Conference on World Wide Web, Banff, AB, Canada, 8–12 May 2007; pp. 697–706.
3. Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J. Freebase: A collaboratively created graph database
for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on
Management of Data, Vancouver, BC, Canada, 9–12 June 2008; pp. 1247–1250.
4. Vrandečić, D.; Krötzsch, M. Wikidata: A free collaborative knowledgebase. Commun. ACM 2014, 57, 78–85.
[CrossRef]
5. Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z. Dbpedia: A nucleus for a web of open
data. In Proceedings of the Semantic Web, International Semantic Web Conference, Asian Semantic Web
Conference, ISWC 2007 + ASWC 2007, Busan, Korea, 11–15 November 2007; pp. 722–735.
6. Shen, W.; Wang, J.; Luo, P.; Wang, M. Linden: Linking named entities with knowledge base via semantic
knowledge. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France,
16–20 April 2012; pp. 449–458.
7. Shen, W.; Wang, J.; Luo, P.; Wang, M. Linking named entities in tweets with knowledge base via user interest
modeling. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 68–76.
8. Zheng, Z.; Si, X.; Li, F.; Chang, E.Y.; Zhu, X. Entity disambiguation with freebase. In Proceedings of the 2012
IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology,
Macau, China, 4–7 December 2012; pp. 82–89.
9. Damljanovic, D.; Bontcheva, K. Named entity disambiguation using linked data. In Proceedings of the 9th
Extended Semantic Web Conference, Crete, Greece, 27–31, May 2012; pp. 231–240.
10. Dong, L.; Wei, F.; Zhou, M.; Xu, K. Question answering over freebase with multi-column convolutional neural
networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics
and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015;
pp. 260–269.
11. Xu, K.; Reddy, S.; Feng, Y.; Huang, S.; Zhao, D. Question Answering on Freebase via Relation Extraction
and Textual Evidence. In Proceedings of the 54th Annual Meeting of the Association for Computational
Linguistics, Berlin, Germany, 7–12 August 2016; pp. 2326–2336.
12. Hoffmann, R.; Zhang, C.; Ling, X.; Zettlemoyer, L.; Weld, D.S. Knowledge-based weak supervision for
information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics, Portland, OR, USA, 19–24 June 2011; pp. 541–550.
13. Fei, W.; Daniel, W. Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting
of the Association for Computational Linguistics, Uppsala, Sweden, 11–16 July 2010; pp. 118–127.
14. Sang, S.; Yang, Z.; Wang, L.; Liu, X.; Lin, H.; Wang, J. SemaTyP: A knowledge graph based literature mining
method for drug discovery. BMC Bioinform. 2018, 19, 193. [CrossRef] [PubMed]
15. Abdelaziz, I.; Fokoue, A.; Hassanzadeh, O.; Zhang, P.; Sadoghi, M. Large-scale structural and textual
similarity-based mining of knowledge graph to predict drug–drug interactions. J. Web Semant. 2017,
44, 104–117. [CrossRef]
16. Li, F.L.; Qiu, M.; Chen, H.; Wang, X.; Gao, X.; Huang, J.; Ren, J.; Zhao, Z.; Zhao, W.; Wang, L.; et al. Alime
assist: An intelligent assistant for creating an innovative e-commerce experience. In Proceedings of the
2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017;
pp. 2495–2498.
17. Xu, D.; Ruan, C.; Korpeoglu, E.; Kumar, S.; Achan, K. Product Knowledge Graph Embedding for E-commerce.
In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA,
3–7 February 2020; pp. 672–680.
18. Xu, Z.; Zhang, H.; Hu, C.; Mei, L.; Xuan, J.; Choo, K.K.R.; Sugumaran, V.; Zhu, Y. Building knowledge base
of urban emergency events based on crowdsourcing of social media. Concurr. Comput. Pract. Exp. 2016,
28, 4038–4052. [CrossRef]
Electronics 2020, 9, 750 25 of 29

19. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling
multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 2, 2787–2795.
20. Socher, R.; Chen, D.; Manning, C.D.; Ng, A. Reasoning with neural tensor networks for knowledge base
completion. Adv. Neural Inf. Process. Syst. 2013, 1, 926–934.
21. Bordes, A.; Glorot, X.; Weston, J.; Bengio, Y. A semantic matching energy function for learning with
multi-relational data. Mach. Learn. 2014, 94, 233–259. [CrossRef]
22. Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge graph embedding via dynamic mapping matrix.
In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th
International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 687–696.
23. Jia, Y.; Wang, Y.; Lin, H.; Jin, X.; Cheng, X. Locally Adaptive Translation for Knowledge Graph Embedding.
In Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February
2016; pp. 992–998.
24. Ji, G.; Liu, K.; He, S.; Zhao, J. Knowledge Graph Completion with Adaptive Sparse Transfer
Matrix. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA,
12–17 February 2016; pp. 985–991.
25. Dai, Y.; Wang, S.; Chen, X.; Xu, C.; Guo, W. Generative adversarial networks based on Wasserstein distance
for knowledge graph embeddings. Knowl.-Based Syst. 2020, 190, 105165. [CrossRef]
26. Weston, J.; Bordes, A.; Yakhnenko, O.; Usunier, N. Connecting Language and Knowledge Bases with
Embedding Models for Relation Extraction. In Proceedings of the 2013 Conference on Empirical Methods in
Natural Language Processing, Seattle, WA, USA, 18–21 October 2013; pp. 1366–1371.
27. Riedel, S.; Yao, L.; McCallum, A.; Marlin, B.M. Relation extraction with matrix factorization and universal
schemas. In Proceedings of the 2013 Conference of the North American Chapter of the Association for
Computational Linguistics, Atlanta, GA, USA, 9–14 June 2013; pp. 74–84.
28. Guo, S.; Wang, Q.; Wang, B.; Wang, L.; Guo, L. Semantically Smooth Knowledge Graph Embedding.
In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th
International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 84–94.
29. Ouyang, X.; Yang, Y.; He, L.; Chen, Q.; Zhang, J. Representation Learning with Entity Topics for
Knowledge Graphs. In Proceedings of the International Conference on Knowledge Science, Engineering and
Management, Melbourne, Australia, 19–20 August 2017; pp. 534–542.
30. Lin, Y.; Liu, Z.; Luan, H.; Sun, M.; Rao, S.; Liu, S. Modeling Relation Paths for Representation Learning
of Knowledge Bases. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language
Processing, Lisbon, Portugal, 7–21 September 2015; pp. 705–714.
31. Toutanova, K.; Lin, V.; Yih, W.T.; Poon, H.; Quirk, C. Compositional learning of embeddings for relation
paths in knowledge base and text. In Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics, Berlin, Germany, 7–12 August 2016; pp. 1434–1444.
32. Zhang, M.; Wang, Q.; Xu, W.; Li, W.; Sun, S. Discriminative Path-Based Knowledge Graph Embedding for
Precise Link Prediction. In Proceedings of the European Conference on Information Retrieval, Grenoble,
France, 25–29 March 2018; pp. 276–288.
33. Zhong, H.; Zhang, J.; Wang, Z.; Wan, H.; Chen, Z. Aligning Knowledge and Text Embeddings by Entity
Descriptions. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,
Lisbon, Portugal, 17–21 September 2015; pp. 267–272.
34. Xiao, H.; Huang, M.; Meng, L.; Zhu, X. SSP: Semantic Space Projection for Knowledge Graph Embedding
with Text Descriptions. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco,
CA, USA, 4–9 February 2017; pp. 3104–3110.
35. An, B.; Chen, B.; Han, X.; Sun, L. Accurate Text-Enhanced Knowledge Graph Representation Learning.
In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; pp. 745–755.
36. Cai, H.; Zheng, V.W.; Chang, K. A comprehensive survey of graph embedding: Problems, techniques and
applications. IEEE Trans. Knowl. Data Eng. 2018, 30, 1616–1637. [CrossRef]
37. Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge graph embedding: A survey of approaches and applications.
IEEE Trans. Knowl. Data Eng. 2017, 29, 2724–2743. [CrossRef]
Electronics 2020, 9, 750 26 of 29

38. Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
New York, NY, USA, 24–27 August 2014; pp. 701–710.
39. Cao, S.; Lu, W.; Xu, Q. Grarep: Learning graph representations with global structural information.
In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management,
Melbourne, Australia, 19–23 October 2015; pp. 891–900.
40. Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q. Line: Large-scale information network embedding.
In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015;
pp. 1067–1077.
41. Wu, F.; Song, J.; Yang, Y.; Li, X.; Zhang, Z.M.; Zhuang, Y. Structured Embedding via Pairwise Relations and
Long-Range Interactions in Knowledge Base. In Proceedings of the 29th AAAI Conference on Artificial
Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 1663–1670.
42. Zhao, Y.; Liu, Z.; Sun, M. Representation Learning for Measuring Entity Relatedness with Rich Information.
In Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina,
25–31 July 2015; pp. 1412–1418.
43. Liu, Z.; Zheng, V.W.; Zhao, Z.; Zhu, F.; Chang, K.C.C.; Wu, M.; Ying, J. Semantic Proximity Search on
Heterogeneous Graph by Proximity Embedding. In Proceedings of the 31st AAAI Conference on Artificial
Intelligence, San Francisco, CA, USA, 4–9 February 2017; pp. 154–160.
44. Nikolentzos, G.; Meladianos, P.; Vazirgiannis, M. Matching Node Embeddings for Graph Similarity.
In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, USA,
4–9 February 2017; pp. 2429–2435.
45. Guo, S.; Wang, Q.; Wang, B.; Wang, L.; Guo, L. SSE: Semantically smooth embedding for knowledge graphs.
IEEE Trans. Knowl. Data Eng. 2017, 29, 884–897. [CrossRef]
46. Zhang, C.; Zhang, K.; Yuan, Q.; Peng, H.; Zheng, Y.; Hanratty, T.; Wang, S.; Han, J. Regions, periods,
activities: Uncovering urban dynamics via cross-modal representation learning. In Proceedings of the 26th
International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 361–370.
47. Han, Y.; Shen, Y. Partially Supervised Graph Embedding for Positive Unlabelled Feature Selection.
In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, NY, USA,
9–15 July 2016; pp. 1548–1554.
48. Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized
spectral filtering. In Proceedings of the 30th International Conference on Neural Information Processing
Systems, Barcelona, Spain, 5–10 December 2016; pp. 3844–3852.
49. Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.; Dean, J. Distributed Representations of Words and Phrases
and their Compositionality. Adv. Neural Inf. Process. Syst. 2013, 26, 3111–3119.
50. Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space.
arXiv 2013, arXiv:1301.3781.
51. Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge graph embedding by translating on hyperplanes.
In Proceedings of the 28th AAAI Conference on Artificial Intelligence, Québec City, QC, Canada,
27–31 July 2014; pp. 1112–1119.
52. Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph
completion. In Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, TX, USA,
25–30 January 2015; pp. 2181–2187.
53. Nguyen, D.Q.; Sirts, K.; Qu, L.; Johnson, M. STransE: A novel embedding model of entities and relationships
in knowledge bases. In Proceedings of the 14th Annual Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies, Dunhuang, China,
9–14 October 2016; pp. 460–466.
54. Xiao, H.; Huang, M.; Hao, Y.; Zhu, X. TransA: An adaptive approach for knowledge graph embedding.
arXiv 2015, arXiv:1509.05490.
55. Wang, F.; Sun, J. Survey on distance metric learning and dimensionality reduction in data mining. Data Min.
Knowl. Discov. 2015, 29, 534–564. [CrossRef]
56. He, S.; Liu, K.; Ji, G.; Zhao, J. Learning to represent knowledge graphs with gaussian embedding.
In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management,
Melbourne, Australia, 19–23 Octobe 2015; pp. 623–632.
Electronics 2020, 9, 750 27 of 29

57. Kullback, S. Information Theory and Statistics; Courier Corporation: North Chelmsford, MA, USA, 1997.
58. Jebara, T.; Kondor, R.; Howard, A. Probability product kernels. J. Mach. Learn. Res. 2004, 5, 819–844.
59. Xiao, H.; Huang, M.; Zhu, X. TransG: A generative model for knowledge graph embedding.
In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany,
7–12 August 2016; Volume 1, pp. 2316–2325.
60. Miller, G.A. WordNet: A lexical database for English. Commun. ACM 1995, 38, 39–41. [CrossRef]
61. Nickel, M.; Tresp, V.; Kriegel, H.P. A three-way model for collective learning on multi-relational data.
In Proceedings of the 28th International Conference on International Conference on Machine Learning,
Bellevue, WA, USA, 28 June–2 July 2011; pp. 809–816.
62. García-Durán, A.; Bordes, A.; Usunier, N. Effective blending of two and three-way interactions for modeling
multi-relational data. In Proceedings of the Joint European Conference on Machine Learning and Knowledge
Discovery in Databases, Nancy, France, 15–19 September 2014; pp. 434–449.
63. Yang, B.; Yih, S.W.t.; He, X.; Gao, J.; Deng, L. Embedding entities and relations for learning and inference
in knowledge bases. In Proceedings of the 2015 International Conference on Learning Representations,
San Diego, CA, USA, 7–9 May 2015.
64. Nickel, M.; Rosasco, L.; Poggio, T. Holographic Embeddings of Knowledge Graphs. In Proceedings of the
30th AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 1955–1961.
65. Plate, T.A. Holographic reduced representations. IEEE Trans. Neural Netw. 1995, 6, 623–641. [CrossRef]
66. Brigham, E.O.; Brigham, E.O. The Fast Fourier Transform and Its Applications; Pearson: Upper Saddle River, NJ,
USA, 1988; Volume 448.
67. Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex embeddings for simple link prediction.
In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016;
pp. 2071–2080.
68. Kazemi, S.M.; Poole, D. SimplE embedding for link prediction in knowledge graphs. In Proceedings
of the 32nd International Conference on Neural Information Processing Systems, Montreal, QC, Canada,
3–8 December 2018; pp. 4289–4300.
69. Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. RotatE: Knowledge Graph Embedding by Relational Rotation in
Complex Space. In Proceedings of the International Conference on Learning Representations, New Orleans,
LA, USA, 6–9 May 2019.
70. Zhang, S.; Tay, Y.; Yao, L.; Liu, Q. Quaternion knowledge graph embeddings. In Proceedings of
the 33th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada,
8–14 December 2019; pp. 2731–2741.
71. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [CrossRef]
72. Wang, S.; Guo, W. Robust co-clustering via dual local learning and high-order matrix factorization.
Knowl. -Based Syst. 2017, 138, 176–187. [CrossRef]
73. Wang, S.; Guo, W. Sparse multigraph embedding for multimodal feature representation. IEEE Trans.
Multimed. 2017, 19, 1454–1466. [CrossRef]
74. Ke, X.; Zou, J.; Niu, Y. End-to-end automatic image annotation based on deep cnn and multi-label data
augmentation. IEEE Trans. Multimed. 2019, 21, 2093–2106. [CrossRef]
75. Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; Lao, N.; Murphy, K.; Strohmann, T.; Sun, S.; Zhang, W.
Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Montreal, QC, Canada,
3–8 December 2014; pp. 601–610.
76. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [CrossRef]
77. Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep learning. Nature 2015, 521, 436–444.
78. Liu, Q.; Jiang, H.; Evdokimov, A.; Ling, Z.H.; Zhu, X.; Wei, S.; Hu, Y. Probabilistic reasoning via deep
learning: Neural association models. arXiv 2016, arXiv:1603.07704.
79. Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the
27th International Conference on Machine Learning, Haifa, Israel, 21–24 June 2010; pp. 807–814.
80. Nguyen, D.Q.; Nguyen, T.D.; Nguyen, D.Q.; Phung, D. A Novel Embedding Model for Knowledge Base
Completion Based on Convolutional Neural Network. In Proceedings of the 2018 Conference of the
North American Chapter of the Association for Computational Linguistics: Human Language Technologies,
New Orleans, LA, USA, 1–6 June 2018; Volume 2, pp. 327–333.
Electronics 2020, 9, 750 28 of 29

81. Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling relational data with
graph convolutional networks. In Proceedings of the European Semantic Web Conference, Anissaras, Greece,
3–7 June 2018; pp. 593–607.
82. Cai, L.; Wang, W.Y. KBGAN: Adversarial Learning for Knowledge Graph Embeddings. In Proceedings
of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; Volume 1, pp. 1470–1480.
83. Robbins, H.; Monro, S. A stochastic approximation method. Herbert Robbins Sel. Pap. 1985, 22, 102–109.
84. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980.
85. Xie, R.; Liu, Z.; Sun, M. Representation learning of knowledge graphs with hierarchical types. In Proceedings
of the 25th International Joint Conference on Artificial Intelligence, Palo Alto, CA, USA, 9–15 July 2016;
pp. 2965–2971.
86. Wang, Z.; Li, J. Text-enhanced representation learning for knowledge graph. In Proceedings of the 25th
International Joint Conference on Artificial Intelligence, Palo Alto, CA, USA, 9–15 July 2016; pp. 1293–1299.
87. Yosef, M.A.; Hoffart, J.; Bordino, I.; Spaniol, M.; Weikum, G. Aida: An online tool for accurate disambiguation
of named entities in text and tables. Proc. VLDB Endow. 2011, 4, 1450–1453.
88. Krompaß, D.; Baier, S.; Tresp, V. Type-Constrained Representation Learning in Knowledge Graphs.
In Proceedings of the 14th International Conference on The Semantic Web-ISWC, Bethlehem, PA, USA,
11–15 October 2015; pp. 640–655.
89. Xie, R.; Liu, Z.; Jia, J.; Luan, H.; Sun, M. Representation Learning of Knowledge Graphs with Entity
Descriptions. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA,
12–17 February 2016; pp. 2659–2665.
90. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention
is all you need. In Proceedings of the 31th International Conference on Neural Information Processing
Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008.
91. Zhou, T.; Ren, J.; Medo, M.; Zhang, Y.C. Bipartite network projection and personal recommendation.
Phys. Rev. E 2007, 76, 046115. [CrossRef]
92. Neelakantan, A.; Roth, B.; McCallum, A. Compositional Vector Space Models for Knowledge Base
Completion. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics
and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015;
pp. 156–166.
93. Guu, K.; Miller, J.; Liang, P. Traversing Knowledge Graphs in Vector Space. In Proceedings of the
2015 Conference on Empirical Methods in Natural Language Processing, Beijing, China, 26–31 July 2015;
pp. 318–327.
94. Jiang, T.; Liu, T.; Ge, T.; Sha, L.; Li, S.; Chang, B.; Sui, Z. Encoding temporal information for time-aware link
prediction. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,
Austin, TX, USA, 1–5 November 2016; pp. 2350–2354.
95. Trivedi, R.; Dai, H.; Wang, Y.; Song, L. Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge
Graphs. In Proceedings of the International Conference on Machine Learning, San Francisco, CA, USA,
25–27 October 2017; pp. 3462–3471.
96. Feng, J.; Huang, M.; Yang, Y.; Zhu, X. GAKE: Graph aware knowledge embedding. In Proceedings
of the COLING 2016 the 26th International Conference on Computational Linguistics: Technical Papers,
Osaka, Japan, 11–16 December 2016; pp. 641–651.
97. Bordes, A.; Chopra, S.; Weston, J. Question Answering with Subgraph Embeddings. In Proceedings of the
2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014;
pp. 615–620.
98. Bordes, A.; Weston, J.; Usunier, N. Open question answering with weakly supervised embedding models.
In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in
Databases, Porto, Portugal, 7–11 September 2014; pp. 165–180.
99. Zhang, F.; Yuan, N.J.; Lian, D.; Xie, X.; Ma, W.Y. Collaborative knowledge base embedding for recommender
systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 353–362.
Electronics 2020, 9, 750 29 of 29

100. Fu, C.; Zhou, M.; Xuan, Q.; Hu, H.X. Expert recommendation in oss projects based on knowledge embedding.
In Proceedings of the 2017 International Workshop on Complex Systems and Networks, Doha, Qatar,
8–10 December 2017; pp. 149–155.
101. Lu, J.; Behbood, V.; Hao, P.; Zuo, H.; Xue, S.; Zhang, G. Transfer learning using computational intelligence:
A survey. Knowl. -Based Syst. 2015, 80, 14–23. [CrossRef]

c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Unit - 1 MACHINE LEARNING BASICS, LINEAR ALGEBRA
No ratings yet
Unit - 1 MACHINE LEARNING BASICS, LINEAR ALGEBRA
41 pages
Vector Space and Subspace
No ratings yet
Vector Space and Subspace
66 pages
Symbolic Math Toolbox™ User's Guide PDF
No ratings yet
Symbolic Math Toolbox™ User's Guide PDF
1,276 pages
Automated Attendance Machine Using Face Detection and Recognition
0% (1)
Automated Attendance Machine Using Face Detection and Recognition
61 pages
Accelerate FWRef
No ratings yet
Accelerate FWRef
556 pages
A Survey On Knowledge Graphs Representation Acquisition and Applications
No ratings yet
A Survey On Knowledge Graphs Representation Acquisition and Applications
21 pages
(Ebook PDF) Qualitative Data Analysis: A Methods Sourcebook 4th Edition Download
No ratings yet
(Ebook PDF) Qualitative Data Analysis: A Methods Sourcebook 4th Edition Download
57 pages
Excel Formulas
No ratings yet
Excel Formulas
206 pages
Cheat Sheet
No ratings yet
Cheat Sheet
163 pages
MATLAB Pamphlet With Exercises: Section 0: A Few Basics
0% (1)
MATLAB Pamphlet With Exercises: Section 0: A Few Basics
14 pages
Knowledge Graph Tutorial
No ratings yet
Knowledge Graph Tutorial
175 pages
Chen 2023
No ratings yet
Chen 2023
9 pages
EE2211 Tut4
No ratings yet
EE2211 Tut4
5 pages
Abstract Reasoning Test
100% (3)
Abstract Reasoning Test
14 pages
Knowledge Graphs
No ratings yet
Knowledge Graphs
150 pages
Mth501 Midterm Solved Mcqs by Junaid-1
No ratings yet
Mth501 Midterm Solved Mcqs by Junaid-1
37 pages
Computer Project Profile
No ratings yet
Computer Project Profile
112 pages
57 Maths
No ratings yet
57 Maths
4 pages
Introduction To ROBOTICS: Kinematics of Robot Manipulator
No ratings yet
Introduction To ROBOTICS: Kinematics of Robot Manipulator
44 pages
CCKS GRL YizhouSun v1
No ratings yet
CCKS GRL YizhouSun v1
85 pages
GCAT - Link Prediction in Knowledge Graphs
No ratings yet
GCAT - Link Prediction in Knowledge Graphs
73 pages
Edupedia 9th Class Math Paper
100% (1)
Edupedia 9th Class Math Paper
4 pages
Basic Simulation Laboratory Manual B.Tech (Ii Year - I Sem) (2021-22)
No ratings yet
Basic Simulation Laboratory Manual B.Tech (Ii Year - I Sem) (2021-22)
56 pages
Chap7 - Diagonalization and Quadratic Forms
No ratings yet
Chap7 - Diagonalization and Quadratic Forms
55 pages
Cybersecurity For Smart Grid Control Vulnerability Assessment Attack Detection and Mitigation A Amulya PDF Download
No ratings yet
Cybersecurity For Smart Grid Control Vulnerability Assessment Attack Detection and Mitigation A Amulya PDF Download
53 pages
Automatic KG Construction
No ratings yet
Automatic KG Construction
50 pages
OmniSX MX2 Training 4A PA Calculator Overview
No ratings yet
OmniSX MX2 Training 4A PA Calculator Overview
10 pages
Introduction To Matlab: Kadin Tseng Boston University Scientific Computing and Visualization
No ratings yet
Introduction To Matlab: Kadin Tseng Boston University Scientific Computing and Visualization
35 pages
TGDK 1 1 2
No ratings yet
TGDK 1 1 2
38 pages
Part1 Intro
No ratings yet
Part1 Intro
48 pages
Knowledge Graphs
No ratings yet
Knowledge Graphs
37 pages
Knowledge Graphs
No ratings yet
Knowledge Graphs
43 pages
Full Text
No ratings yet
Full Text
32 pages
Personal Ai - Mini Project Record Final PDF
No ratings yet
Personal Ai - Mini Project Record Final PDF
32 pages
Determinants MCQ (Free PDF) - Objective Question Answer For Dete
No ratings yet
Determinants MCQ (Free PDF) - Objective Question Answer For Dete
23 pages
2019 Introduction To Neural Network Based Approaches For Question Answering Over Knowledge Graphs
No ratings yet
2019 Introduction To Neural Network Based Approaches For Question Answering Over Knowledge Graphs
34 pages
Jair14494 Rev
No ratings yet
Jair14494 Rev
32 pages
Fast KG
No ratings yet
Fast KG
22 pages
A Survey On Knowledge Graphs: Representation, Acquisition and Applications
No ratings yet
A Survey On Knowledge Graphs: Representation, Acquisition and Applications
27 pages
Wang Knowledge
No ratings yet
Wang Knowledge
20 pages
异构网络学习综述
No ratings yet
异构网络学习综述
22 pages
AI Magazine - 2022 - Chaudhri - Knowledge Graphs Introduction History and Perspectives
No ratings yet
AI Magazine - 2022 - Chaudhri - Knowledge Graphs Introduction History and Perspectives
13 pages
Ji 2021
No ratings yet
Ji 2021
21 pages
Wilcke Et Al 2017 The Knowledge Graph As The Default Data Model For Learning On Heterogeneous Knowledge
No ratings yet
Wilcke Et Al 2017 The Knowledge Graph As The Default Data Model For Learning On Heterogeneous Knowledge
19 pages
(2022) Knowledge Graph - A Giude Tour (21 Pages)
No ratings yet
(2022) Knowledge Graph - A Giude Tour (21 Pages)
21 pages
Embedding Knowledge Graphs Attentive To Positional and Centrality Qualities
No ratings yet
Embedding Knowledge Graphs Attentive To Positional and Centrality Qualities
20 pages
Intelligraphs: Datasets For Benchmarking Knowledge Graph Generation
No ratings yet
Intelligraphs: Datasets For Benchmarking Knowledge Graph Generation
19 pages
Quaternion Knowledge Graph Embeddings: Equal Contribution
No ratings yet
Quaternion Knowledge Graph Embeddings: Equal Contribution
14 pages
Research Review of The Knowledge Graph and Its Application
No ratings yet
Research Review of The Knowledge Graph and Its Application
20 pages
Fast Gen of Kge
No ratings yet
Fast Gen of Kge
14 pages
Unifying Structure and Language Semantic For Efficient Contrastive Knowledge Graph Completion With Structured Entity Anchors
No ratings yet
Unifying Structure and Language Semantic For Efficient Contrastive Knowledge Graph Completion With Structured Entity Anchors
11 pages
Two-Dimensionally Constrained Disaggregate Trip Generation, Distribution and Mode Choice Model: Theory and Application For A Swiss National Model
No ratings yet
Two-Dimensionally Constrained Disaggregate Trip Generation, Distribution and Mode Choice Model: Theory and Application For A Swiss National Model
17 pages
Exploring Scholarly Data by Semantic Query On Knowledge Graph Embedding Space
No ratings yet
Exploring Scholarly Data by Semantic Query On Knowledge Graph Embedding Space
12 pages
Mathematics 11 01073
No ratings yet
Mathematics 11 01073
12 pages
Knowledge Graph Embedding With Atrous Convolution and Residual
No ratings yet
Knowledge Graph Embedding With Atrous Convolution and Residual
12 pages
Concept2Box: Joint Geometric Embeddings For Learning Two-View Knowledge Graphs
No ratings yet
Concept2Box: Joint Geometric Embeddings For Learning Two-View Knowledge Graphs
12 pages
Algorithms For Solving Linear Systems of Equations
No ratings yet
Algorithms For Solving Linear Systems of Equations
10 pages
TransG A Generative Model For Knowledge Graph Embe
No ratings yet
TransG A Generative Model For Knowledge Graph Embe
11 pages
A Comprehensive Survey of Graph Neural Networks For Knowledge Graphs
No ratings yet
A Comprehensive Survey of Graph Neural Networks For Knowledge Graphs
13 pages
Making Large Language Models Perform Better in Knowledge Graph Completion
No ratings yet
Making Large Language Models Perform Better in Knowledge Graph Completion
10 pages
Dynamic GNN Paper Top
No ratings yet
Dynamic GNN Paper Top
12 pages
A Survey On Application of Knowledge Graph
No ratings yet
A Survey On Application of Knowledge Graph
12 pages
Complex Factoid Question Answering With A Free-Text Knowledge Graph
No ratings yet
Complex Factoid Question Answering With A Free-Text Knowledge Graph
12 pages
KnowEdu - A System To Construct Knowledge Graph For Education
No ratings yet
KnowEdu - A System To Construct Knowledge Graph For Education
11 pages
Towards Understanding The Geometry of KN
No ratings yet
Towards Understanding The Geometry of KN
10 pages
Towards A Question Answering System Over Temporal Knowledg Graph Embeddings
No ratings yet
Towards A Question Answering System Over Temporal Knowledg Graph Embeddings
10 pages
REFERENCE PAPER 2 - Machine Learning-Based Prediction of Drug-Drug
No ratings yet
REFERENCE PAPER 2 - Machine Learning-Based Prediction of Drug-Drug
9 pages
CR Transr
No ratings yet
CR Transr
11 pages
Knowledge Graph Embedding With Hierarchical Relation Structure
No ratings yet
Knowledge Graph Embedding With Hierarchical Relation Structure
10 pages
Fast and Continual Knowledge Graph Embedding Via Incremental Lora
No ratings yet
Fast and Continual Knowledge Graph Embedding Via Incremental Lora
9 pages
Analyzing Knowledge Graph Embedding Methods From A Multi-Embedding Interaction Perspective
No ratings yet
Analyzing Knowledge Graph Embedding Methods From A Multi-Embedding Interaction Perspective
8 pages
16585-Article Text-20079-1-2-20210518
No ratings yet
16585-Article Text-20079-1-2-20210518
9 pages
Analyzing Knowledge Graph Embedding Methods From A Multi-Embedding Interaction Perspective
No ratings yet
Analyzing Knowledge Graph Embedding Methods From A Multi-Embedding Interaction Perspective
8 pages
Trans H
No ratings yet
Trans H
8 pages
Fede: Embedding Knowledge Graphs in Federated Setting: Mingyang Chen Wen Zhang Zonggang Yuan
No ratings yet
Fede: Embedding Knowledge Graphs in Federated Setting: Mingyang Chen Wen Zhang Zonggang Yuan
9 pages
Knowledge Graph Embedding
No ratings yet
Knowledge Graph Embedding
8 pages
Knowledge - Graph - Embedding - and - OpenKE (Report)
No ratings yet
Knowledge - Graph - Embedding - and - OpenKE (Report)
5 pages
Integrating Graph Contextualized Knowledge Into Pre-Trained Language Models
No ratings yet
Integrating Graph Contextualized Knowledge Into Pre-Trained Language Models
8 pages
Reference Paper 8
No ratings yet
Reference Paper 8
7 pages
0318 Copie
No ratings yet
0318 Copie
7 pages
Artigo - Grafo Do Conhecimento
No ratings yet
Artigo - Grafo Do Conhecimento
8 pages
KGLM: Integrating Knowledge Graph Structure in Language Models For Link Prediction
No ratings yet
KGLM: Integrating Knowledge Graph Structure in Language Models For Link Prediction
8 pages
Yao Et Al. - 2023 - Knowledge Graphs For Textbooks Extraction and Com
No ratings yet
Yao Et Al. - 2023 - Knowledge Graphs For Textbooks Extraction and Com
8 pages
Principles of Management and Professional Ethics Module Iv Human Values and Engineering Ethics
No ratings yet
Principles of Management and Professional Ethics Module Iv Human Values and Engineering Ethics
6 pages
Mabruk Conf Adaptive ICC 2009
No ratings yet
Mabruk Conf Adaptive ICC 2009
6 pages
2m PDF
No ratings yet
2m PDF
5 pages
Subgraph2vec: A Random Walk-Based Algorithm For Embedding Knowledge Graphs
No ratings yet
Subgraph2vec: A Random Walk-Based Algorithm For Embedding Knowledge Graphs
6 pages
Universial
No ratings yet
Universial
5 pages
C++ Challenge
No ratings yet
C++ Challenge
6 pages
Simply 03
No ratings yet
Simply 03
4 pages
Module1 Knowledge Graphs Introduction-2
No ratings yet
Module1 Knowledge Graphs Introduction-2
5 pages
Advanced Matrix Theory and Linear Algebra For Engineers Video Course - Syllabus
No ratings yet
Advanced Matrix Theory and Linear Algebra For Engineers Video Course - Syllabus
3 pages
Lislie Matrix1
No ratings yet
Lislie Matrix1
2 pages
Developing Analytic Talent: Becoming a Data Scientist
From Everand
Developing Analytic Talent: Becoming a Data Scientist
Vincent Granville
3/5 (7)