0% found this document useful (0 votes)
11 views6 pages

Path Ranking Model For Entity Prediction

Uploaded by

姚命宏
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views6 pages

Path Ranking Model For Entity Prediction

Uploaded by

姚命宏
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

PATH RANKING MODEL FOR ENTITY PREDICTION

Xiao Long∗† , MingHong Yao∗† , Liansheng Zhuang∗ , Houqiang Li∗ , Shafei Wang‡

University of Science and Technology of China

Peng Cheng Laboratory

Northern Institute of Electronic Equipment, Beijing, China
[email protected]; [email protected]

ABSTRACT
Knowledge graphs (KGs) often encounter knowledge incom-
pleteness, necessitating a demand for KG completion. Path-
based methods are one of the most important approaches to
this task. However, since the number of entities is much larger
than that of relations in a knowledge graph, existing path-
based methods are only used to predict the relations between
entity pairs, and are rarely applied to solve the entity predic-
tion task. To address the issue, this paper proposes a new
framework called Path Ranking Model (PRM) for the knowl- Fig. 1. An example of two forms of KGC tasks. The left side
edge graph completion task. Our key idea is to exploit both of the figure above describes the entity prediction task and the
the observable patterns and latent semantic information in re- right side describes the relation prediction task.
lation paths to predict the entities. Extensive experiments on
public popular datasets demonstrate the effectiveness of our
proposed framework in the entity prediction task. Many methods have been proposed to infer the incom-
Index Terms— Knowledge graph, KGC, path ranking pleteness of a knowledge graph, among which knowledge
graph embeddings (KGEs) are popular ones. The KGEs such
as TransE [8], DistMult [9] and RotatE [10] often learn low-
1. INTRODUCTION
dimensional representations of the entities and relations in a
knowledge graph, and provide a generalizable context about
Knowledge graphs (KGs) have emerged as an effective way
the overall KG that can be used to infer relations. The learned
to integrate disparate data sources and model underlying re-
embeddings contain rich semantic information and can bene-
lationships for applications such as natural language process-
fit a broad range of downstream applications [1, 2]. However,
ing [1], question answering [2] and recommender systems [3].
the KGEs models purely treat the triples in a KG indepen-
They are usually stored in triples of the form (head entity, re-
dently and heavily rely on their ability to model connectivity
lation, tail entities), called facts. Typical KGs such as Free-
patterns of the relations.
base [4], DBpedia [5] and NELL [6] may contain millions of
facts, but they are still far from complete due to the complex- Many efforts have been devoted to path-based methods
ity of the real world. For example, 75% of 3 million person which aim to use paths to infer relations between an entity
entities miss a nationality in Freebase, and 60% of person en- pair [11, 12]. Different from KGEs which model the direct
tities do not have a birthplace in DBpedia. Such knowledge path between an entity pair, multi-hop paths are the paths via
graphs are difficult to use in real applications because of no multi nodes between the source and destination nodes. When
correct answers for questions based on incomplete knowledge routing between two nodes, a multi-hop path that consists of
graphs. Knowledge graph completion (KGC) [7] tries to solve a sequence of triples can provide more informative knowl-
this incompleteness by inferring new triples based on the ex- edge than a direct path. For example, let us consider two
isting ones, and has attracted much attention recently. triples (Donald Trump, BornIn, New York City) and (New
York City, LocatedIn, the USA), which constitute a path Don-
This work was supported in part by Next Generation AI Project of China ald Trump −→ BornIn −→ New York City −→ LocatedIn
No.2018AAA0100602, in part to Dr. Liansheng Zhuang by NSFC under
contract No.U20B2070 and No.61976199, and in part to Dr. Houqiang Li
−→ USA via the intermediate node New York City. Then,
by NSFC under contract No.61836011. Dr. Liansheng Zhuang is the corre- the machines are supposed to infer over the multi-hop path
sponding author. to conclude that missing relation Nationality exists between
Donald Trump and the USA. In order to make full use of the 2. RELATED WORK
relation path information in KG, many works are devoted to
mining observable patterns in knowledge graphs for reason- In this section, we will describe the related work and the key
ing. In early path-based methods such as PRA [11], each path differences between them and our works. Roughly speaking,
is treated as an atomic feature and used to predict the relation we can divide knowledge graph completion models into two
in a binary classifier. As a result, there used to be millions categories—KG embedding models and path-based models.
of distinct paths in a single classifier, not to mention that the KG embedding models: Knowledge graph embedding,
size increases dramatically with the number of relations in a which aims to represent entities and relations as low dimen-
KG. To solve this problem, Path-RNN [12] takes multi-hop sional vectors. It often fall into three major categories: (1)
relation paths as input for RNN to construct a vector repre- Translational distance model: Inspired by the translation in-
sentation for the path. After that, the predictability of the path variant principle from word embedding [13]. TransE [8] de-
to a query relation is calculated by dot-product on their repre- scribes relations as translations from head entities to tail en-
sentations. tities, which means that entities and relations satisfy the for-
Though achieving promising performance, path-based mula h + r ≈ t, where h, r, t ∈ Rn . (2) Tensor decom-
methods are only used to predict the relationship between an position based model: DistMult [9] and ComplEx [14] both
entity pair. Since the number of entities in KG is far greater use tensor decomposition to represent each relation as a ma-
than the number of relations, it is more challenging to use trix and each entity as a vector. (3) Neural network based
path-based methods to predict entities that to predict relations. models: ConvE [15] employs convolutional neural networks
The Fig. 1 illustrates the differences between relation predic- to define score functions. Recently, graph convolutional net-
tion and entity prediction. In fact, multi-hop paths contain works are also introduced, as knowledge graphs have graph
rich semantic cues and are extremely useful for entity pre- structures [16]. However, most of KGE methods merely con-
diction tasks. For example, through the knowledge contained sider the facts immediately observed in KG and ignore extra
in this path (Donald Trump −→ BornIn −→ New York City prior knowledge to enhance KG embedding.
−→ LocatedIn −→ USA), one can be confident that Donald Path-based models: Paths existing in KG have gained more
Trump’s father also lives in the USA. Because the relation attention to be combined with KG embedding because multi-
path (BornIn −→ LocatedIn) determines the tail entity selec- hop paths could provide relations between seemingly uncon-
tion under the path (Father −→ LivesIn). nected entities in KG. Path Ranking Algorithm (PRA) [11] is
Inspired by the above insights, this paper proposes a novel one of the early studies which searches paths by random walk
framework to solve the entity prediction task, namely Path in KG and regards the paths as features for a per-target rela-
Ranking Model (PRM). Our key idea is to use the latent se- tion binary classifier. [12] proposed a compositional vector
mantic information and observable patterns in relation paths space model with a recurrent neural network to model rela-
for entity prediction tasks in the knowledge graph. Specifi- tional paths on knowledge graph completion. [17] proposed
cally, our framework first uses the depth-first search (DFS) to PTransE to obtain the path embeddings by composing all the
find all the effective paths (meta-paths) to represent for each relations in each path.
relation. Then, we represent the paths into continuous vec- Compared with KG embedding models, path-based mod-
tor space. Finally, by combining the path features linearly els fully consider the diversity of relation paths in KG and
and activating them, we will get the probability that this triple capture the rich information in them. However, the previ-
appears in the KG. Extensive experiments on three popular ous studies only verified the effectiveness of the path-based
benchmark datasets demonstrate the effectiveness of the PRM methods can decide whether the given query relation exists
in entity prediction tasks. between the entity pairs, Ignoring the entity prediction task.
In summary, our main contributions are listed as follows:
3. METHOD
• This paper proposes a novel PRM framework for entity
prediction task, which simultaneously exploits both the 3.1. Overview of the PRM
latent semantic information and observable patterns in
In this subsection, we will generally introduce the PRM
relation paths. To the best of our knowledge, this is the
framework and the details of the feature generation. When
first work to use the path-based methods to solve the
the traditional path ranking algorithm predicts an unknown
entity prediction task.
relationship, it usually firstly extracts the useful meta-paths,
which can be used to represent a relation. And then it gener-
• Extensive experiments are conducted on three popular ates the paths’ features through random walk. However, this
benchmark datasets. Experimental results show that the way will be very slow for large knowledge graph. Therefore,
PRM achieves the state-of-the-art performance in the the proposed PRM can be divided into two stages, online and
entity prediction task. offline. The offline stage will store the corresponding paths’

2
Fig. 2. Pipeline of the PRM on solving the entity prediction task.

features and scoring weights, while the online stage will use R1 , · · · , R`−1 , and define:
them for entity prediction task. Fig. 2 illustrates the pipeline X
of the PRM on how to solve the entity prediction task. hEe ,P (e0 ) = hEq ,P 0 (q) · P (e | q; R` ) , (3)
Path Extraction Module q∈range(P 0 )

Given a knowledge graph, we can formulate one fact triple R` (q,e)


as (e, R, e0 ). T = {(ei , Ri , e0i ), · · · } is the set of all triples in where P (e|q; R` ) = |R ` (q,·)|
, is the probability of reaching
KG. And we can also write (e, R, e0 ) as R(e, e0 ) if e and e0 node e from node q a one step random walk with edge type
are related by R. A relation path P is a sequence of relations R` , R(e0 , e) indicates whether there exists an edge with type
R1 , · · · , R` with constraint that ∀i : 1 < i < ` − 1. For each R that connect q to e.
relation Ri in KG, we use the depth-first search (DFS) to find More generally, given a set of paths P1 , · · · , Pn , one
the meta-paths P = {P1 , · · · , Pn } of the bounded length ` could treat each hEe ,Pi (e0 ) as a path feature for the node e,
through all triples, which means the maximum length of the and rank nodes by a linear model
meta-paths is `. In fact, ` is a hyper-parameter that needs to
θ1 hEe ,P1 (e0 ) + θ2 hEe ,P2 (e0 ) + · · · + θn hEe ,Pn (e0 ), (4)
be adjusted.
Path Vectorization Module where θi are weights for the paths. In this paper, we consider
In order to emphasize the types associated with each step, learning such linear weighting schemes over all relation paths
path P = R1 , · · · , R` can be written as: of bounded length ` (e.g., ` ≤ 4). One can easily generate
P(e, l) = {P }, the set of all type-correct relation paths with
R R
1
E0 −→ `
· · · −→ E` , (1) range Te and length ≤ l. This gives a ranking of nodes e0 ∈
I(Te ) by the following scoring function
where Ei = range(Ri ) = dom(Ri+1 ), using dom(Ri ) to X
denote the domain of R, and range(R) for its range. And s(e0 ; θ) = hEe ,P (e0 )θP . (5)
we also define dom(P ) ≡ E0 , range(P ) ≡ E` . One can P ∈P(e,l)

recursively define a distribution, denoted as hEe ,P (e0 ), to de-


In matrix form this could be written s = Aθ, where s is a
scribe the probability that the head entity e is connected to
sparse column vector of scores, and θ is a column vector of
the tail entity e0 through the path P . For any relation path
weights for the corresponding paths P . We will call A the
P = R1 , · · · , R` , and set of seed entities Ee ⊂ dom(P ), one
feature matrix, and denote the i-th row of A as Ai .
can define the following distribution if P is the empty path:
Finally, we will store the corresponding paths’ features in
( the database, which can accelerate the speed of the prediction
1/|Ee |, if e0 ∈ Ee , in online stage.
hEe ,P (e0 ) = (2)
0, otherwise . Training Module
Given a relation R and a set of node pairs {(ei , e0i )}, we
If P = R1 , · · · , R` is nonempty, then let P 0 = can construct a training dataset D = {(xi , li ) by (3) and

3
(4), where xi is a vector of all the path features for the pair 4.1. Experimental Settings
{(ei , e0i )} i.e., the j-th component of xi is hei ,Pj (e0i ), and li
indicates whether R(ei , e0i )) is true. It can be formulated as Datasets. We evaluate the PRM on three popular bench-
follows: mark datasets in the task of KGC. FB15k-237 [7], YAGO3-
10 [7], and WN18RR [7]. They are subsets of WN18, FB15k,
( and YAGO3 respectively. Recent studies [7] found that the
1, if (ei , R, e0i ) ∈ T FB15k and WN18 contain inverse relations. Under this cir-
li = (6)
0, otherwise cumstances, FB15k-237 and WN18RR were proposed, which
removed the reverse relations in FB15k and WN18. So, they
With the training data D, we formalize oi (θ) as the per-
are regarded as more challenging datasets. Therefore, we
instance objective function like (7). Parameter θ is the
use FB15k-237, WN18RR, and YAGO3-10 as the benchmark
weights to be estimated. In this paper, we use binomial log-
datasets. The statistics of the datasets are summarized in Ta-
likelihood, which has the advantage of being easy to optimize
ble 1.
and also does not penalize outlier samples too harshly [18].
oi (θ) = wi [li ln pi + (1 − li ) ln (1 − pi )] (7)
Table 1. Statistics of the Three Datasets. Rel denotes relation
where pi is the probability of triple appearing in KG, it can be and Ent denotes entity.
calculated by (8): Dataset #Ent #Rel #Train #Valid #Test
FB15k-237 14,541 237 272,115 17,535 20,466

exp θT xi
p (li = 1 | xi ; θ) = (8) YAGO3-10 123,182 37 1,079,040 5,000 5,000
1 + exp (θT xi )
WN18RR 40,493 11 86,835 3,034 3,134
So, the parameter estimation can be formulated as maximiz-
ing a regularized objective function:
X
O(θ) = o(i) (θ) (9) Baselines. To demonstrate the effectiveness of the PRM in
i link prediction task, we select several involved state-of-the-art
where λ is a parameter controls L2-regularization to prevent models for comparison, including two types of baselines: (1)
overfitting. After the training stage, we can get the scoring KGE methods, which has been widely used in entity predic-
weights for online prediction. tion tasks, such as: TransE [8], DistMult [9], ComplEx [14],
RotatE [10], ConvE [15], R-GCN [16], HAKE [19], Inter-
3.2. PRM For Entity Prediction Task actE [20] and TuckER [21]. (2) Rule enhanced models, which
aims to use first-order logic rules in KG for reasoning [7] like:
In this subsection, we will describe how to formulate entity DRUM [22]. We use the best results presented in their origi-
prediction on a knowledge graph as a ranking task. Since nal papers for comparison.
lacking head entity or tail entity are equivalent, here we only Evaluation Metric. Following Bordes [8], for each triple (h,
discuss the latter. r, t) in the test dataset, we replace either the head entity h or
Given a triple that lacks tail entity (ei , Ri , ?). We use the tail entity t with each candidate entity to create a set of
the all candidate entities in KG to expand to a set of triples candidate triples. We then rank the candidate triples in de-
T = {(ei , Ri , e01 ), · · · , (ei , Ri , e0M )}, where M is the num- scending order by their scores. It is worth noting that we use
ber of the entities in KG. Then, we retrieve the paths’ feature the “Filtered” setting as in [8], which does not take any ex-
xi of the corresponding triples in the database generated in isting valid triples into accounts at ranking. We choose Mean
the offline stage. Next, we can calculate the probability of all Reciprocal Rank (MRR, the mean of all the reciprocals of
candidate entities pi through the xi by (8). predicted ranks) and Hits at N (H@N, the proportion of ranks
not larger than N) as the evaluation metrics. Higher MRR or
ind = arg max {pi } (10) H@N indicates better performance.
i
Parameters Setting. During the training stage, we use the
Finally, selecting the entity index ind with the highest score adaptive moment (Adam) algorithm to optimize the model.
by (10), the e0ind is the most likely tail entity to be predicted. And we search the best hyper-parameters of all models ac-
cording to the performance on the validation set. Notice that
4. EXPERIMENT the validation set does not participate in training. In detail,
we search the learning rate α in {0:001; 0:005; 0:01; 0:1},
In this section, we evaluate the performance of the PRM on meta-paths length ` for one relation in {2; 3; 4}. Finally, we
the task of entity prediction. All the algorithms are imple- set the α: 0.01, `: 4, maximum numbers of meta-paths under
mented by python and PyTorch, and run the experiments on 2 each relation: 500 for all the datasets. And all the training
Quadro RTX 6000 and 96 Intel Xeon Platinum 8268. parameters are randomly initialized.

4
Table 2. Entity prediction results on FB15k-237, WN18RR.
FB15K-237 WN18RR
Hits@N Hits@N
MRR MRR
@1 @3 @10 @1 @3 @10
TransE(Bordes et al. 2013) .294 - - .465 .226 - - .501
DistMult(Yang et al. 2015) .241 .155 .263 .419 .430 .391 .442 .490
ComplEx(Trouillon et al. 2016) .247 .158 .275 .428 .440 .158 .275 .428
R-GCN(Schlichtkrull et al. 2018) .249 .151 .264 .417 .249 .151 .264 .417
ConvE(Dettmers et al. 2018) .316 .239 .350 .491 .461 .390 .430 .481
RotatE(Sun et al. 2019) .338 .241 .375 .533 .476 .428 .492 .571
DRUM(Sadeghian et al.2019) .343 .255 .378 .516 .486 .425 .513 .586
TuckER(Balazevic et al.2019) .358 .266 .394 .544 .470 .443 .482 .526
HAKE(Zhang et al.2020) .346 .250 .381 .542 .497 .452 .516 .582
InteractE(Vashishth et al.2020) .354 .263 - .535 .463 .430 - .528
PRM .364 .255 .388 .580 .498 .431 .733 .766

metrics all the time. This also explains that PRM can guaran-
Table 3. Entity prediction results on YAGO3-10
tee the average accuracy of the prediction results. The experi-
YAGO3-10 mental results illustrate that the path-based model can capture
Hits@N both the latent semantic information and observable patterns
MRR in KG, which has great help for the entity prediction task.
@1 @3 @10
TransE(Bordes et al. 2013) - - - - Due to the differences between the FB15k-237 dataset and
DistMult(Yang et al. 2015) .340 .240 .380 .540 the other two datasets, the performance of different types of
ComplEx(Trouillon et al. 2016) .360 .260 .400 .550 algorithms may be different. For example, the PRM performs
ConvE(Dettmers et al. 2018) .520 .450 .560 .660 much better on YAGO3-10 dataset and WN18RR dataset than
RotatE(Sun et al. 2019) .495 .402 .550 .670 KGE models. However, on the FB15k-237 dataset, the im-
HAKE(Zhang et al. 2020) .545 .462 .596 .694 provement to the entity prediction task by PRM is not so
InteractE(Vashishth et al. 2020) .541 .462 - .687 significant. A possible reason is the number of relations in
PRM .698 .526 .692 .723 FB15k-237 is much more than the other two datasets, which
means there are more complex relation path patterns and less
training data under one relation. Besides, from the above ex-
perimental results, it can be found that the path-based method
4.2. RESULTS AND ANALYSIS can achieve greater improvement on Hits@N (N > 1) than
Hits@1, which means the path-based method can usually pre-
Experimental results are shown in Table 2 and Table 3. From
dict the best solution within the tolerable error range.
Table 2 and Table 3, we have the following findings.
1) The results indicate that PRM significantly and con-
sistently outperforms all the state-of-the-art competitors on
three benchmark datasets. In both Table 2 and Table 3, PRM
5. CONLUSION
achieves the best results on most metrics. The experimental
results clearly demonstrate the effectiveness of the PRM in
the entity prediction task. In this paper, we first propose the PRM framework to solve
2) Specifically, in the YAGO3-10 dataset, PRM is 15.3% the entity prediction task. Compared with KGE models treat-
higher than HAKE’s MMR, 6.4% higher than InteractE’s ing the triples independently, and only using the direct re-
Hits@1, 9.6% higher than HAKE’s Hits@3, and 2.9% higher lations between entities. PRM simultaneously exploits both
than HAKE’s Hits@10. In the WN18RR dataset, PRM im- the latent semantic information and observable patterns in re-
proves upon DRUM’s Hits@10 and Hits@3 by a large margin lation paths, which has great help for predicting the missing
of 18% and 22% respectively respectively. And the Hits@1 entities in KG. The extensive experimental results verified the
and MRR are basically the same compared to other methods. paths’ features with different weights learned by PRM are sig-
In the FB15K237 dataset, PRM also has achieved good re- nificant in improving the performance on the entity prediction
sults on all evaluation metrics. Although on FB15k-237, the task. In the future, we will investigate how to use structure in-
performance of the Hit@1 and Hit@3 is not so good, we can formation and more entity information (e.g., entity type.) to
still observe that PRM is better than KGE methods on MMR help the path-based model get better results.

5
6. REFERENCES base,” in Proceedings of the 2011 conference on empir-
ical methods in natural language processing, 2011, pp.
[1] Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, 529–539.
Maosong Sun, and Qun Liu, “Ernie: Enhanced language [12] Arvind Neelakantan, Benjamin Roth, and Andrew
representation with informative entities,” arXiv preprint McCallum, “Compositional vector space models
arXiv:1905.07129, 2019. for knowledge base completion,” arXiv preprint
[2] Yanchao Hao, Hao Liu, Shizhu He, Kang Liu, and Jun arXiv:1504.06662, 2015.
Zhao, “Pattern-revising enhanced simple question an- [13] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Cor-
swering over knowledge bases,” in Proceedings of the rado, and Jeff Dean, “Distributed representations of
27th International Conference on Computational Lin- words and phrases and their compositionality,” in Ad-
guistics, 2018, pp. 3272–3282. vances in neural information processing systems, 2013,
[3] Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing pp. 3111–3119.
Xie, and Wei-Ying Ma, “Collaborative knowledge base [14] Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric
embedding for recommender systems,” in Proceedings Gaussier, and Guillaume Bouchard, “Complex embed-
of the 22nd ACM SIGKDD international conference on dings for simple link prediction,” International Confer-
knowledge discovery and data mining, 2016, pp. 353– ence on Machine Learning (ICML), 2016.
362. [15] Tim Dettmers, Pasquale Minervini, Pontus Stenetorp,
[4] Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim and Sebastian Riedel, “Convolutional 2d knowledge
Sturge, and Jamie Taylor, “Freebase: a collaboratively graph embeddings,” arXiv preprint arXiv:1707.01476,
created graph database for structuring human knowl- 2017.
edge,” in Proceedings of the 2008 ACM SIGMOD in- [16] Michael Schlichtkrull, Thomas N Kipf, Peter Bloem,
ternational conference on Management of data, 2008, Rianne Van Den Berg, Ivan Titov, and Max Welling,
pp. 1247–1250. “Modeling relational data with graph convolutional
[5] Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, networks,” in European Semantic Web Conference.
Dimitris Kontokostas, Pablo N Mendes, Sebastian Hell- Springer, 2018, pp. 593–607.
mann, Mohamed Morsey, Patrick Van Kleef, Sören [17] Yankai Lin, Zhiyuan Liu, Huanbo Luan, Maosong Sun,
Auer, et al., “Dbpedia–a large-scale, multilingual Siwei Rao, and Song Liu, “Modeling relation paths
knowledge base extracted from wikipedia,” Semantic for representation learning of knowledge bases,” arXiv
web, vol. 6, no. 2, pp. 167–195, 2015. preprint arXiv:1506.00379, 2015.
[6] Tom Mitchell, William Cohen, Estevam Hruschka, [18] Karl Breitung, “Probability approximations by log like-
Partha Talukdar, Bishan Yang, Justin Betteridge, An- lihood maximization,” Journal of engineering mechan-
drew Carlson, Bhavana Dalvi, Matt Gardner, Bryan ics, vol. 117, no. 3, pp. 457–477, 1991.
Kisiel, et al., “Never-ending learning,” Communica- [19] Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, and Jie
tions of the ACM, vol. 61, no. 5, pp. 103–115, 2018. Wang, “Learning hierarchy-aware knowledge graph em-
[7] Quan Wang, Zhendong Mao, Bin Wang, and Li Guo, beddings for link prediction,” in Thirty-Fourth AAAI
“Knowledge graph embedding: A survey of approaches Conference on Artificial Intelligence. 2020, pp. 3065–
and applications,” IEEE Transactions on Knowledge 3072, AAAI Press.
and Data Engineering, vol. 29, no. 12, pp. 2724–2743, [20] Shikhar Vashishth, Soumya Sanyal, Vikram Nitin,
2017. Nilesh Agrawal, and Partha Talukdar, “Interacte: Im-
[8] Antoine Bordes, Nicolas Usunier, Alberto Garcia- proving convolution-based knowledge graph embed-
Duran, Jason Weston, and Oksana Yakhnenko, “Trans- dings by increasing feature interactions,” in Proceedings
lating embeddings for modeling multi-relational data,” of the 34th AAAI Conference on Artificial Intelligence.
Advances in neural information processing systems, vol. 2020, pp. 3009–3016, AAAI Press.
26, pp. 2787–2795, 2013. [21] Ivana Balažević, Carl Allen, and Timothy M
[9] Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, Hospedales, “Tucker: Tensor factorization for
and Li Deng, “Embedding entities and relations for knowledge graph completion,” arXiv preprint
learning and inference in knowledge bases,” arXiv arXiv:1901.09590, 2019.
preprint arXiv:1412.6575, 2014. [22] Ali Sadeghian, Mohammadreza Armandpour, Patrick
[10] Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Ding, and Daisy Zhe Wang, “Drum: End-to-end dif-
Tang, “Rotate: Knowledge graph embedding by re- ferentiable rule mining on knowledge graphs,” in Ad-
lational rotation in complex space,” arXiv preprint vances in Neural Information Processing Systems, 2019,
arXiv:1902.10197, 2019. pp. 15347–15357.
[11] Ni Lao, Tom Mitchell, and William Cohen, “Random
walk inference and learning in a large scale knowledge

You might also like