Exploring Large Language Models For Knowledge Graph Completion
Exploring Large Language Models For Knowledge Graph Completion
this study, we explore utilizing Large Language Li, 2016; Xu et al., 2017; An et al., 2018). Our pre-
Models (LLM) for knowledge graph comple- vious work KG-BERT (Yao et al., 2019) firstly em-
tion. We consider triples in knowledge graphs ploys the pre-trained language model BERT (De-
as text sequences and introduce an innovative vlin et al., 2019) to encode prior knowledge and
framework called Knowledge Graph LLM (KG-
contextual information. The KG-BERT model was
LLM) to model these triples. Our technique em-
ploys entity and relation descriptions of a triple extended by several recent studies (Wang et al.,
as prompts and utilizes the response for pre- 2021, 2022; Lovelace and Rose, 2022) on both effi-
dictions. Experiments on various benchmark ciency and performance, but the language models
knowledge graphs demonstrate that our method used in these works are relatively small.
attains state-of-the-art performance in tasks Recently, large language models (Zhao et al.,
such as triple classification and relation pre- 2023) like ChatGPT and GPT-4 (OpenAI, 2023)
diction. We also find that fine-tuning relatively
have gained significant attention. Researchers
smaller models (e.g., LLaMA-7B, ChatGLM-
6B) outperforms recent ChatGPT and GPT-4.
find that scaling pre-trained language models of-
ten leads to an improved model capacity on down-
1 Introduction stream tasks. These large-sized models show dif-
ferent behaviors from smaller models like BERT
Large knowledge graphs (KG) like FreeBase (Bol-
and display surprising abilities in solving a series
lacker et al., 2008), YAGO (Suchanek et al., 2007),
of complex tasks.
and WordNet (Miller, 1995) serve as a powerful
In this study, we propose a novel method for
foundation for numerous critical AI tasks, includ-
knowledge graph completion using large language
ing semantic search, recommendation (Zhang et al.,
models. Specifically, we treat entities, relations,
2016), and question answering (Cui et al., 2017).
and triples as textual sequences and model knowl-
A KG is generally a multi-relational graph with
edge graph completion as a sequence-to-sequence
entities as nodes and relations as edges. Each edge
problem. We perform instruction tuning with open
is depicted as a triplet (head entity, relation, tail
LLMs (LLaMA (Touvron et al., 2023) and Chat-
entity) (abbreviated as (h, r, t)), signifying the rela-
GLM (Du et al., 2022)) on these sequences for
tionship between two entities, for instance, (Steve
predicting the plausibility of a triple or a candi-
Jobs, founded, Apple Inc.). Despite their effective-
date entity/relation. The method achieves stronger
ness, knowledge graphs remain incomplete. This
performance in several KG completion tasks. Our
issue leads to the challenge of knowledge graph
source code is available at: https://github.com/
completion, which aims to evaluate the plausibility
yao8839836/kg-llm. Our contributions are sum-
of triples that are not present in a knowledge graph.
marized as follows:
A significant amount of research has been dedi-
cated to knowledge graph completion. One preva- • We propose a new language modeling method
lent method is knowledge graph embedding (Wang for knowledge graph completion. To the best
et al., 2017). However, most knowledge graph of our knowledge, this is the first study to sys-
embedding models solely rely on structural infor- tematically investigate large language models
mation from observed triple facts, leading to issues for KG completion tasks.
• Results on several benchmarks show that our as in-context learning, instruction following, and
method achieves state-of-the-art results in step-by-step reasoning. These abilities are helpful
triple classification and relation prediction. for KG completion tasks.
We also find that fine-tuning relatively smaller
models (e.g., LLaMA-7B, ChatGLM-6B) can 2.2 LLMs with KG Completion
outperform recent ChatGPT and GPT-4. Recently, Zhao et al. (2023) presents a compre-
hensive survey of LLMs that describes knowledge
2 Related Work completion as a basic evaluation task of LLMs.
2.1 Knowledge Graph Completion Two closely related studies (Xie et al., 2023; Zhu
et al., 2023) evaluate ChatGPT and GPT-4 on a
Comprehensive reviews of knowledge graph com- link prediction task in KG. Our study is inspired
pletion techniques have been carried out by (Wang by these works, but we further provide more com-
et al., 2017) and (Ji et al., 2021). These techniques prehensive results for KG completion and perform
can be grouped into two categories based on their instruction tuning on three tasks.
scoring functions for triple (h, r, t): translational
distance models like TransE (Bordes et al., 2013) 3 Method
and semantic matching models like DistMult (Yang
et al., 2015). Convolutional neural networks have 3.1 Knowledge Graph Completion Tasks
also demonstrated promising results in knowledge In this chapter, we describe the three tasks in knowl-
graph completion (Dettmers et al., 2018; Nguyen edge graph completion: triple classification, rela-
et al., 2018; Nathani et al., 2019). tion prediction, and entity (link) prediction, and
The methods mentioned above perform knowl- how to transform them into simple prompt ques-
edge graph completion using only the structural tions for LLM to complete the tasks. The entire
information found in triples. However, incorpo- process is depicted in Figure 1.
rating various types of external information, such
as entity types, logical rules, and textual descrip- Triple Classification. Given a triple (h, r, t), the
tions, can enhance performance (Wang et al., 2017; task is to classify it as correct or incorrect. For
Ji et al., 2021). For textual descriptions, Socher example, given the triple <Steve Jobs, founded,
et al. (2013) initially represented entities by aver- Apple Inc.>, the task is to classify it as correct. The
aging the word embeddings in their names, with prompt formation would be "Is this true: Steve
the embeddings learned from an external corpus. Jobs founded Apple Inc.?". And the ideal output of
Wang et al. (2014a) suggested embedding entities LLM would be "Yes, this is true."
and words in the same vector space by aligning Relation Prediction. Given a head entity and a
Wikipedia anchors with entity names. Xie et al. tail entity, the task is to predict the relationship
(2016) employed convolutional neural networks between them. For example, given the head entity
(CNN) to encode word sequences in entity descrip- "Steve Jobs" and the tail entity "Apple Inc.", the
tions. There are also a number of studies in this line task is to predict that their relationship is "founded".
of works (Xiao et al., 2017; Wang and Li, 2016; Xu The prompts formation would be "What is the rela-
et al., 2017; An et al., 2018). Yao et al. (2019) tionship between Steve Jobs and Apple Inc.? Please
proposed KG-BERT which improves the above choose your answer from: was born in | founded
methods with pre-trained language models (PLMs). | is citizen of | . . . . . . | plays for." And the desired
Recently, Wang et al. (2021, 2022); Lovelace and response would be "Steve Jobs founded Apple Inc."
Rose (2022) extended cross-encoder in KG-BERT
to bi-encoder, which enhances the performance Entity (link) Prediction. Given a head entity and
and inference efficiency. Similar to this work, a relationship, the task is to predict the tail entity
KGT5 (Saxena et al., 2022) and KG-S2S (Chen related to the head entity. Given a tail entity and a
et al., 2022) treat KG completion as sequence-to- relationship, the task is to predict the head entity.
sequence tasks. However, the pre-trained language For example, given the head entity "Steve Jobs" and
models used in these studies are relatively small. the relationship "founded", the task is to predict the
Compared with these methods, our method uti- tail entity "Apple Inc.". The prompts formation
lizes more powerful large language models with would be "Steve Jobs founded" for asking the tail
emergent abilities not present in small models such entity and "What/Who/When/Where/Why founded
Input Triple: <Steve Jobs, founded, Apple Inc.>
Triple Classification Is this true: Steve Jobs founded Apple Inc.? Yes, this is true.
Entity Prediction Steve Jobs founded Steve Jobs founded Apple Inc.
What/Who/When/Where/Why
founded Apple Inc.?
Figure 1: Illustrations of Large Langauge Models (LLMs) for Knowledge Graph (KG) Completion.
Table 6: Examples outputs from different LLMs. The first line is taken from FB13-100 and the second line is from
YAGO3-10-100.
Haoyu Wang, Vivek Kulkarni, and William Yang Zhao Zhang, Fuzhen Zhuang, Meng Qu, Fen Lin, and
Wang. 2018. Dolores: Deep contextualized Qing He. 2018. Knowledge graph embedding with
knowledge graph embeddings. arXiv preprint hierarchical relation structure. In EMNLP, pages
arXiv:1811.00147. 3198–3207.
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang,
Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen
Zhang, Junjie Zhang, Zican Dong, et al. 2023. A
survey of large language models. arXiv preprint
arXiv:2303.18223.
Yuqi Zhu, Xiaohan Wang, Jing Chen, Shuofei Qiao,
Yixin Ou, Yunzhi Yao, Shumin Deng, Huajun Chen,
and Ningyu Zhang. 2023. Llms for knowledge graph
construction and reasoning: Recent capabilities and
future opportunities.
A Example Input
An example input for LLM relation prediction
from YAGO3-10: What is the relationship be-
tween Sergio Padt and Jong Ajax? Please choose
your answer from: is known for|is citizen of|has
currency|has child|deals with|has academic advi-
sor|has gender|wrote music for|acted in|died in|has
capital|works at|lives in|is affiliated to|has musi-
cal role|is located in|happened in|has official lan-
guage|created|has won prize|influences|is politician
of|is connected to|owns|graduated from|was born
in|is leader of|exports|is interested in|participated
in|directed|imports|edited|has neighbor|has web-
site|is married to|plays for.