0% found this document useful (0 votes)
10 views73 pages

Kejriwal Knowledge Graph Tutorial - 2020-12-asonam-tutorial-KG

The document provides a comprehensive introduction to Knowledge Graphs (KGs), defining them as sets of triples that represent relationships between entities. It discusses the construction, completion, and applications of KGs, emphasizing the importance of information extraction techniques and entity resolution. Additionally, it highlights the relevance of KGs across various disciplines, including web and information retrieval, semantic web, and knowledge discovery.

Uploaded by

Cyrus Ray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views73 pages

Kejriwal Knowledge Graph Tutorial - 2020-12-asonam-tutorial-KG

The document provides a comprehensive introduction to Knowledge Graphs (KGs), defining them as sets of triples that represent relationships between entities. It discusses the construction, completion, and applications of KGs, emphasizing the importance of information extraction techniques and entity resolution. Additionally, it highlights the relevance of KGs across various disciplines, including web and information retrieval, semantic web, and knowledge discovery.

Uploaded by

Cyrus Ray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Knowledge Graphs: A Practical Introduction across Disciplines

Mayank Kejriwal
University of Southern California
December, 2020
About me
Agenda
3

Knowledge Graphs:
Definitions and Examples

Landscape of
Important Findings
Cross-
Knowledge
disciplinary
Graph
Perspectives and
Construction
Summary and Applications
Research Trends
Agenda
4

Knowledge Graphs:
Definitions and Examples

Landscape of
Important Findings
Cross-
Knowledge
disciplinary
Graph
Perspectives and
Construction
Summary and Applications
Research Trends
5

What is a Knowledge Graph?


Set of triples, where each triple (h, r, t) represents a relationship r between head
entity h and tail entity t

(Barack Obama, wasBornOnDate, 1961-08-04),


(Barack Obama, hasGender, male),
...
(Hawaii, hasCapital, Honolulu),
...
(Michelle Obama, livesIn, United States)
What is a Knowledge Graph?

Technically, a multi-relational directed


labeled graph with semantics

Both edges and nodes have labels, but not


all labels are equal (literals vs. identifiers)

Where do the semantics come from?


• Complex question, only starting to be
understood
More on semantics
Traditionally, semantics are believed to come from ontology
• An ontology is a ‘formal, explicit specification of a shared
conceptualization’ (we will go deeper into this in a while)
• In philosophy, an ontology is a ‘study of what there is’ including the
study of the ‘most general features of what there is, and how the things
there are relate to each other in the metaphysically most general ways’
Source: https://plato.stanford.edu/entries/logic-ontology/

More recently, in AI, we have started to recognize a more commonsense


view of semantics guided by findings in linguistics and distributional
semantics
Have I seen this before?
Knowledge panel

Recognition of user intent

Recommendations

Exploration-suggestions
Agenda
9

Knowledge Graphs:
Definitions and Examples

Landscape of
Important Findings
Cross-
Knowledge
disciplinary
Graph
Perspectives and
Construction
Summary and Applications
Research Trends
Agenda
10

Knowledge Graphs:
Definitions and Examples

Landscape of
Important Findings
Cross-
Knowledge
disciplinary
Graph
Perspectives and
Construction
Summary and Applications
Research Trends
A typical KGC workflow starts from corpus acquisition and
ends with applications
KG Construction KG Completion
KG

Information Entity
extraction Resolution
Corpus (usually
documents, but
also webpages,
tables, reviews, Co-reference Knowledge
social media…) resolution Graph
Embeddings

… …
Applications

11
INFORMATION EXTRACTION (IE)
Named Entity Recognition (NER)

Source: Named Entity Recognition and Classification with Scikit-Learn. https://towardsdatascience.com/named-entity-


recognition-and-classification-with-scikit-learn-f05372f07ba2
Demo: displaCy
https://explosion.ai/demos/displacy-ent

14
NER workflows
Many methods proposed over the previous
3-4 decades:

• Rule-based
• Dictionary-based
• Simple machine learning
• Sequence labeling (e.g., using conditional
random fields or, before that, hidden
Markov models)

Today, deep learning methods designed for


sequences (such as RNNs and, more
recently, transformers) are state-of-the-art

Much research still remains (especially for


social media!)
Source: Cho, H., and H. Lee. Biomedical named entity recognition using deep
neural networks with contextual information. BMC Bioinformatics. 2019.
Other kinds of IE: Relation Extraction

Source: Stanford TACRED


Other kinds of IE: Open Information Extraction

Source: Zhu et al. Open Information Extraction with Global Structure Constraints.
ACM WWW Conference. 2018.
Is IE a solved problem?

Source: Liang et al. BOND: BERT-Assisted Open-Domain Named Entity


Recognition with Distant Supervision. KDD Conference. 2020.
Other NLP steps: Coreference Resolution, Entity Linking…

Sources: Source:
https://aryamccarthy.github.io/wiseman2016learning/ Alokaili and Menai. SVM ensembles for
named entity disambiguation. Computing.
(Wiseman, Rush, and Shieber, 2016) at NAACL 2019.
A typical KGC workflow starts from corpus acquisition and
ends with applications
KG Construction KG Completion
KG

Information Entity
extraction Resolution
Corpus (usually
documents, but
also webpages,
tables, reviews, Co-reference Knowledge
social media…) resolution Graph
Embeddings

… …
Applications

20
KNOWLEDGE GRAPH COMPLETION
Entity Resolution
Algorithmically
identifying and
linking/grouping different
manifestations of the
same real-world object

Problem has existed for


50 years in many
communities (databases,
graphs, networks,
tables…)

Source: Entity Resolution: Tutorial. Getoor and Machanavajjhala. VLDB, 2012


In the world of knowledge graphs
Representation Learning on Knowledge Graphs aka
Knowledge Graph Embeddings

Knowledge graph embeddings:


• TransE, H…
• Neural tensor networks
• Graph convolutional
networks (or their variants)
• Matrix factorization
• …
KGEs (results)
Useful resources:
• OpenKE:
http://139.129.163.161//index/tool
kits#pretrained-embeddings
• StarSpace:
https://github.com/facebookresear
ch/StarSpace
• Recent transformer-based models
could potentially be adapted,
including BERT and RoBERTa:
https://ai.facebook.com/blog/rober
ta-an-optimized-method-for-
pretraining-self-supervised-nlp-
systems/

25
Other proposals: knowledge graph identification using
probabilistic soft logic

Examples of ontological constraints


A typical KGC workflow starts from corpus acquisition and
ends with applications
KG Construction KG Completion
KG

Information Entity
extraction Resolution
Corpus (usually
documents, but
also webpages,
tables, reviews, Co-reference Knowledge
social media…) resolution Graph
Embeddings

… …
Applications

27
Open-source KGs that have been built
Agenda
29

Knowledge Graphs:
Definitions and Examples

Landscape of
Important Findings
Cross-
Knowledge
disciplinary
Graph
Perspectives and
Construction
Summary and Applications
Research Trends
CROSS-DISCIPLINARY PERSPECTIVES: WEB AND INFORMATION RETRIEVAL
31

Google Knowledge Graph


32

Domain-specific search (DSS)


Emerging opportunities for DSS

Fighting human Predicting


trafficking cyberattacks

Accurate
Stopping geopolitical
Penny Stock forecasting
Fraud

33
DARPA/IARPA programs
DARPA Memex
Fighting human Predicting
IARPA Hybrid Forecasting
trafficking cyberattacks

Competition DARPA AIDA


DARPA Causal Exploration
Accurate geopolitical
Stock Fraud DARPA LORELEI
Stopping Penny
forecasting

IARPA CAUSE
34
Research Question
General Search Google Knowledge Graph

DSS Domain-Specific Knowledge Graphs

How do we construct domain specific


knowledge graphs over web data for
powerful DSS applications?
36

Knowledge Graphs for DSS


Domain-specific
Insight Graphs
Many examples in industry
and non-profit
Commercial domains: Amazon Product Graph

Source: Dong, Luna. Building a Broad Knowledge Graph for Products. Keynote at ICDE. 2019
Another example: COVID-19
Other COVID-19 KG examples

Source: Verizon Media


Source: CovidGraph https://github.com/yahoo/covid-19-dashboard

Further reading: Kejriwal, M. (2020). Knowledge Graphs and COVID-19: Opportunities, Challenges, and
Implementation. Harvard Data Science Review.
CROSS-DISCIPLINARY PERSPECTIVES: SEMANTIC WEB
43

What is (or even isn’t) a domain?


Some dictionary definitions
(Merriam Webster) A sphere of knowledge, influence or activity
(Oxford) A specified sphere of activity or knowledge

Specifying the sphere


Rules
Scope (e.g., the legal system)
Syllabi (for classrooms)
Examples

How do domain experts specify the sphere?


Examples
Ontology
Modeling domains: Ontologies

Source: “Ontologies and semantic web.” Stanley Wang. https://www.slideshare.net/stanleywanguni/ontologies-and-semantic-web


Examples of ontologies

Agency domain Friend-of-a-friend


Ontologies are big in Science
Representation of knowledge graphs (and ontologies)
An RDF graph is a set of triples, where each
triple is of the form (subject, predicate,
object):
• Subjects must be URIs (technically,
internationalized resource identifiers, in
practice, just Uniform Resource Locators)
• Predicates (also called ‘properties’) must be
URIs
• Objects can be either URIs of literals
(strings, numbers, dates…)

In the Semantic Web, RDF is the ‘building


block’ of higher order vocabularies (such as
RDF Schema and OWL) that can be used to
represent ontologies
Example of RDF KG

http://www.example.org/~joe/contact.rdf#joesmith
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://xmlns.com/foaf/0.1/Person

http://www.example.org/~joe/contact.rdf#joesmith
http://xmlns.com/foaf/0.1/givenname “Joe”

As a graph As a set of triples


Web Ontology Language (OWL)
OWL builds on RDF
(and another layer
called RDF Schema or
RDFS) to provide a
systematic vocabulary
for defining ontologies

Because OWL builds on


RDF, every OWL
ontology is also an RDF
graph, but not
necessarily vice-versa

https://www.w3.org/TR/owl-features/
Reasoning over knowledge graphs

Source: Bergman. Platforms and Knowledge


Management. 2018
Example tool for reasoning and ontologies: Protege

Source: https://protege.stanford.edu/
Putting it all together: Semantic Web Layer Cake
CROSS-DISCIPLINARY PERSPECTIVES: KNOWLEDGE DISCOVERY & DATA MINING
Source: Dong, Luna. Building a Broad Knowledge Graph for Products. Keynote at ICDE. 2019
Others
Scientific Text Mining
Jiang, M., & Shang, J. (2020, August). Scientific Text Mining and Knowledge Graphs. In Proceedings of the 26th ACM
SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 3537-3538).

Question Answering
Hixon, B., Clark, P., & Hajishirzi, H. (2015). Learning knowledge graphs for question answering through conversational
dialog. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies (pp. 851-861).

Recommendation Systems
Oramas, S., Ostuni, V. C., Noia, T. D., Serra, X., & Sciascio, E. D. (2016). Sound and music recommendation with
knowledge graphs. ACM Transactions on Intelligent Systems and Technology (TIST), 8(2), 1-21.

Summarization
Gunaratna, K., Yazdavar, A. H., Thirunarayan, K., Sheth, A., & Cheng, G. (2017, August). Relatedness-based multi-
entity summarization. In IJCAI: proceedings of the conference (Vol. 2017, p. 1060). NIH Public Access.
Truth/fact-checking
Shiralkar, P., Flammini, A., Menczer, F., & Ciampaglia, G. L. (2017, November). Finding streams in knowledge graphs
to support fact checking. In 2017 IEEE International Conference on Data Mining (ICDM) (pp. 859-864). IEEE.
Agenda
56

Knowledge Graphs:
Definitions and Examples

Landscape of
Important Findings
Cross-
Knowledge
disciplinary
Graph
Perspectives and
Construction
Summary and Applications
Research Trends
Open Knowledge Network (OKN)
Technology companies develop proprietary
knowledge networks as key business
technologies today. However, because these
networks are proprietary and expensive to
construct, government, academia, small
businesses, and nonprofits do not have access
to them. In contrast, an open knowledge
network (OKN) would be available to all
stakeholders, including the researchers who
will help push this technology further. An
OKN requires a nonproprietary, public–
private development effort that spans the
entire data science community and will result
in an open, shared infrastructure.

https://www.nitrd.gov/pubs/Open-Knowledge- https://www.nitrd.gov/news/Open-Knowledge-Network-
Network-Workshop-Report-2018.pdf Workshop-Report-2018.aspx
Knowledge, semantics and context: what are they and how
do we better define/represent them?
Explainable AI

Source: Knowledge Graphs For eXplainable AI. On the Integration of Semantic Technologies and Symbolic Systems
into Deep Learning Models for a More Comprehensible Artificial Intelligence.
https://towardsdatascience.com/knowledge-graphs-for-explainable-ai-dcd73c5c016
WRAPUP
What is a Knowledge Graph?
• Set of triples, where each triple (h, r, t) represents a relationship r between head entity h and
tail entity t

(Barack Obama, wasBornOnDate, 1961-08-04),


(Barack Obama, hasGender, male),
...
(Hawaii, hasCapital, Honolulu),
...
(Michelle Obama, livesIn, United States)

62
What is a Knowledge Graph?

Technically, a multi-relational directed


labeled graph with semantics

Both edges and nodes have labels, but not


all labels are equal (literals vs. identifiers)

Where do the semantics come from?


• Complex question, only starting to be
understood
More on semantics

Traditionally, semantics are believed to come from ontology


• An ontology is a ‘formal, explicit specification of a shared
conceptualization’ (we will go deeper into this in a while)
• In philosophy, an ontology is a ‘study of what there is’ including the
study of the ‘most general features of what there is, and how the things
there are relate to each other in the metaphysically most general ways’
Source: https://plato.stanford.edu/entries/logic-ontology/

More recently, in AI, we have started to recognize a more commonsense


view of semantics guided by findings in linguistics and distributional
semantics
A typical KGC workflow starts from corpus acquisition and ends with
applications

KG Construction KG Completion
KG

Information Entity
extraction Resolution
Corpus (usually
documents, but
also webpages,
tables, reviews, Co-reference Knowledge
social media…) resolution Graph
Embeddings

… …
Applications

65
Named Entity Recognition (NER)

Source: Named Entity Recognition and Classification with Scikit-Learn. https://towardsdatascience.com/named-entity-


recognition-and-classification-with-scikit-learn-f05372f07ba2
Other kinds of IE: Relation Extraction

Source: Stanford TACRED


In the world of knowledge graphs
KGEs (results)

Useful resources:
• OpenKE:
http://139.129.163.161//index/tool
kits#pretrained-embeddings
• StarSpace:
https://github.com/facebookresear
ch/StarSpace
• Recent transformer-based models
could potentially be adapted,
including BERT and RoBERTa:
https://ai.facebook.com/blog/rober
ta-an-optimized-method-for-
pretraining-self-supervised-nlp-
systems/

69
Open-source KGs that have been built
Many applications and open research areas!

Information retrieval
Semantic Web
Recommender systems
Knowledge discovery/data mining
?
Numerous surveys, some more technical/field-specific
• Ehrlinger, L., & Wöß, W. (2016). Towards a Definition of
Knowledge Graphs. SEMANTiCS (Posters, Demos,
SuCCESS), 48, 1-4
• Noy, N., Gao, Y., Jain, A., Narayanan, A., Patterson, A., &
Taylor, J. (2019). Industry-scale knowledge graphs: lessons
and challenges. Queue, 17(2), 48-75
• Nickel, M., Murphy, K., Tresp, V., & Gabrilovich, E. (2015).
A review of relational machine learning for knowledge
graphs. Proceedings of the IEEE, 104(1), 11-33.
• Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P. S. (2020).
A survey on knowledge graphs: Representation, acquisition
and applications. arXiv preprint arXiv:2002.00388.
• Paulheim, H. (2017). Knowledge graph refinement: A survey
of approaches and evaluation methods. Semantic web, 8(3),
489-508.

Upcoming:
Knowledge Graphs: Fundamentals, Techniques, and Applications (Adaptive Computation
and Machine Learning series). Kejriwal, Knoblock and Szekely.
Q&A
[email protected]

You might also like