Knowledge Graph Tutorial
Knowledge Graph Tutorial
49
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW
13
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW
14
F A
Essentially, KG is a sematic network, which models the
entities (including properties) and the relation between
each other.
What is a knowledge graph?
15
What is a knowledge graph?
• Knowledge in graph form!
16
What is a knowledge graph?
• Knowledge in graph form!
17
What is a knowledge graph?
• Knowledge in graph form!
E1
• Captures entities, attributes,
and relationships
E2
• Nodes are entities
E3
18
What is a knowledge graph?
• Knowledge in graph form!
A1
E1 A2
• Captures entities, attributes,
and relationships
E2
• Nodes are entities A1
A2
• Nodes are labeled with
attributes (e.g., types) E3
A1
A3
19
What is a knowledge graph?
• Knowledge in graph form!
A1
E1 A2
• Captures entities, attributes,
and relationships
E2
• Nodes are entities A1
A2
• Nodes are labeled with
attributes (e.g., types) E3
A1
• Typed edges between two A2
nodes capture a relationship
between entities
20
Example knowledge graph
• Knowledge in graph form!
person
John
Lennon
• Captures entities, attributes,
and relationships
band
Beatles
• Nodes are entities
• Nodes are labeled with
attributes (e.g., types)
place
Liverpool
• Typed edges between two
nodes capture a relationship
between entities
21
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW
22
Why knowledge graphs?
• Humans:
• Combat information overload
• Explore via intuitive structure
• Tool for supporting knowledge-driven tasks
• AIs:
• Key ingredient for many AI tasks
• Bridge from data to human semantics
• Use decades of work on graph analysis
23
Interdisciplinary Research
Database
RDF Database
Data Integration Knowledge Fusion
Knowledge Engineering
KB construction
Rule-based Reasoning
5
Knowledge Graphs & Industry
•Google Knowledge Graph
• Google Knowledge Vault
•Amazon Product Graph
•Facebook Graph API
•IBM Watson
•Microsoft Satori
• Project Hanover/Literome
•LinkedIn Knowledge Graph
•Yandex Object Answer
•Diffbot, GraphIQ, Maana, ParseHub, Reactor Labs,
SpazioDati
27
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW
28
Where do knowledge graphs come from?
29
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets
30
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets
• Unstructured Text
◦ WWW, news, social media,
reference articles
31
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets
• Unstructured Text
◦ WWW, news, social media,
reference articles
• Images
32
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets
• Unstructured Text
◦ WWW, news, social media,
reference articles
• Images
• Video
◦ YouTube, video feeds
33
Knowledge Representation
•Decades of research into knowledge representation
36
DA A C &A F A &
3
RDF and Semantic Web
April ’14:
1091 datasets, ???
triples
can be defined
I Resources can be contributed by Abraham Lincoln:DiedIn
y:Washington DC
RDF Data Model
I Triple: Subject, Predicate (Property),
U
Object (s, p, o)
Subject: the entity that is described
Predicate
(URI or blank node) Subject Object
Predicate: a feature of the entity (URI)
Object: value of the feature (URI, U B U B L
blank node or literal) U: set of URIs
I (s, p, o) 2 (U [ B) ⇥ U ⇥ (U [ B [ L) B: set of blank nodes
L: set of literals
I Set of RDF triples is called an RDF graph
bornOnDate gender
diedOnDate hasName
“1962-08-05” y:Marilyn Monroe “Marilyn Monroe”
diedOnDate
diedIn “Franklin D. Roosevelt” “Male”
“1865-04-15” “1776”
hasName gender
y:Washington D.C.
y:Franklin Roosevelt
“1976-03-22” foundYear hasName hasCapitalfoundingYear
title
bornOnDate “1790”
“Washington D.C.” y:United States “President”
y:Reese Witherspoon
bornIn
gender hasName
locatedIn locatedIn
title hasName bornIn
“Female”
“United States”
“Actress”“Reese Witherspoon” y:New Orleans LA y:Hyde Park NY
foundingYear foundingYear
“1718” “1810”
A Distributed RDF Graph
029
Hank Azaria
027 009
refs:label 028 Mystery Men s2:fil3
014
s1:act4 refs:label 018
Sleeper
007 refs:label actedIn
rdfs:label Mary Hartman
actedIn 005 002 010
s1:fil1
s1:fil7 actedIn s2:act1 actedIn s4:fil4
015 directed ismarriedTo rdfs:label 017
016 001 rdfs:label
livesIn 019
Small Time Crooks Woody Allen Louise Lasser
rdfs:label s1:dir1 013 020
026 Slither
rdfs:label 008 directed rdfs:label
s2:pla1 New York
A Very Yong Lady
s1:fil2
rdfs:label
hasWonPrize 011 livesIn
actedIn s1:fil5
003 021
006 012
actedIn s4:dir2 refs:label Edmond O’Brien
s3:act3 hasWonPrize s3:awa1 ismarriedTo
004 directed 022
rdfs:label rdfs:label s3:act2
s4:fil6 rdfs:label Man-Trap
025 024
rdfs:label 030
Hugh Grant Cesar Award
023
Nancy Kelly
Representative graph processing systems
Property Online Data In-memory Atomicity &
graphs query sharding storage Transaction
Neo4j Yes Yes No No Yes
Trinity Yes Yes Yes Yes Atomicity
Horton Yes Yes Yes Yes No
HyperGraphDB No Yes No No Yes
FlockDB No Yes Yes No Yes
TinkerGraph Yes Yes No Yes No
InfiniteGraph Yes Yes Yes No Yes
Cayley Yes Yes SB SB Yes
Titan Yes Yes SB SB Yes
MapReduce No No Yes No No
PEGASUS No No Yes No No
Pregel No No Yes No No
Giraph No No Yes No No
GraphLab No No Yes No No
GraphChi No No No No No
GraphX No No Yes No No
DB-Engines Ranking of Graph
DBMS
Cypher query language is
used by Neo4j.
Gremlin is used by most of
graph DBMSs.
GSQL is used by TigerGraph.
https://db-engines.com/en/ranking/graph+dbms
118
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW
39
What is a knowledge graph?
• Knowledge in graph form!
A1
E1 A2
• Captures entities, attributes,
and relationships
E2
• Nodes are entities A1
A2
• Nodes are labeled with
attributes (e.g., types) E3
A1
• Typed edges between two A2
nodes capture a relationship
between entities
40
Basic problems
A1
E1 A2
E2
A1
A2
E3
A1
A2
41
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?
E2
A1
A2
E3
A1
A2
42
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?
E3
A1
A2
43
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?
44
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?
45
Knowledge Graph Construction
Knowledge Graph
Extraction Construction
46
Two perspectives
Extraction graph Knowledge graph
8
What is NLP?
Information
“Knowledge”
Extraction
Unstructured Structured
Ambiguous Precise, Actionable
Lots and lots of it! Specific to the task
4
Knowledge Extraction
John was born in Liverpool, to Julia and Alfred Lennon. Text
Extraction graph
Information Alfred
Extraction Lennon
childOf
birthplace
Liverpool John
Lennon
childOf Julia
Lennon
5
Breaking it Down
Alfred
Lennon
Information
Extraction
Dependency Parsing,
Part of speech tagging,
Named entity recognition… NNP VBD VBD IN NNP TO NNP CC NNP NNP
John was born in Liverpool, to Julia and Alfred Lennon.
6
Tagging the Parts of Speech
7
Detecting Named Entities
8
NLP annotations à features for IE
Combine tokens, dependency paths, and entity types to define rules.
appos nmod
det case
9
Entity Names: Two Main Problems
Entities with Same Name Different Names for Entities
12
Entity Linking Approach
Washington drops 10 points after game with UCLA Bruins.
Information Extraction
Alfred
Lennon
childOf
birthplace spouse
Liverpool John
Lennon
childOf Julia
Lennon
14
Information Extraction
3 CONCRETE SUB-PROBLEMS 3 LEVELS OF SUPERVISION
Supervised
Defining domain
Learning extractors Semi-supervised
Scoring the facts
Unsupervised
15
IE systems in practice
Defining Learning Scoring Fusing
domain extractors candidate extractors
facts
ConceptNet
Knowledge
Classifier
Vault
OpenIE
27
Knowledge Extraction: Key Points
• Built on the foundation of NLP techniques
• Part-of-speech tagging, dependency parsing, named
entity recognition, coreference resolution…
• Challenging problems with very useful outputs
• Information extraction techniques use NLP to:
• define the domain
• extract entities and relations
• score candidate outputs
• Trade-off between manual & automatic methods
28
Knowledge Graph Construction
Knowledge Graph
Extraction Construction
46
Knowledge Graph
Construction
TOPICS:
P ROBLEM S ETTING
P ROBABILISTIC M ODELS
E MBEDDING T ECHNIQUES
4
Knowledge Graph
Construction
TOPICS:
P ROBLEM S ETTING
P ROBABILISTIC M ODELS
E MBEDDING T ECHNIQUES
5
Reminder: Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?
6
Graph Construction Issues
Extracted knowledge is:
• ambiguous:
◦ Ex: Beetles, beetles, Beatles
◦ Ex: citizenOf, livedIn, bornIn
7
Graph Construction Issues
Extracted knowledge is:
• ambiguous
• incomplete
◦ Ex: missing relationships
◦ Ex: missing labels
◦ Ex: missing entities
8
Graph Construction Issues
Extracted knowledge is:
• ambiguous
• incomplete
spouse
• inconsistent
◦ Ex: Cynthia Lennon, Yoko Ono
◦ Ex: exclusive labels (alive, dead) spouse
◦ Ex: domain-range constraints
9
Graph Construction Issues
Extracted knowledge is:
• ambiguous
• incomplete
• inconsistent
10
Graph Construction approach
•Graph construction cleans and completes extraction graph
11
Knowledge Graph
Construction
TOPICS:
P ROBLEM S ETTING
P ROBABILISTIC M ODELS
E MBEDDING T ECHNIQUES
12
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS
13
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS
14
Beyond Pure Reasoning
15
Beyond Pure Reasoning
16
Beyond Pure Reasoning
17
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS
18
Graphical Models: Overview
•Define joint probability distribution on knowledge graphs
19
Knowledge Graph Identification
Define a graphical model to
perform all three of these A1
tasks simultaneously! E1 A2
A1
• What are their attributes A2
and types (labels)?
E3
A1
• How are they related A2
(edges)?
PUJARA+ISWC13 20
Knowledge Graph Identification
A1
E1 A2
A1
A2
E3
A1
A2
PUJARA+ISWC13 21
Probabilistic Models
•Use dependencies between facts in KG
22
What determines probability?
•Statistical signals from text extractors and classifiers
23
What determines probability?
•Statistical signals from text extractors and classifiers
• P(R(John,Spouse,Yoko))=0.75; P(R(John,Spouse,Cynthia))=0.25
• LevenshteinSimilarity(Beatles, Beetles) = 0.9
24
What determines probability?
•Statistical signals from text extractors and classifiers
25
What determines probability?
•Statistical signals from text extractors and classifiers
26
What determines probability?
•Statistical signals from text extractors and classifiers
27
What determines probability?
•Statistical signals from text extractors and classifiers
28
What determines probability?
•Statistical signals from text extractors and classifiers
• P(R(John,Spouse,Yoko))=0.75; P(R(John,Spouse,Cynthia))=0.25
• LevenshteinSimilarity(Beatles, Beetles) = 0.9
29
Example: The Fab Four
30
Illustration of KG Identification
Uncertain Extractions:
.5: Lbl(Fab Four, novel)
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
(Annotated) Extraction Graph
Uncertain Extractions:
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
musician
novel
Abbey Road
PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
Extraction Graph
Uncertain Extractions:
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
Ontology: musician
Dom(albumArtist, musician)
Mut(novel, musician) novel
Abbey Road
PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
(Annotated) Extraction Graph
Uncertain Extractions: SameEnt
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
Ontology: musician
Dom(albumArtist, musician)
Mut(novel, musician) novel
Entity Resolution: Abbey Road
SameEnt(Fab Four, Beatles)
PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
(Annotated) Extraction Graph
Uncertain Extractions: SameEnt
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
Ontology: musician
Dom(albumArtist, musician)
Mut(novel, musician) novel
Entity Resolution: Abbey Road
SameEnt(Fab Four, Beatles)
Lbl(Beatles,
musician)
Lbl(Fab Four,
musician)
Rel(Fab Four,
Lbl(Fab Four, novel) AlbumArtist,
Abbey Road)
Defining graphical models
•Many options for defining a graphical model
•We focus on two approaches, MLNs and PSL, that use rules
37
Rules for KG Model
100: Subsumes(L1,L2) & Label(E,L1) -> Label(E,L2)
100: Exclusive(L1,L2) & Label(E,L1) -> !Label(E,L2)
[ 3] SameEnt(Beatles, FabFour)
^ Lbl(Beatles, musician) φ5 φ
) Lbl(FabFour, musician)
PUJARA+ISWC13; PUJARA+AIMAG15
How do we get a knowledge graph?
Have: P(KG) forall KGs Need: best KG
A1
A1
E1 A2
P( )
E1 A2
E2
E2
A1
A1
A2
A2
E3
E3
A1
A1
A2
A2
42
Inference and KG optimization
•Finding the best KG satisfying weighed rules: NP Hard
43
Graphical Models Experiments
Data: ~1.5M extractions, ~70K ontological relations, ~500 relation/label types
Task: Collectively construct a KG and evaluate on 25K target facts
Comparisons:
Extract Average confidences of extractors for each fact in the NELL candidates
Rules Default, rule-based heuristic strategy used by the NELL project
MLN Jiang+, ICDM12 – estimates marginal probabilities with MC-SAT
PSL Pujara+, ISWC13 – convex optimization of continuous truth values with ADMM
JIANG+ICDM12; PUJARA+ISWC13
Graphical Models: Pros/Cons
BENEFITS DRAWBACKS
• Define probability • Requires optimization over
distribution over KGs all KG facts - overkill
45
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS
46
Random Walk Overview
•Given: a query of an entity and relation
48
Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)
albumArtist
hasInstrument
playsInstrument
49
Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)
50
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
51
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
52
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
53
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
P(Q|!=<coworker,playsInstrument>) W!
54
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
P(Q|!=<coworker,playsInstrument>) W!
55
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
P(Q|!=<coworker,playsInstrument>) W!
56
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
57
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
P(Q|!=<albumArtist,hasInstrument>) W!
58
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)
P(Q|!=<albumArtist,hasInstrument>) W!
59
Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)
60
Recent Random Walk Methods
PRA: Path Ranking Algorithm
• Performs random walk of imperfect knowledge graph
• Estimates transition probabilities using KG
• For each relation, learns parameters for paths through the KG
61
Recent Random Walk Methods
PRA: Path Ranking Algorithm
• Performs random walk of imperfect knowledge graph
• Estimates transition probabilities using KG
• For each relation, learns parameters for paths through the KG
62
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b
LAO+EMNLP11 63
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b
LAO+EMNLP11 64
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b
LAO+EMNLP11 65
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b
LAO+EMNLP11 66
Recent Random Walk Methods
PRA: Path Ranking Algorithm
• Performs random walk of imperfect knowledge graph
• Estimates transition probabilities using KG
• For each relation, learns parameters for paths through the KG
67
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
68
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, )
R( ,PlaysInstrument,Y)
69
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, )
R( ,PlaysInstrument,Y)
70
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, )
R( ,PlaysInstrument,Y)
71
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, )
R( ,PlaysInstrument,Y)
R( ,Coworker, )
R( ,PlaysInstrument, )
72
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument,Y) R( ,HasInstrument,K)
R( ,Coworker, )
R( ,PlaysInstrument, )
73
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument,Y) R( ,HasInstrument,K)
R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument, ) R( ,HasInstrument, )
74
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)
R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)
R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument,Y) R( ,HasInstrument,K)
R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument, ) R( ,HasInstrument, )
75
ProPPR in a nutshell
!
X X
min log p⌫0 [uk+ ] + log(1 p⌫0 [uk ] + µ||w||22
w
k2+ k2
0.95
0.92
Google Beatles Baseball
WANG+MLJ15 77
Random Walks: Pros/Cons
BENEFITS DRAWBACKS
• KG query estimation • Full KG completion task
independent of KG size inefficient
78
Two classes of Probabilistic Models
GRAPHICAL MODELS RANDOM WALK METHODS
◦ Possible facts in KG are ◦ Possible facts posed as
variables queries
79
MATRICES, TENSORS, AND NEURAL NETWORKS
Probabilistic Models: Downsides
Embeddings
2
Two Related Tasks
surface pattern relation
relation relation
Relation
Extraction relation
surface pattern
relation relation
3
Two Related Tasks
surface pattern relation
relation relation
Relation
Extraction relation
surface pattern
relation relation
4
What is NLP?
John was born in Liverpool, to Julia and Alfred Lennon.
Natural Language
Processing
5
What is Information Extraction?
Lennon.. Mrs. Lennon.. his father
John Lennon... the Pool .. his mother .. he Alfred
Person Location Person Person
John was born in Liverpool, to Julia and Alfred Lennon.
NNP VBD VBD IN NNP TO NNP CC NNP NNP
Information Extraction
Alfred
Lennon
childOf
birthplace spouse
Liverpool John
Lennon
childOf Julia
Lennon
6
Relation Extraction From Text
John was born in Liverpool, to Julia and Alfred Lennon.
Alfred
Lennon
“born in __, to”
7
Relation Extraction From Text
John was born in Liverpool, to Julia and Alfred Lennon.
Alfred
livedIn Lennon
“born in __, to”
childOf
“was born to”
“was born in” John
Liverpool “and”
birthplace Lennon
childOf
“was born to”
8
“Distant” Supervision
“was born in” John
Liverpool
birthplace Lennon
9
Relation Extraction as a Matrix
John was born in Liverpool, to Julia and Alfred Lennon.
m relations
k
n
n k
m relations
≈ X
pairs
pairs
bornIn(John,Liverpool)
pairs
pairs
13
Embeddings ~ Logical Relations
Relation Embeddings, w
◦ Similar embedding for 2 relations denote they are paraphrases
◦ is married to, spouseOf(X,Y), /person/spouse
◦ One embedding can be contained by another
◦ w(topEmployeeOf) ⊂ w(employeeOf)
◦ topEmployeeOf(X,Y) → employeeOf(X,Y)
◦ Can capture logical patterns, without needing to specify them!
Time, Inc
1 1
similar embedding
Volvo
1
Scania A.B.
Campeau
Federated Dept Stores
Apple
HP
X professor at Y X historian at Y
Kevin Boyle
1
(Freeman,Harvard)
Ohio State
→ (Boyle,OhioState)
R. Freeman
1
Harvard
relation relation
Relation
Extraction relation
surface pattern
relation relation
17
Graph Completion
Alfred
livedIn Lennon
“born in __, to”
childOf
“was born to”
“was born in” John
Liverpool “and”
birthplace Lennon
childOf
“was born to”
18
Graph Completion
Alfred
livedIn Lennon
childOf
spouse
John
Liverpool
birthplace Lennon
childOf
spouse
Julia
livedIn Lennon
19
Tensor Formulation of KG
|R|
Does an unseen
relation exist?
|E| e1
r
e2
|E|
20
Factorize that Tensor
|E|
|R| k
k
k
|E| |E|
|R|
|E|
21
Many Different Factorizations
CANDECOMP/PARAFAC-Decomposition
Model E
Not tensor
factorization
Holographic Embeddings (per se)
HOLE: Nickel et al, AAAI (2016), Model E: Riedel et al, NAACL (2013), RESCAL: Nickel et al, WWW (2012), CP: Harshman (1970), Tucker2: Tucker (1966) 22
Translation Embeddings
TransE
Honolulu
e2
birthplace
r TransH
e1
Barack Obama
Liverpool
birthplace
TransR
John Lennon
TransE: Bordes et al. XXX (2011), TransH: Bordes et al. XXX (2011), TransR: Bordes et al. XXX (2011) 23
Parameter Estimation
|R|
|E| e1
Unobserved cell: decrease score
r e2
|E|
24
Matrix vs Tensor Factorization
25
What they can, and can’t, do..
relation relation
Relation
Extraction relation
surface pattern
Joint Model
relation relation
27
Compositional Neural Models
So far, we’re learning vectors for each entity/surface pattern/relation..
But learning vectors independently ignores “composition”
• Every surface pattern is not unique • Every relation path is not unique
A is B’s spouse. A parent B, B parent C
• Synonymy: • Explicit:
A is married to B. A grandparent C
• Can the representation learn this? • Can the representation capture this?
28
Composing Dependency Paths
… was born to … … ‘s parents are … \parentsOf
(never appears in
training data)
But we don’t need linked data to know they mean similar things…
Use neural networks to produce the embeddings from text!
NN NN
NN
stateBasedIn
NN
Relation Extraction:
• Matrix Factorization Approaches
Graph Completion:
• Tensor Factorization Approaches
31
Logic-centric serving via Symbolic Reasoning
knowledge reasoning
Logic
Rule
Base
ConceptNet
(Common Sense Base)
Freebase, etc.
(Facts Base)
concrete
Knowledge in symbolic logic form
• Symbols are abstract identifiers can be manipulated in an algebra system
• Variables
• Functions
IsA, 0.98
dog
Pal
ConceptNet
(Common Sense Base)
person actor
Lassie
Satori
(Facts Base)
Jan Clayton
logic rules
Common sense
Functions and relations are just hyperedges!
• is just a hyperedge connecting
three nodes .
Hyperedges
• Symbolic transformation is just graph
pattern matching and graph transformation!
bark
Satori
CapableOf, 0.8
Animal
IsA, 0.98
dog
Pal
Logic
person actor
rule Lassie
base
Jan Clayton
logic rules
Common sense
Use graph transformation to do logic deduction
person
think
IsA
Albert HasPrerequisite
Einstein
Multimodal KB Embeddings
Object Encoder
Entity Lookup
Images CNN
Text LSTM
Update Model
User Learning
Algorithm
Update Model
User Learning
Algorithm
Many different options
- Generalized Expectation
- Posterior Regularization
- Labeling functions in SNORKEL
45
(2) Future research directions:
Online KG Construction
• One shot KG construction Online KG construction
• Consume online stream of data
• Temporal scoping of facts
• Discovering new concepts automatically
• Self-correcting systems
62
(2) Future research directions:
Online KG Construction
• Continuously learning and self-correcting systems
• [Selecting Actions for Resource-bounded Information Extraction using
Reinforcement Learning, Kanani and McCallum, WSDM 2012]
• Presented a reinforcement learning framework for budget constrained information extraction
63