0% found this document useful (0 votes)
7 views

Knowledge Graph Tutorial

Uploaded by

孫ウィーユ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Knowledge Graph Tutorial

Uploaded by

孫ウィーユ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 175

Knowledge Graph Tutorial

Jay Pujara Sameer Singh


jaypujara.org sameersingh.org
[email protected] [email protected]
@jay_mlr @sameer_

49
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW

13
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW

14
F A
Essentially, KG is a sematic network, which models the
entities (including properties) and the relation between
each other.
What is a knowledge graph?

15
What is a knowledge graph?
• Knowledge in graph form!

16
What is a knowledge graph?
• Knowledge in graph form!

• Captures entities, attributes,


and relationships

17
What is a knowledge graph?
• Knowledge in graph form!
E1
• Captures entities, attributes,
and relationships
E2
• Nodes are entities

E3

18
What is a knowledge graph?
• Knowledge in graph form!
A1
E1 A2
• Captures entities, attributes,
and relationships
E2
• Nodes are entities A1
A2
• Nodes are labeled with
attributes (e.g., types) E3
A1
A3

19
What is a knowledge graph?
• Knowledge in graph form!
A1
E1 A2
• Captures entities, attributes,
and relationships
E2
• Nodes are entities A1
A2
• Nodes are labeled with
attributes (e.g., types) E3
A1
• Typed edges between two A2
nodes capture a relationship
between entities

20
Example knowledge graph
• Knowledge in graph form!

person
John
Lennon
• Captures entities, attributes,
and relationships

band
Beatles
• Nodes are entities
• Nodes are labeled with
attributes (e.g., types)

place
Liverpool
• Typed edges between two
nodes capture a relationship
between entities

21
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW

22
Why knowledge graphs?
• Humans:
• Combat information overload
• Explore via intuitive structure
• Tool for supporting knowledge-driven tasks

• AIs:
• Key ingredient for many AI tasks
• Bridge from data to human semantics
• Use decades of work on graph analysis

23
Interdisciplinary Research
Database
RDF Database
Data Integration Knowledge Fusion

Natural Language Machine


Processing Learning
Information Extraction Knowledge
Semantic Parsing Representation
(Graph Embedding)

Knowledge Engineering
KB construction
Rule-based Reasoning
5
Knowledge Graphs & Industry
•Google Knowledge Graph
• Google Knowledge Vault
•Amazon Product Graph
•Facebook Graph API
•IBM Watson
•Microsoft Satori
• Project Hanover/Literome
•LinkedIn Knowledge Graph
•Yandex Object Answer
•Diffbot, GraphIQ, Maana, ParseHub, Reactor Labs,
SpazioDati
27
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW

28
Where do knowledge graphs come from?

29
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets

30
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets

• Unstructured Text
◦ WWW, news, social media,
reference articles

31
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets

• Unstructured Text
◦ WWW, news, social media,
reference articles

• Images

32
Where do knowledge graphs come from?
• Structured Text
◦ Wikipedia Infoboxes, tables,
databases, social nets

• Unstructured Text
◦ WWW, news, social media,
reference articles

• Images

• Video
◦ YouTube, video feeds

33
Knowledge Representation
•Decades of research into knowledge representation

•Most knowledge graph implementations use RDF triples


• <rdf:subject, rdf:predicate, rdf:object> : r(s,p,o)
• Temporal scoping, reification, and skolemization...

•ABox (assertions) versus TBox (terminology)

•Common ontological primitives


• rdfs:domain, rdfs:range, rdf:type, rdfs:subClassOf, rdfs:subPropertyOf, ...
• owl:inverseOf, owl:TransitiveProperty, owl:FunctionalProperty, ...

36
DA A C &A F A &

• RDF is an de facto standard for


Knowledge Graph (KG).
• RDF is a language for the
conceptual modeling of
information about web resources
• A building block of semantic web
• Make the information on the web
and the interrelationships among
them "Machine Understandable"

3
RDF and Semantic Web

I RDF is a language for the conceptual modeling of information


about web resources
I A building block of semantic web
I Facilitates exchange of information
I Search engines can retrieve more relevant information
I Facilitates data integration (mashes)
I Machine understandable
I Understand the information on the web and the
interrelationships among them
RDF Uses

I Yago and DBPedia extract facts from Wikipedia & represent


as RDF ! structural queries
I Communities build RDF data
I E.g., biologists: Bio2RDF and Uniprot RDF
I Web data integration
I Linked Data Cloud
I ...
RDF Data Volumes . . .
I . . . are growing – and fast
I Linked data cloud currently consists of 325 datasets with
>25B triples
I Size almost doubling every year

April ’14:
1091 datasets, ???
triples

Max Schmachtenberg, Christian Bizer, and Heiko Paulheim: Adoption of Linked


Data Best Practices in Di↵erent Topical Domains. In Proc. ISWC, 2014.
RDF Introduction xmlns:y=http://en.wikipedia.org/wiki
y:Abraham Lincoln

I Everything is an uniquely named


resource
I Namespaces can be used to scope
the names
I Properties of resources can be
defined Abraham Lincoln:hasName “Abraham Lincoln”
Abraham Lincoln:BornOnDate: “1809-02-12”
I Relationships with other resources Abraham Lincoln:DiedOnDate: “1865-04-15”

can be defined
I Resources can be contributed by Abraham Lincoln:DiedIn

di↵erent people/groups and can be


located anywhere in the web
I Integrated web “database”

y:Washington DC
RDF Data Model
I Triple: Subject, Predicate (Property),
U
Object (s, p, o)
Subject: the entity that is described
Predicate
(URI or blank node) Subject Object
Predicate: a feature of the entity (URI)
Object: value of the feature (URI, U B U B L
blank node or literal) U: set of URIs
I (s, p, o) 2 (U [ B) ⇥ U ⇥ (U [ B [ L) B: set of blank nodes
L: set of literals
I Set of RDF triples is called an RDF graph

Subject Predicate Object


Abraham Lincoln hasName “Abraham Lincoln”
Abraham Lincoln BornOnDate “1809-02-12”
Abraham Lincoln DiedOnDate “1865-04-15”
RDF Example Instance
Prefix: y=http://en.wikipedia.org/wiki
Subject Predicate Object
y: Abraham Lincoln
y: Abraham Lincoln
hasName
BornOnDate
“Abraham Lincoln”
“1809-02-12”’ Literal
y: Abraham Lincoln DiedOnDate “1865-04-15”
URI y:Abraham Lincoln
y: Abraham Lincoln
bornIn
DiedIn
y:Hodgenville KY
y: Washington DC
y:Abraham Lincoln title “President”
y:Abraham Lincoln gender “Male”
y: Washington DC
y:Washington DC
hasName
foundingYear
“Washington D.C.”
“1790” URI
y:Hodgenville KY hasName “Hodgenville”
y:United States hasName “United States”
y:United States hasCapital y:Washington DC
y:United States foundingYear “1776”
y:Reese Witherspoon bornOnDate “1976-03-22”
y:Reese Witherspoon bornIn y:New Orleans LA
y:Reese Witherspoon hasName “Reese Witherspoon”
y:Reese Witherspoon gender “Female”
y:Reese Witherspoon title “Actress”
y:New Orleans LA foundingYear “1718”
y:New Orleans LA locatedIn y:United States
y:Franklin Roosevelt hasName “Franklin D. Roosevelt”
y:Franklin Roosevelt bornIn y:Hyde Park NY
y:Franklin Roosevelt title “President”
y:Franklin Roosevelt gender “Male”
y:Hyde Park NY foundingYear “1810”
y:Hyde Park NY locatedIn y:United States
y:Marilyn Monroe gender “Female”
y:Marilyn Monroe hasName “Marilyn Monroe”
y:Marilyn Monroe bornOnDate “1926-07-01”
y:Marilyn Monroe diedOnDate “1962-08-05”
RDF Graph
“1926-07-01” “Female”

bornOnDate gender
diedOnDate hasName
“1962-08-05” y:Marilyn Monroe “Marilyn Monroe”

“Abraham Lincoln” “President” “Male”

hasName title gender


bornOnDate bornIn hasName
“1809-02-12” y:Abraham Lincoln y:Hodgenville KY “Hodgenville”

diedOnDate
diedIn “Franklin D. Roosevelt” “Male”
“1865-04-15” “1776”
hasName gender
y:Washington D.C.
y:Franklin Roosevelt
“1976-03-22” foundYear hasName hasCapitalfoundingYear
title
bornOnDate “1790”
“Washington D.C.” y:United States “President”
y:Reese Witherspoon
bornIn
gender hasName
locatedIn locatedIn
title hasName bornIn
“Female”
“United States”
“Actress”“Reese Witherspoon” y:New Orleans LA y:Hyde Park NY

foundingYear foundingYear

“1718” “1810”
A Distributed RDF Graph

029

Hank Azaria
027 009
refs:label 028 Mystery Men s2:fil3
014
s1:act4 refs:label 018
Sleeper
007 refs:label actedIn
rdfs:label Mary Hartman
actedIn 005 002 010
s1:fil1
s1:fil7 actedIn s2:act1 actedIn s4:fil4
015 directed ismarriedTo rdfs:label 017
016 001 rdfs:label
livesIn 019
Small Time Crooks Woody Allen Louise Lasser
rdfs:label s1:dir1 013 020
026 Slither
rdfs:label 008 directed rdfs:label
s2:pla1 New York
A Very Yong Lady
s1:fil2
rdfs:label
hasWonPrize 011 livesIn
actedIn s1:fil5
003 021
006 012
actedIn s4:dir2 refs:label Edmond O’Brien
s3:act3 hasWonPrize s3:awa1 ismarriedTo
004 directed 022
rdfs:label rdfs:label s3:act2
s4:fil6 rdfs:label Man-Trap
025 024
rdfs:label 030
Hugh Grant Cesar Award
023
Nancy Kelly
Representative graph processing systems
Property Online Data In-memory Atomicity &
graphs query sharding storage Transaction
Neo4j Yes Yes No No Yes
Trinity Yes Yes Yes Yes Atomicity
Horton Yes Yes Yes Yes No
HyperGraphDB No Yes No No Yes
FlockDB No Yes Yes No Yes
TinkerGraph Yes Yes No Yes No
InfiniteGraph Yes Yes Yes No Yes
Cayley Yes Yes SB SB Yes
Titan Yes Yes SB SB Yes
MapReduce No No Yes No No
PEGASUS No No Yes No No
Pregel No No Yes No No
Giraph No No Yes No No
GraphLab No No Yes No No
GraphChi No No No No No
GraphX No No Yes No No
DB-Engines Ranking of Graph
DBMS
Cypher query language is
used by Neo4j.
Gremlin is used by most of
graph DBMSs.
GSQL is used by TigerGraph.

https://db-engines.com/en/ranking/graph+dbms
118
Knowledge Graph Primer
TOPICS:
W HAT IS A K NOWLEDGE G RAPH ?
W HY ARE K NOWLEDGE G RAPHS I MPORTANT ?
W HERE DO K NOWLEDGE G RAPHS COME FROM ?
K NOWLEDGE R EPRESENTATION C HOICES
P ROBLEM O VERVIEW

39
What is a knowledge graph?
• Knowledge in graph form!
A1
E1 A2
• Captures entities, attributes,
and relationships
E2
• Nodes are entities A1
A2
• Nodes are labeled with
attributes (e.g., types) E3
A1
• Typed edges between two A2
nodes capture a relationship
between entities

40
Basic problems
A1
E1 A2

E2

A1
A2

E3
A1
A2

41
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?

E2

A1
A2

E3
A1
A2

42
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?

• What are their attributes E2


and types (labels)? A1
A2

E3
A1
A2

43
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?

• What are their attributes E2


and types (labels)? A1
A2
• How are they related E3
(edges)? A1
A2

44
Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?

• What are their attributes E2


and types (labels)? A1
A2
• How are they related E3
(edges)? A1
A2

45
Knowledge Graph Construction

Knowledge Graph
Extraction Construction

46
Two perspectives
Extraction graph Knowledge graph

Who are the entities? • Named Entity • Entity Linking


(nodes) Recognition • Entity Resolution
• Entity Coreference

What are their • Entity Typing • Collective


attributes? (labels) classification

How are they related? • Semantic role • Link prediction


(edges) labeling
• Relation Extraction

8
What is NLP?

Information
“Knowledge”
Extraction

Unstructured Structured
Ambiguous Precise, Actionable
Lots and lots of it! Specific to the task

Humans can read them, but Can be used for downstream


… very slowly applications, such as creating
… can’t remember all Knowledge Graphs!
… can’t answer questions

4
Knowledge Extraction
John was born in Liverpool, to Julia and Alfred Lennon. Text

NLP Lennon.. Mrs. Lennon.. his father


John Lennon... the Pool .. his mother .. he Alfred
Person Location Person Person
John was born in Liverpool, to Julia and Alfred Lennon. Annotated text
NNP VBD VBD IN NNP TO NNP CC NNP NNP

Extraction graph
Information Alfred
Extraction Lennon
childOf
birthplace
Liverpool John
Lennon
childOf Julia
Lennon

5
Breaking it Down
Alfred
Lennon
Information
Extraction

Entity resolution, childOf


spouse
Entity linking, Liverpool
birthplace
John
Lennon
Relation extraction… childOf
Julia
Lennon
Document

Lennon.. Mrs. Lennon.. his father


Coreference Resolution... John Lennon... the Pool .. his mother .. he Alfred
Person Location Person Person
John was born in Liverpool, to Julia and Alfred Lennon.
Sentence

Dependency Parsing,
Part of speech tagging,
Named entity recognition… NNP VBD VBD IN NNP TO NNP CC NNP NNP
John was born in Liverpool, to Julia and Alfred Lennon.

6
Tagging the Parts of Speech

NNP VBD VBD IN NNP TO NNP CC NNP NNP

John was born in Liverpool, to Julia and Alfred Lennon.

Nouns are entities

Verbs are relations

• Common approaches include CRFs, CNNs, LSTMs

7
Detecting Named Entities

Person Location Person Person


John was born in Liverpool, to Julia and Alfred Lennon.

• Structured prediction approaches


• Capture entity mentions and entity types

8
NLP annotations à features for IE
Combine tokens, dependency paths, and entity types to define rules.

appos nmod

det case

Argument 1 , DT CEO of Argument 2


Person Organization

Bill Gates, the CEO of Microsoft, said …


Mr. Jobs, the brilliant and charming CEO of Apple Inc., said …
… announced by Steve Jobs, the CEO of Apple.
… announced by Bill Gates, the director and CEO of Microsoft.
… mused Bill, a former CEO of Microsoft.
and many other possible instantiations…

9
Entity Names: Two Main Problems
Entities with Same Name Different Names for Entities

Same type of entities share names Nick Names


Kevin Smith, John Smith, Bam Bam, Drumpf, …
Springfield, …

Things named after each other Typos/Misspellings


Clinton, Washington, Paris, Baarak, Barak, Barrack, …
Amazon, Princeton, Kingston, …

Partial Reference Inconsistent References


First names of people, Location MSFT, APPL, GOOG…
instead of team name, Nick names

12
Entity Linking Approach
Washington drops 10 points after game with UCLA Bruins.

Washington DC, George Washington, Washington state,


Candidate Generation Lake Washington, Washington Huskies, Denzel Washington,
University of Washington, Washington High School, …

Washington DC, George Washington, Washington state,


Entity Types LOC/ORG Lake Washington, Washington Huskies, Denzel Washington,
University of Washington, Washington High School, …

Washington DC, George Washington, Washington state,


Coreference UWashington, Lake Washington, Washington Huskies, Denzel Washington,
Huskies University of Washington, Washington High School, …

Washington DC, George Washington, Washington state,


UCLA Bruins, Lake Washington, Washington Huskies, Denzel Washington,
Coherence USC Trojans University of Washington, Washington High School, …

Vinculum, Ling, Singh, Weld, TACL (2015) 13


Information Extraction
Lennon.. Mrs. Lennon.. his father
John Lennon... the Pool .. his mother .. he Alfred
Person Location Person Person
John was born in Liverpool, to Julia and Alfred Lennon.
NNP VBD VBD IN NNP TO NNP CC NNP NNP

Information Extraction

Alfred
Lennon
childOf
birthplace spouse
Liverpool John
Lennon
childOf Julia
Lennon

14
Information Extraction
3 CONCRETE SUB-PROBLEMS 3 LEVELS OF SUPERVISION

Supervised
Defining domain
Learning extractors Semi-supervised
Scoring the facts
Unsupervised

15
IE systems in practice
Defining Learning Scoring Fusing
domain extractors candidate extractors
facts
ConceptNet

NELL Heuristic rules

Knowledge
Classifier
Vault
OpenIE

27
Knowledge Extraction: Key Points
• Built on the foundation of NLP techniques
• Part-of-speech tagging, dependency parsing, named
entity recognition, coreference resolution…
• Challenging problems with very useful outputs
• Information extraction techniques use NLP to:
• define the domain
• extract entities and relations
• score candidate outputs
• Trade-off between manual & automatic methods

28
Knowledge Graph Construction

Knowledge Graph
Extraction Construction

46
Knowledge Graph
Construction
TOPICS:
P ROBLEM S ETTING
P ROBABILISTIC M ODELS
E MBEDDING T ECHNIQUES

4
Knowledge Graph
Construction
TOPICS:
P ROBLEM S ETTING
P ROBABILISTIC M ODELS
E MBEDDING T ECHNIQUES

5
Reminder: Basic problems
A1
E1 A2
• Who are the entities
(nodes) in the graph?

• What are their attributes E2


and types (labels)? A1
A2
• How are they related E3
(edges)? A1
A2

6
Graph Construction Issues
Extracted knowledge is:
• ambiguous:
◦ Ex: Beetles, beetles, Beatles
◦ Ex: citizenOf, livedIn, bornIn

7
Graph Construction Issues
Extracted knowledge is:
• ambiguous

• incomplete
◦ Ex: missing relationships
◦ Ex: missing labels
◦ Ex: missing entities

8
Graph Construction Issues
Extracted knowledge is:
• ambiguous

• incomplete

spouse
• inconsistent
◦ Ex: Cynthia Lennon, Yoko Ono
◦ Ex: exclusive labels (alive, dead) spouse
◦ Ex: domain-range constraints

9
Graph Construction Issues
Extracted knowledge is:
• ambiguous

• incomplete

• inconsistent

10
Graph Construction approach
•Graph construction cleans and completes extraction graph

•Incorporate ontological constraints and relational patterns

•Discover statistical relationships within knowledge graph

11
Knowledge Graph
Construction
TOPICS:
P ROBLEM S ETTING
P ROBABILISTIC M ODELS
E MBEDDING T ECHNIQUES

12
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS

13
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS

14
Beyond Pure Reasoning

•Classical AI approach to knowledge: reasoning


Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)

15
Beyond Pure Reasoning

•Classical AI approach to knowledge: reasoning


Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)
•Reasoning difficult when extracted knowledge has errors

16
Beyond Pure Reasoning

•Classical AI approach to knowledge: reasoning


Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)
•Reasoning difficult when extracted knowledge has errors
•Solution: probabilistic models
P(Lbl(Socrates, Mortal)|Lbl(Socrates,Man)=0.9)

17
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS

18
Graphical Models: Overview
•Define joint probability distribution on knowledge graphs

•Each candidate fact in the knowledge graph is a variable

•Statistical signals, ontological knowledge and rules


parameterize the dependencies between variables

•Find most likely knowledge graph by optimization/sampling

19
Knowledge Graph Identification
Define a graphical model to
perform all three of these A1
tasks simultaneously! E1 A2

• Who are the entities


(nodes) in the graph? E2

A1
• What are their attributes A2
and types (labels)?
E3
A1
• How are they related A2
(edges)?

PUJARA+ISWC13 20
Knowledge Graph Identification
A1
E1 A2

P(Who, What, How |Extractions)


E2

A1
A2

E3
A1
A2

PUJARA+ISWC13 21
Probabilistic Models
•Use dependencies between facts in KG

•Probability defined jointly over facts

P=0 P=0.25 P=0.75

22
What determines probability?
•Statistical signals from text extractors and classifiers

23
What determines probability?
•Statistical signals from text extractors and classifiers
• P(R(John,Spouse,Yoko))=0.75; P(R(John,Spouse,Cynthia))=0.25
• LevenshteinSimilarity(Beatles, Beetles) = 0.9

24
What determines probability?
•Statistical signals from text extractors and classifiers

•Ontological knowledge about domain

25
What determines probability?
•Statistical signals from text extractors and classifiers

•Ontological knowledge about domain


• Functional(Spouse) & R(A,Spouse,B) -> !R(A,Spouse,C)
• Range(Spouse, Person) & R(A,Spouse,B) -> Type(B, Person)

26
What determines probability?
•Statistical signals from text extractors and classifiers

•Ontological knowledge about domain

•Rules and patterns mined from data

27
What determines probability?
•Statistical signals from text extractors and classifiers

•Ontological knowledge about domain

•Rules and patterns mined from data


• R(A, Spouse, B) & R(A, Lives, L) -> R(B, Lives, L)
• R(A, Spouse, B) & R(A, Child, C) -> R(B, Child, C)

28
What determines probability?
•Statistical signals from text extractors and classifiers
• P(R(John,Spouse,Yoko))=0.75; P(R(John,Spouse,Cynthia))=0.25
• LevenshteinSimilarity(Beatles, Beetles) = 0.9

•Ontological knowledge about domain


• Functional(Spouse) & R(A,Spouse,B) -> !R(A,Spouse,C)
• Range(Spouse, Person) & R(A,Spouse,B) -> Type(B, Person)

•Rules and patterns mined from data


• R(A, Spouse, B) & R(A, Lives, L) -> R(B, Lives, L)
• R(A, Spouse, B) & R(A, Child, C) -> R(B, Child, C)

29
Example: The Fab Four

30
Illustration of KG Identification
Uncertain Extractions:
.5: Lbl(Fab Four, novel)
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)

PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
(Annotated) Extraction Graph
Uncertain Extractions:
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
musician

novel
Abbey Road

PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
Extraction Graph
Uncertain Extractions:
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
Ontology: musician
Dom(albumArtist, musician)
Mut(novel, musician) novel
Abbey Road

PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
(Annotated) Extraction Graph
Uncertain Extractions: SameEnt
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
Ontology: musician
Dom(albumArtist, musician)
Mut(novel, musician) novel
Entity Resolution: Abbey Road
SameEnt(Fab Four, Beatles)

PUJARA+ISWC13; PUJARA+AIMAG15
Illustration of KG Identification
(Annotated) Extraction Graph
Uncertain Extractions: SameEnt
.5: Lbl(Fab Four, novel) Fab Four Beatles
.7: Lbl(Fab Four, musician)
.9: Lbl(Beatles, musician)
.8: Rel(Beatles,AlbumArtist,
Abbey Road)
Ontology: musician
Dom(albumArtist, musician)
Mut(novel, musician) novel
Entity Resolution: Abbey Road
SameEnt(Fab Four, Beatles)

After Knowledge Graph Identification


Beatles Rel(AlbumArtist)
Lbl
musician Abbey Road
Fab Four
PUJARA+ISWC13; PUJARA+AIMAG15
Probabilistic graphical model for KG
Rel(Beatles,
Lbl(Beatles, novel) AlbumArtist,
Abbey Road)

Lbl(Beatles,
musician)

Lbl(Fab Four,
musician)

Rel(Fab Four,
Lbl(Fab Four, novel) AlbumArtist,
Abbey Road)
Defining graphical models
•Many options for defining a graphical model

•We focus on two approaches, MLNs and PSL, that use rules

•MLNs treat facts as Boolean, use sampling for satisfaction

•PSL infers a “truth value” for each fact via optimization

37
Rules for KG Model
100: Subsumes(L1,L2) & Label(E,L1) -> Label(E,L2)
100: Exclusive(L1,L2) & Label(E,L1) -> !Label(E,L2)

100: Inverse(R1,R2) & Relation(R1,E,O) -> Relation(R2,O,E)


100: Subsumes(R1,R2) & Relation(R1,E,O) -> Relation(R2,E,O)
100: Exclusive(R1,R2) & Relation(R1,E,O) -> !Relation(R2,E,O)

100: Domain(R,L) & Relation(R,E,O) -> Label(E,L)


100: Range(R,L) & Relation(R,E,O) -> Label(O,L)

10: SameEntity(E1,E2) & Label(E1,L) -> Label(E2,L)


10: SameEntity(E1,E2) & Relation(R,E1,O) -> Relation(R,E2,O)

1: Label_OBIE(E,L) -> Label(E,L)


1: Label_OpenIE(E,L) -> Label(E,L)
1: Relation_Pattern(R,E,O) -> Relation(R,E,O)
1: !Relation(R,E,O)
1: !Label(E,L)

JIANG+ICDM12; PUJARA+ISWC13, PUJARA+AIMAG15 38


Rules to Distributions
•Rules are grounded by substituting literals into formulas
wr : SameEnt(Fab Four, Beatles) ^
Lbl(Beatles, musician) ) Lbl(Fab Four, musician)
•Each ground rule has a weighted satisfaction derived
from the formula’s truth value
" #
1 X
P (G|E) = exp wr r (G, E)
Z
r2R
•Together, the ground rules provide a joint probability
distribution over knowledge graph facts, conditioned on
the extractions
JIANG+ICDM12; PUJARA+ISWC13
Probability Distribution over KGs
1
P(G | E) = exp $%−∑ wr ϕ r (G)&'
Z r∈R

CandLblT (FabFour, novel) ) Lbl(FabFour, novel)

Mut(novel, musician) ^ Lbl(Beatles, novel)


) ¬Lbl(Beatles, musician)

SameEnt(Beatles, FabFour) ^ Lbl(Beatles, musician)


) Lbl(FabFour, musician)
[ 1] CandLblstruct (FabFour, novel) φ φ
) Lbl(FabFour, novel) φ1

[ 2 ] CandRelpat (Beatles, AlbumArtist, AbbeyRoad) Lbl(Fab Four, Lbl(Beatles,


) Rel(Beatles, AlbumArtist, AbbeyRoad) novel) novel)

[ 3] SameEnt(Beatles, FabFour)
^ Lbl(Beatles, musician) φ5 φ
) Lbl(FabFour, musician)

Lbl(Fab Four, Lbl(Beatles,


[ 4] Dom(AlbumArtist, musician)
musician) musician)
^ Rel(Beatles, AlbumArtist, AbbeyRoad)
) Lbl(Beatles, musician)
φ3 φ4 φ
[ 5] Mut(musician, novel) φ
^ Lbl(FabFour, musican) Rel(Beatles,
φ2 albumArtist,
) ¬Lbl(FabFour, novel)
Abbey Road)

PUJARA+ISWC13; PUJARA+AIMAG15
How do we get a knowledge graph?
Have: P(KG) forall KGs Need: best KG
A1
A1
E1 A2

P( )
E1 A2

E2
E2
A1
A1
A2
A2
E3
E3
A1
A1
A2
A2

MAP inference: optimizing over distribution to


find the best knowledge graph

42
Inference and KG optimization
•Finding the best KG satisfying weighed rules: NP Hard

•MLNs [discrete]: Monte Carlo sampling methods


• Solution quality dependent on burn-in time, iterations, etc.

•PSL [continuous]: optimize convex linear surrogate


•Fast optimization, ¾-optimal MAX SAT lower bound

43
Graphical Models Experiments
Data: ~1.5M extractions, ~70K ontological relations, ~500 relation/label types
Task: Collectively construct a KG and evaluate on 25K target facts

Comparisons:
Extract Average confidences of extractors for each fact in the NELL candidates
Rules Default, rule-based heuristic strategy used by the NELL project
MLN Jiang+, ICDM12 – estimates marginal probabilities with MC-SAT
PSL Pujara+, ISWC13 – convex optimization of continuous truth values with ADMM

Running Time: Inference completes in 10 seconds, values for 25K facts


AUC F1

Extract .873 .828

Rules .765 .673

MLN (Jiang, 12) .899 .836

PSL (Pujara, 13) .904 .853

JIANG+ICDM12; PUJARA+ISWC13
Graphical Models: Pros/Cons
BENEFITS DRAWBACKS
• Define probability • Requires optimization over
distribution over KGs all KG facts - overkill

• Easily specified via rules • Dependent on rules from


ontology/expert

• Fuse knowledge from many • Require probabilistic


different sources semantics - unavailable

45
Graph Construction
Probabilistic Models
TOPICS:
O VERVIEW
G RAPHICAL MODELS
R ANDOM WALK M ETHODS

46
Random Walk Overview
•Given: a query of an entity and relation

•Starting at the entity, randomly walk the KG

•Random walk ends when reaching an appropriate goal

•Learned parameters bias choices in the random walk

•Output relative probabilities of goal states


Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)

48
Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)

albumArtist

hasInstrument
playsInstrument

49
Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)

50
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

51
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

52
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

53
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

Path Weight of path

P(Q|!=<coworker,playsInstrument>) W!

54
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

P(Q|!=<coworker,playsInstrument>) W!

55
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

P(Q|!=<coworker,playsInstrument>) W!

56
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

57
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

P(Q|!=<albumArtist,hasInstrument>) W!

58
Random Walk Illustration
Query Q: R(Lennon, PlaysInstrument, ?)

P(Q|!=<albumArtist,hasInstrument>) W!

59
Random Walk Illustration
Query: R(Lennon, PlaysInstrument, ?)

60
Recent Random Walk Methods
PRA: Path Ranking Algorithm
• Performs random walk of imperfect knowledge graph
• Estimates transition probabilities using KG
• For each relation, learns parameters for paths through the KG

ProPPR: Programming with Personalized PageRank


• Constructs proof graph
• Nodes are partially-ground clauses with one or more facts
• Edges are proof-transformations

• Parameters are learned for each ground entity and rule

61
Recent Random Walk Methods
PRA: Path Ranking Algorithm
• Performs random walk of imperfect knowledge graph
• Estimates transition probabilities using KG
• For each relation, learns parameters for paths through the KG

ProPPR: Programming with Personalized PageRank


• Constructs proof graph
• Nodes are partially-ground clauses with one or more facts
• Edges are proof-transformations

• Parameters are learned for each ground entity and rule

62
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b

LAO+EMNLP11 63
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b

Filter paths based on HITS and accuracy

LAO+EMNLP11 64
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b

Filter paths based on HITS and accuracy

Estimate probabilities efficiently with dynamic programming

LAO+EMNLP11 65
PRA in a nutshell
X
score(q.s ! e; q) = P (q.s ! e; ⇡i )W⇡i
⇡i 2⇧b

Filter paths based on HITS and accuracy

Estimate probabilities efficiently with dynamic programming

Path weights are learned with logistic regression

LAO+EMNLP11 66
Recent Random Walk Methods
PRA: Path Ranking Algorithm
• Performs random walk of imperfect knowledge graph
• Estimates transition probabilities using KG
• For each relation, learns parameters for paths through the KG

ProPPR: ProbLog + Personalized PageRank


• Constructs proof graph
• Nodes are partially-ground clauses with one or more facts
• Edges are proof-transformations

• Parameters are learned for each ground entity and rule

67
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

Unbound variables in proof tree!

68
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, )
R( ,PlaysInstrument,Y)

69
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, )
R( ,PlaysInstrument,Y)

70
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, )
R( ,PlaysInstrument,Y)

71
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, )
R( ,PlaysInstrument,Y)

R( ,Coworker, )
R( ,PlaysInstrument, )
72
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument,Y) R( ,HasInstrument,K)

R( ,Coworker, )
R( ,PlaysInstrument, )
73
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument,Y) R( ,HasInstrument,K)

R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument, ) R( ,HasInstrument, )
74
ProPPR-ized PRA example
Query Q: R(Lennon, PlaysInstrument, ?)

R( ,Coworker,X) R( ,AlbumArtist,J)
R(X,PlaysInstrument,Y) R(J,HasInstrument,K)

R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument,Y) R( ,HasInstrument,K)

R( ,Coworker, ) R( ,AlbumArtist, )
R( ,PlaysInstrument, ) R( ,HasInstrument, )
75
ProPPR in a nutshell
!
X X
min log p⌫0 [uk+ ] + log(1 p⌫0 [uk ] + µ||w||22
w
k2+ k2

• Input: queries, positive answers, negative answers


k k
• Goal: p⌫0 [u+ ] p⌫0 [u ] (page rank from RW)

• Learn: random walk weights

• Train via stochastic gradient descent


WANG+MLJ15 76
Results from PRA and ProPPR
• Task:
• 1M extractions for 3 domains;
• ~100s of training queries
• ~1000s of test queries
• AUC of extractions alone is 0.7

0.96 Relation Prediction AUC

0.95

0.94 PRA (1M)


ProPPR (1M)
0.93

0.92
Google Beatles Baseball
WANG+MLJ15 77
Random Walks: Pros/Cons
BENEFITS DRAWBACKS
• KG query estimation • Full KG completion task
independent of KG size inefficient

• Model training produces • Training data difficult to


interpretable, logical rules obtain at scale

• Robust to noisy extractions • Input must follow


through probabilistic form probabilistic semantics

78
Two classes of Probabilistic Models
GRAPHICAL MODELS RANDOM WALK METHODS
◦ Possible facts in KG are ◦ Possible facts posed as
variables queries

◦ Logical rules relate facts ◦ Random walks of the KG


constitute “proofs”

◦ Probability satisfied ◦ Probability path


rules lengths/transitions

◦ Universally-quantified ◦ Locally grounded

79
MATRICES, TENSORS, AND NEURAL NETWORKS
Probabilistic Models: Downsides

Embeddings

Limitation to Logical Relations


• Representation restricted by manual design
• Everything as dense vectors
• Clustering? Assymetric implications?
• Can capture many relations
• Information flows through these relations
• Learned from data
• Difficult to generalize to unseen entities/relations

Computational Complexity of Algorithms • Complexity depends on


latent dimensions
• Complexity depends on explicit dimensionality
• Learning using stochastic
• Often NP-Hard, in size of data
gradient, back-propagation
• More rules, more expensive inference
• Querying is often cheap
• Query-time inference is sometimes NP-Hard
• GPU-parallelism friendly
• Not trivial to parallelize, or use GPUs

2
Two Related Tasks
surface pattern relation

relation relation
Relation
Extraction relation

surface pattern

relation relation

relation relation relation


relation Graph
Completion
relation relation
relation

3
Two Related Tasks
surface pattern relation

relation relation
Relation
Extraction relation

surface pattern

relation relation

relation relation relation


relation Graph
Completion
relation relation
relation

4
What is NLP?
John was born in Liverpool, to Julia and Alfred Lennon.

Natural Language
Processing

Lennon.. Mrs. Lennon.. his father


John Lennon... the Pool .. his mother .. he Alfred
Person Location Person Person
John was born in Liverpool, to Julia and Alfred Lennon.
NNP VBD VBD IN NNP TO NNP CC NNP NNP

5
What is Information Extraction?
Lennon.. Mrs. Lennon.. his father
John Lennon... the Pool .. his mother .. he Alfred
Person Location Person Person
John was born in Liverpool, to Julia and Alfred Lennon.
NNP VBD VBD IN NNP TO NNP CC NNP NNP

Information Extraction

Alfred
Lennon
childOf
birthplace spouse
Liverpool John
Lennon
childOf Julia
Lennon

6
Relation Extraction From Text
John was born in Liverpool, to Julia and Alfred Lennon.

Alfred
Lennon
“born in __, to”

“was born to”


“was born in” John
Liverpool “and”
Lennon

“was born to”

“born in __, to” Julia


Lennon

7
Relation Extraction From Text
John was born in Liverpool, to Julia and Alfred Lennon.

Alfred
livedIn Lennon
“born in __, to”
childOf
“was born to”
“was born in” John
Liverpool “and”
birthplace Lennon
childOf
“was born to”

“born in __, to” Julia


livedIn Lennon

8
“Distant” Supervision
“was born in” John
Liverpool
birthplace Lennon

No direct supervision gives us this information.


Supervised: Too expensive to label sentences
Rule-based: Too much variety in language
Both only work for a small set of relations, i.e. 10s, not 100s

“is native to”


“was born in”
Barack
Honolulu
birthplace Obama
“visited”
“met the senator from”

9
Relation Extraction as a Matrix
John was born in Liverpool, to Julia and Alfred Lennon.

John Lennon, Liverpool 1 ?

John Lennon, Julia Lennon 1


Entity Pairs

John Lennon, Alfred Lennon 1

Julia Lennon, Alfred Lennon 1 ?

Barack Obama, Hawaii 1 1

Barack Obama, Michelle Obama 1 1

Universal Schema, Riedel et al, NAACL (2013) 10


Matrix Factorization

m relations
k
n
n k
m relations

≈ X

pairs
pairs

bornIn(John,Liverpool)

Universal Schema, Riedel et al, NAACL (2013) 11


Training: Stochastic Updates
relations
relations

pairs
pairs

Pick an observed cell, :


◦ Update & such that is higher
Pick any random cell, assume it is negative:
◦ Update & such that is lower
12
Relation Embeddings

13
Embeddings ~ Logical Relations
Relation Embeddings, w
◦ Similar embedding for 2 relations denote they are paraphrases
◦ is married to, spouseOf(X,Y), /person/spouse
◦ One embedding can be contained by another
◦ w(topEmployeeOf) ⊂ w(employeeOf)
◦ topEmployeeOf(X,Y) → employeeOf(X,Y)
◦ Can capture logical patterns, without needing to specify them!

Entity Pair Embeddings, v


Similar entity pairs denote similar
relations between them
Entity pairs may describe multiple
“relations”
independent foundedBy and employeeOf
relations

From Sebastian Riedel 14


Similar Embeddings
similar underlying embedding

X own percentage of Y X buy stake in Y

Time, Inc
1 1
similar embedding

Amer. Tel. and Comm.

Volvo
1
Scania A.B.

Campeau
Federated Dept Stores

Apple
HP

Successfully predicts “Volvo owns percentage of Scania A.B.”


from “Volvo bought a stake in Scania A.B.”

From Sebastian Riedel 15


Implications
X historian at Y → X professor at Y

X professor at Y X historian at Y

Kevin Boyle
1
(Freeman,Harvard)
Ohio State
→ (Boyle,OhioState)
R. Freeman
1
Harvard

Learns asymmetric entailment:


PER historian at UNIV → PER professor at UNIV
But,
PER professor at UNIV → PER historian at UNIV

From Sebastian Riedel 16


Two Related Tasks
surface pattern relation

relation relation
Relation
Extraction relation

surface pattern

relation relation

relation relation relation


relation Graph
Completion
relation relation
relation

17
Graph Completion
Alfred
livedIn Lennon
“born in __, to”
childOf
“was born to”
“was born in” John
Liverpool “and”
birthplace Lennon
childOf
“was born to”

“born in __, to” Julia


livedIn Lennon

18
Graph Completion
Alfred
livedIn Lennon

childOf

spouse
John
Liverpool
birthplace Lennon
childOf
spouse

Julia
livedIn Lennon

19
Tensor Formulation of KG

|R|

Does an unseen
relation exist?

|E| e1

r
e2
|E|

20
Factorize that Tensor
|E|

|R| k
k
k

|E| |E|
|R|

|E|

21
Many Different Factorizations
CANDECOMP/PARAFAC-Decomposition

Tucker2 and RESCAL Decompositions

Model E

Not tensor
factorization
Holographic Embeddings (per se)

HOLE: Nickel et al, AAAI (2016), Model E: Riedel et al, NAACL (2013), RESCAL: Nickel et al, WWW (2012), CP: Harshman (1970), Tucker2: Tucker (1966) 22
Translation Embeddings
TransE

Honolulu
e2
birthplace
r TransH

e1
Barack Obama
Liverpool

birthplace
TransR

John Lennon

TransE: Bordes et al. XXX (2011), TransH: Bordes et al. XXX (2011), TransR: Bordes et al. XXX (2011) 23
Parameter Estimation

|R|

Observed cell: increase score

|E| e1
Unobserved cell: decrease score

r e2
|E|

24
Matrix vs Tensor Factorization

• Vectors for each entity pair • Vectors for each entity


• Can only predict for entity pairs that • Assume entity pairs are “low-rank”
appear in text together • But many relations are not!
• No sharing for same entity in different • Spouse: you can have only ~1
entity pairs • Cannot learn pair specific information

25
What they can, and can’t, do..

From Singh et al. VSM (2015), http://sameersingh.org/files/papers/mftf-vsm15.pdf 26


Joint Extraction+Completion
surface pattern relation

relation relation
Relation
Extraction relation

surface pattern

Joint Model
relation relation

relation relation relation


relation Graph
Completion
relation relation
relation

27
Compositional Neural Models
So far, we’re learning vectors for each entity/surface pattern/relation..
But learning vectors independently ignores “composition”

Composition in Surface Patterns Composition in Relation Paths

• Every surface pattern is not unique • Every relation path is not unique
A is B’s spouse. A parent B, B parent C
• Synonymy: • Explicit:
A is married to B. A grandparent C

• Inverse: X is Y’s parent. • Implicit: X bornInCity Y, Y cityInState Z


Y is one of X’s children. X “bornInState” Z

• Can the representation learn this? • Can the representation capture this?

28
Composing Dependency Paths
… was born to … … ‘s parents are … \parentsOf

(never appears in
training data)

But we don’t need linked data to know they mean similar things…
Use neural networks to produce the embeddings from text!

NN NN

… was born to … … ‘s parents are … \parentsOf

Verga et al (2016), https://arxiv.org/pdf/1511.06396v2.pdf 29


Composing Relational Paths
countryBasedIn

NN
stateBasedIn

NN

isBasedIn stateLocatedIn countryLocatedIn


Microsoft Seattle Washington USA

Neelakantan et al (2015), http://www.aaai.org/ocs/index.php/SSS/SSS15/paper/viewFile/10254/10032


30
Lin et al, EMNLP (2015), https://arxiv.org/pdf/1506.00379.pdf
Review: Embedding Techniques
Two Related Tasks:
• Relation Extraction from Text
• Graph (or Link) Completion

Relation Extraction:
• Matrix Factorization Approaches

Graph Completion:
• Tensor Factorization Approaches

Compositional Neural Models


• Compose over dependency paths
• Compose over relation paths

31
Logic-centric serving via Symbolic Reasoning
knowledge reasoning

Relation-centric serving via


Graph serving
knowledge graph

Facts-centric serving via


Entity serving
knowledge indexes

The evolution of knowledge representation


Why is a big knowledge graph not enough?
• Large knowledge graphs have billions of facts

• However, it doesn’t provide much help in logic reasoning

o The knowledge is not symbolized logic knowledge

o Lack of reasoning rules allow machines to do reasoning automatically

o More importantly, lack of common sense


The pyramid of knowledge
abstract

Logic
Rule
Base

ConceptNet
(Common Sense Base)

Freebase, etc.
(Facts Base)

concrete
Knowledge in symbolic logic form
• Symbols are abstract identifiers can be manipulated in an algebra system
• Variables
• Functions

• Symbolic expression is a finite combination of symbols

• Symbolic transformation: a symbolic expression can be transformed into


another symbolic expression according to the rules of a predefined
reasoning algebra
• An inference engine tries to derive answers for a logic question by performing logical
deductions
Represents Satori facts and common sense bark
Satori
knowledge in RHHG CapableOf, 0.8
Animal

IsA, 0.98

dog
Pal

ConceptNet
(Common Sense Base)

person actor
Lassie

Satori
(Facts Base)
Jan Clayton
logic rules

Common sense
Functions and relations are just hyperedges!
• is just a hyperedge connecting
three nodes .

• A logical expression can be


written as .

Hyperedges
• Symbolic transformation is just graph
pattern matching and graph transformation!
bark
Satori
CapableOf, 0.8
Animal

IsA, 0.98

dog
Pal

Logic
person actor
rule Lassie
base

Jan Clayton
logic rules

Common sense
Use graph transformation to do logic deduction

Pal IsA Dog IsA Animal

The logical deduction of a transitive relation

Pal IsA Animal

Graph transformation: whenever we see a graph with a certain


pattern , replace it with a graph .
Our “shallow” yet reasonable answer
• Why can Albert Einstein think, computer can’t
computer
• [brain] is Capable Of [think]
• [person] have [brain]
• [Albert Einstein] is a [person]
PartOf PartOf
?x ?y ?z

• [think] requires [brain]


• [computer] does not have [brain] CapableOf

person
think
IsA
Albert HasPrerequisite
Einstein
Multimodal KB Embeddings

Object Encoder

Entity Lookup

Images CNN

Text LSTM

Numbers, etc. FeedFwd


Knowledge as Supervision
✔ spouseOf(Barack, Michelle) Learned Model

Update Model
User Learning
Algorithm

Problem 1: Each annotation takes time


Problem 2: Each annotation is a drop in the ocean

✔ X husband of Y => spouseOf(X,Y) Learned Model

Update Model
User Learning
Algorithm
Many different options
- Generalized Expectation
- Posterior Regularization
- Labeling functions in SNORKEL

45
(2) Future research directions:
Online KG Construction
• One shot KG construction Online KG construction
• Consume online stream of data
• Temporal scoping of facts
• Discovering new concepts automatically
• Self-correcting systems

62
(2) Future research directions:
Online KG Construction
• Continuously learning and self-correcting systems
• [Selecting Actions for Resource-bounded Information Extraction using
Reinforcement Learning, Kanani and McCallum, WSDM 2012]
• Presented a reinforcement learning framework for budget constrained information extraction

• [Never-Ending Learning, Mitchell et al. AAAI 2015]


• Tom Mitchell says “Self reflection and an explicit agenda of learning subgoals” is an important
direction of future research for continuously learning systems.

63

You might also like