0% found this document useful (0 votes)
80 views25 pages

Towards Knowledge Graphs Validation Through Weighted Knowledge Sources

Knowledge graphs (KGs) have shown to be an important asset of large companies, which provide correct and reliable knowledge. To do so a critical task is knowledge validation, which measures whether statements from KGs are semantically correct and correspond to the so-called "real"world. We propose a Knowledge Graph Validation Framework.

Uploaded by

Elwin Huaman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views25 pages

Towards Knowledge Graphs Validation Through Weighted Knowledge Sources

Knowledge graphs (KGs) have shown to be an important asset of large companies, which provide correct and reliable knowledge. To do so a critical task is knowledge validation, which measures whether statements from KGs are semantically correct and correspond to the so-called "real"world. We propose a Knowledge Graph Validation Framework.

Uploaded by

Elwin Huaman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Towards Knowledge Graphs Validation

through Weighted Knowledge Sources


Elwin Huaman, Amar Tauqeer, and Anna Fensel
Semantic Technology Institute (STI) Innsbruck
Department of Computer Science,
University of Innsbruck, Austria

KGSWC 2021 Next Generation


Outline

● What?
Basics - Research questions

● How?
Approach - Solution

● Why?
Use cases

Elwin Huaman | KGSWC2021 | 23/11/2021 2


What?
Basics - Research questions

Elwin Huaman | KGSWC2021 | 23/11/2021 3


What?
Weighted Knowledge Sources
Weighted knowledge sources are data sources that have different weights (or degree of
importance) for different application scenarios.

Which KG is best for me?


● Quality - fitness for use
● Whether data complies to the user's need
● Dependent on tasks

Elwin Huaman | KGSWC2021 | 23/11/2021 4


What?
Knowledge Graphs
Knowledge Graphs are very large semantic nets that integrate various and heterogeneous
information sources to represent knowledge about certain domains of discourse.
Entity

:anna :luis :enrolledIn Literal


:birthPlace
:age Relationship
:name :enrolledIn
:knows :birthPlace
:name :age sameAs relatioship
Anna 21
sameAs :cs102
:cs101 :Puno :Puno Luis None
:enrolledIn ● Basic statement (or triple)
:subject ● We can add more statements
:name :name :subject
:carol
● … and more statements can be added
Programming
● Forming a graph
Puno Puno Algebra
:name
● Graphs can be created independently
Carol :enrolledIn ● … and can be integrated
prefix : <http://example.org/>

Elwin Huaman | KGSWC2021 | 23/11/2021 5


What?
Knowledge Graphs Validation
Knowledge Graphs Validation task aims measuring whether statements from KGs are
semantically correct and correspond to the so-called "real" world.

The University of Innsbruck is located in the city of Innsbruck

A simple statement or triple.

A triple = (subject, predicate, object)

is located in
A triple: University of Innsbruck City of Innsbruck

http://schema.org/containedInPlace
An RDF triple: http://example.com/University_of_Innsbruck http://example.com/Innsbruck

Elwin Huaman | KGSWC2021 | 23/11/2021 6


What?
Knowledge Graphs Validation
Type

Entity

Literal
so:Course so:Product so:Place
What needs to be fixed?
Relationship

sameAs relatioship :anna e:luis ● Wrong instance assertion


so:birthPlace
so:name
so:age E.g. :anna is a Person, not a Product
so:knows so:birthPlace
so:knows so:age ● Wrong property value assertion
Anna 21
sameAs
so:name E.g. so:knows is semantically wrong
:cs101 :Puno e:Puno
so:name Luis None ● Wrong equality assertion
so:teaches
E.g. :Puno and e:Puno are related, but not
Programming so:name so:name
so:courseCode :carol the same
Puno Puno
so:name
● …
prefix : <http://example.org/>
prefix e: <http://example.com/> cs101 Carol
prefix so: <http://schema.org/>

Elwin Huaman | KGSWC2021 | 23/11/2021 7


What?
Towards Knowledge Graphs Validation through Weighted
Knowledge Sources
Compute a confidence score for every triple (or statement) and instance in KGs. The computed score is
based on finding the same instances across different weighted knowledge sources and comparing their
features.

Weights

Reliable
KGs
[0.1]
Validator
KG

Elwin Huaman | KGSWC2021 | 23/11/2021 8


Validator
How?
Approach - Solution

Elwin Huaman | KGSWC2021 | 23/11/2021 9


How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Input: The user has two options, a) to provide a SPARQL endpoint where to fetch the data from, or b) to
load a dataset in a Turtle format.

Reliable
KGs

KG

Elwin Huaman | KGSWC2021 | 23/11/2021 10


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Mapping: The validator maps the input KG and the external sources to a common format.

Validator
Reliable
KGs

Mapping
KG

DS

Elwin Huaman | KGSWC2021 | 23/11/2021 11


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Instance Matching: The Validator requests to define at least two or more properties (e.g., name and geo
coordinates) that are to be used for the instance matching process.

Validator
Reliable
KGs
Instance
Mapping
matching
KG

DS

Elwin Huaman | KGSWC2021 | 23/11/2021 12


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Confidence Measurement / Triple validation: Calculates a confidence score of whether a property value
on various external sources matches the property value in the user’s KG.

Validator
Confidence Measurement
Reliable
KGs
Instance Triple [0.1]
Mapping
matching validation
KG

DS
Weights

Elwin Huaman | KGSWC2021 | 23/11/2021 13


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Confidence Measurement / Instance validation: Computes the aggregated score from the attribute
space of an instance.

Validator
Confidence Measurement
Reliable
KGs [0.1]
Instance Triple Instance [0.1]
Mapping
matching validation validation
KG

DS
Weights

Elwin Huaman | KGSWC2021 | 23/11/2021 14


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Output: The computed scores for triples and instances are shown in a graphical user interface.

Validator
Confidence Measurement
Reliable
KGs [0.1] [0.1]
Instance Triple Instance
Mapping
matching validation validation
KG

DS
Weights

Elwin Huaman | KGSWC2021 | 23/11/2021 15


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Output: The computed scores for triples and instances are shown in a graphical user interface.

Elwin Huaman | KGSWC2021 | 23/11/2021 16


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Evaluation I:

● Dataset: A subset of the Tirol Knowledge


Graph (~15 Billion statements).
○ 50 Hotel instances
● Baseline: We performed a manual validation.
○ Precision, Recall, and F-measure
● Result: F-measure of at least 75% on
address, name, and phone properties.
Comparison of precision, recall, and f-measure scores over the
manual and semi-automatic validation.

Elwin Huaman | KGSWC2021 | 23/11/2021 17


Validator
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Evaluation II:

● Dataset: Pantheon dataset 11341 famous biographies


○ 2530 politician instances
● Setup: We defined two external sources.
○ Wikidata and DBpedia.
● Result: ~15 minutes.
○ Overall recall scores are
■ 0.36% (DBpedia)
■ 0.49% (Wikidata)

The recall score results of the validation of politician instances.

Elwin Huaman | KGSWC2021 | 23/11/2021 18


Validator
Why?
Use cases

Elwin Huaman | KGSWC2021 | 23/11/2021 19


Why?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Use cases:
Semantic correctness of a triple.
E.g. To validate if the shown data of a person,
business, are correct based on different sources

Elwin Huaman | KGSWC2021 | 23/11/2021 20


Why?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Use cases:
Semantic correctness of a triple.
E.g. To validate if the shown data of a person,
business, are correct based on different sources

Linking different Knowledge Sources.


E.g. Linking an instance of the user’s KG with the matched
instance in Wikidata

Elwin Huaman | KGSWC2021 | 23/11/2021 21


Why?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Use cases:
Semantic correctness of a triple.
E.g. To validate if the shown data of a person,
business, are correct based on different sources

Linking different Knowledge Sources.


E.g. Linking an instance of the user’s KG with the matched
instance in Wikidata.

Validating static data.


E.g. Checking whether the addresses of hotels are
up-to-date and are correctly shown by external sources.

Elwin Huaman | KGSWC2021 | 23/11/2021 22


Insights & Limitations
❏ Assessment
❏ Automation
❏ Cost-effectiveness
❏ Dynamic-data
❏ Scalability

Elwin Huaman | KGSWC2021 | 23/11/2021 23


Summary
● A Validation framework
○ Mapping
○ Instance Matching
○ Confidence Measurement
■ Triple validation
■ Instance Validation
○ GUI
● Use cases
● Insights and limitations

Elwin Huaman | KGSWC2021 | 23/11/2021 24


Acknowledgement
STI Univ.-Prof. Dr. Fensel Dieter
Assoc.-Prof. Dr. Fensel Anna
Tauqeer Amar M.Sc.

Projects MindLab (mindlab.ai)


Next Generation WordLiftNG (wordlift.io/ng/)

Elwin Huaman | KGSWC2021 | 23/11/2021 25

You might also like