Introducing inference-driven OWL ABox enrichment
Proceedings of International Conference on Information Integration and Web …, 2013•dl.acm.org
Publically available text-based documents (eg news, meeting transcripts) are a very
important source of knowledge for organizations and individuals. These documents refer
domain entities such as persons, places, professional positions, decisions, actions, etc.
Querying these documents (instead of browsing, searching and finding) is a very relevant
task for any person in general, and particularly for professionals dealing with intensive
knowledge tasks. Querying text-based documents' data, however, is not supported by …
important source of knowledge for organizations and individuals. These documents refer
domain entities such as persons, places, professional positions, decisions, actions, etc.
Querying these documents (instead of browsing, searching and finding) is a very relevant
task for any person in general, and particularly for professionals dealing with intensive
knowledge tasks. Querying text-based documents' data, however, is not supported by …
Publically available text-based documents (e.g. news, meeting transcripts) are a very important source of knowledge for organizations and individuals. These documents refer domain entities such as persons, places, professional positions, decisions, actions, etc. Querying these documents (instead of browsing, searching and finding) is a very relevant task for any person in general, and particularly for professionals dealing with intensive knowledge tasks. Querying text-based documents' data, however, is not supported by common technology. For that, such documents' content has to be explicitly and formally captured into knowledge base facts. Making use of automatic NLP processes for capturing such facts is a common approach, but their relatively low precision and recall give rise to data quality problems. Further, facts existing in the documents are often insufficient to answer complex queries and, therefore, it is often necessary to enrich the captured facts with facts from third-party repositories (e.g. public LOD, private IS databases). This paper describes the adopted process to identify what data is currently missing from the knowledge base repository and which is desirable to collect from external repositories. The proposed process aims to foster and is driven by OWL DL inference-based instance (ABox) classification, which is supported by the constraints of the TBox.
![](/https/scholar.google.com/scholar/images/qa_favicons/acm.org.png)