0% found this document useful (0 votes)

83 views14 pages

Dbpedia: A Nucleus For A Web of Open Data

DBpedia extracts structured data from Wikipedia and publishes it on the web in RDF format. It extracts information from infobox templates, article categories, links, images and other structured data in Wikipedia. This results in a large dataset of over 100 million RDF triples describing people, places and things. DBpedia links this data to other open datasets, creating a web of over 2 billion RDF triples that can be queried and used in applications.

Uploaded by

api-3773589

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views14 pages

Dbpedia: A Nucleus For A Web of Open Data

Uploaded by

api-3773589

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

DBpedia: A Nucleus for a Web of Open Data

Sören Auer1,3 , Christian Bizer2 , Georgi Kobilarov2 , Jens Lehmann1 , Richard

Cyganiak2 , and Zachary Ives3
1
Universität Leipzig, Department of Computer Science, Johannisgasse 26,
D-04103 Leipzig, Germany,
{auer,lehmann}@informatik.uni-leipzig.de
2
Freie Universität Berlin, Web-based Systems Group, Garystr. 21,
D-14195 Berlin, Germany,
[email protected], [email protected] [email protected]
3
University of Pennsylvania, Department of Computer and Information Science
Philadelphia, PA 19104, USA,
[email protected], [email protected]

Abstract DBpedia is a community effort to extract structured informa-

tion from Wikipedia and to make this information available on the Web.
DBpedia allows you to ask sophisticated queries against datasets derived
from Wikipedia and to link other datasets on the Web to Wikipedia
data. We describe the extraction of the DBpedia datasets, and how the
resulting information is published on the Web for human- and machine-
consumption. We describe some emerging applications from the DBpedia
community and show how website authors can facilitate DBpedia content
within their sites. Finally, we present the current status of interlinking
DBpedia with other open datasets on the Web and outline how DBpedia
could serve as a nucleus for an emerging Web of open data.

1 Introduction
It is now almost universally acknowledged that stitching together the world’s
structured information and knowledge to answer semantically rich queries is
one of the key challenges of computer science, and one that is likely to have
tremendous impact on the world as a whole. This has led to almost 30 years
of research into information integration [15,19] and ultimately to the Semantic
Web and related technologies [1,11,13]. Such efforts have generally only gained
traction in relatively small and specialized domains, where a closed ontology,
vocabulary, or schema could be agreed upon. However, the broader Semantic
Web vision has not yet been realized, and one of the biggest challenges facing such
efforts has been how to get enough “interesting” and broadly useful information
into the system to make it useful and accessible to a general audience.
A challenge is that the traditional “top-down” model of designing an ontology
or schema before developing the data breaks down at the scale of the Web: both
data and metadata must constantly evolve, and they must serve many different
communities. Hence, there has been a recent movement to build the Seman-
tic Web grass-roots-style, using incremental and Web 2.0-inspired collaborative
approaches [10,12,13]. Such a collaborative, grass-roots Semantic Web requires
a new model of structured information representation and management: first
and foremost, it must handle inconsistency, ambiguity, uncertainty, data prove-
nance [3,6,8,7], and implicit knowledge in a uniform way.
Perhaps the most effective way of spurring synergistic research along these
directions is to provide a rich corpus of diverse data. This would enable re-
searchers to develop, compare, and evaluate different extraction, reasoning, and
uncertainty management techniques, and to deploy operational systems on the
Web.
The DBpedia project has derived such a data corpus from the Wikipedia
encyclopedia. Wikipedia is heavily visited and under constant revision (e.g.,
according to alexa.com, Wikipedia was the 9th most visited website in the third
quarter of 2007). Wikipedia editions are available in over 250 languages, with
the English one accounting for more than 1.95 million articles. Like many other
web applications, Wikipedia has the problem that its search capabilities are
limited to full-text search, which only allows very limited access to this valuable
knowledge base. As has been highly publicized, Wikipedia also exhibits many
of the challenging properties of collaboratively edited data: it has contradictory
data, inconsistent taxonomical conventions, errors, and even spam.
The DBpedia project focuses on the task of converting Wikipedia content
into structured knowledge, such that Semantic Web techniques can be employed
against it — asking sophisticated queries against Wikipedia, linking it to other
datasets on the Web, or creating new applications or mashups. We make the
following contributions:

– We develop an information extraction framework, which converts Wikipedia

content to RDF. The basic components form a foundation upon which fur-
ther research into information extraction, clustering, uncertainty manage-
ment, and query processing may be conducted.
– We provide Wikipedia content as a large, multi-domain RDF dataset, which
can be used in a variety of Semantic Web applications. The DBpedia dataset
consists of 103 million RDF triples.
– We interlink the DBpedia dataset with other open datasets. This results in
a large Web of data containing altogether around 2 billion RDF triples.
– We develop a series of interfaces and access modules, such that the dataset
can be accessed via Web services and linked to other sites.

The DBpedia datasets can be either imported into third party applications
or can be accessed online using a variety of DBpedia user interfaces. Figure 1
gives an overview about the DBpedia information extraction process and shows
how extracted data is published on the Web. These main DBpedia interfaces
currently use Virtuoso [9] and MySQL as storage back-ends.
The paper is structured as follows: We give an overview about the DBpedia
information extraction techniques in Section 2. The resulting datasets are de-
scribed in Section 3. We exhibit methods for programmatic access to the DBpedia
dataset in Section 4. In Sections 5 we present our vision of how the DBpedia
Web 2.0 Semantic Web Traditional
Mashups Browsers Web Browser

SPARQL Linked SNORQL Query

… …
Endpoint Data Browser Builder

published via
Virtuoso MySQL

loaded into
DBpedia datasets

Articles Infobox … Categories

Extraction

Wikipedia Dumps
Article texts DB tables

Figure 1. Overview of the DBpedia components.

datasets can be a nucleus for a Web of open data. We showcase several user
interfaces for accessing DBpedia in Section 6 and finally review related work in
Section 7.

2 Extracting Structured Information from Wikipedia

Wikipedia articles consist mostly of free text, but also contain different types of
structured information, such as infobox templates, categorisation information,
images, geo-coordinates, links to external Web pages and links across different
language editions of Wikipedia.
Mediawiki4 is the software used to run Wikipedia. Due to the nature of this
Wiki system, basically all editing, linking, annotating with meta-data is done
inside article texts by adding special syntactic constructs. Hence, structured in-
formation can be obtained by parsing article texts for these syntactic constructs.
Since MediaWiki exploits some of this information itself for rendering the user
interface, some information is cached in relational database tables. Dumps of the
crucial relational database tables (including the ones containing the article texts)
for different Wikipedia language versions are published on the Web on a regular
basis5 . Based on these database dumps, we currently use two different methods of
extracting semantic relationships: (1) We map the relationships that are already
stored in relational database tables onto RDF and (2) we extract additional
information directly from the article texts and infobox templates within the
articles.
We illustrate the extraction of semantics from article texts with an Wikipedia
infobox template example. Figure 2 shows the infobox template (encoded within
4
http://www.mediawiki.org
5
http://download.wikimedia.org/
Figure 2. Example of a Wikipedia template and rendered output (excerpt).

a Wikipedia article) and the rendered output of the South-Korean town Bu-
san. The infobox extraction algorithm detects such templates and recognizes
their structure using pattern matching techniques. It selects significant tem-
plates, which are then parsed and transformed to RDF triples. The algorithm
uses post-processing techniques to increase the quality of the extraction. Me-
diaWiki links are recognized and transformed to suitable URIs, common units
are detected and transformed to data types. Furthermore, the algorithm can
detect lists of objects, which are transformed to RDF lists. Details about the in-
fobox extraction algorithm (including issues like data type recognition, cleansing
heuristics and identifier generation) can be found in [2]. All extraction algorithms
are implemented using PHP and are available under an open-source license6 .

3 The DBpedia Dataset

The DBpedia dataset currently provides information about more than 1.95 mil-
lion ”things”, including at least 80,000 persons, 70,000 places, 35,000 music al-
bums, 12,000 films. It contains 657,000 links to images, 1,600,000 links to rele-
vant external web pages, 180,000 external links into other RDF datasets, 207,000
Wikipedia categories and 75,000 YAGO categories [16].
DBpedia concepts are described by short and long abstracts in 13 differ-
ent languages. These abstracts have been extracted from the English, German,
6
http://sf.net/projects/dbpedia
French, Spanish, Italian, Portuguese, Polish, Swedish, Dutch, Japanese, Chinese,
Russian, Finnish and Norwegian versions of Wikipedia.
Altogether the DBpedia dataset consists of around 103 million RDF triples.
The dataset is provided for download as a set of smaller RDF files. Table 1 gives
an overview over these files.

Dataset Description Triples

Articles Descriptions of all 1.95 million concepts within the English 7.6M
Wikipedia including titles, short abstracts, thumbnails and
links to the corresponding articles.
Ext. Abstracts Additional, extended English abstracts. 2.1M
Languages Additional titles, short abstracts and Wikipedia article 5.7M
links in German, French, Spanish, Italian, Portuguese, Pol-
ish, Swedish, Dutch, Japanese, Chinese, Russian, Finnish
and Norwegian.
Lang. Abstracts Extended abstracts in 13 languages. 1.9M
Infoboxes Data attributes for concepts that have been extracted from 15.5M
Wikipedia infoboxes.
External Links Links to external web pages about a concept. 1.6M
Article Categories Links from concepts to categories using SKOS. 5.2M
Categories Information which concept is a category and how categories 1M
are related.
Yago Types Dataset containing rdf:type Statements for all DBpedia in- 1.9 M
stances using classification from YAGO [16].
Persons Information about 80,000 persons (date and place of birth 0.5M
etc.) represented using the FOAF vocabulary.
Page Links Internal links between DBpedia instances derived from the 62M
internal pagelinks between Wikipedia articles.
RDF Links Links between DBpedia and Geonames, US Census, Mu- 180K
sicbrainz, Project Gutenberg, the DBLP bibliography and
the RDF Book Mashup.
Table 1. The DBpedia datasets.

Some datasets (such as the Persons or Infoboxes datasets) are semantically

rich in the sense that they contain very specific information. Others (such as
the PageLinks dataset) contain meta-data (such as links between articles) with-
out a specific semantics. However, the latter can be beneficial, e.g. for deriving
measures of closeness between concepts or relevance in search results.
Each of the 1.95 million resources described in the DBpedia dataset is iden-
tified by a URI reference of the form http://dbpedia.org/resource/Name ,
where Name is taken from the URL of the source Wikipedia article, which has
the form http://en.wikipedia.org/wiki/Name . Thus, each resource is tied
directly to an English-language Wikipedia article. This yields certain beneficial
properties to DBpedia identifiers:

– They cover a wide range of encyclopedic topics,

– They are defined by community consensus,
– There are clear policies in place for their management,
– And an extensive textual definition of the concept is available at a well-
known web location (the Wikipedia page).

4 Accessing the DBpedia Dataset on the Web

We provide three access mechanisms to the DBpedia dataset: Linked Data, the
SPARQL protocol, and downloadable RDF dumps. Royalty-free access to these
interfaces is granted under the terms of the GNU Free Documentation License.

Linked Data. Linked Data is a method of publishing RDF data on the Web
that relies on http:// URIs as resource identifiers and the HTTP protocol to
retrieve resource descriptions [4,5]. The URIs are configured to return mean-
ingful information about the resource—typically, an RDF description contain-
ing everything that is known about it. Such a description usually mentions re-
lated resources by URI, which in turn can be accessed to yield their descrip-
tions. This forms a dense mesh of web-accessible resource descriptions that can
span server and organization boundaries. DBpedia resource identifiers, such as
http://dbpedia.org/resource/Busan, are set up to return RDF descriptions
when accessed by Semantic Web agents, and a simple HTML view of the same in-
formation to traditional web browsers (see Figure 3). HTTP content negotiation
is used to deliver the appropriate format.
Web agents that can access Linked Data include: 1. Semantic Web browsers
like Disco7 , Tabulator[17] (see Figure 3), or the OpenLink Data Web Browser8 ;
2. Semantic Web crawlers like SWSE9 and Swoogle10 ; 3. Semantic Web query
agents like the Semantic Web Client Library11 and the SemWeb client for SWI
prolog12 .

SPARQL Endpoint. We provide a SPARQL endpoint for querying the DBpedia

dataset. Client applications can send queries over the SPARQL protocol to this
endpoint at http://dbpedia.org/sparql. This interface is appropriate when
the client application developer knows in advance exactly what information is
needed. In addition to standard SPARQL, the endpoint supports several exten-
sions of the query language that have proved useful for developing user interfaces:
full text search over selected RDF predicates, and aggregate functions, notably
COUNT. To protect the service from overload, limits on query cost and result size
are in place. For example, a query that asks for the store’s entire contents is
rejected as too costly. SELECT results are truncated at 1000 rows. The SPARQL
endpoint is hosted using Virtuoso Universal Server13 .
7
http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/disco/
8
http://demo.openlinksw.com/DAV/JS/rdfbrowser/index.html
9
http://swse.org
10
http://swoogle.umbc.edu/
11
http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/semwebclient/
12
http://moustaki.org/swic/
13
http://virtuoso.openlinksw.com
Figure 3. http://dbpedia.org/resource/Busan viewed in a web browser
(left) and in Tabulator (right).
RDF Dumps. N-Triple serializations of the datasets are available for download
at the DBpedia website and can be used by sites that are interested in larger
parts of the dataset.

5 Interlinking DBpedia with other Open Datasets

In order to enable DBpedia users to discover further information, the DBpedia
dataset is interlinked with various other data sources on the Web using RDF
links. RDF links enable web surfers to navigate from data within one data source
to related data within other sources using a Semantic Web browser. RDF links
can also be followed by the crawlers of Semantic Web search engines, which may
provide sophisticated search and query capabilities over crawled data.
The DBpedia interlinking effort is part of the Linking Open Data community
project14 of the W3C Semantic Web Education and Outreach (SWEO) inter-
est group. This community project is committed to make massive datasets and
ontologies, such as the US Census, Geonames, MusicBrainz, the DBLP bibliog-
raphy, WordNet, Cyc and many others, interoperable on the Semantic Web. DB-
pedia, with its broad topic coverage, intersects with practically all these datasets
and therefore makes an excellent “linking hub” for such efforts.
Figure 4 gives an overview about the datasets that are currently interlinked
with DBpedia. Altogether this Web-of-Data amounts to approximately 2 billion
RDF triples. Using these RDF links, surfers can for instance navigate from a
computer scientist in DBpedia to her publications in the DBLP database, from
a DBpedia book to reviews and sales offers for this book provided by the RDF
Book Mashup, or from a band in DBpedia to a list of their songs provided by
Musicbrainz or dbtune.
The example RDF link shown below connects the DBpedia URI identifying
Busan with further data about the city provided by Geonames:

<http://dbpedia.org/resource/Busan>
14
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/
LinkingOpenData
Figure 4. Datsets that are interlinked with DBpedia.

owl:sameAs <http://sws.geonames.org/1838524/> .

Agents can follow this link, retrieve RDF from the Geonames URI, and
thereby get hold of additional information about Busan as published by the
Geonames server, which again contains further links deeper into the Geonames
data. DBpedia URIs can also be used to express personal interests, places of
residence, and similar facts within personal FOAF profiles:

<http://richard.cyganiak.de/foaf.rdf#cygri>
foaf:topic_interest <http://dbpedia.org/resource/Semantic_Web> ;
foaf:based_near <http://dbpedia.org/resource/Berlin> .

Another use case is categorization of blog posts, news stories and other doc-
uments. The advantage of this approach is that all DBpedia URIs are backed
with data and thus allow clients to retrieve more information about a topic:

<http://news.cnn.com/item1143>
dc:subject <http://dbpedia.org/resource/Iraq_War> .

6 User Interfaces

User interfaces for DBpedia can range from a simple table within a classic web
page, over browsing interfaces to different types of query interfaces. This section
gives an overview about the different user interfaces that have been implemented
so far.
6.1 Simple Integration of DBpedia Data into Web Pages

DBpedia is a valuable source of general-purpose data that can be used within

web pages. Therefore, if you want a table containing German state capitals,
African musicians, Amiga computer games or whatever on your website, you
can generate this table using a SPARQL query against the DBpedia endpoint.
Wikipedia is kept up-to-date by a large community and a nice feature of such
tables is that they will also stay up-to-date as Wikipedia, and thus also DBpedia,
changes. Such tables can either be implemented using Javascript on the client
or with a scripting language like PHP on the server. Two examples of Javascript
generated tables are found on the DBpedia website15 .

6.2 Search DBpedia.org

Search DBpedia.org is a sample application that allows users to explore the

DBpedia dataset together with information from interlinked datasets such as
Geonames, the RDF Book Mashup or the DBLP bibliography. In contrast to
the keyword-based full-text search commonly found on the Web, search over
structured data offers the opportunity to make productive use of the relations in
the data, enabling stepwise narrowing of search results in different dimensions.
This adds a browsing component to the search task and may reduce the common
“keyword-hit-or-not-hit” problem.
A Search DBpedia.org session starts with a keyword search. A first set of
results is computed by direct keyword matches. Related matches are added,
using the relations between entities up to a depth of two nodes. Thus, a search
for the keyword “Scorsese” will include the director Martin Scorsese, as well as
all of his films, and the actors of these films.
The next step is result ranking. Our experiments showed that important
articles receive more incoming page links from other articles. We use a combi-
nation of incoming link count, relevance of the link’s source, and relation depth
to calculate a relevance ranking.
After entering a search term, the user is presented with a list of ranked
results, and with a tag cloud built from the classes found in the results, using
a combination of the DBpedia and YAGO [16] classifications. Each class weight
is calculated from the sum of associated result weights and the frequency of
occurrence. The tag cloud enables the user to narrow the results to a specific
type of entities, such as “Actor”, even though a simple keyword search may not
have brought up any actors.
When a resource from the results is selected, the user is presented with a
detailed view of all data that is known about the resource. Label, image and de-
scription are shown on top. Single-valued and multi-valued properties are shown
separately. Data from interlinked datasets is automatically retrieved by follow-
ing RDF links within the dataset and retrieved data from interlinked datasets
is shown together with the DBpedia data.
15
http://dbpedia.org
Figure 5. Search results and details view for Busan.

6.3 Querying DBpedia Data

Compared to most of the other Semantic Web knowledge bases currently avail-
able, for the RDF extracted from Wikipedia we have to deal with a different
type of knowledge structure – we have a very large information schema and a
considerable amount of data adhering to this schema. Existing tools unfortu-
nately mostly focus on either one of both parts of a knowledge base being large,
schema or data.
If we have a large data set and large data schema, elaborated RDF stores
with integrated query engines alone are not very helpful. Due to the large data
schema, users can hardly know which properties and identifiers are used in the
knowledge base and hence can be used for querying. Consequently, users have to
be guided when building queries and reasonable alternatives should be suggested.
We specifically developed a graph pattern builder for querying the extracted
Wikipedia content. Users query the knowledge base by means of a graph pattern
consisting of multiple triple patterns. For each triple pattern three form fields
capture variables, identifiers or filters for subject, predicate and object of a triple.
While users type identifier names into one of the form fields, a look-ahead search
proposes suitable options. These are obtained not just by looking for matching
identifiers but by executing the currently built query using a variable for the
currently edited identifier and filtering the results returned for this variable for
matches starting with the search string the user supplied. This method ensures,
that the identifier proposed is really used in conjunction with the graph pattern
under construction and that the query actually returns results. In addition, the
identifier search results are ordered by usage number, showing commonly used
identifiers first. All this is executed in the background, using the Web 2.0 AJAX
technology and hence completely transparent for the user. Figure 6 shows a
screenshot of the graph pattern builder.
Figure 6. Form based query builder.

6.4 Third Party User Interfaces

The DBpedia project aims at providing a hotbed for applications and mashups
based on information from Wikipedia. Although DBpedia was just recently
launched, there is already a number of third party applications using the dataset.
Examples include:

– A SemanticMediaWiki [14,18] installation run by the University of Karl-

sruhe, which has imported the DBpedia dataset together with the English
edition of Wikipedia.
– WikiStory (see Figure 7) which enables users to browse Wikipedia articles
about people on a large timeline.
– The Objectsheet JavaScript visual data environment,which allows spread-
sheet calculations based on DBpedia data16 .

7 Related Work
A second project that also works on extracting structured information from
Wikipedia is the YAGO project [16]. YAGO extracts only 14 relationship types,
such as subClassOf, type, familyNameOf, locatedIn from different sources of in-
formation in Wikipedia. One source is the Wikipedia category system (for sub-
ClassOf, locatedIn, diedInYear, bornInYear ), and another one are Wikipedia
redirects. YAGO does not perform an infobox extraction as in our approach. For
determining (sub-)class relationships, YAGO does not use the full Wikipedia
category hierarchy, but links leaf categories to the WordNet hierarchy.
16
http://richk.net/objectsheet/osc.html?file=sparql_query1.os
Figure 7. WikiStory allows timeline browsing of biographies in Wikipedia.

The Semantic MediaWiki project [14,18] also aims at enabling the reuse of
information within Wikis as well as at enhancing search and browse facilities.
Semantic MediaWiki is an extension of the MediaWiki software, which allows
you to add structured data into Wikis using a specific syntax. Ultimately, the
DBpedia and Semantic MediaWiki have similar goals. Both want to deliver the
benefits of structured information in Wikipedia to the users, but use different
approaches to achieve this aim. Semantic MediaWiki requires authors to deal
with a new syntax and covering all structured information within Wikipedia
would require to convert all information into this syntax. DBpedia exploits the
structure that already exists within Wikipedia and hence does not require deep
technical or methodological changes. However, DBpedia is not as tightly inte-
grated into Wikipedia as is planned for Semantic MediaWiki and thus is limited
in constraining Wikipedia authors towards syntactical and structural consistency
and homogeneity.
Another interesting approach is followed by Freebase17 . The project aims
at building a huge online database which users can edit in a similar fashion as
they edit Wikipedia articles today. The DBpedia community cooperates with
Metaweb and we will interlink data from both sources once Freebase is public.

8 Future Work and Conclusions

As future work, we will first concentrate on improving the quality of the DB-
pedia dataset. We will further automate the data extraction process in order to
increase the currency of the DBpedia dataset and synchronize it with changes
in Wikipedia. In parallel, we will keep on exploring different types of user inter-
faces and use cases for the DBpedia datasets. Within the W3C Linking Open
17
http://www.freebase.com
Data community project18 , we will interlink the DBpedia dataset with further
datasets as they get published as Linked Data on the Web. We also plan to
exploit synergies between Wikipedia versions in different languages in order to
further increase DBpedia coverage and provide quality assurance tools to the
Wikipedia community. Such a tool could for instance notify a Wikipedia author
about contradictions between the content of infoboxes contained in the differ-
ent language versions of an article. Interlinking DBpedia with other knowledge
bases such as Cyc (and their use as back-ground knowledge) could lead to further
methods for (semi-) automatic consistency checks for Wikipedia content.
DBpedia is a major source of open, royalty-free data on the Web. We hope
that by interlinking DBpedia with further data sources, it could serve as a nucleus
for the emerging Web of Data.

Acknowledgments

We are grateful to the members of the growing DBpedia community, who are
actively contributing to the project. In particular we would like to thank Jörg
Schüppel and the OpenLink team around Kingsley Idehen and Orri Erling.

References
1. Karl Aberer, Philippe Cudré-Mauroux, and Manfred Hauswirth. The chatty web:
Emergent semantics through gossiping. In 12th World Wide Web Conference, 2003.
2. Sören Auer and Jens Lehmann. What have innsbruck and leipzig in common?
extracting semantics from wiki content. In Enrico Franconi, Michael Kifer, and
Wolfgang May, editors, ESWC, volume 4519 of Lecture Notes in Computer Science,
pages 503–517. Springer, 2007.
3. Omar Benjelloun, Anish Das Sarma, Alon Y. Halevy, and Jennifer Widom. Uldbs:
Databases with uncertainty and lineage. In VLDB, 2006.
4. Tim Berners-Lee. Linked data, 2006. http://www.w3.org/DesignIssues/
LinkedData.html.
5. Christian Bizer, Richard Cyganiak, and Tom Heath. How to publish linked
data on the web, 2007. http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/
LinkedDataTutorial/.
6. Peter Buneman, Sanjeev Khanna, and Wang Chiew Tan. Why and where: A
characterization of data provenance. In ICDT, volume 1973 of Lecture Notes in
Computer Science, 2001.
7. Christian Bizer. Quality-Driven Information Filtering in the Context of Web-Based
Information Systems. PhD thesis, Freie Universität Berlin, 2007.
8. Yingwei Cui. Lineage Tracing in Data Warehouses. PhD thesis, Stanford Univer-
sity, 2001.
9. Orri Erling and Ivan Mikhailov. RDF support in the Virtuoso DBMS. volume P-
113 of GI-Edition - Lecture Notes in Informatics (LNI), ISSN 1617-5468. Bonner
Köllen Verlag, September 2007.
18
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/
LinkingOpenData
10. Alon Halevy, Oren Etzioni, AnHai Doan, Zachary Ives, Jayant Madhavan, and
Luke McDowell. Crossing the structure chasm. In CIDR, 2003.
11. Alon Y. Halevy, Zachary G. Ives, Dan Suciu, and Igor Tatarinov. Schema mediation
in peer data management systems. In ICDE, March 2003.
12. Zachary Ives, Nitin Khandelwal, Aneesh Kapur, and Murat Cakir. Orchestra:
Rapid, collaborative sharing of dynamic data. In CIDR, January 2005.
13. Anastasios Kementsietsidis, Marcelo Arenas, and Renée J. Miller. Mapping data in
peer-to-peer systems: Semantics and algorithmic issues. In SIGMOD, June 2003.
14. Markus Krötzsch, Denny Vrandecic, and Max Völkel. Wikipedia and the Semantic
Web - The Missing Links. In Jakob Voss and Andrew Lih, editors, Proceedings of
Wikimania 2005, Frankfurt, Germany, 2005.
15. John Miles Smith, Philip A. Bernstein, Umeshwar Dayal, Nathan Goodman, Terry
Landers, Ken W.T. Lin, and Eugene Wong. MULTIBASE – integrating hetero-
geneous distributed database systems. In Proceedings of 1981 National Computer
Conference, 1981.
16. Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. Yago: A Core of
Semantic Knowledge. In 16th international World Wide Web conference (WWW
2007), New York, NY, USA, 2007. ACM Press.
17. Tim Berners-Lee et al. Tabulator: Exploring and analyzing linked data on the se-
mantic web. In Proceedings of the 3rd International Semantic Web User Interaction
Workshop, 2006. http://swui.semanticweb.org/swui06/papers/Berners-Lee/
Berners-Lee.pdf.
18. Max Völkel, Markus Krötzsch, Denny Vrandecic, Heiko Haller, and Rudi Studer.
Semantic wikipedia. In Les Carr, David De Roure, Arun Iyengar, Carole A. Goble,
and Michael Dahlin, editors, Proceedings of the 15th international conference on
World Wide Web, WWW 2006, pages 585–594. ACM, 2006.
19. Gio Wiederhold. Intelligent integration of information. In SIGMOD, 1993.

The Web of Data - Aidan Hogan
100% (1)
The Web of Data - Aidan Hogan
696 pages
Dr.M.a.archANA - Mobile Communication Networks
No ratings yet
Dr.M.a.archANA - Mobile Communication Networks
146 pages
Web Semantics: Science, Services and Agents On The World Wide Web
100% (1)
Web Semantics: Science, Services and Agents On The World Wide Web
22 pages
Linked Data Evolving The Web Into A Global Data Space 1st Edition Tom Heath - The Latest Updated Ebook Is Now Available For Download
100% (1)
Linked Data Evolving The Web Into A Global Data Space 1st Edition Tom Heath - The Latest Updated Ebook Is Now Available For Download
82 pages
Faceted Exploration of Multiple RDF Data Sources Using SPARQL
No ratings yet
Faceted Exploration of Multiple RDF Data Sources Using SPARQL
84 pages
Bead Dealers in Chawri Bazar, Delhi, India - Justdial
No ratings yet
Bead Dealers in Chawri Bazar, Delhi, India - Justdial
6 pages
Web Content Mining
100% (1)
Web Content Mining
112 pages
Manual EPLAN - Manual Software Eplan P8 - Iniciante
100% (1)
Manual EPLAN - Manual Software Eplan P8 - Iniciante
141 pages
Linked Data Visualization 1st Edition Laura Po Instant Download
100% (1)
Linked Data Visualization 1st Edition Laura Po Instant Download
21 pages
Extraction of Historical Events From Wikipedia
100% (1)
Extraction of Historical Events From Wikipedia
12 pages
Spatial & Web Mining
100% (1)
Spatial & Web Mining
45 pages
Facilities Management Policy Draft 12
100% (2)
Facilities Management Policy Draft 12
36 pages
Semantic Hierarchies For Image Annotation - A Survey
No ratings yet
Semantic Hierarchies For Image Annotation - A Survey
41 pages
Grafa: Scalable Faceted Browsing For RDF Graphs
No ratings yet
Grafa: Scalable Faceted Browsing For RDF Graphs
16 pages
On Boosting Semantic Web Data Access
100% (1)
On Boosting Semantic Web Data Access
68 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
79 pages
Linked Data Based Exploration A State-Of-The-Art Camera-Ready
100% (1)
Linked Data Based Exploration A State-Of-The-Art Camera-Ready
13 pages
DRI Canada Professional Practices (2014-07) PDF
No ratings yet
DRI Canada Professional Practices (2014-07) PDF
42 pages
CCS357 Lab Manual
No ratings yet
CCS357 Lab Manual
41 pages
Data Mining Unit4 5
No ratings yet
Data Mining Unit4 5
130 pages
Internet Resouces
No ratings yet
Internet Resouces
16 pages
Topic:-Alphabets, Strings and Languages, Chomsky Hierarchy of Grammar
No ratings yet
Topic:-Alphabets, Strings and Languages, Chomsky Hierarchy of Grammar
9 pages
Understanding Each NeuAlgo Strategy Entry - Updated 1 1 2023
No ratings yet
Understanding Each NeuAlgo Strategy Entry - Updated 1 1 2023
6 pages
Advance iOS App Architecture PDF
100% (1)
Advance iOS App Architecture PDF
297 pages
Webmininglec
100% (1)
Webmininglec
75 pages
SAP System Administration Made Easy 4.0B
100% (1)
SAP System Administration Made Easy 4.0B
358 pages
Web Design Proposal
100% (1)
Web Design Proposal
15 pages
OD2e L2 Word List
No ratings yet
OD2e L2 Word List
5 pages
2019 - Beats, Downbeats, and Functional Segment Annotations of Western Popular Music.
100% (1)
2019 - Beats, Downbeats, and Functional Segment Annotations of Western Popular Music.
8 pages
Kalpana Chawla - Wikipedia
No ratings yet
Kalpana Chawla - Wikipedia
31 pages
Unit 7: Web Mining and Text Mining
No ratings yet
Unit 7: Web Mining and Text Mining
13 pages
RDF Journal Compilation
No ratings yet
RDF Journal Compilation
7 pages
Semantic Web: (An Introduction)
100% (1)
Semantic Web: (An Introduction)
39 pages
eCTD Specification and Related Files: Electronic Common Technical Document Specification V3.2.2
No ratings yet
eCTD Specification and Related Files: Electronic Common Technical Document Specification V3.2.2
2 pages
Validating RDF Data 2017
No ratings yet
Validating RDF Data 2017
308 pages
Web Mining: by Saumil Shah Roll No: 46 Mca 4 Sem
No ratings yet
Web Mining: by Saumil Shah Roll No: 46 Mca 4 Sem
28 pages
Jagpat Project Dhapni
No ratings yet
Jagpat Project Dhapni
46 pages
Reagle WikiSym2007 WikipediaAuthorialLeadership
100% (1)
Reagle WikiSym2007 WikipediaAuthorialLeadership
13 pages
AccurioPress C2070 C2070P C2060 Catalog en PDF
No ratings yet
AccurioPress C2070 C2070P C2060 Catalog en PDF
16 pages
Ma1254 - Random Processes: Unit I - Probability and Random Variable
100% (1)
Ma1254 - Random Processes: Unit I - Probability and Random Variable
5 pages
Introduction To The Semantic Web (Tutorial) Johnson & Johnson Philadelphia, USA October 30, 2009 Ivan Herman, W3C
No ratings yet
Introduction To The Semantic Web (Tutorial) Johnson & Johnson Philadelphia, USA October 30, 2009 Ivan Herman, W3C
184 pages
Introduction To The Semantic Web (Tutorial) Johnson & Johnson Philadelphia, USA October 30, 2009 Ivan Herman, W3C
No ratings yet
Introduction To The Semantic Web (Tutorial) Johnson & Johnson Philadelphia, USA October 30, 2009 Ivan Herman, W3C
184 pages
#1 Semantic Web Vision and Introduction Part2
No ratings yet
#1 Semantic Web Vision and Introduction Part2
52 pages
RDF SW Velocity
No ratings yet
RDF SW Velocity
12 pages
CSC118 - Fundamentals of Algorithm Development
0% (1)
CSC118 - Fundamentals of Algorithm Development
3 pages
FNIS FNIS Readme 7.4.5
No ratings yet
FNIS FNIS Readme 7.4.5
17 pages
IRSW - Semantic Web Introduction
No ratings yet
IRSW - Semantic Web Introduction
76 pages
Computer Basics
No ratings yet
Computer Basics
36 pages
Semantic Search: With Contributions From Thanh Tran (KIT)
No ratings yet
Semantic Search: With Contributions From Thanh Tran (KIT)
78 pages
BKD-Scholarly Knowledge Graphs Through Structuring Scholarly Communication A Review-2022
No ratings yet
BKD-Scholarly Knowledge Graphs Through Structuring Scholarly Communication A Review-2022
37 pages
8 DataStorageIndexingStructures Updated
No ratings yet
8 DataStorageIndexingStructures Updated
57 pages
CSS Animations
No ratings yet
CSS Animations
46 pages
Oral Presentation: Communicative English 2 Psas
No ratings yet
Oral Presentation: Communicative English 2 Psas
19 pages
SI Lecture 14
No ratings yet
SI Lecture 14
49 pages
The Semantic Web An Introduction
No ratings yet
The Semantic Web An Introduction
23 pages
Oversampling ADC: Data Converters Oversampling ADC Professor Y. Chiu EECT 7327 Fall 2014
No ratings yet
Oversampling ADC: Data Converters Oversampling ADC Professor Y. Chiu EECT 7327 Fall 2014
24 pages
KMST Project Report 2018
No ratings yet
KMST Project Report 2018
25 pages
Memory Aware Optimized Hadoop MapReduce Model in Cloud Computing Environment
No ratings yet
Memory Aware Optimized Hadoop MapReduce Model in Cloud Computing Environment
11 pages
Semantic Web-Based Information Retrieval Models: A Systematic Survey
No ratings yet
Semantic Web-Based Information Retrieval Models: A Systematic Survey
20 pages
Semantic Web
No ratings yet
Semantic Web
25 pages
Semantic Wiki Review
No ratings yet
Semantic Wiki Review
22 pages
Introduction To The Semantic Web
No ratings yet
Introduction To The Semantic Web
11 pages
Solidworks Tutorial
No ratings yet
Solidworks Tutorial
14 pages
Ontology
No ratings yet
Ontology
14 pages
Learning To Tag and Tagging To Learn: A Case Study On Wikipedia
No ratings yet
Learning To Tag and Tagging To Learn: A Case Study On Wikipedia
15 pages
References
No ratings yet
References
10 pages
Bishop Et Al. - 2011 - OWLIM A Family of Scalable Semantic Repositories
No ratings yet
Bishop Et Al. - 2011 - OWLIM A Family of Scalable Semantic Repositories
10 pages
Questions: Practice Set
No ratings yet
Questions: Practice Set
3 pages
Documentation Extraction Framework
No ratings yet
Documentation Extraction Framework
5 pages
Oracle SPARC Servers Solution Engineer Assessment Examen
No ratings yet
Oracle SPARC Servers Solution Engineer Assessment Examen
7 pages
Semantic Wikipedia: Max Völkel, Markus Krötzsch, Denny Vrandecic, Heiko Haller, Rudi Studer
No ratings yet
Semantic Wikipedia: Max Völkel, Markus Krötzsch, Denny Vrandecic, Heiko Haller, Rudi Studer
10 pages
Iniyan 20242dsc0007 Dav
No ratings yet
Iniyan 20242dsc0007 Dav
8 pages
Ontowiki - A Tool For Social, Semantic Collaboration
No ratings yet
Ontowiki - A Tool For Social, Semantic Collaboration
14 pages
Seminor Rough Report
No ratings yet
Seminor Rough Report
18 pages
Adv - Micro. Programming ATmega8 Using Arduino IDE
No ratings yet
Adv - Micro. Programming ATmega8 Using Arduino IDE
8 pages
A Deep Learning Approach For Public Sentiment Analysis in COVID-19 Pandemic
No ratings yet
A Deep Learning Approach For Public Sentiment Analysis in COVID-19 Pandemic
7 pages
Assignment Ii: Case Ii The Strategy Behind Tiktok'S Global Rise
No ratings yet
Assignment Ii: Case Ii The Strategy Behind Tiktok'S Global Rise
7 pages
2006 12 07 Wikipedia
No ratings yet
2006 12 07 Wikipedia
4 pages
Thought Works
No ratings yet
Thought Works
6 pages
Semantic Data Extraction From Infobox Wi PDF
No ratings yet
Semantic Data Extraction From Infobox Wi PDF
6 pages
Ocr PDF
No ratings yet
Ocr PDF
5 pages
Concept Vector Extraction From Wikipedia Category Network: Masumi Shirakawa Kotaro Nakayama
No ratings yet
Concept Vector Extraction From Wikipedia Category Network: Masumi Shirakawa Kotaro Nakayama
9 pages
Measuring Article Quality in Wikipedia: Models and Evaluation
No ratings yet
Measuring Article Quality in Wikipedia: Models and Evaluation
10 pages
p257 Kumar
No ratings yet
p257 Kumar
10 pages
Wikiwww 2007
No ratings yet
Wikiwww 2007
10 pages
SilvermanBryden31 05 07
No ratings yet
SilvermanBryden31 05 07
10 pages
Social Media
No ratings yet
Social Media
6 pages
Investigations Into Trust For Collaborative Information Repositories: A Wikipedia Case Study
No ratings yet
Investigations Into Trust For Collaborative Information Repositories: A Wikipedia Case Study
9 pages
Icwsm 08
No ratings yet
Icwsm 08
9 pages
Sigmod2013 Tutorial
No ratings yet
Sigmod2013 Tutorial
5 pages
C4 159 Ortega
No ratings yet
C4 159 Ortega
8 pages
ViaLiteHD 1U Rack Chassis HRK1x DS 2
No ratings yet
ViaLiteHD 1U Rack Chassis HRK1x DS 2
2 pages
ColemanPaperQ1 08
No ratings yet
ColemanPaperQ1 08
7 pages
Wikipedia As Rational Discourse: An Illustration of The Emancipatory Potential of Information Systems
No ratings yet
Wikipedia As Rational Discourse: An Illustration of The Emancipatory Potential of Information Systems
10 pages
NS 21ec742 Assignment 2
No ratings yet
NS 21ec742 Assignment 2
2 pages
Stein 2008c
No ratings yet
Stein 2008c
6 pages
Ambarella CV52S Product Brief 15OCT2021
No ratings yet
Ambarella CV52S Product Brief 15OCT2021
2 pages
Preferential Attachment in The Growth of Social Networks: The Case of Wikipedia
No ratings yet
Preferential Attachment in The Growth of Social Networks: The Case of Wikipedia
5 pages
Research Paper
No ratings yet
Research Paper
4 pages
Talk Before You Type: Coordination in Wikipedia
No ratings yet
Talk Before You Type: Coordination in Wikipedia
10 pages
BPL PVC Pipe
No ratings yet
BPL PVC Pipe
1 page
Gaurav Resume
No ratings yet
Gaurav Resume
1 page
Guldencomplexsystems
No ratings yet
Guldencomplexsystems
3 pages
Semi Structured Data
No ratings yet
Semi Structured Data
1 page
WiGiPedia Visual Editing of Semantic Data in Wikipedia - Testing
No ratings yet
WiGiPedia Visual Editing of Semantic Data in Wikipedia - Testing
2 pages
The Day of The Jackal: Penguin Readers Factsheets
No ratings yet
The Day of The Jackal: Penguin Readers Factsheets
4 pages