Linked Open Data: The Essentials. A Quick Start Guide For Decision Makers
Linked Open Data: The Essentials. A Quick Start Guide For Decision Makers
Imprint
Published by: edition mono/monochrom, Vienna, Austria ISBN: 978-3-902796-05-9 Print: DGS Druck- u. Graphikservice GmbH Design & Layout: Susan Hrtig (Semantic Web Company) Production Editor: Thomas Thurner (Semantic Web Company) Proofreading: Jena Wuu & Vince Reardon PDF Version: A PDF version of Linked Open Data: The Essentials for download is available here: http://www. semantic-web.at/LOD-TheEssentials.pdf Copyright: Unless stated otherwise, any material in this book is licensed under a Creative-CommonsLizenz BY 3.0 Austria: http://creativecommons.org/licenses/by/3.0/at
Introductory Remarks
by Martin Schpe (BMU), Martin Hiller (REEEP) and Martin Kaltenbck (SWC)
The actual technology that drives and enables OGD is known as Linked Open Data (LOD). To accelerate knowledge sharing in the field of clean energy, the BMU is sponsoring Linking Open Data to Accelerate LowCarbon Development, a workshop for decision makers in clean energy organisations that will be held at the Masdar Institute in January 2012, and organised by the Renewable Energy and Energy Efficiency Partnership (REEEP). To accompany this workshop, you have in your hands a useful publication, Linked Open Data: The Essentials, which provides a succinct introduction to the topic for decision makers and project developers. I hope you will find in it the inspiration for developing your own data and information management strategy.
Martin Hiller
Director General, Renewable Energy and Energy Efficiency Partnership (REEEP) Today, the internet makes the wealth of human knowledge available to anyone, anywhere. From a clean energy perspective, this makes the internet one of the most potent capacity building tools possible. The challenge is how to sort through and effectively use the everincreasing volume of information available. Linked Open Data (LOD) is a growing movement for organisations to make their existing data available in a machine-readable format. This enables users to create and combine data sets and to make their own interpretations of data available in digestible formats and applications. With the aim of accelerating this movement in the clean energy arena, the Renewable Energy and Energy Efficiency Partnership (REEEP) is sponsoring a workshop entitled Linking Open Data to Accelerate Low-
Carbon Development at the Masdar Institute in January 2012, with the kind support of the German Federal Ministry for the Environment, Nature Conservation and Nuclear Safety (BMU). This workshop seeks to assist key energy and development organisations in taking necessary steps to open up their data sets and to maximise the interconnection between knowledge brokers and providers. Linked Open Data: The Essentials was developed to accompany this workshop and to give decision makers a quick overview of the LOD concept and how to accelerate the process in their respective organisations. We trust you will find the publication to be useful reading.
Martin Kaltenbck
Managing Partner & CFO, Semantic Web Company GmbH, Austria Data management has become a crucial factor for business success and innovation. Efficient handling of Linked (Open) Data and metadata in the fields of public administration and industry is key. With a combination of social software methods and technologies, organisations can benefit and reach competitive advantage. The publication Linked Open Data: The Essentials gives decision makers a good overview of Open Government, Open Government Data, Open Data and Linked Open Data (LOD). It highlights the potentials and benefits of LOD, providing a quick guide with the most important steps for LOD publishing, a consumption strategy for your organisation and three best practice examples of LOD in use.
Table of Contents
Imprint Introductory Remarks 2 3
1. 2. 3.
3.1.
From Open Data to Linked Open Data The Power of Linked Open Data Linked Open Data Start Guide
Publishing Linked Open Data
9 22 30
31 36
4.
4.1.
40
43 46 49
5.
5.1.
Appendix
Authors
53
53 56
10
Mexico, Norway, Philippines, South Africa, United Kingdom, United States) endorsed an Open Government declaration, announced their countries action plans, and welcomed the commitment of 38 governments to join the partnership. In September 2011, there were 46 national government commitments to Open Government worldwide. Some of the most important enablers for Open Government are free access to information and the possibility to freely use and re-use this information (e.g. data, content, etc). After all, without information it is not possible to establish a culture of collaboration and participation among the relevant stakeholders. Therefore, Open Government Data (OGD) is often seen as a crucial aspect of Open Government. OGD is a worldwide movement to open up government/public administration data, information and content to both human and machine-readable non-proprietary formats for re-use by civil society, economy, media and academia as well as by politicians and public administrators. This would apply only to data and information produced or commissioned by government or government-controlled entities and is not related to data on individuals.. Being open means lowering barriers to ensure the widest possible re-use by anyone. With OGD, a new paradigm came into being for publishing government data that invites everyone to look, take and play! The often-used term Open Data refers to data and information beyond just governmental institutions and includes those from other relevant stakeholder groups such as business/industry, citizens, NPOs and NGOs, science or education. Some of the best-known institutions currently undertake Open Data activities include the World Bank4, the United Nations5, REEEP6, the New York Times7, The Guardian8 and the Open Knowledge Foundation (OKFN)9.
In 2007, 30 Open Government advocates came together in Sebastopol, California, USA to develop a set of OGD principles10 that underscored why OGD is essential for democracy. In 2010, the Sunlight Foundation11 expanded these to 10 principles. Even though these principles are neither set in stone nor legally binding, they are widely considered by the global open (government) data community as general guidelines on the topic. Government Data shall be considered open if the data is made public in a way that complies with the principles below: 1. Data must ce complete All public data is made available. The term data refers to electronically-stored information or recordings, including but not limited to documents, databases, transcripts, and audio/visual recordings. Public data is data that is not subject to valid privacy, security or privilege limitations, as governed by other statutes.
11
2. Data must be primary Data is published as collected at the source, with the finest possible level of granularity, and not in aggregate or modified forms. 3. Data must be timely Data is made available as quickly as necessary to preserve the value of the data. 4. Data must be accessible. Data is available to the widest range of users for the widest range of purposes. 5. Data must be machine-processable Data is structured so that it can be processed in an automated way. 6. Access must be non-discriminatory Data is available to anyone, with no registration requirement.
12
7.
Data formats must be non-proprietary Data is available in a format over which no entity has exclusive control.
8. Data must be license-free Data is not subject to any copyright, patent, trademark or trade secrets regulation. Reasonable privacy, security and privilege restrictions may be allowed as governed by other statutes. Compliance to these principles must be reviewable through the following means: A contact person must be designated to respond to people trying to use the data; or A contact person must be designated to respond to complaints about violations of the principles; or An administrative or judicial court must have the jurisdiction to review whether the agency has applied these principles appropriately.
The two principles added by the Sunlight Foundation are as follows: 9. permanence Permanence refers to the capability of finding information over time. 10. usage costs One of the greatest barriers to access to ostensibly publiclyavailable information is the cost imposed on the public for access even when the cost is de minimus. It has been acknowledged that the worldwide OGD movement originated in Australia, New Zealand, Europe and North America, but today we also see strong OGD engagement and activity in Asia, South America and Africa. For example, Kenya started Africas first data portal12 in July 2011. The European Commission (EC) has also put the issue high up on its agenda and is actively pushing OGD forward in Europe. Neelie
Krose, Vice-President of the European Commission responsible for the Digital Agenda, has stated strong commitment to OGD through her announcement of an EC data portal by early 2012 and for a PanEuropean data portal acting as a single point of access for all European national data portals by 2013. Open Data is an important part of both the Digital Agenda for Europe13 and the European e-government Action Plan 2011201514. In December 2011 the EC furthermore announced its Open Data Strategy for Europe: Turning Government Data into Gold15. The current leading countries in national Open Data activities and initiatives are definitely the governments of the United States of America16, Australia17, the Scandinavian countries and the UK government18. All of these countries have a high political commitment to both Open Data and central Open Data portals, and they all have a strong Open Data community. These innovative countries and the people behind them can be considered the pioneers of OGD. Two very good resources about the worldwide OGD movement are: SWC world map of Open Data initiatives, activities and portals: http://bit.ly/open-data-map OKFN comprehensive list of data catalogs curated by experts from around the world: http://datacatalogs.org
13
14
For a good example of a national OGD process, please refer to the following UK Open Government Data Timeline by Tim Davies:
Links
(1) The Memorandum on Transparency and Open Government: http://www.whitehouse.gov/the_press_office/ TransparencyandOpenGovernment (2) e-government, Wikipedia: http://en.wikipedia.org/wiki/EGovernment (3) Open Government Partnership: http://www.opengovpartnership. org (4) Open Data World Bank: http://data.un.org (5) Open Data United Nations: http://data.worldbank.org (6) Open Data REEEP: http://data.reegle.info (7) Open Data New York Times: http://data.nytimes.com (8) Open Data The Guardian: http://www.guardian.co.uk/worldgovernment-data (9) Open Knowledge Foundation: http://okfn.org (10) 8 Principles of Open Government Data: http://www.opengovdata. org/home/8principles (11) Sunlight Foundation: 10 principles of Open Government Data: http://sunlightfoundation.com/policy/documents/ten-open-dataprinciples (12) Kenya Open Data Portal: http://opendata.go.ke (13) Digital Agenda for Europe: http://ec.europa.eu/information_ society/digital-agenda (14) eGovernment Action Plan Europe 20112015: ec.europa. eu/information_society/activities/egovernment/action_ plan_2011_2015 (15) Announcement: Open Data Strategy for Europe: http://bit.ly/ s5FiQo (16) Open Data Catalogue United States of America: http://data.gov (17) Open Data Catalogue of Australia: http://data.gov.au (18) Open Data Catalogue United Kingdom: http://data.gov.uk
15
16
Further Reading
Open Government, Wikipedia: http://en.wikipedia.org/wiki/ Open_government Open Knowledge Foundation, OGD website: http:// opengovernmentdata.org Open Data, Wikipedia: http://en.wikipedia.org/wiki/Open_data Open Knowledge Foundation Blog: http://blog.okfn.org
2. Open Government: transparency, democracy, participation and collaboration 3. Legal issues 4. Impact on society 5. Innovation and knowledge society 6. Impact on economy and industry 7. Licenses, models for exploitation, terms of use 8. Data relevant aspects 9. Data governance 10. Applications and use cases 11. Technological aspects When considering how to fully benefit from OGD in concrete cases, it is clear that interoperability and standards are key. This is where LOD principles come into play.
To fully benefit from Open Data, it is crucial to put information and data into a context that creates new knowledge and enables powerful services and applications. As LOD facilitates innovation and knowledge creation from interlinked data, it is an important mechanism for information management and integration. There are two equally important viewpoints to LOD: publishing and consuming. Throughout this guide, we will always address LOD from both the publishing and consumption perspectives. The path from open (government) data to linked open (government) data was best described by Sir Tim Berners-Lee1 when he first presented his 5 Stars Model at the Gov 2.0 Expo in Washington DC in 2010. Since then, Berners-Lees model has been adapted and explained in several ways; the following adaptation of the 5 Stars Model2 by Michael Hausenblas3 explains the costs and benefits for both publishers and consumers of LOD. Information is available on the Web (any format) under an open license Information is available as structured data (e.g. Excel instead of an image scan of a table) Non-proprietary formats are used (e.g. CSV instead of Excel) URI identification is used so that people can point at individual data Data is linked to other data to provide context
17
18
What are the costs and benefits of web data? As a consumer ... 9 You can see it. 9 You can print it. 9 You can store it locally (on your hard drive or on an USB stick). 9 You can enter the data manually into another system. What are the costs and benefits of web data? As a consumer, you can do everything that you could do with web data, plus: 9 You can directly process it with proprietary software to aggregate it, perform calculations, visualise it, etc. 9 You can export it into another (structured) format. What are the costs and benefits of web data? As a consumer, you can do everything that you could do with web data, plus: 9 You do not have to pay for a format over which a single entity has exclusive control As a publisher ... As a publisher ... As a publisher ... 9 It is easy to publish.
9 It is easy to publish.
9 It is easy to publish.
What are the costs and benefits of web data? As a consumer, you can do everything that you could do with web data, plus: 9 You can link to it from any other place, either on the web or locally. 9 You can bookmark it. 9 You can re-use parts of the data. As a publisher ...
19
9 You will need to invest some time slicing and dicing your data. 9 You will need to assign URIs to data items and think about how to represent the data. 9 You have fine-granular control over the data items and can optimise their access (e.g. load balancing, caching, etc.)
What are the costs and benefits of web data? As a consumer, you can do everything that you could do with web data, plus: 9 You can discover new data of interest while consuming other information. 9 You have access to the data schema. As a publisher ...
9 You will need to invest resources to link your data to other data on the web. 9 You make your data discoverable. 9 You increase the value of your data.
LOD is becoming increasingly important in the fields of state-of-the-art information and data management. It is already being used by many well-known organisations, products and services to create portals, platforms, internet-based services and applications. LOD is domain-independent and penetrates various areas and domains, thus proving its advantage over traditional data management. For example, the project LOD24 Creating Knowledge Out of Interlinked Data, which is funded by the European Commission under the 7th
20
Framework Programme, develops powerful LOD mechanisms and tools based on three real use cases: OGD, linked enterprise data and LOD for media and publishers. For further reading on linked open (government) data, please refer to the Government Linked Data (GLD) W3C working group5. The following chapters discuss the benefits of LOD, as well as basic LOD consuming and publishing principles for creating powerful and innovative services for knowledge management, decision making and general data management. The best practice examples reegle. info6 and OpenEI7 show how LOD can have a great impact on their respective target groups. Another popular example of applied OGD is legislation.gov.uk.
Links
(1) Sir Tim Berners-Lee (Wikipedia): http://en.wikipedia.org/wiki/ Tim_Berners-Lee (2) 5 Stars Model on Open Government Data by Michael Hausenblas: http://lab.linkeddata.deri.ie/2010/star-scheme-by-example (3) Michael Hausenblas: http://semanticweb.org/wiki/Michael_ Hausenblas (4) LOD2 Creating Knowledge Out of Interlinked Data: http://www. lod2.eu (5) GLD W3C Working Group: http://www.w3.org/2011/gld/charter (6) Clean energy info portal reegle.info: http://www.reegle.info (7) Open Energy Info (OpenEI): http://en.openei.org
Further reading
Linked Data, Wikipedia: http://en.wikipedia.org/wiki/Linked_data Linked Data Connect Distributed Data Across the Web: http:// linkeddata.org
Linked Data: Evolving the Web into a Global Data Space, Heath and Bizer: http://linkeddatabook.com Linking Government Data, David Wood (Editor), Springer; 2011 edition (November 12, 2011), ISBN-10: 146141766X, ISBN-13: 978-1461417668 W3C Linking Open Data Community Project: http://www.w3.org/ wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
21
22
Linked Data?
Nowadays, the idea of linking web pages by using hyperlinks is obvious, but this was a groundbreaking concept 20 years ago. We are in a similar situation today since many organizations do not understand the idea of publishing data on the web, let alone why data on the web should be linked. The evolution of the web can be seen as follows:
23
Although the idea of Linked Open Data (LOD) has yet to be recognised as mainstream (like the web we all know today), there are a lot of LOD already available. The so called LOD cloud3 covers more than an estimated 50 billion facts from many different domains like geography, media, biology, chemistry, economy, energy, etc. The data is of varying quality and most of it can also be re-used for commercial purposes.
24
Please see a current version of the LOD Cloud diagram of 2011 as follows:
In some ways, we are all open to the web, but not all of us know how to deal with this rather new way of thinking. Most often the digital natives and digital immigrants who have learned to work and live with the social web have developed the best strategies to make use of this kind of openness. Whereas the idea of Open Data is built on the concept of a social web, the idea of Linked Data is a descendant of the semantic web. The basic idea of a semantic web is to provide cost-efficient ways to publish information in distributed environments. To reduce costs when it comes to transferring information among systems, standards play the most crucial role. Either the transmitter or the receiver has to convert or map its data into a structure so it can be understood by the receiver. This conversion or mapping must be done on at least three different levels: used syntax, schemas and vocabularies used to deliver meaningful information; it becomes even more time-consuming when information is provided by multiple systems. An ideal scenario would be a fully-harmonised internet where all of those layers are based on exactly one single standard, but the fact is that we face too many standards or de-facto standards today. How can we overcome this chicken-and-egg problem? There are at least three possible answers: Provide valuable, agreed-upon information in a standard, open format. Provide mechanisms to link individual schemas and vocabularies in a way so that people can note if their ideas are similar and related, even if they are not exactly the same. Bring all this information to an environment which can be used by most, if not all of us. For example: dont let users install proprietary software or lock them in one single social network or web application!
25
26
27
28
growing trust in the growing semantic web; and maintaining a global cultural graph of information that is both reliable and persistent. Linked Data in biomedicine6: establishing a set of principles for ontology/vocabulary development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain; tempering the explosive proliferation of data in the biomedical domain; creating a coordinated family of ontologies that are interoperable and logical; and incorporating accurate representations of biological reality. Linked government data: re-using public sector information (PSI); improving internal administrative processes by integrating data based on Linked Data; and interlinking government and nongovernment information.
Whereas most of the current momentum can be observed in the government & NGO sectors, more and more media companies are jumping on the bandwagon. Their assumption is that more and more industries will perceive Linked Data as a cost-efficient way to integrate data.
Linking information from different sources is key for further innovation. If data can be placed in a new context, more and more valuable applications and therefore knowledge will be generated.
29
Links
(1) The World Wide Web Consortium (W3C): is an international community that develops open standards to ensure the longterm growth of the Web: http://www.w3.org (2) Resource Description Framework (RDF): http://www.w3.org/RDF (3) Linked Open Data Cloud: http://www.lod-cloud.net (4) DBpedia: http://dbpedia.org (5) Jan Hannemann, Jrgen Kett (German National Library): Linked Data for Libraries (2010) http://www.ifla.org/files/hq/papers/ ifla76/149-hannemann-en.pdf (6) OBO Foundry: http://obofoundry.org
30
31
32
resource identifiers (URIs)1 as names for each of your objects. To ensure sustainability, remember to develop data models for data that change over time.
Specify license(s)
To ensure broad and efficient re-use of your data, evaluate, specify and provide a clear license for your data to avoid its re-use in a legal vacuum. If possible, specify an existing license that people already know. This enables interoperability with other data sets in the field of licensing. For example, Creative Commons2 is a commonly-used license for OGD.
sets by providing and updating the meta-information about your data sets on the data hub5. Remember to always provide human-readable descriptions of your data sets to make the data sets self-describing for easy and efficient re-use. For a similar approach, we recommend the Ingredients for high quality Linked (Open) Data by the W3C Linked Data Cookbook6. The essential steps to publishing your own LOD are: 1. Model and link the data
33
2. Name things with URIs 3. Re-use vocabularies whenever possible 4. Publish human- and machine-readable descriptions 5. Convert data to RDF 6. Specify an appropriate license 7. Announce the new Linked Data Set(s) The following life cycle of Linked Open (Government) Data by Bernadette Hyland7 visualises the path for LOD publishing:
The Four Rules of Linked Data (W3C Design Issues for Linked Data8) are also a good place to start understanding LOD principles: The semantic web isnt just about putting data on the web that is the old web of pages. It is about making links, so that a person or machine can explore the semantically connected web of data. With Linked Data, you can find more related data. Like the web of hypertext, the web of data is constructed with documents on the web. However, unlike the web of hypertext, where links are relationships anchors in hypertext documents written in
34
HTML, LOD functions through links between arbitrary things described by RDF. The URIs identify any kind of object or concept, but regardless of HTML or RDF, the same expectations apply to make the web grow: 1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the established standards (e.g. RDF, SPARQL) 4. Include links to other URIs, so that more things can be discovered Furthermore, it is crucial to provide high quality information for developers and data workers about your data. Provide information about data provenance as well as its data collection to guarantee smooth and efficient work with your data. To ensure widest possible re-use, provide a (web) API9 on top of the published data sets that allows users to query your data and to fetch data and information from your data collection tailored to their needs. A web API enables web developers to easily work with your data. Here are some best practice examples for publishing LOD: UK official National Open (Government) Data Portal Linked Data Area: http://data.gov.uk/linked-data Official UK Legislation: http://www.legislation.gov.uk reegle.info LOD portal: http://data.reegle.info EU project: LATC LOD around the clock: http://latc-project.eu
Links
(1) Uniform Resource Identifier, URI on Wikipedia: http://en.wikipedia. org/wiki/Uniform_resource_identifier (2) Creative Commons: http://creativecommons.org (3) Resource Description Framework (RDF): http://www.w3.org/RDF/ RDF on Wikipedia: http://en.wikipedia.org/wiki/Resource_ Description_Framework
(4) The LOD Cloud: http://richard.cyganiak.de/2007/10/lod (5) The Data Hub (formerly CKAN): http://thedatahub.org (6) W3C Linked (Open) Data Cookbook: http://www.w3.org/2011/gld/ wiki/Linked_Data_Cookbook (7) Bernadette Hyland: http://3roundstones.com/about-us/ leadership-team/bernadette-hyland (8) W3C Design Issues for Linked Data: http://www.w3.org/ DesignIssues/LinkedData.html (9) Web API: http://en.wikipedia.org/wiki/Web_API or Web Service: http://en.wikipedia.org/wiki/Web_service
35
Further Reading
How to publish Linked Data on the Web, Bizer et al: http://www4. wiwiss.fu-berlin.de/bizer/pub/linkeddatatutorial Linked Data Connect Distributed Data across the Web: http:// linkeddata.org Linked Data: Evolving the Web into a Global Data Space, Heath and Bizer: http://linkeddatabook.com Designing URI Sets for the UK Public Sector: http:// www.cabinetoffice.gov.uk/resource-library/ designing-uri-sets-uk-public-sector Linked Data Patterns, Dodds & Davies: http://patterns. dataincubator.org/book/linked-data-patterns.pdf Linking Government Data, David Wood (Editor), Springer; 2011 edition (November 12, 2011), ISBN-10: 146141766X, ISBN-13: 978-1461417668
36
search engine like Sindice1 or one of the globally available Open Data catalogues such as The Data Hub2. Also consider data set update cycles and when the data was last updated.
37
38
Links
(1) Sindice the semantic web index: http://sindice.com/ (2) The Data Hub: http://thedatahub.org (3) Vocabulary / Ontology Alignment on Wikipedia: http:// en.wikipedia.org/wiki/Ontology_alignment
Further Reading
Second International Workshop on Consuming Linked Data: http://km.aifb.kit.edu/ws/cold2011 Semantic Web for the Masses, paper by Lisa Goddard: http:// firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/ view/3120/2633 Linked Data: The Future of Knowledge Organization on the Web: http://www.iskouk.org/events/linked_data_sep2010.htm Linked Data: Evolving the Web into a Global Data Space, Heath & Bizer: http://linkeddatabook.com
39
A good resource for LOD projects and suppliers (in the domain of eGovernment) is W3Cs community directory, which was launched in late 2011: http://dir.w3.org
40
A RDF triple is an expression that defines a way in which you can represent a relationship between objects in a dataset. Usually there are three parts to a triple: Subject, Predicate and Object.
41
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lodcloud.net September 2011
As you can see in the cloud diagram, some of the data providers have become well-known and established as popular linking hubs in the web of data. Prominent examples are DBpedia4, a community effort to extract structured information from Wikipedia and geonames5 that provides RDF descriptions of millions of geographical locations worldwide. The next chapters highlight three successful examples of pioneering websites and applications that already use LOD to enrich their own content and to publish their data sets in RDF as LOD for free re-use by external parties: reegle.info a clean energy information portal focusing on high quality information on renewable energy, energy efficiency and climate-compatible, development-related topics openEI a portal providing various energy data sets in a semantic wiki Legislation.gov.uk and Legislate Mobile a portal with data from the official national government archive of the United Kingdom
42
All three best practices are part of the LOD cloud diagram shown above, and their data sets are interlinked with other data providers to maximise the benefits of using LOD technology. More examples can be found on the CKAN6 directory Data Hub, a registry of open knowledge data sets and projects.
Links:
(1) Linking Open Data Project: http://esw.w3.org/topic/SweoIG/ TaskForces/CommunityProjects/LinkingOpenData (2) W3C Semantic Web Education and Outreach Group: http://www. w3.org/2001/sw/sweo (3) The Linked Open Data (LOD) Cloud: http://lod-cloud.net (4) DBpedia: http://dbpedia.org/About (5) Geonames: http://www.geonames.org/ontology (6) The Data Hub: http://thedatahub.org
Further Reading:
Linked Data The Story So Far, Christian Bizer, Tom Heath, Tim Berners-Lee, International Journal on Semantic Web and Information Systems (IJSWIS) (2009): http://tomheath.com/papers/ bizer-heath-berners-lee-ijswis-linked-data.pdf Linking Government Data, David Wood (Editor), Springer; 2011 edition (November 12, 2011), ISBN-10: 146141766X, ISBN-13: 978-1461417668 The Joy of Data - A Cookbook for Publishing Linked Government Data on the Web, Bernadette Hyland und David Wood: http:// www.springerlink.com/content/n30nq362wr678101
43
This best practice example on how to produce and consume Linked Data is based on the country profiles of the clean energy information gateway reegle.info1. These comprehensive dossiers focus heavily on energy-related issues and the entire content is fetched from different Open Data sources in a self-maintaining way that ensures that reegle users can always access the latest high quality information in a visually appealing presentation. Reegle has already established itself as a popular information portal in the fields of renewable energy and energy efficiency. With more than 200,000 users a month (October 2011), its services are widely used. Reegle was launched in 2006 but underwent a complete re-design in 2010, not only in style but also in content, technology and services. Reegle now offers all of its data under W3C standards, i.e. it is open and Linked Data in a non-proprietary format (Resource Description Framework RDF). As a consumer and provider of Open Data, reegle has collected its information from the most highly-rated sources and offers this data in 243 individual country profiles.
44
Reegle offers comprehensive energy-related information on 243 countries and regions, and enriches retrieved external Open Data with REEEPs own policy and regulatory overview. All countries display relevant energy statistics from established sources such as the UN and the World Bank as well as all reegle stakeholder (actors) active in the relevant country. Projects outputs are also offered wherever available and round out this unique information output. Maps, tools and programs from NREL were also integrated, as their data portal OpenEI.org also uses Linked Data technology, offers data sets in RDF format, and provides a SPARQL endpoint for accessing their datasets.
multiple data sets and provide added value by combining different data sets. Without the possibility of using machine-readable data sets, a huge amount of manpower would be needed to provide this service. Reegle also established itself in the LOD cloud as a provider of Open Data and is listed in CKANs directory of Open Data sets. As a provider of an ever-increasing amount of data, reegle has also taken on the challenge of building the SKOS-based Renewable Energy and Climate Compatible Development Thesaurus with full Linked Data capacities to structure and retrieve energy and climate data.
45
Links:
(1) Clean energy info portal reegle: http://www.reegle.info (2) reegle data portal: http://data.reegle.info
46
(3) REEEP the Renewable Energy and Energy Efficiency Partnership: http://www.reeep.org (4) REN21 the Renewable Energy Policy Network for the 21st Cetury: http://www.ren21.net
Further reading:
Open Data info: http://blog.reegle.info/blog/tag/open-data Developer Guide: http://data.reegle.info/developers/guide Thesaurus Guide: http://data.reegle.info/thesaurus/guide
47
Wherever possible, the OpenEI Glossary features related terms and definitions collected from other sources. This is made possible by the Linked Data services provided by other agencies such as DBPedia and reegle. OpenEI obtains this information through RDF or SPARQL endpoints in real time, ensuring that the information provided to the user is always current.
48
Links:
(1) OpenEI Portal: http://en.openei.org (2) OpenEI Glossary: http://en.openei.org/wiki/Glossary
49
Further reading:
LOD on Open Energy Information: http://en.openei.org/lod FOAF Project: http://www.foaf-project.org
50
The main and general objective of legislation.gov.uk is to deliver a high-quality public service for people who need to consult, cite, and use legislation as part of the traditional web of documents. Publishing the UKs Statute Book as data, for people to take, use, and re-use for whatever purpose or application they may need it for, is a service made possible through the new web of data. Research has also shown that users of the OPSI website and UK Statute Law Database were often confused about the relevance of legislation accessed. Therefore, it is a primary goal to make legislation.gov.uk intuitive and to make it obvious whether a certain piece is current or historical. Legislation can be searched by year, as more recent data sets from the last few decades tend to be complete whereas historical data sets are often partial. A timeline is now available for advanced users to navigate through particular Acts over the centuries and makes legislation.gov.uk a worthy successor to previous portals. A map view is also available to quickly see which legislation applies to all of the UK or to just England, Scotland, Wales or Northern Ireland.
by statute, it is possible to use these definitions to connect terms with appropriate statutes in a thesaurus or ontology. For example, the ESD toolkit4 created a controlled vocabulary of services by different authorities and linked them to the corresponding identifier URIs of powers and duties from legislation.gov.uk. A controlled vocabulary, like a thesaurus or an ontology, drastically improves a databases efficiency when being searched. Drawing different ways of describing certain legislative concepts, like synonyms or even misspellings, under a single term is how the ESD Toolkits controlled vocabulary vastly improves the searchability of the data and improves the whole user experience. Plus, connecting governing legislation and enacting authorities adds concrete value to public information and citizens. Relying on LOD standards also makes it possible for legislation.gov.uk to give accurate information about when a section is repealed, by what piece of legislation, and when that repeal comes into force. Again, this is possible by connecting, or linking, the right pieces of information through persistent URIs. Publishing UK legislation in this way has made the UK statute book an important contribution to both the web of data and the old-fashioned web of documents.
51
52
should appear in the app for browsing within 24 hours of it being released publicly. Sharing noteworthy pieces of legislation on Facebook is also made easy through this app.
Accessibility details
Technologies used on legislation.gov.uk include XHTML + RDFa, CSS and JavaScript. It is possible to convert to PDFs and Braille copies may be obtained. The whole legislation.gov.uk website is based on an open API which makes available the raw data behind it for other users. To see this underlying data, just append /data.xml or /data.rdf to the URL. All data is available freely under the Open Government License6.
Links
(1) The Official Home of UK Legislation: http://www.legislation.gov.uk (2) Office of Pulic Sector Information: http://en.wikipedia.org/wiki/ Office_of_Public_Sector_Information (3) Statute Law Database: http://en.wikipedia.org/wiki/UK_Statute_ Law_Database (4) Education for Sustainable Development Toolkit: http://www. esdtoolkit.org (5) Mobile Legislate App: http://mobilelegislate.com (6) Open Government Licence: http://www.nationalarchives.gov.uk/ doc/open-government-licence
Further reading
Linking UK Government Data (John Sheridan): http://www.slideshare.net/semwebcompany/ linking-uk-government-data-john-sheridan
APPENDIx
5. APPEnDIx
5.1. Authors
Martin Kaltenbck, CMC
Author of Chapter 1 & 3
Martin Kaltenbck studied communication, psychology and marketing at the University of Vienna. In 2000, he co-founded punkt. netServices, an Austrian company specialising in information and knowledge management and Enterprise 2.0 solutions. He is Managing Partner and CFO at Semantic Web Company (SWC), where he is responsible for finance and operations. Furthermore, he leads numerous projects in national and international research, industry and public administration. His regular speaking engagements and publications cover the fields of Enterprise 2.0, social semantic web, Linked Open Data and Open Government Data. He is a certified management consultant, a member of the Executive Board of the Austrian Chapter of the Open Knowledge Foundation (OKFO) and an invited expert of OGD Austria, a governmental cooperation. He is also currently working as an invited expert at W3C.
53
Publications
ZukunftsWebBuch 2010 - Chances and Risks of the Future Web Enterprise 2.0 - Introduction, Principles, Use Cases and Tools Open Government Data (OGD) White Book Austria 2011
Contact
Tel: +43 (0)1 402 12 35-25 E-Mail: [email protected] Slideshare: http://www.slideshare.net/MartinKaltenboeck LinkedIn: http://www.linkedin.com/in/martinkaltenboeck
APPENDIx
54
Publications
Environmental Software Systems. Frameworks of eEnvironment - data.reegle.info A New Key Portal for Open Energy Data Information Technology Safety-Concepts in Europe
Contact
Tel: +43 (0)1 260 26 37 14 E-Mail: [email protected] Skype: reeep_f.bauer LinkedIn: http://www.linkedin.com/in/florianbauer
APPENDIx
55
Publications
Social Semantic Web - Web 2.0, was nun? Using Linked Data in Thesaurus Management
Contact
Tel: +43 (0)1 402 12 35-27 E-Mail: [email protected] Slideshare: http://www.slideshare.net/ABLVienna LinkedIn: http://at.linkedin.com/pub/andreas-blumauer/6/46/b
APPENDIx
56
Sponsoring Partners
Federal Ministry for the Environment, Nature Conservation and Nuclear Safety (BMU) http://www.bmu.de Renewable Energy & Energy Efficiency Partnership (REEEP) http://www.reeep.org Semantic Web Company (SWC), Vienna Austria http://www.semantic-web.at LOD2 Creating Knowledge out of Interlinked Data http://lod2.eu
APPENDIx
Event Partners/Sponsors
This book was created during the preparation of the REEEP- and SWC-organised event, Linking Open Data to Accelerate Low-Carbon Development, held in January 2012 in Abu Dhabi with the goal of providing an easy-to-understand guide on first steps steps in consuming and producing LOD. The event was kindly supported by: Federal Ministry for the Environment, Nature Conservation and Nuclear Safety (BMU) http://www.bmu.de International Renewable Energy Agency (IRENA) http://www.irena.org National Renewable Energy Laboratory (NREL) http://www.nrel.gov Masdar Institute http://www.masdar.ac.ae
57
Acknowledgments
This book would not have been possible without the help of all those who have contributed ideas, discussions, demos and texts. We would like to thank everyone who helped us (names listed in random order): Martin Schpe, Marianne Osterkorn, Martin Hiller, Anna Florowski, John Sheridan, Jos Manuel Alonso, Bitange Ndemo, Jon Weers, Thomas Thurner, Denise Recheis, Gnther Friesinger , Vince Reardon, Jena Wuu.
This is a quick start guide for decision makers who need to quickly get up to speed with the Linked Open Data (LOD) concept, and who want to make their organization a part of this movement. It gives a quick overview of all key aspects of LOD, and gives practical answers to many pertinent questions including: What do the terms Open Data, Open Government Data and Linked Open Data actually mean, and what are the di erences between them? What do I need to take into account in developing a LOD strategy for my organization? What does my organization need to do technically in order to open up and publish its data sets? How can I make sure the data is accessible and digestible for others? How can I add value to my own data sets by consuming LOD from other sources? What can be learned from three case studies of best practices in LOD? REEEP's clean energy information portal reegle.info NREL's Open Energy Information Portal The o cial home of UK legislation: legislation.gov.uk What are the potentials oered by this fundamental step-change in the way data is shared and consumed via the web?
edition mono
ISBN: 978-3-902796-05-9