Public Data
Public Data
Hintikka
Public data
an introduction to opening information resources
Case: Tax tree Cover image: Peter Tattersall Tax tree Peter Tattersall was the winner of the idea-category in the 2009 Apps for Democracy Finland contest. Peters idea is rather simple. Even for the professionals, the Budget of Finland is heavy reading. The budgetary materials have been available for some time, and the media has been trying to make it more visible. The tax tree is an idea of an Internet service, one that would make the revenues and expenditures of an organisation ran by state, municipality or public administration even more visible. Revenues are the roots of the tree. They then become a part of the stem and finally they branch out as expenditures. The attained profits are represented as leafs and fruit. The thickness of the roots and branches are equivalent to the amount of sources of income and the items of expenditure. In order to work, the tax tree and other similar applications need open data in a machine-readable format. The Netra service (www.netra.fi) is run by the State Treasury of Finland, and it provides information on the operation, resources and profitability of the state. However, Netra is designed to specifically serve the needs of the Finnish Government. It is the aim of this handbook to answer one simple question: what would we need to do to put the tax tree into practice in Finland?
Public data
an introduction to opening information resources
4
Public data an introduction to opening information resources
Foreword
Public administration is in possession of data sets, the opening and free use of which are supported by international examples and social discussion. The importance of the availability of public data in connection with productivity, competitiveness, and well-being has been emphasised by The Ubiquitous Information Society Board in Finland. It has been suggested that it should be made easier for the companies to access information in digital web services. In addition, Finland has been seen as a forerunner of open data societies in the future. The role of the data as a basic infrastructure is the central theme of the National Digital Agenda in Finland. Ministry-appointed working groups are trying to find solutions for improving the availability of public data. In March 2011, the Finnish Government accepted n agreement in principle, according to which data sets have to be openly available for everyone to reuse and marked with uniform and clear terms of use. Data transfers are viewed in light of their over-all benefits to national economy, which, as a principal rule, means non-chargeability. In addition, the European Union Commission has stated that the Member States need to take action to improve the utilisation of public data. However, there is still rather little know-how to be applied into practice. Prior to this guidebook, there was no comprehensive guide available on the topic in Finland. This book aims to offer guidance on how to open data sets in a controlled way. Services and producing communities are emerging around the data sets. Introducing the building blocks of this ecosystem is in the centre of what this book offers. In the Finnish information society, people are beginning to see openness of data as an answer to societal and economic challenges. In addition to new opportunities for business, the openness of public data increases the sufficiency of the government and the participation of citizens. The benefits do not only include web services; the effects can be seen widely in our surroundings, for example in environmental planning. At its best, open data creates a culture of doing things together, a culture that is enabled by the communal and technological development of the Internet. During this change the legal and administrative practices need to be updated. For now, we are only able to estimate the effects of open data but we cannot stand still and wait. It has been said that Finland has the possibility to become the leader among open information societies because of our high-quality public data sets and technical know-how. In order to become the leader, we need trial runs carried out open-mindedly. The three authors of this guidebook are well capable of processing domestic and international expertise and have, therefore, impressively fulfilled the goal of this book; to let people know how to use data sets innovatively and productively. Therefore, it is the purpose of this book to support the producers and the users of data to create these practices.
Helsinki, April 20, 2011. Taru Rastas, Ministry of Transport and Communications
M AT T I M AT T I L A / F L I C R . C o M | C R E AT I V E C o M M o N S
6
Public data an introduction to opening information resources
Abstract
Information resources produced and held by public administration have recently been widely discussed both in Finland and in the European Union. At the beginning of 2010, several working groups were formed in Finland and many opinions were heard; all asking the current legislation and methods to be revised. The current definition of policy in Finland regarding PSI (Public Sector Information) is based on the Act on Criteria for Charges Payable to the State, which was passed in 1992. According to the Act in question, using data produced in the public administration is, more often than not, chargeable. The Act on Criteria for Charges Payable to the State was first passed when the Internet did not yet exist. Nowadays, during the Internet era, the costs for offering data are substantially lower than in the 1990s. Free distribution of government held data would be overall profitable for Finnish businesses and civic activities. In addition, it would help make the government more effective. No accurate calculations are available on the topic yet. However, studies and reports show that currently most of the income from transferring data comes from within the government. According to our point of view, opening the public sectors data for free would be more profitable overall than the current system. This book gives an overall picture of the process of opening the administrations data resources for free and open use for everyone. Opening government data has already been made part of operational policy and strategies in the US and Great Britain. In this handbook, we take a look at opening government data in a wider societal frame of reference. Opening the data begins with evaluating the organisations own information resources. This might be a long process, depending on the size and nature of the organisation. However, everything does not have to be done immediately. The opening process could proceed phase by phase, starting from the easier data and gradually moving on to more complex data sets. During the data inventory, organisations may come across data they had no knowledge of or did not know how to utilise. During the inventory, organisations can create their own strategies and goals on how to utilise their data. Possible benefits include new ways to use the data, collaboration with new partners, or the development of the organisations role. This guidebook is a toolbox, giving you the necessary tools to estimate the usability of the data. After the inventory, all data should be converted into machinereadable format. More and more often, data is applied to Internet and mobile applications. These applications offer extra value to the users by allowing them to access certain information without browsing the Internet. Finland has produced some high quality data resources but, in most cases, the information is published solely on PDF-format, which makes it harder to add value to the data. Many laws, Directives and recommendations need to be considered during the process of opening public data. These include, among others, the Freedom of Information Act, data protection legislation, Act on Criteria for Charges Payable to the State, the Copyright Act, international recommendations, competition legislation and EU Directives. None of the laws and regulations mentioned above prevent opening data. There are, however, parts in the regulations and the legislation that one should be familiar with to ensure a controlled opening process. Opening data can be seen as an interactive process, mainly because often the best ways to use data are developed outside the organisation. In this book, we
7
Abstract
compare the process of opening data to an ecosystem, where various actors offer data without reciprocity, in a way which benefits all parties involved. Another way of looking at open data is to see it as something infrastructural. Open data can be perceived as part of the infrastructure as both the enabler and the content, much like streets and electricity. To date, there are no institutions coordinating the opening of data in Finland. In order to ensure coordinated progress that is as effortless to an organisation as possible, a clearing house of open public sector data could be set up in Finland. This clearing house would coordinate practical issues, offer guidance to the government, and solve problems, much like the Consumer Agency. In addition, a data catalogue could be developed in Finland, one that would list all public sector data.
8
Contents
Foreword Abstract Introduction Open Data as a Source for Interdisciplinary Benefits The Concept of Data Data as Part of the Ecosystem 1. Extensive Use of Data as a Goal 1.1 Where to Use Open Data 1.2 User-driven Innovation 1.3 Agile Methods of Opening Data 1.4 Organisation as an Enabler 1.4.1 Open Communication with the Reusers 1.4.2 Innovation Contest as an Incentive for Action 2. Organisations Views on Openness 2.1 Data Resources, Actors and Roles 2.2 Indicators of Data Usability 2.2.2 Completeness 2.2.3 Equality of Terms of Use 2.2.4 Timely and Original 2.2.5 Legal and Free Reusability 2.2.6 Non-Chargeability 2.2.7 Machine-Readability 2.2.8 Openness of the Format 2.2.9 Good Documentation 2.3 Inventory and Publishing Processes of Data Resources 2.3.1 Reporting the Data 2.3.2 Publishing the Pilot Material 2.3.3 Documentation and the Use Cases 2.3.4 Terminology and the Development of Information Architecture 3. Permission to Republish and Reuse 3.1 Legislation linked to the publicity of data 3.2 Legislation and Regulations Regarding the Chargeability of Data 3.3 Legislation Regarding Access Rights of Data 3.4 Competition Legislation 3.5 EU Directives Regarding Public Sector Data 3.5.1 Financial Utilisation of Public Sector Data and the PSI Directive 3.5.2 INSPIRE Directive and the Legislation on Spatial Data 4 6 11 11 13 13 17 18 21 22 22 23 24 27 28 29 30 30 31 31 32 32 32 33 33 35 36 37
Public data an introduction to opening information resources
37 41 42 44 44 45 45 46 46
9
4. Financial Views on Open Data 4.1 The Changed Operational Environment 4.2 The Pre-Internet Legislation on Payable Charges 4.3 EUs Proposal for the Upper Limit of Charges: Marginal Costs 4.3.1 The Effects of Pricing on the Reuse of Data 4.4 Data as Public Good 4.5 Time Consumption 4.6 The Overall Profits of Non-chargeability 5. Technical Preparations 5.1 Planning 5.2 What is Machine-readability? 5.2.1 Interfaces and Formats According to Data 5.2.2 Machine-readable Licences 5.3 Web Architecture 5.4 Content formats 5.4.1 Notifications on Updates through Feeds 5.4.2 Real-time Web 5.4.3 Spatial Data in a Reusable Format 5.4.4 Publishing Documents 5.5 Interfaces, Applications and Services 5.6 Linked Data 6. Open Data Infrastructure 6.1 Data Catalogue All Public Data Available at One Place 6.1.1 Technical Compatibility of Data Catalogues 6.2 Work Behind the Data Catalogue 6.2.1 Unlocking Service of Public Data References 49 51 52
Contents
52 53 54 55 56
59 60 61 61 62 62 64 64 64 65 66 66 67 71 72 73 73 74 75
10
The government continuously produces large amounts of data. Taking into consideration the quality of Finnish data resources, that data could be put to use even more effectively.
Public data an introduction to opening information resources
A C B / F L I C R . C o M | C R E AT I V E C o M M o N S
Introduction
The government produces, holds and administers wide information resources with great financial and societal value. At the moment, the only ones being able to use this raw data are the ones who have access to the information resources. According to estimates, only a small portion of available data ends up being reused. The development of communal and technological characteristics of the Internet opens up new possibilities for creating more open data policies. Currently, allowing the national data resources to be used free of charge, and a change in the current culture are the most important things to do to further develop the Finnish information society. These changes would bring new ways to face the current challenges. Free public data could offer the Finnish institutions an opportunity to renew themselves and learn the cooperation skills of the network era. It could also steer the development of Finnish society into a new culture of joint cooperation. Recently, the idea of allowing private companies, research institutes and other interested parties to access public administrations data has been unanimously supported. The matter has been widely discussed, thanks to our national objectives and international examples. Therefore, it is our understanding that conversation on the principals of open data is no longer needed; the pressure is now on improving know-how and creating practices through guidance.
11
Introduction
12
Public data an introduction to opening information resources
13
Chapter 6: Open data infrastructure presents the national infrastructure of open data in Finland. The infrastructure includes data catalogues and cross-administrative actors who help other actors open and utilise their data resources.
Introduction
14
Public data an introduction to opening information resources
A large scale utilisation of data creates new services, research and information, some of which have commercial value. The process also promotes democracy and education and makes the everyday lives of people without any financial benefits. The increase in utilising data has a positive effect on producing data and continuously improves the quality and the usability of data resources. In the ecosystem model, government organisations, citizens and corporations are all both users and producers of data. Nowadays, as the Internet and knowledge work become more and more common, the significance of production outside traditional business models and the monetary economy has become greater. This is manifested in new practices, such as Open Source, Wikipedia, and social media. In this guide, we see the collection, improvement, publication, and reuse of data as an entity and as interaction between different actors, not only as a business or trade. Ecosystem evokes an image of well-being of the entity and, on the other hand, fulfilling ones own needs through the richness and vitality of the ecosystem. Another useful term to help the reader to fully understand the field is open data infrastructure. Open data infrastructure includes all organisations and systems operating with open data, in other words, the whole operational environment. This model is suitable for analysing the field of open data at state and municipal levels. For instance, Spatial Data Infrastructure Act well represents the understanding of data as a base material for infrastructure: Spatial data infrastructure refers to provided metadata, geodata and spatial data services, web services and web technologies, distribution of data, contracts regarding availability and use, as well as co-ordination and follow-up mechanisms. The role of the government should include producing infrastructure for everyone to use and therefore to function as an enabler for wider utilisation of data. With the ecosystem we wish to highlight not only the technological systems and institutionalised organisations, but also the living, dynamically changing network of interaction. Individual citizens and government organisations are all part of this network. The concept of infrastructure is in the background throughout the entire book and we shall take a closer look at it in chapter 6: The Infrastructure of Open Data. Chapter 6 handles open data projects crossing the boundaries of organisations, such as the national data catalogue.
15
Introduction
Many functional web services have emerged from the users needs and perceptions.
d A L B E R A / F L I C R . C o M | C R E AT I V E C o M M o N S
17
1. Extensive Use of Data as a Goal
18
Public data an introduction to opening information resources
Case: Apps for Democracy (Washington DC.) Apps for democracy contest produced more gains to the government of Washington DC than any other project.
Vivek Kundra, former Chief Technology Officer in Washington DC, current Chief Information Officer of the United States of America.
The data catalogue maintained by Washington DC (http://data.octo.dc.gov) was established already in 2006, and is perhaps the first, extensive public data catalogue. The catalogue contains hundreds of high-quality data sets, e.g. live data feeds on public transport, school ratings and regional demographics. However, not many noticeable applications emerged for a few years since the catalogue was published. In addition, the catalogue has been used mainly by the administration itself. Apps for Democracy contest was first introduced as an incentive to further the wider use of the catalogue. Organising the contest cost the city 50,000 US dollars, out of which 20,000 were given out as prize money. The competition resulted in 47 functioning services, including mobile, Internet, Facebook and Twitter applications. According to calculations, producing these applications through traditional channels would have cost over 2 Million US dollars. A large part of the expenses would have consisted of internal project management and procurement procedures. It was estimated that, using conventional procedures, it would have taken over two years to provide citizens with this many applications. It now only took a couple of months. Free social media tools were used in organising and promoting the contest. Through social media, the target audience was reached efficiently and quickly. (More in chapter 1.4.2. Innovation Contest as an Incentive for Action)
Education, research and product development In research organisations, easy access to data supports high-quality research and in education data can be used to demonstrate certain facts. The GapMinder service, developed by a professor Hans Rosling in Karolinska Institute in Stockholm, is a fine example of the power of visualisation. A pre-stage version of the software was developed when Rosling needed to show the students how the 1960s idea of global polarisation based on life expectancy and family size was no longer valid. Research and product development offer a greater scale for mining, combining and visualising information than the mashups. In these cases, the aim is either to produce new information or optimise a certain large data set, and not merely to make life easier or further the transparency of governments. For example, optimisation models of a citys transportation system can be created based on traffic measurements, public transportations user statistics and different regional statistics. Nowadays many organisations create these types of optimisations and simulations using their own data resources. Open data would enable the use of additional sources and the data sets of other organisations. Automation Data can be used for automation, where the data helps to guide a process, or make it easier. For example, filling in and updating address forms on web application can be made automatic using address and zip code data. In addition, heating and air conditioning systems could benefit from weather data and data on the capacity of power-distribution networks. This data could then help automate the systems in a way that could lower the consumption of electricity and the spikes in consumption would even out. This type of automation is yet to actualise, but having access to more data could speed the progress. Crowdsourcing Another benefit of open data is the improvement of quality and collective mustering of resources, as well as reducing the overlap of operations with a joint data resource. Jeff Howe (2006) came up with the term crowdsourcing to describe new ways of organising work, made possible by the Internet. The term simply means outsourcing the work to an anonymous crowd on the Internet. Crowdsourcing varies in form. Typically, one either searches for the best possible solution or alternatively data may be collected, classified, assorted, produced, and developed collectively. The best known examples of crowdsourcing are Wikipedia and OpenStreetmap but, for some reason, it is often forgotten that citizens and companies could produce data for the public administration to use. An example of crowdsourcing utilised by the government in Finland is the pilot service Fillarikanava. This service is for cyclists, who can mark their cycling-related observations on a map. These observations are then saved as geodata, which is taken into consideration when planning improvement projects in the area. A slightly different application is the Times Educational Supplement (TES), created in Great Britain. TES is a platform for sharing and joint production for teaching materials. It has been estimated that TES will save 1 Billion GBP in teachers working hours in two years (UK 2009) after the teachers start sharing their materials to each other. This way not all teachers have to produce their own teaching material.
19
1. Extensive Use of Data as a Goal
20
Public data an introduction to opening information resources Picture 1.1: OpenStreetMap was produced collectively by volunteers. It is more accurate than Google Maps in, for instance, naming the parks and pavements in the city centre of Tampere. In Finland, the community has voluntarily drawn up the map and taken measurements with GPS devices. The OpenStreetMap community accepts geodata donations.
21
1. Extensive Use of Data as a Goal
22
Public data an introduction to opening information resources
Information provider Enabler Platform provider Facilitator / co-ordinator Online consultancy provider (Hintikka 2009)
In the Introduction, we presented the concept of open data as an ecosystem, where several actors have different roles and the data is seen as common, renewable raw material. In the open data ecosystem, the government acts as both the enabler, and the provider of information. Once data is offered to be used openly and free of charge, the development and specialisation of roles becomes possible.
One might think of open data as building the infrastructure of an information society. It enables the development of highly developed services and social innovations. In 2008, the board of the Finnish National Broadcasting Company YLE added the enabler strategy to their general company strategy. The essential message of the enabler strategy is that an organisation no longer executes solely its own operations, but also offers others the possibility to activity. The term government as a platform is used by Tim OReilly to describe the changing paradigm of government in the world of Internet technology and open data. The idea is that governments should focus on building infrastructure and thereby enable the development of sustainable private actor ecosystem. This is not a new way to operate within a government. For example, creating road networks has been seen as something naturally organised by the government (instead of the government organising the shipping of goods and/or people). Creating road networks has enabled a variety of private activities related to roads and transport. In the context of open data, this means that information might be better mediated by reducing the governments role in delivering the information. Currently, many government organisations see providing information through their own websites as more important, than the technical infrastructure of open data. This infrastructure would allow others to use the data. Following Tim OReillys trail of thought, we talk about the government as an enabler, which is slightly more than a mere platform. Being an enabler is active, whereas platform creates an image of something passive. Regarding open data, it is our understanding that it is well-grounded for the government to offer data in a reusable format and function as a platform but, in addition, actively urge companies, citizens and other instances to utilise the data
23
1. Extensive Use of Data as a Goal
24
Public data an introduction to opening information resources
regardless of programming skills, whether it is about the use of open data or how to fix concrete problems. However, the programmers have an important role; transforming data into information requires creating user interfaces, programmatic manipulation, visualisation, combining, or other illustration. The preparations for opening data can be made, in part, publicly on the Internet. Dismounting to social media services is a natural way of getting in touch with the programming communities and other interested parties. Among others, the Finnish Innovation Fund Sitra has coordinated different community panels on the Internet. Expert-centred events that used to be enclosed have now been put on the Internet for anyone to see. Social media has been discovered by other organisations, as well. According to several studies, the government in Finland has been more active than the business life in using the Internet-provided possibilities. Internet presence allows new contacts that previously did not exist. For example, the Finnish police have operated in Facebook and other social media services, popular among the youth.
25
1. Extensive Use of Data as a Goal
26
Public data an introduction to opening information resources
A 3 S T h E T I x / F L I C R . C o M | C R E AT I V E C o M M o N S
The goal is to gain experience in open data distribution and ecosystem participation faster than would be possible though any renewal cycle of an IT-system.
27
2. Organisations Views on Openness
28
Public data an introduction to opening information resources
Case: Data is everywhere the tree register Public data is everywhere, if you know where to look. A practical example from New York Trees Near You service, created by Brett Camper, won the Best Application Honorable Mention in the New York BigApps competition in 2010. Trees Near You is a free iPhone application, giving information on more than 500,000 living trees in New York. This application combines the GPS coordinates from the phone, Wikipedia articles on wood species and street tree census data, offered publicly by the city. This application is a great example of all the available data one might never come to think of.
Government data can be structured based on producer organisations, content of the data sets or the assumed use. So far, spatial data (MMM 2005) and data produced in organisations within the Central Government (Kuronen 1998) have been the focus of analysis. In this guidebook, we apply the analysis on all government produced information that is legally public. It is essential for the organisations to identify their data resources and give out information regarding those resources. In Finland, there is an on-going project for developing the government-level information architecture (VM 2009). In the context with this project, an introductory chart was created to demonstrate the overall field of Finnish data resources. In this guidebook, we look at the government as an important data producer. However, from the ecosystems perspective, it does not matter who produces or uses the information. Currently, it is possible that private companies produce nearly as much data as the government organisations. Individual citizens are increasingly taking part in producing data and developing it into information and knowledge. Within the open data ecosystem, government organisations, citizens and companies are all not only the producers but also the users of information in many cases, at the same time. The ecosystem sees organisations and private people as actors, interacting with each other to mould the conventions of ecosystem. Many government processes, such as preparation of legislative proposals, involve communication and exchanging information between different organisations. Law-drafting in Finland, for example, may include using information from Statistics Finland or assessing the budget effects in cooperation with the ministry of Finance. It is a safe assumption, that all government organisations a) produce new data, b) process, c) handle and d) utilise data produced by some other party. Even though the field is wide and the producing organisations cannot be clearly identified, the actors roles in relation to data can be pointed out quite accurately (table 2.1.).
29
2. Organisations Views on Openness
collecting and saving raw material managing and processing raw material combining and editing data from different sources standardising and homogenising data from different sources (same terminology means the same thing) updating information publishing data administration of data resources utilising the data as part of the service
Updater Publisher Register keeper Application developer as the final user of data Interpreter as the final user of data User of data-based services
interpreting the data, e.g. researchers, companies or democracy activists an individual, company, or organisation using open data applications and interpretations
An organisations data sets will be identified later in this chapter, when we discuss the inventory of an organisations data resources. The above-mentioned separation of roles is meant as a tool to help with the inventory.
30
Public data an introduction to opening information resources
The usability of data can be estimated using the following criteria: Accessibility (2.2.1), Completeness (2.2.2), Equality of Terms of Use (2.3.3), Timely and Original (2.2.4), Legal and Free Reusability (2.2.5), Non-Chargeability (2.2.6), Machine-Readability (2.2.7), Openness of the Format (2.2.8) and Good Documentation (2.2.9). Aiming for complete usability and utilisation according to all criteria is not cost-effective with all data sets. Often, the reusability of reources can be significantly improved by decisions that further some of the abovementioned criteria (e.g. using more permissive licences or offering the data free of charge).
2.2.1 Accessibility
Easy to use: The existence and location of the data are well known. The information and the licence terms allowing reuse are easily found on the Internet by both people and search engines. Difficult to use: A data set only exists in an operative system of an organisation and no one outside the organisation has knowledge of it.
The Google Maps interface and the contents of Wikipedia are fine examples of data, the existence and usability of which are widely known. The availability of data resources can be improved by adding it to a well-kept data catalogue, optimising the metadata of the data sets for search engines, and publishing the data according to the linked data paradigm. Letting the potential re-users to know about the catalogue on the Internet, in publications, and in various events can further the general visibility of the data resource (see chapter 6: The Infrastructure of Open Data)
2.2.2 Completeness
Easy to use: Data, in its entirety, is free for downloading on the Internet. The accessibility and the potential use are not indirectly restricted by allowing access to only a certain part of the data set at a time. Difficult to use: Only part of the entire data set is freely available and a separate contract is required to access the complete data set.
Typically, access to the complete information resource is restricted, intentionally or unintentionally, by only offering the data through a query interface and making it impossible to download the entire data set. If the data resource is available in its entirety, it is technically possible for anyone to start redistributing the data to themselves and others. Restricting the entirety may be a way to prevent copies. On the other hand, the restrictions prevent any use based on an extensive analysis and burden the query interface, which could be avoided by offering a copy of the data set (see chapter 5: Technical Preparations).
Difficult to use: Access to the data resource in restrcited based on the user or the purpose of use. For example, data can be offered solely for research and product development, or for uncommercial purposes.
31
2. Organisations Views on Openness
In practice, equality is achieved when everyone has access to the data set and no registration is required. In which case, anyone with standard licensing terms can use the data set. A licence does not prevent the use of the information in a certain range of use. In particular, commercial use is allowed because there are high hopes that the commercial actors will participate in the ecosystem. Equality means letting go of the anticipatory control. One is allowed to use data unskilfully and for political purposes (see chapter 3: Permission to Publish and Reuse)
In addition to raw data, joint and processed forms of the data set can be placed available. In some cases, it is possible to publish data sets potentially in violation of privacy by making generalisations and lowering the level of accuracy. However, one has to be extremely careful with generalisations and turning the data anonymous (see chapter 5: Technical Preparations).
Permissive licences include, for example, Creative Commons and Open Database licences. The copyrights of open public data should be waived using, for example, the Creative Commons Zero licence in order to avoid any obscurities later on the processing chain. Regarding the terms of use, the most common wish expressed during the conducted interviews was to find out who is using the data and for what purpose. Often, there were no reasons for restrictions but the creators wanted to know so that they could further develop their operations. The data can be monitored without signing separate contracts or restrictive terms of use by, for example, user registration on the Internet (see chapter 3: Permission to publish and reuse).
32
Public data an introduction to opening information resources
2.2.6 Non-Chargeability
Easy to use: The data is free of charge Difficult to use: The data is offered for a charge and the profits from the sales are used for covering other expenses in the producing organisation.
Any data that is offered for a fee no larger than the marginal costs can be considered as open. Often, the costs of maintenance and producing a data set are many times higher than the charged marginal costs. Even a small fee may limit the use of the information because of the contracts. In fact, it is possible that most of the marginal costs consist of the bureaucracy related to billing. If that is the case, then there are no grounds for charging the marginal costs. If, for any particular reason, charges are collected, it should be possible to make the payment on the Internet and receive the data set immediately without burdening the authorities (see chapter 4: Economic Viewpoints)
2.2.7 Machine-Readability
Easy to use: The data resources have a permanent location on the Internet, allowing automated and programmatic access. The data is structured to enable automatic processing. The terms of use are machine-readable, they can be accepted on the Internet, and the data is received immediately without burdening the authorities. Difficult to use: The data is published in a non-structured format, making it readable only for people (e.g. PDF documents and html pages).
As a ground rule for machine-readability, a capable programmer can, in a relatively short period of time, create a programme that automatically retrieves the data from the Internet, processes it in the machines memory, and presents it in a new format, for example, on a screen of an iPhone. If a data set is not machinereadable at any stage, it is rather difficult to change it to such a format. Despite of that, many organisations offer data in a non-machine-readable format on their webpages. However, the same information can often be found elsewhere in a machine-readable format, which makes it a lot easier to publish the data (see chapter 5: Technical Preparations).
This can be achieved by offering the data in an open format. The definition of the format is freely and publicly available and the use is not restricted financially or otherwise. If possible, it might be worthwhile to offer the same data in several formats. Using open formats is not always realistic. For example, some spatial da-
ta systems use a producer-specific format, which means that switching to open formats would only be possible after a system renewal (see chapter 5: Technical Preparations).
33
2. Organisations Views on Openness
The reusability of data can be significantly improved with metadata, documentation, user examples, and quality definitions. The only downside of good documentation is the work it takes. Regulation may notably slow down the publication of data sets. On the other hand, documentation can initially be done lightly and improved later. For example, including column headings in a Tab separated file is sufficient (see chapter 5: Technical Preparations).
34
Public data an introduction to opening information resources
edge of their data resources and new data is published for the ecosystem to use. In addition, the organisation gathers more and more information on opening data during each phase. The phases of the process are a) analysing the data set, b) publishing the information and c) learning from opening the data.
Phase 1: Reporting the data The announcement regarding information held by an organisation and the opening of data are done as quickly and lightly as possible. In this phase, the following questions need answering 1. What data sets are we in possession of? 2. Which of them are public? 3. How open are different data sets? Phase 2: Publishing pilot material Once your own data resources have been identified, it is not difficult to pick the ones that are the easiest to open. The following iteration round includes pilot materials that are technically, legally and economically the easiest to open. In this phase, the following questions need answering 4. Which data sets are easy to open? 5. Which terms of use should be applied? 6. Who, within the organisation, is in charge of the contents of data sets and technical systems? Phase 3: Documentation and use cases The amount of information being published can be increased phase by phase. However, one should also improve and clarify documentation for the re-users and collect information on use cases within the organisation. In this phase, the following questions need answering 7. What is the contextual description of the data and how was it documented technically? 8. What kind of needs for information, user groups, and use cases are related to current use of different data sets? 9. Have any requests been made in or outside the organisation regarding the availability or usability of data? Phase 4: Information architecture and terminology The first pilots give the organisation enough experience to create a strategy for furthering the openness of data resources. This strategy would function as a guideline on how to systematically open registers and interfaces. After that, it is possible to create cross-organisational use of data resources. In this phase, the following questions need answering 10. What are the needs and possibilities for cross-organisational standardisation of data? 11. What type of system renewal is the improvement of the openness of data resources related to? 12. How could an organisations data resources be better organised together with other actors?
As was mentioned previously, this is an indicative process and the execution of it may vary from organisation to organisation. However, the starting point is quite clear. Opening data always starts with identifying ones own data resources and evaluating the current state they are in. After the evaluation, it is possible to participate in the ecosystem of open data; first experimentally and later by including open data as a permanent part of working methods. One can start small, and experiences gathered along the way form a great basis for an organisations own open data strategy.
Strategy
35
2. Organisations Views on Openness
Picture 2.1 Opening data is a gradual process, one that progresses in interaction with its users.
A) Analysing the data Phase 1: A list of all data in possession of the organisation Phase 2: Analysis of the data (technology, legislation, responsibilities) Phase 3: Interviews and use cases regarding the use of information resources Phase 4: Development plan for information architecture B) Publishing the data Phase 1: Publishing the Information Asset Registry of the organisation Phase 2: Publishing the pilot material in its raw form Phase 3: Defining the interface and publishing documentation Phase 4: Creating a search service or a data portal C) Learning from opening data Phase 1: Following the statistics on data use and downloads Phase 2: First applications, the word spreads Phase 3: Meeting with the users, for example in a user workshop Phase 4: Active use and development of the organisations data resources in the ecosystem
36
Public data an introduction to opening information resources
Name of the data set Short description of the content How open is the data, is it available for users and if so, how?
Once the list of public data resources has been posted on the net, it is time to send out messages to potential re-users, create contacts and gather opinions. An organisation can achieve this by, for example, contacting the developer community through social media or big events. Communication with the re-users should encourage further use of the data. If the decision on opening data has not been made and the organisation is only looking for ideas, they should let the re-users know that. It is possible, that the users might ignore the organisation at first. In best case scenario, the mere publishing of a list may provoke discussion between the organisation and the users of the data. Finally, the organisation should gather and evaluate the feedback and experiences regarding opening the data. What was good? What was not? This is preparing for the next iteration phase.
37
2. Organisations Views on Openness
38
Public data an introduction to opening information resources
find targets for development. Organisations can ask the users what kind of needs they have, what, if any, problems they have encountered and what kind of needs they think other users may have. The goal is to find out if the information sources are well available and if the format of the data is acceptable. The format needs to be changed if there are any problems or technical issues to the utilisation. Sometimes documents and files that seem to have no creators, or are filed inappropriately, can be found in the information architecture. These types of documents are usually a sign of a) a non-functioning policy that has been overlooked (e.g. a difficult practice), b) a procedure that has not been instructed well enough (lack of instructions) or c) an occurrence of a new need that was not anticipated (a new unit or a project team). Many challenges regarding information architecture revolve around the same issue: information is produced inconsistently and in great quantities. The original tools were, in all likelihood, not designed to control current masses of information. The standardisation of data is a key issue, both within the organisations and in co-operation with other organisations. In worst case scenario, there is no knowledge of what information is saved on which system. Every actor looks at things from a perspective relevant to themselves, data is scattered in different systems, and the information is gathered and updated in several places. The depiction of data sets and definition of methods are aimed to solve and prevent problems with compatibility, as well as create an overall view on data sets in different organisations. The goal is to standardise information architecture and find common ways of depicting information both within and between organisations. Compiling of information architecture requires time, hard work and co-operation between the parties. The wider the intended view, the harder the work. An example of an information architecture project in Finland is the KuntaGML project. The project attempts to create standardised interfaces for spatial data, such as basic zoning information. Despite the hard work, a good architecture is rewarding; systems are easier to update, working hours are saved, the data resources are of better quality and, first and foremost, data can be reused even more widely.
39
2. Organisations Views on Openness
A M y p A L k o / F L I C R . C o M | C R E AT I V E C o M M o N S
Unclear or missing Terms of Use limit the reuse of the data much more than the possessor had intended.
41
3. Permission to Republish and Reuse
42
Public data an introduction to opening information resources
Freedom of Information legislation refers to the Act on the Openness of Government Activities (621/1999) and Decree on the Openness of Government Activities and on Good Practice in Information Management (1030/1999) Data protection legislation refers primarily to Personal data Act (523/1999) but also to Act on the Protection of Privacy in Working Life (759/2004) and section 24 of the Criminal Code of Finland. Recent topics for discussion in Finland have included the Act on the Protection of Privacy in Electronic Communications (516/2004) and especially its reform from 2008, known in Finland as Lex Nokia. Act on Criteria for Charges Payable to the State (150/1992) OECD Recommendation of the Council concerning Access to Research Data from Public Funding (OECD 2006). OECD Recommendation of the Council for Enhanced Access and More Effective Use of Public Sector Information (OECD 2008) Copyright Act (404/1961) The basis for competition legislation is the primary nature of financial competition compared to market regulation. The currently valid law is the Act on Competition Restrictions (480/1992) The INSPIRE Directive aims at enhancing the use of spatial data, furthering the cooperation between authorities and creating diverse services for citizens. PSI (Public Sector Information) Directive (98/2003/EC) on the re-use of public sector information aims to increase the commercial utilisation of data.
EU Directives
ernment-held documents are public, unless the publicity of them has specifically been limited by legislation. Giving out a public document is free of charge in many cases, e.g. when the document is electronically stored and sent to the receiver via e-mail, or when handing out the document is part of the obligation of the authority to consult and duty to provide advice and information of the authority. If a fee is collected, the costs of finding the document and omitting the classified information are not to be included in the fee. Data protection is not about protecting information; it is about the citizens rights to live without the fear of information regarding their private lives becoming public. The legislation protects privacy (such as information about an individuals financial status, health or political views) as well as other rights and freedoms, such as freedom of movement and freedom of assembly. For example, when the travel card system was first introduced in Finland, stampings on different bus lines were saved on the card. It was even possible to receive a print-out later to see where the holder of the travel card had been. This provoked intense discussions about confidentiality and the freedom of movement. Later, it was decided that the stampings would not stay on record. The cornerstone of data protection legislation is the definition of personal data and the aim is to protect personal data from unjust use, potentially harmful to an individual. The definition on personal data is unequivocally challenging. The European Union has defined the term in a 26-page document (EU 2007). Personal data is easier to understand, if one thinks of anonymous data as an opposite for the term. Anonymous data is information not connectable to an individual. When opening data, it is easiest to start the publishing with anonymous data. The Personal Data Act contains specific regulations on registering and using personal data. Any data that enables the recognition of an individual is not to be made public without the permission of the individual in question. This, also, applies to data enabling indirect recognition. In some cases, personal data may be given out to trusted parties after signing a separate contract. Separate contracts can be drawn up to use personal data for, for example, research. When separate contracts are drawn up, it is vital to make them as transparent as possible, so that anyone could explore the reasons for such a contract. The legislation regarding the publicity of data is quite straightforward and easy to construe. But, in some cases, it might be in order to limit the legal publicity of data sets, if, for instance, the accessibility of the information becomes drastically easier through Internet distribution. Currently, it is possible to obtain quite a lot of personal data, but it takes time and effort. If all personal data was on the Internet, merely a press of a button away, it might endanger the right for privacy. For example, the National Land Survey of Finland was in a position where they had information about properties, their owners, lots and forests in the same service. The interpretation was that the name of the propertys owner was not to be revealed in the same service, even though the owners name was public information. The reason for this was that the service enabled estimating the financial value of the owners wood property. The value of ones property falls under the protection of privacy. These types of situations should be dealt with in advance, and that is why the authorities in charge should be included in the process of planning new ways to utilise data. Data protection is part of information security, which generally refers to protecting the systems, data, services and data communications in parts, where peoples basic rights are not threatened. The Data Protection Ombudsman in Finland has suggested that the current, scattered data protection obligations should be centred under one, general information security law (YLE 2007). The need for a general and technology-neutral information security law becomes in-
43
3. Permission to Republish and Reuse
44
Public data an introduction to opening information resources
creasingly greater as cloud services services provided via Internet keep growing. When companies and private citizens start commonly storing their own files on the Internet and begin using cloud services, they preferably seek services provided by companies operating in countries people consider trustworthy. If the Data Protection Ombudsmans suggestion is supported, Finland could help implement such legislation on the EU level. However, information security is not examined any further in this guidebook.
information. Conversely, even though the data is protected by copyrights, it does not mean that one has to control or license the use of it. The author can waive of all rights by simply stating it directly in connection with distribution. For example, Creative Commons Zero licence allows the data to be distributed to everyone through the Internet. In both cases, the producer of the data set holds all the keys to encourage the reuse of the data. It is important to be familiar with the current state of government-produced data in relation to copyrights, because copyrights arise automatically and may limit the use of the data far more than the producer would have hoped. Different open licences, such as Creative Commons, Science Commons, and Open Database Licence, set the lowest restrictions on the use of the data. There are diverse contract practices between different actors in the public administration regarding the redistribution of data sets. Sentences like distribution of data is allowed, but not in large quantities are not uncommon. Contracts are often out-dated, and the Internet may not be considered as a distribution channel for digital information. Instead, the data has been conveyed, by virtue of contract, to be used, for example, for publication purposes. Many obscurities could be avoided, if the attitudes towards immaterial property laws were clearly defined in both municipal and governmental organisations.
45
3. Permission to Republish and Reuse
46
Public data an introduction to opening information resources
3.5.1 Financial Utilisation of Public Sector Data and the PSI Directive
Compared to the INSPIRE spatial data Directive, the PSI Directive is fairly little known. According to the Commissions estimate, the Directive has significantly contributed to the formation of internal European market throughout the entire European Union. However, the implementation of the Directive should still be carefully monitored (EC Commission 2009). If the goals set in the 2003 PSI Directive were properly promoted in Finland, this guidebook would no longer be needed. Promoting the financial utilisation of data resources would also support nonfinancial use and further the development of inter-organisational exchange of information. If implemented, the Directives articles would significantly further the clarification of licensing protocol and the availability of public data. For example, article 9, Practical Arrangements, of the Directive states Member States shall ensure that practical arrangements are in place that facilitate the search for documents available for re-use, such as assets lists, accessible preferably online, of main documents, and portal sites that are linked to decentralised assets lists. When it comes to pricing data transfers, the Directive sets a maximum fee and suggests the pricing be based on the technical costs of extricating and transferring data. Regarding digital data transmission or copying, billing for publication may result in higher expenses than the actual extrication. Already during the publication of the Directive, the instructions for pricing were a compromise. Since then, arguments for free distribution of public sector data have become stronger and stronger across Europe (see chapter 4). According to a disquisition from the Ministry (VM 2007), possible instituting of the PSI Directive in Finland has resulted in changing section 34 of the Act on the Openness of Government Activities in a way that requires organisations to determine the fees for transferring data beforehand, and publish them. Other measures supporting the goals of the Directive, such as the assets lists described in article 9, have not, so far, been taken in Finland. The authorities have created direct application-to-application connections to some of added value service producers. However, these connections are not being actively offered to other operators. The webpages of any authority do not often offer any information on the data resources available for reuse.
47
According to the law, spatial data governing authorities should describe the information by using metadata and interlink them with a service provider. In addition, data sets which are meant for joint use should be placed visibly on the Internet, free for downloading. On the chargeability of the data, the law refers to the Act on Criteria for Charges Payable to the State, which means that possible changes to the Act affect spatial data, as well. However, using metadata is free for all and the possible charges need to be specifically explained. Electronic services and online payment methods need to be implemented before collecting fees. The terms of use and the contract model should be available on the web. Spatial data legislation and the procedures of its implementation could function as a model for opening other public sector data.
3. Permission to Republish and Reuse
Unfortunately, chargeability leads to incomplete usage of high-quality data resources and gathering the same information repeatedly in several locations.
E o g h A N o L I o N N A I N / F L I C R . C o M | C R E AT I V E C o M M o N S
49
4. Financial Views on Open Data
50
Public data an introduction to opening information resources
trades and small marginal costs. Collecting even the smallest fees need bureaucracy, which is not free to maintain. In addition to the financial aspect, free data equals democracy. People would have free and equal access to data to support their arguments, and they would be free to refine and use it as they see fit.
Case: Intra-governmental data transfers Making business with public records has moved money from one pocket to another, but there has been no increase in net income (Kuronen 1998a) Most of the intra-governmental data transfer is actually gratuitous transfer. This is due to the fact that the legislation regarding the receiving authority states, that it should have free access to another authoritys data. Such decrees can be found, for example, in the Act on National Pension (568/2007), the Police Act (493/1995), the Statistics Act (280/2004) and the Customs Act (1466/1994). Regardless of this, almost half of the income from an authoritys data transfers comes from another authority. On two separate disquisitions (2003 and 2004), the Ministry of Finance in Finland looked into the pricing of intra-governmental data transfers. The latter disquisition includes five suggestions for action, which have not been implemented. The first suggestion was to switch to pricing based solely on marginal costs. In the disquisition, a questionnaire was given out to the 17 largest production and distribution organisations of digital data within the government. According to the survey, these organisations made over 28 million euro in data transfer fees in 2002. The costs of data transfers were reported at 13.5 million euro, which would leave 14.6 million euro as net income. The significance of data transfer income differed between organisations (table 4.1).
Table 4.1 The income from digital data transfers in some organisations in 2002 (VM 2004).
State
Municipality 1223
Population Register Centre Finnish Meteorological Institute Finnish Vehicle Administration National Land Survey of Finland Tax Administration
2528
4454
66
2127
6647
86,3%
330
229
5017
5576
10,0%
2728
236
1332
4296
9,0%
757
151
87
995
27,0%
In 2007, earlier disquisitions were revised and the following was noted: Since 2004, changes to the pricing of data transfers have been made in the administration of Ministry of Transport and Communication. Marginal cost pricing is applied to data transfers from Digiroad system and vehicle traffic system. Otherwise, the grounds for payments have stayed the same (VM 2007).
51
4. Financial Views on Open Data
52
Public data an introduction to opening information resources
4.3 EUs Proposal for the Upper Limit of Charges: Marginal Costs
Currently things change fast, and it is sometimes difficult for legislation to keep up. The PSI Directive (see 3.5.1), set in 2003, suggests forfeiting the criterion for charging and that charges should not exceed the marginal costs. At the time of preparation of the Directive, this was reasonable, since, for example, CD-ROMs were a common medium for data publication. The Directive does not specify the marginal costs. Instead, terms such as costs of reproducing and disseminating are used. Terms like the ones mentioned show, just how old-fashioned the thinking was at the time of preparing the Directive: The upper limit for charges set in this Directive is without prejudice to the right of Member States or public sector bodies to apply lower charges or no charges at all, and Member States should encourage public sector bodies to make documents available at charges that do not exceed the marginal costs for reproducing and disseminating the documents. The greatest motive of the Directive is to add growth in the PSI market and, especially, to get companies to re-use the information produced in the public ad-
ministration. In euro, the size of the information market may grow as prices are reduced, allowing new user companies to enter the field. The PSI Directive is on the right path, but it is a product of its time. In the development of digital world, seven years equals eternity. In the time span from the first webpage to the present day, the Directive falls somewhere in the middle. The qualitative change was not taken into consideration when drawing up the Directive, instead it aimed to increase, fasten and facilitate the re-use of public sector data resources using the same process as in the past. What this in fact meant, was that companies were able to have access to data sets only by contract. One can nowadays gain access to the Digiroad data set, a national road and street database in Finland, for marginal costs, which are approximately a few hundred euros for the entire database. There is a dramatic difference between this and the registers based on the criteria for charges. For example, 3.50 euro is charged for every digital inquiry from the Vehicle Registry in Finland, which means that the cost of the entire register would add up to millions of euro. Currently, the Digiroad data is delivered on a DVD-ROM disc. Using the cloud services of the web, it would be possible to transfer this amount of data to the subscriber with costs less than one euro. In addition, no start-up investments or maintenance costs are required. The price does not, however, include the possible costs of automating the updates or the registration of users. One has to wonder, if it is not possible to finish what has been started, as we are so close to free data distribution. It is not hard to imagine that the last small fee and the drawing up of written contracts related to it could easily limit the new and creative use of data.
53
4. Financial Views on Open Data
54
Public data an introduction to opening information resources
to Finnish universities to use for research and education. This, however, requires a definition of research use. Defining it can be frustrating, when the line between research and results and financial utilisation is blurry. In addition, the possibilities of starting companies to utilise data should be improved. For this purpose, the so-called start-off contracts were introduced. The contract would allow the distribution of data to new companies, and the payments would not fall due until the data sets would start showing profits (Hermans 2009). In the long run, the previously mentioned models for charge discrimination are not recommended, no matter how good the intentions. They do not support comprehensively flexible utilisation of data, and maintaining several different contract practices simultaneously only burdens both the provider and the user of the data. The PSI Directive aims to dissolve exclusive and discriminating practices regarding pricing and contracts. In many cases, it is difficult to say how much income actually comes from selling data or charges based on marginal costs, or how much of the expenses come from the production and distribution of data. The separation of expenses and income specific to each data set would require a highly developed and transparent cost accounting. In order to continue the discussion on the pricing models for the government, it would be beneficial to examine the relation between production, ownership, distribution and income in practice.
where a community is renting its own real estate to itself. Money is being moved from one unit to another, but in practice the transfer of money only creates expenses.
Ownership Often, the ownership of information is perceived in the same way as the ownership of any other possessions. In the previous chapter, we examined copyrights and noticed that if copyrights emerge naturally, it is possible to give them up. Information in itself does not belong to copyrights, only the format it is presented in. If data produced in the public sector is seen as a public good, the ownership is hard to determine. Therefore, producing a data set does not automatically lead to ownership. That is why we talk about data held by public administration instead of data owned by public administration. Provision and marginal costs Once the data has been collected and organised as databases, through which data is published in downloadable format and through an interface to users, the marginal costs of one extra user are minimum or non-existent. This does not take away the fact, that it costs money to produce the data set and that there are initial investments to be made on a functioning distribution system. Public administrations data is, by no means, free but if so desired, it can be handed over to users for free. If organisations collecting and maintaining the publication infrastructure are compensated, the result, raw data, can be distributed for free as a public good. As was mentioned previously, the collecting and the maintenance of data sets are usually done for other reasons and with budget funds. The remaining question is the maintenance of distribution infrastructure; whether or not it, too, should be paid with budget funds or covered by charges based on marginal costs. Prior to the Internet, producing data was related to its distribution but that is no longer the case. Internet allows an effective and economic way to circulate data as a public good. Both the profit and the utility value of public data might be significantly larger if it was provided for everyone to use for free. In the name of equality and simplicity, same conditions should apply to handing out data to government organisations, average citizens and companies.
55
4. Financial Views on Open Data
56
Public data an introduction to opening information resources
time consumption. In the Netherlands, the government surveys the time citizens spend on their tax forms, reading government letters and other interaction with the government (Den Hurk 2008). The idea is, that time spent on the interaction with the government means less time to do something else. These surveys help create a measurable quantity, which can be used to assess parallel solutions and decrease the load inflicted by the government.
57
4. Financial Views on Open Data
There is a strong force of the smallest common denominator in the web. Versatile, simple and easily adoptable solutions covering wide application areas may spread surprisingly widely.
d I E B M x / F L I C R . C o M | C R E AT I V E C o M M o N S
5. Technical Preparations
This chapter gives an overview on the technical terminology related to opening data. We believe everyone working with open data projects should be able to understand the general, technological basics of opening data. We hope that the definitions clarifying the technology could help in practice, when technical experts, other professionals and decision makers are discussing the policies of opening data. In previous chapters we offered guidance on the inventory of government data resources and examined data distribution from financial and legal perspectives. Hopefully, we have provided the necessary tools for decision-making. The next phase is to define the technical framework for publishing data, which is highly relevant regarding the utilisation and processing of data. If utilising data is too difficult, the possibilities of open data will not materialise. In the same way we introduced the most important laws and regulations regarding open data in chapter 3, we now begin this chapter with a table listing all technologies and standards introduced in this chapter (Table 5.1). One should not be scared the number of standards, protocols and formats. Memorising or fully comprehending them is not necessary, since it is easy enough to view back on the lists. In this context, they are used mostly as means, through which to refer to the technical solutions of opening data. This is not a comprehensive list but we believe it contains all of the most important modern technologies.
59
5. Technical Preparations
XML: Extensible Markup Language is a set of rules for encoding documents and it is expandable for different uses using new marking elements. CSV: Comma Separated Value file format, which uses commas to separate values from one another. Files can be opened with spread sheet programmes. JSON: JavaScript Object Notation is a language-independent, lightweight text-based open standard. RDF: Resource Description Framework is a standard for linked data paradigm, where individual information resources are described through inter-linked vocabularies. (In this chapter, the abbreviation RDF refers to RDF files in XML format, most often named using the suffix .rdf). RSS ja GeoRSS: Really Simple Syndication is an XML-based information format for the transmission of feeds. If the feed is attached with geographical coordinates, it is a GeoRSS feed. ATOM: Atom refers to two standards, close to each other. Atom Syndication Format is an XML-based markup language for presenting feeds and Atom Publishing Protocol (AtomPub) is a simple HTTP based standard describing programming interface, meant for blog updates. KML: Keyhole Markup Language is an XML-based language for marking spatial data and using it in context with map services.
60
Public data an introduction to opening information resources
HTML: Hypertext Markup Language is the key file format for www, it enables both presenting the structure (but not the structure of the contents) of webpages and the linking of webpages to each other, forming a net of hypertexts. PDF: Portable Document Format is a common file format on the Internet; focus is on a good printability of the document. HTTP: Hypertext Transfer Protocol is one of the key standards on the Internet. PuSH: PubSubHubbub is a transfer protocol used mainly for quick notifications of information updates. Microformats are a bulk of small formats, comprised of HTML elements, used to incorporate machine-readable information into www pages. RDFa: a format used to incorporate machine-readable meanings into www pages. URI: Universal Resource Identifier is an identifier for Internet resources. REST: Representational State Transfer is an architecture model based on HTTP protocol, used for implementing programming interfaces.
Transfer protocol
5.1 Planning
The machine-readability of data is not a straightforward mechanical procedure. The following questions regarding different implementation possibilities are typical when a data set is published on the web:
What is the preferred format for publication? Which standards to use? How should we describe the published information and what metadata should we offer? Should we create an interface, or put the data up for download in a file format?
Open source code, open data and open formats are strategic tools for data administration, designed to prevent dependence on a single provider. Open source code allows the government software development not to depend on a single supplier, and open data ensures that the government does not have an unintentional monopoly status on creating and developing new services. Transferring open, or closed, data calls for transfer formats independent of platforms and software. Standards are important. Open standards are a prerequisite for open market. One might not think about standards daily, but without them the buyers would always be dependent on the manufacturers. In other words, there would be no possibilities to purchase add-ons or other compatible components for previously bought items. We all take for granted that the energy saving light bulb we bought is compatible with the light fitting at home. This type of thinking is relatively new to the field of information technology. For example, it was not until recently that, with help from the EU, the mobile phone manufacturers agreed on how to attach the charger to the phone. In future, the buyer of the phone is no longer dependent on the manufacturer in case the charger gets lost (EU 2009).
Open standardisation is not particularly high on the system suppliers list of priorities. A strong supplier may benefit from its customers being dependent on its products because of, for example, the suppliers use of non-standard file formats. Once the market changes, the supplier may make concessions. They might publish their standard for others to use, but still hold on to the development of it and charge for the use.
61
5. Technical Preparations
62
Public data an introduction to opening information resources
used for publishing more complex data structures, or one can define their own XML structure. To harmonise different markup languages, the linked data paradigm presents records in RDF format (see 5.6). Compared to XML, JSON is a lighter way of presenting data in a format allowing easy transfer. With JSON syntax, one can present simple name-value pairs as well as more complex data structures in a way that makes them easy to process with common web programming languages. In the initial stages, the data provider should publish the data in a way that he/she finds the easiest. Later, the usability of data can be improved by converting it to other file formats. As a ground rule, the conversions should be done by the provider. Otherwise every user has to make the same conversions themselves. The same information can be provided in XML, JSON, and RDF formats (see 5.5). Once the data is available in a machine-readable format, it has to be documented. Documentation gives information on what exactly does each piece of information mean. For example, comma or tab separation allows presenting machine-readable table format data, but it does not give out information on which columns exist in the table. Many tables would be difficult to modify, if there were no column headlines. Therefore, in addition to machine-readability, it is important to offer documentation on what the structure contains. In our example, addresses and opening hours make it easy to figure out what the content is but most cases are far more complex and a clear manual on data is needed.
63
5. Technical Preparations
Picture 5.1: Hypertext is read with a browser, which looks for html documents (webpage) from web servers and shows them in a human-readable format.
of the Web. In 2004, W3C finished a recommendation that lists the central parts of Web architecture and design principles (W3C 2004). Unfortunately, these recommendations are not always followed when, for example, forming programmatically created URI identifiers. The recommendation offers a great frame for publishing data in the web. Web architecture enables gradual development in a scattered web environment. It has been established as protean and has proven the force of one organically growing, linked information space. Later in this chapter we will discuss linked data. It gives us an idea on how to publish data on the Internet, linking it to the worldwide www-information space. REST-style service interfaces, introduced in the context with interfaces, are based on Web architecture and are therefore good platforms for publishing data in the open web. There is a strong force of the smallest common denominator on the Internet. Versatile, simple and easily adoptable solutions covering wide application areas may spread surprisingly widely. One can save both time and effort, once the technology components, know-how, and practices can be applied to problemsolving. This is exactly what happened with the www, after it became a common tool for executing different services. Practically the entire Internet is based on a few, widely adapted standards (TCP/IP, HTTP, HTML, CSS, JavaScript etc.). These standards are well known by developers, administrators and software architects. In addition, there are many open source code software and advanced tools for implementing systems based on Web architecture. One of the most important tools is the browser (picture 5.1). The features of the original browser, developed for browsing hypertext, have improved and they are used as user interfaces together with the www for other services on the Internet, such as email, data transfers and instant messaging in social media. Even the term www is beginning to fade from spoken language. The central term in Web architecture is the global information space, which consists of clearly identifiable interlinked resources. A resource can be a document or an online computer programme which has an unambiguous identifier. The URI identifiers known to all users of the web include webpage addresses, such as http://www.suomi.fi. HTTP-URI is a recommended notation for things brought to the field of Web architecture.
64
Public data an introduction to opening information resources
A resource, for example the results of a vote in the parliament, can be presented in different formats, such as HTML, CSV, XML or RDF (see 5.4). In the very core of Web architecture lies the hypertext transfer protocol HTTP (Hypertext Transfer Protocol), which defines the possible functions (GET, PUT, POST and DELETE) regarding the interaction between browsers and www servers. This basic structure has been proven functional by both the users of the web and the software developers.
updates by sending any updates enquiries to the publisher, even though 99 % of the time the answer is no. Protocols for creating real-time services have only recently become more common. These protocols operate according to a reversed logic; the publisher sends the updates to the inclined subscribers. Currently, the most popular protocol for real-time web is the PubSubHubbub (PuSH), which sends new contents directly to the subscribers.
65
5. Technical Preparations
Decision-making
Alteration proposals for zoning, motions and decisions on a specific region or address etc. Commercial and public services: libraries and hospitals and their opening hours, shops, restaurants, adventure services etc. Information on traffic jams, roadwork, public transportation stops and timetables, real-time locations of vehicles etc. road weather, forecasts etc. Photos, videos, stories linked to a specific location etc.
Services
Traffic
From a technical aspect, adding spatial data to RSS feeds is fairly simple. A feed containing location marking is called a GeoRSS feed. The interface for Google Maps contains features for depicting GeoRSS feeds on a map. KML Network Link allows an even more versatile publication of spatial data. Google Maps and Google Earth can both utilise the KML feeds. Information from a specific area is provided to the user and the data is updated whenever the view changes. For this reason, feeds should contain information that does not update automatically but becomes relevant for the users when they are viewing a specific location. The same GeoRSS or KML feeds can be utilised in several map services (or any service utilising spatial data) and vice versa; GeoRSS and KML feeds from several sources can be attached to a single service.
66
Public data an introduction to opening information resources
67
5. Technical Preparations
In technical terms 1. Use URIs (Universal Resource Identifier) as names for things.
The purpose 1. To form a concept of the thing, one that can be talked about and referred to. 2. To offer information related to the concept from where it would naturally be looked from 3. To make it easy to find additional information on named objects and resources. 4. To form relationships between concepts to create an ever-growing web instead of separate pieces of information.
3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things.
Regardless of the format the information is being published in and the format it originally existed, RDF is a useful model to connect data resources together via the Internet. As a technology, RDF enables easy linking of things and concepts to each other, as well as the later linking of independent, separately designed systems to each other. Since the RDF enables depicting the same data using different vocabularies, harmonising the terminology in order to increase compatibility can be done where it is most cost efficient. Regarding publishing open data, it is not necessary to search for joint standards or describe all data, for example information about a single school. The viewpoints of individuals and organisations differ greatly. For example, the school can be found in certain registers as a tenant, the ministry of Education could have a lot of information about the same school and the school itself is in possession of a lot of information, including opening hours etc. There is no reason to assume that all these actors would start using the same standardised terminology to describe the school. Even if a term list was created, it would be a result of a compromise, and therefore would no longer serve the needs of any of the actors. Commonly accepted terms for description could significantly increase the integrated use of data reserves, and therefore creating such lists should be supported. The force of the smallest common denominator (see 5.3), operating on the Internet, has not in any way prevented the individuals and communities from creating more detailed practices on joint platforms. For example, the markup language for www sites, HTML, is a widely accepted standard. Yet more narrow practices such as the microformats, have been created on top of HTML to allow marking address information in the HTML in a machine-readable format. If these
68
Public data an introduction to opening information resources
narrow practices are applied, it is crucial to do it in a way that by no means restricts utilising the rest of the site. The site is visible even if the browser does not support the microformat, and using a microformat in one part of the site does not require the whole website to use all microformats. In terminology, RDF offers balance between the easiness and the benefits of standardisation. The individuals and organisation publishing data have the right to choose how to describe their data. In addition, they have the chance to distribute their vocabularies and reuse parts of vocabularies created by others. Another point in favour of using RDF is the fact that vocabularies and term lists are easy to add to it later. The Linked Data web is designed to organically grow and develop, same way as the web of linked documents has done. It grows when people and organisations, irrespective of each other, spontaneously add their own resources on the web and link them together. Self-organising and incoherence are parts of the development: links are broken, new links are made, and term lists and vocabularies are merged and separated.
69
5. Technical Preparations
70
Wouldnt it be great, if all public sector data could be easily found and available in one location?
T E k N o S TA d I , J u S S I S E p p N E N | p h o T o : k I R M o k I V E L
71
6. Open Data Infrastructure
72
Public data an introduction to opening information resources
Case: data.gov.uk
So far, the most impressive government data catalogue is data.gov.uk, released on January 19, 2010 as a public beta version. This site, put together in six months under the surveillance of Tim Berners-Lee, beats the US data.gov by a country mile. At the time of publication, the UK catalogue contained approximately 3000 information sets, which is three times more than data.gov. Not only larger, the UK data catalogue can be considered to be more interesting than the American one. Data.gov has faced serious criticism since its release, arguing that it only contains information that does not provoke political conversation. In the UK catalogue, one can find statistics on the deaths of soldiers on duty and other information interesting in the sense of transparency. The UK catalogue was designed to support machine-readability and the characteristics of the semantic web. One can search through the database using SPARQL query language. The results for example British schools - can be examined in the browser and nothing has to be downloaded to the readers computer. The idea behind the solution used on the catalogue is that the technology it uses is easily adaptable.
73
6. Open Data Infrastructure
74
Public data an introduction to opening information resources
In other words, a clearing house organisation would be responsible for the raw data being in a machine-readable format and available to every interested party both inside the government and outside. It would have no obligation to process data or produce any information services for ordinary citizens. Currently, the authorities are obligated to create information portals for citizens and the creation of machine-readable interfaces is often left with very little attention. Offering raw material and the interfaces could be outsourced to a clearing house, and the agencies would be free to focus on their current functions: providing basic services and processing information. Naturally, processing creates new information that can no longer be called raw data but can be distributed through interfaces for free. In these cases, the agencies could outsource the distribution of produced information to a clearing house.
References
Benson, Yrj (2009). Valtion IT-toiminta. Valtion IT-johtaja Yrj Benson alustus Kuntaliiton seminaarissa 4.11.2009. Helsinki: Kuntaliitto. qsb.webcast.fi/k/kuntaliitto/kuntaliitto_2009_1104_07_benson/. Viitattu 10.3.2010. Berners-Lee, Tim (2006). Linked Data. www.w3.org/DesignIssues/LinkedData.html. Viitattu 10.3.2010 Cloud, John (2006). The Gurus of YouTube. 16.12.2006. Time magazine. www.time.com/time/ printout/0,8816,1570721,00.html. Viewed 8.3.2010. DCMS, & BIS (2009). Digital Britain. Final report. Building Britains future. DCMS (Department of culture, media and sport) and BIS (Department of Business Innovations and Skills). www. culture.gov.uk/images/publications/digitalbritain-finalreport-jun09.pdf. Viewed 15.12.2009. Dekkers, Makx, Polman, Femke, de Velde, Robbin, & de Vries, Marc (2006). MEPSIR Measuring European Public Sector Information Resources. Final Report of Study on Exploitation of public sector information benchmarking of EU framework conditions. ec.europa.eu/information_society/policy/psi/docs/pdfs/mepsir/final_report.pdf. Viewed 10.3.2010. EU (2007). Lausunto 4/2007 henkiltietojen ksitteest. 20.6.2007. ec.europa.eu/justice_home/fsj/privacy/docs/wpdocs/2007/wp136_fi.pdf. Viewed 8.3.2010. EU (2007b). Communication from the Commission. Pre-commercial Procurement: Driving innovation to ensure sustainable high quality public services in Europe. SEC (2008) 1668. http://ec.europa.eu/invest-in-research/pdf/download_en/com_2007_799.pdf Viewed 17.3.2010 EU (2009). Cellphone charger harmonization. ec.europa.eu/enterprise/sectors/rtte/chargers/ index_en.htm. Viewed 8.3.2010. EU (2010). EUROPE 2020 A strategy for smart, sustainable and inclusive growth. http:// ec.europa.eu/eu2020/pdf/COMPLET%20EN%20BARROSO%20%20%20007%20-%20Europe%20 2020%20-%20EN%20version.pdf Viewed 15.3.2010 Eups20 (2010). An Open Declaration on European Public Services. http://eups20.wordpress. com/the-open-declaration/ Viewed 18.3.2010 EY Komissio (2009). Julkisen sektorin hallussa olevien tietojen uudelleenkytt - direktiivin 2003/98/EY uudelleentarkastelu. Komission tiedonanto KOM(2009) 212 lopullinen. www. epsiplatform.eu/media/files/com09_212_fi. Viewed 5.2.2010. EY (2003). Euroopan parlamentin ja neuvoston direktiivi 2003/98/EY, annettu 17. pivn marraskuuta 2003, julkisen sektorin hallussa olevien tietojen uudelleenkytst. www. diges.info/pdf/PSI-direktiivi_suomi.pdf. Viewed 8.3.2010. Forss, Marko (2009). Virtuaalinen lhipoliisiryhm. Poliisin kotisivut. www.poliisi.fi/irc-galleria. Viewed 8.3.2010. Garca-Garca, J., & Alonso de Magdaleno, I. (2010). Commons-based innovation, The Linux kernel case. In 2nd European Conference on Corporate R&D. iri.jrc.ec.europa.eu/concord-2010/posters/Garcia-Garcia.ppt. Viewed 9.3.2010. HS (2010). Ministeri pohtii viranomaistietojen vapauttamista verkkoon. Kansalaisista virkatiedon jalostajia. Lehtiartikkeli 7.2.2010. Helsingin Sanomat. HS (2010). Suomi el velaksi 2010. Lehtiartikkeli 9.1.2010. Helsingin Sanomat.
75
References
76
Public data an introduction to opening information resources
Hermans, Outi, & Hermans, Raine (2009). Paikkatietojen yhteiskytt ja jakeluperiaatteet. Hinnoitteluperiaatteiden analyysi ja kansantaloudellisten vaikutusten simulointi. Keskusteluaiheita 1194. www.etla.fi/files/2372_Dp1194.pdf. Viewed 20.12.2009. Hintikka, Kari A. (1993). Tieto - neljs tuotannontekij : tehtaasta televirtuaalisuuteen. Helsinki: Painatuskeskus. Hintikka, Kari A. (2009). Toimintatapojen muutoksia eli kollektiiviset ja hajautetut toimintamallit. Alustus Suomi.fi:n seminaarissa 7.4.2009. www.slideshare.net/ubiq/suomifi-seminaari-742009-alustus-kari-a-hintikka. Viewed 7.3.2010. Howe, Jeff (2006). The Rise of Crowdsourcing. Wired magazine. www.wired.com/wired/archive/14.06/crowds.html. Viewed 8.3.2010. Joey Van Den Hurk, Peter Rem, M. J. (2008). Standard Cost Model for Citizens - Users guide for measuring administrative burdens for Citizens. Ministry of the Interior and kingdom Relations. whatarelief.eu/publications/standard-cost-model. Viewed 10.3.2010 Kansallinen ennakointiverkosto (2008). Internet ja vuorovaikutuksen uudet muodot. (Nieminen-Sundell, Riitta, toim.). www.foresight.fi/wp-content/uploads/2009/08/Internet ja vuorovaikutuksen uudet muodot.pdf. Viewed 19.1.2010. Karvonen, Erkki (2000). Elmmek tieto- vai informaatioyhteiskunnassa? Teoksessa Vuorensyrj, Matti & Savolainen, Reijo (toim.) Tieto ja tietoyhteiskunta. Helsinki: Gaudeamus Kuntaliitto (2010). Kuntien verkkoviestintohje. Kuntaliiton verkkojulkaisu. Kuntaliitto. hosted.kuntaliitto.fi/intra/julkaisut/pdf/p20100127150612668.pdf. Viewed 10.3.2010. Kuronen, Timo (1998). Tietovarantojen hydyntminen ja demokratia. Helsinki: Suomen itsenisyyden juhlarahasto. www.sitra.fi/julkaisut/tietoyhteiskunta/sitra174. pdf?download=Lataa+pdf. Viewed 12.12.2009. Kuronen, Timo (1998b). Tietovarantojen hydyntminen ja demokratia. Esimerkkej tiedon prosesseista. SITRA 174. Helsinki: Suomen itsenisyyden juhlarahasto. onko thn Viewed? -Antti Jogi Poikola 10/03/2010 19:51 LVM, & VM (2010). Tyryhmt edistmn julkisen tiedon saatavuutta. Tiedote 11.02.2010. Valtiovarainministerio ja Liikenne- ja viestintministeri. www.lvm.fi/web/fi/tiedote/ view/1130523. Viewed 8.3.2010. LVM (2009). Shkisesti nouseva Suomi Viestinnn elinkeinopoliittisen tyryhmn loppuraportti 22.10.2009. www.lvm.fi/c/document_library/get_file?folderId=339549&name=DL FE-9534.pdf. Viewed 3.3.2010. Lasica, Joseph Daniel (2009). Identity in the Age of Cloud Computing. www.aspeninstitute. org/sites/default/files/content/docs/pubs/Identity_in_the_Age_of_Cloud_Computing.pdf. Viewed 5.3.2010. Luoto, Karoliina (2009). Suomen tietoyhteiskunta 2020. Foresight.fi -blogimerkint 17.12.2009. Sitra. www.foresight.fi/2009/12/17/suomen-tietoyhteiskunta-2020/. Viewed 15.2.2010. Mannermaa, Mika (2006). Demokratia tulevaisuuden myllerryksess. Yhteiskunnallinen vaikuttaminen uudessa viitekehyksess. Helsinki, Eduskunta, tulevaisuusvaliokunta. www. eduskunta.fi. Viewed 10.3.2010. viitataanko thn? -Antti Jogi Poikola 10/03/2010 20:07 MMM (2004). Kansallinen paikkatietostrategia 20052010, Maa- ja metstalousministeri MMM:n julkaisuja 10/2004. Retrieved from www.mmm.fi/fi/index/etusivu/maanmittaus_paikkatiedot/paikkatietojenyhteiskaytto/kansallinenpaikkatietostrategia.html. Viewed 10.3.2010.
Mokka, Roope & Neuvonen, Aleksi (2006). Yksiln ni. Hyvinvointivaltio yhteisjen ajalla. Helsinki: Sitra. www.sitra.fi/julkaisut/raportti69.pdf. Viewed 8.3.2010. Neuvottelukunta (2009). Tietoyhteiskunnan vauhdittamiseen lydettv ratkaisuja, Lehdisttiedote 24.11.2009. www.lvm.fi/web/fi/tiedote/view/994565. Viewed 10.3.2010. Niiniluoto, Ilkka (1989). Informaatio, tieto ja yhteiskunta. Filosofinen ksiteanalyysi. Helsinki: Valtionhallinnon kehittmiskeskus. Niiniluoto, Ilkka (2000). Uskomuksia ilman tietoa? www.uta.fi/~attove/niini.pdf. Viewed 7.3.2010. OReilly, Tim (2010). Open Government: Collaboration, Transparency, and Participation in Practice. (D. Lathrop & L. Ruma) (p. 432). OReilly Media; 1 edition. OECD (2006). OECD Principles and Guidelines for Access to Research Data from Public Funding. www.oecd.org/dataoecd/9/61/38500813.pdf. Viewed 20.1.2010. OECD (2008). OECD Recommendation of the Council for Enhanced Access and More Effective Use of Public Sector Information. www.oecd.org/dataoecd/0/27/40826024.pdf. Viewed 10.3.2010. Open Knowledge Foundation (2006). Open Knowledge Definition (OKD). Open Knowledge Foundation. www.opendefinition.org/okd/. Viewed 10.3.2010. Otakantaa.fi (2010). Miten suomalaista tietoyhteiskuntaa pitisi kehitt? Liikenne- ja viestintministerin kysely Otakantaa.fi -palvelussa. www.otakantaa.fi/forum.cfm?group=428& sort=Otsikko&id=&ref=Kaikki keskustelut&pg=4. Viewed 10.3.2010 Paukku, Paul (2009). Julkiset shkiset palvelut / kehittmisen pullonkaulat. Selvitysmiehen raportti 12.6.2009. www.lvm.fi/web/fi/julkaisu/view/906227. Viewed 7.3.2010. Pursiainen, Harri (2009). Kansallinen lyliikenteen strategia. Selvitysmiehen ehdotus 30.10.2009. www.lvm.fi/web/fi/tyoryhmat/tyoryhma/view/845060. Viewed 10.3.2010. Shirky, Clay (2008). Here Comes Everybody: The Power of Organizing Without Organizations. Penguin Press HC. TEM (2008). Kansallinen innovaatiostrategia. http://www.tem.fi/files/19704/Kansallinen_innovaatiostrategia_12062008.pdf Viewed 15.3.2010 TEM (2009). Kilpailulaki 2010 -tyryhmn mietint. www.tem.fi/files/21617/TEM4.pdf. Viewed 10.3.2010 TEM (2010). Kysynt- ja kyttjlhtisen innovaatiopolitiikan toimenpideohjelma. www. tem.fi/files/26093/OSA_2_final.pdf. Viewed 20.2.2010. Torkington, Nat (2010). Rethinking Open Data. Lessons learned from the Open Data front lines. OReilly Radar blog. radar.oreilly.com/2010/02/rethinking-open-data.html. Viewed 20.1.2010. Turkki, Teppo (2009). Nykyaikaa etsimss. Suomen digitaalinen tulevaisuus. www.eva.fi/ files/2573_nykyaikaa_etsimassa.pdf. Viewed 8.3.2010. UK (2009). Putting the Frontline First: smarter government. www.hmg.gov.uk/media/52788/ smarter-government-final.pdf. Viewed 8.3.2010 VM (2003). Hallinnon sisisten tietoluovutusten hinnoittelu, Hallinnon tietoluovutusten hinnoitteluhankkeen raportti 30.5.2003. Tyryhmmuistioita 16/2003. www.vm.fi/vm/fi/04_ julkaisut_ja_asiakirjat/01_julkaisut/04_hallinnon_kehittaminen/37893/40524_fi.pdf. Viewed 26.1.2010.
77
References
78
Public data an introduction to opening information resources
VM (2004). Hallinnon sisisten tieto- luovutusten maksukytntj selvittneen tyryhmn muistio. Tyryhmmuistioita 11/2004. www.vm.fi/vm/fi/04_julkaisut_ja_asiakirjat/01_julkaisut/04_hallinnon_kehittaminen/89111_fi.pdf. Viewed 5.3.2010. VM (2005). Direktiivi 2003/98/EY julkisen sektorin hallussa olevien tietojen uudelleenkytst. Kansallinen implementointi ja vastaavuus Suomen lainsdntn. Muistio 26.5.2005. www.vm.fi/vm/fi/04_julkaisut_ja_asiakirjat/03_muut_asiakirjat/95083.pdf. Viewed 8.3.2010. VM (2009). Valtiotason arkkitehtuurit -hanke. Julkishallinnon tietoarkkitehtuuri Mrittely 0.3 21.12.2009. www.vm.fi/vm/fi/04_julkaisut_ja_asiakirjat/03_muut_ asiakirjat/20100105Valtio/04_Julkishallinnon_tietoarkkitehtuuri_v0.3.pdf. Viewed 15.1.2010. VM (2009b). Shkisen asioinnin ja demokratian vauhdittamisohjelman (SADe) toteuttamissuunnitelma 2009-2014. Valtiovarainministerin muistio 16.6.2009. www.vm.fi/vm/fi/04_julkaisut_ja_asiakirjat/03_muut_asiakirjat/suunnitelma_SADe_160609.pdf. Viewed 10.3.2010 VM (2009c). Valtiotason arkkitehtuurit -hanke. Valtionhallinnon arkkitehtuuriperiaatteet ja -linjaukset. Mrittely 0.62 Pivys 31.12.2009. http://www.vm.fi/vm/fi/04_julkaisut_ja_asiakirjat/03_muut_asiakirjat/20100105Valtio/01_Valtionhallinnon_arkkitehtuuriperiaatteet_ ja_-linjaukset_v0.62.pdf Viewed 15.3.2010 VM (2010). Tilastotoimen tehostamista ja alueellistamista valmistelleen tyryhmn loppuraportti. http://www.vm.fi/vm/fi/04_julkaisut_ja_asiakirjat/01_julkaisut/04_hallinnon_ kehittaminen/20100302Tilast/Loppuraportti_2_3_2010_liitteet.pdfViewed 10.3.2010. VNK (2006). Kansallinen tietoyhteiskuntastrategia 2007-2015. Uudistuva, ihmislheinen ja kilpailukykyinen Suomi. www.arjentietoyhteiskunta.fi/files/34/Kansallinen_tietoyhteiskuntastrategia.pdf. Viewed 10.3.2010. Voutilainen, Tomi (2009). ICT-oikeus shkisess hallinnossa. Vitskirja Joensuun Yliopisto (p. 360). Edita Publishing Oy. W3C (2004). Architecture of the World Wide Web, Volume One W3C Recommendation 15 December 2004. W3C. www.w3.org/TR/webarch/. Viewed 10.3.2010 YK (2010). Green Growth. UNESCAP. www.greengrowth.org/. YLE (2008). Yleisradion hallituksen toimintakertomus v. 2008. yle.fi/yleista/ kuvat/2008tilinpaatos.pdf. YLE (2009). Tietosuojavaltuutettu vaatii yleist tietoturvalakia. 20.2.2009. YLE Uutiset verkkosivu. yle.fi/uutiset/kotimaa/2009/02/tietosuojavaltuutettu_vaatii_yleista_tietoturvalakia_560232.html. Viewed 8.3.2010. YLE (2009). Ylen mahdollistajastrategia. Yle. yle.fi/yleista/kuvat/YLE_mahdollistajastrategia. pdf. Viewed 8.3.2010.u_vaatii_yleista_tietoturvalakia_560232.html. Viewed 8.3.2010. YLE (2009). Ylen mahdollistajastrategia. Yle. http://yle.fi/yleista/kuvat/YLE_mahdollistajastrategia.pdf. Viewed 8.3.2010.