Microdata documentation
""From the archivist's and the end user's perspective, a ‘good’ dataset is one that is easy to use. Its documentation is clear and easy to understand, the data contain no surprises, and users are able to access the dataset with relatively little start-up time."" (Guide to Social Science Data Preparation and Archiving, ICPSR)
The data documentation, or metadata, helps the researcher to:
- Find the data they are interested in. Without names, abstracts, keywords, and other important metadata element it might be difficult for a researcher to locate specific datasets and variables. Any cataloguing and resource location system—whether manual or digital—is based on metadata.
- Understand what the data are measuring and how the data have been created. Without proper descriptions of the survey design and the methods used when collecting and processing the data, the risk is high that the user will misunderstand and even misuse them.
- Assess the quality of the data. To know whether data are useful for a research project, researchers need information about the data collection standards, as well as any deviations from the planned standards.
Rich metadata also reduces the burden on the data producer, as it reduces the need to provide regular support to users of the data.
Traditionally, data producers and data archives produced text-based codebooks. Today's alternative is XML-based codebooks, produced according to international metadata standards such as the Data Documentation Initiative (DDI) and the Dublin Core. To facilitate the documentation of microdata, the IHSN distributes the Microdata Management Toolkit, and promotes the adoption of international best practices.
As recommended by the Generic Statistical Business Process Model (GSBPM), metadata must be captured "in real time" during the entire life cycle of the survey; documenting the design or implementation of a survey and the resulting data should not be seen as a last step in the implementation of a project.
In this section, we provide information or guidelines on:
- The expected scope of survey documentation: What kind of information makes good metadata
- Metadata standards and models: Selected metadata standards and models of particular relevance for the documentation of microdata
- Tools: Free software applications for the documentation of survey microdata