Anonymization practices

No international standard defines the methods for anonymizing data, acceptable levels of risk, or recommended measures of information loss. How much and the type of protection required is specific to each dataset, depending on the sensitivity and “commercial value" of the content, and to each specific legal and cultural environment. It is therefore useful to document some practices. This is, however, not an easy task, as agencies that anonymize their datasets do not communicate much on the methods implemented and the levels of risk in the data they disseminate.

This limited access to knowledge combined with a lack of experience in using the tools and methods makes it difficult for many agencies to implement “optimal” solutions. By optimal we mean; meet their obligations towards privacy protection but also their obligation to release data useful for policy monitoring and evaluation. In order to bridge this gap in practical guidelines The World Bank completed a project funded by the Knowledge for Change Program II, which sought to build a knowledge base through experimentation on a diverse set of microdata. This knowledge was then be translated into a practice guide for public release. The practice guide fills a critical gap by documenting research conducted at the World Bank through a large-scale evaluation of anonymization techniques, and (ii) translating these results into practical guidelines. This practice guide was released in 2015. It is being updated regularly in order to add new methods and to keep up with new features available in the open source R software package sdcMicro which the guides uses for its practical examples. A current version of the guide can be found here.

Related Resources

Documents

Anonymisation: managing data protection risk code of practice

Download

Author(s)

UK Information Commissioner's Office

Description

The code explains the issues surrounding the anonymisation of personal data, and the disclosure of data once it has been anonymised. It explains the relevant legal concepts and tests in the UK Data Protection Act 1998 (DPA). The code provides good practice advice that will be relevant to all organisations that need to convert personal data into a form in which individuals are no longer identifiable.

Date

November 2012

URL

http://ico.org.uk/for_organisations/data_protection/topic_guides/~/media/documents/library/Data_Protection/Practical_application/anonymisation-codev2.pdf

NCHS Staff Manual on Confidentiality

Download

Author(s)

National Center for Health Statistics (NCHS)

Description

The confidentiality of records is a matter of primary concern to the NCHS. To elicit health information from the American people and from the health care providers through our surveys, the NCHS must be able to assure them that this information will be protected from all unauthorized persons. This means that NCHS must have strong laws to protect these records, and must establish and follow established procedures. This manual outlines the NCHS’s policies that implement federal law and ensure that all confidential information will be fully protected. It should be viewed in unison with the NCHS data release policies addressing access to data and NCHS Research Ethics Review Board Requirements.

Date

2004

URL

http://www.cdc.gov/nchs/data/misc/staffmanual2004.pdf

sdcApp Reference Manual

Download

Author(s)

Thijs Benschop, Matthew Welch

Description

This is documentation and guidance for using sdcApp, a graphic user interface for the sdcMicro R package. sdcMicro provides tools for Statistical Disclosure Control (SDC) for microdata, also known as microdata anonymization. For an overview of the theory of SDC for microdata we suggest reading: Statistical Disclosure Control for Microdata: A Theory Guide.

Date

2019-11-12

URL

https://sdcappdocs.readthedocs.io/en/latest/

Statistical Disclosure Control for Microdata: A Practice Guide

Download

Author(s)

Thijs Benschop, Matthew Welch

Description

Releasing data in a safe way is required to protect the integrity of the statistical system, by ensuring agencies honor their commitment to respondents to protect their identity. Agencies do not widely share, in substantial detail, their knowledge and experience using SDC and the processes for creating safe data with other agencies. This makes it difficult for agencies new to the process to implement solutions. We consolidated knowledge from literature as well as from our own experience to inform our discussion of the processes and methods presented in this guide. This guide focuses on the implementation of methods and uses the free R based package sdcMicro for its examples. If you are interested in reading in detail about the theory behind the methods used, we suggest reading our accompanying guide: Statistical Disclosure Control for Microdata: Theory.

Date

June 2016

URL

https://sdcpractice.readthedocs.io/en/latest/

Statistical Disclosure Control for Microdata: Theory

Download

Author(s)

Thijs Benschop and Matthew Welch

Description

This guide provides and introduction to the theory of Statistical Disclosure Control (SDC) for microdata. It includes an overview of the most commonly applied methods in SDC, a step-by-step overview of the complete SDC process and many examples from practice in National Statistics Offices (NSOs).

For guidance on the technical implementation of the theory mentioned in the guide, please refer to our guides:

- Statistical Disclosure Control for Microdata: A Practice Guide for guidance on the application of methods and on using sdcMicro from the command-line
- sdcApp manual for guidance on the application of methods and on using the GUI sdcApp available for sdcMicro

Date

11/12/2019

URL

https://sdctheory.readthedocs.io/en/latest/

Statistical Policy Working Paper 22 (Second version, 2005) - Report on Statistical Disclosure Limitation Methodology

Download

Author(s)

Federal Committee on Statistical Methodology

Description

The Report on Statistical Disclosure Limitation Methodology, Statistical Policy Working Paper 22, discusses both tables and microdata and describes current practices of the principal Federal statistical agencies. The original report includes a tutorial, guidelines, and recommendations for good practice; recommendations for further research; and an annotated bibliography. In 2004, the Confidentiality and Data Access Committee (CDAC) revised Statistical Policy Working Paper 22 to include research and new methodologies that were developed over the past ten years, and to reflect current agency practices.

Date

December 2005

URL

http://www.fcsm.gov/working-papers/SPWP22_rev.pdf

Search form

Anonymization practices

Related Resources

Documents

Guidelines