Data Versioning Principles Best Practices
Data Versioning Principles Best Practices
Version: 1.1
DOI: 10.15497/RDA00042
Authors: Jens Klump, Lesley Wyborn, Mingfang Wu, Robert Downs, Ari Asmi, Gerry Ryder, Julia
Martin (Research Data Alliance Data Versioning Working Group)
Published: 06 April2020
Abstract: The demand for better reproducibility of research results is growing. With more data
becoming available online, it will become increasingly important for a researcher to be able to cite
the exact extract of the dataset that was used to underpin their research publication. However, while
the means to identify datasets using persistent identifiers have been in place for more than a decade,
systematic data versioning practices are currently not available. Without these, it is very hard for
researchers to gain attribution and credit for their actual contributions to the collection, creation and
publishing of individual datasets. Versioning procedures and best practices are well established for
scientific software and can be used to enable reproducibility of scientific results.
The Research Data Alliance (RDA) Data Versioning Working Group produced this Final Report to
document 39 use cases and current practices, and to make recommendations for the versioning of
research data. To further adoption of the outcomes, the Data Versioning Working Group then
contributed selected use cases and recommended data versioning practices to other groups in RDA
and W3C. The outcomes of the RDA Data Versioning Working Group add a central element to the
systematic management of research data at any scale by providing recommendations for standard
practices in the versioning of research data.
This revised report incorporates the feedback to the version 1.0 received from the RDA community
review.
Language: English
Citation and Download: Klump, J., Wyborn, L., Downs, R., Asmi, A., Wu, M., Ryder, G., Martin, J.
(2020). Principles and best practices in data versioning for all data sets big and small. Version 1.1.
Research Data Alliance. DOI: 10.15497/RDA00042.
Revised Report of the Research Data Alliance Data Versioning
Working Group
Jens Klump, Lesley Wyborn, Mingfang Wu, Robert Downs, Ari Asmi, Gerry
Ryder, Julia Martin
Version Information
Release Date Description
Version 1.1 2020-04-06 Updated with minor changes following community review
Recommended Citation:
Klump, J., Wyborn, L., Downs, R., Asmi, A., Wu, M., Ryder,
G., & Martin, J. (2020). Principles and best practices in data
versioning for all data sets big and small. Version 1.1.
Research Data Alliance. DOI: 10.15497/RDA00042.
1
Executive Summary
The demand for better reproducibility of research results is growing. With more data
becoming available online, it will become increasingly important for a researcher to
be able to cite the exact extract of the dataset that was used to underpin their
research publication. However, while the means to identify datasets using persistent
identifiers have been in place for more than a decade, systematic data versioning
practices are currently not available. Without these, it is very hard for researchers to
gain attribution and credit for their actual contributions to the collection, creation and
publishing of individual datasets. Versioning procedures and best practices are well
established for scientific software and can be used to enable reproducibility of
scientific results.
The Research Data Alliance (RDA) Data Versioning Working Group produced this
Final Report to document 39 use cases and current practices, and to make
recommendations for the versioning of research data. To further adoption of the
outcomes, the Data Versioning Working Group then contributed selected use cases
and recommended data versioning practices to other groups in RDA and W3C. This
revised report incorporates the feedback received from the RDA community review.
From initial data acquisition, there can be multiple levels of processing applied and
each level of processing can then have multiple versions created. This Final Report
applies the Functional Requirements for Bibliographic Records (FRBR) to provide a
conceptual framework with a set of data versioning principles developed around the
FRBR concepts of the ‘work’, the ‘expression’, the ‘manifestation’ and the ‘item’.
1
Designated User Community: An identified group of potential Consumers who should be able to
understand a particular set of information. The Designated Community may be composed of multiple
user communities. A Designated Community is defined by the Archive and this definition may change
over time. (CCSDS, 2012 page 1–11)
2
● The release of a new version of a dataset should be accompanied by a
description of the nature and the significance of the change.
● The significance of this change will depend on the intended use of the data by
its designated user community.
● Each new release of a data product should have a new identifier.
Granularity (aggregates, composites, collections and time series):
● Data may be aggregated and combined into collections or timeseries.
● The collection should be identified and versioned, as should be each of its
constituent datasets.
● Entire time series should be identified, as should be time-stamped revisions.
Manifestation (data formats and encodings):
● The same dataset may be expressed in different file formats or character
encodings without differences in content. While these datasets will have
different checksums, the work expressed in these datasets does not differ,
they are manifestations of the same work.
● Manifestations of the same work should be individually identified and related
to their parent work.
Provenance (derived products):
● The definition of revisions and releases signifies that a dataset has been
derived from a precursor and is part of the description of its lineage, or
provenance.
● Provenance can be more complex than following a linear path. Information
accompanying a dataset release should therefore contain information on the
provenance of a dataset.
Citation:
● Include information of the Release in the citation. DataCite recommends to
use semantic versioning, issue a new identifier with major releases, use the
“alternate identifier” and “related identifier” elements to identify releases and
how they relate to other datasets, e.g. whether it was derived from a
precursor.
● Updating the metadata does not create a new version, it only changes the
catalogue entry.
3
1 Introduction
The demand for better reproducibility of research results is growing. More and more
data is becoming available online. In some cases, the datasets have become so
large that downloading the data is no longer feasible. Data can also be offered
through web services and accessed on demand. This means that parts of the data
are accessed at a remote source when needed. In this scenario, it will become
increasingly important for a researcher to be able to cite the exact extract of the
dataset that was used to underpin their research publication. However, while the
means to identify datasets using persistent identifiers have been in place for more
than a decade, systematic data versioning practices are currently not available.
Versioning procedures and best practices are well established for scientific software
(e.g. Fitzpatrick et al, 2009; Preston-Werner, 2013). The related Wikipedia article
gives an overview of software versioning practices (Wikipedia, 2019). The codebase
of large software projects does bear some semblance to large dynamic datasets. Are
therefore versioning practices for code also suitable for datasets or do we need a
separate suite of practices for data versioning? How can we apply our knowledge of
versioning code to improve data versioning practices? This Data Versioning Working
Group investigated to which extent these practices can be used to enhance the
reproducibility of scientific results (e.g. Bryan, 2018).
The Research Data Alliance (RDA) Data Versioning Working Group produced this
Final Report to document use cases and practices, and to make recommendations
for the versioning of research data. To further adoption of the outcomes, the Working
Group contributed selected use cases and recommended data versioning practices
to other groups in RDA and W3C. The outcomes of the RDA Data Versioning
Working Group add a central element to the systematic management of research
data at any scale by providing recommendations for standard practices in the
versioning of research data. These practice guidelines are illustrated by a collection
of use cases.
This revised report incorporates the feedback received from the RDA community
review.
4
and standard practices are well established for scientific software and its concepts
could be applied to other use cases to facilitate the goals of reproducibility of
scientific results. The Data Versioning Working Group worked with other groups
within RDA and external on topics where data versioning is of importance towards
developing a common understanding of data versioning and recommended
practices.
Within RDA the Data Versioning Working Group worked with the Data Citation
Working Group to include its outputs (Rauber et al., 2016) into the collection of use
cases, and with the Data Foundations and Terminology Interest Group, the Use
Cases Coordination Group, the Research Data Provenance Interest Group, the
Provenance Patterns Working Group, and the Software Source Code Interest Group
to align data versioning concepts. The Data Versioning Working Group also worked
closely with the W3C DXWG to introduce selected use cases collected by the RDA
Data Versioning Working Group into the W3C Working Group’s collection of use
cases and align versioning concepts. Additionally, the RDA Data Versioning Working
Group worked closely with the AGU Enabling FAIR Data Project, in particular, Task
Group E on Data Workflows.
An important driver to have a closer look at data versioning came from the work of
the RDA Working Group on Data Citation, whose final report recognises the need for
systematic data versioning practices (Rauber et al., 2016). The RDA Data Citation
Working Group aimed to address issues of identifying and citing a subset of large
and dynamic data collection. The RDA Data Citation Working Group recommends to
version and timestamp any updates to data items and assign identifiers to
timestamped queries which then allow the retrieval of specific data subsets at any
given point in time. The recommendations given here are well suited for relational
databases that are accessed using database queries but are not as well suited for
file-based data. This gap was discussed at a BoF meeting held at the RDA Plenary
in September 2016 in Denver, resulting in the formation of an Interest Group on data
versioning. A review of the recommendations by the RDA Data Versioning Interest
Group (the precursor to this group) concluded that systematic data versioning
practices are currently not available.
5
bibliographic records to the needs of the users of those records” (IFLA Study Group
on the Functional Requirements for Bibliographic Records, 1998).
In the digital era, the FRBR model is proving ideal not just in helping to distinguish
multiple derivatives (versions) of an original dataset, but in establishing transparent
provenance chains of how a particular dataset evolved from the initial collection of
the original data through to its publication, curation and archiving, and more
importantly, being able to provide attribution and accreditation to those researchers,
institutions, funders, etc that were involved in the creation and subsequent
preservation of each version.
6
2. It helps to improve interoperability across catalogues under the same model.
7
We also disagree with Hourclé’s application of the Work entity in the FRBR model to
research data because in the FRBR definition the Work is an abstract entity. As a
generalisation of Hourclé’s work, we suggest an alternative mapping of the FRBR
entities to data that takes into account concepts developed for the Observations and
Measurements model (ISO 19156) (Cox, 2015) as follows:
1) A Work is the observation that results in the estimation of the value of a
feature property, and involves application of a specified procedure, such as a
sensor, instrument, algorithm or process chain;
2) An Expression of a work is the realisation of a work in the form of a logical
data product. Any change in the data model or content constitutes a change in
expression;
3) A Manifestation is the embodiment of an expression of a work, e.g. as a file
in a specific structure and encoding. Any changes in its form (e.g. file
structure, encoding) is considered a new manifestation; and
4) An Item is a concrete entity, representing a single exemplar of a
manifestation, e.g. a specific data file in an individual, named data repository.
An Item can be one or more than one object (e.g. a collection of files bundled
in a container object).
8
multiple organisational websites as either file downloads for local processing and/or
in situ access as web services.
Considering the complexity of the ASTER use case, in particular the different formats
of each data product, combined with the multiple sites a product is released from (
with each with different access mechanism), the FRBR model was applied to help
ensure reproducibility (knowing the source of any version that was used in any
subsequent analysis), provenance (knowing the sequential history of any evolved
data product) and attribution (knowing which organisation/individual had funded
and/or produced and/or was sustaining the release of any version).
In detail, the various entities along the Full-path of ASTER data use are as follows:
1) The Work: the work is all observations taken by the ASTER sensor on board
the Terra (EOS AM-1) satellite.
9
more importantly, enabling attribution for and identity of any person/institution/
organisation involved in the development of any version.
4 Use cases
The Data Versioning Working Group collected 39 use cases from about 33
organisations and working groups that cover different research domains (e.g. social
and economical science, earth science, environmental science, molecular
bioscience) and different data types (Klump et al., 2020). The use cases describe
current practices from data providers. These use case descriptions are useful in
identifying differences in data versioning practices between data providers and
highlighting encountered issues. We analysed the use cases in the context of the
Data Versioning Working Group, but also registered them with the RDA Use Cases
Group2 for analysis that can be potentially carried by other Interest Groups/Working
Groups with a different interest, e.g. data management analysis.
2
RDA Use Cases Group: https://www.rd-alliance.org/groups/use-cases-group.html
10
● Issue 6: For a collection with multiple versions, a landing page may point to
the latest version, all published versions, all published and archived versions.
(#BCO-DMO, #NASA, #AAO, #GA-EMC, # Molecular).
4.2 Use Cases for W3C Dataset Exchange Working Groups (DXWG)
The W3C DXWG has documented lists of use cases including four use cases related
to data versioning3, namely: version definition, version identifier, version release
date, and version delta; for the purpose of identifying current shortcomings and
motivating the extension of the Data Catalog Vocabulary (DCAT) (Albertoni et al.,
2019). To align with the W3C DXWG goal, we summarise six use cases that are
related to metadata entity/attribute/vocabulary scope definition, as either an
additional use case or more concrete requirements to existing W3C DXWG use
cases. We discussed the following six use cases with the W3C DXWG for their
consideration in their further iteration and prioritisation of use cases.
1. When changes are made to released data, data should be versioned. A
previous version shouldn’t be overwritten by the latest version, each version
should be identifiable and retrievable. (#DIACHRON, #USGS #BCO-DMO).
2. When data has a new version, it should be easy for users to judge what kinds
of changes have been made, so that users can 1) select the appropriate
version, 2) assess if the changes would affect a research conclusion based on
data from previous versions. (#DIACHRON, #USGS #BCO-DMO).
4. The W3C working group recommends having a data revision when data are
corrected, added (with or without changing data structure) and removed.
(#USGS, #BCO-DMO, #CSIRO #Molecular).
Consider also the following situations:
a. new analytical and or processing techniques are applied to a select
number of attributes/components of the existing dataset;
b. models and derivative products are revised with new data;
c. the data itself is revised as processing methods are improved;
d. there is no change to data but rather the data structure, format or
scheme; and
3
https://www.w3.org/TR/dcat-ucr/#RVSDF
11
e. data is processed with a different calibration.
In each above use case, should a new version or a new dataset be
recommended?
6. There are several terms used for revision, e.g. Version, Collection, Release,
Edition. Should all (or some of) these terminologies be unified under the same
name “version”? (#NASA, #Molecular, #GA-EMC).
5 Versioning Principles
The recommendations given by the RDA Data Citation Working Group (Rauber et
al., 2016) start with a key concept for data versioning: “Apply versioning to ensure
earlier states of datasets can be retrieved” (R1 - Data Versioning). Fundamental to
this recommendation is the requirement for unambiguous references to specific
versions of data used to underpin research results. In this concept, any change to
the data creates a new version of the dataset. A simple way to determine whether
two datasets differ would be to calculate and compare a checksum (R6 - Result Set
Verification). However, just knowing that the bit streams of two datasets differ does
not give us other essential information that we might need to know.
The Data Versioning Working Group analysed the versioning use cases (Klump et
al., 2020) outlined in this report and compared these practices to the
recommendations of the RDA Data Citation Working Group. In addition to
differences in the bitstream between two versions, we found a number of additional
questions that versioning practices try to address:
● What constitutes a change in a dataset? (Revision) (Issue 1, 2);
● What are the magnitude and significance of the change? (Release) (Issue 1);
● Are the differences in the bitstream due to different representation forms?
(Manifestation) (Issue 1);
● If the data are part of a collection and which elements of the collection have
changed? (Granularity) (Issue 1, 5);
● How do two versions relate to each other? (Provenance) (Issue 3, 4, 6); and
● How can we express information on versioning when citing data? (Issue 4, 7).
Data versioning communicates not only that a dataset has been changed, but also
refers to the significance and magnitude of change and other aspects. This finding
corresponds to the work published by the W3C DXWG (Albertoni, et al., 2019).
12
Version control (Revision)
As noted above, the recommendations given by the RDA Data Citation Working
Group already states that any change to a dataset creates a new version of the
dataset that needs to be identified. This may also require the minting of a persistent
identifier for this new version. This practice of fine-granular identification of versions
is derived from version control commonly applied to the management of software
code where every change to the code is identified as a separate version, often called
a “revision” or “build” (Fitzpatrick et al, 2009). In the case of software versioning, the
revision or build number can change far more frequently than the version number of
a “released” version.
Formats (Manifestation)
The format of software code is tightly coupled to its syntax rules and compilation into
executable software. Yet, the same software, e.g. a word processing package,
maybe available for different platforms. Data, in the same way, may be manifested in
different formats for use in different workflows while all are “manifestations” of the
same “expression” of a “work”. Different “manifestations” may be identified
separately in addition to identifying the “work”.
13
history of a piece of information is known as “provenance”. Using provenance, it
should be possible to understand how a piece of information has changed and
whether it is fit for the intended purpose or whether the information should be trusted
(Taylor et al., 2015).
6 Recommendations
The key learning from the working group’s activities was to distinguish between
versioning based on changes in a dataset (data revisions) and communicating the
significance of these changes (data release) as part of the data lifecycle. Thus, the
two key concepts in data versioning are (1) to be clear about which dataset is to be
identified and (2) what we want to communicate about it to its designated user
community.
14
content of the dataset. Concepts such as Semantic Versioning (Preston-Werner,
2013) describe a commonly used practice to communicate the significance of a
version change in a dataset release and have been widely adopted in software
development.
Provenance of datasets
The definition of revisions and releases to describe that a dataset has been derived
from a precursor helps to describe its lineage, or provenance. Semantic versioning,
and related versioning schemes, encode in their release numbers information about
a dataset and its precursors. Provenance, however, can be more complex than
following a linear path. Information accompanying a dataset release should therefore
contain information on the provenance of a dataset.
15
12) elements to identify releases and how they relate to other datasets, e.g. whether
it was derived from a precursor. Note that this is the minimum required for data
citation by DataCite; repositories may opt to offer a richer description of release
history and provenance of a dataset through other channels.
Resources
This section provides links and further information about the resources and
examples discussed in the article.
16
● Data Citation Working Group:
https://www.rd-alliance.org/groups/data-citation-wg.html
● Data Fundations and Terminology Interest Group:
https://www.rd-alliance.org/groups/data-foundations-and-terminology-ig.html
● Use Cases Coordination Group:
https://www.rd-alliance.org/groups/use-cases-group.html
● Research Data Provenance Interest Group:
https://www.rd-alliance.org/groups/research-data-provenance.html
● Provenance Patterns Working Group
https://www.rd-alliance.org/groups/provenance-patterns-wg
● Software Source Code Interest Group:
https://rd-alliance.org/groups/software-source-code-ig
● W3C Dataset Exchange Working Group: https://www.w3.org/2017/dxwg/
● W3C Data Exchange Working Group: Collection of use cases:
https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space
● W3C Data Exchange Working Group: Versioning concepts:
https://www.w3.org/2017/dxwg/wiki/General_versioning_considerations
● NASA Earth Observing System Data and Information System (EOSDIS) data
processing levels:
https://earthdata.nasa.gov/collaborate/open-data-services-and-software/data-i
nformation-policy/data-levels
Acknowledgements
The chairs of the RDA Data Versioning WG would like to thank all who contributed
use cases to the WG and joined the discussions at the plenary sessions and along
the way. We also thank the members of the community for their constructive
comments during the community review process.
Use cases were contributed by Natalia Atkins (IMOS), Catherine Brady (ARDC), Jeff
Christiansen (QCIF), Martin Capobianco, Andrew Marshall and Margie Smith (GA),
Bob Downs (Columbia University), Kirsten Elger and Damian Ulbricht (GFZ
Potsdam), Ben Evans, Nigel Rees, Kate Snow and Lesley Wyborn (NCI),
Siddeswara Guru (TERN), Julia Hickie (NLA), Dominic Hogan (CSIRO), Leslie Hsu
(USGS), Paul Jessop (International DOI Foundation), Dave Jones (StormCenter
Communications Inc.), Danie Kinkaide (BCO-DMO), Heather Leasor (ADA, ANU),
Benno Lee (Rensselaer Polytechnic Institute), Heather Leasor (ADA, ANU), Simon
Oliver (Digital Earth Australia), Andreas Rauber (Vienna University of Technology),
Simon O’Toole (AAO), Martin Schweitzer (BoM).
Special thanks go to the Australian Research Data Commons for their support.
We also like to thank our RDA Secretariat and TAB Liaisons, Stefanie Kethers and
Tobias Weigel, for their guidance and support.
17
References
Albertoni, R., Browning, D., Cox, S. J. D., Gonzalez-Beltran, A., Perego, A.,
Winstanley, P., et al. (2019). Data Catalog Vocabulary (DCAT) - Version 2 (W3C
Proposed Recommendation). Cambridge, MA: World Wide Web Consortium (W3C).
Retrieved from https://www.w3.org/TR/vocab-dcat-2/ 25 March 2020.
Bryan, J. (2018). Excuse Me, Do You Have a Moment to Talk About Version
Control? The American Statistician, 72(1), 20–27. Retrieved from
https://doi.org/10.1080/00031305.2017.1399928 25 March 2020.
CCSDS. (2012). Reference Model for an Open Archival Information System (OAIS).
Magenta Book (Recommended Practice No. CCSDS 650.0-M-2). 135 pp. Greenbelt,
MD: Consultative Committee for Space Data Systems. Retrieved from
http://public.ccsds.org/publications/archive/650x0m2.pdf 25 March 2020.
Cox, S. J. D. (2015). Ontology for observations and sampling features, with
alignments to existing models. Semantic Web Journal, 8(3), 453–570. Retrieved
from
http://www.semantic-web-journal.net/content/ontology-observations-and-sampling-fe
atures-alignments-existing-models-0
Cudahy, T. (2012). Satellite ASTER Geoscience Product Notes for Australia (No.
EP125895) (p. 26). Canberra, Australia: Commonwealth Scientific and Industrial
Research Organisation. Retrieved from https://doi.org/10.4225/08/584d948f9bbd1 25
March 2020.
DataCite Metadata Working Group. (2018). DataCite Metadata Schema
Documentation for the Publication and Citation of Research Data (Version 4.2). 69
pp. Hannover, Germany: DataCite e.V. Retrieved from
https://doi.org/10.5438/bmjt-bx77 25 March 2020.
Fitzpatrick, B., Pilato, C. M., & Collins-Sussman, B. (2009). Version Control with
Subversion. Sebastopol, CA: O’Reilly Media, Inc. Retrieved from
http://svnbook.red-bean.com/ 25 March 2020.
Hourclé, J. A. (2009). FRBR applied to scientific data. Proceedings of the American
Society for Information Science and Technology, 45(1), 1–4. Retrieved from
https://doi.org/10.1002/meet.2008.14504503102 25 March 2020.
IFLA Study Group on the Functional Requirements for Bibliographic Records.
(1998). Functional Requirements for Bibliographic Records (IFLA Series on
Bibliographic Control No. 19) (p. 142). Munich, Germany: International Federation of
Library Associations and Institutions. Retrieved from
http://www.ifla.org/publications/functional-requirements-for-bibliographic-records 25
March 2020.
18
Klump, J., Huber, R., & Diepenbroek, M. (2016). DOI for geoscience data - how early
practices shape present perceptions. Earth Science Informatics, 9(1), 123–136.
Retrieved from https://doi.org/10.1007/s12145-015-0231-5 25 March 2020.
Klump, J., Wyborn, L., Downs, R., Asmi, A., Wu, M., Ryder, G., & Martin, J. (2020).
Compilation of Data Versioning Use cases from the RDA Data Versioning Working
Group. Version 1.1. Research Data Alliance. DOI: 10.15497/RDA00041.
NASA (2019). Earth Observing System Data and Information System (EOSDIS) data
processing levels. Retrieved from
https://earthdata.nasa.gov/collaborate/open-data-services-and-software/data-informa
tion-policy/data-levels 23 March 2019.
Paskin, N. (2003). On Making and Identifying a “Copy.” D-Lib Magazine, 9(1).
Retrieved from https://doi.org/10.1045/january2003-paskin 25 March 2020.
Preston-Werner, T. (2013). Semantic Versioning 2.0.0. Retrieved March 7, 2019,
from https://semver.org/spec/v2.0.0.html (Original work published May 29, 2011)
Rauber, A., Asmi, A., van Uitvanck, D., & Pröll, S. (2016). Data Citation of Evolving
Data: Recommendations of the Working Group on Data Citation (WGDC) (Technical
Report). Denver, CO: Research Data Alliance. Retrieved from
https://doi.org/10.15497/RDA00016 25 March 2020.
Razum, M., Schwichtenberg, F., Wagner, S., & Hoppe, M. (2009). eSciDoc
Infrastructure: A Fedora-Based e-Research Framework. In Research and Advanced
Technology for Digital Libraries (Vol. 5714, pp. 227–238). Heidelberg, Germany:
Springer Verlag. Retrieved from http://dx.doi.org/10.1007/978-3-642-04346-8_23 25
March 2020.
Rees, N., Evans, B., Conway, D., Seillé, H., Goleby, B., Wyborn, L., (2019).
Capturing (via automation) the Sequential Processing Levels along multiple
Full-paths of Magnetotellurics Data Use. Extended Abstract, eResearch Australasia
Conference, Brisbane–Australia, 21-25 October. Retrieved from
https://conference.eresearch.edu.au/wp-content/uploads/2019/10/2019-eResearch-R
ees_et_al.pdf 21 December 2019.
Taylor, K., Woodcock, R., Cuddy, S., Thew, P., & Lemon, D. (2015). A Provenance
Maturity Model. In R. Denzer, R. M. Argent, G. Schimak, & J. Hřebíček (Eds.),
Environmental Software Systems. Infrastructures, Services and Applications (Vol.
448, pp. 1–18). Cham, Switzerland: Springer International Publishing. Retrieved from
http://doi.org/10.1007/978-3-319-15994-2_1 25 March 2020.
Wikipedia. (2019). Software Versioning. Wikipedia. Retrieved from
https://en.wikipedia.org/w/index.php?title=Software_versioning&oldid=886437916
March 11, 2019.
19
W3C Dataset Exchange Working Group (DXWG). (2017). Retrieved from
https://www.w3.org/2017/dxwg/wiki/Main_Page March 20, 2019.
20