0% found this document useful (0 votes)
33 views10 pages

Semantic Versioning Versus Breaking Changes A Study of The Maven Repository

This study analyzes seven years of release history from over 22,000 Java libraries in the Maven repository to evaluate adherence to semantic versioning principles. It examines the frequency of breaking changes in minor and major releases, use of deprecation tags, and other metrics to understand compatibility in library upgrades and how well practices align with semantic versioning guidelines.

Uploaded by

mprostak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views10 pages

Semantic Versioning Versus Breaking Changes A Study of The Maven Repository

This study analyzes seven years of release history from over 22,000 Java libraries in the Maven repository to evaluate adherence to semantic versioning principles. It examines the frequency of breaking changes in minor and major releases, use of deprecation tags, and other metrics to understand compatibility in library upgrades and how well practices align with semantic versioning guidelines.

Uploaded by

mprostak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

2014 14th IEEE International Working Conference on Source Code Analysis and Manipulation

Semantic Versioning versus Breaking Changes:


A Study of the Maven Repository

Steven Raemaekers Arie van Deursen Joost Visser


Software Improvement Group Technical University Delft Software Improvement Group
Amsterdam, The Netherlands Delft, The Netherlands Amsterdam, The Netherlands
Email: [email protected] Email: [email protected] Email: [email protected]

Abstract—For users of software libraries or public program- • MAJOR: This number should be incremented when
ming interfaces (APIs), backward compatibility is a desirable incompatible API changes are made;
trait. Without compatibility, library users will face increased • MINOR: This number should be incremented when
risk and cost when upgrading their dependencies. In this
study, we investigate semantic versioning, a versioning scheme functionality is added in a backward-compatible man-
which provides strict rules on major versus m inor and patch ner;
releases. We analyze seven years of library release history in • PATCH: This number should be incremented when
Maven Central, and contrast version identifiers with actual backward-compatible bug fixes are made.
incompatibilities. We find that around one third of all releases
introduce at least one breaking change, and that this figure These principles were formulated in 2010 by (GitHub
is the same for minor and major releases, indicating that founder) Tom Preston-Werner.2 As argued in the semantic
version numbers do not provide developers with information in versioning specification, “these rules are based on but
stability of interfaces. Additionally, we find that the adherence not necessarily limited to pre-existing widespread common
to semantic versioning principles has only marginally increased
over time. We also investigate the use of deprecation tags and
practices in use in both closed and open-source software.”
find out that methods get deleted without applying deprecated But how common are these practices in reality? Are
tags, and methods with deprecated tags are never deleted. We such changes just harmless, or do they actually hurt by
conclude the paper by arguing that the adherence to semantic causing rework? Do breaking changes mostly occur in major
versioning principles should increase because it provides users releases, or do they occur in minor releases as well? Do
of an interface with a way to determine the amount of rework
that is expected when upgrading to a new version. major and minor releases differ in terms of typical size?
Furthermore, for the breaking changes that do occur, to
Keywords-Semantic versioning, Software libraries what extent are they signalled through, e.g., deprecation
tags? Finally, does the presence of breaking changes affect
I. I NTRODUCTION
the time (delay) between library version release and actual
For users of software libraries or public programming adoption of the new release in clients?
interfaces (APIs), backward compatibility is a desirable trait. In this paper, we seek to answer questions like these. To
Without compatibility, library users will face increased risk do so, we make use of seven years of versioning history as
and cost when upgrading their dependencies. In spite of present in the collection of Java libraries available through
these costs and risks, library upgrades may be desirable or Maven’s central repository.3 Our dataset comprises around
even necessary, for example if the newer version contains 150,000 binary jar files, corresponding to around 22,000
required additional functionality or critical security fixes. different libraries for which we have 7 versions on average.
To conduct the upgrade, the library user will need to know Furthermore, our dataset includes cross-usage of libraries
whether there are incompatibilities, and, if so, which ones. (libraries use other libraries in the dataset), permitting us to
Determining whether there are incompatibilities, however, study the impact of incompatibilities in concrete clients as
is hard to do for the library user (it is, in fact, undecidable in well.
general). Therefore, it is the library creator’s responsibility As an approximation of the (undecidable) notion of back-
to indicate the level of compatibility of a library update. ward compatbility, we use binary compatibility as defined in
One way to inform library users about incompatibilities is the Java language specification. This is an underestimation,
through version numbers. As an example, semantic ver- since binary incompatibilities are certainly breaking, but
sioning1 (semver) suggests a versioning scheme in which there are likely to be different (semantic) incompatibilities
three digit version numbers MAJOR.MINOR.PATCH have
the following semantics: 2 Github actively promotes semver and encourages all 10,000,000
projects hosted by GitHub to adopt it.
1 http://semver.org 3 http://search.maven.org/

978-1-4799-6148-1/14 $31.00 © 2014 IEEE


978-0-7695-5304-7/14 215
DOI 10.1109/SCAM.2014.30
Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
as well. As a measurement for the amount of changed the provenance of a software library, for instance, if the
functionality in a release, we will use the edit script size source code was copied from another library. They deploy
between two subsequent releases. Equipped with this, we several different techniques to uniquely identify a library,
will study versioning practices in the Maven dataset, and and find out its history, much like a crime scene containing
contrast them with the idealized guidelines as expressed in a fingerprint. Ossher et al. [11] also use the Maven repository
the semver specification. to reconstruct a repository structure with directories and
This paper is structured as follows. We start out, in version based on a collection of libraries of which the
Section II, by sketching related work in the area of version groupId, artifactId and version are not known.
analysis. In Section III, we formulate the research questions
we seek to answer. Then, in Section IV, we describe III. R ESEARCH Q UESTIONS
our approach to answer these questions, and methods of The overall goal of this paper is to understand to what
measurement. In Section V we provide descriptive statistics degree versioning conventions are adhered to in the de-
of the Maven dataset. In Sections VI–IX we present our velopment of software libraries. This leads to a better
analysis in full detail. We discuss the wider implications understanding of how developers use versioning schemes
and the threats to the validity of our findings in Sections X to identify expected amounts of rework for users of the
and XI. We conclude the paper in Section XII. interfaces they offer.
We regard semver as a formalization of principles that
II. R ELATED W ORK developers already implicitly embraced, even before the
To the best of our knowledge, our work is the first manifesto was released in 2010. We use the explicit rules
systematic study of versioning principles in a large collection of semver as a way to formally test these principles. We
of Java libraries. However, several case studies on backward want to find out if developers actually mean to give a
compatible and incompatible changes in public interfaces signal, for instance, that a library contains only backward-
as appearing in these libraries have been performed [2], compatible bug fixes when releasing a new patch version, or
[7], [9]. For instance, Cossette et al. [2] investigate binary that a library introduces a substantial number of backward-
incompatibilities introduced in five different libraries and incompatible changes to its public interface when releasing
aim to detect the correct adaptations to upgrade to the newer a new major version.
version of the library. Similarly, Dig et al. [9] investigate To achieve our overal goal, we seek to answer the follow-
binary incompatibilities in five other libraries and conclude ing research questions in this paper:
that most of the backward incompatible API changes are • RQ1: How are semantic versioning principles applied
behavior-preserving refactorings. Dietrich et al. [7] have per- in practice in the Maven repository in terms of binary
formed an empirical study into evolution problems caused (in)compatible changes?
by library upgrades. They manually detect different kinds • RQ2: Has the adherence to semantic versioning prin-
of source and binary incompatibilities, and conclude that ciples increased over time?
although incompatibility issues do occur in practice, the • RQ3: How are dependencies to newer versions updated,
selected set of issues does not appear very often. and what are factors causing systems not to include the
Another area of active research is to automatically detect latest versions of dependencies?
refactorings based on changes in public interfaces [3], [4], • RQ4: How are deprecation tags applied to methods in
[8]. The idea behind these approaches is that these refactor- the Maven repository?
ings can automatically be “replayed” to update to a newer
In the next section, we discuss our research method.
version of a library. This way, an adaptation layer between
the old and the new version of the library can automatically IV. M ETHOD
be created, thus shielding the system using that library from
In this paper, we analyze a snapshot of the Maven’s
backward incompatible changes.
Central Repository, dated July 11, 2011.4 Maven is an
While our work investigates backward incompatibilities
automated build system that manages the entire “build
for given version string changes, Bauml et al. [1] take the
cycle” of software projects. To use Maven in a software
opposite approach, in the sense that they propose a method to
project, a pom.xml file is created that specifies the project
generate version number changes based on changes in OSGi
structure, settings for different build steps (e.g. compile,
bundles. A comparable approach in the Maven repository
package, test) as well as libraries that the project depends
would be to create a plugin that automatically determines the
on. These libraries are automatically downloaded by maven,
correct subsequent version number based on backward in-
from specified repositories. These repositories can be private
compatibilities and the amount of new functionality present
as well as public. For open source systems, the Central
in the new release as compared to the previous one.
The Maven repository has been used in other work as 4 Obtained from http://juliusdavies.ca/2013/j.emse/bertillonage/maven.tar.
well. Davies et al. [5] use the same dataset to investigate gz based on [5], [6]

216

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
Repository is typically used, which contains jar files and and artifactId based on their version string. We used
sources for the most widely used open source Java libraries. the Maven Artifact API8 to compare version strings with
Our dataset extracted from this central repository contains each other, taking into account the proper sorting given the
148,253 Java binary jar files and 101,413 Java source jar files major, minor, patch and prerelease in a given version string.
for a total of 22,205 different libraries. This gives an average For each subsequent pair of releases from this sorted list, the
of 6.7 releases per library. For more information on our release type is determined according to the change in version
dataset, which includes resolved and versioned dependencies number. For instance, a change in version number from “1.0”
at the method level, we refer to [12]. to “1.1” was marked as a minor release. We do not check
whether version numbers are incremented properly, i.e. if
A. Determining backward incompatible API changes there are no gaps in version numbers.
Determining full backward compatibility amounts to de- Since semver applies only to version numbers contain-
termining equivalence of functions, which in general is ing a major, minor and patch version number, we only
undecidable. Instead of such semantic compatiblity, we will investigate pairs of library versions which are both struc-
rely on binary incompatibilities. tured according to the format “MAJOR.MINOR.PATCH” or
Binary incompatible changes, in this paper also called “MAJOR.MINOR”. In the latter case, we assume an implicit
breaking changes, are formally defined by the Java Language patch version number of 0.
specification as follows: “a change to a type is binary Semantic versioning also permits prereleases, such as
compatible with (equivalently, does not break binary com- 1.2.3-beta1 or (as commonly used in a maven set-
patibility with) pre-existing binaries if pre-existing binaries ting) 1.2.3-SNAPSHOT. We exclude prereleases from our
that previously linked without error will continue to link analysis since semver does not provide any rules regarding
without error.”5 In this paper, we will use the following breaking changes or new functionality in these release types.
working definition: breaking changes are any changes to a
library interface that require recompilation of systems using C. Detecting changed functionality
the changed functionality. Examples of breaking changes are In order to compare major, minor, and patch releases in
method removals and return type changes6 . terms of size, we look at the amount of changed functionality
To detect breaking changes between each subsequent between releases. To do so, we look at the edit script between
pair of library versions, we use Clirr7 . Clirr is a tool that each pair of subsequent versions, and measure the size of
takes two jar files as input and returns a list of changes these scripts. We do so by calculating differences between
in the public API. Clirr is capable of detecting 43 API abstract syntax trees (ASTs) of the two versions. Hence, we
changes in total, of which 23 are considered breaking and can see, for example, the total number of statements that
20 are considered non-breaking. Clirr does not detect all needs to be inserted, deleted, updated or moved to convert
binary incompatibilities that exist, but it does detect the the first version of the library into the second. We use the
most common ones (see Table 2). We executed Clirr on the static code analysis tool ChangeDistiller 9 to calculate edit
complete set of all subsequent versions of releases in the scripts between library versions. For more information on
Maven repository. The approach to determine subsequent ChangeDistiller, we refer to [10].
versions is described next.
Whenever Clirr finds a binary incompatibility between D. Obtaining release intervals and dependencies
two releases, those releases are certainly not compatible.
To calculate release intervals, we collect upload dates for
However, if Clirr fails to find a binary incompatibility, the
each jar file in the Maven Central Repository. Unfortunately,
releases can still be semantically incompatible. As such,
not for all libraries a valid upload date is available. Ul-
our reports on e.g., the percentage of releases introducing
timately, for 129,183 out of 144,934 (89.1%) libraries we
breaking changes is an underestimation: The actual situation
could identify a valid release date.
may be worse, but not better.
B. Determining subsequent versions and update types E. Obtaining deprecation patterns
In the Maven repository, each library version (a single jar For API developers, the Java language offers the pos-
file) is uniquely identified by its groupId, artifactId, sibility to warn about future incompatibilities by means
and version, for instance “org.springframework”, of the “@Deprecated” annotation.10 Old methods can
“spring-core” and “2.5.6”. To determine subsequent be marked as deprecated, but as they are not removed
version pairs, we sort all versions with the same groupId backward compatibility is retained. Also in semver, the use
5 http://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html 8 http://maven.apache.org/ref/3.1.1/maven-artifact
6 For an overview of different types of binary incompatibilities and a 9 https://bitbucket.org/sealuzh/tools-changedistiller
detailed explanation, see http://wiki.eclipse.org/Evolving Java-based APIs 10 http://docs.oracle.com/javase/1.5.0/docs/guide/javadoc/deprecation/
7 http://clirr.sourceforge.net deprecation.html

217

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
# Pattern Example #Single #Pairs Incl. Breaking changes
1 MAJOR.MINOR 2.0 20,680 11,559 yes # Change type Frequency
2 MAJOR.MINOR.PATCH 2.0.1 65,515 50,020 yes 1 Method has been removed 177,480
3 #1 or #2 with nonnum. chars 2.0.D1 3,269 2,150 yes 2 Class has been removed 168,743
4 MAJOR.MINOR-prerelease 2.0-beta1 16,115 10,756 no 3 Field has been removed 126,334
5 MAJOR.MINOR.PATCH-pre. 2.0.1-beta1 12,674 8,939 no 4 Parameter type change 69,335
6 Other versioning scheme 2.0.1.5.4 10,930 8,307 no 5 Method return type change 54,742
Total 129,138 91,731 6 Interface has been removed 46,852
Table 1. Version string patterns and frequencies of occurrence in the 7 Number of arguments changed 42,286
Maven repository. 8 Method added to interface 28,833
9 Field type change 27,306
10 Field removed, previously constant 12,979

Non-breaking changes
of such annotations is required, before methods are actually # Change type Frequency
removed. 1 Method has been added 518,690
2 Class has been added 216,117
We detect deprecated methods in the following way. 3 Field has been added 206,851
We extract the source code from source jar files for each 4 Interface has been added 32,569
5 Method removed, inherited still exists 25,170
library and, for performance reasons, textually search for 6 Field accessibility increased 24,954
occurrences of the string “@Deprecated” first. Only when 7 Value of compile-time constant changed 16,768
at least one deprecated tag is found, we parse the complete 8 Method accessibility increased 14,630
9 Addition to list of superclasses 13,497
source code of the library using the JDT (Java Development 10 Method no longer final 9,202
Tools) Core library11 . Table 2. The most common breaking and non-beaking changes in the
Using JDT, we create an abstract syntax tree for each Maven repository as detected by Clirr.
source file, and apply a visitor to find out which methods
have deprecation tags. Next versions of the same method
are connected using method header (name and parameters) and 22.3% of the version strings contains a prerelease label
matching. Combining this information with the update types (patterns 4 and 5). The difference between the single and the
from Section IV-B makes it possible to distinguish between pair frequency is due to two reasons: (1) the second version
different types of deprecation patterns. string of an update can follow a different pattern than the
first; and (2) a large number of libraries only has a single
V. D ESCRIPTIVE S TATISTICS release (6,442 out of 22,205 libraries, 29%).
Before answering our research questions, we provide an
B. Breaking and non-breaking changes
overview of the actual use of version strings that comply
with semver, and of the most common types of breaking Table 2 shows the top 10 breaking and non-breaking
changes in the Maven dataset. changes in the Maven repository as detected by Clirr. The
most frequently occurring breaking change is the method
A. Version string patterns removal, with 177,480 occurrences. A method removal is
Table 1 shows the six most common version string pat- considered to be a breaking change because the removal of
terns that occur in the Maven repository. For each pattern, a method leads to compilation errors in all places where this
the table shows the number of libraries with version strings method is used. The most frequently occurring non-breaking
that match that pattern (#Single) and the number of subse- change as detected by Clirr is the method addition, with
quent versions that both follow the same pattern (#Pairs) – 518,690 occurrences.
we will use the latter to identify breaking changes between Table 3 shows the number of major, minor and patch
subsequent releases. releases containing at least one breaking change. The table
The first three versioning schemes correspond to actual shows that 35.8% of major releases contains at least one
semver releases, whereas the remaining ones correspond breaking change, which in accordance with guidelines such
to prereleases. Since prerleases can be more tolerant in as semver. We also see that 35.7% of minor releases and
terms of breaking changes (semver does not state what 23.8% of patch releases contain at least one breaking change.
the relationship between prereleases and non-prereleases in This is in sharp contrast to the requirement that minor and
terms of breaking changes and new functionality is)12 we patch releases should be backward compatible. The overall
exclude prereleases from our analysis. number of releases that contain at least one breaking change
The table shows that the majority of the version strings is 30.0%.
(69.3%) is formatted according to the first two schemes, The table shows that there does not exist a large difference
between the percentage of major and minor releases that
11 http://www.eclipse.org/jdt/core
contain breaking changes. This indicates that semver is
12 Pre-releases in maven correspond to -SNAPSHOT releases,
not adhered to in practice with respect to breaking changes.
which should not be distributed via Maven’s Central Repository
(see https://docs.sonatype.org/display/Repository/Sonatype+OSS+Maven+ If this were the case, the number of minor and patch releases
Repository+Usage+Guide) containing breaking changes would be 0 in the table. The

218

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
Contains at least 1 breaking change #Breaking #Non-break. Edit script Days
Update type Yes % No % Total Type μ σ2 μ σ2 μ σ2 μ σ2
Major 4,268 35.8% 7,624 64.2% 11,892 Major 58.3 337.3 90.7 582.1 50.0 173.0 59.8 169.8
Minor 10,690 35.7% 19,267 64.3% 29,957 Minor 27.4 284.7 52.2 255.5 52.7 190.5 76.5 138.3
Patch 9,239 23.8% 29,501 76.2% 38,740 Patch 30.1 204.6 42.8 217.8 22.7 106.5 62.8 94.4
Total 24,197 30.0% 56,392 70.0% 80,589 Total 32.0 264.3 52.2 293.3 37.2 152.3 67.4 122.9
Table 3. The number of major, minor and patch releases that contain
breaking changes. Table 4. Analysis of the number of breaking and non-breaking changes,
edit script size, and release intervals of major, minor, and patch releases.

10% 20% 30% 40% 50% 60%


total number of updates in Table 3 (80,589) differs from
the total number of pairs in Table 1 because of missing or
corrupt jar files, which have a version string but cannot be
analyzed by Clirr.

Percentage
VI. RQ1: M AJOR VS M INOR VS PATCH R ELEASES
To understand the adherence of semantic versioning prin-
ciples for major, minor, and patch releases, Table 4 shows the
average number of breaking changes, non-breaking changes,
edit script size and number of days for the different release

0%
types. Each release is compared to its immediate previous 2006 2007 2008
Year
2009 2010 2011

release, regardless of the release type of this previous


Major Minor
release. Patch Breaking
As the table shows, on average there are 58 breaking Breaking if non-major

changes in a major release. Minor and patch releases in-


troduce fewer breaking changes (around half as many as Figure 5. The percentage of major, minor, patch, breaking, and breaking
the major releases), but 27 and 30 on average is still a if non-major releases through time.
substantial number (and clearly not 0 as semantic versioning
requires). The differences between the three update types
are significant with F = 7.31 and p = 0, tested with a In Section X we will get back to these results and try to
nonparametric Kruskall-Wallis test, since the data is not provide explanations. We first continue with an analysis of
normally distributed13 . adherence to semver through time.
In terms of size, major releases are somewhat smaller
VII. RQ2: S EMANTIC V ERSIONING A DHERENCE OVER
than minor releases (average edit script size of 50 and 52,
T IME
respectively), with patch releases substantially smaller (22),
with F = 117.49 and p = 0. This provides support for the rule To find out if the adherence to semver has changed over
in semver stating that patch releases should contain only time, we plot the number of major, minor and patch releases
bug fixes, which overall would lead to smaller edit script through time and the number of releases containing breaking
sizes than new functionality. changes over time. This plot is shown in Figure 5.
With respect to release intervals, these are on average 2 The figure shows that the ratio of major, minor and
(for major and patch releases) to 2.5 months (for minor patch releases is relatively stable and around 15%, 30% and
releases), with F = 115.47 and p = 0. It is interesting to 50%, respectively. The percentage of major releases per year
see that minor, and not major updates take the longest time seems to decrease slightly in later years.
to release. Regardless of release type, one in every three releases
Care must be taken when interpreting the mean for skewed contains breaking changes. This percentage is relatively
data. All data in this table follows a strong power law, in stable but slightly decreasing in later years. One out of every
which the most observations are closer to 0 and there are a four releases violates semver (“breaking if non-major”),
relative small amount of large outliers. Nonetheless, a larger but this percentage also slightly decreases in later years:
mean indicates that there are more large outliers present in from 28.4% in 2006 to 23.7% in 2011.
the data. To answer RQ2: The adherence to semantic versioning
Thus, to answer RQ1: The strict principles of semantic principles has increased over time with a moderate decrease
versioning regarding breaking changes are not adhered to of breaking changes in non-major releases from 28.4% in
in practice. Instead of being free of breaking changes, minor 2006 to 23.7% in 2011.
and patch releases include 30 breaking changes on average. VIII. RQ3: U PDATE B EHAVIOR
13 Even if the data is not normally distributed, we still summarize the The key reason to investigate breaking changes is that
data with a mean and standard deviation to provide insight in the data. they complicate upgrading a library to its latest version. To

219

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
Update L min p25 p50 p75 p90 p95 p99 max
Update S Major Minor Patch Total Major 0 0 0 0 1 1 4 22
Major 543 189 82 814 Minor 0 0 0 1 2 4 6 101
Minor 651 791 227 1,669 Patch 0 0 0 1 5 6 13 46
Patch 150 54 297 501 Table 8. Percentiles for the number of major, minor and patch dependency
Total 1,344 1,034 606 2,984 versions lagging.
Table 6. The number of updates of different types of S and simultaneous
updates of dependency L.

could have included L3 but still includes L2 . The period that


uses
S has been using L1 is from February 1, to April 1. The
S1 S2 S3 total time that S has a dependency on L is from February
next ver. 1 to August 1.
This example illustrates that there can exist a lag between
One minor
release lagging the release of a new version of L and the inclusion in S. In
this example, S3 lags one minor release behind, and could
L1 L2 L3
have included L3 . The time S3 theoretically could update to
patch major minor
L3 is between May, 1 and August, 1.
For each system S and each of its dependencies L, we
Jan 1 Feb 1 Mar 1 Apr 1 May 1 Aug 1
calculate the number of major, minor and patch releases that
S3-L Update lag version of S lags behind. The release dates of Sx and Ly
are used to determine the number of releases after Ly but
Figure 7. An example of a timeline with a system S updating library L. before Sx .
Table 8 shows percentiles for the number of major, minor
and patch versions that dependencies L of system S are
what extent is this visible in the maven dataset? What delay lagging as compared to the latest releases of L at the release
is there typically between a library release and the usage date of S. For instance, when a system released a new
of that release by other systems? Is this delay affected by version at January 1, 2013 and that release included a library
breaking changes? with version 4.0.1 but there have been 10 minor releases of
To investigate the actual update behavior of systems using that library before January 1 and after the release date of
libraries, we collected all updates from the Maven repository version 4.0.1 that could have been included in that release
that update one of their dependencies. Thus, we investigate of S, the number of minor releases lagging is 10 for that
usage scenarios within the maven dataset. system-library combination. These numbers are calculated
We obtained a list of 2,984 updates from the Maven for each system-library combination separately.
repository of the form Sx , Sx+1 , Ly , Ly+1 , where L is The table shows that the number of major releases that
a dependency of S which was updated from version y to S lags on average tends to be smaller than the number
version y+1 in the update of S from x to x+1. For example, of minor and patch releases lagging. The distributions are
when the Spring framework included version 3.8.1 of JUnit highly skewed, with a median of 0 for all three release types
in version 2.0, but included version 3.8.2 in version 2.1, and a 75th percentile of 1 for minor and patch releases,
Spring framework performed a minor update of JUnit in a indicating that the majority of library developers include the
patch release. latest releases of dependencies in their own libraries. The
Table 6 shows the number of updates of different types of numbers also indicate that developers tend to better keep
S and L in the Maven repository. The table shows that most up with the latest major releases than with minor and patch
major updates of dependencies (543) are performed in major releases, as indicated by the 90th percentile of 1 for major
updates of S, and most minor updates of dependencies (791) releases and a 90th percentile of 5 for patch releases.
are performed in minor updates of S. The same is true for To better understand the reasons underlying the update
patch updates of dependencies, which are most frequently lag, we investigate two properties of libraries that could
updated in patch updates of S (297). influence the number of releases that systems are lagging:
To further investigate update behavior of dependencies, the edit script size and the number of breaking changes of
we calculate the number of versions of L that S lags behind, these dependencies. We hypothesize that people are reluctant
as illustrated in Figure 7. The figure shows an example of to update to a newer version of a dependency when it
three versions of S, and a dependency L of S. On January introduces a large number of breaking changes or introduces
1, L1 , a patch update, is released. S1 decides to use this a large amount of new or changed functionality. To test this,
version in its system. On March 1, a major update of L is we investigate whether a positive correlation exists between
released, L2 . The next release of S, S2 , happens on April the number of major, minor and patch releases lagging in
1. This release still includes L1 , although L2 was already libraries using a dependency and the number of breaking
available to include in S2 . The same is true for S3 , which changes and changed functionality in new releases of that

220

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
Breaking changes Edit script size Table 10 shows different possible deprecation patterns.
Major versions lagging 0.0772 -0.0701
Minor versions lagging 0.1440 0.1272
The table uses a typical library with 4 releases (two major,
Patch versions lagging 0.0190 0.0199 two minor). For each pattern in the table, we count its
Table 9. Spearman correlations between the size of the update lag of L frequency in the maven data set. As the table shows, there
and breaking changes and the edit script size in the next version of L. are a couple of different ways to deprecate and delete
methods in major or minor releases, some of which are
correct according to semver (column c).
dependency. We calculate Spearman correlations between Cases 1 and 2 in Table 10 show an example of a private
the number of versions lagging and the number of breaking method with and without deprecation tags. As the table
changes and edit script size in these versions. shows, the first case occurs in 24.24% of all methods. Since
The results are shown in Table 9. The table shows semver is only about versioning and changes in public
Spearman correlations, which are calculated on 13,945 ob- interfaces, these cases are therefore not investigated further.
servations and all have a p-value of 0. The correlations are Case 3 shows a public method that is neither deleted nor
generally very weak, with the maximum correlation being deprecated, which is the most common life cycle for a
0.1440 between the number of minor versions lagging and method (42% of the cases). Case 4 shows a public method
the number of breaking changes in these dependencies. that is deprecated, but is never removed in later versions.
The results indicate that although the number of breaking According to the principles regarding deprecation as stated
changes and the edit script size of a library does seem to have in semver, this is correct behavior. As the table shows,
some influence on the number of library releases systems are this is the most common use of the deprecation tag, even
lagging, the influence generally is not very large. though it is used in just 793 methods. Case 5 shows a
To answer RQ3: updates of dependencies to major re- public method that is removed from the interface but never
leases are most often performed in major library updates. declared deprecated, which is not correct: This is the typical
There exists a lag between the latest versions of dependen- case of introducing a breaking change in a minor release.
cies and the versions actually included, with the gap being Case 6 deprecates the method, but deletes it in a minor
the largest for patch releases and the smallest for major release, which would not be correct. This case does not
releases. There exists a small influence of the number of occur. Case 7 declares the method deprecated in a major
backward incompatibilities and of the amount of change in release, which would also be incorrect (and which does not
new versions on this lag. occur). Case 8 shows an example of deprecation by the book,
IX. RQ4: D EPRECATION PATTERNS exactly as prescribed by semver. The method is declared
deprecated in a minor release, there is another minor release
As we have seen, breaking changes are common. To deal that also declares the method deprecated and in the next
with breaking changes, the Java language offers depreca- major release, the method is removed. This correct pattern
tion annotations. For the use of such annotations, semantic does not occur at all in the maven data set. Case 9 shows
versioning provides the following rules for deprecation of a method that is undeprecated, about which semver does
methods in public interfaces: “a new minor release should not explicitly contain a statement.
be issued when a new deprecation tag is added. Before the As the table further shows, public methods without a
functionality is removed completely in a new major release, deprecated tag in their entire history are in the majority
there should be at least one minor release that contains the with 42.27%. Surprisingly, the number of public methods
deprecation so that users can smoothly transition to the new that ever get deprecated in their entire history is only 793,
API.”14 Thus, whenever there is a breaking change (which or 0.30%. The number of public methods that get deleted
must be in a major release), this should be preceded by a without a deprecated tag is 86,449, or 33.03%. The number
deprecation (which can be in a minor release). of methods that get deleted after adding a deprecated tag to
In this section, we investigate whether this principle is an earlier version is 0 (cases 6 and 8). Finally, the number
adhered to in practice. We investigate how many libraries of methods that get “undeprecated” is 0.01%.
actually deprecate methods, and if they do, how many These results are surprising since they suggest that de-
releases it takes before these methods get deleted, if at all. velopers do not apply deprecation patterns in the way that
We also find out if there is indeed at least one minor change semver proposes. In fact, developers do not seem to use
in between before the method is removed, as semver the deprecated tag for methods very often at all. Most public
prescribes. methods get deleted without applying a deprecated tag first
In total, 1196 out of 22,205 artifacts (5.4%) contain at (case 5), and methods that do get a deprecated tag are
least one method deprecation tag. Given our observation that almost never deleted (case 4). This suggests that developers
1 in 3 releases introduces breaking changes, this number are reluctant to remove deprecated functionality from new
immediately appears to be too low. releases, possibly because they are afraid to break backward
14 http://semver.org/spec/v2.0.0.html compatibility. Case 8 is, according to semver, the only

221

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
# v1 (maj.) v2 (min.) v3 (min.) v4 (maj.) c i Freq. %
1 pr m1 pr m1 pr m1 pr m1 y n 63,698 24.34
2 pr m2 pr m2 pr @d m2 pr @d m2 y n 113 0.04
3 pu m3 pu m3 pu m3 pu m3 y n 110,613 42.27
4 pu m4 pu @d m4 pu @d m4 pu @d m4 y y 793 0.30
5 pu m5 pu m5 - - n y 86,449 33.03
6 pu m6 pu @d m6 - - n y 0 0
7 pu m7 pu m7 pu m7 pu @d m7 n y 0 0
8 pu m8 pu @d m8 pu @d m8 - y y 0 0
9 pu m9 pu @d m9 pu m9 pu m9 n y 16 0.01
Table 10. Possible method deprecation patterns. @d = deprecated tag, c = correct, i = interesting; pr = private; pu = public; – = method deleted.

proper way to deprecate and delete methods. However, the public interface they are releasing. It is increasingly hard
pattern was not found in the entire Maven repository. for library developers to change their overall design of their
To answer RQ4: Developers do not follow deprecation interface after it has been published. This problem becomes
guidelines as suggested by semantic versioning. Most public worse the more users actually use the interface. Releasing
methods are deleted without applying a deprecated tag first, a new major release can effectively signal that continuity
and when these tags are applied to methods, these methods of the old interface should not be expected and that radical
are never deleted in later versions. changes may be present. However, when this mechanism is
only partially used, which we have shown is the case in the
X. D ISCUSSION Maven repository, it becomes unclear what exactly a major
The results of this study indicate that the stability of inter- release means.
faces and mechanisms to signal this instability to developers A possible explanation for the low adherence to semver
leaves much to be desired. One in every three interfaces is that the Java modularization mechanism is not suited to
contains breaking changes, and additionally, one in three provide all visibility levels as desired by developers. For
interfaces that should not contain breaking changes actually instance, developers sometimes release “internal” packages.
does. The usage of the deprecation tag and the deletion These are packages that should be hidden from outside
of methods in the Maven repository show that the average developers and are only meant to be used by the developers
developer tends to disregard the effects his actions have on themselves. The problem with internal packages is that
clients of a library. they are publicly visible, meaning that outside developers
have complete access to these packages, just like regu-
A. Low adherence explained lar packages. What is missing from the Java language is
Even though the used versioning schemes on itself of another layer of visibility, which hides internal packages
a large number of libraries conforms to the versioning from outside users. An example of a mechanism that does
scheme as endorsed by semver, developers apparently do provide this level of visibility is the modularization structure
not conform to the actual rules as set out by this standard. of the OSGi framework. Additionally, entire libraries are
If developers would adhere completely to these principles sometimes released that are only meant to be used by the
and their releases contain the same amount of breaking developers themselves, even without the use of internal
changes as found in the Maven repository, the number of packages. Java or the Maven repository also do not provide
major releases should be much larger than is currently the support to prevent external users from using these libraries.
case. This low adherence is surprising since there are no In fact, these libraries should have never been released in
other mechanisms available, except versioning schemes and the Maven repository to begin with.
deprecation tags, which signal interface instability. We argue The low number of methods that use the deprecation tag
that the principles set out by semver should be followed in the entire Maven repository was surprising. A possible
by every developer of software libraries, or any piece of explanation for this is that classes can also be deprecated
software that is used by external developers. completely, without individually deprecating all methods in
We argue that ultimately, better designed and more stable that class. Our analysis will not detect these cases. Future
interfaces leads to a lower maintenance burden of software work could further investigate whether developers deprecate
in general. When a library user, or a user of any piece entire classes instead of deprecating only single methods.
of publicly available functionality knows that there are
expected changes when upgrading to a newer version, the B. Actual usage frequencies
developer can anticipate this and choose to postpone or In our research, we do not take into account the difference
include the update. Strict adherence to semantic versioning between internal and non-internal packages. We also do
principles also forces library developers to think hard about not take into account the actual usage of packages, classes
the functionality they release, and about the design of the and methods with breaking changes. It makes a difference

222

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
whether a public method in the interface of a library is used as can be seen in Table 1. For this reason, only adherence to
frequently by other developers, such as AssertEquals in the principles stated by semver was checked in this paper.
JUnit, or the method is not used at all by other developers.
We consider the impact of breaking changes on libraries E. Major version 0 releases
using that functionality outside the scope of this paper. Semver states that “Major version zero (0.y.z) is for
However, semver does not state that breaking changes in initial development. Anything may change at any time. The
major releases can only occur in parts of the library that are public API should not be considered stable.”. We did not
never used, but instead states that breaking changes should consider whether the effects as tested in this paper also hold
never be present in minor and patch releases, regardless of for releases with a major version of zero. The number of
actual usage. releases having a major version of 0 is 10.44% (13,162 /
126,070), which is a substantial part of all releases. Future
C. Release interval and edit script size work could investigate whether the principles as tested in
Table 4 showed that major releases have smaller re- this paper are also visible in releases with a major version
lease intervals and also contain less functional change. We of 0. We expect that the number of breaking changes in these
expected that major releases have larger release intervals releases will be considerably higher than other releases.
instead. This could be explained by the fact that developers
often start working on a major release alongside the minor or XI. T HREATS TO VALIDITY
patch release (by creating a branch) of the previous version, A. Internal validity
which would decrease the actual release interval.
The release dates of libraries as obtained from the central
The table also shows that major releases generally contain
Maven repository are sometimes incorrect, as demonstrated
less changed functionality than minor releases, as measured
by the disproportionally large number of libraries with a
by edit script size. A possible explanation for this is that de-
release date of November 5th, 2005 (2,321, 1.5%). These
velopers create a new major release especially for backward
data points were excluded from our analysis, but we do not
incompatible changes in its API, and new functionality is
have absolute certainty of the correctness of the remaining
added later. Seen this way, a major release can be interpreted
release dates. Another indication that release dates were not
as a signal that gives information on significant changes in
always correct is the fact that an ordering based on release
the interface of a library, while saying nothing about the
dates and an ordering based on version numbers of a single
amount of changed functionality in the release.
artifact does not always give the same rankings. In these
D. The birth of semantic versioning principles cases, the ordering in version numbers was assumed to be
correct. These possibly invalid data points do influence our
The snapshot of the Maven repository that was analyzed analysis on the number of days between releases, however,
in this paper contained releases until July 11, 2011. The but we assume that on average, our statistical analyses
commit history of the GitHub repository of semver.org15 provides us with a robust average. A manually checked
showed that the first commit was performed on December sample of 50 random library versions and their release dates
14, 2009. The question rises how widespread the knowledge on the corresponding websites were all correct. This sample
about semver was before the first version of semver was gives us confidence in the overall reliability of the release
online. dates in the repository.
It is unclear when semantic versioning principles were The low number of deprecation tags detected in the Maven
started to be used by developers, but we believe that the prin- repository is surprising. However, we have confidence in
ciples on semver.org are simply a summary of principles our methodology to detect these tags since deprecation
that were already known in the developer community, but patterns were scanned in two different ways. First, a textual
had not been encoded in a comprehensive manifesto before. search was performed to search for literal occurences of
This hypothesis is supported by the fact that comparable the string “@Deprecated”. Second, when a deprecated
semantic versioning principles have been encoded elsewhere, tag was found in a library, the complete library was parsed
such as the one by the OSGi alliance16 , which released their and and AST’s were created. This approach therefore makes
semantic versioning principles on may 6, 2010 and which it impossible to miss a deprecated tag. In future work,
contains comparable guidelines as the ones by semver. we could further investigate causes for the low number of
Furthermore, there exist several alternative versioning deprecated tags.
approaches17 , but the versioning schemes described in these
approaches do not seem to be used in the Maven repository, B. External validity
15 https://github.com/mojombo/semver.org/commits/gh-pages?page=5 While our findings are directly based on an exploration of
16 http://www.osgi.org/wiki/uploads/Links/SemanticVersioning.pdf semantic versioning principles in Maven, we believe many
17 http://en.wikipedia.org/wiki/Software versioning of them will hold beyond this setting. For example, in other

223

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.
eco-systems, such as .NET libraries, nuget18 packages R EFERENCES
(the .NET counterpart of Maven), OSGi bundles, or Ruby [1] J. Bauml and P. Brada. Automated versioning in OSGi: A
gems19 , similar phenomena may be observed. In the domain mechanism for component software consistency guarantee.
of software services, versioning and compatibility play a role In Proceedings of the 2009 35th Euromicro Conference on
not just at compile time, but also at runtime, as services Software Engineering and Advanced Applications, SEAA ’09,
may be dynamically replace with (hopefully compatible) pages 428–435, 2009.
updates. Here, again, a need for dealing with breaking [2] B. E. Cossette and R. J. Walker. Seeking the ground
changes will occur, as well as a need for managing this truth: a retroactive study on the evolution and migration of
through deprecation tags. software libraries. In Proceedings of the ACM SIGSOFT
Future work could investigate to what degree the patterns 20th International Symposium on the Foundations of Software
Engineering, FSE ’12, pages 55:1–55:11, New York, NY,
found in our dataset are representative for software libraries
USA, 2012. ACM.
outside the Maven repository, software libraries written in
other languages than Java or software systems in general. To [3] I. Şavga and M. Rudolf. Refactoring-based support for binary
test our hypothesis that other library repositories also show compatibility in evolving frameworks. In Proceedings of
the same patterns, further research is needed. Future work the 6th International Conference on Generative Programming
and Component Engineering, GPCE ’07, pages 175–184,
could also replicate the same patterns in a set of industrial
2007.
software systems.
[4] B. Dagenais and M. P. Robillard. Recommending adaptive
changes for framework evolution. In Proceedings of the 30th
C. Reproducability and reliability international conference on Software engineering, ICSE ’08,
There was substantial computing power involved to obtain pages 481–490, 2008.
data for this paper: data was obtained on a supercomputer [5] J. Davies, D. M. German, M. W. Godfrey, and A. Hindle.
with 100 processing nodes with an aggregated running time Software bertillonage: Finding the provenance of an entity.
of almost six months. Without access to the same amount of In Proceedings of the 8th Working Conference on Mining
computing power, the data will be very hard to reproduce. Software Repositories, MSR ’11, pages 183–192, 2011.

[6] J. Davies, D. M. Germán, M. W. Godfrey, and A. Hindle.


XII. C ONCLUSION Software bertillonage - determining the provenance of soft-
ware development artifacts. Empirical Software Engineering,
In this paper, we have looked versioning as adopted by 18(6):1195–1237, 2013.
over 22,000 open source libraries distributed through Maven
[7] J. Dietrich, K. Jezek, and P. Brada. Broken promises: An
Central. In particular, we investigated whether principles as empirical study into evolution problems in Java programs
formulated by semantic versioning are adhered to, which caused by library upgrades. In CSMR-WCRE, pages 64–73.
specifies rules about the introduction of breaking changes in IEEE, 2014.
relation to version number increments.
[8] D. Dig, C. Comertoglu, D. Marinov, and R. Johnson. Au-
Our findings are as follows: tomated detection of refactorings in evolving components.
• The introduction of breaking changes is widespread: In Proceedings of the 20th European conference on Object-
Around one third of all releases introduce at least one Oriented Programming, ECOOP’06, pages 404–428, 2006.
breaking change. [9] D. Dig and R. Johnson. How do APIs evolve? a story
• While semantic versioning prescribes that breaking of refactoring: Research articles. J. Softw. Maint. Evol.,
changes are only permitted in major releases, we see 18(2):83–107, 2006.
little difference between these two: One third of the
major as well as one third of the minor releases [10] B. Fluri, M. Wuersch, M. Pinzger, and H. Gall. Change dis-
tilling: Tree differencing for fine-grained source code change
introduce at least one breaking change. extraction. IEEE Trans. Softw. Eng., 33(11):725–743, Nov.
• The presence of breaking changes has little influence 2007.
on the actual delay between the availability of a library
and the use of the newer version of that library. [11] J. Ossher, H. Sajnani, and C. Lopes. Astra: Bottom-up
• Deprecation tags are used very little, and never in the construction of structured artifact repositories. In Reverse
Engineering (WCRE), 2012 19th Working Conference on,
way as strictly suggested by semantic versioning. pages 41–50, 2012.
The results indicate that the current mechanisms to signal
interface instability are not used properly. [12] S. Raemaekers, A. v. Deursen, and J. Visser. The maven
repository dataset of metrics, changes, and dependencies.
In Proceedings of the 10th Working Conference on Mining
18 http://www.nuget.org Software Repositories, MSR ’13, pages 221–224, 2013.
19 http://www.rubygems.org

224

Authorized licensed use limited to: POLITECHNIKI WARSZAWSKIEJ. Downloaded on December 05,2023 at 13:16:43 UTC from IEEE Xplore. Restrictions apply.

You might also like