On The Diffuseness and The Impact On
On The Diffuseness and The Impact On
Fabio Palomba
Delft University of Technology, The Netherlands
E-mail: [email protected]
Gabriele Bavota
Università della Svizzera italiana (USI), Switzerland
E-mail: [email protected]
Massimiliano Di Penta
University of Sannio, Italy
E-mail: [email protected]
Fausto Fasano, Rocco Oliveto
University of Molise, Italy
E-mail: [email protected] E-mail: [email protected]
Andrea De Lucia
University of Salerno, Italy
E-mail: [email protected]
2 Fabio Palomba et al.
1 Introduction
Bad code smells (also known as “code smells” or “smells”) were defined as
symptoms of poor design and implementation choices applied by programmers
during the development of a software project (Fowler, 1999). As a form of
technical debt (Cunningham, 1993), they could hinder the comprehensibility
and maintainability of software systems (Kruchten et al, 2012). An example
of code smell is the God Class, a large and complex class that centralizes the
behavior of a portion of a system and only uses other classes as data holders.
God Classes can rapidly grow out of control, making it harder and harder for
developers to understand them, to fix bugs, and to add new features.
The research community has been studying code smells from different per-
spectives. On the one side, researchers developed methods and tools to detect
code smells. Such tools exploit different types of approaches, including metrics-
based detection (Lanza and Marinescu, 2010; Moha et al, 2010; Marinescu,
2004; Munro, 2005), graph-based techniques (Tsantalis and Chatzigeorgiou,
2009), mining of code changes (Palomba et al, 2015a), textual analysis of
source code (Palomba et al, 2016b), or search-based optimization techniques
(Kessentini et al, 2010; Sahin et al, 2014). On the other side, researchers inves-
tigated how relevant code smells are for developers (Yamashita and Moonen,
2013; Palomba et al, 2014), when and why they are introduced (Tufano et al,
2015), how they evolve over time (Arcoverde et al, 2011; Chatzigeorgiou and
Manakos, 2010; Lozano et al, 2007; Ratiu et al, 2004; Tufano et al, 2017), and
whether they impact on software quality properties, such as program com-
prehensibility (Abbes et al, 2011), fault- and change-proneness (Khomh et al,
2012, 2009a; D’Ambros et al, 2010), and code maintainability (Yamashita and
Moonen, 2012, 2013; Deligiannis et al, 2004; Li and Shatnawi, 2007; Olbrich
et al, 2010; Sjoberg et al, 2013).
Similarly to some previous work (Khomh et al, 2012; Li and Shatnawi,
2007; Olbrich et al, 2010; Gatrell and Counsell, 2015) this paper investigates
the relationship existing between the occurrence of code smells in software
projects and software change- and fault-proneness. Specifically, while previous
work shows a significant correlation between smells and code change/fault-
proneness, the empirical evidence provided so far is still limited because of:
– Limited size of previous studies: the study by Khomh et al (2012)
was conducted on four open source systems, while the study by D’Ambros
et al (2010) was performed on seven systems. Furthermore, the studies by
Li and Shatnawi (2007), Olbrich et al (2010), and Gatrell and Counsell
(2015) were conducted considering the change history of only one software
project.
– Detected smells vs. manually validated smells: Previous work study-
ing the impact of code smells on change- and fault-proneness, including the
one by Khomh et al (2012), relied on data obtained from automatic smell
detectors. Although such smell detectors are often able to achieve a good
level of accuracy, it is still possible that their intrinsic imprecision affects
the results of the study.
On the Diffuseness and the Impact on Maintainability of Code Smells 3
2 Related work
The research community has been highly active in the definition of code smell
detection methods and tools, as well as in the investigation of the impact of
code smells on software maintenance properties. In this section we report the
literature related to (i) empirical studies aimed at understanding to what ex-
tent code smells are diffused in software systems and how they evolve over
time, (ii) the impact of code smells on change- and fault-proneness, and (iii)
user studies conducted in order to comprehend the phenomenon from a de-
veloper’s perspective. A complete overview of code smell detection techniques
can be found in related papers by Palomba et al (2015c) and Fernandes et al
(2016).
source code, but they do not invest time in performing refactoring activities
aimed at removing them. A partial explanation for this behavior is provided
by Arcoverde et al (2011), who studied the longevity of code smells showing
that they often survive for a long time in the source code. The authors point
to the will of avoiding changes to API as one of the main reason behind this
result (Arcoverde et al, 2011).
The evolution of code smells has also been studied by Olbrich et al (2009),
who analyzed the evolution of God Class and Shotgun Surgery, showing that
there are periods in which the number of smells increases and periods in which
this number decreases. They also show that the increase/decrease of the num-
ber of instances does not depend on the size of the system.
Vaucher et al (2009) conducted a study on the evolution of the God Class
smell, aimed at understanding whether they affect software systems for long
periods of time or, instead, are refactored while the system evolves. Their
goal was to define a method able to discriminate between God Class instances
that are introduced by design and God Class instances that are introduced
unintentionally. Recently, Tufano et al (2015) investigated when code smells
are introduced by developers, and the circumstances and reasons behind their
introduction. They showed that most of the times code artifacts are affected by
smells since their creation and that developers introduce them not only when
implementing new features or enhancing existing ones, but sometimes also
during refactoring. A similar study was also conducted on test smells (Tufano
et al, 2016). Furthermore, Tufano et al (2017) also found that almost 80% of
the code smells are never removed from software systems, and the main cause
for their removal is the removal of the smelly artifact, rather than refactoring
activities. In a closely related field, Bavota et al (2012) and Palomba et al
(2016a) provided evidence that test smells are also widely diffused in test code
and impact the maintainability of JUnit test classes.
Historical information, in general, and the evolution of code smells, in
particular, was also used in the past to identify components affected by code
smells. Ratiu et al (2004) proposed an approach to detect smells based on
evolutionary information of code components over their life-time. The aim is
to analyze the persistence of the problem and the effort spent to maintain
these components. Historical information has also been used by Lozano et al
(2007) to assess the impact of code smells on software maintenance. Gı̂rba et al
(2007) exploited formal concept analysis (FCA) to detect co-change patterns.
In other words, they identified code components that change in the same way
and at the same time. Palomba et al (2015b) use association rule discovery to
detect some code smell types, showing that the evolutionary-based approach
outperforms approaches based on static and dynamic analysis and could also
successfully complement them.
Our investigation about the diffuseness of code smells (RQ1 ) is closely
related to the empirical studies discussed above. However, our goal is to analyze
whether the results achieved in previous work hold on the set of software
systems used in this paper in order to (i) corroborate previous findings on a
much larger dataset (both in terms of number of software systems and code
6 Fabio Palomba et al.
smells), and (ii) understand the confidence level for the generalizability of the
results provided through the analysis of the impact of code smells on change-
and fault-proneness.
The main goal of this paper is to analyze the change- and fault-proneness of
classes affected (and not) by code smells. Such a relationship has already been
investigated by previous research. In particular, Khomh et al (2009a) showed
that the presence of code smells increases the code change proneness. Also,
they showed that code components affected by code smells are more fault-
prone than non-smelly components (Khomh et al, 2012). Our work confirms
the results achieved by Khomh et al (2012) on a larger set of code smells
and software systems, an provides some complementary hints about the phe-
nomenon. In particular, other than studying the change- and fault-proneness
of smelly and non-smelly classes, we analyzed how such indicators vary when
the smells identified are removed. Also, we use the SZZ algorithm (Sliwer-
ski et al, 2005) to better investigate the temporal relationship between the
presence of code smells and fault introduction.
Gatrell and Counsell (2015) conducted an empirical study aimed at quan-
tifying the effect of refactoring on class change- and fault-proneness. In partic-
ular, they monitored a commercial C# system for twelve months identifying
the refactorings applied during the first four months. They examined the same
classes for the second four months in order to determine whether the refactor-
ing results in a decrease of change- and fault-proneness. They also compared
such classes with the classes of the system that were not refactored in the
same period. Results revealed that classes subject to refactoring have a lower
change- and fault-proneness. It is worth noting that Gatrell and Counsell did
not focus their attention on well known design problems (i.e., code smells)
but they analyzed if refactored classes regardless of the presence of a design
problem. Instead, our study investigates the actual impact of code smells on
change- and fault-proneness. Moreover, their study was conducted on a sin-
gle software system, while we analyzed 395 software releases of 30 software
systems.
Li and Shatnawi (2007) empirically evaluated the correlation between the
presence of code smells and the probability that the class contains errors. They
studied the post-release evolution process showing that many code smells are
positively correlated with class errors. Olbrich et al (2010) conducted a study
on the God Class and Brain Class code smells, reporting that these code smells
were changed less frequently and had a fewer number of defects than non-
smelly classes. D’Ambros et al (2010) also studied the correlation between
the Feature Envy and Shotgun Surgery smells and the defects in a system,
reporting no consistent correlation between them. In our empirical study, we
do not consider correlation between the presence of smells and the number of
defects, but we investigate the release history of software systems in order to
On the Diffuseness and the Impact on Maintainability of Code Smells 7
measure the actual change- and fault-proneness of classes affected (and not)
by design flaws.
Finally, Saboury et al (2017) conducted an empirical investigation on the
impact of code smells on the fault-proneness of JavaScript modules, confirming
the negative effect smells have on the maintainability of source code. Similarly
to our study, Saboury et al (2017) used of the SZZ algorithm to identify which
bugs were introduced after the introduction of the smells.
Abbes et al (2011) studied the impact of two code smell types, i.e., Blob
and Spaghetti Code, on program comprehension. Their results show that the
presence of a code smell in a class does not have an important impact on
developers’ ability to comprehend the code. Instead, a combination of more
code smells affecting the same code components strongly decreases developers’
ability to deal with comprehension tasks.
The interaction between different smell instances affecting the same code
components was also been studied by Yamashita and Moonen (2013), who con-
firmed that developers experience more difficulties when working on classes
affected by more than one code smell. The same authors also analyzed the
impact of code smells on maintainability characteristics (Yamashita and Moo-
nen, 2012). They identified which maintainability factors are reflected by code
smells and which ones are not, basing their results on (i) expert-based main-
tainability assessments, and (ii) observations and interviews with professional
developers.
Sjoberg et al (2013) investigated the impact of twelve code smells on the
maintainability of software systems. In particular, the authors conducted a
study with six industrial developers involved in three maintenance tasks on
four Java systems. The amount of time spent by each developer in performing
the required tasks whas been measured through an Eclipse plug-in, while a re-
gression analysis whas been used to measure the maintenance effort on source
code files having specific properties, including the number of smells affecting
them. The achieved results show that smells do not always constitute a prob-
lem, and that often class size impacts maintainability more than the presence
of smells.
Deligiannis et al (2004) also performed a controlled experiment showing
that the presence of God Class smell negatively affects the maintainability of
source code. Also, the authors highlight an influence played by these smells in
the way developers apply the inheritance mechanism.
Recently, Palomba et al (2014) investigated how the developers perceive
code smells, showing that smells characterized by long and complex code are
those perceived more by developers as design problems. In this paper we pro-
vide a complementary contribution to the previous work by Palomba et al
(2014). Rather than looking at developers’ perception, this paper observes the
possible effect of smells in terms of change- and fault-proneness.
8 Fabio Palomba et al.
The goal of this study is to analyze the diffuseness of 13 code smell types in
real software applications and to assess their impact on code change- and fault-
proneness. It is worth remarking that the term “diffuseness”, when associated
to a code smell type, refers to the percentage of code components in a system
affected by at least one instance of the smell type. Analyzing the diffuseness of
code smells is a preliminary analysis needed to better interpret their effect on
change- and fault-proneness. Indeed, some smells might be highly correlated
with fault-proneness but rarely diffused in software projects or vice versa. The
13 code smell types considered in this study are listed in Table 1 together with
a short description.
Table 3: The rules used by our tool to detect candidate code smells
Name Description
CDSBP A class having at least one public field.
Complex Class A class having at least one method for which McCabe cyclomatic
complexity is higher than 10.
Feature Envy All methods having more calls with another class than the one they
are implemented in.
God Class All classes having (i) cohesion lower than the average of the system
AND (ii) LOCs > 500.
Inappropriate Intimacy All pairs of classes having a number of method’s calls between them
higher than the average number of calls between all pairs of classes.
Lazy Class All classes having LOCs lower than the first quartile of the distribution
of LOCs for all system’s classes.
Long Method All methods having LOCs higher than the average of the system.
LPL All methods having a number of parameters higher than the average
of the system.
Message Chain All chains of methods’ calls longer than three.
Middle Man All classes delegating more than half of the implemented methods.
Refused Bequest All classes overriding more than half of the methods inherited by a
superclass.
Spaghetti Code A class implementing at least two long methods interacting between
them through method calls or shared fields.
Speculative Generality A class declared as abstract having less than three children classes
using its methods.
a consensus on the detected code smells. To ensure high recall, our detection
tool uses very simple rules that overestimate the presence of code smells.
The rules for the 13 smell types considered in the study are reported in
Table 3 and are inspired to the rule cards proposed by Moha et al (2010) in
DECOR. The metrics’ thresholds used to discriminate whether a class/method
is affected or not by a smell are lower than the thresholds used by Moha et al
(2010). Again, this was done in order to detect as many code smell instances
as possible. For example, in the case of the Complex Class smell we considered
as candidates all the classes having a cyclomatic complexity higher than 10.
Such a choice was driven by recent findings reported by Lopez and Habra
(2015), which found that “a threshold lower than 10 is not significant in Object-
Oriented programming when interpreting the complexity of a method”. As for
the other smells we relied on (i) simple filters, e.g., in the cases of CDSBP
(where we discarded from the manual validation all the classes having no
public attributes) and Feature Envy (we only considered the methods having
more relationships toward another class than with the class they are contained
in), (ii) the analysis of the metrics’ distribution (like in the cases of Lazy Class,
Inappropriate Intimacy, Long Method, and Long Parameter List), or (iii) very
conservative thresholds (e.g., a God Class should not have less than 500 LOCs).
We chose not to use existing detection tools (Marinescu, 2004; Khomh
et al, 2009b; Sahin et al, 2014; Tsantalis and Chatzigeorgiou, 2009; Moha
et al, 2010; Oliveto et al, 2010; Palomba et al, 2015a, 2016b) because (i) none
of them has ever been applied to detect all the studied code smells and (ii)
their detection rules are generally more restrictive to ensure a good compromise
between recall and precision and thus may miss some smell instances. To verify
this claim, we evaluated the behavior of three existing tools, i.e., DECOR
On the Diffuseness and the Impact on Maintainability of Code Smells 11
commits by mining regular expressions containing issue IDs in the change log
of the versioning system, e.g.,“fixed issue #ID” or “issue ID”. Secondly, for
each issue ID related to a commit, we downloaded the corresponding issue re-
ports from their issue tracking system and extracted the following information
from them: (i) product name; (ii) issue type, i.e., whether an issue is a bug,
enhancement request, etc.; (iii) issue status, i.e., whether an issue was closed
or not; (iv) issue resolution, i.e., whether an issue was resolved by fixing it,
or whether it was a duplicate bug report, or a “works for me” case; (v) issue
opening date; (vi) issue closing date, if available.
To estimate the date in which a bug was likely introduced3 , we exploited the
SZZ algorithm4 (Sliwerski et al, 2005), which is based on the annotation/blame
feature of versioning systems. In summary, given a bug-fix identified by the
bug ID, k, the approach works as follows:
1. For each file fi , i = 1 . . . mk involved in the bug-fix k (mk is the number of
files changed in the bug-fix k), and fixed in its revision rel-fixi,k , we extract
the file revision just before the bug fixing (rel-fixi,k − 1).
2. Starting from the revision rel-fixi,k − 1, for each source line in fi changed
to fix the bug k the blame feature of Git is used to identify the file revision
where the last change to that line occurred. In doing that, blank lines and
lines that only contain comments are identified using an island grammar
parser (Moonen, 2001). This produces, for each file fi , a set of ni,k fix-
inducing revisions rel-bugi,j,k , j = 1 . . . ni,k . Thus, more than one commit
can be indicated by the SZZ algorithm as responsible for inducing a bug.
By adopting the process described above we were able to approximate the
time periods in which each class was affected by one or more bugs. We excluded
from our analysis all the bugs occurring in a class Ci before it became smelly.
Note that we also excluded bug-introducing changes that were recorded after
the bug was reported, since they represent false positives.
It is worth noting that in the context of RQ2 we considered all the classes
of the analyzed systems: if a class was smelly in some releases and non-smelly
in other releases, it contributes to both sets of smelly and non-smelly classes.
Also, in this research question we did not discriminate the specific kind of
smell affecting a class (i.e., a class is considered smelly if it contains any kind
of code smell). A fine-grained analysis of the impact of the different smell
types on class change- and fault-proneness is presented in the next research
question.
In RQ3 we exploited the code smells’ oracle we built (i.e., the one reporting
the code smells affecting each class in each of the 395 considered releases) to
identify in which releases of each system a class was smelly or not smelly. Then,
we focused only on classes affected by at least one smell instance in at least
one of the analyzed software releases but not in all of them. In this way, we
could compare their change- and fault-proneness when they were affected and
not affected by smells. To effectively investigate the effect of smell removal
on maintainability, we considered each smell type in isolation, i.e., we took
into account only the classes affected by a single smell rather than considering
classes affected by more smells. For example, suppose that a class C was firstly
affected by the God Class smell between releases ri and ri+1 . Then, the smell
was not detected between releases ri+1 and ri+2 . Finally, the smell re-appeared
between releases ri+2 and ri+3 . We compute the change-proneness of C when
it was smelly by summing up the change-proneness of C in the periods between
3 The right terminology is “when the bug induced the fix” because of the intrinsic lim-
itations of the SZZ algorithm, which cannot precisely identify whether a change actually
introduced the bug.
4 SZZ stays for the last name initials of the three algorithm’s authors.
14 Fabio Palomba et al.
ri and ri+1 and between ri+2 and ri+3 . Similarly, we computed the change-
proneness of C when it was non-smelly by computing the change-proneness
of C in the period between ri+1 and ri+2 . Following the same procedure, we
compare the fault-proneness of classes when they were affected and not by a
code smell. As done for RQ2 , the comparison is performed by using boxplots
and statistical tests for significance (Mann-Whitney test) and effect size (Cliff’s
Delta).
Fig. 1 shows the boxplot reporting (i) the absolute number of code smell
instances, (ii) the percentage of affected code components (i.e., percentage of
affected classes/methods5 ), and (iii) the code smell density (i.e., number of
code smells per KLOC) affecting the software systems considered in our study.
For sake of clarity, we aggregated the results considering all the systems as a
single dataset. Detailed results are reported in the appendix at the end of the
paper.
The boxplots highlight significant differences in the diffuseness of code
smells. The first thing that leaps to the eyes is that code smells like Feature
Envy, Message Chain, and Middle Man are poorly diffused in the analyzed
systems. For instance, across the 395 system releases the highest number of
Feature Envy instances in a single release (a Xerces release) is 17, leading to
a percentage of affected methods of only 2.3%. We found instances of Feature
Envy in 50% of the analyzed 395 releases.
The Message Chain smell is also poorly diffused. It affects 13% of the
analyzed releases and in the most affected release (a release of HSQLDB) only
four out of the 427 classes (0.9%) are instances of this smell. It is worth noting
that in previous work Message Chain resulted to be the smell having the
highest correlation with fault-proneness (Khomh et al, 2012). Therefore, the
observed results indicate that although the Message Chain smell is potentially
harmful its diffusion is fairly limited.
Finally, the last poorly diffused code smell is the Middle Man. Only 30%
of the 395 analyzed releases are affected by this smell type and the high-
est number of instances of this smell type in a single release (a release of
Cassandra) is eight. In particular, the classes affected by the Middle Man
in Cassandra 0.6 were 8 out of 261 (3%). In this case, all identified Middle
Man instances affect classes belonging to the org.apache.cassandra.utils
package, grouping together classes delegating most of their work to classes
5 Depending on the code smell granularity, we report the percentage of affected classes or
methods.
On the Diffuseness and the Impact on Maintainability of Code Smells 15
CDSBP CC FE II GC LC LM LPL MC MM RB SC SG
CDSBP CC FE II GC LC LM LPL MC MM RB SC SG
CDSBP CC FE II GC LC LM LPL MC MM RB SC SG
plex and long (on average, 259 LOC). For a similar reason, we found several
instances of Long Method in Eclipse Core. Indeed, it contains a high number
of classes implementing methods dealing with code parsing in the IDE. While
we cannot draw any clear conclusion based on the manual analysis of these two
systems, our feeling is that the inherent complexity of such parsing methods
makes it difficult for developers to (i) write the code in a more concise way
to avoid Long Method code smells, or (ii) remove the smell, for instance by
applying extract method refactoring.
Another quite diffused code smell is the Spaghetti Code, that affects 83%
of the analyzed releases (327 out of 395) with the highest number of instances
(54) found in a JBoss’s release. Other diffused code smells are Speculative
Generality (80% of affected releases), Class Data Should Be Private (77%),
Inappropriate Intimacy (71%), and God Class (65%).
Interestingly, the three smallest systems considered in our study (i.e.,
Hibernate, jSL, and Sax) do not present any instance of code smell in any
of the 31 analyzed releases. This result might indicate that in small systems
software developers are generally able to better keep under control the code
quality, thus avoiding the introduction of code smells. To further investigate
this point we computed the correlation between system size (in terms of #
Classes, #Methods, and LOCs) and the number of instances of each code
smell (see Table 4). As expected, some code smells have a positive correla-
tion with the size attributes, meaning that the larger the system the higher
the number of code smell instances in it. There are also several code smells
for which this correlation does not hold (i.e., Feature Envy, Inappropriate In-
timacy, Long Parameter List, Message Chain, and Middle Man). With the
exception of Long Parameter List, all these smells are related to “suspicious”
interactions between the classes of the system (e.g., the high coupling repre-
sented by the Inappropriate Intimacy smell). It is reasonable to assume that
On the Diffuseness and the Impact on Maintainability of Code Smells 17
the interactions of such classes is independent from the system size and mainly
related to correct/wrong design decisions.
We also compute the code smell density as the number of smell instances
per KLOC in each of the 395 analyzed releases (see bottom part of Fig. 1). The
results confirm that the Long Method is the most diffused smell, having the
highest average density (i.e., 28 instances per KLOC). Also Refused Bequest
and Complex Class smells, i.e., the code smells having the highest percentage
of affected code components, are confirmed to be quite diffused in the studied
systems. All the other smells seem to have diffuseness trends similar to the
ones previously discussed.
Table 5 classifies the studied code smells on the basis of their diffuseness in
the releases subject of our study. The “% of affected releases” column reports
the percentage of analyzed releases in which we found at least one instance
of a specific smell type. For example, a smell like Long Method affects 84% of
releases, i.e., 395*0.84=332 releases.
Summary for RQ1 . Most of the analyzed smells are quite diffused, es-
pecially the ones characterized by long and/or complex code (e.g., Long
Method, Complex Class). On the contrary, Feature Envy, Lazy Class, Mes-
sage Chain, and Middle Man are poorly diffused.
● ● ●
classes
smelly
● ●●
● ●●
●●●● ●●
0 20 40 60 80
# changes
affected by code smells (32) is almost three times higher with respect to the me-
dian change-proneness of the other classes (12). As an example, the Eclipse
class IndexAllProject affected by the Long Method smell (in its method
execute) was modified 77 times during the time period between the release
8 (2.1.3) and 9 (3.0), while the median value of changes for classes not af-
fected by any code smell is 12. Moreover, during the change history of the
system the number of lines of code of the method execute of this class varied
between 671 and 968 due to the addition of several features. The results of
the Mann-Whitney and Cliff tests highlight a statistically significant differ-
ence in the change-proneness of classes affected and not affected by code smell
(p-value<0.001) with a large effect size (d=0.68).
Concerning the fault-proneness, the results also show important differences
between classes affected and not affected by code smells, even if such differences
are less marked than those observed for the change-proneness (see Fig. 3). The
median value of the number of bugs fixed on classes not affected by smells is
3 (third quartile=5), while the median for classes affected by code smells is
9 (third quartile=12). The results confirm what already observed by Khomh
et al (2012). The observed difference is statistically significant (p-value<0.001)
with a medium effect size (d=0.41).
When considering only the bugs induced after the smell introduction, the
results still confirm previous findings. Indeed, as shown in Fig. 4, smelly classes
still have a much higher fault-proneness with respect to non-smelly classes. In
particular, the median value of the number of bugs fixed in non-smelly classes
is 2 (third quartile=5), while it is 9 for smelly classes (third quartile=12). The
difference is statistically significant (p-value<0.001) with a large effect size
(d=0.82).
On the Diffuseness and the Impact on Maintainability of Code Smells 19
non−smelly
classes
● ● ● ●
classes
smelly
0 5 10 15 20 25 30 35
# defects
● ● ● ●
classes
smelly
0 5 10 15 20 25 30 35
# defects
Fig. 4: Fault-proneness of classes affected and not affected by code smells when
considering the bugs introduced after the smell introduction only.
This result can be explained by the findings reported in the work by Tufano
et al (2017), where the authors showed that most of the smells are introduced
during the very first commit involving the affected class (i.e., when the class
is added for the first time to the repository). As a consequence, most of the
bugs are introduced after the code smell appearance. This conclusion is also
supported by the fact that in our dataset only 21% of the bugs related to
smelly classes are introduced before the smell introduction.
While the analysis carried out until now clearly highlighted a trend in
terms of change- and fault- proneness of smelly and non-smelly classes, it is
20 Fabio Palomba et al.
important to note that a smelly class could be affected by one or more smells.
For this reason, we performed an additional analysis to verify how change- and
fault-proneness of classes very when considering classes affected by zero, one,
two, and three code smells. In our dataset there are no classes affected by
more than three smells in the same system release. Moreover, if a class was
affected by two code smells in release rj−1 and by three code smells in release
rj , its change- (fault-) proneness between releases rj−1 and rj contributed to
the distribution representing the change- (fault-) proneness of classes affected
by two smells while its change- (fault-) proneness between releases rj and rj+1
contributed to the distribution representing the change- (fault-) proneness of
classes affected by three smells. Fig. 5 reports the change-proneness of the four
considered sets of classes, while Fig. 6 and Fig. 7 depict the results achieved
for fault-proneness.
In terms of change-proneness, the trend depicted in Fig. 5 shows that the
higher the number of smells affecting a class the higher its change-proneness. In
particular, the median number of changes goes from 12 for non-smelly classes,
On the Diffuseness and the Impact on Maintainability of Code Smells 21
to 22 for classes affected by one smell (+83%), 32 for classes affected by two
smells (+167%), and up to 54 for classes affected by three smells (+350%).
Table 6 reports the results of the Mann-Whitney test and of the Cliff’s delta
obtained when comparing the change-proneness of these four categories of
classes. Since we performed multiple tests, we adjusted our p-values using the
Holm’s correction procedure (Holm, 1979). This procedure sorts the p-values
resulting from n tests in ascending order, multiplying the smallest by n, the
next by n − 1, and so on.
The achieved results show that (i) classes affected by a lower number of
code smells always exhibit a statistically significant lower change-proneness
than classes affected by a higher number of code smells and (ii) the effect
size is always large with the only exception of the comparison between classes
affected by one smell and classes affected by two smells, for which the effect
size is medium.
Similar observations can be made for what concerns the fault-proneness.
Fig. 6 depicts the boxplots reporting the fault-proneness of classes affected by
zero, one, two, and three code smells. When increasing the number of code
smells, the median fault-proneness of the classes grows from 3 for the non-
smelly classes up to 12 (+300%) for the classes affected by three code smells.
22 Fabio Palomba et al.
The results of the statistical analysis reported in Table 7 confirm the signif-
icant difference in the fault-proneness of classes affected by a different number
of code smells, with a large effect size in most of the comparisons.
Previous findings are also confirmed when looking at the boxplots of Fig.
7, which refers to the analysis of the fault-proneness performed considering
only the bugs introduced after the smell introduction. Indeed, the higher the
number of code smells affecting a class the higher its fault-proneness. The
significant differences are also confirmed by the statistical tests reported in
Table 8.
4.3 Change- and fault-proneness of classes when code smells are introduced
and removed (RQ3 )
For each considered code smell type, Fig. 8 shows a pair of boxplots reporting
the change-proneness of the same set of classes during the time period in which
they were affected (S in Fig. 8) and not affected (NS in Fig. 8) by that specific
code smell.
On the Diffuseness and the Impact on Maintainability of Code Smells 23
In all pairs of boxplots a recurring pattern can be observed: when the classes
are affected by the code smell they generally have a higher change-proneness
than when they are not affected. This result holds for all code smells but
Middle Man (MM), Lazy Class (LC), Feature Envy (FE), and Class Data
Should Be Private (CDSBP).
For classes affected by a God Class (GC) smell we can observe an increase
of +283% of the change-proneness median value (46 vs 12). The case of the
Base64 class belonging to the Elastic Search system is particularly represen-
24 Fabio Palomba et al.
tative: when affected by the God Class smell, the developers modified it 87
times on average (the average is computed across the 5 releases in which this
class was smelly); instead, when the class was not affected by the code smell,
the developers modified it only 10 times on average (the class was not smelly
in 3 releases).
Similar results can be observed for the Complex Class (CC) smell: the
median change-proneness of classes is 55 in the time period in which they are
affected by this smell, while it is 34 when they are non-smelly. For example,
when the Scanner class of the Eclipse Core project was affected by this smell,
it was modified 95 times on average (across the 18 releases in which the class
was smelly), as opposed to the 27 changes observed on average across the 11
releases in which it was not smelly.
The discussion is quite similar for code smells related to errors in the
applications of Object Oriented principles. For example, for classes affected
by Refused Bequest (RB) the median change-proneness goes from 43 (in the
presence of the smell) down to 26 (in the absence of the smell). The case of the
class ScriptWriterBase of the HSQLDB project is particularly interesting.
On average this class was involved in 52 changes during the time period in
which it was affected by RB (13 releases), while the average number of changes
decreased to 9 during the time period in which it was not smelly (4 releases).
It is also interesting to understand why some code smells reduce the change-
proneness. For the Lazy Class smell this result is quite expected. Indeed, by
definition this smell arises when a class has small size, few methods, low com-
plexity, and it is used rarely from the other classes; in other words, as stated by
Fowler “the class isn’t doing enough to pay for itself ” (Fowler, 1999). Remov-
ing this smell could mean increasing the usefulness of the class, for example
by implementing new features in it. This likely increases the class change-
proneness. Also, the removal of a Middle Man (a class delegating most of its
responsibilities) is expected to increase the change-proneness of classes, since
the non-smelly class will implement (without delegation) a set of responsi-
bilities that are likely to be maintained by developers, thus triggering new
changes.
Results of the fault-proneness are shown in Fig. 9. Here, the differences be-
tween the time periods the classes are affected and not by code smells are less
evident, but still present, especially for Refused Bequest (RB), Inappropriate
Intimacy (II), God Class (GC), and Feature Envy (FE). The most interesting
case is the FE, for which we observed that the fault-proneness increases by
a factor of 8 when this code smell affects the classes. A representative exam-
ple is represented by the method internalGetRowKeyAtOrBefore of the class
Memcache of the project Apache HBase. This method did not present faults
when it was not affected by any smell (i.e., the method was not affected by
smells in 4 releases of the system). However, when the method started to be
too coupled with the class HStoreKey, it was affected by up to 7 faults. The
reason for this growth is due to the increasing coupling of the method with
the class HStoreKey. Indeed, a HBase developer commented on the evolution
On the Diffuseness and the Impact on Maintainability of Code Smells 25
of this method in the issue tracker6 : “Here’s a go at it. The logic is much more
complicated, though it shouldn’t be too impossible to follow ”.
For all other smells we did not observe any strong difference in the fault-
proneness of the classes when comparing the time periods during which they
were affected and not affected by code smells. While this result might seem
6 https://issues.apache.org/jira/browse/HBASE-514
26 Fabio Palomba et al.
between the first and the third quartile of the distribution, i.e., medium
size; and (iii) the group composed by the smelly classes having a size larger
than the third quartile of the distribution of the size of the classes, i.e.,
large size;
2. we applied the same strategy for grouping small, medium, and large non-
smelly classes; and
3. we computed the change- and the fault-proneness for each class belonging
to the six groups, in order to investigate whether smelly-classes are more
change- and fault-prone regardless of their size.
The obtained results are consistent with those discussed above. The interested
reader can find them in our online appendix (Palomba et al, 2017).
Summary for RQ3 . While the class change-proneness can benefit from
code smell removal, the presence of code smells in many cases is not neces-
sarily the direct cause of the class fault-proneness, but rather a co-occurring
phenomenon.
5 Threats to Validity
This section discusses the threats that might affect the validity of our study.
The main threats related to the relationship between theory and observa-
tion (construct validity) are due to imprecisions/errors in the measurements
we performed. Above all, we relied on a tool we built and made publicly avail-
able in our online appendix (Palomba et al, 2017) to detect candidate code
smell instances. Our tool exploits conservative detection rules aimed at en-
suring high recall at the expense of low precision. Then, two of the authors
manually validated the identified code smells to discard false positives. Still,
we cannot exclude the presence of false positives/negatives in our dataset.
We assessed the change- and fault-proneness of a class Ci in a release rj
as the number of changes and the number of bug fixes Ci was subject to in
the time period t between the rj and the rj+1 release dates. This implies that
the length of t could play a role in the change- and fault-proneness of classes
(i.e., the longer t the higher the class change- and fault-proneness). However,
it is worth noting that:
1. This holds for both smelly and non-smelly classes, thus reducing the bias
of t as a confounding factor.
2. To mitigate such a threat we completely re-run our analyses by considering
a normalized version of class change- and fault-proneness. In particular, we
computed the change-proneness of a class Ci in a release rj as:
#Changes(Ci )rj−1 →rj
change proneness(Ci , rj ) =
#Changes(rj−1 → rj )
where #Changes(Ci )rj−1 →rj is the number of changes performed to Ci by
developers during the evolution of the system between the rj−1 ’s and the
On the Diffuseness and the Impact on Maintainability of Code Smells 29
where N OBF (Ci )rj−1 →rj is the number of bug fixing activities performed
on Ci by developers between the rj−1 ’s and the rj ’s release dates and
N OBF (rj−1 → rj ) is the total number of bugs fixed in the whole system
during the same time period.
The achieved results are reported in our online appendix (Palomba et al,
2017) and are consistent with those reported in Section 4.
In addition, we cannot exclude imprecisions in the measurement of the
fault-proneness of classes due to misclassification of issues (e.g., an enhance-
ment classified as a bug) in the issue-tracking systems (Antoniol et al, 2008).
At least, the systems we consider use an explicit classification of bugs, distin-
guishing them from other issues.
We relied on the SZZ algorithm (Sliwerski et al, 2005) to investigate whether
there is a temporal relationship between the occurrence of a code smell and
a bug induction. We are aware that such an algorithm only gives a rough ap-
proximation of the set of commits inducing a fix, because (i) the line-based
differencing of git has intrinsic limitations, and (ii) in some cases a bug can be
fixed without modifying the lines inducing it, e.g., by adding a workaround or
in general changing the control-flow elsewhere.
The main threats related to the relationship between the treatment and
the outcome (conclusion validity) might be represented by the analysis method
exploited in our study. We discussed our results by presenting descriptive
statistics and using proper non-parametric correlation tests (p-values were
properly adjusted when multiple comparisons were performed by applying the
Holms correction procedure previously described). In addition, the practical
relevance of the differences observed in terms of change- and fault-proneness
is highlighted by effect size measures.
Threats to internal validity concern factors that could influence our obser-
vations. The fact that code smells disappear, may or may not be related to
refactoring activities occurred between the observed releases. In other words,
other changes might have produced such effects. We are aware that we cannot
claim a direct cause-effect relation between the presence of code smells and
fault- and change-proneness of classes, which can be influenced by several other
factors. In particular, our observations may be influenced by the different de-
velopment phases encountered over the change history as well as by developer-
related factors (e.g., experience and workload). Also, we acknowledge that such
measures could simply reflect the “importance” of classes in the analyzed sys-
tems and in particular their central role in the software evolution process. For
example, we expect classes controlling the business logic of a system to also be
the ones more frequently modified by developers (high change-proneness) and
30 Fabio Palomba et al.
This paper reported a large study conducted on 395 releases of 30 Java open
source projects, aimed at understanding the diffuseness of code smells in Java
open source projects and their relation with source code change- and fault-
proneness. The study considered 17,350 instances of 13 different code smell
types, firstly detected using a metric-based approach and then manually vali-
dated.
The results highlighted the following findings:
– Diffuseness of smells. The most diffused smells are the one related to size
and complexity such as Long Method, Spaghetti Code, and to some extent
Complex Class or God Class. This seems to suggests that a simple metric-
based monitoring of code quality could already give enough indications
about the presence of poor design decisions or in general of poor code
quality. Smells not related to size like Message Chains and Lazy Class
are less diffused, although there are also cases of such smells with high
diffuseness, see for example Class Data Should Be Private and Speculative
Generality.
– Relation with change- and fault-proneness. Generally speaking, our
results confirm the results of the previous study by Khomh et al (2012),
i.e., classes affected by code smells tend to be more change- and fault-prone
than others, and that this is even more evident when classes are affected by
multiple smells. At the same time, if we analyze the fault-proneness results
for specific types of smells, we can also notice that high fault-proneness is
particularly evident for smells such as Message Chain that are not highly
diffused.
On the Diffuseness and the Impact on Maintainability of Code Smells 31
we plan to further analyze other factors influencing the change- and fault-
proneness of classes.
References
DOI 10.1109/TSE.2017.2653105
Vaucher S, Khomh F, Moha N, Gueheneuc YG (2009) Tracking design smells:
Lessons from a study of god classes. In: Proceedings of the 2009 16th Work-
ing Conference on Reverse Engineering (WCRE’09), pp 145–158
Yamashita AF, Moonen L (2012) Do code smells reflect important maintain-
ability aspects? In: 28th IEEE International Conference on Software Main-
tenance, ICSM 2012, Trento, Italy, September 23-28, 2012, pp 306–315
Yamashita AF, Moonen L (2013) Exploring the impact of inter-smell relations
on software maintainability: an empirical study. In: 35th International Con-
ference on Software Engineering, ICSE ’13, San Francisco, CA, USA, May
18-26, 2013, pp 682–691
Appendix
Table 11 shows the diffuseness of the analyzed code smells in the subject
systems.
Table 11: Code smell diffuseness in the subject systems: min-max instances in the systems’ releases.
System CDSBP Complex Feature God Inappropriate Lazy Long LPL Message Middle Refused Spaghetti Speculative
Class Envy Class Intimacy Class Method Chain Man Bequest Code Generality
ArgoUML 5-19 (0.6-1.3) 4-10 (0.5-0.7) 1-2 (0.1-0.1) 2-6 (0.2-0.4) 0-8 (0.0-0.5) 0-0 (0.0-0.0) 18-30 (2.3-2.1) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 1-2 (0.1-0.1) 0-4 (0.0-0.3) 14-21 (1.8-1.5) 12-29 (1.5-2.0)
Ant 1-7 (1.2-0.9) 0-4 (0.0-0.5) 0-6 (0.0-0.9) 0-6 (0.0-0.7) 2-22 (2.4-2.7) 0-0 (0.0-0.0) 6-38 (7.2-4.6) 0-17 (0.0-2.0) 0-3 (0.0-0.4) 0-2 (0.0-0.2) 0-14 (0.0-1.7) 4-30 (5.0-3.7) 1-4 (1.2-0.5)
aTunes 3-12 (2.0-1.8) 0-1 (0.0-0.2) 0-6 (0.0-0.9) 0-1 (0.0-0.2) 0-8 (0.0-1.2) 1-9 (0.7-1.4) 6-31 (4.3-3.2) 0-11 (0.0-1.7) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 1-8 (0.7-1.3) 6-21 (4.3-3.4) 0-10 (0.0-1.6)
Cassandra 5-14 (1.6-2.4) 0-4 (0.0-0.6) 0-2 (0.0-0.3) 0-2 (0.0-0.4) 2-24 (0.0-4.0) 0-1 (0.0-0.2) 3-22 (1.0-4.0) 0-16 (0.0-2.7) 0-2 (0.0-0.3) 2-8 (0.7-1.4) 0-2 (0.0-0.3) 0-6 (0.0-0.9) 0-16 (0.0-2.9)
Derby 20-40 (1.3-1.0) 21-25 (1.5-1.3) 1-1 (0.6-0.5) 20-26 (1.3-1.4) 0-0 (0.0-0.0) 1-1 (0.4-0.5) 176-212 (0.8-0.8) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 2-2 (0.1-0.1) 10-19 (0.7-0.9) 12-16 (0.8-0.8) 19-26 (1.3-1.4)
Eclipse Core 15-32 (2.1-2.7) 8-35 (1.1-2.9) 0-6 (0.0-0.5) 3-15 (0.4-1.3) 0-16 (0.0-1.4) 2-17 (0.2-1.4) 36-180 (4.8-15.2) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 2-2 (0.3-0.2) 7-31 (0.9-2.6) 12-25 (1.6-2.1) 4-15 (0.5-1.8)
Elastic Search 3-5 (0.2-0.2) 0-5 (0.0-0.2) 0-0 (0.0-0.0) 1-3 (0.1-0.1) 4-4 (0.2-0.1) 4-7 (0.2-0.3) 11-27 (0.7-1.9) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-3 (0.0-0.1) 3-8 (0.2-0.4) 3-9 (0.2-0.4)
FreeMind 0-5 (0.0-0.9) 0-6 (0.0-1.2) 0-1 (0.0-0.2) 0-2 (0.0-0.4) 0-6 (0.0-0.7) 0-3 (0.0-0.6) 0-13 (0.0-2.6) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-1 (0.0-0.1) 0-3 (0.0-0.6) 0-9 (0.0-1.9) 0-2 (0.0-0.4)
Hadoop 0-3 (0.0-0.1) 0-2 (0.0-0.1) 0-4 (0.0-0.1) 0-2 (0.0-0.1) 2-10 (1.6-0.1) 3-9 (2.3-0.1) 5-17 (3.8-0.1) 0-12 (0.0-0.1) 0-0 (0.0-0.1) 0-1 (0.0-0.1) 0-0 (0.0-0.1) 6-7 (4.6-0.1) 3-5 (2.5-0.1)
HSQLDB 0-7 (0.0-1.5) 0-5 (0.0-1.1) 0-3 (0.0-0.7) 0-11 (0.0-1.5) 8-24 (14.8-5.4) 0-9 (0.0-2.1) 10-124 (1.4-14.1) 0-13 (0.0-2.9) 0-4 (0.0-0.9) 0-0 (0.0-0.0) 0-13 (0.0-0.2.8) 2-29 (3.7-6.5) 0-3 (0.0-0.7)
Hbase 5-16 (3.1-2.3) 2-7 (1.2-1.0) 1-9 (0.7-1.3) 1-8 (0.7-1.1) 2-14 (1.2-2.0) 2-21 (1.4-3.1) 12-42 (7.5-6.4) 3-45 (1.9-6.5) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-22 (0.0-3.2) 3-5 (1.8-0.7) 2-10 (1.3-1.4)
Hibernate 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0)
Hive 3-7 (0.7-0.6) 1-3 (0.1-0.3) 4-11 (0.9-0.9) 0-2 (0.0-0.2) 0-8 (0.0-0.7) 1-1 (0.1-0.1) 11-83 (2.7-7.4) 14-77 (3.4-6.9) 0-1 (0.0-0.1) 0-1 (0.0-0.1) 4-6 (0.9-0.5) 2-4 (0.5-0.4) 6-31 (1.5-2.8)
Incubating 12-16 (4.8-5.0) 6-6 (2.5-1.9) 3-10 (1.1-3.1) 6-6 (2.8-1.7) 12-30 (4.8-9.5) 1-4 (0.4-1.3) 89-110 (3.5-3.5) 27-35 (10.8-11.1) 0-0 (0.0-0.0) 0-1 (0.0-0.3) 10-17 (4.0-5.3) 6-8 (2.4-2.5) 6-8 (2.3-2.5)
Ivy 1-1 (0.4-0.3) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-2 (0.0-0.6) 0-6 (0.0-1.7) 0-0 (0.0-0.0) 2-22 (0.7-6.3) 13-21 (4.6-6.0) 0-0 (0.0-0.0) 1-1 (0.3-0.3) 0-0 (0.0-0.0) 1-4 (0.4-1.2) 4-5 (1.4-1.4)
Lucene 39-47 (2.2-2.0) 3-5 (0.2-0.2) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 8-14 (0.5-0.6) 0-10 (0.0-0.5) 61-74 (3.5-3.2) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 7-9 (0.4-0.4) 10-16 (0.6-0.7) 29-38 (1.6-1.7)
JEdit 0-7 (0.0-1.3) 4-21 (1.8-4.0) 0-2 (0.0-0.4) 0-6 (0.0-1.2) 0-8 (0.0-1.5) 0-0 (0.0-0.0) 8-33 (3.5-6.4) 0-9 (0.0-1.7) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-3 (0.0-0.6) 3-18 (1.3-3.5) 4-14 (1.8-2.7)
JHotDraw 0-0 (0.0-0.0) 0-4 (0.0-0.6) 0-0 (0.0-0.0) 0-2 (0.0-0.3) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0)
JFreeChart 0-9 (0.0-1.2) 0-3 (0.0-0.4) 0-0 (0.0-0.0) 0-9 (0.0-1.4) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 2-63 (2.3-8.1) 8-67 (9.3-8.6) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 1-3 (1.1-0.4) 2-5 (2.3-0.6)
JBoss 18-65 (0.8-1.4) 9-23 (0.4-0.5) 0-1 (0.0-0.1) 1-16 (0.1-0.4) 0-4 (0.0-0.1) 0-6 (0.1-0.0) 45-135 (1.9-2.8) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-3 (0.0-0.1) 24-55 (1.0-1.1) 23-54 (1.0-1.1) 23-65 (1.0-1.4)
JVlt 0-2 (0.0-0.9) 0-0 (0.0-0.0) 0-4 (0.0-1.8) 0-1 (0.0-0.0) 2-4 (1.2-0.5) 1-5 (0.7-2.2) 5-7 (3.0-3.1) 0-2 (0.0-0.9) 0-0 (0.0-0.0) 0-1 (0.0-0.5) 1-3 (0.7-1.4) 3-4 (1.8-1.8) 1-3 (0.7-1.3)
jSL 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0)
Karaf 0-2 (0.0-0.4) 0-0 (0.0-0.0) 1-1 (0.4-0.2) 0-2 (0.0-0.3) 0-2 (0.0-0.3) 0-0 (0.0-0.0) 1-5 (0.5-1.1) 0-2 (0.0-0.3) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 5-9 (2.0-1.9) 2-2 (2.1-0.4)
Nutch 3-12 (1.6-4.6) 0-0 (0.0-0.0) 1-5 (0.5-1.9) 0-0 (0.0-0.0) 4-12 (2.1-4.9) 1-9 (0.6-3.5) 7-17 (3.8-6.6) 0-16 (0.0-6.1) 0-0 (0.0-0.0) 0-6 (0.0-6.3) 0-2 (0.0-0.7) 0-4 (0.0-1.5) 0-3 (0.0-1.2)
Pig 3-5 (1.2-0.5) 0-7 (0.0-0.8) 0-1 (0.0-0.12) 0-3 (0.0-0.3) 4-10 (1.6-1.1) 0-1 (0.0-0.1) 0-43 (0.0-4.7) 0-3 (0.0-0.3) 0-1 (0.0-0.1) 0-2 (0.0-0.2) 0-13 (0.0-1.4) 0-7 (0.0-0.8) 4-20 (1.7-2.2)
Qpid 11-18 (1.1-1.9) 4-10 (0.4-1.0) 0-2 (0.0-0.2) 4-6 (0.6-0.7) 4-10 (0.4-1.1) 0-1 (0.0-0.1) 21-33 (2.1-3.6) 22-39 (2.2-4.2) 0-1 (0.0-0.1) 0-1 (0.0-0.1) 1-8 (0.1-0.9) 3-8 (0.4-0.8) 32-38 (3.3-4.1)
Sax 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0)
Struts 7-12 (1.1-1.1) 0-4 (0.0-0.4) 0-1 (0.0-0.1) 0-2 (0.0-0.3) 6-12 (0.9-1.0) 0-0 (0.0-0.0) 3-14 (0.5-1.4) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 1-2 (0.2-0.2) 3-9 (0.5-0.9) 0-6 (0.0-0.6)
Wicket 0-0 (0.0-0.0) 2-2 (0.3-0.2) 0-0 (0.0-0.0) 4-4 (0.5-0.5) 4-6 (0.5-0.7) 0-0 (0.0-0.0) 4-4 (0.5-0.5) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 0-0 (0.0-0.0) 3-7 (0.4-0.8) 11-11 (1.4-1.3) 18-18 (2.7-2.2)
On the Diffuseness and the Impact on Maintainability of Code Smells
Xerces 10-42 (6.0-5.7) 4-10 (2.4-1.4) 0-17 (0.0-2.3) 5-11 (3.0-1.5) 2-34 (1.2-4.6) 0-4 (0.0-0.5) 48-123 (2.7-6.9) 4-29 (2.5-4.0) 2-3 (1.3-0.4) 0-0 (0.0-0.0) 0-15 (0.0-2.0) 4-9 (2.4-1.3) 2-11 (1.2-1.5)
Overall 0-65 (0.0-5.0) 0-35 (0.0-4.0) 0-17 (0.0-3.1) 0-26 (0.0-1.7) 0-34 (0.0-9.5) 0-21 (0.0-3.5) 0-212 (0.0-15.2) 0-77 (0.0-11.0) 0-4 (0.0-0.9) 0-8 (0.0-6.3) 0-55 (0.0-3.2) 0-54 (0.0-6.5) 0-65 (0.0-4.1)
37