2023 - Dependent or Not - Detecting and Understanding Collections of Refactorings
2023 - Dependent or Not - Detecting and Understanding Collections of Refactorings
6, JUNE 2023
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3345
recommendations of JDeodorant [9] where, similar to exist- attribute metrics. Finally, we conducted human validation with
ing automated refactoring tools, the dependencies between the 27 developers to manually evaluate the correctness of the de-
refactoring are not revealed, thus leaving the challenging task tected dependencies and their relevance.
of interpreting the sequence of refactoring recommendations to Our implemented algorithm achieved 100% in correctly de-
the programmers. tecting all dependencies between refactorings and identifying
The dependencies between refactorings is critical when de- invalid refactorings. Furthermore, our findings demonstrate that
velopers select which refactorings to be applied as this helps 43% of the 1,457,873 recommended refactorings are part of de-
them understand how a sequence of refactorings is dependent pendent refactoring graphs. This finding confirms that refactor-
and what is the impact of making changes in some of the ings are commonly involved in dependent relations and cannot
recommended refactorings. The integration of refactoring de- be applied truly independently. Furthermore, dependent refac-
pendencies in recommender tools is still lacking in existing torings improve all six QMOOD quality attribute metrics [13]
research. in our experiments better than independent refactorings. The
To close this gap, in this paper we describe a theory for rea- manual validation of the refactorings by 27 developers shows
soning about collections of refactorings through a definition of that all the identified dependencies are correct for a sample of
ordering dependencies among refactorings and an algorithm for 233 refactorings after applying them directly on the code of 61
identifying these dependencies. We aim to improve the accuracy open source projects based on the order proposed by DPRef. The
of refactoring recommendation tools by detecting refactoring post-study survey with the developers confirmed the relevance of
dependencies, which allows the developers to efficiently inter- detecting the dependencies to help them understand the sequence
act with such refactoring recommendation tools. We propose of recommended refactorings.
defining refactoring recommendations as sets of refactoring The authors also provide a replication package1 that includes
graphs rather than as refactoring sequences. We illustrate these the refactoring dependency detection tool and necessary data for
concepts with a tool for visualizing refactoring dependencies our large scale validation. The replication package will enable
and refactoring graphs. researchers and tool builders to integrate the refactoring de-
Refactorings, when formalized, have clear pre-conditions pendency feature into existing refactoring recommendation and
defining the circumstances in which they can be applied and detection tools and further investigate the relationships among
post-conditions defining the effects of applying them. For in- refactorings.
stance, one of the pre-conditions of a Move Method refactoring The rest of the paper is organized as follows: Section II
is that the method exists in the class from which it will be moved discusses the related work and presents a motivating exam-
and one of its post-conditions is that the method must exist in ple; Section III provides our definitions of refactoring depen-
the target class afterward. Therefore, a refactoring dependency dencies and an algorithm to detect them; Section IV presents
exists when any post-condition of one refactoring matches any and discusses the obtained results; Section V highlights the
pre-condition of another refactoring e.g., the method exists in threats to validity; Section VI summarizes our research agenda;
the relevant class. These linked refactorings then can be orga- and Section VII concludes.
nized into groups based upon their dependencies. We represent
these groups as directed acyclic graphs, where the nodes are II. RELATED WORK AND MOTIVATING EXAMPLE
the refactorings and the edges are the dependencies. This ap-
proach offers three main benefits: 1) developers can quickly Catalogs like Fowler’s [4] have identified many types of
and intuitively understand the dependencies among refactorings refactorings e.g., Move Method, Extract Class, Pull Up Field,
that constrain recommendations e.g., which refactorings must each of which is a semantics preserving code transformation
be done together; 2) developers can more easily compare rec- that improves code structure. Developers routinely apply such
ommendations by focusing on essential, rather than accidental, refactorings in their day-to-day work, with modern development
differences; and 3) tool builders can easily integrate new features environments providing limited support for applying selected
to detect invalid refactorings and improve their recommendation refactorings as directed by a developer. In this section, we
algorithms. survey two categories of refactoring research: recommendation
We validated our proposed theory based on 1,457,873 refac- tools and refactoring dependencies. We then summarize the
torings recommended for 9,595 Java projects publicly available challenges addressed in this paper through a motivating example.
on GitHub. We considered 14 types of refactorings that are most
commonly used in practice [10], [11]. We also developed a web A. Related Work
tool, DPRef, that implements the proposed ordering dependency
1) Refactoring Recommendations: There has been both in-
detection algorithm. It transforms a refactoring sequence recom-
dustry and research interest in developing automated and semi-
mended by existing refactoring tools [9], [12] into refactoring
automated refactoring tools to support developers [14]. One
graphs based on the detected dependencies. We conducted ex-
representative example is JDeodorant, the tool proposed by
periments to evaluate the correctness of detected dependencies,
discover what portion of refactorings in recommendations are
actually dependent rather than independent, and estimate the
potential impact of dependent refactorings on several quality 1 https://sites.google.com/umich.edu/refactoring-dependency/home
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3346 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
TABLE I
REFACTORING TYPES AND THEIR PRE- AND POST-CONDITION RULES
Tsantalis and Chatzigeorgiou [9]. JDeodorant and similar rec- a multi-criteria code refactoring approach aimed at optimiz-
ommendation tools [11], [15], [16], [17], [18], [19] generate ing contrasting objectives: (i) minimizing the number of code
recommendations as sequences of refactoring instances. The smells; (ii) minimizing the refactoring cost (i.e., the number
experiments described in this paper take this form of refactoring of recommended refactorings); (iii) preserving the design se-
recommendation as input. Thus, our discussion in this sec- mantics (meaning considering textual information embedded in
tion focuses on this category of studies. We point the interested code identifiers and comments in the refactoring recommenda-
reader to the survey by Bavota et al. [20] for an overview of tion); and (iv) maximizing the consistency with code changes
approaches supporting code refactoring recommendations. In performed over the system’s change history. In this study, we
another refactoring recommendation tool, O’Keeffe and Cin- use the refactoring recommendations generated by this tool
néide [21] formulate refactoring tasks as a multi-objective search based on 1) its superior performance compared to the state
problem to generate alternative designs by applying a sequence of the art [12]; 2) the large number of supported refactoring
of refactoring operations. Such a search is guided by a quality types, and 3) its being publicly available. The contribution of
evaluation function based on eleven object-oriented design met- this paper is not generating refactoring recommendations. Any
rics that reflect refactoring goals. Harman and Tratt [17] were the refactoring recommendation approach can be used to generate
first to introduce the concept of Pareto optimality to search-based the refactorings (input of our proposed approach) if they support
refactoring. They used it to combine two metrics, namely CBO some or all of the refactoring types summarized in Table I.
(Coupling Between Objects) and SDMPC (Deviation of Meth- 2) Refactoring Dependencies: Chavez et al. [26] investi-
ods Per Class), into a fitness function and showed its superior gated how refactoring types affect five quality attributes based
performance as compared to a mono-objective technique [17]. on the version history of twenty-three open source projects.
The two aforementioned studies [17], [21] paved the way for They found that 94% of refactorings are applied to code with at
several search-based approaches aimed at recommending refac- least one low quality attribute value, with 65% of refactorings
torings [12], [15], [22], [23], [24], [25]. improving attributes and 35% of all refactorings being neutral
A representative example of these search-based refactoring on the system. Similarly, Cinnéide et al. [8] studied the impact
techniques is the work by Ouni et al. [12], who propose of individual refactorings on quality attributes, such as using
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3347
Move Method to reduce the coupling of a class. None of these B. Motivating Example
studies considered the impact of a sequence of refactorings on
The key to applying refactorings successfully is the decision
the quality attributes.
of which refactorings to apply and where to apply them. In
Murphy-Hill et al. [7] investigated refactoring tool usage
essence, this requires a developer to instantiate a refactoring
through both sampling developers’ code and manually checking
by supplying the parameters that allow a type of refactoring to
if their refactorings were performed with tool support and looked
be unambiguously applied to code. For example, to instantiate a
at 240,000 tool-assisted refactorings to find assumptions on how
Move Method, a developer must supply parameters that indicate
developers, in general, refactor code. They ultimately concluded
which method to move and where to move it. Throughout this
that refactoring tools are rarely used by developers in practice
paper, when we talk about refactorings and dependencies among
with 90% of refactorings being performed manually, and that
refactorings, we are talking about refactoring instances.
40% of refactorings occur in batches.
While refactoring recommendations generated by tools to
Bibiano et al. [6] analyzed batch refactoring characteristics
mimic this activity are typically represented as sequences, not
and their effects on code smells in open and closed source
all orderings in these sequences are significant. That is, the same
projects and concluded that 57% of batches/patterns are simple
code could be generated by two solutions that contain the same
compositions of only two types of refactorings. They high-
refactorings, but simply apply them in a different order. This is
light lack of tool support to automatically detect refactoring
because while many refactorings are independent of one another,
dependencies as a barrier. However, this study is based on the
other refactorings are dependent on each other such that remov-
assumption that refactorings are only related if applied to the
ing or reordering a refactoring from a solution could make other
same code location, which often is not the case for types of
refactorings invalid. These refactoring recommendations tools
refactorings that modify multiple code fragments.
typically do not intend to imply a strict ordering of dependencies
Mens et al. [27], [28] define and detect mutual exclusions, se-
among elements of the sequence. The sequence is simply a
quential dependencies and asymmetric conflicts between refac-
concise means to communicate multiple steps. In that sense,
torings. These studies analyze dependencies at the model-level
many tools that report a sequence of refactorings today may
working with UML and they use graph transformation tech-
communicate an unintended meaning (strict sequential order)
niques to detect invalid refactorings. The detection of conflicts
that this work can help clarify for users of such tools. With the
between refactorings at the model level (UML) is based on a
current growth of interactive tools to support refactoring [11],
set of rules (matrix where the lines and columns are model
[18], developers are offered solutions that contain dozens to
refactoring types) that are manually defined. The type of refac-
hundreds of refactorings and the option to selectively apply
torings is different and simplified compared to code-level ones.
elements of a solution. Without a theory for reasoning about
Furthermore, the authors were looking for mutually exclusive
refactoring dependencies, developers can inadvertently make
UML refactorings rather than detecting dependencies.
decisions (e.g., ignoring part of a solution) that result in failure
Liu et al. [29], [30] propose a conflict-aware scheduling
(code cannot be successfully refactored) and enter a tedious
approach, which schedules refactorings according to the conflict
trial and error loop. Making refactoring dependencies visible
matrix of refactorings and effects of each individual refactoring
improves developers’ understanding of how refactorings work
using a multi-objective optimisation model. In this work, the au-
together and allows them to make sound inferences regarding
thors focused on identifying the best schedule to apply refactor-
their application.
ings where the conflicts are defined based on which code smells
Fig. 2 shows a simplified example of a solution composed of 5
to be fixed. Thus, the same refactorings fixing different code
refactorings to be applied to the JFreeChart project. Three of the
smells or applied in the same locations are grouped together.
refactorings (#3, #4, #5) depend on another refactoring (#2)
The notion of dependencies is defined in our work in a different because the Extract Super Class refactoring (#2) creates a new
way than Liu et al. where they are more about the conflicts
class (Class_7), on which refactorings #3, #4, and #5 operate.
between the refactoring themselves and not their goals.
If the new class is not created first, then refactorings #3, #4,
Sousa et al. [31] identify and analyze composite refactorings and #5 will fail. Thus, there exists an ordering dependency from
within and across commits from the commit history of 48
each of #3, #4, #5 to #2. Refactoring #1, however, has dis-
GitHub software projects. The concept of defining dependencies
tinct parameters, indicating that it operates on different code ele-
in this work is different than our paper where the dependency ments, thus it has no ordering dependencies on any others in this
is about grouping the refactorings applied within the same
solution. Presenting these dependencies to a developer clarifies
commits/locations. the options that the developer has. For example, the developer
Overall, existing studies do not provide a rigorous definition of could choose not to apply any refactoring except for refactoring
ordering dependencies among refactorings. They mainly define
#2 without consequences; if the developer chooses not to ap-
what might be better considering similarity relations, such as a ply refactoring #2, then refactorings #3, #4, and #5 cannot
collection of refactorings that have similar effects (fixing a code be applied either. Detecting ordering dependency relationships
smell) or similar context (applied by the same developer or to
among refactorings is essential to more effectively applying
the same code location) [32], [33]. refactorings.
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3348 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
III. REFACTORING DEPENDENCY THEORY graphs (Algorithm 1) results in a set of graphs with the following
traits:
The refactoring dependency theory for reasoning about col- r Each refactoring instance is an element of exactly one
lections of refactorings is built upon two concepts. The first is the
refactoring graph.
definition of an ordering dependency relation among refactor- r Some graphs contain a single refactoring instance because
ings in a collection of refactorings. Pre- and post-conditions for
that refactoring is truly independent of all others. We
refactoring types are used to detect refactoring dependencies,
i.e., a set of predicates associated with each refactoring that call these trivial graphs comprised of a single node of a
refactoring instance.
reflect changes. The second is the organization of a collection r The remaining graphs contain multiple refactoring in-
of refactorings as a set of refactoring graphs. Together, these
concepts improve our ability to understand the meaning of stances, each of which is part of one or more dependencies.
We call these non-trivial graphs.
collections of refactorings, allowable operations on them, and r Each refactoring graph is independent of every other graph
their composition in practice.
In this section we describe the elements of our proposed in the solution.
Refactoring recommendations typically comprises a collec-
theory, the algorithm for detecting refactoring dependencies
tion of compatible refactorings, and as such positive dependen-
and an associated web tool that implements this detection
algorithm. cies are more relevant to common use cases. The idea of negative
refactorings would be more applicable if a recommendation
contained mutually exclusive advice (e.g., three Move Method
A. Definitions refactorings that move the same method to three different loca-
Our proposed dependency relation captures an ordering de- tions). This is not the common use case, but this work would
pendency between pairs of refactoring instances. Specifically, be easily adapted. The essence of identifying a refactoring that
an ordering dependency (rf2 → rf1 ) between two refactoring precluded (or invalidated) another could be performed using
instances (rf1 and rf2 ) exists when rf2 can only be successfully the same pre- and post-conditions, but with a modification to
applied after rf1 has been applied. That is, rf1 makes a change to check for differences rather than commonalities (e.g., refac-
code that is necessary in order to apply rf2 . This condition can be toring #1’s post-condition moves the location of a method to
evaluated based on the combination of pre- and post-conditions class A and refactoring #2’s pre-condition requires that same
of the types of refactorings involved and the parameters of method to reside in class B). It may require additional work to
each refactoring instance. For example, to apply Move Method consider the initial state of a program, but the same principles
(a type of refactoring) to move method m1 from class c1 to would likely apply. existing tools that do not use refactoring
class c2 (m1 , c1 , and c2 being the parameters of the refactoring dependencies.
instance), several preconditions must hold (e.g., m1 , c1 , and c2 In this paper, we are planning to use refactoring dependencies
must all exist and m1 must be defined on c1 ). The pre- and to present recommendations in terms of refactoring graphs that
post-conditions of each type of refactoring will be described in clearly convey when an order is required and when it is not.
the next sub-section. Most recommendations based on this work will not imply a
Building on this ordering dependency definition, we organize strict sequential order for all refactorings. For instance, let us
collections of refactorings as sets of refactoring graphs rather consider the example of the non-trivial graph shown in Fig. 2 as
than as sequences of refactorings. A refactoring graph is a an example. In an approach based on refactoring dependencies,
weakly connected directed acyclic graph composed of refac- #4 and #5 can only be applied after #2, but no ordering is
toring instance vertices and ordering dependency edges. Using implied between #4 and #5. The sequential order is critical
the ordering dependencies as the basis for forming refactoring when the refactorings are dependent to each other.
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3349
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3350 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
TABLE II
QMOOD QUALITY METRICS
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3351
totaling 233 refactorings for 5 open source projects2,3,4,5,6 that TABLE III
STATISTICS OF THE SUBJECT PROJECTS
contained at least 5 K LOC and involved significant refactorings
in the last 2 years. The graphs are a sequence of refactorings and
some of them are large in size. We made sure during the sampling
process to use the following criteria to avoid any bias in the se-
lected refactorings for the manual validation: refactoring types,
projects size, projects domain, and locations of the refactorings
(files).
The participants were asked to use our tool to identify refac-
toring dependencies, assess the correctness of those dependen- and the design metrics from QMOOD. We compared the number
cies, and apply and compile the refactorings. First, developers of graphs that improve the quality attributes and design metrics
checked the applicability and ability to properly compile the from QMOOD [13] for both trivial and non-trivial graphs. We
refactorings without producing any conflicts. Each refactoring also considered the rates of improvement, in percentage, for
that caused the code to fail to compile was deemed invalid and all graphs taking into consideration the reusability, flexibility,
was discarded. Then, they identified the refactorings for which understandability, functionality, extendibility, and effectiveness
the set of dependencies was both correct and complete among quality attributes captured by QMOOD metrics and available
all the applicable and compilable refactorings. When checking in Table II, as well as, basic metrics such as coupling, cohesion,
correctness, developers were asked to evaluate each refactoring etc. We also calculated a Total Quality Index (TQI), aggregating
independently to determine whether each identified dependency all the metrics, after normalization, with equal weights into one
was necessary and whether there were any missing dependencies metric.
to evaluate the accuracy of both tail-related dependencies and These evaluation metrics are useful to understand the im-
those not connected to the tail. The participants looked to the pact of collections versus individual refactorings on improving
generated dependencies graph of the refactorings using our the quality and which quality attributes are more likely to be
visual representation in the DPRef tool. Then, they reviewed significantly improved using non-trivial graphs or independent
the code before and after applying any selected refactorings. In refactorings. We want to also highlight that the comparison
case of any doubt, the participants can select the refactorings of non-trivial graphs and trivial graphs could be the result of
from the graph and they will be automatically executed on the the number of refactoring instances, instead of the result of
code. Conflicting refactorings may generate errors in the code dependency.
which may confirm the missing dependencies. In this case, the
refactoring is considered as invalid in this exercise. Otherwise,
B. Experimental Settings
a refactoring for which the set of dependencies was both correct
and complete is considered as a valid refactoring. We define a We considered a total of 9,595 open-source Java projects
manual correctness score (MC) as provided by [35] to address the above research questions. The
selection process limited consideration to projects with ≥ 5 k
# of Valid Refactorings LOC and at least 2 collaborators. We also eliminated any du-
MC = . (2)
#of Evaluated Refactorings plicate (cloned) projects from consideration. We applied these
To answer RQ2, we calculated the number of dependencies criteria on the list of one million GitHub projects. We performed
(edges) and graphs (trivial and non-trivial) for all projects. We this selection process in an attempt to eliminate small projects,
also counted the number of refactorings in non-trivial graphs such as student projects and small hobby/learner programs, that
and the most frequently occurring refactoring types in them, as were not likely to be good candidates for refactorings.
well as the Non-Trivial Rate (NTR) defined as follows: Table III shows the min, average, and max for the number of
the collaborators, code size (in LOC), # of classes, and # of
#of Refactorings inNon-trivial Graphs recommended refactorings generated. The list of subject projects
NTR = . (3)
#of Refactorings is also available in the replication package, along with all results
(e.g., refactorings, quality metrics, refactoring graphs, etc.).
These evaluation metrics allow us understand the extent of The total number of refactorings collected from recommen-
refactoring dependencies. Furthermore, we can evaluate the dations for these 9,595 projects is almost 1.5 million (1,457,873
refactoring types that are less commonly applied in isolation and refactorings). We used the parameter settings recommended by
also understand the complexity of the non-trivial graphs based the authors of the refactoring recommendations tool [12]: Single
on their sizes. To answer RQ3,we consider all the trivial and non- Point crossover with probability = 0.7, Bit Flip mutation with
trivial graphs to evaluate their impact on the quality attributes probability = 0.4, and stopping criterion was set to 100,000 eval-
uations. We also set the initial population size to 100 and utilized
a tournament selection operator with n=2. The minimum and
2 https://github.com/phunware/maas-ads-android-sdk
3 https://github.com/solita/query-utils
maximum number of refactorings per solution are limited to 150
4 https://github.com/forge/roaster and 200, respectively.
5 https://github.com/goobi/goobi-ugh For the manual validation of the refactoring dependencies,
6 https://github.com/kongchen/swagger-maven-example we recruited 27 full-time developers from our networks, each of
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3352 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
TABLE IV
SELECTED PARTICIPANTS
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3353
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3354 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3355
Fig. 11. The number of graphs that improved the quality metrics.
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3356 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
can be explained independently, the cognitive burden on a recommendation/detection tools treat refactorings in isolation.
developer is much lower (e.g., contrast with determining In this paper, we proposed a definition for ordering dependencies
which refactorings scattered across a sequence of dozens among refactorings and an algorithm for detecting these de-
or hundreds of refactorings are related). pendencies. We also proposed defining refactoring recommen-
r comparability: search-based refactoring recommendation dations as sets of refactoring graphs rather than as refactoring
tools typically generate multiple recommendations on a sequences and illustrated these concepts with a tool for visualiz-
Pareto front, leaving developers to choose one. Identifying ing refactoring dependencies and sets of refactoring graphs. We
common elements of different recommendations is simpli- elaborated our research agenda for future work in Section VI.
fied by comparing sets of graphs that do not contain the We validated the proposed approach on 1,457,873 refactor-
spurious orderings found in sequence representations. ings recommended for 9,595 projects. Our results show that
r search efficiency: search-based refactoring recommenda- the proposed approach achieved 100% in correctly detecting
tion tools that use genetic algorithms gain new options. all dependencies among refactorings. Furthermore, we found
Specifically, crossover operations can be more reliable (re- that 43% of the 1,457,873 recommended refactorings are part of
ducing failures) when using dependency analysis; graphs dependent refactoring graphs, which confirms that refactorings
may also be better genomes for crossover than individual are commonly involved in dependent relations and cannot be
refactorings. applied truly independently. These concepts advance a theory
Consequently, there are several directions for future work: for reasoning about refactorings collectively, rather than individ-
Refactoring Pattern Extraction. One important implication ually, and offer clear benefits to developers applying refactoring
of the proposed refactoring dependency theory is the ability to recommendations (explainability and comparability) and au-
extract common refactoring patterns by mining software repos- thors of tools for recommending refactorings (search efficiency
itories using tools such as RefMiner [16]. These patterns are the and improving correctness of recommendations).
common non-trivial graphs that can be extracted on different
commits/pull-requests of the same project or multiple projects.
Such patterns of non-trivial graphs can be linked to refactoring ACKNOWLEDGMENTS
opportunities such as resolving different types of code smells Copyright 2021 IEEE. References herein to any specific
repeatably. In the future, we plan to use the refactoring depen- commercial product, process, or service by trade name, trade
dencies to understand the common refactoring patterns from the mark, manufacturer, or otherwise, does not necessarily consti-
history of commits and pull requests of software repositories tute or imply its endorsement, recommendation, or favoring
using existing refactoring detection tools such as RefMiner. by Carnegie Mellon University or its Software Engineering
Refactoring Collaborations Between Developers. Studying Institute. DM21-0546
the collaborations among multiple developers when refactoring
code is a promising next step. Refactoring graphs extracted
from commit histories can be linked to the authors of those REFERENCES
commits. Then, a graph of collaborations among developers
[1] E. Tom, A. Aurum, and R. Vidgen, “An exploration of technical debt,”
can be generated based on the dependencies among the applied J. Syst. Softw., vol. 86, no. 6, pp. 1498–1516, 2013.
refactorings. This can lead to new insights into why and when [2] M. Kuutila, M. Mäntylä, U. Farooq, and M. Claes, “Time pressure in
developers collaborate for refactoring. software engineering: A systematic review,” Inf. Softw. Technol., vol. 121,
2020, Art. no. 106257.
Change Operator in Search-Based Refactoring. Random se- [3] S. A. Slaughter, D. E. Harter, and M. S. Krishnan, “Evaluating the cost of
lection and application of crossover and mutation when evolv- software quality,” Commun. ACM, vol. 41, no. 8, pp. 67–73, 1998.
ing a population of solutions is a challenge in search-based [4] M. Fowler, Refactoring: Improving the Design of Existing Code. Boston,
MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1999.
refactoring. Refactoring dependency analysis can be used to [5] G. Bavota, A. D. Lucia, M. D. Penta, R. Oliveto, and F. Palomba, “An
avoid destroying good patterns in refactoring solutions and make experimental investigation on the innate relationship between quality and
change operators more intelligent, which can lead to better refactoring,” J. Syst. Softw., vol. 107, pp. 1–14, 2015.
[6] A. C. Bibiano et al., “A quantitative study on characteristics and effect
solutions and faster convergence. of batch refactoring on code smells,” in Proc. IEEE/ACM 13th Int. Symp.
Interactive Refactoring Tool Support. Developers can more Empir. Softw. Eng. Meas., Brazil, 2019, pp. 1–11.
easily understand the implications of selecting which refactor- [7] E. Murphy-Hill, C. Parnin, and A. P. Black, “How we refactor, and how we
know it,” IEEE Trans. Softw. Eng., vol. 38, no. 1, pp. 5–18, Jan./Feb. 2012.
ings from a recommendation to apply, improving the interactive [8] M. Ó. Cinnéide, L. Tratt, M. Harman, S. Counsell, and I. H. Moghadam,
process and increasing their confidence in the recommendation “Experimental assessment of software metrics using automated refactor-
tool. The only restriction in applying non-trivial refactoring ing,” in Proc. IEEE-ACM Int. Symp. Empir. Softw. Eng. Meas., Lund,
Sweden, 2012, pp. 49–58.
graphs is that a refactoring can only be applied if every other [9] N. Tsantalis and A. Chatzigeorgiou, “Identification of move method refac-
refactoring that it depends on (transitively) is also applied. Thus, toring opportunities,” IEEE Trans. Softw. Eng., vol. 35, no. 3, pp. 347–367,
invalid refactorings can be detected and highlighted on the fly. May/Jun. 2009.
[10] M. Kim, T. Zimmermann, and N. Nagappan, “An empirical study of
refactoring challenges and benefits at Microsoft,” IEEE Trans. Softw. Eng.,
vol. 40, no. 7, pp. 633–649, Jul. 2014.
VII. CONCLUSION [11] V. Alizadeh, M. Kessentini, W. Mkaouer, M. O. Cinnéide, A. Ouni, and
Y. Cai, “An interactive and dynamic search-based approach to software
Although manually applying a collection of refactorings is refactoring recommendations,” IEEE Trans. Softw. Eng., vol. 46, no. 9,
common practice, existing empirical studies and refactoring pp. 932–961, Sep. 2020.
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
FERREIRA et al.: DETECTING AND UNDERSTANDING COLLECTIONS OF REFACTORINGS 3357
[12] A. Ouni, M. Kessentini, H. Sahraoui, K. Inoue, and K. Deb, “Multi-criteria [36] J. H. McDonald, Handbook of Biological Statistics, vol. 2. Baltimore, MD,
code refactoring using search-based software engineering: An industrial USA: Sparky House Publishing, 2009.
case study,” ACM Trans. Softw. Eng. Methodol., vol. 25, no. 3, 2016, [37] J. R. Koehler and A. B. Owen, “Computer experiments,” in Handbook of
Art. no. 23. Statistics, Amsterdam, Netherlands: Elsevier Science, 1996, pp. 261–308.
[13] J. Bansiya and C. G. Davis, “A hierarchical model for object-oriented [38] M. Alshayeb and Mohammad, “Empirical investigation of refactoring
design quality assessment,” IEEE Trans. Softw. Eng., vol. 28, no. 1, effect on software quality,” Inf. Softw. Technol., vol. 51, pp. 1319–1326,
pp. 4–17, Jan. 2002. Sep. 2009.
[14] C. Abid, V. Alizadeh, M. Kessentini, T. D. N. F. Ferreira, and D. Dig, “30 [39] F. Palomba, A. Zaidman, R. Oliveto, and A. De Lucia, “An exploratory
years of software refactoring research: A systematic literature review,” study on the relationship between changes and refactoring,” in Proc.
2020, arXiv:2007.02194. IEEE/ACM 25th Int. Conf. Prog. Comprehension, 2017, pp. 176–185.
[15] M. W. Mkaouer, M. Kessentini, S. Bechikh, K. Deb, and M. O. Cinnéide, [40] C. Vassallo, G. Grano, F. Palomba, H. C. Gall, and A. Bacchelli, “A
“Recommendation system for software refactoring using innovization and large-scale empirical exploration on refactoring activities in open source
interactive dynamic optimization,” in Proc. IEEE-ACM 29th Int. Conf. software projects,” Sci. Comput. Program., vol. 180, pp. 1–15, 2019.
Autom. Softw. Eng., Vasteras, Sweden, 2014, pp. 331–336. [41] G. Szke, G. Antal, C. Nagy, R. Ferenc, and T. Gyimthy, “Empirical
[16] N. Tsantalis, M. Mansouri, L. M. Eshkevari, D. Mazinanian, and D. Dig, study on refactoring large-scale industrial systems and its effects on
“Accurate and efficient refactoring detection in commit history,” in Proc. maintainability,” J. Syst. Softw., vol. 129, no. C., pp. 107–126, Jul. 2017,
ACM 40th Int. Conf. Softw. Eng., Gothenburg, Sweden, 2018, pp. 483–494. doi: 10.1016/j.jss.2016.08.071.
[17] M. Harman and L. Tratt, “Pareto optimal search based refactoring at
the design level,” in Proc. 9th ACM Annu. Conf. Genet. Evol. Comput.,
London, England, 2007, pp. 1106–1113.
[18] Y. Lin, X. Peng, Y. Cai, D. Dig, D. Zheng, and W. Zhao, “Interactive and
guided architectural refactoring with search-based recommendation,” in
Proc. ACM SIGSOFT Int. Symp. Found. Softw. Eng., Seattle, USA, 2016,
pp. 535–546.
[19] T. Sharma and D. Spinellis, “A survey on software smells,” J. Syst. Softw.,
vol. 138, pp. 158–173, 2018. Thiago Ferreira received the PhD degree in com-
[20] G. Bavota, A. D. Lucia, A. Marcus, and R. Oliveto, “Recommending refac- puter science from the Federal University of Parana,
toring operations in large software systems,” in Recommendation Systems in 2019. He is an assistant professor with the College
in Software Engineering, Berlin, Germany: Springer, 2014, pp. 387–419. of Innovation & Technology (CIT), University of
[21] M. O’Keeffe and M. O. Cinnéide, “A stochastic approach to automated Michigan-Flint. His research interests focus on the
design improvement,” in Proc. ACM 2nd Int. Conf. Princ. Pract. Program. use of user preferences, optimization algorithms, and
Java, Kilkenny City, Ireland, 2003, pp. 59–62. artificial intelligence techniques to address several
[22] O. Seng, J. Stammel, and D. Burkhart, “Search-based determination of software engineering problems such as software re-
refactorings for improving the class structure of object-oriented systems,” quirements, software testing, and software refactor-
in Proc. ACM 8th Annu. Conf. Genet. Evol. Comput., Seattle, USA, 2006, ing. For more information, see [email protected].
pp. 1909–1916.
[23] M. Kessentini, W. Kessentini, H. Sahraoui, M. Boukadoum, and A. Ouni,
“Design defects detection and correction by example,” in Proc. IEEE 19th
Int. Conf. Prog. Comprehension, kingston, Canada, 2011, pp. 81–90.
[24] A. Ouni, M. Kessentini, and H. Sahraoui, “Search-based refactoring using
recorded code changes,” in Proc. IEEE 17th Eur. Conf. Softw. Maintenance
Reengineering, Genova, Italy, 2013, pp. 221–230.
[25] W. Mkaouer et al., “Many-objective software remodularization us-
ing NSGA-III,” ACM Trans. Softw. Eng. Methodol., vol. 24, no. 3,
pp. 17:1–17:45, 2015.
[26] A. Chávez, I. Ferreira, E. Fernandes, D. Cedrim, and A. Garcia, “How James Ivers is the lead of the Carnegie Mellon
does refactoring affect internal quality attributes? A multi-project study,” University Software Engineering Institute’s software
in Proc. ACM 31st Braz. Symp. Softw. Eng., Fortaleza, Brazil, 2017, architecture group, which develops and matures
pp. 74–83. tools and practices to support software architects.
[27] T. Mens, G. Taentzer, and O. Runge, “Analysing refactoring dependencies He is also the co-author of the Documenting Soft-
using graph transformation,” Softw. Syst. Model., vol. 6, no. 3, pp. 269–285, ware Architectures book. For more information, see
2007. [email protected].
[28] T. Mens, G. Taentzer, and O. Runge, “Detecting structural refactoring
conflicts using critical pair analysis,” Electron. Notes Theor. Comput. Sci.,
vol. 127, no. 3, pp. 113–128, 2005.
[29] H. Liu, G. Li, Z. Ma, and W. Shao, “Conflict-aware schedule of software
refactorings,” IET Softw., vol. 2, no. 5, pp. 446–460, 2008.
[30] H. Liu, Z. Ma, W. Shao, and Z. Niu, “Schedule of bad smell detection and
resolution: A new way to save effort,” IEEE Trans. Softw. Eng., vol. 38,
no. 1, pp. 220–235, Jan./Feb. 2012.
[31] L. Sousa et al., “Characterizing and identifying composite refactorings:
Concepts, heuristics and patterns,” in Proc. 17th Int. Conf. Mining Softw.
Repositories, 2020, pp. 186–197.
[32] H. Melton and E. Tempero, “Identifying refactoring opportunities by Jeffrey J. Yackley is currently working toward
identifying dependency cycles,” in Proc. ACM 29th Australas. Comput. the PhD degree with the University of Michigan -
Sci. Conf., Australia, 2006, pp. 35–41. Dearborn. He is co-advised by Dr. Marouane Kessen-
[33] N. Yoshida, Y. Higo, T. Kamiya, S. Kusumoto, and K. Inoue, “On refac- tini in the ISE Lab and Dr. Bruce R. Maxim in the
toring support based on code clone dependency relation,” in Proc. IEEE GAME Lab. His research focuses on search based
11th Int. Softw. Metrics Symp., Como, Italy, 2005, pp. 10–pp. software engineering and machine learning in or-
[34] DPRef, 2022. [Online]. Available: https://github.com/iselab-dearborn/ der to address problems with software architecture,
dpref-refactoring-dependencies refactoring, and testing in addition to his research
[35] N. Munaiah, S. Kroh, C. Cabrey, and M. Nagappan, “Curating github on computer science education where he focuses on
for engineered software projects,” Empir. Softw. Eng., vol. 22, no. 6, applying active learning techniques in the classroom.
pp. 3219–3253, 2017. For more information, see [email protected].
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.
3358 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 49, NO. 6, JUNE 2023
Marouane Kessentini received the PhD degree from Khouloud Gaaloul received the PhD degree from
the University of Montreal, Canada, in 2012. He is a The University of Luxembourg. She is a postdoctoral
full professor, chair of the CSE Department, Oakland researcher with the University of Oakland University
University, and director of the NSF IUCRC Center on under the supervision of Dr. Marouane Kessentini in
Pervasive AI. He is a recipient of the prestigious 2018 the ISELab. She held the position of a post-doctoral
President of Tunisia distinguished research award, researcher with The SnT Centre for Security, Relia-
the University of Michigan-Dearborn distinguished bility, and Trust, University of Luxembourg and the
teaching award, the University of Michigan-Dearborn position of a post-doctoral researcher with the Uni-
distinguished digital education award, the University versity of Michigan-Dearborn. Her research interests
of Michigan-Dearborn/College of Engineering and include model-based software development and anal-
Computer Science distinguished research award, 4 ysis of Cyber-Physical Systems, search-based testing
best paper awards including and the prestigious IEEE 10 Year Most Influential and machine learning. She has been conducting her research in close collabo-
Paper award (2011–2021), and his AI-based software refactoring invention, ration with industry partners in the aerospace sector. For more information, see
licensed and deployed by Fortune 500 companies, and selected as one of the [email protected]
Top 8 inventions with the University of Michigan for 2018 (including the three
campuses), among more than 500 inventions, by the UM Technology Transfer
Office. He received various multi-million grants from both industry and federal
agencies and published more than 180 papers in top journals and conferences.
He has extensive collaborations with the industry in different areas related to
Edge AI, AI/MLOps, AI and cyber-physical systems, intelligent software bots,
etc. He is the co-founder of many workshops, general chair of SSBSE16 and
ASE22, and PC chair of MODELS19, SANER 2021, GECCO, etc. He served
as a keynote speaker with various venues including ICSR, SSBSE, GECCO,
WCCI, etc. He graduated more than 18 PhD students and served as associate
editor in 7 journals and PC member of more than 200 conferences.
Authorized licensed use limited to: Institute of Software. Downloaded on June 07,2024 at 06:51:20 UTC from IEEE Xplore. Restrictions apply.