A Survey of Software Refactoring: Tom Mens, Member, IEEE, and Tom Tourwe
A Survey of Software Refactoring: Tom Mens, Member, IEEE, and Tom Tourwe
FEBRUARY 2004
INTRODUCTION
does not alter the external behavior of the code, yet improves its
internal structure [7]. The key idea here is to redistribute
classes, variables, and methods across the class hierarchy in
order to facilitate future adaptations and extensions.
In the context of software evolution, restructuring and
refactoring are used to improve the quality of the software
(e.g., extensibility, modularity, reusability, complexity,
maintainability, efficiency). Refactoring and restructuring
are also used in the context of reengineering [9], which is the
examination and alteration of a subject system to reconstitute it in a new form and the subsequent implementation
of the new form [8]. In this context, restructuring is needed
to convert legacy code or deteriorated code into a more
modular or structured form [10] or even to migrate code to
a different programming language or even language
paradigm [11].
The remainder of this paper is structured as follows:
Section 2 explains general ideas of refactoring by means of
an illustrative example. Section 3 identifies and explains the
different refactoring activities. Section 4 provides an overview of various formalisms and techniques that can be used
to support these refactoring activities. Section 5 summarizes
different types of software artifacts for which refactoring
support has been provided. Section 6 discusses essential
issues that have to be considered in developing refactoring
tools. Section 7 discusses how refactoring fits in the
software development process. Finally, Section 8 concludes.
RUNNING EXAMPLE
127
128
REFACTORING ACTIVITIES
FEBRUARY 2004
3.2
129
130
is a research area in its own right [34], [35], [36], we will not
treat it in detail here. We only discuss a few approaches that
relate consistency maintenance to refactoring.
Bottoni et al. propose maintaining consistency between
the program and design models by describing refactoring as
coordinated graph transformation schemes [37]. These
schemes have to be instantiated according to the specific
code modification and applied to the design models
affected by the change.
Within the same level of abstraction, there is also a need
to maintain consistency. For example, if we want to refactor
source code, we have to ensure that the corresponding unit
tests are kept consistent [23]. Similarly, if we have different
kinds of UML design models and any of these is being
refactored, the others have to be kept consistent. Van Der
Straeten et al. suggest to do this by means of logic rules [39].
Rajlich uses the technique of change propagation to cope
with inconsistencies between different software artifacts
[38]. This technique deals with the phenomenon that, when
one part of a software is changed, dependent parts of the
software may need to be changed as well.
REFACTORING TECHNIQUES
AND
FORMALISMS
FEBRUARY 2004
TABLE 1
Correspondence between Refactoring
and Graph Transformation
131
4.3
132
FEBRUARY 2004
TABLE 2
Restructuring Support in Different Programming Languages
TYPES
OF
SOFTWARE ARTIFACTS
5.1 Programs
Support for program restructuring and refactoring has
been provided in a variety of different programming
languages and programming paradigms. This is summarized in Table 2.
133
5.2 Designs
A recent research trend is to deal with refactoring at design
level, for example in the form of UML models [93], [37],
[94]. Boger et al. developed a refactoring browser integrated
with a UML modeling tool [95]. It supports refactoring of
class diagrams, statechart diagrams, and activity diagrams.
For each of these diagrams, the user can apply refactorings
that cannot easily or naturally be expressed in other
diagrams or in the source code. Van Gorp et al. propose a
UML extension to express the pre and postconditions of
source code refactorings using OCL [96]. The proposed
extension allows an OCL empowered CASE tool to verify
nontrivial pre and postconditions, to compose sequences of
refactorings, and to use the OCL query engine to detect bad
code smells. Such an approach is desirable as a way to
refactor designs independent of the underlying programming language.
Design patterns provide a means to describe the program
structure at a high level of abstraction [12]. Often,
refactorings are used to introduce new design pattern
instances into the software [88], [90], [91]. We already
illustrated this in our running example of Section 2, where
refactorings were used to introduce a Visitor design pattern.
Design patterns also impose constraints on the software
structure, which may limit applicability of certain refactorings. To detect this, Mens and Tourwe resort to logic
reasoning [109]. Jahnke and Zundorf use graph transformation techniques to restructure/replace occurrences of poor
design patterns in a legacy program by good design
patterns [97].
Object-oriented database schemas can be seen as the
predecessor of UML class diagrams. Because their main
focus is on how data should be structured, they are an ideal
candidate for refactoring. In fact, the research area of objectoriented software refactoring originates in the research on
how to restructure object-oriented database schemas [40],
[66], [69].
134
TOOL SUPPORT
6.1 Automation
The degree of automation of a refactoring tool varies
depending on which of the refactoring activities of Section 3
are supported by the tool, as well as the extent to which
support for each of these activities is automated.
For example, contemporary IDEs often include a
refactoring browser that supports a semiautomatic approach
to refactoring. While it remains the task of the developer to
identify which part of the software needs to be refactored
and to select the most appropriate refactoring to apply, the
actual application of the refactoring is automated. As
indicated by Tokuda and Batory [28], a semiautomatic
approach can drastically increase the productivity (in terms
of coding and debugging time) when compared to refactoring by hand. Based on two nontrivial case studies, they
estimate this to be a factor of 10 or more. Similarly, one can
expect developer productivity to improve after the software
has been refactored because the software generally is more
understandable, maintainable, and evolvable. Another
main advantage of refactoring tools from the viewpoint of
the developer is that their behavior-preserving nature
significantly reduces the need for debugging and testing,
two activities that are known to be very time-consuming
and labor intensive.
As an alternative to this semiautomatic approach, some
researchers demonstrated the feasibility of fully automated
3. For an extensive and up-to-date overview of refactoring tools, we refer
to http://www.refactoring.com/.
FEBRUARY 2004
6.2 Reliability
The reliability of a refactoring tool mainly depends on the
ability to guarantee that its provided refactoring transformations are truly behavior preserving. As we have seen in
Section 4.3, it is only possible to guarantee this in very
specific cases (e.g., for simple languages, for a limited
number of refactorings, given a clearly defined notion of
semantics). Because of these restrictions, most tools check
the refactoring preconditions before applying it and perform tests afterward.
In the absence of a full guarantee of behavior preservation, it is essential that a refactoring tool provides an undo
mechanism to make undesired changes undone [41].
6.3 Configurability and Openness
There is a tendency to integrate refactoring tools directly
into industrial strength IDEs. This is typically achieved
using the built-in extensibility mechanisms of these tools
(e.g., plug-ins, APIs, or wizards). Unfortunately, these
extensibility mechanisms are often inadequate for the
purpose of configuring the tools with user-specific or
domain-specific information.
There are a variety of ways in which a user (or a group or
users) should be able to configure a refactoring tool for a
particular usage:
.
6.4 Coverage
As mentioned in Section 3, there is a wide range of
refactoring activities that can be covered by a tool. An ideal
refactoring tool should be as complete as possible, i.e., it
should cover most of these activities. Unfortunately, most
commercial refactoring tools only provide support for
automatically applying refactorings, whereas the other
activities of the refactoring process are neglected.
6.5 Scalability
Contemporary software development tools only support
primitive refactorings. As illustrated in the example of
Section 2, refactoring even the simplest design already
requires applying a large number of primitive refactorings.
To increase the scalability and performance of a refactoring
tool, frequently used sequences of primitive refactorings
should be combined into composite refactorings.
The use of composite refactorings has several advantages. First of all, they better capture the specific intent of
the software change induced by the refactoring. As such, it
becomes easier to understand how the software has been
refactored. Second, composite refactorings result in a
performance gain because the tool needs to check the
preconditions only once for the composite refactoring,
rather than for each primitive refactoring in the sequence
separately [41], [102]. A third advantage of composite
refactorings is that they can weaken the behavior preservation requirements of its primitive constituents. The primitive refactorings in a sequence do not have to be behavior
preserving as long as the net effect of their composition is
behavior preserving. This interesting idea is referred to as
transactional refactoring by Tokuda and Batory [28]. As an
example, they show that the refactoring DelegateMethod
AcrossObjectBoundary is a sequence of two primitive
refactorings, MoveMethodAcrossObjectBoundary (which
removes the method entirely from its original class) and
CreateMethod Accessor (which reintroduces the method to
the original class and delegates its execution to the moved
method). While the net result of applying both refactorings
in sequence is behavior preserving, the primitive refactorings are not. If clients of the original class reference the
target method, the enabling conditions of the move method
refactoring will prevent the method from being moved.
Cinneide
Similar to the use of composite refactorings, O
and Nixon [103] propose using refactorings to introduce
design patterns by first splitting up the design pattern into a
sequence of minipatterns and then applying a sequence of
135
136
PROCESS SUPPORT
FEBRUARY 2004
7.3
CONCLUSIONS
ACKNOWLEDGMENTS
This research was funded by the FWO Project G.0452.03 A
formal foundation for software refactoring and was carried
out in the context of the scientific networks Formal
Foundations of Software Evolution and Research Links
to Explore and Advance Software Evolution financed by
the Fund for Scientific ResearchFlanders and the European Science Foundation, respectively. We thank JeanMarc Jezequel and the anonymous reviewers for their
excellent reviews that turned this paper into a far better
paper than it would have been otherwise.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
137
138
[47] M. Weiser, Program Slicing, IEEE Trans. Software Eng., vol. 10,
no. 4, pp. 352-357, 1984.
[48] F. Lanubile and G. Visaggio, Extracting Reusable Functions by
Flow Graph-Based Program Slicing, IEEE Trans. Software Eng.,
vol. 23, no. 4, pp. 246-258, Apr. 1997.
[49] A. Lakhotia and J.-C. Deprez, Restructuring Programs by
Tucking Statements into Functions, Information and Software
Technology, special issue on program slicing, vol. 40, pp. 677-689,
1998.
[50] R. Komondoor and S. Horwitz, Semantics-Preserving Procedure
Extraction, technical report, Computer Sciences Dept., Univ. of
Wisconsin-Madison, 2000.
[51] L. Larsen and M.J. Harrold, Slicing Object-Oriented Software,
Proc. Intl Conf. Software Eng., pp. 495-505, Mar. 1996.
[52] B. Ganter and R. Wille, Formal Concept Analysis: Mathematical
Foundations. Springer-Verlag, 1999.
[53] G. Snelting and F. Tip, Reengineering Class Hierarchies Using
Concept Analysis, Proc. Foundations of Software Eng., pp. 99-110,
1998.
[54] P. Tonella, Concept Analysis for Module Restructuring, Trans.
Software Eng., vol. 27, no. 4, pp. 351-363, Apr. 2001.
[55] A. van Deursen and T. Kuipers, Identifying Objects Using
Cluster and Concept Analysis, Proc. 21st Intl Conf. Software Eng.,
pp. 246-255, 1999.
[56] J. Philipps and B. Rumpe, Roots of Refactoring, Proc. 10th
OOPSLA Workshop Behavioral Semantics, 2001.
[57] N. Wirth, Program Development by Stepwise Refinement,
Comm. ACM, vol. 14, pp. 221-227, 1971.
[58] R.-J. Back, Correctness Preserving Program Refinements,
technical report, Math. Centre Tracts #131, Mathematisch Centrum Amsterdam, 1980.
[59] J. Philipps and B. Rumpe, Refinement of Information Flow
Architectures, Proc. Intl Conf. Formal Eng. Methods, 1997.
[60] S. Demeyer, S. Ducasse, and O. Nierstrasz, Finding Refactorings
via Change Metrics, Proc. Object-Oriented Programming, Systems,
Languages, Applications Conf. 2000, vol. 35, no. 10, pp. 166-177, Oct.
2000.
[61] D. Coleman, P. Arnold, S. Bdoff, H. Gilchrist, F. Hayes, and P.
Jeremaes, Object-Oriented Development: The Fusion Method. Prentice
Hall, 1994.
[62] W.G. Griswold, M.I. Chen, R.W. Bowdidge, and J.D. Morgenthaler, Tool Support for Planning the Restructuring of Data
Abstractions in Large Systems, Proc. SIGSOFT Symp. Foundations
of Software Eng., Oct. 1996.
[63] D. Gupta, P. Jalote, and G. Barua, A Formal Framework for OnLine Software Version Change, IEEE Trans. Software Eng., vol. 22,
no. 2, pp. 120-131, Feb. 1996.
[64] I. Moore, Automatic Inheritance Hierarchy Restructuring and
Method Refactoring, Proc. Object-Oriented Programming, Systems,
Languages, Applications Conf., pp. 235-250, 1996.
[65] R.E. Mortimer and K.H. Bennett, Maintenance and Abstraction of
Program Data Using Formal Transformations, Proc. Intl Conf.
Software Maintenance, pp. 301-311, 1996.
[66] P.L. Bergstein, Object-Preserving Class Transformations,
SIGPLAN Notices, vol. 26, no. 11, pp. 299-313, Nov. 1991.
[67] P.L. Bergstein, Maintenance of Object-Oriented Systems during
Structural Evolution, Theory and Practice of Object Systems, vol. 3,
no. 3, pp. 185-212, 1991.
[68] S.H. Hwang, Y. Tsujino, and N. Tokura, A Reorganization
Framework of the Object-Oriented Class Hierarchy, Proc. Asia
Pacific Conf. Software Eng., pp. 117-126, 1995.
[69] W.L. Hursch and L.M. Seiter, Automating the Evolution of
Object-Oriented Systems, Proc. Symp. Object Technology for
Advanced Software, pp. 2-21, 1996.
[70] H. Tamaki and T. Sato, Unfold/Fold Transformation of Logic
Programs, Proc. Intl Conf. Logic Programming, pp. 127-138, 1984.
[71] T. Kawamura and T. Kanamori, Preservation of Stronger
Equivalence in Unfold/Fold Logic Program Transformation,
Proc. Intl Conf. Fifth Generation Computer Systems, pp. 413-422,
1988.
[72] N. Jones and A. Mycroft, Stepwise Development of Operational
and Denotational Semantics for Prolog, Proc. Intl Symp. Logic
Programming, pp. 289-298, 1984.
[73] F. Bodin, Sage++: An Object-Oriented Toolkit and Class Library
for Building Fortran and C++ Restructuring Tools, Proc. Conf.
Object-Oriented Numerics, 1994.
FEBRUARY 2004
139
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.