Software Evolution Class Note
Software Evolution Class Note
REFACTORING
• As software evolves and strays away from its original design, three things happen.
– Decreased understandability
– Decreased reliability
– Out-of-date documentation
• Decrease the complexity of software by improving its internal quality by restructuring the software.
• Restructuring means reorganizing software (source code + documentation) to give it a different look, or
structure.
– Readability
– Extensibility
– Maintainability
– Modularity
– easier to understand;
– easier to change;
– Pretty printing
1
– Meaningful names for variables
• Developers and managers need to be aware of restructuring for the following reasons
– better understandability
– better reliability
– longer lifetime
– automated analysis
– The objective of restructuring and refactoring is to improve the internal and external values of
software.
– Maintain consistency.
– source code;
– requirements documents.
– Specific modules, functions, classes, methods, and data can be identified for refactoring.
• The concept of code smell is applied to source code to detect what should be refactored.
• A code smell is any symptom in source code that possibly indicates a deeper problem.
2
• Examples of code smell are:
– duplicate code;
– long methods;
– large classes;
– message chain.
– software architecture;
• class diagram;
• activity diagrams;
– database schemas.
– R2: Rename method print to process in class FileServer. (R1 and R2 are to be done together.)
– R4: Pull up method accept from PrintServer and FileServer to the superclass Server.
– R5: Move method accept from PrintServer to class Packet, so that data packets themselves will
decide what actions to take.
– R7: Encapsulate field receiver in Packet so that another class cannot directly access this field.
– R8: Add parameter p of type Packet to method print in PrintServer to print the contents of a
packet.
– R9: Add parameter p of type Packet to method save in class FileServer so that the contents of a
packet can be printed.
3
: Class diagram of a Local Area Network (LAN) simulator
• R1—R9 indicate that a large number of refactorings can be identified even for a small system.
• A subset of the entire set of refactorings need to be carefully chosen because of the following reasons.
– Some refactorings can be individually applied, but they must follow an order if applied together.
• The following two techniques can be used to analyze a set of refactorings to select a feasible subset.
• Given a set of refactorings, analyze each pair for conflicts. A apir is said to be
conflicting if both of them cannot be applied together.
• If one refactoring has already been applied, a mutually exclusive refactoring cannot be
applied anymore.
4
• Example: after applying R1, R2, and R3, R4 becomes applicable. Now, if R4 is
applied, then R6 is not applicable anymore.
• Temporal constraints: A temporal constraint over a sequence of operations is that the operations
occur in a certain order.
• Resource constraints: The software after refactoring does not demand more resources: memory,
energy, communication bandwidth, and so on.
• Safety constraints: It is important that the software does not lose its safety properties after
refactoring.
• Two pragmatic ways of showing that refactoring preserves the software’s behavior.
• Testing
• Exhaustively test the software before and after applying refactorings, and compare the
observed behavior on a test-by-test basis.
• Ensure that the sequence(s) of method calls are preserved in the refactored program.
5
Figure : Applications of two refactorings
6.1.5 Evaluate the impacts of the
Refactorings on Quality
• Refactorings impact both internal and external qualities of software.
• In general, refactoring techniques are highly specialized, with one technique improving a small number
of quality attributes.
• For example,
• By measuring the impacts of refactorings on internal qualities, their impacts on external qualities can be
measured.
– Decreased coupling, increased cohesion, and decreased size are likely to make a software system
more maintainable.
– To assess the impact of a refactoring technique for better maintainability, one can evaluate the
metrics before refactoring and after refactoring, and compare them.
• Exmple: A soft-goal graph for quality attribute (maintainability) is a hierarchical graph rooted at the
desired change in the attribute, for example, high maintainability.
• The internal nodes represent successive refinements of the attribute and are basically the soft goals.
• The leaf nodes represent refactoring transformations which contribute positively/negatively to soft-goals
which appear above them in the hierarchy.
• A partial example of a soft goal graph with one leaf node, namely, Move, has been illustrated in Fig..
• The dotted lines between the leaf node Move and three soft goals – High Modularity, High Module
Reuse, and Low Control Flow Coupling imply that the Move transformation impacts those three soft
goals.
: An example of a soft goal graph for maintainability, with one leaf node
6.2 FORMALISMS FOR REFACTORING
• Three key formalisms for refactoring are:
7
– assertions:
– graph transformation:
– metrics:
• Metrics are useful in quantifying to what extent the internal and external properties of
software entities have changed.
6.2.1 Assertions
• Programmers make assumptions about the behavior of programs at specific points, and those
assumptions can be tested by means of assertions.
– invariants;
– preconditions; and
– postconditions.
• Invariant
– A class invariant is an invariant that all instances of that class must satisfy.
• Precondition
• Postcondition
• Invariants, preconditions, and postconditions can be applied to test the behavior preserving property of
refactorings.
– All instance variables of a class, whether defined or inherited, have distinct names.
• Classes (C), method signatures (M), block structures (B), variables (V), parameters (P), and expressions
(E) are represented by typed nodes in a graph.
8
• The possible relationships among the nodes are:
– update (u).
• The Push-Down-Method refactoring has been applied to method originate in first Fig. to obtain a new
graph shown in second Fig..
9
Figure : Program graph obtained after applying push-down-method
refactoring to the program graph of first Fig.
6.2.3 Software Metrics
• Software metrics can be used to quantify the internal and external qualities of software.
• A module consists of many components; each component provides a defined functionality used by other
components.
• Measure the strength of togetherness of components within a module to decide whether or not some
components should stay in the same module.
• cohesion; and
• coupling.
• Cohesion: This metric is used to represent the strength of togetherness in the same module.
• Coupling: This metric is used to represent the strength of dependency between separate modules.
– Substitute algorithm;
– Parameterize Methods;
10
– Substitute algorithm
– Replace algorithm X with algorithm Y: (i) because implementation of Y is clearer than X; (ii) Y
performs better than X; and (iii) standardization bodies want X to be replaced with Y.
– Algorithm substitution is easier if both X and Y have the same input-output behaviors.
Consider the following code segment, where the method bodyMassIndex has two formal parameters.
int person;
:
// person is initialized here;
:
int bodyMass = getMass(person);
int height = getHeight(person);
int BMI = bodyMassIndex(bodyMass, height);
:
The above code segment can be rewritten such that the new bodyMassIndex method accepts one
formal parameter, namely, person, and internally computes the values of bodyMass and height.
The refactored code segment has been shown in the following:
int person;
:
// person is initialized here;
:
int BMI = bodyMassIndex(person);
:
The advantage of this refactoring is that it reduces the number of parameters passed to methods.
Such reduction is important because one can easily make errors while passing long parameter
lists.
• Push Down Method
– Assume that Executive and Clerk are two subclasses of the superclass Employee, as shown in
Fig. (a).
If overTimePay is used in the Clerk class, but not in the Executive class, then the programmer can push down
overTimePay to the Clerk class, is shown in Fig. (b).
11
– Sometimes programmers may find multiple methods performing the same computations on
different input data sets.
– Those methods can be replaced with a new method with additional formal parameters, as
illustrated in Fig. .
– In Fig. (a), we have the Communication class with four methods: bluetoothInterface,
wifiInterface, threeGInterface, and fourGInterface.
– In Fig.(b), we have the Communication class with just one method, namely, wirelessInterface
with one parameter, namely, radio.
– The method wirelessInterface can be invoked with different values of radio so that the
wirelessInterface method can in turn invoke different radio interfaces.
Figure : An example of parameterizing a method. There are four methods in (a), whereas there is one method in
(b) with one parameter
6.4 INITIAL WORK ON SOFTWARE RESTRUCTURING
• Software restructuring dates back to the mid 1960s, as soon as programs were written in Fortran.
– Restructuring techniques
• Elimination-of-goto approach
• Clustering approach
• Software structure is a set of attributes of the software such that the programmer gets a good
understanding of software.
• Any factor that can influence the state of software or the programmer’s perception might influence
software structure.
• One view of the factors that influence software structure has been shown in Fig..
12
– Code -- Documentation
– Tools -- Programmers
– Code quality at all levels of details (e.g. variables, constants, statements, function, and module)
impact code understanding.
• Documentation
– External documentation
• Requirements documents
• Design documents
• User manuals
Test cases
• Tools – Programming environment
• Tracing of source code help in understanding the dynamic behavior of the code.
• Tools can reformat code for better readability via pretty printing, highlighting of key
words, and color coding of source code.
• Programmers
• Individual capabilities
• Education
– Management can play an influencing role in having a good initial structure and sustain it by
designing policies and allocating resources.
– Examples
• Management can tie the annual performance review with the programmer’s adherence to
standards.
• Environment
14
6.4.2 Classification of Restructuring Approaches
– Upgrade documentation
• Update external documentations to make them consistent with code, accurate, and
complete.
• Incremental restructuring
• Goto-less approach
• Case-statement approach
15
• Clustering approach
– Tools
• Restructuring techniques
– Elimination-of-goto Approach
– Clustering Approach
• Elimination-of-goto Approach
– Before the onset of structured programming, much code was written in the ‘70s with goto
statements.
– Structured programming puts emphasis on the following control constructs: for, while, until,
and, if-then-else.
– It has been shown that every flowchart program with goto statements can be transformed into a
functionally equivalent goto-less program by using while statements.
– Localization
16
– Information Hiding
• For example, a queue is a high level concept which can be implemented by means of a
variety of low level data structures.
• Arrays
• A programmer can design a function by using enqueue and dequeue calls without any
concern for their actual implementations.
• Localization of variables
• Organize global variables and functions which refer to those global variables
into package-like groups.
• Localization of functions
• Put locally called functions and the calling function in the same group.
• Those functions and variables which are only externally referable and visible to
other packages constitute the package specification.
– This approach is applied to those software which cannot be restructured with any hope, but need
to be retained for their outputs.
– As illustrated in Fig., write a new front-end interface and a new back-end data base so that:
17
Figure: System sandwich approach to software restructuring.
The arrows represent the flow of data and/or commands.
• Clustering Approach
– Clusters are defined as continuous regions of space containing a relatively high density of
points, separated from other such regions by regions containing a relatively low density of
points.
– Modularization is defined as the clustering of large amount of entities in groups in such a way
that the entities in one group are more closely related, based on some similarity metrics.
– While applying the idea of clustering, two factors are taken into account:
– Similarity metrics
• Distance measures
– Euclidean distance
– Manhattan distance
• Association coefficients
– Jaccard coefficient
• Construction algorithms
• Hierarchical algorithms
1.IF there are N entities, begin with N clusters such that each
cluster contains a unique entity.
Compute the similarities between the clusters.
2. WHILE there is more than a cluster
DO
Find the most similar pair of clusters and merge them into a single cluster.
Recompute the similarities between the clusters.
END
20
Figure: Illustration of entity level remodularization.
Bullets represent low level entities.
Dotted shapes represent modules.
• Program Slicing Approach
• Backward slicing: The set of statements that can affect the value of a variable at some
point of interest in a program is called a backward slice.
• Forward slicing: The set of statements that are likely to be affected by the value of a
variable at some point of interest in a program is called a forward slide.
– Therefore, if a module supports multiple functionalities, a portion of the code can be extracted to
form a new module.
– Large functions can be decomposed into smaller functions by means of program slicing to
restructure programs.
21
7.6 DOMAIN ENGINEERING
• The term domain engineering refers to a development-for-reuse process to create reusable software
assets (RSA).
• Domain engineering is the set of activities that are executed to create reusable software assets to be
used in specific software projects.
• For a software product family, the requirements of the family are identified and a reusable, generic
software structure is designed to develop members of the family.
• In the following slides, we explain analysis, design, and implementation activities of domain
engineering.
Domain Analysis
• Domain analysis comprises three main steps:
• The Feature Oriented Domain Analysis (FODA) method developed at the Software Engineering
Institute is a well-known method for domain analysis.
• The FODA method describes a process for domain analysis to discover, analyze, and document
commonality and differences within a domain.
Domain Design
• Domain design comprises two main steps:
– develop a generic software architecture for the family of products under consideration; and
• The common architecture becomes the basis for system construction and incremental growth.
• The design activities are supported by architecture description languages (ADLs), namely, Acme, and
interface definition languages (IDLs), such as Facebook’s Thrift.
Domain Implementation
• Domain implementation involves the following broad activities:
– acquire and create reusable assets by applying the domain knowledge acquired in the process of
domain analysis and the generic software architecture constructed in the domain design phase;
• Development, management, and maintenance of a repository of reusable assets make up the core of
domain implementation.
22
7.7 Application Engineering
• Application engineering (a.k.a. product development) is complementary to domain engineering.
• It refers to a development-with-reuse process to create specific systems by using the fabricated assets
defined in domain engineering.
Application engineering is fed with reusable assets from domain engineering, whereas domain engineering is fed
with new requirements from application engineering
– Draco
– Feature-Oriented ReuseMethod(FORM)
– Koala
23
24