0% found this document useful (0 votes)
6 views

Software Evolution Class Note

Software evolution
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Software Evolution Class Note

Software evolution
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

6.

REFACTORING

• Developers continuously modify, enhance and adapt software.

• As software evolves and strays away from its original design, three things happen.

– Decreased understandability

– Decreased reliability

– Increased maintenance cost

• Decreased understandability is due to

– Increased complexity of code

– Out-of-date documentation

– Code not conforming to standards

• Decrease the complexity of software by improving its internal quality by restructuring the software.

• Restructuring applied on object-oriented software is called refactoring.

• Restructuring means reorganizing software (source code + documentation) to give it a different look, or
structure.

• Source code is restructured to improve some of its non-functional requirements:

– Readability

– Extensibility

– Maintainability

– Modularity

• Restructuring does not modify the software’s functionalities.

• Restructuring can be performed while adding new features.

• Software restructuring is informally stated as the modifications of software to make it

– easier to understand;

– easier to change;

– easier to change its documentation;

– less susceptible to faults when changes are made to it.

• A higher level goal of restructuring is to increase the software value

– external software value: fewer faults in software is seen to be better by customers

– internal software value: a well-structured system is less expensive to maintain

• Simple examples of restructuring

– Pretty printing

1
– Meaningful names for variables

– One statement per line of source code

• Developers and managers need to be aware of restructuring for the following reasons

– better understandability

– keep pace with new structures

– better reliability

– longer lifetime

– automated analysis

• Characteristics of restructuring and refactoring

– The objective of restructuring and refactoring is to improve the internal and external values of
software.

– Restructuring preserves the external behavior of the original program.

– Restructuring can be performed without adding new requirements.

– Restructuring generally produces a program in the same language.

• Example: a C program is restructured into another C program.

6.1 ACTIVITIES IN A REFACTORING PROCESS


• To restructure a software system, one follows a process with well-defined activities.

– Identify what to refactor.

– Determine which refactorings to apply.

– Ensure that refactoring preserves the software’s behavior.

– Apply the refactorings to the chosen entities.

– Evaluate the impacts of the refactorings.

– Maintain consistency.

6.1.1 Identify what to refactor


• The programmer identifies what to refactor from a set of high-level software artifacts.

– source code;

– design documents; and

– requirements documents.

• Next, focus on specific portions of the chosen artifact for refactoring.

– Specific modules, functions, classes, methods, and data can be identified for refactoring.

• The concept of code smell is applied to source code to detect what should be refactored.

• A code smell is any symptom in source code that possibly indicates a deeper problem.

2
• Examples of code smell are:

– duplicate code;

– long parameter list;

– long methods;

– large classes;

– message chain.

• Entities to be refactored at the design level

– software architecture;

• class diagram;

• statechart diagram; and

• activity diagrams;

– global control flow; and

– database schemas.

6.1.2 Determine which refactorings to apply


• Referring to Figure, some refactorings are

– R1: Rename method print to process in class PrintServer.

– R2: Rename method print to process in class FileServer. (R1 and R2 are to be done together.)

– R3: Create a superclass Server from PrintServer and FileServer.

– R4: Pull up method accept from PrintServer and FileServer to the superclass Server.

– R5: Move method accept from PrintServer to class Packet, so that data packets themselves will
decide what actions to take.

– R6: Move method accept from FileServer to Packet.

– R7: Encapsulate field receiver in Packet so that another class cannot directly access this field.

– R8: Add parameter p of type Packet to method print in PrintServer to print the contents of a
packet.

– R9: Add parameter p of type Packet to method save in class FileServer so that the contents of a
packet can be printed.

3
: Class diagram of a Local Area Network (LAN) simulator
• R1—R9 indicate that a large number of refactorings can be identified even for a small system.

• A subset of the entire set of refactorings need to be carefully chosen because of the following reasons.

– Some refactorings must be applied together.

• Example: R1 and R2 are to be applied together.

– Some refactorings must be applied in certain orders.

• Example: R1 and R2 must precede R3.

– Some refactorings can be individually applied, but they must follow an order if applied together.

• Example: R1 and R8 can be applied in isolation. However, if both of them are to be


applied, then R1 must occur before R8.

– Some refactorings are mutually exclusive.

• Example: R4 and R6 are mutually exclusive.

• Tool support is needed to identify a feasible subset of refactorings.

• The following two techniques can be used to analyze a set of refactorings to select a feasible subset.

– Critical pair analysis

• Given a set of refactorings, analyze each pair for conflicts. A apir is said to be
conflicting if both of them cannot be applied together.

• Example: R4 and R6 constitute a conflicting pair.

– Sequential dependency analysis

• In order to apply a refactoring, one or more refactorings must be applied before.

• If one refactoring has already been applied, a mutually exclusive refactoring cannot be
applied anymore.

4
• Example: after applying R1, R2, and R3, R4 becomes applicable. Now, if R4 is
applied, then R6 is not applicable anymore.

6.1.3 Ensure that refactoring preserves the software’s behavior.


• Ideally, the input/output behavior of a program after refactoring is the same as the behavior
before refactoring.

• In many applications, preservation of non-functional requirements is necessary.

• A non-exclusive list of such non-functional requirements is as follows:

• Temporal constraints: A temporal constraint over a sequence of operations is that the operations
occur in a certain order.

• For real-time systems, refactorings should preserve temporal constraints.

• Resource constraints: The software after refactoring does not demand more resources: memory,
energy, communication bandwidth, and so on.

• Safety constraints: It is important that the software does not lose its safety properties after
refactoring.

• Two pragmatic ways of showing that refactoring preserves the software’s behavior.

• Testing

• Exhaustively test the software before and after applying refactorings, and compare the
observed behavior on a test-by-test basis.

• Verification of preservation of call sequence

• Ensure that the sequence(s) of method calls are preserved in the refactored program.

6.1.4 Apply the refactorings to chosen entities


• The class diagram of Fig. 7.2(a) has been obtained from Fig. by

– focusing on the classes FileServer, PrintServer, and Packet; and

– applying refactorings R1, R2, and R3.

5
Figure : Applications of two refactorings
6.1.5 Evaluate the impacts of the
Refactorings on Quality
• Refactorings impact both internal and external qualities of software.

• Some examples of internal qualities of software are

– size, complexity, coupling, cohesion, and testability

• Some examples of external qualities of software are

– performance, reusability, maintainability, extensibility, robustness, and scalability

• In general, refactoring techniques are highly specialized, with one technique improving a small number
of quality attributes.

• For example,

– some refactorings eliminate code duplication;

– some raise reusability;

– some improve performance; and

– some improve maintainability.

• By measuring the impacts of refactorings on internal qualities, their impacts on external qualities can be
measured.

• Example of measuring external qualities


6
– Some examples of software metrics are coupling, cohesion, and size.

– Decreased coupling, increased cohesion, and decreased size are likely to make a software system
more maintainable.

– To assess the impact of a refactoring technique for better maintainability, one can evaluate the
metrics before refactoring and after refactoring, and compare them.

6.1.6 Maintain consistency


• Rather than evaluate the impacts after applying refactorings, one selects refactorings such that the
program after refactoring possesses better quality attributes.

• The concept of soft-goal graph help select refactorings.

• Exmple: A soft-goal graph for quality attribute (maintainability) is a hierarchical graph rooted at the
desired change in the attribute, for example, high maintainability.

• The internal nodes represent successive refinements of the attribute and are basically the soft goals.

• The leaf nodes represent refactoring transformations which contribute positively/negatively to soft-goals
which appear above them in the hierarchy.

• A partial example of a soft goal graph with one leaf node, namely, Move, has been illustrated in Fig..

• The dotted lines between the leaf node Move and three soft goals – High Modularity, High Module
Reuse, and Low Control Flow Coupling imply that the Move transformation impacts those three soft
goals.

: An example of a soft goal graph for maintainability, with one leaf node
6.2 FORMALISMS FOR REFACTORING
• Three key formalisms for refactoring are:

7
– assertions:

• Assertions are useful in verifying the assumptions made by programmers.

– graph transformation:

• Graph transformation is useful in viewing refactorings as applications of transformation


rules.

– metrics:

• Metrics are useful in quantifying to what extent the internal and external properties of
software entities have changed.

6.2.1 Assertions
• Programmers make assumptions about the behavior of programs at specific points, and those
assumptions can be tested by means of assertions.

• An assertion is specified as a Boolean expression which evaluates to true or false.

• Three kinds of assertions:

– invariants;

– preconditions; and

– postconditions.

• Invariant

– An invariant is an assertion that evaluates to true wherever in the program it is invoked.

– A class invariant is an invariant that all instances of that class must satisfy.

• Precondition

– A precondition is a condition that must be satisfied before a computation is performed.

• Postcondition

– A postcondition is a condition that must be satisfied after a computation is performed.

• Invariants, preconditions, and postconditions can be applied to test the behavior preserving property of
refactorings.

• Examples of invariant in the context of transformation of database schema is:

– All instance variables of a class, whether defined or inherited, have distinct names.

– All methods of a class, whether defined or inherited, have distinct names.

• Note: Static checking of preconditions, postconditions, and invariants is computationally expensive.

6.2.2 Graph Transformation


• Programs, class diagrams, and statecharts can be viewed as graphs, and refactorings can be viewed as
graph production rules.

• Classes (C), method signatures (M), block structures (B), variables (V), parameters (P), and expressions
(E) are represented by typed nodes in a graph.

8
• The possible relationships among the nodes are:

– method lookup (l); -- inheritance (i);

– membership (m); -- (sub)type (t);

– expression (e); -- actual parameter (ap);

– formal parameter (fp); -- cascaded expression (•);

– call (c); -- variable access (a); and

– update (u).

• Figure shows an example program graph.

• The Push-Down-Method refactoring has been applied to method originate in first Fig. to obtain a new
graph shown in second Fig..

Figure : An example of a program graph


Formalisms for Refactoring

9
Figure : Program graph obtained after applying push-down-method
refactoring to the program graph of first Fig.
6.2.3 Software Metrics
• Software metrics can be used to quantify the internal and external qualities of software.

• A module consists of many components; each component provides a defined functionality used by other
components.

• Measure the strength of togetherness of components within a module to decide whether or not some
components should stay in the same module.

• Two metrics considered are:

• cohesion; and

• coupling.

• Cohesion: This metric is used to represent the strength of togetherness in the same module.

• Coupling: This metric is used to represent the strength of dependency between separate modules.

6.3 MORE EXAMPLES OF REFACTORING


• More examples are intuitively explained here.

– Substitute algorithm;

– Replace parameter with methods;

– Push Down Method;

– Parameterize Methods;
10
– Substitute algorithm

– Replace algorithm X with algorithm Y: (i) because implementation of Y is clearer than X; (ii) Y
performs better than X; and (iii) standardization bodies want X to be replaced with Y.

– Algorithm substitution is easier if both X and Y have the same input-output behaviors.

• Replace parameters with methods

Consider the following code segment, where the method bodyMassIndex has two formal parameters.
int person;
:
// person is initialized here;
:
int bodyMass = getMass(person);
int height = getHeight(person);
int BMI = bodyMassIndex(bodyMass, height);
:
The above code segment can be rewritten such that the new bodyMassIndex method accepts one
formal parameter, namely, person, and internally computes the values of bodyMass and height.
The refactored code segment has been shown in the following:

int person;
:
// person is initialized here;
:
int BMI = bodyMassIndex(person);
:
The advantage of this refactoring is that it reduces the number of parameters passed to methods.
Such reduction is important because one can easily make errors while passing long parameter
lists.
• Push Down Method

– Assume that Executive and Clerk are two subclasses of the superclass Employee, as shown in
Fig. (a).

– Method overTimePay has been defined in Employee class.

If overTimePay is used in the Clerk class, but not in the Executive class, then the programmer can push down
overTimePay to the Clerk class, is shown in Fig. (b).

Figure : Illustration of the Push Down Method refactoring:


(a) the class diagram before refactoring;
(b) the class diagram after refactoring.
• Parameterize Methods

11
– Sometimes programmers may find multiple methods performing the same computations on
different input data sets.

– Those methods can be replaced with a new method with additional formal parameters, as
illustrated in Fig. .

– In Fig. (a), we have the Communication class with four methods: bluetoothInterface,
wifiInterface, threeGInterface, and fourGInterface.

– In Fig.(b), we have the Communication class with just one method, namely, wirelessInterface
with one parameter, namely, radio.

– The method wirelessInterface can be invoked with different values of radio so that the
wirelessInterface method can in turn invoke different radio interfaces.

Figure : An example of parameterizing a method. There are four methods in (a), whereas there is one method in
(b) with one parameter
6.4 INITIAL WORK ON SOFTWARE RESTRUCTURING
• Software restructuring dates back to the mid 1960s, as soon as programs were written in Fortran.

• Topics of discussion in this section are:

– Factors influencing software structure

– Classification of restructuring approaches

– Restructuring techniques

• Elimination-of-goto approach

• Localization and information hiding approach

• System sandwich approach

• Clustering approach

• Program slicing approach

6.4.1 Factors Influencing Software Structure

• Software structure is a set of attributes of the software such that the programmer gets a good
understanding of software.

• Any factor that can influence the state of software or the programmer’s perception might influence
software structure.

• One view of the factors that influence software structure has been shown in Fig..

12
– Code -- Documentation

– Tools -- Programmers

– Managers and policies -- Environment

Figure : Factors which can influence software structure


• Code

– Code quality at all levels of details (e.g. variables, constants, statements, function, and module)
impact code understanding.

– Adherence to coding standards improves code quality.

– Adoption of common architectural styles enhances code understanding.

• Documentation

– Internal documentation (also known as in-line codumentation)

– External documentation

• Requirements documents

• Design documents

• User manuals

Test cases
• Tools – Programming environment

– Development tools help programmers better understand the code.

• Tracing of source code help in understanding the dynamic behavior of the code.

• Animation of algorithms help in understanding the dynamic strategy adopted in


algorithms.
13
• Cross referencing of global variables reveal interactions among modules.

• Tools can reformat code for better readability via pretty printing, highlighting of key
words, and color coding of source code.

• Programmers

– Qualities of programmers influence their perception of structure.

– Examples of programmer qualities

• Individual capabilities

• Education

• Experience and training

• Managers and policies

– Management can play an influencing role in having a good initial structure and sustain it by
designing policies and allocating resources.

– Examples

• Management can design general policies to adhere to standards.

• Management can tie the annual performance review with the programmer’s adherence to
standards.

• Environment

– This refers to the general working environment of programmers.

– Example: Physical facilities and availability of resources when needed

14
6.4.2 Classification of Restructuring Approaches

• A broad classification of software restructuring approaches has been shown in Fig.

Figure : Broad classification of approaches to software structuring.

• Approaches not involving code changes

– Train programmers in software engineering, including architectural styles and modularization


techniques.

– Upgrade documentation

• Make in-line comments more accurate and readable.

• Update comments to reflect code changes.

• Update external documentations to make them consistent with code, accurate, and
complete.

• Approaches involving code changes

– Practices: Some examples of restructuring practices are:

• Restructuring code with preprocessors.

• Making code understandable by means of inspection.

• Formatting code by adhering to standards and style guidelines.

• Restructuring code for reusability.

– Techniques: Some approaches are based on defined techniques.

• Incremental restructuring

• Goto-less approach

• Case-statement approach

• Boolean flag approach

15
• Clustering approach

– Tools

• Eclipse IDE, IntelliJ IDEA, jFactor, Refactorit, and Clone Doctor

6.4.3 Restructuring Techniques

• Restructuring techniques

– Those were developed in the mid-70s, before object-oriented programming.

– The techniques are applied at different levels of abstractions.

• Example of restructuring techniques

– Elimination-of-goto Approach

– Localization and Information Hiding Approach

– System Sandwich Approach

– Clustering Approach

– Program Slicing Approach

• Elimination-of-goto Approach

– Before the onset of structured programming, much code was written in the ‘70s with goto
statements.

– Structured programming puts emphasis on the following control constructs: for, while, until,
and, if-then-else.

– Those constructs make occurrences of loop and branching clear.

– It has been shown that every flowchart program with goto statements can be transformed into a
functionally equivalent goto-less program by using while statements.

• Localization and Information Hiding Approach

– Localization

• It is a process of collecting the logically related computational resources in one physical


module.

• Functions, procedures, operations, and data types are computational resources.

• By localizing computational resources into separate modules, programmers can


restructure a program into a loosely coupled system of sufficiently independent
modules.

• Sometimes, localization is difficult to achieve.

• A variable may be imported into a module by means of the include statement.

• Data sharing among functions is not explicitly represented in source code.

• Localization and Information Hiding Approach

16
– Information Hiding

• The details of implementations of computational resources can be hidden to make it


easier to understand the program.

• For example, a queue is a high level concept which can be implemented by means of a
variety of low level data structures.

• Singly linked list

• Doubly linked list

• Arrays

• A programmer can design a function by using enqueue and dequeue calls without any
concern for their actual implementations.

• Localization and Information Hiding Approach

– A restructuring process based on localization of variables and functions

• Localization of variables

• Organize global variables and functions which refer to those global variables
into package-like groups.

• This organization can be achieved by applying the concept of closure of


functions to a set of global variables.

• This leads to groups of functions and global variables referred to by those


functions.

• Localization of functions

• Put locally called functions and the calling function in the same group.

• Information hiding and hierarchical structuring

• Organize groups of functions into hierarchical package structures based on the


visibility of functions within groups.

• Those functions and variables which are only externally referable and visible to
other packages constitute the package specification.

• System Sandwich Approach

– This approach is applied to those software which cannot be restructured with any hope, but need
to be retained for their outputs.

– As illustrated in Fig., write a new front-end interface and a new back-end data base so that:

• it is easy to interface with the program; and

• the program’s outputs are recorded in a more structured way.

17
Figure: System sandwich approach to software restructuring.
The arrows represent the flow of data and/or commands.
• Clustering Approach

– Software modularization is an important design step.

– A program can be remodularized in two ways.

• System level remodularization

– This is a top-down approach.

– Partition the program into smaller modules, as illustrated in Fig.

• Entity level remodularization

– This is a bottom-up approach.

– Group a program’s entities to form larger modules.

Figure : Illustration of system level remodularization.


Bullets represent low level entities.
18
Dotted shapes represent modules.
Arrows represent progression from one level to the next.
• Clustering Approach (Contd.)

– The concept of clustering is key to modularization.

– Clusters are defined as continuous regions of space containing a relatively high density of
points, separated from other such regions by regions containing a relatively low density of
points.

– Modularization is defined as the clustering of large amount of entities in groups in such a way
that the entities in one group are more closely related, based on some similarity metrics.

– While applying the idea of clustering, two factors are taken into account:

• What similarity metrics to consider?

• What clustering algorithm to use?

• Clustering Approach (Contd.)

– Similarity metrics

• Distance measures

– Euclidean distance

– Manhattan distance

• Association coefficients

– Simple matching coefficient

– Jaccard coefficient

– Examples of association coefficients

• Let x and y be two entities. Let:

a = # of features present for both x and y.


b = # of features present for x but not y.
c = # of features present for y but not x.
d = # of features not present for both x and y.
• Simple matching coefficient: simple(x, y) = (a + d)/(a + b + c + d).

• Jaccard coefficient: Jaccard(x, y) = a/(a + b + c).

• Clustering Approach (Contd.)

• Clustering algorithms: three broad techniques applied.

• Graph theoretical algorithms

• Construction algorithms

• Optimization algorithms (aka iterative and improvement algorithms)

• Hierarchical algorithms

• Divisive algorithms (See Figure)


19
• Agglomerative algorithms (See Figure)

• The clustering produced by a hierarchical algorithm can be visualized in a dendogram.

• Clustering Approach (Contd.)

• The general structure of an agglomerative algorithm

1.IF there are N entities, begin with N clusters such that each
cluster contains a unique entity.
Compute the similarities between the clusters.
2. WHILE there is more than a cluster
DO
Find the most similar pair of clusters and merge them into a single cluster.
Recompute the similarities between the clusters.
END

Figure : Illustration of system level remodularization.


Bullets represent low level entities.
Dotted shapes represent modules.
Arrows represent progression from one level
to the next.

Figure : Dendogram representation

20
Figure: Illustration of entity level remodularization.
Bullets represent low level entities.
Dotted shapes represent modules.
• Program Slicing Approach

– Two kinds of program slicing

• Backward slicing: The set of statements that can affect the value of a variable at some
point of interest in a program is called a backward slice.

• Forward slicing: The set of statements that are likely to be affected by the value of a
variable at some point of interest in a program is called a forward slide.

– A key idea in program slicing

• Identify and extract a cohesive subset of statements from a program.

– Therefore, if a module supports multiple functionalities, a portion of the code can be extracted to
form a new module.

– Large functions can be decomposed into smaller functions by means of program slicing to
restructure programs.

21
7.6 DOMAIN ENGINEERING

• The term domain engineering refers to a development-for-reuse process to create reusable software
assets (RSA).

• It is also referred to as product line development.

• Domain engineering is the set of activities that are executed to create reusable software assets to be
used in specific software projects.

• For a software product family, the requirements of the family are identified and a reusable, generic
software structure is designed to develop members of the family.

• In the following slides, we explain analysis, design, and implementation activities of domain
engineering.

Domain Analysis
• Domain analysis comprises three main steps:

– identify the family of products to be constructed;

– determine the variable and common features in the family of products;

– develop the specifications of the product family.

• The Feature Oriented Domain Analysis (FODA) method developed at the Software Engineering
Institute is a well-known method for domain analysis.

• The FODA method describes a process for domain analysis to discover, analyze, and document
commonality and differences within a domain.

Domain Design
• Domain design comprises two main steps:

– develop a generic software architecture for the family of products under consideration; and

– develop a plan to create individual systems based on reusable assets.

• The design activity emphasizes a common architecture of related systems.

• The common architecture becomes the basis for system construction and incremental growth.

• The design activities are supported by architecture description languages (ADLs), namely, Acme, and
interface definition languages (IDLs), such as Facebook’s Thrift.

Domain Implementation
• Domain implementation involves the following broad activities:

– identify reusable components based on the outcome of domain analysis;

– acquire and create reusable assets by applying the domain knowledge acquired in the process of
domain analysis and the generic software architecture constructed in the domain design phase;

– catalogue the reusable assets into a component library.

• Development, management, and maintenance of a repository of reusable assets make up the core of
domain implementation.
22
7.7 Application Engineering
• Application engineering (a.k.a. product development) is complementary to domain engineering.

• It refers to a development-with-reuse process to create specific systems by using the fabricated assets
defined in domain engineering.

• Application engineering composes specific application systems by:

(i) reusing existing assets;


(ii) developing any new components that are needed;
(iii) reengineering some existent software; and
(iv) testing the overall system.
• Similar to the standard practices in software engineering [27], it begins by eliciting requirements,
analyzing the requirements, and writing a specification.

Relationship Between Application & Domain Engineering


• Both domain and application engineering processes feed on each other, as illustrated in Figure .

Application engineering is fed with reusable assets from domain engineering, whereas domain engineering is fed
with new requirements from application engineering

Feedback between domain and application Engineering

7.8 Domain Engineering Approaches


• The following nine domain engineering approaches reported in literature:

– Draco

– Domain Analysis and Reuse Environment (DARE)

– Familyoriented Abstraction, Specification, and Transportation (FAST)

– Feature-Oriented ReuseMethod(FORM)

– ”Komponentbasierte Anwendungsentwicklung” (KobrA)

– Product line UML-basedsoftware engineering (PLUS)

– Product Line Software Engineering (PuLSE)

– Koala

– Reusedriven Software Engineering Business (RSEB)

23
24

You might also like