0% found this document useful (0 votes)
4 views

Software Testing

Uploaded by

munasinmunna19
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Software Testing

Uploaded by

munasinmunna19
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 217

UNIT - I

LESSON 1

SOFTWARE DEVELOPMENT LIFE CYCLE MODELS


Contents
1.0 Aims and Objectives
1.1 Phases of Software Project - An Overview
1.2 Requirements Gathering and Analysis
1.3 Planning
1.4 Design
1.5 Development or Coding
1.6 Testing
1.7 Deployment and Maintenance
1.8 Let Us Sum Up

1.0 AIMS AND OBJECTIVES

This is the first lesson of the entire subject, will introduce you the fundamental
concepts of software testing. The prime part of software development is its
phases. Before getting into software testing the readers must understand the
phases of software project development which are associated with entire
software testing task.
At the end of the lesson, I hope you can able understand
• Requirements Analysis
• How to Plan?
• The tools and methods of design
• Various Coding methods
• An Independent Testing Module
• The ways of deployment and further maintenance

1.1 PHASES OF SOFTWARE PROJECT - AN OVERVIEW

Software life cycle models describe phases of the software cycle and the
order in which those phases are executed. There are tons of models, and many
companies adopt their own, but all have very similar patterns. A software
project is made up of a series of phases. Broadly, most software projects
comprise the following phases.

1
• Requirements Gathering and Analysis
• Planning
• Design
• Development or Coding
• Testing
• Deployment and Maintenance
Among various models, Let me explain the waterfall model is a sequential
software development model (a process for the creation of software) in which
development is seen as flowing steadily downwards (like a waterfall) through the
phases of requirements analysis, design, implementation, testing (validation),
integration, and maintenance. This is otherwise called as common and classic
model where it is a linear-model as depicted in figure 1.1

Figure 1.1 SDLC Water – Fall Model (Linear model)

Check your progress 1

Write down the phases of SDLC.


Notes: a) Write your answer in the space given below
b) Check your answer with the one given at the end of this lesson
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

2
1.2 REQUIREMENTS GATHERING AND ANALYSIS

During requirements gathering, the specific requirements of the software


to be built are gathered and documented. If the software is bespoke software,
then there is a single customer who can give these requirements. If the product
is a general-purpose software, then a product marketing team within the
software product organization specifies the requirements by aggregating the
requirements of multiple potential customers. In either case, it is important to
ensure that the right requirements are captured at every stage. The
requirements get documented in the form of a System Requirements
Specification (SRS) document. This document acts as a bridge between the
customer and the designers chartered to build the product. In Figure 1.2, there
listed some software documents name and their abbreviation.

IEEE Software Document Definitions

SQAP - Software Quality Assurance Plan IEEE 730

SCMP - Software Configuration Management Plan IEEE 828

STD - Software Test Documentation IEEE 829

SRS - Software Requirements Specification IEEE 830

SVVP - Software Validation & Verification Plan IEEE 1012

SDD - Software Design Description IEEE 1016

SPMP - Software Project Management Plan IEEE 1058

Figure 1.2 Software documents

1.3 PLANNING

The purpose of the planning phase is to come up with a schedule, the


scope, and resource requirements for a release. A plan explains how the
requirements will be met and by which time. It needs to take into account the
requirements – what will be met and what will not be met – for the current
release to decide on the scope for the project, look at resource availability, and
to come out with set of milestones and release date for the project. The planning
phase is applicable for both development and testing activities. At the end of
this phase, both project plan and test plan documents are delivered.

3
Check your progress 2

Define the Plan.


Notes: a) Write your answer in the space given below
c) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

1.4 DESIGN

The purpose of the design is to figure out how to satisfy the requirements
enumerated in the System Requirements Specification document. The design
phase produces a representation that will be used by the following phase, the
development phase. This representation should serve two purposes. First, from
this representation, it should be possible to verify that all the requirements are
satisfied. Second, this representation should give sufficient information for the
development phase to proceed with the coding and implementation of the
system. Design is usually split into two levels – high level design and low-level
or a detailed design. The design step produces the System Design Description
(SDD) document that will be used by development teams to produce the
programs that realize the design.

1.5 DEVELOPMENT OR CODING

Design acts as a blue print for the actual coding to proceed. This
development or coding phase comprises coding the programs in the chosen
programming language. It produces the software that meets the requirements
the design was meant to satisfy. In addition to programming, this phase also
involves the creation of product documentation.

Check your progress 3

Describe on Unified Modeling applied for Software Design.


Notes: a) Write your answer in the space given below
d) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

1.6 TESTING

As the programs are coded (in the chosen programming language), they
are also tested. In addition, after the coding is (deemed) complete, the product

4
is subjected to testing. Testing is the process of exercising the software product
in pre-defined ways to check if the behavior is the same as expected behavior.
By testing the product, an organization identifies and removes as many defects
as possible before shipping it out.

1.7 DEPLOYMENT AND MAINTENANCE

Once a product is tested, it is given to the customers who deploy it in


their environments. As the users start using the product in their environments,
they may observe discrepancies between the actual behavior of the product and
what they were given to expect (either by the marketing people or through the
product documentation). Such discrepancies could end up as product defects,
which need to be corrected. The product now enters the maintenance phase,
wherein the product maintained or changed to satisfy the changes that arise
from customer expectations, environmental changes, etc. Maintenance is made
up of corrective maintenance (for example, fixing customer-reported problems),
adaptive maintenance (for example, making the software run on a new version
of an operating system or database), and preventive maintenance (for example,
changing the application program code to avoid a potential security hole in an
operating system code).
SDLC STRENGTHS AND WEAKNESS
Here in Table 1.1 listed the strengths and weakness of Software Development
Lifecycle. And Open source, Object Programming are some alternate techniques
to SDLC.

Table 1.1

Strength and Weaknesses of SDLC

Strengths Weaknesses

Control. Increased development time.


Monitor Large projects. Increased development cost.
Detailed steps. Systems must be defined up front.
Evaluate costs and completion
Rigidity.
targets.
Hard to estimate costs, project
Documentation.
overruns.
Well defined user input. User input is sometimes limited.
Ease of maintenance.
Development and design standards.
Tolerates changes in MIS staffing.

5
1.8 LETS SUM UP

In this lesson, we discussed briefly on the phases of SDLC and the activities
performed in each phase. The Software Development Life cycle models such as
waterfall, V-process and other models are illustrated in detail in third lesson of
the courseware.

Check Your Progress: Model Answers

1. The Phases are Requirements Gathering and Analysis, Planning, Design,


Development or Coding, Testing, Deployment and Maintenance
2. A plan explains how the requirements will be met and by which time. It
needs to take into account the requirements – what will be met and what
will not be met
3. Unified modeling has various diagrams to represent system design such
that Class diagram, Object diagram, Use case diagram, Package diagram
and deployment diagram.

References:

1. SOFTWARE TESTING Principles and Practices – Srinivasan Desikan


& Gopalswamy Ramesh, 2006, Pearson Education.
2. http://www.a2zdotnet.com/View.aspx?id=51

6
LESSON 2

QUALITY CONTROL, TESTING AND REPRESENTING


DIFFERENT PHASES
Contents
2.0 Aims and Objectives
2.1 Quality
2.2 Quality Assurance and Quality Control
2.3 Testing, Verification and Validation
2.4 Process Model to Represent Different Phases
2.5 Let Us Sum Up

2.0 AIMS AND OBJECTIVES

In previous lesson, you must be able to know the phases of SDLC or


SDLC Lifecycle. In this lesson we are going to discuss on how evaluation on
quality and testing are employed in all SDLC phases. At the end of the lesson,
you can be able to know
Quality Assurance and Quality Control
Testing, Verification and Validation
Process Model for each phase

2.1 QUALITY

Quality is meeting the requirements expected of the software,


consistently and predictably. A software product is designed to satisfy certain
requirements of a given customer (or set of customers). How can we
characterize this phrase – “satisfying requirements”? Requirements get
translated into software features, each feature being designed to meet one or
more of the requirements. For each such feature, the expected behavior is
characterized by a set of test cases. Each test case is further characterized by
1. The environment under which the test case is to be executed;
2. Inputs that should be provided for that test case;
3. How these inputs should get processed;
4. What changes should be produced in the internal state or environment;
and
5. What output should be produced?
The actual behavior of given software for a given test case, under a given
set of inputs, in a given environment, and in a given internal state is
characterized by

7
1. How these inputs actually get processed;
2. What changes are actually produced in the internal state or
environment? and
3. What outputs are actually produced?
If the actual behavior and the expected behavior are identical in all their
characteristics, then that test case is said to be passed. If not, the given
software is said to have a defect on that test case. How do we increase the
chances of a product meeting the requirements expected of it, consistently and
predictably? There are two types of methods – quality control and quality
assurance.

2.2 Quality Assurance and Quality Control

Quality control attempts to build a product, test it for expected behavior


after it is built, and if it the expected behavior is not the same as the actual
behavior of the product, fixes the product as is necessary and rebuilds the
product. This iteration is repeated till the expected behavior of the product
matches the actual behavior for the scenarios tested. Thus quality control is
defect-detection and defect-correction oriented, and works on the product
rather than on the process.
Quality assurance, on the other hand, attempts defect prevention by
concentrating on the process of producing the product rather than working on
defect detection/correction after the product is built. For example, instead of
producing and then testing a program code for proper behavior by exercising
the built product, a quality assurance approach would be first review the design
before the product is built and correct the design errors in the first place.
Similarly, to ensure the production of a better code a quality assurance process
may mandate coding standards to be followed by all programmers. As can be
seen from the above examples, quality assurance normally tends to apply to all
the products that use a process. Also, since quality assurance continues
throughout the life of the product it is everybody’s responsibility; hence it is a
staff function. In contrast, the responsibility for quality control is usually
localized to a quality control team. Table 2.1 summarizes the key distinctions
between quality control and quality assurance.

Quality Assurance Quality Control

Concentrates on the process of Concentrates on specific


producing the products products

Defect-prevention oriented Defect-detection and correction


oriented

Usually done throughout the life Usually done after the product is
cycle built

8
This is usually a staff function This is usually a line function

Examples: reviews and audits Examples: software testing at


various levels

Table 2.1 Differences between Quality Assurance and Control

Check your progress 1

How Quality Assurance and Control is functioning on Defects?


Notes: a) Write your answer in the space given below
e) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

2.3 TESTING, VERIFICATION AND VALEDICTION

Verification is the process of evaluating a system or component to


determine whether the products of a given phase satisfy the conditions Imposed
at the start of that phase. The narrow definition of the term "testing" is the
phase that follows coding and precedes deployment. Testing is traditionally
used to mean testing of the program code. However, coding is a downstream
activity, as against requirements and design that occur much earlier in a
project life cycle. Given that the objective of a software project is to minimize
and prevent defects, testing of program code alone is not sufficient. As we saw
in the last chapter, defects can creep in during any phase and these defects
should be detected as close to the point of injection as possible and not wait till
the testing of programs. Hence against this, if each phase is "tested" separately
as and when the phase is completed (or, better still, as the phase is being
executed), then defects can be detected early, thereby reducing the overall
costs.
Timely testing increases the chances of a product or service meeting the
customer's requirements. When a product is tested with appropriate and
realistic tests that reflect typical usage patterns by the intended users, the
chances of the product satisfying the customer's requirement is much higher.
While testing does not guarantee zero defects, effective testing certainly
increases the chances of customer acceptance of the software.
The purpose of testing is to uncover defects in the system (and to have
someone fix the defects). Testing is done by a set of people within a software
product (or service) organization whose goal and charter is to uncover the
defects in the product before it reaches the customer (see Section 1.3). As we
saw in the previous chapter, the purpose of testing is NOT to prove that the
product has no defects. The purpose of software testing is to find defects in a
software product. The reward systems and the organization structures should
create and foster an environment that encourages this purpose of testing.

9
VALIDATION
Validation is the process of evaluating a system or component during or
at the end of the development process to determine whether it satisfies specified
requirements. Testing is NOT meant to replace other ways of ensuring quality
(like reviews). It is one of the methods to detect defects in a software product.
There are other methods that achieve the same function. For example, we will
see later that following well-defined processes and standards reduce the
chances of defects creeping into a software. We will also discuss other methods
like reviews and inspections, which actually attempt to prevent defects coming
into the product. To be effective, testing should complement, supplement, and
augment such quality assurance methods discussed in the previous section.
The idea of catching defects within each phase, without letting then reach the
testing phase, leads us to define two more terms-verification and validation.
During the requirements gathering phase, the requirements are faithfully
captured. The SRS document is the product of the requirements phase. To
ensure that requirements are faithfully captured, the customer verifies this
document. The design phase takes the SRS document as input and maps the
requirements to a design that can drive the coding. The SOD document is the
product of the design phase. The SOD is verified by the requirement team to
ensure that the design faithfully reflects the SRS, which imposed the conditions
at the beginning of the design phase.
Verification takes care of activities to focus on the question "Are we
building the product right?" and validation takes care of a set of activities to
address the question "Are we building the right product?"
To build the product right, certain activities/conditions/procedures ill
imposed at the beginning of the life cycle. These activities are considered
"proactive" as their purpose is to prevent the defects before they take shape. The
process activities carried out during various phases for each of the product
releases can be termed as verification. Requirements review, design review, and
code review are some examples of verification activities.
To build the right product, certain activities are carried out during
various phases to validate whether the product is built as per specification.
These activities are considered "reactive" as their purpose is to find defeat that
affect the product and fix them as soon as they are introduced. Sam examples
of validation include unit testing performed to verify if the code logic works,
integration testing performed to verify the design, and system testing performed
to verify that the requirements are met.
To summarize, there are different terminologies that may stand for the
same or similar concepts. For all practical purposes in this study material, we
can assume verification and quality assurance to be one and the same.
Similarly quality control, validation, and. testing mean the same.
Quality Assurance = Verification
Quality Control = Validation
= Testing

10
Check your progress 2

Define Validation.
Notes: a) Write your answer in the space given below
f) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

2.5 PROCESS MODEL TO REPRESENT DIFFERENT PHASES

A process model is a way to represent any given phase of software


development that effectively builds in the concepts of validation and verification
to prevent and minimize the delay between defect injection and defect detection
(and eventual correction). In this model, each phase of a software project is
characterized by the following.
• Entry criteria, which specify when that phase can be started. Also
included are the inputs for the phase.
• Tasks, or steps that need to be carried out in that phase, along
with measurements that characterize the tasks.
• Verification, which specifies methods of checking that the tasks
have been carried out correctly.
• Exit criteria, which stipulate the conditions under which one can
consider the phase as done. Also included are the outputs for only
the phase. .
This model, known as the Entry Task Verification eXit or ETVX
model, offers several advantages for effective verification and validation.
1. Clear entry criteria make sure that a given phase does not start
prematurely.
2. The verification for each phase (or each activity in each phase) helps
prevent defects, or at least, minimizes the time delay between defect
injection and defect detection.
3. Documentation of the detailed tasks that comprise each phase
reduces the ambiguity in interpretation of the instructions and thus
minimizes the variations that can come from repeated executions of
these tasks by different individuals.
4. Clear exit criteria provide a means of validation of the phase, after the
phase is done but before handing over to the next phase.

11
An example of applying the ETVX model to the design phase is presented in the
figure 2.1.

Entry criteria:
Approval of SRS by customer

Output:
• Architecture documents
Input: • Design documents
Approved SRS • Program Specifications

Exit criteria:
Figure 2.1 • Complete traceability between
design and SRS
ETVX model applied to design • Development team ready to
start programming
Check your progress 3

Discuss any two advantages of ETVX system.


Notes: a) Write your answer in the space given below
g) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

2.6 LETS SUM UP

I hope this lesson provides hands on view about quality with assurance and
control. And Validation and Verification process of a software project.
A model, known as the Entry Task Verification eXit or ETVX model, offers
several advantages for effective verification and validation.

Check Your Progress: Model Answers

1. In Quality Assurance the defects are prevention oriented and in Quality


control defects are correction oriented detected easily.

12
2. Validation is the process of evaluating a system or component during or
at the end of the development process to determine whether it satisfies
specified requirements.
3. The ETVX system provides,
a. The verification for each phase (or each activity in each phase) helps
prevent defects, or at least, minimizes the time delay between defect
injection and defect detection.
b. Documentation of the detailed tasks that comprise each phase reduces
the ambiguity in interpretation of the instructions and thus minimizes
the variations that can come from repeated executions of these tasks by
different individuals.

13
LESSON 3

LIFE CYCLE MODELS


Contents
3.0 Aims and Objectives
3.1 Life Cycle Models
3.1.1 Waterfall Model
3.1.2 Prototyping and Rapid Application Development Models
3.1.3 Spiral or Iterative Model
3.1.4 The V model
3.1.5 Modified V model
3.2 Comparison of Various Life Cycle Models
3.3 Let Us Sum Up

3.0 AIMS AND OBJECTIVES

This lesson would be discussing the most interesting part of software


modeling which brings out various types such as Waterfall model, Prototypes,
Spiral model, V model and their modified versions.
At the end of the lesson, you may be able to decide a modeling
mechanism for your own software project development. Various attributes of
Life cycle modeling is presented.

3.1 LIFE CYCLE MODELS

The ETVX model characterizes a phase of a project. A Life Cycle model


describes how the phases combine together to form a complete project or life
cycle. Such a model is characterized by the following attributes.
The activities performed In any given software project, apart from the
most common activities or phases-requirements gathering, design,
development, testing, and maintenance-there could be other activities as well.
Some of these activities could be technical activities (for example, porting) and
some could be non-technical (for example, hiring).
The deliverables from each activity Each activity produces a set of
deliverables, which are the end products of that activity. For example, the
requirements gathering phase produces the SRS document, the design phase
produces the SOD document, and so on.
Methods of validation of the deliverables The outputs produced by a
given activity represent the goal to be satisfied by that activity. Hence it is
necessary to have proper validation criteria for each output.

14
The sequence of activities The different activities work together in
unison in a certain sequence of steps to achieve overall project goals. For
example, the process of requirements gathering may involve steps such as
interviews with customers, documentation of requirements, validation of
documented requirements with customers, and freezing of requirements,
These steps may be repeated as many times as needed to get the final frozen
requirements.
Methods of verification of each activity, including the mechanism
of communication amongst the activities The different activities interact
with one another by means of communication methods. For example, when a
defect is found in one activity and is traced back to the causes in an earlier
activity, proper verification methods are needed to retrace steps from the point
of defect to the cause of the defect.
We will now look at some of the common life cycle models that are used
in software projects. For each model, we will look at:
1. a brief description of the model;
2. the relationship of the model to verification and validation activities;
and
3. typical scenarios where that life cycle model is useful.

Check your progress 1

What do you mean by deliverables?


Notes: a) Write your answer in the space given below
h) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

3.1.1 WATERFALL MODEL

In the Waterfall model, a project is divided into a set of phases (or


activities). Each phase is distinct, that is, there are clear lines of separation
between the phases, with very clear demarcation of the functions of each of the
phase.
A project starts with an initial phase, and upon completion of the phase,
moves onto the next phase. On the completion of this phase, the project moves
to the subsequent phase and so on. Thus the phases are strictly time
sequenced.
We depict one example of a project in the Waterfall model in Figure 3.1
The project goes through a phase of requirements gathering. At the end of
requirements gathering, a System Requirements Specification document is
produced. This becomes the input to the design phase. During the design
phase, a detailed design is produced in the form of a System Design

15
Description. With the SDD as input, the project proceeds to the development or
coding phase, wherein programmers develop the programs required to satisfy
the design. Once the programmers complete their coding tasks, they hand the
product to the testing team, who test the product before it is released.
If there is no problem in a given phase, then this method can work, going
in one direction (like a waterfall) given in Figure 3.1. But what would happen if
there are problems after going to a particular phase? For example, you go into
the design phase and find that it is not possible to satisfy the requirements,

Figure 3.1 The Waterfall Model


The main strength of waterfall model is its simplicity. The model is very
useful when a project can actually be divided into watertight compartments.
But very few software projects can be divided thus. The major drawback in the
waterfall model arises from the delay in feedback among the phases, and thus
the ineffectiveness of verification and validation activities. An error in one phase
is not detected till at least the next phase. When a given phase detects an error,
the communication is only to the immediately preceding phase. This sequential
nature of communication among the phase can introduce inordinate delays in
resolving the problem. The reduced responsiveness that is inherent in the
model and the fact that the segregation of phases is unrealistic severely
restricts the applicability of this method.

3.1.2 Prototyping and Rapid Application Development Models

Prototyping and Rapid Application Development (RAD) models recognize


and address the following issues.
1. Early and frequent user feedback will increase the chances of a software
project meeting the customer requirements.
2. Changes are unavoidable and the software development process must be
able to adapt itself to rapid changes.
The Prototyping model comprises the following activities.

16
1. The software development organization interacts with customers to
understand their customers to understand their requirements.
2. The software development organization produces a prototype to show
how the eventual software system would look like. This prototype
would have the models of how the input screens and output reports
would look like, in addition to having some “empty can functionality”
to demonstrate the workflow and processing logic.
3. The customer and the development organization review the prototype
frequently so that the customer’s feedback is taken very early in the
cycle (that is, during the requirements gathering phase).
4. Based on the feedback and the prototyping that is produced, the
software development organization produces the System
Requirements Specification document.
5. Once the SRS document is produced, the prototype can be discarded.
6. The SRS document is used as the basis for further design and
development.
Thus, the prototype is simply used as a means of quickly gathering (the
right) requirements. This model has built-in mechanisms for verification and
validation of the requirements. As the prototype is being developed, the
customer’s frequent feedback acts as a validation mechanism. Once the SRS is
produced, it acts as a validation mechanism for the design and subsequent
steps. But the verification and validation activities of the subsequent phases are
actually dictated by the life cycle model that is followed after the SRS is
obtained.
The Rapid Application Development model is a variation of the
Prototyping Model. Like the Prototyping Model, the RAD Model relies on
feedback and interaction by the customers to gather the initial requirements.
However, the Prototyping model differs from the RAD Model on two counts.
First, in the RAD Model, it is not a prototype that is built but the actual
product itself. That is, the built application (prototype, in the previous model) is
not discarded. Hence, it is named Rapid Application Development Model.
Second, in order to ensure formalism in capturing the requirements in
the design and subsequent phases, a Computer Aided Software Engineering
(CASE) tool is used throughout the life cycle, right from requirements gathering.
Such CASE tools have
 Methodologies to elicit requirements
 Repositories to store the gathered requirements and all downstream
entities such as design objects; and
 Mechanisms to automatically translate the requirements stored in the
repositories to design and generate the code in the chosen programming
environment.

17
This method can have wider applicability for even general-purpose
products. The automatic generation of the design and programs produced by a
CASE tool makes this model more attractive.

Check your progress 2

Discuss on RAD model.


Notes: a) Write your answer in the space given below
i) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

3.1.3 SPIRAL OR ITERATIVE MODEL

The Spiral or Iterative model given in Figure 3.2 follows a process in


which the requirements gathering, design, coding, and testing are performed
iteratively till all requirements are met. There is also a good amount of overlap
among the activities of requirements gathering, design, coding, and testing
following this model. What phase the product is in is difficult to conclude as
each requirement can be at a different phase. The only conclusion that can be
made is at what phase each of the requirements is in. If a defect is produced in
any phase of a given requirement, it may cause that requirement to revisit an
earlier phase. This model enables incremental development whereby the
product evolves, with requirements getting added to it dynamically. This
enables the product to be demonstrated, at any point of time, with the
functionality available at that point of time. It also enables the “increments” to
be sent to the customer for approval. The progress of the product can be seen
from the beginning of the project as the model delivers "increments" at regular
intervals. Even though it will be very difficult to plan a release date following
this model, it allows the progress to be tracked and the customer approvals to
be obtained at regular intervals, thereby reducing the risk of finding major
defects at a later point of time. Table gives an example of phases for some of the
requirements in the product.
Table 3.1 Illustration of Phases in Spiral

Requirements Status / Phase currently


in
Requirements-1 Coding
Requirements-2 Design
Requirements-3 Requirement
Requirements-4 Testing
Requirements-5 Released
Table 3.1 Illustration of Phases in Spiral

18
Figure 3.2 The Spiral Model

3.1.4 THE V MODEL

The Waterfall Model viewed testing as a post-development (that is, post -


coding) activity. The Spiral Model took this one step further and tried to break
up the product into increments each of which can be tested separately. The V
Model starts off being similar to the Waterfall Model in that it envisages product
development to be made up of a number 1 phases or levels. However, the new
perspective that the V Model brings in is that different types of testing apply at
different levels illustrated in Figure 3.3. Thus, from a testing perspective, the
type of tests that need to be done at each level vary significantly.
Given these levels, what kind of tests apply in each of these levels? To
begin with, for overall business requirements, eventually whatever software is
developed should fit into and work in this overall context and should be
accepted by the end users, in their environment. This testing, the final prod of
the pudding, is acceptance testing. But, before the product is deployed m the
customer's environment, the product vendor should test it as an entire unit to
make sure that all the software requirements are satisfied by the product that
is developed. This testing of the entire software system can be called system
testing. Since high-level design views the system as being made up of
interoperating and integrated (software) subsystems, the individual
subsystems should be integrated and tested together before a full blown
system test can be done. This testing of high-level design corresponds to
integration testing. The components that are the outputs of the low-level design
have to be tested independently before being integrated. Thus, the testing
corresponding to the low-level design phase is component testing. Finally, since
coding produces several program units, each of these smaller program units
have to be tested independently before trying to combine them together to

19
form components. This testing of the program units forms unit testing.

Figure 3.3 The V-Process Model


Not only are the skill sets required for designing these different types of
tests different, but also, there is no reason to defer the designing of the tests till
the very end. As and when each activity on the left-hand side of the "V" is being
carried out, the design of the corresponding type of tests can be carried out. By
performing an early design of the tests and deferring only the test execution till
the end, we achieve three important gains.
• First, we achieve more parallelism and reduce the end-of-cycle time taken
for testing.
• Second, by designing tests for each activity upfront, we are building in
better upfront validation, thus again reducing last-minute surprises.
• Third, tests are designed by people with appropriate skill sets.
This is the basis for the V Model, which presents excellent advantages for
verification and validation.

Check your progress 3

Discuss the different types of testing performed on V model.


Notes: a) Write your answer in the space given below
j) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

20
3.1.5 MODIFIED V MODEL

The V Model split the design and execution portion of the various types of
tests and attached the test design portion to the corresponding earlier phases of
the software life cycle.
An assumption made there was that even though the activity of test
execution was split into execution of tests of different types, the execution
cannot happen until the entire product is built. For a given product, I different
units and components can be in different stages of evolution. The V Model does
not explicitly address this natural parallelism commonly found in product
development.
Just as the V Model introduced various types of testing, the modified V
model introduces various phases of testing. A phase of testing has a one to-one
mapping to the types of testing, that is, there is a unit-testing phase,
component-testing phase, and so on. Once a unit has completed the unit -
testing phase, it becomes part of a component and enters the component
testing phase. It then moves to integration-testing phase and so on. Rather
than view the product as going through different types of tests (as the V model
does), the modified V Model views each part of the product to go through
different phases of testing. These are actually two sides of the same coin and
thus provide complimentary views. The main advantage the modified V model
brings to the table is the recognition of the parallelism present in different parts
of the product and assigning each part to the most appropriate phase of testing
that is possible. In Figure 2.6, the columns of the table represents one side of V,
and rows (which are test phases) represent the other side of V.

3.2 COMPARISON OF VARIOUS LIFE CYCLE MODELS

As can be seen from the above discussion, each of the models has its
advantages and disadvantages. Each of them has applicability in a specific
scenario. Each of them also provides different issues, challenges and
opportunities for verification and validation depicted n Figure 3.4.

Figure 3.4 Model applicability and relevance to verification and validation

21
3.3 LETS SUM UP

We are in the end of the lesson, in which you can be able to understand the
different types of testing and different types of life cycle models and RAD model.
Prototyping is a method of doing a software project as a blueprint. i.e
implementing it for trial before making it reality.

Check Your Progress: Model Answers

1. The deliverables from each activity Each activity produces a set of


deliverables, which are the end products of that activity. For example,
the requirements gathering phase produces the SRS document, the
design phase produces the SOD(System Overall Design)document.
2 The Rapid Application Development model is a variation of the
Prototyping Model. Like the Prototyping Model, the RAD Model relies on
feedback and interaction by the customers to gather the initial
requirements. However, the Prototyping model differs from the RAD
Model on two counts.
3. Integration Testing, Unit Testing, System Testing and Component
Testing.

References:

1. http://en.wikipedia.org/wiki/Software_development_process
2. 2. SOFTWARE TESTING Principles and Practices – Srinivasan
Desikan Gopalswamy Ramesh, 2006, Pearson Education.
3. http://www.stylusinc.com/Common/Concerns/
SoftwareDevtPhilosophy.php

22
LESSON 4

WHITE BOX TESTING & STATIC TESTING


Contents
4.0 Aims and Objectives
4.1 White Box Testing
4.2 Static Testing
4.2.1 Static Testing by Humans
4.2.1.1 Desk Checking
4.2.1.2 Code Walkthrough
4.2.1.3 Formal Inspection
4.2.1.4 Combining Various Methods
4.2.2 Static Analysis Tools
4.2.2.1 Coding Review Checklist
4.3 Let Us Sum Up

4.0 AIMS AND OBJECTIVES

In this lesson, we are going to discuss on the foremost level of testing. i.e.
White box testing. Another interesting testing called Static testing is also
discussed in detail with its sub-heads such as Static Testing by humans and
automatic static testing tool.
At the end of this lesson, you might be able to understand what basic
testing methods available and tools used for them.

4.1 WHITE BOX TESTING

White box testing is a way of testing the external functionality of the code
by examining and testing the program code that realizes the external
functionality. This is also known as clear box, or glass box or open box testing
given in Figure 4.1.

23
Figure 4.1 Classification of White Box Testing
White box testing takes into account the program code, code structure
and internal design flow. A number of defects come about because of incorrect
translation of requirements and design into program code. Some other defects
are created by programming errors and programming language idiosyncrasies.
The different methods of white box testing, reduces the delay between the
injection of a defect in the program code and its detection. Furthermore, since
the program code represents what the product actually does (rather than what
the product is intended to do), testing by looking at the program code makes us
get closer to what the product is actually doing.

Check your progress 1

Describe on White box testing.


Notes: a) Write your answer in the space given below
k) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

4.2 STATIC TESTING

Static testing is a type of testing which requires only the source code of
the product, not the binaries or executables. Static testing does not involve
executing the programs on computers but involves select people going through
the code to find out whether
• The code works according to the functional requirement;
• The code has been written in accordance with the design developed
earlier in the project life cycle.
• The code for any functionality has been missed out;

24
• The code handles errors properly.
Static testing can be done by humans or with the help of specialized tools.

4.2.1 STATIC TESTING BY HUMANS

These methods rely on the principle of humans reading the program code
to detect errors rather than computers executing the code to find errors. This
process has several advantages.
1. Sometimes humans can find errors that computers cannot. For example,
when there are two variables with similar names and the programmer
used a "wrong" variable by mistake in an expression, the computer will
not detect the error but execute the statement and produce incorrect
results, whereas a human being can spot such an error.
2. By making multiple humans read and evaluate the program, we can get
multiple perspectives and therefore have more problems identified
upfront than a computer could.
3. A human evaluation of the code can compare it against the specifications
or design and thus ensure that it does what is intended to do. This may
not always be possible when a computer runs a test.
4. A human evaluation can detect many problems at one go and can even
try to identify the root causes of the problems. More often than not,
multiple problems can get fixed by attending to the same root cause.
Typically, in a reactive testing, a test uncovers one problem (or, at best, a
few problems) at a time. Often, such testing only reveals the symptoms
rather than the root causes. Thus, the overall time required to fix all the
problems can be reduced substantially by a human evaluation.
5. By making humans test the code before execution, computer resources
can be saved. Of course, this comes at the expense of human resources.
6. A proactive method of testing like static testing minimizes the delay in
identification of the problems. The sooner a defect is identified and
corrected, lesser is the cost of fixing the defect.
7. From a psychological point of view, finding defects later in the cycle (for
example, after the code is compiled and the system is being put together)
creates immense pressure on programmers. They have to fix defects with
less time to spare. With this kind of pressure, there are higher chances of
other defects creeping in.
There are multiple methods to achieve static testing by humans. They are (in
the increasing order of formalism) as follows.
1. Desk checking of the code
2. Code walkthrough
3. Code review
4. Code inspection

25
Since static testing by humans is done before the code is compiled am
executed, some of these methods can be viewed as process-oriented or defect
prevention-oriented or quality assurance-oriented activities rather than pure
testing activities. Especially as the methods become increasingly formal (for
example, Fagan Inspection), these traditionally fall under the “process” domain.
They find a place in formal process models such as ISO 9001, CMMI, and so on
and are seldom treated as part of the “testing” domain. Nevertheless, as
mentioned earlier in this book, we take a holistic view of "testing" as anything
that furthers the quality of a product. These methods have been included in
this chapter because they have visibility into the program code.
4.2.1.1 Desk checking Normally done manually by the author' the code, desk
checking is a method to verify the portions of the code for correctness. Such
verification is done by comparing the code with the design or specifications to
make sure that the code does what it is supposed to do and effectively. This is
the desk checking that most programmers do before compiling and executing
the code. Whenever errors are found the author applies the corrections for
errors on the spot. This method, catching and correcting errors is characterized
by:
1. No structured method or formalism to ensure completeness and!
2. No maintaining of a log or checklist.
In effect, this method relies completely on the author's thoroughness,
diligence, and skills. There is no process or structure that guarantees verifies
the effectiveness of desk checking. This method is effective for correcting
"obvious" coding errors but will not be effective in detecting errors that arise
due to incorrect understanding of requirements or incomplete requirements.
This is because developers (or, more precisely, programmers who are doing the
desk checking) may not have the domain knowledge required to understand the
requirements fully.
The main advantage offered by this method is that the programmer who
knows the code and the programming language very well is well equipped to
read and understand his or her own code. Also, since this is done by one
individual, there are fewer scheduling and logistics overheads. Furthermore, the
defects are detected and corrected with minimum time delay.
Some of the disadvantages of this method of testing are as follows.
1. A developer is not the best person to detect problems in his or her own
code. He or she may be tunnel visioned and have blind spots to certain
types of problems.
2. Developers generally prefer to write new code rather than do any form of
testing! (We will see more details of this syndrome later in the section on
challenges as well as when we discuss people issues.)
3. This method is essentially person-dependent and informal and thus may
not work consistently across all developers.

26
Owing to these disadvantages, the next two types of proactive methods
are introduced. The basic principle of walkthroughs and formal inspections is to
involve multiple people in the review process.
4.2.1.2 Code walkthrough This method and formal inspection (described in
the next section) are group-oriented methods. Walkthroughs are less formal
than inspections. The line drawn in formalism between walkthroughs and
inspections is very thin and varies from organization to organization. The
advantage that walkthrough has over desk checking is that it brings multiple
perspectives. In walkthroughs, a set of people look at the program code and
raise questions for the author. The author explains the logic of the code, and
answers the questions. If the author is unable to answer some questions, he or
she then takes those questions and finds their answers. Completeness is
limited to the area where questions are raised by the team.
4.2.1.3 Formal inspection Code inspection-also called Fagan Inspection
(named after the original formulator) - is a method, normally with a high degree
of formalism. The focus of this method is to detect all faults, violations, and
other side-effects. This method increases the number of defects detected by
1. demanding thorough preparation before an inspection/review;
2. enlisting multiple diverse views;
3. assigning specific roles to the multiple participants; and
4. going sequentially through the code in a structured manner.
A formal inspection should take place only when the author has made
sure the code is ready for inspection by performing some basic desk checking
and walkthroughs. When the code is in such a reasonable state of readiness, an
inspection meeting is arranged. There are four roles in inspection. Fin is the
author of the code. Second is a moderator who is expected to formally run the
inspection according to the process. Third are the inspectors. These are the
people who actually provides, review comments for the code. There are typically
multiple inspectors. Finally, there is a scribe, who takes detailed notes during
the inspection meeting and circulates them to the inspection team after the
meeting.
The author or the moderator selects the review team. The chosen
members have the skill sets to uncover as many defects as possible. In
introductory meeting, the inspectors get copies (These can be hard copies or
soft copies) of the code to be inspected along with other supporting documents
such as the design document, requirements document, and any documentation
of applicable standards. The author also presents his or her perspective of what
the program is intended to do along with any specific issues that he or she may
want the inspection team to put extra focus on. The moderator informs the
team about the date, time, and venue of the inspection meeting. The inspectors
get adequate time to go through the documents and program and ascertain
their compliance to the requirements, design and standards.
The inspection team assembles at the agreed time for the inspection
meeting (also called the defect logging meeting). The moderator takes the iii
sequentially through the program code, asking each inspector if there are any

27
defects in that part of the code. If any of the inspectors raises a defect then the
inspection team deliberates on the defect and, when agreed there is a defect,
classifies it in two dimensions-minor/major and systemic/ execution. A mis-
execution defect is one which, as the name suggests, happens because of an
error or slip on the part of the author. It is unlikely to be repeated later, either
in this work product or in other work products. An example of this is using a
wrong variable in a statement. Systemic dele on the other hand, can require
correction at a different level. For example an error such as using some
machine-specific idiosyncrasies may have to remove by changing the coding
standards. Similarly, minor defects are defects that may not substantially affect
a program, whereas major defects need immediate attention.
A scribe formally documents the defects found in the inspection meeting
and the author takes care of fixing these defects. In case the defects severe, the
team may optionally call for a review meeting to inspect the fixes to ensure that
they address the problems. In any case, defects found thro inspection need to
be tracked till completion and someone in the team has to verify that the
problems have been fixed properly.
4.2.1.4 Combining various methods The methods discussed above are not
mutually exclusive. They need to be used in a judicious combination to be
effective in achieving the goal of finding defects early.
Formal inspections have been found very effective in catching defects
early. Some of the challenges to watch out for in conducting formal inspections
are as follows.
1. These are time consuming. Since the process calls for preparation as well
as formal meetings, these can take time.
2. The logistics and scheduling can become an issue since multiple people
are involved.
3. It is not always possible to go through every line of code, with several
parameters and their combinations in mind to ensure the correctness of
the logic, side-effects and appropriate error handling. It may also not be
necessary to subject the entire code to formal inspection.
In order to overcome the above challenges, it is necessary to identify,
during the planning stages, which parts of the code will be subject to formal
inspections. Portions of code can be classified on the basis of their criticality or
complexity as "high," "medium," and "low." High or medium complex critical
code should be subject to formal inspections, while those classified as "low" can
be subject to either walkthroughs or even desk checking.
Desk checking, walkthrough, review and inspection are not only used for
code but can be used for all other deliverables in the project life cycle such as
documents, binaries, and media.

Check your progress 2

What are the methods of static testing by humans?


Notes: a) Write your answer in the space given below

28
l) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

4.2.2 Static Analysis Tools

The review and inspection mechanisms described above involve


significant amount of manual work. There are several static analysis tools
available in the market that can reduce the manual work and perform analysis
of the code to find out errors such as those listed below.
1. whether there are unreachable codes (usage of GOTO statements
sometimes creates this situation; there could be other reasons too)
2. variables declared but ) not used
3. mismatch in definition and assignment of values to variables
4. illegal or error prone typecasting of variables
5. use of non-portable or architecture-dependent programming constructs
6. memory allocated but not having corresponding statements for freeing
them up memory
7. calculation of cyclomatic complexity
These static analysis tools can also be considered as an extension of
compilers as they use the same concepts and implementation to locate errors. A
good compiler is also a static analysis tool. For example, most C compilers
provide different "levels" of code checking which will catch the various types of
programming errors given above.
Some of the static analysis tools can also check compliance for coding
standards as prescribed by standards such as POSIX. These tools can also
check for consistency in coding guidelines (for example, naming conventions,
allowed data types, permissible programming constructs, and so on).
While following any of the methods of human checking - desk checking
walkthroughs, or formal inspections-it is useful to have a code review checklist.
Given below is checklist that covers some of the common issue. Every
organization should develop its own code review checklist. Th1 checklist should
be kept current with new learning as they come about.
In a multi-product organization, the checklist may be at two levels-first,
an organization-wide checklist that will include issues such as organizational
coding standards, documentation standards, and so on; second, a product or
project-specific checklist that addresses issues specific to the product or
project.

29
4.2.2.1 CODING REVIEW CHECKLIST

Data Item Declaration Related


 Are the names of the variables meaningful?
 If the programming language allows mixed case names, are there
variable names with confusing use of lower case letters and capital
letters?
 Are the variables initialized?
 Are there similar sounding names (especially words in singular and
plural)? [These could be possible causes of unintended errors.]
 Are all the common structures, constants and flags to be used defined
in a header file rather than in each file separately?
Data Usage Related
 Are values of right data types being assigned to the variables?
 Is the access of data from any standard files repositories, or databases
done through publicly supported interfaces?
 If pointers are used, are they initialized properly?
 Are bounds to arrays subscripts and pointers properly checked?
 Has the usage of similar-looking operators (for example, = and = = or &
and && in C) checked?
Standards Related
 Does the code follow the coding conventions of the organization?
 Does the code follow any coding conventions that are platform specific
(for example, GUI calls specific to Windows or Swing)
Style Related
 Are unhealthy programming constructs (for example, global variables in
C, ALTER statement in COBOL) being used in the program?
 Is there usage of specific idiosyncrasies of a particular machine
architecture or a given version of an underlying product (for example,
using “undocumented” features)?
 Is sufficient attention being paid to readability issues like indentation of
code?
Miscellaneous
 Have you checked for memory leaks (for example, memory acquired but
not explicitly freed)?
Documentation Related
 Is the code adequately documented, especially where the logic is complex

30
or the section of code is critical for product functioning?
 Is appropriate change history documented?
 Are the interfaces and the parameters thereof properly documented?

4.3 LETS SUM UP

We stand at the end of this lesson where you understood the fundamentals of
white box testing, static testing. Static testing categorized with human testing
and testing tools. The methods of testing practices are also discussed.

Check Your Progress: Model Answers

1. White box testing is a way of testing the external functionality of the code
by examining and testing the program code that realizes the external
functionality. This is also known as clear box, or glass box or open box
testing.
2. The methods are
a. Desk checking of the code
b. Code walkthrough
c. Code review
d. Code inspection
2. Software Development Lifecycle and its phases needs software
components to face the market regime.
3. Software components take code reuse with self-contained, binary
modules to be created by independent developers. A Component can be
written in any computer language that supports the creation of
components. Components are plugged into application runtime.

31
LESSON 5

STRUCTURAL TESTING
Contents
5.0 Aims and Objectives
5.1 Structural Testing
5.2 Unit/Code Functional Testing
5.3 Code Coverage Testing
5.3.1 Statement Coverage
5.3.2 Path Coverage
5.3.3 Condition Coverage
5.3.4 Function Coverage
5.4 Code Complexity Testing
5.5 Challenges in White Box Testing
5.6 Let Us Sum Up

5.0 AIMS AND OBJECTIVES

This lesson is the end part of Unit I, where we will discuss on Structural
testing, Code functional testing and Code coverage testing.
At the end of this lesson, the reader might be able to know the challenges
in white box testing.

5.1 Structural Testing

Structural testing takes into account the code, code structure, internal
design, and how they are coded. The fundamental difference between structural
testing and static testing is that in structural testing tests are actually run by
the computer on the built product, whereas in static testing the product is
tested by humans using just the source code and not the executables or
binaries.
Structural testing entails running the actual product against some pre
designed test cases to exercise as much of the code as possible or necessary. A
given portion of the code is exercised if a test case causes the program to
execute that portion of the code when running the test.

Check your progress 1

List the basic difference between structural testing and static testing.
Notes: a) Write your answer in the space given below
m) Check your answer with the one given at the end of this Lesson.

32
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

5.2 Unit/Code Functional Testing

This initial part of structural testing corresponds to some quick checks


that a developer performs before subjecting the code to more extensive code
coverage testing or code complexity testing. This can happen by sell methods.
1. Initially, the developer can perform certain obvious tests, knowing the
input variables and the corresponding expected output variables. This
can be a quick test that checks out any obvious mistakes. By repeating
these tests for multiple values of input variables, the confidence level of
the developer to go to the next level increases. This can be done prior to
formal reviews of static testing so that the review mechanism does not
waste time catching obvious errors.
2. For modules with complex logic or conditions, the developer can build a"
debug version" of the product by putting intermediate print statements
and making sure the program is passing through the right loops and
iterations the right number of times. It is important to remove the
intermediate print statements after the defects are fixed.
3. Another approach to do the initial test is to run the product under a
debugger or an Integrated Development Environment (IDE). These tools
allow single stepping of instructions (allowing the developer to stop at the
end of each instruction, view or modify the contents of variables, and so
on), setting break points at any function or instruction, and viewing the
various system parameters or program variable values.
All the above fall more under the "debugging" category of activities than
under the "testing" category of activities. All the same, these are intimately
related to the knowledge of code structure and hence we have included these
under the "white box testing" head. This is consistent with our view that testing
encompasses whatever it takes to detect and correct defects in a product.

5.3 Code Coverage Testing

Since a product is realized in terms of program code, if we can run test


cases to exercise the different parts of the code, then that part of the product
realized by the code gets tested. Code coverage testing involves designing and
executing test cases and finding out the percentage of code that is covered by
testing. The percentage of code covered by a test is found by adopting a
technique called instrumentation of code. There are specialized tools available
to achieve instrumentation. Instrumentation rebuilds the product, linking the
product with a set of libraries provided by the tool vendors. This instrumented
code can monitor and keep an audit of what portions of code are covered. The
tools also allow reporting on the portions of the code that are covered
frequently, so that the critical or most-often portions of code can be identified.

33
Code coverage testing is made up of the following types of coverage.
1. Statement coverage
2. Path coverage
3. Condition coverage
4. Function coverage
5.3.1 Statement coverage Program constructs in most conventional
programming languages can be classified as
1. Sequential control flow
2. Two-way decision statements like if then else
3. Multi-way decision statements like Switch
4. Loops like while do, repeat until and for
Object-oriented languages have all of the above and, in addition number
of other constructs and concepts. We will take up issues pertaining to object
oriented languages together. Statement coverage refers to writing test cases that
execute each of the program statements. One can start with the assumption
that more the code covered, the better is the testing of the functionality, as the
code realizes the functionality. Based on this assumption, code coverage can be
achieved by providing coverage to each of the above types of statements. When
we consider a two-way decision construct like the if statement then to cover all
the statements, we should also cover the then and parts of the if statement.
This means we should have, for each if else, (at least) one test case to test the
Then part and (at least) one test case to test the else part.
The multi-way decision construct such as a Switch statement reduced to
multiple two-way if statements. Thus, to cover all possible switch cases, there
would be multiple test cases. Loop constructs present more variations to take
care of. A loop in various forms such as for, while, repeat, and so on-is
characterized by executing a set of statements repeatedly until or while certain
conditions are met. A good percentage of the defects in programs come about
because of loops that do not function properly. More often, loops fail in what are
called "boundary conditions." One of the common looping errors is that the
termination condition of the loop is not properly stated. In order to make sure
that there is better statement coverage for statements within a loop, there
should be test cases that
1. Skip the loop completely, so that the situation of the termination
condition being true before starting the loop is tested.
2. Exercise the loop between once and the maximum number of times, to
check all possible “normal” operations of the loop.
3. Try covering the loop, around the “boundary” of n-that is, just below n, n
and just above n.

34
5.3.2 Path Coverage In path coverage, we split a program into a number, of
distinct paths. A program (or a part of a program) can start from the beginning
and take any of the paths to its completion.
Let us take an example of a date validation routine. The date is accepted,
as three fields mm, dd and yyyy. We have assumed that prior to entering this
routine; the values are checked to be numeric. To simplify the discussion, we
have assumed the existence of a function called leap year which will return
TRUE if the given year is a leap year. There is an array called DayofMonth which
contains the number of days in each month.
Path coverage provides a stronger condition of coverage than statement
coverage as it relates to the various logical paths in the program rather than
just program statements.
5.3.3 Condition coverage In the above example, even if we have covered all the
paths possible, it would not mean that the program is fully tested. For example,
we can make the program take the path A by giving a value less than 1 (for
example, 0) to mm and find that we have covered the path A and the program
has detected that the month is invalid. But, the program may still not be
correctly testing for the other condition namely mm > 12. Furthermore, most
compliers perform optimizations to minimize the number of Boolean operations
and all the conditions may not get evaluated, even though the right path is
chosen. For example, when there is an OR condition (as in the first I F
statement above), once the first part of the I F (for example, mm < 1) is found to
be true, the second part will not be evaluated at all as the overall value of the
Boolean is TRUE. Similarly, when there is an AND condition in a Boolean
expression, when the first condition evaluates to FALSE, the rest of the
expression need not be evaluated at all.
The condition coverage, as defined by the formula alongside in the
margin gives an indication of the percentage of conditions covered by a set of
test cases. Condition coverage is a much stronger criteria than path coverage,
which in turn is a much stronger criteria than statement coverage.
5.3.4 Function coverage This is a new addition to structural testing to identify
how many program functions (similar to functions in "C" language) are covered
by test cases.
The requirements of a product are mapped into functions during the
design phase and each of the functions form a logical unit. For example, in a
database software, "inserting a row into the database" could be a function. Or,
in a payroll application, "calculate tax" could be a function. Each function
could, in turn, be implemented using other functions. While providing function
coverage, test cases can be written so as to exercise each of the different
functions in the code. The advantages that function coverage provides over the
other types of coverage are as follows.
1. Functions are easier to identify in a program and hence it is easier to
write test cases to provide function coverage.

35
2. Since functions are at a much higher level of abstraction than code, it is
easier to achieve 100 percent function coverage than 100 percent
coverage in any of the earlier methods.
3. Functions have a more logical mapping to requirements and hence can
provide a more direct correlation to the test coverage of the product. In
the next chapter, we will be discussing the requirement traceability
matrix, which track a requirement through design coding, and testing
phases. Functions provide one means to achieve this traceability.
Function coverage provides a way of testing this traceability.
4. Since functions are a means of realizing requirements, the importance of
functions can be prioritized based on the importance of the requirements
they realize. Thus, it would be easier to prioritize the functions for
testing. This is not necessarily the case with the earlier methods of
coverage.
5. Function coverage provides a natural transition to black box testing.
We can also measure how many times a given function is called. This will
indicate which functions are used most often and hence these function become
the target of any performance testing and optimization. As an example, if
networking software, we find that the function that assemble, and disassembles
the data packets is being used most often, it is appropriate to spend extra effort
in improving the quality and performance of the function. Thus, function
coverage can help in improving the performance well as quality of the product.
5.3.5 Summary Code coverage testing involves" dynamic testing methods of
executing the product with pre-written test cases, and finding out how much of
code has been covered. If a better coverage of a code is desired, several
iterations of testing may be required. For each iteration, one has to go through
the statistics and write a new set of test cases Ii covering portions of the code
not covered by earlier test cases. To do till type of testing not only does one
need to understand the code, logic but also need to understand how to write
effective test cases that can cover good portions of the code. This type of testing
can also be referred to as "gray box testing" as this uses the combination of
"white box and bill box methodologies" (white + black = gray) for effectiveness.
Performance analysis and optimization Code coverage tests can identify the
areas of a code that are executed most frequently. Extra attention can then be
paid to these sections of the code. If further performance improvement is no
longer possible, then other strategies like caching can be considered. Code
coverage testing provides information that is useful in making such
performance-oriented decisions.
Resource usage analysis White box testing, especially with instrumented code,
is useful in identifying bottlenecks in resource usage. For example, if a
particular resource like the RAM or network is perceived as a bottleneck, then
instrumented code can help identify where the bottlenecks are and point
towards possible solutions.
Checking of critical sections or concurrency related parts of code Critical
sections are those parts of a code that cannot have multiple processes

36
executing at the same time. Coverage tests with instrumented code are one of
the best means of identifying any violations of such concurrency constraints
through critical sections.
Identifying memory leaks Every piece of memory that is acquired or allocated
by a process (for example, by malloc in C) should be explicitly released (for
example, by free in C). If not, the acquired memory is "lost" and the amount of
available memory decreases correspondingly. Over time, there would be no
memory available for allocation to meet fresh memory requests and processes
start failing for want of memory. The various white box testing methods can
help identify memory leaks. Most debuggers or instrumented code can tally
allocated and freed memory.
Dynamically generated code White box testing can help identify security holes
effectively, especially in a dynamically generated code. In instances where a
piece of code is dynamically created and executed, the functionality of the
generated code should be tested on the fly. For example, when using web
services, there may be situations wherein certain parameters are accepted from
the users and html/java code may be generated and passed on to a remote
machine for execution. Since after the transaction or service is executed, the
generated code ceases to exist, testing the generated code requires code
knowledge. Hence, the various techniques of white box testing discussed in this
chapter come in handy.

Check your progress 2

What are the different types of code coverage methods?


Notes: a) Write your answer in the space given below
n) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

5.4 Code Complexity Testing

In previous sections, we saw the different types of coverage that can be provided
to test a program. Two questions that come to mind while using this coverage
are:
1. Which of the paths are independent? If two paths are not independent,
then we may be able to minimize the number of tests.
2. Is there an upper bound on the number of tests that must be run to
ensure that all the statements have been executed at least once?
Cyclomatic complexity is a metric that quantifies the complexity of a
program and thus provides answers to the above questions.
A program is represented in the form of a flow graph. A flow graph
consists of nodes and edges. In order to convert a standard flow chart into a
flow graph to compute cyclomatic complexity, the following steps can be taken.

37
1. Identify the predicates or decision points (typically the Boolean
conditions or conditional statements) in the program.
2. Ensure that the predicates are simple (that is, no and/or, and so on in
each predicate). Figure shows how to break up a condition having or into
simple predicates. Similarly, if there are loop constructs break the loop
termination checks into simple predicates.
3. Combine all sequential statements into a single node. The reasoning here
is that these statements all get executed, once started.
4. When a set of sequential statements are followed by a simple predicate
(as simplified in (2) above), combine all the sequential statements and the
predicate check into one node and have two edges emanating from this
one node. Such nodes with two edge emanating from them are called
predicate nodes.
5. Make sure that all the edges terminate at some node; add a node to
represent all the sets of sequential statements at the end of the program
We have illustrated the above transformation rules of a conventional flow
chart to a flow diagram in Figure 5.1 (a) and 5.1 (b). We have color coded the
different boxes so that the reader can see the transformation more clearly. The
flow chart elements of a given color on the left-hand side get mapped to flow
graph elements of the corresponding nodes on the ri~ hand side. Intuitively, a
flow graph and the cyclomatic complexity provide indicators to the complexity of
the logic flow in a program and to the number of independent paths in a
program.
Flow graph translation of an OR to a simple predicate.

Figure 5.1 (a) A predicate with a Boolean OR (b) An equivalent set of


simple predicates
Calculating and using cyclomatic complexity For small programs cyclomatic
complexity can be calculated manually, but automated tools are essential as
several thousands of lines of code are possible in each program in a project. It
will be very difficult to manually create flow graphs for large programs. There
are several tools that are available in the market which can compute cyclomatic
complexity. But, we would like to caution that calculating the complexity of a
module after it has been built and tested may be too late-it may not be possible
to redesign a complex module after it has been tested. Thus some basic

38
complexity checks must be performed on the modules before embarking upon
the testing (or even coding) phase. This can become one of the items to check
for in a code review. Based on the complexity number that emerges from using
the toot one can conclude what actions need to be taken for complexity
measure.

5.5 CHALLENGES IN WHITE BOX TESTING

White box testing requires a sound knowledge of the program code and
the programming language. This means that the developers should get
intimately involved in white box testing. Developers, in general, do not like to
perform testing functions. This applies to structural testing as well as static
testing methods such as reviews. In addition, because of the timeline pressures,
the programmers may not "find time" for reviews (an euphemism for wanting to
do more coding). We will revisit this myth of dichotomy between testing and
development functions in the chapter on people issues (Chapter 13).
Human tendency of a developer being unable to find the defects in
his or her code As we saw earlier, most of us have blind spots in detecting
errors in our own products. Since white box testing involves programmers who
write the code, it is quite possible that they may not be most effective in
detecting defects in their own work products. An independent perspective could
certainly help.
Fully tested code may not correspond to realistic scenarios
Programmers generally do not have a full appreciation of the external
(customer) perspective or the domain knowledge to visualize how a product will
be deployed in realistic scenarios. This may mean that even after extensive
testing, some of the common user scenarios may get left out and defects may
creep in.
These challenges do not mean that white box testing is ineffective. But
when white box testing is carried out and these challenges are addressed by
other means of testing, there is a higher likelihood of more effective testing.

Check your progress 3

Discuss the challenges in white box testing.


Notes: a) Write your answer in the space given below
o) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

5.6 LETS SUM UP

Check Your Progress: Model Answers


1. The fundamental difference between structural testing and static testing
is that in structural testing tests are actually run by the computer on the

39
built product, where as in static testing the product is tested by humans
using just the source code and not the executables or binaries.
2. Code coverage testing is made up of the following types of coverage.
1. Statement coverage
2. Path coverage
3. Condition coverage
4. Function coverage
3. Human tendency of a developer being unable to find the defects in his or
her code Fully tested code may not correspond to realistic scenarios

References :

1. SOFTWARE TESTING Principles and Practices – Srinivasan Desikan


& Gopalswamy Ramesh, 2006, Pearson Education.
2. http://www.softwaretestinghelp.com/white-box-testing/

40
UNIT - II
LESSON 6

BLACK BOX TESTING


Contents
6.0 Aims and Objectives
6.1 What is Black Box Testing?
6.2 Why Black Box Testing?
6.3 When to do Black Box Testing?
6.4 How to do Black Box Testing?
6.4.1 Requirements based Training
6.4.2 Positive and Negative Testing
6.4.3 Boundary Value Analysis
6.5 Let Us Sum Up

6.0 AIMS AND OBJECTIVES

I hope you understood the various software development lifecycle models in


the previous chapter. In this lesson, we will discuss on black box testing and
the methods of black box testing. They are
• Requirements based
• Positive and Negative Testing
• Boundary Value Analysis

6.1 WHAT IS BLACK BOX TESTING

Black box testing involves looking at the specifications and does not
require examining the code of a program. Black box testing is done from the
customer's view point. The test engineer engaged in black box testing only
knows the set of inputs and expected outputs and is unaware of hoi those
inputs are transformed into outputs by the software. Black box tests are
convenient to administer because they use the complete finished product and
do not require any knowledge of its construction Independent test laboratories
can administer black box tests to ensure functionality and compatibility.
Black-box test design treats the system as a "black-box", so it doesn't
explicitly use knowledge of the internal structure. Black-box test design is
usually described as focusing on testing functional requirements. Glass-box
test design allows one to peek inside the "box", and it focuses specifically on
using internal knowledge of the software to guide the selection of test data

41
Black box testing thus requires a functional knowledge of the product to
be tested. It does not mandate the knowledge of the internal logic of the system
nor does it mandate the knowledge of the programming language used to build
the product. Our tests in the above example were focused towards testing the
features of the product (lock and key), the different states, we already knew the
expected outcome. You may check if the lock works with some other key (other
than its own). You may also want to check with a hairpin or any thin piece of
wire if the lock works. We shall see in further sections, in detail, about the
different kinds of tests that can be performed in, a given product.

Check your progress 1

Define black box testing.


Notes: a) Write your answer in the space given below
p) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

6.2 WHY BLACK BOX TESTING?

Black box testing helps in the overall functionality verification of the


system under test.
Black box testing is done based on requirements It helps in identifying any
incomplete, inconsistent requirement as well as any issues involved when the
system is tested as a complete entity.
Black box testing addresses the stated requirements as well as implied
requirements Not all the requirements are stated explicitly, but are deemed
implicit. For example, inclusion of dates, page header, and footer may not be
explicitly stated in the report generation requirements specification. However,
these would need to be included while providing the product to the customer to
enable better readability and usability.
Black box testing encompasses the end user perspectives Since we want to
test the behavior of a product from an external perspective, end-user
perspectives are an integral part of black box testing.
Black box testing handles valid and invalid inputs It is natural for users to
make errors while using a product. Hence, it is not sufficient for black box
testing to simply handle valid inputs. Testing from the end-user perspective
includes testing for these error or invalid conditions. This ensures that the
product behaves as expected in a valid situation and does not hang or crash
when provided with an invalid input. These are called positive and negative test
cases.
The tester mayor may not know the technology or the internal logic of the
product. However, knowing the technology and the system internals helps in
constructing test cases specific to the error-prone areas.

42
Test scenarios can be generated as soon as the specifications are ready. Since
requirements specifications are the major inputs for black box testing test
design can be started early in the cycle.

6.3 WHEN TO DO BLACK BOX TESTING

Black box testing activities require involvement of the testing team from
the beginning of the software project life cycle, regardless of the software
development life cycle model chosen for the project.
Testers can get involved right from the requirements gathering analysis
phase for the system under test. Test scenarios and test data are prepared
during the test construction phase of the test life cycle, when' software is in the
design phase.
Once the code is ready and delivered for testing, test execution can be
done. All the test scenarios developed during the construction phi are executed.
Usually, a subset of these test scenarios is selected: regression testing.

6.4 HOW TO DO BLACK BOX TESTING

It is not possible to exhaustively test a product however simple the


product is. Since we are testing external functionality in black box testing, we
need to arrive at a judicious set of tests that test as much of the external
functionality as possible, uncovering as many defects as possible, in as short a
time as possible. While this may look like a utopian wish list, the techniques we
will discuss in this section facilitates this goal.
This section deals with the various techniques to be used to generate
scenarios for effective black box testing. The various techniques we will discuss
are as follows
1. Requirements based testing
2. Positive and negative testing
3. Boundary value analysis
4. Decision tables
5. Equivalence partitioning
6. State based testing
7. Compatibility testing
8. User documentation testing
9. Domain testing

Check your progress 2

Write short notes on Decision tables.


Notes: a) Write your answer in the space given below
q) Check your answer with the one given at the end of this lesson.

43
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

6.4.1 REQUIREMENTS BASED TRAINING

Requirements testing deals with validating the requirements given in


Software Requirements Specification (SRS) of the software system not all
requirements are explicitly stated; some of the requirements are implied or
implicit. Explicit requirements are stated and documented as part of the
requirements specification. Implied or implicit requirements are those that are
nor documented but assumed to be incorporated in the system.
The precondition for requirements testing is a detailed review of the
requirements specification. Requirements review ensures that they are
consistent, correct, complete, and testable. This process ensures that some
implied requirements are converted and documented as explicit requirements,
thereby bringing better clarity to requirements and making requirements based
testing more effective.
Some organizations follow a variant of this method to bring more details
into requirements. All explicit requirements (from the Systems Requirements
Specifications) and implied requirements (inferred by the test team) are
collected and documented as "Test Requirements Specification" (TRS).
Requirements based testing can also be conducted based on such a TRS, as it
captures the testers' perspective as well. However, for simplicity, we will
consider SRS and TRS to be one and the same.
An RTM plays a valuable role in requirements based testing.
1. Regardless of the number of requirements, ideally each of the
requirements has to be tested. When there are a large number of
requirements, it would not be possible for someone to manually keep a
track of the testing status of each requirement. The] provides a tool to
track the testing status of each requirement without missing any (key)
requirements.
2. By prioritizing the requirements, the RTM enables testers to prioritize the
test cases execution to catch defects in the high priority area as early as
possible. It is also used to find out whether there are adequate test cases
for high-priority requirements reduce the number of test cases for low-
priority requirement addition, if there is a crunch for time for testing, the
prioritization enables selecting the right features to test.
3. Test conditions can be grouped to create test cases or a represented as
unique test cases. The list of test case(s) that ad, a particular
requirement can be viewed from the RTM.
4. Test conditions/cases can be used as inputs to arrive at a size/effort /
schedule estimation of tests.

44
The Requirements Traceability Matrix provides a wealth of information
various test metrics. Some of the metrics that can be collected or inferred from
this matrix are as follows.
• Requirements addressed priority wise - This metric helps in knowing the
test coverage based on the requirements. Number of tests that is covered
for high-priority requirement versus tests created for low-priority
requirement.
• Number of test cases requirement wise - For each requirement, the total
number of test cases created.
• Total number of test cases prepared - Total of all the test cases prepared
for all requirements.
Once the test cases are executed, the test results can be used to collect
metrics such as
• Total number of test cases (or requirements) passed-Once execution is
completed, the total passed test cases and what per of requirements they
correspond.
• Total number of test cases (or requirements) failed -Once"' execution is
completed, the total number of failed test easel what percent of
requirements they correspond.
• Total number of defects in requirements- List of defects reported for each
requirement (defect density for requirements). This helps in doing an
impact analysis of what requirements have more de and how they will
impact customers. A comparatively high defects density in low-priority
requirements is acceptable for a release. A high-defect density in high-
priority requirement is consider high-risk area, and may prevent a
product release.

Check your progress 3

What do you mean by TRS?


Notes: a) Write your answer in the space given below
r) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

6.4.2 POSITIVE AND NEGATIVE TESTING

Positive testing tries to prove that a given product does what it is


supposed to do. When a test case verifies the requirements of the product with
a set of expected output, it is called positive test case. The purpose of positive
testing is to prove that the product works as per specification and expectations.

45
A product delivering an error when it is expected to give an error is also a part
of positive testing.
Positive testing can thus be said to check the products behavior for
positive and negative conditions as stated in the requirement.
Negative testing is done to show that the product does not fail when an
unexpected input is given. The purpose of negative testing is to try and break
the system. Negative testing covers scenarios for which the product is not
designed and coded. In other words, the input values may not have been
represented in the specification of the product. These test conditions can be
termed as unknown conditions for the product as far as the specifications are
concerned. But, at the end – user level, there are multiple scenarios that are
encountered and that need to be taken care of by the product. It becomes even
more important for the tester to know the negative situations that may occur at
the end-user level so that the application can be tested and made foolproof. A
negative test would be a product not delivering an error when it should or
delivering an error when it should not.
The difference between positive and negative testing is in their coverage.
For positive testing if all documented requirements and test conditions are
covered, then coverage can be considered to be 100 percent. If the specifications
are very clear, then coverage can be achieved. In contrast there is no end to
negative testing and 100 percent coverage in negative testing in impractical.
Negative testing requires a high degree of creativity among the testers to cover
as many “unknowns” as possible to avoid failure at a customer site.

6.4.3 BOUNDARY VALUE ANALYSIS

Conditions and boundaries are two major sources of defects in a software


product. Most of the defects in software products hover around conditions and
boundaries. By conditions, we mean situations wherein, based on the values of
various variables, certain actions would have to be taken, by boundaries; we
mean “limits” of values of the various variables.
The question from a testing perspective for the above problem is what
test data is likely to reveal the most number of defects in the program?
Generally it has been found that most defects in situations such as this happen
around the boundaries – for example, when buying 9, 10, 11, 19, 20, 21, 29,
30, 31 and similar number of items. While the reason for this phenomenon is
not entirely clear, some possible reasons are as follows.
• Programmers tentativeness in using the right comparison operator, for
example, whether to use the <= operator or < operator when trying to
make comparisons.
• Confusion caused by the availability of multiple ways to implement loops
and condition checking. For example, in a programming language like C,
we have for loops, while loops and repeat loops. Each of these have
different terminating conditions for the loop and this could cause some
confusion in deciding which operator to use, thus skewing the defects
around the boundary conditions.

46
• The requirements themselves may not be clearly understood, especially
around the boundaries, thus causing even, the correctly coded program
to not perform the correct way.
Another instance where boundary value testing is extremely useful in
uncovering defects is when there are internal limits placed on certain resources,
variables or data structures. Consider a database management system (or a file
system) which caches the recently used data blocks in a shared memory area.
Usually such a cached area is limited by a parameter that the user specifies at
the time of starting up the system. Assume that the database is brought up
specifying that the most recent 50 database buffers have to be cached. When
these buffers are full and a 51st block needs to be released, after storing it in
secondary memory. As you can observe, both the operations – inserting the new
buffer as well as freeing up the first buffer – happen at the “boundaries”.
To summarize boundary value testing
• Look for any kind of graduation or discontinuity in data values which
affect computation – the discontinuities are the boundary values, which
require thorough testing.
• Look for any internal limits such as limits on resources (as in the
example of buffers given above). The behavior of the product at these
limits should also be the subject of boundary value testing.
• Also include in the list of boundary values, documented limits on
hardware resources. For example, if it is documented that a product will
run with minimum 4MB of RAM, make sure you include test cases for
the minimum RAM (4MB in this case).
• The examples given above discuss boundary conditions for input data –
the same analysis needs to be done for output variables also.
Boundary value analysis discussed in context of black box testing applies to
white box testing also. Internal data structures like arrays, stacks and queues
need to be checked for boundary or limit conditions; when there are linked lists
used as internal structures, the behavior of the list at the beginning and end
have to be tested thoroughly.
Boundary values and decision tables help identify the test cases that are
most likely to uncover defects. A generalization of both these concepts is the
concept of equivalence classes.

6.5 LETS SUM UP

Now we are at the end of the lesson, you may be able to understand what
black box is testing.
Advantages of Black Box Testing
• more effective on larger units of code than glass box testing
• tester needs no knowledge of implementation, including specific
programming languages

47
• tester and programmer are independent of each other
• tests are done from a user's point of view
• will help to expose any ambiguities or inconsistencies in the
specifications
• test cases can be designed as soon as the specifications are complete
Disadvantages of Black Box Testing
• only a small number of possible inputs can actually be tested, to test
every possible input stream would take nearly forever
• without clear and concise specifications, test cases are hard to design
• there may be unnecessary repetition of test inputs if the tester is not
informed of test cases the programmer has already tried
• may leave many program paths untested
• cannot be directed toward specific segments of code which may be very
complex (and therefore more error prone)
• most testing related research has been directed toward glass box testing
Check Your Progress: Model Answers
1. Black box testing is done from the customer's view point. The test
engineer engaged in black box testing only knows the set of inputs and
expected outputs and is unaware of hoi those inputs are transformed
into outputs by the software.
Black-box test design treats the system as a "black-box", so it doesn't
explicitly use knowledge of the internal structure. Black-box test design
is usually described as focusing on testing functional requirements.
Glass-box test design allows one to peek inside the "box", and it focuses
specifically on using internal knowledge of the software to guide the
selection of test data
2. The decision-table-based testing technique
A decision table has two parts: the conditions part and the actions part.
The decision table specifies under what conditions a test action must be
performed. Each condition expresses a relationship among variables that
must be resolvable as true or false. All the possible combinations of
conditions define a set of alternatives. For each alternative, a test action
should be considered. The number of alternatives increases exponentially
with the number of conditions, which may be expressed as
2NumberOfConditions. When the decision table becomes too complex, a
hierarchy of new decision tables can be constructed.

48
Figure 6.1 Example of a decision table
Because some alternatives specified might be unrealistic, a test strategy
should
1) verify that all alternatives can actually be reached and 2) describe how
the AUT will behave under all alternative conditions. With a decision
table, it is easy to add and remove conditions, depending on the test
strategy. It is easy to increase test coverage by adding new test actions
from iteration to iteration, according to the test strategy.
As illustrated in Figure 6.1, decision tables are useful when specifying,
analyzing, and testing complex logic. They are efficient for describing
situations where varying conditions produce different test actions. They
are powerful for finding faults both in implementation and specifications.
3. All explicit requirements (from the Systems Requirements Specifications)
and implied requirements (inferred by the test team) are collected and
documented as "Test Requirements Specification" (TRS).

49
LESSON 7

OTHER ESSENTIAL TECHNIQUES OF TESTING


Contents
7.0 Aims and Objectives
7.1 Decision Tables
7.2 Equivalence Partitioning
7.3 State based or Graph Based Testing
7.4 Compatibility Testing
7.5 User Documentation Testing
7.6 Domain Testing
7.7 Let Us Sum Up

7.0 AIMS AND OBJECTIVES

In this lesson, we will discuss some of the important methodologies of


testing such as decision tables, equivalence partitioning, state based or graph
based testing, compatibility testing, user documentation testing and domain
testing.
It would be an essential requirement of study to understand the
forthcoming lessons. At the end of the lesson, you might be able to understand
all the above titles.

7.1 DECISION TABLES

A decision table lists the various decision variables, the conditions (or
values) assumed by each of the decision variables, and the actions to take in
each combination of conditions. The variables that contribute to the decision
are listed as the columns of the table. The last column of the table is the action
to be taken for the combination of values of the decision variables. In cases
when the number of decision variables is many (say, more than five or six) and
the number of distinct combinations of variables is few (say, four or five), the
decision variables can be listed as rows.
The reader would have noticed that there are a number of entries
marked"-" in the decision table. The values of the appropriate decision variables
in these cases do not affect the outcome of the decision. For example, the status
of the spouse is relevant only when the filing status is "Married filing separate
return." Similarly, the age of spouse and whether spouse is blind or not comes
into play only when the status is "Married, filing joint return." Such entries are
called don't cares (sometimes represented by the Greek character phi,Φ). These
don't cares significantly reduce the number of tests to be performed. For
example, in case there were no don’t cares, there would be eight cases for the
status of "Single": four with status of spouse as claimed standard deduction
and four with spouse status being not claiming standard deduction. Other than

50
this one difference, there is no material change in the status of expected result
of the standard deduction amount. We leave it as an exercise for the reader to
enumerate the number of rows in the decision table, should we not allow don't
cares and have to explicitly specify each case. There are formal tools like
Karnaugh Maps which can be used to arrive at a minimal Boolean expression
that represents the various Boolean conditions in a decision table. The
references given at the end of this chapter discuss these tools and techniques.
Thus, decision tables act as invaluable tools for designing black box tests
to examine the behavior of the product under various logical condition input
variables. The steps in forming a decision table are as follows.
1. Identify the decision variables.
2. Identify the possible values of each of the decision variables.
3. Enumerate the combinations of the allowed values of each of the
variables.
4. Identify the cases when values assumed by a variable (or by sets of
variables) are immaterial for a given combination of other input variables.
Represent such variables by the don't care symbol.
5. For each combination of values of decision variables (appropriately
minimized with the don't care scenarios), list out the action or expected
result.
6. Form a table, listing in each but the last column a decision variable. In
the last column, list the action item for the combination of variables in
that row (including don't cares, as appropriate).
Once a decision table is formed, each row of the table acts as the
specification for one test case. Identification of the decision variables makes
these test cases extensive, if not exhaustive. Pruning the table by using don't
cares minimizes the number of test cases. Thus, decision tables are usually
effective in arriving at test cases in scenarios which depend on the values of the
decision variables.
WORKING WITH DECISION AND DATA-DRIVEN TABLES
Since the logic is defined in the decision table, the tester does not need to
hard code any testing logic. The decision script just performs the verifications
during execution, compares the result of the verifications with the alternatives
provided by the decision table, and returns the next test script to run if a
solution is found.
A test suite script contains several decision scripts and test scripts. All
the elements of a test suite are defined in a driver table that specifies an un-
ordered set of test segments. Each test segment consists of a collection of test
scripts that are executed sequentially between two decision scripts. For each
test segment, the driver table specifies the transition between a source test
script and a target test script.
As the decision is computed dynamically by the decision script during
execution, a mechanism of notification must be implemented for the test suite

51
script to be notified by the decision script about the next test script to run.
When the decision script notifies the test suite script about the next test script
to run, the test suite script queries the driver table to find the next test segment
to run. The process is illustrated in Figure 7.1.

Figure 7.1 The elements of a test suite


Check your progress 1
Define data driven table.
Notes: a) Write your answer in the space given below
s) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

7.2 EQUIVALENCE PARTITIONING

Equivalence partitioning is a software testing technique that in


identifying a small set of representative input values that produces as many
different output conditions as possible. This reduces the number of
permutations and combinations of input, output values used for testing,
thereby increasing the coverage and reducing the effort involved in testing.

The set of input values that generate one single expected output is called
a partition. When the behavior of the software is the same for a set of values

52
then the set is termed as an equivalence class or a partition. In this case
representative sample from each partition (also called the member of the
equivalence class) is picked up for testing. One sample from the partition is
enough for testing as the result of picking up some more values from the set
will be the same and will not yield any additional defects. Since all the values
produce equal and same output they are termed as equivalence partition.
Testing by this technique involves (a) identifying all partitions for the
complete set of input, output values for a product and (b) picking up one
member value from each partition for testing to maximize complete coverage.
From the results obtained for a member of an equivalence class or
partition, this technique extrapolates the expected results for all the values in
that partition. The advantage of using this technique is that we gain good
coverage with a small number of test cases. For example, if there is a defect in
one value in a partition, then it can be extrapolated to all the values of that
particular partition. By using this technique, redundancy of tests is minimized
by not repeating the same tests for multiple values in the same partition.
Let us consider the example below, of an insurance company that has
the following premium rates based on the age group.
Life Insurance Premium Rates
A life insurance company has base premium of $0.50 for all ages. Based
on the age group, an additional monthly premium has to be paid that is as
listed in the table below. For example, a person aged 34 has to pay a premium =
base Premium + additional premium = $0.50 + $1.65 = $2.15
Age Group Additional
Premium
Under 35 $1.65
35-59 $2.87
60+ $6.00
Based on the equivalence partitioning technique, the equivalence
partitions that are based on age are given below:
• Below 35 years of age (valid input)
• Between 35 and 59 years of age (valid input)
• Above 60 years of age (valid input)
• Negative age (invalid input)
• Age as 0 (invalid input)
• Age as any three-digit number (valid input)
We need to pick up representative values from each of the above
partitions. You may have observed that even though we have only a small table
of valid values, the equivalence classes should also include samples of invalid
inputs. This is required so that these invalid values do not cause unforeseen

53
errors. You can see that the above cases include both positive and negative test
input values.
The test cases for the example based on the equivalence partitions. The
equivalence partitions table has as columns:
• Partition definition
• Type of input (valid / invalid)
• Representative test data for that partition
• Expected results
One way to divide the set is by
1. Prime numbers
2. Composite numbers
3. Numbers with decimal point
These three classes divide the set of numbers into three valid classes. In
addition, to account for any input a user may give, we will have to add an
invalid class-string with alphanumeric characters. As in the previous case, we
can construct an equivalence partitions table for this example.
Thus, like in the first example on life insurance premium, here have
reduced a potentially infinite input data space to a finite one, without losing the
effectiveness of testing. This is the power of using equivalence classes: choosing
a minimal set of input values that are truly representative of the entire
spectrum and uncovering a higher number of defects.
The steps to prepare an equivalence partitions table are as follows.
• Choose criteria for doing the equivalence partitioning (range, list of
values, and so on)
• Identify the valid equivalence classes based on the above criteria
(number of ranges allowed values, and so on)
• Select a sample data from that partition
• Write the expected result based on the requirements given
• Identify special values, if any, and include them in the table
• Check to have expected results for all the cases prepared
• If the expected result is not clear for any particular test case, mark
appropriately and escalate for corrective actions. If you cannot answer a
question, or find an inappropriate answer, consider whether you want to
record this issue on your log and clarify with the team that
arbitrates/dictates the requirements.

Check your progress 2

Describe on Equivalence partitioning.


Notes: a) Write your answer in the space given below
t) Check your answer with the one given at the end of this lesson.

54
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

7.3 STATE BASED OR GRAPH BASED TESTING

State or graph based testing is very useful in situations where


1. The product under test is a language processor (for example, a compiler),
wherein the syntax of the language automatically lends itself to a state
machine or a context free grammar represented by a railroad diagram.
2. Workflow modeling where, depending on the current state and
appropriate combinations of input variables, specific workflows is carried
out, resulting in new output and new state.
3. Dataflow modeling, where the system is modeled as a set of dataflow,
leading from one state to another.
Consider an application that is required to validate a number according
to the following simple rules.
1. A number can start with an optional sign.
2. The optional sign can be followed by any number of digits.
3. The digits can be optionally followed by a decimal point, represented by a
period.
4. If there is a decimal point, then there should be two digits after the
decimal.
5. Any number – whether or not it has a decimal point, should be
terminated by a blank.
A graph like the one above can also be converted to a state transition
using the same notation and method illustrated in the previous example Graph
based testing such as in this example will be applicable when
1. The application can be characterized by a set of states.
2. The data values (screens, mouse clicks, and so on) that cause the
transition from one state to another is well understood.
3. The methods of processing within each state to process the input
received is also well understood.

7.4 COMPATIBILITY TESTING

In the above sections, we looked at several techniques to test product


features and requirements. It was also mentioned that the test case result are
compared with expected results to conclude whether the test is success or not.
The test case results not only depend on the product for proper functioning;
they depend equally on the infrastructure for delivering functionality. When
infrastructure parameters are changed, the product is expected to still behave
correctly and produce the desired or expected results. The infrastructure
parameters could be of hardware, software, or other components. These
parameters are different for different customers. A black box testing, not
considering the effects of these parameters on the test case results, will
necessarily be incomplete and ineffective, as it not truly reflect the behavior at a
customer site. Hence, there is a need compatibility testing. This testing ensures
the working of the product with different infrastructure components. The
techniques used for compatibility, testing are explained in this section.

55
The parameters that generally affect the compatibility of the product are
• Processor (CPU) (Pentium III, Pentium IV, Xeon, SPARC, and so on) and
the number of processors in the machine
• Architecture and characteristics of the machine (32 bit, 64 bit, and so
on)
• Resource availability on the machine (RAM, disk space, network card)
• Equipment that the product is expected to work with (printers, modems,
routers, and so on)
• Operating system (Windows, Linux, and so on and their variants) and
operating system services (DNS, NIS, FTP, and so on)
• Middle-tier infrastructure components such as web server, application
server, network server
• Backend components such database servers (Oracle, Sybase, and so on)
• Services that require special hardware-cum-software solutions (cluster
machines, load balancing, RAID array, and so on)
• Any software used to generate product binaries (compiler, linker, and so
on and their appropriate versions)
• Various technological components used to generate components (SDK,
JDK, and so on and their appropriate different versions)
The above are just a few of the parameters. There are many more
parameters hat can affect the behavior of the product features. In the above
example, we have described ten parameters. If each of the parameters can take
four values, then there are forty different values to be tested. But that is not all.
Not only can the individual values of the parameters affect the features, but
also the permutation and combination of the parameters. Taking these
combinations into consideration, the number of times a particular feature to be
tested for those combinations may go to thousands or even millions. In he
above assumption of ten parameters and each parameter taking on four values,
the total number of combinations to be tested is 410, which is a large number
and impossible to test exhaustively.
Some of the common techniques that are used for performing
compatibility testing, using a compatibility table are
1. Horizontal combination All values of parameters that can coexist with the
product for executing the set test cases are grouped together as a row in
the compatibility matrix. The values of parameters that can coexist
generally belong to different layers/types of infrastructure pieces such as
operating system, web server, and so on. Machines or environments are
set up for each row and the set of product features are tested using each
of these environments.
2. Intelligent sampling In the horizontal combination method, each feature
of the product has to be tested with each row in the compatibility matrix.
This involves huge effort and time. To solve this problem, combinations
of infrastructure parameters are combined with the set of features
intelligently and tested. When there are problems due to any of the
combinations then the test cases are executed, exploring the various
permutations and combinations. The selection of intelligent samples is
based on information collected on the set of dependencies of the product

56
with the parameters. If the product results are less dependent on a set of
parameters, then they are removed from the list of intelligent samples.
All other parameters are combined and tested. This method significantly
reduces the number of permutations and combinations for test cases.
Compatibility testing not only includes parameters that are outside the
product, but also includes some parameters that are a part of the product. For
example, two versions of a given version of a database may depend on a set of
APIs that are part of the same database. These parameters are also an added
part of the compatibility matrix and tested. The compatibility testing of a
product involving parts of it self can be further classified into two types.
1. Backward compatibility testing There are many versions of the same
product that are available with the customers. It is important for the
customers that the objects, object properties, schema, rules, reports, and
so on, that are created with an older version of the product continue to
work with the current version of the same product. The testing that
ensures the current version of the product continues to work with the
older versions of the same product is called backward compatibility
testing. The product parameters required for the backward compatibility
testing are added to the compatibility matrix and are tested.
2. Forward compatibility testing There are some provisions for the product
to work with later versions of the product and other infrastructure
components, keeping future requirements in mind. For example, IP
network protocol version 6 uses 128 bit addressing scheme (IP version 4,
uses only 32 bits). The data structures can now be defined to
accommodate 128 bit addresses, and be tested with prototype
implementation ofIpv6 protocol stack that is yet to become a completely
implemented product. The features that are part of Ipv6 may not be still
available to end users but this kind of implementation and testing for the
future helps in avoiding drastic changes at a later point of time. Such
requirements are tested as part of forward compatibility testing. Testing
the product with a beta version of the operating system, early access
version of the developers' kit, and so on are examples of forward
compatibility. This type of testing ensures that the risk involved in
product for future requirements is minimized.
For compatibility testing and to use the techniques mentioned above, an
in-depth internal knowledge of the product may not be required. Compatibility
testing begins after validating the product in the basic environment. It is a type
of testing that involves high degree of effort, as there are a large number of
parameter combinations. Following some of the techniques mentioned above
may help in performing compatibility testing more effectively.

Check your progress 3

What are the various methods to perform compatibility testing?


Notes: a) Write your answer in the space given below
u) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------

57
--------------------------------------------------------------------------------------------------

7.5 USER DOCUMENTATION TESTING

User documentation covers all the manuals, user guides, installation


guides, setup guides, read me file, software release notes, and online help that
are provided along with the software to help the end user to understand the
software system.
User documentation testing should have two objectives.
1. To check if what is stated in the document is available in the product.
2. To check if what is there in the product is explained correctly in the
document.
Testing these documents attain importance due to the fact that the users
will have to refer to these manuals, installation, and setup guides when they
start using the software at their locations. Most often the users are not aware of
the software and need hand holding until they feel comfortable. Since these
documents are the first interactions the users have with the product, they tend
to create lasting impressions. A badly written installation document can put off
a user and bias him or her against the product, even if the product offers rich
functionality.
Some of the benefits that ensue from user documentation testing are:
1. User documentation testing aids in highlighting problems over looked
during reviews.
2. High quality user documentation ensures consistency of documentation
and product, thus minimizing possible defects reported by customers. It
also reduces the time taken for each support call sometimes the best way
to handle a call is to alert the customer the relevant section of the
manual. Thus the overall support cost is minimized.
3. Results in less difficult support calls. When a customer faithfully follows
the instructions given in a document but is unable to achieve the desired
(or promised) results, it is frustrating and often frustration shows up on
the support staff. Ensuring that a product is tested to work as per the
document and that it works correctly contributes to better customer
satisfaction and better morale of support staff.
4. New programmers and testers who join a project group can use the
documentation to learn the external functionality of the product.
5. Customers need less training and can proceed more quickly advanced
training and product usage if the documentation is of high quality and is
consistent with the product. Thus high-quality user documentation can
result in a reduction of overall training costs for user organizations.
Defects found in user documentation need to be tracked to closure like
any regular software defect. In order to enable an author to close a
documentation defect information about the defect/comment description,
paragraph/page number reference, document version number reference, name
of reviewer, name of author, reviewer's contact number, priority, and severity of
the comment need to be passed to the author.

58
7.6 DOMAIN TESTING

White box testing required looking at the program code. Black box testing
performed testing without looking at the program code but looking at the
specifications. Domain testing can be considered as the next level of testing in
which we do not look even at the specifications of a software product but are
testing the product, purely based on domain knowledge and expertise in the
domain of application. This testing approach requires critical understanding of
the day-to-day business activities for which the software is written. This type of
testing requires business domain knowledge rather than the knowledge of what
the software specification contains or how the software is written. Thus domain
testing can be considered as an extension of black box testing. As we move from
white box testing through black box testing to domain testing we know less and
less about the details of the software product and focus more on its external
behavior.
The test engineers performing this type of testing are selected because
they have in-depth knowledge of the business domain. Since the depth in
business domain is a prerequisite for this type of testing, sometimes it is easier
to hire testers from the domain area (such as banking, insurance, and so on)
and train them in software, rather than takes software professionals and trains
them in the business domain. This reduces the effort and time required for
training the testers in domain testing and also increases the effectiveness of
domain testing.

7.7 LETS SUM UP

Check Your Progress: Model Answers


1. A test suite script contains several decision scripts and test scripts. All
the elements of a test suite are defined in a driver table that specifies an
un-ordered set of test segments. Each test segment consists of a
collection of test scripts that are executed sequentially between two
decision scripts. For each test segment, the driver table specifies the
transition between a source test script and a target test script.
2. Equivalence partitioning is a software testing technique that in
identifying a small set of representative input values that produces as
many different output conditions as possible.
3. Horizontal combination, Intelligent Sampling, Forward compatibility
testing and backward compatibility testing.

59
LESSON 8

INTEGRATION TESTING
Contents
8.0 Aims and Objectives
8.1 What is Integration Testing?
8.2 Integration Testing as a Type of Testing
8.2.1 Top-Down Integration
8.2.2 Bottom-up Integration
8.2.3 Bi-Directional Integration
8.2.4 System Integration
8.2.5 Choosing Integration Method
8.5 Let Us Sum Up

8.0 AIMS AND OBJECTIVES

In this lesson, we are going to discuss on integration testing which is the


essential category of testing for component based programming. I discussed
various methods of this testing such as top-down, bottom-up, bi-directional and
system integration.
At the end of this method, you may be to understand components and
their interactions, the directions of interactions and its relevant test
mechanisms.

8.1 WHAT IS INTEGRATION TESTING?

A system is made up of multiple components or modules that can


comprise hardware and software. Integration is defined as the set of interactions
among components. Testing the interaction between the modules and
interaction with other systems externally is called integration testing.
Integration testing starts when two of the product components are available and
ends when all component interfaces have been tested. The final round of
integration involving all components is called Final Integration Testing (FIT), or
system integration.
Integration testing is both a type of testing and a phase of testing. As
integration is defined to be a set of interactions, all defined interactions among
the components need to be tested. The architecture and design can give the
details of interactions within systems; however, testing the interactions between
one system and another system requires a detailed understanding of how, they
work together. This knowledge of integration (that is, how the system or
modules work together) depends on many modules and systems. These diverse
modules could have different ways of working when integrated with other
systems. This introduces complexity in procedures and in what needs to be

60
done. Recognizing this complexity, a phase in testing is dedicated to test these
interactions, resulting in the evolution of a process. This ensuing phase is
called the integration testing phase.
Since integration testing is aimed at testing the interactions among the
modules, this testing-just like white box, black box, and other types of testing-
comes with a set of techniques and methods, which we will see in the following
sections. Hence integration testing is also viewed as a type of testing (and thus
fits into the canvas of this part of the book).

Check your progress 1

Define Integration.
Notes: a) Write your answer in the space given below
v) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

8.2 INTEGRATION TESTING AS A TYPE OF TESTING

Integration testing means testing of interfaces. When we talk about


interfaces, there are two types of interfaces that have to be kept in mind for
proper integration testing. They are internal interfaces and exported or external
interfaces.
Internal interfaces are those that provide communication across two
modules within a project or product, internal to the product, and not exposed to
the customer or external developers. Exported interfaces are those that are
visible outside the product to third party developers and solution providers.
One of the methods of achieving interfaces is by providing Application
Programming Interfaces (APIs). APIs enable one module to call another module.
The calling module can be internal or external. For example, JDBC is an
example of an API used by a Java program to make certain SQL calls. Even
though both the API and interfaces appear to be similar, it is important to
realize that integration is the purpose to be achieved while API is a means of
achieving the purpose. API is just one of the means of providing an interface
between two modules. One can think of other means of integration among the
various modules: Some of these could be simple function calls, public
functions, and some could be facets of programming language constructs like
global variables and some could be operating system constructs like
semaphores and shared memory. In this chapter, we will not discuss the details
of the vehicles used for integration (as it is primarily a development issue), but
rather look at how we can test the interfaces (which is the focus for testing).
Not all the interfaces may be available at the same time for testing pur-
poses, as different interfaces are usually developed by different development

61
teams, each having their own schedules. In order to test the interfaces, when
the full functionality of the component being introduced is not available, stubs
are provided. A stub simulates the interface by providing the appropriate values
in the appropriate format as would be provided by the actual component being
integrated.
Integration testing is done with test cases, which goes through the
internal and exported interfaces, and tests the functionality of the software.
Internal interfaces are for other developers inside an organization and external
interfaces are for third party developers or other users outside the group.
Testing for internal interfaces requires a complete understanding of
architecture and high-level design (HLD) and how they impact the software
functionality. In cases where exported interfaces are provided from the software,
one needs to understand the purpose of those interfaces, why they are
provided, and how they are actually used by developers and solution
integrators. Hence knowledge of design, architecture, and usage is a must for
integration testing.
Initially, the exported (or external) interfaces were provided through APIs
and Software Development Kits (SDKs). The use of SDKs required an
understanding of the programming language on which the API/SDK is provided.
Later, the interfaces became available through scripting languages, without the
need for SDKs. (Some of the popular scripting languages include Perl, Tcl/Tk).
These scripting languages eliminated or minimized the effort in learning the
languages in which the API was written. This also made it possible for the
interfaces to be called from programming language environments different from
the one in which the interface was originally written. This significantly
simplified the usage of exported interfaces. For testing interfaces, we now have
dynamically created scripts, which can be changed at run time, by a few clicks
of the mouse.
All these have made the use of interfaces a lot more widespread. The
number of purposes for which the interfaces are provided have been on the
increase. These interfaces are becoming increasingly generic in nature not
getting tied to a specific application or language. This has resulted in increasing
the permutations and combinations of scenarios of usage of the interfaces.
Thus, the complexity of integration testing-that is, testing of the various
scenarios of usage of interfaces - has also increased significantly.
While discussing about interfaces we need to keep in mind that not all
interactions between the modules are known and explained through interfaces.
Some of the interfaces are documented and some of them, not. This gives rise to
another classification of interfaces, that is, implicit and explicit interfaces.
Explicit interfaces are documented interfaces and implicit interfaces are those
which are known internally to the software engineers but are not documented.
The testing (white box/black box) should look for both implicit and explicit
interfaces and test all those interactions.
A question that often arises in the mind of a test engineer is whether
integration testing is a black box or a white box testing approach. In most
cases, the most appropriate answer is to say integration testing is a bit box
testing approach. However, in situations where architecture or design

62
documents do not clear by explain all interfaces among components the
approach can include going through the code and generating some additional
test cases, and mixing them with other test cases generated, black box testing
approaches. This approach could be termed as the "gray box testing" approach.
There are several methodologies available, to in decide the order for
integration testing. These are as follows.
1. Top-down integration
2. Bottom-up integration
3. Bi-directional integration
4. System integration

8.2.1 TOP-DOWN INTEGRATION

Integration testing involves testing the topmost component interface with


other components in same order as you navigate from top to bottom, till you
cover all the components as depicted in Figure 8.1
In an incremental product development, where one or two components
gets added to the product in each increment, the integration testing
methodology pertains to only to

Figure 8.1 Tree-structure for component integration


those new interfaces that are added and those related interfaces that are
impacted that are by the changes or increment in the product. Therefore, not all
the steps in the above table need to be repeated for integration testing in that
case.
To optimize the number of steps in integration testing, steps 6 and 7 can
be combined and executed as a single step. Similarly, steps 8 and 9 also can be
combined and tested in a single step. Combining steps does not a mean
reduction in the number of interfaces tested. It just means an optimization in

63
the elapsed time, as we do not have to wait for steps 6 and 8 to get over to start
with testing steps 7 and 9 respectively.
If a set of components and their related interfaces can deliver
functionality without expecting the presence of other components or with
minimal interface requirement in the software/product, then that set of
components and their related interfaces is called as a "sub-system." Each sub-
system in a product can work independently with or without other sub-
systems. This makes the integration testing easier and enables focus on
required interfaces rather than getting worried about each and every
combination components.
The top-down integration explanation above assumes that component
provides all the interface requirements of other components even while other
components are getting ready and does not require modification at a later stage
(that is, after the other components have been developed). This approach
reflects the Waterfall or V model of software development.
If a component at a higher level requires a modification every time a
module gets added to the bottom, then for each component addition integration
testing needs to be repeated starting from step 1. This may be a requirement for
an iterative model of software development. Hence whatever may be the
software development model; top-down integration can be still applied with
appropriate repetition in integration testing. .

Check your progress 2

What do you mean by subsystem?


Notes: a) Write your answer in the space given below
w) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

8.2.2 BOTTOM-UP INTEGRATION

Bottom-up integration is just the opposite of top-down integration, where


the components for a new product development become available in reverse
order, starting from the bottom. In Figure 8.2, the components are assumed to
be available in the order of their number.
Double arrows in Figure 8.2 denotes both the logical flow of components
and integration approach. Logic flow is from top to bottom, and integration path
is from bottom to top.
The navigation in bottom-up integration starts from component 1 in Figure
covering all sub-systems, till component 8 is reached as the order in which the
interfaces have to be tested.

64
The number of steps in the bottom-up approach can be optimized into
four steps, by combining steps 2 and 3 and by combining steps 5-8.

Figure 8.2 Logical flow of components


The arrows from top to bottom (that is downward-pointing arrows)
indicate interaction or control flow. The arrow from bottom to top (that is,
upward-pointing arrows) indicate integration approach or integration path.
What it means is that the logic flow of the product can be different from the
integration path. While the logic flow of the interaction in this case is going from
up to down, the integration path is going from down to up. This approach
allows a combination of integration approaches to be followed for the same
product.
The iterative and agile model is an example to explain the different path
for logic flow and integration approach. In this model, a product development
organization needs to demonstrate the product functionalities at regular
intervals to customers for getting their feedback. One way to increase the
frequency of delivery is to make each of the components independent. Some
portion of the code is duplicated across components to demonstrate each of the
components separately to the customer. The code duplication is not only for
demonstrating the product but also for independent testing. In the above
example, components 5-8 are developed as independent component with
duplicated common code.
After testing and approval from customer the common code gets pushed
into components at a higher level, say' components 5-7. After this iteration,
there is still a duplication of this code across components 5-7 which will
ultimately get removed by moving the code to component 8, in iteration. This is
one of the evolving models, to make the product complete after several
iterations of customer feedback and integration. The interface definitions are
done only at the time of moving the code to components at a higher level. The
common cc1J moves across components and interface definitions are made
only at the time of movement. Hence in this model, each iteration of integration
testing starts.
It may be easy to say that top-down integration approach is best suite for
the Waterfall and V models and the bottom-up approach for the iterative and
agile methodologies. This is only true for the examples taken above and from a
process perspective. In a practical scenario the approach selected for
integration depends more on the design and architecture of a product and on
associated priorities. Additionally, the selection of a right integration approach

65
needs to consider several other perspectives such as availabilil1 of components,
technology used, process, testing skills, and resource availability.

8.2.3 BI-DIRECTIONAL INTEGRATION

Bi-directional integration is a combination of the top-down and bottom-


up integration approaches used together to derive integration steps.
The individual components 1, 2, 3, 4, and 5 are tested separately and bi-
directional integration is performed initially with the use of stubs and drivers.
Drivers are used to provide upstream connectivity while stubs provide
downstream connectivity. A driver is a function which redirects the requests to
some other component and stubs simulate the behavior of a missing
component. After the functionality of these integrated components are tested,
the drivers and stubs are discarded. Once components 6, 7, and 8 become
available, the integration methodology then focuses only on those components,
as these are the components which need focus and are new. This approach is
also called" sandwich integration."
An area where this approach comes in handy is when migrating from a
two-tier to a three-tier environment. In the product development phase when a
transition happens from two-tier architecture to three-tier architecture, the
middle tier (components 6-8) gets created as a set of new components from the
code taken from bottom-level applications and top-level services.

8.2.4 SYSTEM INTEGRATION

System integration means that all the components of the system are
integrated and tested as a single unit. Integration testing, which is testing of
interfaces, can be divided into two types:
• Components or sub-system integration
• Final integration testing or system integration
When looking at steps for each of the above integration methodologies it is
obvious that complete system integration is also covered as the last step. Thus,
system integration is actually a part of every methodology described above.
The salient point this testing methodology raises, is that of optimization.
Instead of integrating component by component and testing, this approach
waits till all components arrive and one round of integration testing is done.
This approach is also called big-bang integration. It reduces testing effort and
removes duplication in testing.
System integration using the big bang approach is well suited in product
development scenario where the majority of components are already available
and stable and very few components get added or modified. In this case, instead
of testing component interfaces one by one, it makes sensei integrate all the
components at one go and test once, saving effort and time for the multi-step
component integrations.

66
While this approach saves time and effort, it is also not without
disadvantages. Some of the important disadvantages that can have a bearing on
the release dates and quality of a product are as follows.
1. When a failure or defect is encountered during system integration is very
difficult to locate the problem, to find out in which interface the defect
exists. The debug cycle may involve focusing on specific interfaces and
testing them again.
2. The ownership for correcting the root cause of the defect may be a
difficult issue to pinpoint.
3. When integration testing happens in the end, the pressure from the
approaching release date is very high. This pressure on the engineers
may cause them to compromise on the quality of the product.
4. A certain component may take an excessive amount of time to be ready.
This precludes testing other interfaces and wastes till' till the end.
As a result of all these factors, the choice of the method of integration
testing becomes extremely crucial. A judicious combination of the above -
methods would be needed to achieve effectiveness in the time and quail of
integration testing.

8.2.5 CHOOSING INTEGRATION METHOD

As mentioned in the above discussions, the integration method depends not


only on the process, development model, but also on various other aspects.

Check your progress 3

What are the different types of system integration?


Notes: a) Write your answer in the space given below
b)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

8.3 LETS SUM UP

Check Your Progress: Model Answers


1. Integration is defined as the set of interactions among components.
Testing the interaction between the modules and interaction with other
systems externally is called integration testing. Integration testing starts
when two of the product components are available and ends when all
component interfaces have been tested.

67
2. A set of components and their related interfaces can deliver functionality
without expecting the presence of other components or with minimal
interface requirement in the software/product, which set of components
and their related interfaces is called as a "sub-system."
3. System integration means that all the components of the system are
integrated and tested as a single unit. Integration testing, which is
testing of interfaces, can be divided into two types:
• Components or sub-system integration
• Final integration testing or system integration

68
LESSON 9

INTEGRATION TESTING AS A PHASE OF TESTING


Contents
9.0 Aims and Objectives
9.1 Integration Testing As a Phase of Testing
9.2 Scenario Testing
9.2.1 System Scenarios
9.2.2 Use case Scenarios
9.3 Let Us Sum Up

9.0 AIMS AND OBJECTIVES

In the previous lesson, we discussed on the fundamental concepts of


Integration testing and different types of Integration Testing. In this lesson, we
are going to discuss on integration testing as one among the phase of testing.
We are going to discuss on Scenario testing also with its types. At the end of
this lesson, you can understand system scenarios and use case scenarios.

9.1 INTEGRATION TESTING AS A PHASE OF TESTING

Integration testing as a phase of testing starts from the point where two
components can be tested together, to the point where all the components work
together as a complete system delivering system/product functionality. In the
integration testing phase, the focus is not only on whether functionality of the
components works well, but also on whether they work together and deliver
sub-system and system functionality.
The integration testing phase focuses on finding defects which
predominantly arise because of combining various components for testing, and
should not be focused on for component or few components. Integration testing
as a type focuses on testing the interfaces. This is a subset of the integration
testing phase. When a sub-system or system components are put together (or
integrated), the defects not only arise because of interfaces, but also for various
other reasons such as usage, incomplete understanding of product domain,
user errors, and so on. Hence the integration testing phase needs to focus on
interfaces as well as usage flow. It is very important to note this point to avoid
confusion between integration testing type and integration testing phase.
Integration testing as a phase involves different activities and different
types of testing have to be done in that phase. This is a testing phase that
should ensure completeness and coverage of testing for functionality. To
achieve this, the focus should not only be on planned test case execution but
also on unplanned testing, which is termed as "ad hoc testing." A principle of
Testing, there is no end to testing, and quality cannot depend only on pre-
written test cases; ad hoc testing becomes important to integration testing

69
phase. There are different terminologies associated with ad hoc testing, such as
exploratory testing, monkey testing, out of the box testing, and so on. All these
tests perform the same functions during integration testing phase, that is,
uncover or unearth those defects which are not found by planned test case
execution. This approach helps in locating some problems which are difficult to
find by test teams but also difficult to imagine in the first place. The approach
also helps in generating a comfort feeling on the software and getting an overall
acceptance of the product from all internal users of the system.
The integration testing phase involves developing and executing test
cases that cover multiple components and functionality. When the functionality
of different components are combined and tested together for a sequence
related operations, they are called scenarios. Scenario testing is a planned
activity to explore different usage patterns and combine them into test cases
called scenario test cases. We will see scenario testing in more detail in the next
section.

9.2 SCENARIO TESTING

Scenario testing is defined as a "set of realistic user activities that are


used for evaluating the product." It is also defined as the testing involving
customer scenarios.
There are two methods to evolve scenarios.
1. System scenarios
2. Use-case scenarios/role based scenarios

Check your progress 2

Describe on Scenario Testing.


Notes: a) Write your answer in the space given below
x) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

9.2.1 SYSTEM SCENARIOS

System scenario is a method whereby the set of activities used for


scenario testing covers several components in the system. The following
approaches can be used to develop system scenarios.
Story line Develop a story line that combines various activities I product that
may be executed by an end user. A user enters his or her office logs into the
system, checks mail, responds to some mails, compiles some programs,
performs unit testing and so on. All these typical activities carried out in the
course of normal work when coined together become a scenario.

70
Life cycle/state transition Consider an object, derive the different transitions/
modifications that happen to the object, and derive scenarios to cover them. For
example, in a savings bank account, you can start, opening an account with a
certain amount of money; make a deposit perform a withdrawal, calculate
interest, and so on. All these activities are applied to the "money" object, and
the different transformations, applied to the "money" object becomes different
scenarios.
Deployment/implementation stories from customer Develop a scenario from
a known customer deployment/implementation details and create a set of
activities by various users in that implementation.
Business verticals Visualize how a product/software will be applied to different
verticals and create a set of activities as scenarios to address specific vertical
businesses. For example, take the purchasing function. It may be done
differently in different verticals like pharmaceuticals, software houses, and
government organizations. Visualizing these different types of tests make the
product "multi-purpose."
Battle ground Create some scenarios to justify that "the product works" and
some scenarios to "try and break the system" to justify "the product doesn't
work." This adds flavor to the scenarios mentioned above.
The set of scenarios developed will be more effective if the majority of the
approaches mentioned above are used in combination, not in isolation.
Scenario should not be a set of disjointed activities which have no relation to
each other. Any activity in a scenario is always a continuation of the previous
activity, and depends on or is impacted by the results of previous activities.
Effective scenarios will have a combination of current customer
implementation, foreseeing future use of product, and developing ad hoc test
cases. Considering only one aspect (current customer usage or future customer
requirements, for instance) would make scenarios ineffective. If only current
customer usage is considered for testing, new features may not get tested
adequately. Considering only the future market for scenarios may make the
scenarios test only the new features and some of the existing functionality may
not get tested. A right mix of scenarios using the various approaches explained
above is very critical for the effectiveness of scenario testing.
Coverage is always a big question with respect to functionality in
scenario testing. This testing is not meant to cover different permutations and
combinations of features and usage in a product. However, by using a simple
technique, some comfort feeling can be generated on the coverage of activities
by scenario testing.

Check your progress 2

Justify Integration testing is a phase of testing.


Notes: a) Write your answer in the space given below
y) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------

71
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

9.2.2 USE CASE SCENARIOS

A use case scenario is a stepwise procedure on how a user intends to use


a system with different user roles and associated parameters. A use case
scenario case include stories, pictures, and deployment details. Use cases are
useful for explaining customer problems and how the software can solve those
problems without any ambiguity.
A use case can involve several roles or class of users who typically
perform different activities based on the role. There are some activities that are
common across roles and there are some activities that are very specific and
can be performed only by the use belonging to a particular role. Use case
scenarios term the users with different roles as actors. What the product should
do for a particular activity is termed as system behavior. Users with a specific
role to interact between the actors and the system are called agents.
To explain the concept of use case scenarios, let us take the example I
withdrawing cash from a bank. A customer fills up a check and gives it to an
official in the bank. The official verifies the balance in the account from the
computer and gives the required cash to the customer. The customer in this
example is the actor, the clerk the agent, and the response given by the
computer, which gives the balance in the account, is called the system
response.
This way of describing different roles in test cases helps in testing the
product without getting into the details of the product. In the above example,
the actor (who is the customer) need not know what the official is doing and
what command he is using to interact with the computer.

Fig 9.1 Sample Scenario


The actor is only concerned about getting the cash. The agent (who is the
official) is not concerned about the logic of how the computer works. He or she

72
is only interested in knowing from the computer whether he or she can give the
cash or not. However, the system behavior (computer logic) needs to be tested
before applying the sequence of agent activities and actor activities. In this
example depicted in Figure 9.1, the activities performed by the actor and the
agent can be tested by testers who do not have much knowledge of the product.
Testers who have in-depth knowledge of the product can perform the system
behavior part of testing. They need to know the logic of how the code works and
whether or not the system response is accurate.
As mentioned earlier, actor and agent are roles that represent different
types (classes) of users. Simulating different types of users again needs a clear
understanding of business and the system response for each of the user needs
a clear understanding of how the product is implemented. Hence, testers using
the use case model, with one person testing the actions and other person
testing the system response, complement each other's testing as well as testing
the business and the implementation' aspect of the product at the same time.
The agent part of the use cases are not needed in all cases. In a
completely automated system involving the customer and the system, use cases
can be written without considering the agent portion. Let us extend the earlier
example of cash withdrawal using an ATM. Table 9.1 illustrates how the actor
and system response can be described in the use case.
Table 9.1 Possible actor responses

Actor System response

User likes to withdraw cash and Request for password or Personal


inserts the card in the ATM Identification Number (PIN)
machine

User fills in the password or PIN Validate the password or PIN


Give a list containing types of accounts

User selects an account type Ask the user for amount to withdraw

User fills in the amount of cash Check availability of funds


required
Update account balance
Prepare receipt
Dispense cash

Retrieve cash from ATM Print receipt

This way of documenting a scenario and testing makes it simple and also
makes it realistic for customer usage. Use cases are not used only for testing. In
some product implementations, use cases are prepared prior to the design and
coding phases, and they are used as a set of requirements for design and
coding phases. All development activities are performed based on use case

73
documentation. In extreme programming models these are termed as user
stories and form the basis for architecture/design and coding phases. Hence,
use cases are useful in combining the business perspectives and
implementation detail and testing them together.

Check your progress 3

Define Usecase Scenario.


Notes: a) Write your answer in the space given below
z) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

9.8 Lets sum up

Check Your Progress: Model Answers


1. Scenario testing is defined as a "set of realistic user activities that are
used for evaluating the product." It is also defined as the testing involving
customer scenarios.
There are two methods to evolve scenarios.
• System scenarios
• Use-case scenarios/role based scenarios
2. The integration testing phase focuses on finding defects which
predominantly arise because of combining various components for
testing, and should not be focused on for component or few components.
Integration testing as a type focuses on testing the interfaces
3. A use case scenario is a stepwise procedure on how a user intends to use
a system with different user roles and associated parameters. A use case
scenario case includes stories, pictures, and deployment details.

74
LESSON 10

DEFECT BASH
Contents
10.0 Aims and Objectives
10.1 Defect Bash
10.2 Choosing the Frequency and Duration of Defect Bash
10.3 Selecting the Right Product Build
10.4 Communicating the Objective of Defect Bash
10.5 Setting up and monitoring the Lab
10.6 Taking Actions and Fixing Issues
10.7 Optimizing the Effort Involved in Defect Bash
10.8 Let Us Sum Up

10.0 AIMS AND OBJECTIVES

In all previous lessons, we were discussing on various types of testing such as


white box testing, black box testing, integration testing, Scenario testing etc. I
hope you can understand the basic ethics of testing and procedures.
In this lesson, I am going to introduce you about an ad-hoc testing method
called Defect Bash. The steps to perform defect bash, formulae to be applied are
elaborated in down chapters.

10.1 DEFECT BASH

Defect bash is an ad hoc testing where people performing different roles


in an organization test the product together at the same time. This is yen
popular among application development companies, where the product can be
used by people who perform different roles. The testing by all the participants
during defect bash is not based on written test cases. What is to be tested is left
to an individual's decision and creativity. They can also try some operations
which are beyond the product specifications. Defect bash brings together plenty
of good practices that are popular in testing industry. They are as follows.
1. Enabling people "Cross boundaries and test beyond assigned areas"
2. Bringing different people performing different roles together in the
organization for testing-"Testing isn't for testers alone"
3. Letting everyone in the organization use the product before delivery-"Eat
your own dog food"
4. Bringing fresh pairs of eyes to uncover new defects-"Fresh eyes have less
bias"
5. Bringing in people who have different levels of product under standing to
test the product together randomly -" Users of software are not same"

75
6. Let testing doesn't wait for lack of/time taken for documentation- "Does
testing wait till all documentation is done?"
7. Enabling people to say "system works" as well as enabling them to
"break-the system" -"Testing isn't to conclude the system works or
doesn't work"
Even though it is said that defect bash is an ad hoc testing, not all
activities of defect bash are unplanned. All the activities in the defect bash are
planned activities, except for what to be tested. It involves several steps.
Step 1 Choosing the frequency and duration of defect bash
Step 2 Selecting the right product build
Step 3 Communicating the objective of each defect bash to everyone
Step 4 Setting up and monitoring the lab for defect bash
Step 5 Taking actions and fixing issues
Step 6 Optimizing the effort involved in defect bash
Bug bash is where all the developers, testers, program managers, usability
researchers, designers, documentation folks, and even sometimes marketing
people, put aside their regular day-to-day duties and pound on the product to
get as many eyes on the product as possible.
Bug bash sounds similar to eat one's own dog food and is a tool used as part of
test management approach. Bug bash is usually declared in advance to the
team. The test management team sends out the scope and assigns the testers
as resource to assist in setup and also collect bugs. Test management might
use this along with small token prize for good bugs found and/or have small
socials at the end of the Bug Bash. Another interesting bug bash prize was to
Piecing test management team members.
Ad hoc testing is a commonly used term for software testing performed without
planning and documentation. The tests are intended to be run only once,
unless a defect is discovered. Ad hoc testing is a part of exploratory testing,
being the least formal of test methods. In this view, ad hoc testing has been
criticized because it isn't structured, but this can also be strength for important
things can be found quickly. It is performed with improvisation; the tester seeks
to find bugs with any means that seem appropriate. It contrasts to regression
testing that looks for a specific issue with detailed reproduction steps, and a
clear expected result. Ad hoc testing is most often used as a complement to
other types of testing.

Check your progress 1

Define Defect Bash.


Notes: a) Write your answer in the space given below
aa) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

76
10.2 CHOOSING THE FREQUENCY AND DURATION OF DEFECT BASH

Defect bash is an activity involving a large amount of effort (since it


involves large a number of people) and an activity involving huge planning (as is
evident from the above steps). Frequent defect bashes will incur low return on
investment, and too few defect bashes may not meet the objective of finding all
defects. Duration is also an important factor. Optimizing the small duration is a
big saving as a large number of people are involved. On the other hand if the
duration is small, the amount of testing that is done may not meet the
objective.

10.3 SELECTING THE RIGHT PRODUCT BUILD

Since the defect bash involves a large number of people, effort and
planning, a good quality build is needed for defect bash. A regression tested
build would be ideal as all new features and defect fixes would have been
already tested in such a build. An intermediate build where the code
functionality is evolving or an untested build will make the purpose and
outcome of a defect bash ineffective. Where a large number of people are
involved, a good quality product build gives confidence on the product and
progress. Also, when testers doing a defect bash uncover an excessive number
of defects or very severe defects, the confidence of the testers falls and the
perception of the product being unstable lingers on for long.

10.4 COMMUNICATING THE OBJECTIVE OF DEFECT BASH

Even though defect bash is an ad hoc activity, its purpose and objective
have to be very clear. Since defect bash involves people performing different
roles, the contribution they make has to be focused towards meeting the
purpose and objective of defect bash. The objective should be to find a large
number of uncovered defects or finding out system requirements (CPU,
memory, disk, and so on) or finding the non-reproducible or random defects,
which could be difficult to find through other planned tests. Defects that a test
engineer would find easily should not be-the objective of a defect bash. Once
they are told in advance, the members of defect bash team will be in a better
position to contribute towards stated objectives.

Check your progress 2

How defect bash is handled by a test engineer?


Notes: a) Write your answer in the space given below
bb) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

77
10.5 SETTING UP AND MONITORING THE LAB

Since defect bashes are planned, short term and resource intensive activities, it
makes sense to setup and monitor a laboratory for this purpose. Finding out
the right configuration, resources (hardware, software, and set of people to
perform defect bash) are activities that have to be planned carefully before a
bash actually starts. Since the effort involved is more, it is critical to ensure
that the right setup is done, so that everyone can perform the desired set of
activities on the software. The majority of defect bash fail due to inadequate
hardware, wrong software configurations, and perceptions related to
performance and scalability of the software. During defect bash, the product
parameters and system resources (CPU, RAM, disk, network) need to be
monitored for defects and also corrected so that users can continue to use the
system for the complete duration of the defect bash.
There are two types of defects that will emerge during a defect bash. The
defects that are in the product, as reported by the users, can be classified as
functional defects. Defects that are unearthed while monitoring the system
resources, such as memory leak, long turnaround time, missed requests, high
impact and utilization of system resources, and so on are called non-functional
defects. Defect bash is a unique testing method which can bring out both
functional and non-functional defects. However, if the lab is not set up properly
or not monitored properly, there is a chance that some of the non-functional
defects may not get noticed at all.

Check your progress 3

Discuss the two types of defects appear in defect bash.


Notes: a) Write your answer in the space given below
cc) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

10.6 TAKING ACTIONS AND FIXING ISSUES

The last step, is to take the necessary corrective action after the defect
bash. Getting a large number of defects from users is the purpose and also the
normal end result from a defect bash. Many defects could be duplicate defects.
However, different interpretations of the same defect by different users, and the
impact of the same defect showing up differently in different places, make them
difficult to be called duplicates. Since there could be a large number of defects,
the approach to fix problems from a defect bash should not be at a per defect
level. It is difficult to solve all the problems if they are taken one by one and
fixed in code. The defects need to be classified into issues at a higher level, so
that a similar outcome can be avoided in future defect bashes. There could be
one defect associated with an issue and there could be several defects that can
be called as an issue. An example of an issue can be "In all components, all
inputs for employee number have to be validated before using them in business

78
logic." This enables all defects from different components to be grouped and
classified as one issue. All the issues reported from a defect bash need to be
taken through a complete code and design inspections, analyzed, and fixed
together in places where a defect could evolve from. So the outcome of a defect
bash can also be used for preventing defects for future defect bashes.

10.7 OPTIMIZING THE EFFORT INVOLVED IN DEFECT BASH

Since a defect bash involves a large number of people, spending much


effort is normal for conducting defect bashes. There are several ways to optimize
the effort involved in a defect bash if a record of objectives and outcome is kept.
Having a tested build, keeping the right setup, sharing the objectives, and so
on, to save effort and meet the purpose have been already discussed. Another
approach to reduce the defect bash effort is to conduct "micro level" defect
bashes before conducting one on a large scale. Some of the more evident defects
will emerge at micro level bashes.
Since a defect bash is an integration testing phase activity, it can be
experimented by integration test team before they open it up for others. To
prevent component level defects emerging during integration testing, a micro
level defect bash can also be done to unearth feature level defects, before the
product can be taken into integration.
Hence, a defect bash can be further classified into
1. Feature/component defect bash
2. Integration defect bash
3. Product defect bash
To explain the effort saved by the defect bash classification, let us take
three product defect bashes conducted in two hours with 100 people. The total
effort involved is 3*2*100 = 600 person hours. If the feature/component test
team and integration test team, that has 10 people each, can participate in:
doing two rounds of micro level bashes, which can find out one third of defects
that are expected, then effort saving is 20% with respect to the following
calculation.
Total effort involved in two rounds of product bashes -400 man hours
Effort involved in two rounds of feature bash (2*2*10)-40
Effort involved in two rounds of integration bash (2*2*10)-40
Effort saved = 600 - (A + B + C) = 600 - 480 = 120 person hours, or 20%
This is only an approximate calculation, as the effort involved in the
steps mentioned earlier in this section (Steps 1-6) also need to be included for
each defect bash. Those steps have to be repeated for each defect bash,
irrespective of whether they are at feature level or integration level or product
level.
Integration testing is both a type of testing and a phase of testing.
Integration testing starts after each of the components are tested alone and
delivered, using black box testing approaches discussed. All components tested
and working do not mean they will continue to work the same way after they
are put together and integrated. This is an important phase/type of testing that

79
is often ignored by many organizations. Owing to project pressure and delay in
development schedules, integration testing may get diluted as it is performed in
between component and system testing phases. A separate test team focusing
on integration testing is an initiative recently taken by several companies to
provide to integration testing the focus it has deserved for a long time.
Integration testing, if done properly, can reduce the number of defects that will
be found in the system testing phase, a phase that is explained in the following
chapter.

10.8 LETS SUM UP

We are at the end of the lesson, we look at a glance all discussed above.
The testing by all the participants during defect bash is not based on written
test cases. What is to be tested is left to an individual's decision and creativity.
Defect bash is an activity involving a large amount of effort (since it involves
large a number of people) and an activity involving huge planning (as is evident
from the above steps)
Integration testing is both a type of testing and a phase of testing. Integration
testing starts after each of the components are tested alone and delivered, using
black box testing approaches discussed
Check Your Progress: Model Answers
1. Defect bash is an ad hoc testing where people performing different roles
in an organization test the product together at the same time. This is yen
popular among application development companies, where the product
can be used by people who perform different roles.
2. Defects that a test engineer would find easily should not be-the objective
of a defect bash. Once they are told in advance, the members of defect
bash team will be in a better position to contribute towards stated
objectives.
3. There are two types of defects that will emerge during a defect bash. The
defects that are in the product, as reported by the users, can be
classified as functional defects. Defects that are unearthed while
monitoring the system resources, such as memory leak, long turnaround
time, missed requests, high impact and utilization of system resources,
and so on are called non-functional defects

80
UNIT - III
LESSON 11

SYSTEM AND ACCEPTANCE TESTING


Contents
11.0 Aims and Objectives
11.1 System Testing Overview
11.2 Why is System Testing Done?
11.3 Functional Versus Non-Functional Testing
11.4 Let Us Sum Up

11.0 AIMS AND OBJECTIVES

This is the entry spot of unit III, in which we will discuss about System and
Acceptance Testing. In this lesson, we will introduce you about system testing
with functional and non-functional testing.
We hope the reader might be able to understand various circumstances of
applying the test methodologies to the real world software.

11.1 SYSTEM TESTING OVERVIEW

System testing is defined as a testing phase conducted on the complete


integrated system, to evaluate the system compliance with its specified
requirements. It is done after unit, component and integration testing phases.
A system is a complete set of integrated components that together deliver
product functionality and features. A system can also be defined as a set of
hardware, software and other parts that together provide product features and
solutions. In order to test the entire system, it is necessary to understand the
product’s behavior as a whole. System testing helps in uncovering the defects
that may not be directly attributable to a module or an interface. System testing
brings out issues that are fundamental to design, architecture and code of the
whole product.
System testing is the only phase of testing which tests the both
functional and non-functional aspects of the product. On the functional side,
system testing focuses on real-life customer usage of the product and solutions.
System testing stimulates customer deployments. For a general-purpose
product, system testing also means testing it for different business verticals and
applicable domains such as insurance, banking, asset management and so on.
On the non-functional side, system brings in different testing types (also
called quality factors), some of which are as follows.
1. Performance/Load testing To evaluate the time taken or response time
of the system to perform its required functions in comparison with

81
different versions of same product(s) or a different competitive product(s)
is called performance testing.
2. Scalability testing A testing that requires enormous amount of resource
to find out the maximum capability of the system parameters is called
scalability testing.
3. Reliability testing To evaluate the ability of the system or an
independent component of the system to perform its required functions
repeatedly for a specified period of time is called reliability testing.
4. Stress testing Evaluating a system beyond the limits of the specified
requirements or system resources (such as disk space, memory,
processor utilization) to ensure the system does not break down
unexpectedly is called stress testing.
5. Interoperability testing This testing is done to ensure that two or more
products can exchange information, use the information and work
closely.
6. Localization Testing Testing conducted to verify that the localized
product works in different languages is called localization testing.
The definition of system testing can keep changing, covering wider and
more high-level aspects, depending on the context. A solution provided to a
customer may be an integration of multiple products. Each product may be a
combination of several components. A supplier of a component of a product can
assume the independent component as a system in its own right and do system
testing of the component. From the perspective of the product organization,
integrating those components is referred to as sub-system testing. When all
components, delivered by different component developers, are assembled by a
product organization, they are tested together as a system. At the next level,
there are solution integrators who combine products from multiple sources to
provide a complete integrated solution for a client. They put together many
products as a system and perform system testing of this integrated solution.
System testing is performed on the basis of written test cases according
to information collected from detailed architecture/design documents, module
specifications and system requirements specifications. System test cases are
created after looking at component and integration test cases, and are also at
same time designed to include the functionality that tests the system together.
System test cases can also be developed based on user stories, customer
discussions, and points made by observing typical customer usage.
System testing may not include many negative scenario verifications,
such as testing for incorrect and negative values. This is because such negative
testing would have been already performed by component and integration
testing and may not reflect real-life customer usage.
System testing may be started once unit, component, and integration
testing are completed. This would ensure that the more basic program logic
errors and defects have been corrected. Apart from verifying the business
requirements of the product, system testing is done to ensure that the product
is ready for moving to the user acceptance test level.

82
Check your Progress 1

Define System Testing.


Notes: a) Write your answer in the space given below
dd) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

11.2 WHY IS SYSTEM TESTING DONE?

An independent test team normally does system testing. This


independent test team is different from the team that does the component and
integration testing. The system test team generally reports to a manager other
than the project manager to avoid conflict of interest and to provide freedom to
individuals doing system testing. Testing the product with an independent
perspective and combining that with the perspectives of the customer makes
system testing unique, different and effective. System testing by an independent
team removes the bias on the product and inducting a “fresh pair of eyes”
through an independent system test team helps in locating problems missed
out by component and integration testing.
The behavior of the complete product is verified during system testing.
Tests that refer to multiple modules, programs and functionality are included in
system testing. Testing the complete product behavior is critical as it is wrong
to believe that individually tested components will work together when they are
put together.
System testing helps in identifying as many defects as possible before the
customer finds them in the deployment. This is the last chance for the test
team to find any remaining product defects before the product is handed over to
the customer.
System testing is conducted with an objective to find product level
defects and in building the confidence before the product is released to the
customer. Component and integration testing phases focus on finding defects. If
the same focus is provided in system testing and significant defects are found,
it may generate a feeling that the product is unstable (especially because
system testing is closer to product release than component or integration
testing). Contrary to this, if system testing uncovers few defects, it raises
questions on the effectiveness and value of system testing phase. Hence, system
testing strives to always achieve a balance between the objective of finding
defects and the objective of building confidence in the product prior to release.
Since system testing is the last phase of testing before the release, not all
defects can be fixed in code due to time and effort needed in development and
testing and due to the potential risk involved in any last-minute changes.
Hence, an impact analysis is done for those defects to reduce the risk of
releasing a product with defects. If the risk of the customers getting exposed to

83
the defects is high, then the defects are fixed before the release; else, the
product is released as such. The analysis of defects and their classification into
various categories also gives an idea about the kind of defects that will be found
by the customer after release. This information helps in planning native
approaches and so on. Hence, system testing helps in reducing the risk of
releasing a product.
System testing is highly complementary to other phases of testing. The
component and integration test phases are conducted taking inputs from
functional specification and design. The main focus during these testing phases
and technology and product implementation. On the other hand, customer
scenarios and usage patterns serve as the basis for system testing. Thus system
testing phase complements the earlier phases with an explicit focus on
customers. The system testing phase helps in switching this focus of the
product development team towards customers and their use of the product.
To summarize, system testing is done for the following reasons.
1. Provide independent perspective in testing.
2. Bring in customer perspective in testing.
3. Provide a “fresh pair of eyes” to discover defects not found earlier by
testing.
4. Test product behavior in a holistic, complete and realistic environment
5. Test both functional and non-functional aspects of the product.
6. Build confidence in the product.
7. Analyze and reduce the risk of realizing the product.
8. Ensure all requirements are met and ready the product for acceptance
testing.

Check your progress 2

What is the objective of applying System Testing?


Notes: a) Write your answer in the space given below
ee) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

11.3 FUNCTIONAL VERSUS NON-FUNCTIONAL TESTING

Functional testing involves testing a product’s functionality and features.


Non-functional testing involves the product’s quality factors. System testing
comprises both functional and non-functional test verification.

84
Functional testing helps in verifying what the system is supposed to do.
It aids in testing the product’s features or functionality. It has only two results
as far as requirements fulfillment is concerned-met or not met. If requirements
are not properly enumerated, functional requirements may be understood in
many ways. Hence, functional testing should have very clear expected results
documented in terms of the behavior of the product. Functional testing
comprises simple methods and steps to execute the test cases. Functional
testing results normally depend on the product, not on the environment. It uses
a pre-determined set of resources and configuration except for a few types of
testing such as compatibility testing where configurations play a role.
Functional testing requires in-depth customer and product knowledge as well
as domain knowledge so as to develop different test cases and find critical
defects; as the focus of the testing is to find defects. Failures in functional
testing normally results in fixes in the code to arrive at the right behavior.
Functional testing is performed in all phases of testing such as unit testing,
component testing, integration testing, and system testing. Having said that,
the functional testing done in the system testing phase (functional system
testing) focuses on product features as against component features and
interface features.
Non-functional testing is performed to verify the quality factors (such as
reliability, scalability etc.). These quality factors are also called non-functional
requirements. Non-functional testing requires the expected results to be
documented in qualitative and quantifiable terms. Non-functional testing
requires large amount of resources and the results are different for different
configurations and resources. Non-functional testing is very complex due to the
large amount of data that needs to be collected and analyzed. The focus on non-
functional testing is to qualify the product and is not meant to be a defect-
finding exercise. Test cases for non-functional testing include clear pass/fail
criteria. However, test results are concluded both on pass/fail definitions and
on the experiences encountered in running the tests.
Apart from verifying the pass or fail status, non-functional tests results
are also determined by the amount of effort involved in executing them and any
problems faced during execution. For example, if a performance test met the
pass/fail criteria after 10 iterations, then the experience is bad and test result
cannot be taken as pass. Either the product or the non-functional testing
process needs to be fixed here.
Non-functional testing requires understanding the product behavior,
design, and architecture and also knowing what the competition provides. It
also requires analytical and statistical skills as the large amount of data
generated requires careful analysis. Failures in non-functional testing affect the
design and architecture much more than the product code. Since non-
functional testing is not repetitive in nature and requires a stable product, it is
performed in the system testing phase.
Some of the points mentioned in the Table 11.1 may be seen as
judgmental and subjective. For example, design and architecture knowledge is
needed for functional testing also. Hence all the above points have to be taken
as guidelines, not dogmatic rules. Since both functional and non-functional

85
aspects are being tested in the system testing phase, the question that can be
asked is "What is the right proportion of test cases/effort for these two types of
testing?" Since functional testing is a focus area starting from the unit testing
phase while non-functional aspects get tested only in the system testing phase,
it is a good idea that a majority of system testing effort be focused on the non-
functional aspects. A 70%-30% ratio between non-functional and functional
testing can be considered good and 50%-50% ratio is a good starting point.
However, this is only a guideline, and the right ratio depends more on the
context, type of release, requirements, and products.
Table 11.1 Functional testing versus non-functional testing

Testing Aspects Functional Testing Non-Functional


Testing
Involves Product features and Quality factors
functionality
Tests Product behavior Behavior and
experience
Result conclusion Simple steps written to Huge data collected
check expected results and analyzed
Results varies due to Product implementation Product
implementation,
resources and
configuration
Testing focus Defect detection Qualification of
product
Knowledge required Product and domain Product, domain,
design, architecture,
statistical skills
Failures normally due to Code Architecture, design
and code
Testing phase Unit, component, System
integration, system
Testing case repeatability Repeated many times Repeated only in case
of failures and for
different
configurations
configuration One-time setup for a set of Configuration
test cases changes for each test
case

Check your progress 3

What do you mean by non-functional requirements?


Notes: a) Write your answer in the space given below
ff) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------

86
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------

11.4 LET US SUM UP

Now we are at the end of the lesson, and we hope the reader can understand
the concepts of system testing its various sub sections.
System testing is the only phase of testing which tests the both functional and
non-functional aspects of the product. On the functional side, system testing
focuses on real-life customer usage of the product and solutions.
Stress testing Evaluating a system beyond the limits of the specified
requirements or system resources (such as disk space, memory, processor
utilization) to ensure the system does not break down unexpectedly is called
stress testing.
Performance/Load testing, Scalability testing, Reliability testing, Stress testing,
Interoperability testing, Localization Testing are the non-functional testing
methods.
Check Your Progress: Model Answers
1. System testing is defined as a testing phase conducted on the complete
integrated system, to evaluate the system compliance with its specified
requirements. It is done after unit, component and integration testing
phases.
2. System testing is conducted with an objective to find product level
defects and in building the confidence before the product is released to
the customer. Component and integration testing phases focus on
finding defects. If the same focus is provided in system testing and
significant defects are found, it may generate a feeling that the product is
unstable (especially because system testing is closer to product release
than component or integration testing).
3. Non-functional testing is performed to verify the quality factors (such as
reliability, scalability etc.). These quality factors are also called non-
functional requirements. Non-functional testing requires the expected
results to be documented in qualitative and quantifiable terms. Non-
functional testing requires large amount of resources and the results are
different for different configurations and resources

87
LESSON 12

FUNCTIONAL SYSTEM TESTING


Contents
12.0 Aims and Objectives
12.1 Functional System Testing
12.2 Design / Architecture Verification
12.3 Business Vertical Testing
12.4 Deployment Testing
12.5 Beta Testing
12.6 Certification, Standards and Testing for Compliance
12.7 Let Us Sum Up

12.0 AIMS AND OBJECTIVES

As extended to the previous lesson, we are going to discuss various functional


system testing methods such as business vertical testing, deployment testing,
Design verification, beta testing etc.
At the end of the lesson the reader can understand the certification, standards
and testing for compliance mechanisms.

12.1 FUNCTIONAL SYSTEM TESTING

As explained earlier, functional testing is performed at different phases


and the focus is on product level features. As functional testing is performed at
various testing phases, there are two obvious problems. One is duplication and
other one is gray area. Duplication refers to the same tests being performed
multiple times and gray area refers to certain tests being missed out in all the
phases. A small percentage of duplication across phases is unavoidable as
different teams are involved. Performing cross reviews (involving teams from
earlier phases of testing) and looking at the test cases of the previous phase
before writing system test cases can help in minimizing the duplication. A small
percentage of duplication is advisable, as different people from different teams
test the features with different perspectives; yielding new defects.
Gray areas in testing happen due to lack of product knowledge, lack of
knowledge of customer usage, and lack of co-ordination across test teams. Such
gray areas in testing make defects seep through and impact customer usage. A
test team performing a particular phase of testing may assume that a particular
test will be performed by the next phase. This is one of the reasons for such
gray areas. In such cases, there has to be a clear guideline for team interaction
to plan for the tests at the earliest possible phase. A test case moved from a
later phase to an earlier phase is a better alternative than delaying a test case
from an earlier phase to a later phase, as the purpose of testing is to find

88
defects as early as possible. This has to be done after completing all tests meant
for the current phase, without diluting the test, of the current phase.
There are multiple ways system functional testing is performed. There
are also many ways product level test cases are derived for functional testing.
Some of the common techniques are given below.
1. Design/architecture verification
2. Business vertical testing
3. Deployment testing
4. Beta testing
5. Certification, standards, and testing for compliance.

Check your progress 1

Describe on Duplication.
Notes: a) Write your answer in the space given below
gg) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

12.2 DESIGN / ARCHITECTURE VERIFICATION

In this method of functional testing, the test cases are developed and
checked against the design and architecture to see whether they are actual
product-level test cases. Comparing this with integration testing, the test cases
for integration testing are created by looking at interfaces whereas system level
test cases are created first and verified with design and architecture to check
whether they are product-level or component-level test cases. The integration
test cases focus on interactions between modules or component whereas the
functional system test focuses on the behavior of the complete product. A side
benefit of this exercise is ensuring completeness of ill product implementation.
This technique helps in validating the product features that are written based
on customer scenarios and verifying them using product implementation. If
there is a test case that is a customer scenario but failed validation using this
technique, then it is moved appropriately to component or integration testing
phases. Since functional testing is performed at various test phases, it is
important to reject the test cases and move them to an earlier phase to catch
defects early and avoid any major surprise at the later phases. Some of the
guidelines used to reject test cases for system functional testing include the
following.
1. Is this focusing on code logic, data structures, and unit of the product?
(If yes, then it belongs to unit testing.)

89
2. Is this specified in the functional specification of any component? (If yes,
then it belongs to component testing.)
3. Is this specified in design and architecture specification for in integration
testing? (If yes, then it belongs to integration testing.)
4. Is it focusing on product implementation but not visible to customers?
(This is focusing on implementation-to be covered in
unit/component/integration testing.)
5. Is it the right mix of customer usage and product implementation?
(Customer usage is a prerequisite for system testing.)

12.3 BUSINESS VERTICAL TESTING

General purpose products like workflow automation systems can be used


by different businesses and services. Using and testing the product for different
business verticals such as insurance, banking, asset management, and so on,
and verifying the business operations and usage, is called "business vertical
testing." For this type of testing, the procedure in the product is altered to suit
the process in the business. For example, in loan processing, the loan is
approved first by the officer and then sent to a clerk. In claim processing, the
claim is first worked out by a clerk and then sent to an officer for approval.
User objects such as clerk and officer are created by the product and associated
with the operations. This is one way of customizing the product to suit the
business. There are some operations that can only be done by some user
objects; this is called role based operations. It is important that the product
understands the business processes and includes customization as a feature so
that different business verticals can use the product. With the help of the
customization feature, a general workflow of a system is altered to suit specific
business verticals.
Another important aspect is called terminology. To explain this concept
let us take the example of e-mail. An e-mail sent in the insurance context may
be called a claim whereas when an e-mail is sent in a loan-processing system, it
is called a loan application. The users would be familiar with this terminology
rather than the generic terminology of "e-mail." The user interface should reflect
these terminologies rather than use generic terminology e-mails, which may
dilute the purpose and may not be understood clearly by the users. An e-mail
sent to a blood bank service cannot take the same priority as an internal e-mail
sent to an employee by another employee. These differentiations need to be
made by the product using the profile of the sender and the mail contents.
Some e-mails or phone calls need to be tracked by the product to see whether
they meet Service Level Agreements (SLAS). For example, an e-mail to a blood
bank service needs as prompt a reply as possible. Some of the mails could be
even automated mail replies based on rules set in the e-mail management
system for meeting the SLAs. Hence the terminology feature of the product
should call the e-mail appropriately as a claim or a transaction and also
associate the profile and properties in a way a particular business vertical
works.

90
Yet another aspect involved in business vertical testing is syndication.
Not all the work needed for business verticals are done by product development
organizations. Solution integrators, service providers pay a license fee to a
product organization and sell the products and solutions using their name and
image. In this case the product name, company name, technology names, and
copyrights may belong to the latter parties or associations and the former would
like to change the names in the product. A product should provide features for
those syndications in the product and they are as tested part of business
verticals testing.
Business vertical testing can be done in two ways-simulation and
replication. In simulation of a vertical test, the customer or the tester assumes
requirements and the business flow is tested. In replication, customer data and
process is obtained and the product is completely customized, tested and the
customized product as it was tested is released to the customer.
As discussed in the chapters on integration testing, business vertical are
tested through scenarios. Scenario testing is only a method to evolve scenarios
and ideas, and is not meant to be exhaustive. It's done more from the
perspective of interfaces and their interaction. Having some business verticals
scenarios created by integration testing ensures quick progress in system
testing, which is done with a perspective of end-to-end scenario. In the system
testing phase, the business verticals are completely tested real-life customer
environment using the aspects such as customization terminology, and
syndication described in the above paragraphs.

Check your progress 2

What do you mean by Integration test cases?


Notes: a) Write your answer in the space given below
hh)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------

12.4 DEPLOYMENT TESTING

System testing is the final phase before product delivery. By this till the
prospective customers and their configuration would be known an: in some
cases the products would have been committed for sale. Hence system testing is
the right time to test the product for those customers who are waiting for it. The
short-term success or failure of a particular product release is mainly assessed
on the basis of on how well these customer requirements are met. This type of
deployment (simulated) testing till happens in a product development company
to ensure that customer deployment requirements are met is called offsite
deployment.
Deployment testing is also conducted after the release of the product by
utilizing the resources and setup available in customer’s locations. This is a
combined effort by the product development organization and the organization
trying to use the product. This is called onsite deployment. Even though onsite
deployment is not conducted in the system testing phase, it is explained here to

91
set the context. It is normally the system testing team that is involved in
completing the onsite deployment test. Onsite deployment testing is considered
to be a part of acceptance testing and is an extension of offsite deployment
testing
Onsite deployment testing is done at two stages. In the first stage (Stage
1), actual data from the live system is taken and similar machines and
configurations are mirrored, and the operations from the users are rerun on the
mirrored deployment machine. This gives an idea whether the enhanced or
similar product can perform the existing functionality without after the user.
This also reduces the risk of a product not being able to satisfy existing
functionality, as deploying the product without adequate testing can cause
major business loss to an organization. Some deployments use intelligent
recorders to record the transactions that happen on a live system and commit
these operations on a mirrored system and then compare the results against
the live system.
The objective of the recorder is to help in keeping the mirrored and live
system identical with respect to business transactions. In the second stage
(Stage 2), after a successful first stage, the mirrored system is made a live
system that runs the new product. Regular backups are taken and alternative
methods are used to record the incremental transactions from the time
mirrored system became live. The recorder that was used in the first stage can
also be used here. However, a different method to record the incremental
transaction is advised, for sometimes failures can happen due to recorder also.
This stage helps to avoid any major failures since some of the failures can be
noticed only after an extended period of time. In this stage, the live system that
was used earlier and the recorded transactions from the time mirrored system
became live, are preserved to enable going back to the old system if any major
failures are observed at this stage. If no failures are observed in this (second)
stage of deployment for an extended period (for example, one month), then the
onsite deployment is considered successful and the old live system is replaced
by the new system. Stages 1 and 2 of deployment testing are represented in
Figure.

Figure 12.1 A Live System


In Stage 1 of Figure 12.1, it can be seen that the recorder intercepts the
user and live system to record all transactions. All the recorded transactions

92
from the live system are then played back on the product under test under the
supervision of the test engineer (shown by dotted lines). In Stage 2, the test
engineer records all transactions using a recorder and other methods and plays
back on the old live system (shown again by dotted lines).

12.5 BETA TESTING

Developing a product involves a significant amount of effort and time.


Delays in product releases and the product not meeting the customer
requirements are common. A product rejected by the customer after delivery
means a huge loss to the organization. There are many reasons for a product
not meeting the customer requirements. They are as follows.
1. There are implicit and explicit requirements for the product. A product
not meeting the implicit requirements (for example, ease of use) may
mean rejection by the customer.
2. Since product development involves a good amount of time, some of the
requirements given at the beginning of the project would have become
obsolete or would have changed by the time the product is delivered.
Customers' business requirements keep changing constantly and a
failure to reflect these changes in the product makes the latter obsolete.
3. The requirements are high-level statements with a high degree of
ambiguity. Picking up the ambiguous areas and not resolving them with
the customer results in rejection of the product.
4. The understanding of the requirements may be correct but their
implementation could be wrong. This may mean reworking the design
and coding to suit the implementation aspects the customer wants. If
this is not done in time, it may result in rejection of the product.
5. Lack of usability and documentation makes it difficult for the customer
to use the product and may also result in rejection.
The list above is only a sub-set of the reasons and there could be many
more reasons for rejection. To reduce the risk, which is the objective of system
testing, periodic feedback is obtained on the product. One of the mechanisms
used is sending the product that is under test to the customers and receiving
the feedback. This is called beta testing. This testing is performed by the
customer and helped by the product development organization. During the
entire duration of beta testing, there are various activities that are planned and
executed according to a specific schedule. This is called a beta program. Some
of the activities involved in the beta program are as follows.
1. Collecting the list of customers and their beta testing requirements along
with their expectations on the product.
2. Working out a beta program schedule and informing the customers. Not
all the customers in the list need to agree to the start date and end date
of the beta program. The end date of a beta program should be
reasonably before the product release date so that the beta testing
defects can be fixed before the release.

93
3. Sending some documents for reading in advance and training the
customer on product usage.
4. Testing the product to ensure it meets "beta testing entry criteria." The
customers and the product development/management groups of the
vendor together prepare sets of entry/exit criteria for beta testing.
5. Sending the beta product (with known quality) to the customer and
enable them to carry out their own testing.
6. Collecting the feedback periodically from the customers and prioritizing
the defects for fixing.
7. Responding to customers' feedback with product fixes or documentation
changes and closing the communication loop with the customers in a
timely fashion.
8. Analyzing and concluding whether the beta program met the exit criteria.
9. Communicate the progress and action items to customers and formally
closing the beta program.
10. Incorporating the appropriate changes in the product.
Deciding on the entry criteria of a product for beta testing and deciding
the timing of a beta test poses several conflicting choices to be made. Sending
the product too early, with inadequate internal testing will make the customers
unhappy and may create a bad impression on quality of product. Sending the
product too late may mean too little a time for beta defect fixes and this one
defeats the purpose of beta testing. Late integration testing phase and early
system testing phase is the ideal time for starting a beta program.
It is quite possible that customers discontinue the beta program after
starting it or remain passive, without adequately using the product and giving
feedback. From the customers' perspective, it is possible that beta testing is
normally just one of their activities and it may not be high on their priority list.
Constant communication with the customers is necessary to motivate them to
use the product and help them whenever they are facing problems with the
product. Defects reported in beta programs are also given the same priority and
urgency as that of normal support calls, with the only difference being that the
product development/engineering department is likely to have a more direct
interaction with the beta customers. Failure in meeting beta testing objectives
or in giving timely fixes may mean some customers rejecting the product.
One other challenge in beta programs is the choice of the number of beta
customers. If the numbers chosen are too few, then the product may not get a
sufficient diversity of test scenarios and test cases. If too many beta customers
are chosen, then the engineering organization may not be able to cope up with
fixing the reported defects in time. Thus the number of beta customers should
be a delicate balance between providing a diversity of product usage scenarios
and the manageability of being able to handle their reported defects effectively.
Finally, the success of a beta program depends heavily on the willingness
of the beta customers to exercise the product in various ways, knowing fully
well that there may be defects. This is not an easy task. As mentioned earlier,
the beta customers must be motivated to see the benefits they can get. Only
customers who can be thus motivated and are willing to play the role of trusted
partners in the evolution of the product should participate in the beta program.

94
Check your progress 3

Define Deployment testing.


Notes: a) Write your answer in the space given below
ii) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

12.6 CERTIFICATION, STANDARDS AND TESTING FOR COMPLIANCE

A product needs to be certified with the popular hardware, operating


system, database, and other infrastructure pieces. This is called certification
testing. A product not working with any of the popular hardware or software or
equipment, may be unsuitable for current and future use. The sales of a
product depend on whether it was certified with the popular systems or not.
The product has to work with the popular systems, as the customer would have
already invested heavily on those. Not only should the product co-exist and run
with the current versions of these popular systems, but the product
organization should also document a commitment (in the form of a roadmap) to
continue to work with the future versions of the popular systems. This is one
type of testing where there is equal interest from the product development
organization, the customer, and certification agencies to certify the product.
However, the onus of making sure the certification happens rests with the
product development organization. The certification agencies produce
automated test suites to help the product development organization. The
product development organization runs those certification test suites and
corrects the problems in the product to ensure that tests are successful. Once
the tests are successfully run, the results are sent to the certification agencies
and they give the certification for the product. The test suite may be rerun by
the certification agencies to verify the results, in which case the product under
test should be sent along with the test results.
There are many standards for each technology area and the product may
need to conform to those standards. This is very important as adhering to these
standards makes the product interact easily with other products. This also
helps the customer not to worry too much about the products future
compatibility with other products. As explained, there are many standards for
each technology area and the product development companies select the
standards to be implemented at the beginning of the product cycle. Following
all the standards may not be possible and sometime there may be some non-
functional issues (for example, performance impact) because of which certain
standards may not get implemented. Standards can be evolving also (for
example, Ipv6 in networking and 3G in mobile technologies) and finer details
are worked out as and when some implementations kick off. Some of the
standards are evolved by the open community and published as public domain
standards (for example, Open LDAP standard). Tools associated with those open
standards can be used free of cost to verify the standard's implementation.
Testing the product to ensure that these standards are properly implemented is

95
called testing for standards. Once the product is tested for a set of standards,
they are published in the release documentation for the information of the
customers so that they know what standards, are implemented in the product.
There are many contractual and legal requirements for a product. Failing
to meet these may result in business loss and bring legal action against the
organization and its senior management. Some of these requirements could be
contractual obligations and some statutory requirements. Failing to meet these
could severely restrict the market for the product. For example, it may not be
possible to bid for US government organizations if usability guidelines (508
Accessibility Guidelines) are not met. Testing the product for contractual, legal,
and statutory compliance is one of the critical activities of the system testing
team. The following are some examples of compliance testing.
• Compliance to FDA This act by the food and drug administration
requires that adequate testing be done for products such as cosmetics,
drugs, and medical sciences. This also requires that all the test reports
along with complete documentation of test cases, execution information
for each test cycle along with supervisory approvals be preserved for
checking adequacy of tests by the FDA.
• 508 accessibility guidelines This accessibility set of guidelines requires
the product to meet some requirements for its physically challenged
users. These guidelines insist that the product should be as accessible to
physically challenged people as it is to people without those disabilities.
• SOX (Sarbanes-Oxley's Act) This act requires that products and services
be audited to prevent financial fraud in the organization. The software is
required to go through all transactions and list out the suspected faulty
transactions for analysis. The testing for this act helps the top executives
by keeping them aware of financial transactions and their validity.
• OFAC and Patriot Act This act requires the transactions of the banking
applications be audited for misuse of funds for terrorism.
The terms certification, standards and compliance testing are used
interchangeably. There is nothing wrong in the usage of terms as long as the
objective of the testing is met. For example, a certifying agency helping an
organization to meet standards can be called as both certification testing and
standards testing (for example, Open LDAP is both a certification and a
standard).

12.7 LET US SUM UP

This is an elaborate lesson where we discussed functional testing methods like


Design / Architecture Verification and Business Vertical Testing.
There are multiple ways system functional testing is performed. There are also
many ways product level test cases are derived for functional testing. Some of
the common techniques are Design/architecture verification, Business vertical
testing, Deployment testing, Beta testing and Certification, standards, and
testing for compliance.
Some of the standards are evolved by the open community and published as
public domain standards (for example, Open LDAP standard).

96
Check Your Progress: Model Answers

1. Duplication refers to the same tests being performed multiple times and
gray area refers to certain tests being missed out in all the phases. A
small percentage of duplication across phases is unavoidable as different
teams are involved
2. The integration test cases focus on interactions between modules or
component where as the functional system test focuses on the behavior
of the complete product. A side benefit of this exercise is ensuring
completeness of ill product implementation
4. Deployment testing is also conducted after the release of the product by
utilizing the resources and setup available in customer’s locations.

97
LESSON 13

NON-FUNCTIONAL TESTING
Contents
13.0 Aims and Objectives
13.1 Non-Functional Testing
13.2 Setting up the Configuration
13.3 Coming up with Entry/Exit Criteria
13.4 Balancing key Resources
13.5 Scalability Testing
13.6 Reliability Testing
13.7 Stress Testing
13.8 Interoperability Testing
13.9 Functional Vs Non-functional Testing
13.10 Let Us Sum Up

13.0 AIMS AND OBJECTIVES

In this lesson, we will converse on non-functional category of system testing. It


includes Entry/exit criteria, scalability testing, reliability testing, stress testing
and interoperability testing.
We expect the reader might generate sample test cases for all the possible types
of non-functional testing.

13.1 NON-FUNCTIONAL TESTING

The process followed by non-functional testing is similar to that of


functional testing but differs from the aspects of complexity, knowledge
requirement, effort needed, and number of times the test cases are repeated.
Since repeating non-functional test cases involves more time, effort, and
resources, the process for non-functional testing has to be more robust stronger
than functional testing to minimize the need the repetition. This is achieved by
having more stringent entry/exit criteria better planning, and by setting up the
configuration with data population in advance for test execution.
Non-Functional Testing covers:
Load and Performance Testing
Ergonomics Testing
Stress & Volume Testing
Compatibility & Migration Testing
Data Conversion Testing
Security / Penetration Testing

98
Operational Readiness Testing
Installation Testing
Security Testing (Application Security, Network, System Security)
We are going to discuss only few of the above types in this lesson.

Check your progress 1

List down various non-functional testing methods.


Notes: a) Write your answer in the space given below
jj) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

13.2 SETTING UP THE CONFIGURATION

The biggest challenge, setting up the configuration, is common to all


types of non-functional testing. There are two ways the setup is done-simulated
environment and real-life customer environment. Due to varied types of
customers, resources availability, time involved in getting the exact setup and
so on, setting up a scenario that is exactly real-life is difficult. Even though
using real-life customer environment is a crucial factor for iii success of this
testing, due to several complexities involved, simulated setup is used for non-
functional testing where actual configuration is difficult to get. Setting up a
configuration is a challenge for iii following reasons.
1. Given the high diversity of environments and variety of customers it is
very difficult to predict the type of environment that will be used
commonly by the customers.
2. Testing a product with different permutations and combinations
configurations may not prove effective since the same combination of
environment may not used by the customer and testing for several
combinations involves effort and time. Furthermore, because of the
diversity of configurations, there is a combinatorial explosion in the
number of configurations to be tested.
3. The cost involved in setting up such environments is quite high.
4. Some of the components of the environment could be from competing
companies products and it may not be easy to get these.
5. The people may not have the skills to set up the environment.
6. It is difficult to predict the exact type and nature of data that customer
may use. Since confidentiality is involved in the data used by the
customer, such information is not passed on to the testing team.

99
In order to create a "near real-life" environment, the details regarding
customer's hardware setup, deployment information and test data collected in
advance. Test data is built based on the sample data given is a new product,
then information regarding similar or related products is collected. These inputs
help in setting up the test environment close
customer's so that the various quality characteristics of the system can be
verified more accurately.

13.3 COMING UP WITH ENTRY/EXIT CRITERIA

Coming up with entry and exit criteria is another critical factor in non-
functional testing. Table 6.2 gives some examples of how entry/exit criteria can
be developed for a set of parameters and for various types of nonfunctional
tests. Meeting the entry criteria is the responsibility of the previous test phase
(that is, integration testing phase) or it could be the objective of dry-run tests
performed by the system testing team, before accepting the product for system
testing.

13.4 BALANCING KEY RESOURCES

This section intends to discuss the concepts of non-functional testing


with respect to four key resources-CPU, disk, memory, and network. The four
resources are related to each other and we need to completely understand their
relationship to implement the strategy for non-functional testing.
These four resources in a computer require equal attention as they need
to be judiciously balanced to enhance the quality factors of the product. All
these resources are interdependent. For example, if the memory requirements
in the system are addressed, the need for CPU may become more intensive. This
in turn may result in multiple cycles of upgrade as the requirements of the
customers keep increasing. The demand for all these resources tends to grow
when a new release of the product is produced as software becomes more and
more complex. Software is meant not only for computers but also for equipment
such as cell phones; hence upgrading the resources is not easy anymore.
The following are some examples of basic assumptions that can be made
about resources and configuration.
1. The CPU can be fully utilized as long as it can be freed when a high
priority job comes in.
2. The available memory can be completely used by the product as long as
the memory is relinquished when another job requires memory.
3. The cost of adding CPU or memory is not that expensive as it was earlier.
Hence resources can be added easily to get better performance as long as
we can quantify and justify the benefits for each added resource.
4. The product can generate many network packets as long as the network
bandwidth and latency is available and does not cost much. There is a
difference in this assumption that most of the packets generated are for

100
LAN and not for WAN. In the case of WAN or routes involving multiple
hops, the packets generated by the product need to be reduced.
5. More disk space or the complete I/O bandwidth can be used for the
product as long as they are available. While disk costs are getting
cheaper, 10 ban3width is not.
6. The customer gets the maximum return on investment (ROI) only if the
resources such as CPU, disk, memory, and network are optimally used.
So there is intelligence needed in the software to understand the server
configuration and its usage.
7. Graceful degradation in non-functional aspects can be expected when
resources in the machine are also utilized for different activities in the
server.
8. Predictable variations in performance or scalability are acceptable for
different configurations of the same product.
9. Variation in performance and scalability is acceptable when some
parameters are tuned, as long as we know the impact of adjusting each
of those tunable parameters.
10. The product can behave differently for non-functional factors for different
configurations such as low-end and high-end servers as long as they
support return on investment. This in fact motivates the customers to
upgrade their resources.
Once such sample assumptions are validated by the development team
and customers, and then non-functional testing is conducted.

13.5 SCALABILITY TESTING

The objective of scalability testing is to find out the maximum capability


of the product parameters. As the exercise involves finding the maximum, the
resources that are needed for this kind of testing are normally very high. For
example, one of the scalability test cases could be finding out how many client
machines can simultaneously log in to the server to perform some operations.
In Internet space, some of the services can get up to a million accesses to the
server. Hence, trying to simulate that kind of real-life scalability parameter is
very difficult but at the same time very important.
At the beginning of the scalability exercise, there may not be an obvious
clue about the maximum capability of the system. Hence a high-end
configuration is selected and the scalability parameter is increased step by step
to reach the maximum capability.
The design and architecture give the theoretical values, and requirement
from the customers mention the maximum capability that is expected. The
scalability exercise first verifies the lower number of these two. When the
requirements from the customer are more than what design/architecture: can
provide, the scalability testing is suspended, the design is reworked, and
scalability testing resumed checking the scalability parameters. Hence, the

101
requirements, design, and architecture together provide inputs to the scalability
testing on what parameter values are to be tested.
Contrary to other types of testing, scalability testing does not end when
the requirements are met. The testing continues till the maximum capability of
a scalable parameter is found out for a particular configuration. Having a highly
scalable system that considers the future requirements of the customer helps a
product to have a long lifetime. Otherwise, each time there are new
requirements, a major redesign and overhaul takes place in the product and
some stable features may stop working because of those changes, thus creating
quality concerns. The cost and effort involved in such product developments are
very high.
Failures during scalability test include the system not responding, or the
system crashing, and so on. But whether the failure is acceptable or not has to
be decided on the basis of business goals and objectives. For example, a
product not able to respond to 100 concurrent users while its objective to serve
at least 200 users simultaneously is considered a failure. When a product
expected to withstand only 100 users fails when its load is increased to 200,
then it is a passed test case and an acceptable situation.
Scalability tests help in identifying the major bottlenecks in a product.
When resources are found to be the bottlenecks, they are increased validating
the assumptions mentioned. If the bottles are in the product, they are fixed.
However, sometimes the under infrastructure such as the operating system or
technology can also become bottlenecks. In such cases, the product
organization is expected to work with the OS and technology vendors to resolve
the issues.
Scalability tests are performed on different configurations to check
product's behavior. For each configuration, data are collected and analyzed. On
completion of the tests, the data collected in the templates are analyzed and
appropriate actions are taken. For example, if CPU utilization approaches to
100%, then another server is set up to share the load or another CPU is added
to the server. If the results are successful, then, tests are repeated for 200 users
and more to find the maximum limit for configuration.

Check your progress 2

When Scalability tests are performed?


Notes: a) Write your answer in the space given below
kk) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

102
13.6 RELIABILITY TESTING

As defined earlier, reliability testing is done to evaluate the product's


ability to perform its required functions under stated conditions for a specified
period of time or for a large number of iterations. Examples of reliability include
querying a database continuously for 48 hours and performing login operations
10,000 times.
The reliability of a product should not be confused with reliability testing.
Producing a reliable product requires sound techniques, good discipline, robust
processes, and strong management; and involves a whole gamut of activities for
every role or function in a product organization. The reliability of a product
deals with the different ways a quality product is produced, with very few
defects by focusing on all the phases of product development and the processes.
Reliability here is an all-encompassing term used to mean all the quality factors
and functionality aspects of the product. This perspective is related more to the
overall way the product is developed and has less direct relevance to testing.
This product reliability is achieved by focusing on the following activities.
1. Defined engineering processes Software reliability can be achieved by
following clearly defined processes. The team is mandated to understand
requirements for reliability right from the beginning and focuses on
creating a reliable design upfront. All the activities (such as design,
coding, testing and documentation) are planned, taking into
consideration the reliability requirements of the software.
2. Review of work products at each stage At the end of each stage of the
product development life cycle, the work products produced are
reviewed. This ensures early detection of error and their fixes as soon as
they are introduced.
3. Change management procedures Many errors percolate to the product
due to improper impact analysis of changes made to the product.
Changes received late during the product development life cycle can
prove harmful. There may not be adequate time for regression testing
and hence the product is likely to have errors due to changes. Hence,
having a clearly defined change management procedure is necessary to
deliver reliable software.
4. Review of testing coverage Allocating time for the different phases and
types of testing can help in catching errors as and when the product is
being developed, rather than after the product is developed. All the
testing activities are reviewed for adequacy of time allotted, test cases,
and effort spent for each type of testing.
5. Ongoing monitoring of the product Once the product has been
delivered; it is analyzed proactively for any possibly missed errors. In this
case the process as well as the product is fixed for missed defects. This
prevents the same type of defects from reappearing.

103
13.7 STRESS TESTING

Stress testing is done to evaluate a system beyond the limits of specified


requirements or resources, to ensure that system does not break. Stress testing
is done to find out if the product's behavior degrades under extreme conditions
and when it is denied the necessary resources. The product is over-loaded
deliberately to simulate the resource crunch and to find out its behavior.
Stress testing helps in understanding how the system can behave under
extreme (insufficient memory, inadequate hardware) and realistic situations.
System resources upon being exhausted may cause such situations. This helps
to know the conditions under which these tests fail so that the maximum limits,
in terms of simultaneous users, search criteria, large number of transactions,
and so on can be known.
The following guidelines can be used to select the tests for stress testing.
1. Repetitive tests Executing repeated tests ensures that at all times the
code works as expected. There are some operations that are repeatedly
executed by the customer. A right mix of these operations and
transactions need to be considered for stress testing.
2. Concurrency Concurrent tests ensure that the code is exercised in
multiple paths and simultaneously. The operations that are used by
multiple users are selected and performed concurrently for stress testing.
3. Magnitude This refers to the amount of load to be applied to the product
to stress the system. It can be a single operation being executed for a
large volume of users or a mix of operations distributed over different
users. The operations that generate the amount of load needed are
planned and executed for stress testing.
4. Random variation As explained earlier, stress testing depends on
increasing/decreasing variable load. Tests that stress the system with
random inputs (in terms of number of users, size of data), at random
instances and random magnitude are selected and executed as part of
stress testing.
Defects that emerge from stress testing are usually not found from any
other testing. Defects like memory leaks are easy to detect but difficult to
analyze due to varying load and different types/mix of tests executed. Hence,
stress tests are normally performed after reliability testing. This testing helps in
finding out concurrency and synchronization issues like deadlocks, thread,
leaks, and other synchronization problems.

Check your progress 3

Discuss the guidelines for stress testing.


Notes: a) Write your answer in the space given below
ll) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------

104
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

13.8 INTEROPERABILITY TESTING

Interoperability testing is done to ensure the two or more products (the


exchange information, use information, and work properly together.
Systems can be interoperable unidirectional (the exchange of information
is one way) or bi-directional (exchange of information in both ways). For
example, the text available in a text editor can be exported into a Microsoft
Word application using the "Insert->File" option. But a picture available
Microsoft Word cannot be exported into text editor. The two-way interoperability
is represented by exchange information between email management (Microsoft
Outlook) and Microsoft Word, where information can be cut and pasted on both
directions.
The terms "interoperability" and "integration" are used interchange, but
this is incorrect. Integration is a method and interoperability is the result.
Integration pertains to only one product and defines interface two or more
components. Unless two or more products are designed exchanging
information, interoperability cannot be achieved. As explained in the chapter on
integration testing, there is only a thin line of difference between various types
of testing.
Interoperability attains more importance in the context of Internet, which
is characterized by a seamless co-existence and inter-operation of multiple
computers and multiple pieces of software. Hence, it is essential for more and
more products to be interoperable so that they can communicate with almost
all the operating systems, browsers, development tools, compilers, applications,
and so on.
There are no real standard methodologies developed for interoperability
testing. There are different variants of interoperable testing one system with
another, one to many, and multi-dimensional interoperability testing. Following
technical standards like SOAP (Simple object access protocol), eXtensible
Markup Language (XML) and some more from W3C (World Wide Web
Consortium) typically aid in the development of products using common
standards and methods. But standards conformance is not enough for
interoperability testing, as standards alone do not ensure consistent
information exchange and work flow. Some popular products may not have
implemented all the standards. But the product under test needs to
interoperate with those popular products due to pressing business need.
The following are some guidelines that help in improving interoperability.
1. Consistency of information flow across systems When an input is
provided to the product, it should be understood consistently by all
systems. This would enable a smooth, correct response to be sent back
to the user. For example, when data structures are used to pass

105
information across systems, the structure and interpretation of these
data structures should be consistent across the system.
2. Changes to data representation as per the system requirements
When two different systems are integrated to provide a response to the
user, data sent from the first system in a particular format must be
modified or adjusted to suit the next system's requirement. This would
help the request to be understood by the current system. Only then can
an appropriate response be sent to the user. For example, when a littile
end-ian machine passes data to a big erid-ian machine, the byte ordering
would have to be changed.
3. Correlated interchange of messages and receiving appropriate
responses When one system sends an input in the form of a message,
the next system is in the waiting mode or listening mode to receive the
input. When multiple machines are involved in information exchange,
there could be clashes, wrong response, deadlocks, or delays in
communication. These aspects should be considered in
architecting/designing the product, rather than leave it to be found as a
surprise during the later phases.
4. Communication and messages When a message is passed on from a
system A to system B, if any and the message is lost or gets garbled the
product should be tested to check how it responds to such erroneous
messages. The product must not crash or hang. It should give useful
error messages to the user requesting him to wait for sometime until it
recovers the connection. As multiple products are involved, a generic
error message such as "Error from remote machine" will be misleading
and not value adding. The user need not know where the message is
coming from but needs to understand the cause of the message and the
necessary corrective action.
5. Meeting quality factors When two or more products are put together,
there is an additional requirement of information exchange between
them. This requirement should not take away the quality of the products
that would have been already met individually by the products.
Interoperability testing needs to verify this perspective.
The responsibility for interoperability lies more on the architecture,
design, and standards of various products involved in the domain. Hence,
testing for interoperability yields better results only if the requirements are met
by development activities such as architecture, design, and coding.
Interoperability testing should be restricted to qualify the information exchange
rather than finding defects and fixing them one after another.
Interoperability among products is a collective responsibility and the effort of
many product organizations. All product organizations are expected to work
together to meet the purpose of interoperability. There are standards
organizations that focus on interoperability standards which help the product
organizations to minimize the effort involved in collaborations. They also assist
in defining, implementing, and certifying the standards implementation for
interoperability.

106
One of the fundamental objectives of a project is to collect both the functional
and non-functional requirements. These need to be kept in balance and
harmony, and most importantly not compromised as the project progresses.

13.9 FUNCTIONAL VS NON-FUNCTIONAL TESTING

Functional Requirements
The official definition for a functional requirement specifies what the system
should do:
"A requirement specifies a function that a system or component must be able to
perform."
Functional requirements specify specific behavior or functions, for example:
"Display the heart rate, blood pressure and temperature of a patient connected to
the patient monitor."
Typical functional requirements are:
• Business Rules
• Transaction corrections, adjustments, cancellations
• Administrative functions
• Authentication
• Authorization –functions user is delegated to perform
• Audit Tracking
• External Interfaces
• Certification Requirements
• Reporting Requirements
• Historical Data
• Legal or Regulatory Requirements
Non-Functional Requirements
The official definition for a non-functional requirement specifies how the system
should behave:
"A non-functional requirement is a statement of how a system must behave, it is a
constraint upon the systems behavior."
Non-functional requirements specify all the remaining requirements not covered
by the functional requirements. They specify criteria that judge the operation of
a system, rather than specific behaviors, for example:
"Display of the patient's vital signs must respond to a change in the patient's
status within 2 seconds."
Typical non-functional requirements are:
Performance - Response Time, Throughput, Utilization, Static Volumetric

13.10 LET US SUM UP

Check Your Progress: Model Answers


1. Non-Functional Testing covers:
Load and Performance Testing, Ergonomics Testing, Stress & Volume
Testing Compatibility & Migration Testing, Data Conversion Testing,

107
Security / Penetration Testing, Operational Readiness Testing, Installation
Testing, Security Testing (Application Security, Network, System Security)
2. Scalability tests are performed on different configurations to check product's
behavior. For each configuration, data are collected and analyzed. An
example of a data collection template is given below.
3. Repetitive testing, concurrency, magnitude and random variation are the
topics concentrated on stress testing.

108
LESSON 14

ACCEPTANCE TESTING
Contents
14.0 Aims and Objectives
14.1 Acceptance Testing
14.2 Acceptance Criteria
14.2.1 Acceptance criteria-Product acceptance
14.2.2 Acceptance criteria-Procedure acceptance
14.3 Selecting Test Cases for Acceptance Testing
14.4 Executing Acceptance Tests
14.5 Let Us Sum Up

14.0 AIMS AND OBJECTIVES

In previous lesson, we discussed on functional and non-functional


categories and their testing methods belonging. Here in this lesson, we are
going to discuss on acceptance testing which a significant testing is done by
customers.
‘Acceptance’ is a significant stage in the contractual process;
commercially, it is likely to operate as a payment milestone; legally it will affect
the application of any warranty provisions and potential remedies which may be
available to the customer. Principles of good commercial practice prescribe that
express provisions for any acceptance testing procedure should be clearly set
out, determining under which circumstances and upon what basis ‘acceptance’
of the software will be deemed to have occurred.
The scope and application of the acceptance testing procedure (the ATP)
will vary considerably, depending upon the type of project and the nature of the
software which is being supplied. For example, the supply of a standard off-
the-shelf software package to meet a straightforward software requirement,
such as the supply of a word processing package, may employ a relatively
simple acceptance test. In this instance, the acceptance test may be passed if
the software has been used in a live environment for a period of thirty days
without rejection. In contrast, a complex bespoke software development project
is likely to demand detailed acceptance testing. A bespoke software
development project is likely to require extensive testing against a series of
specified functions and facilities, each of which is to be achieved within
particular performance requirements.i Testing the software against specific
acceptance criteria will enable the customer to determine whether the software
which has been delivered is in conformity with that for which he had
contracted.
At the end of this lesson, you may able to understand what is acceptance
testing, product and procedures related to it.

109
14.1 ACCEPTANCE TESTING

Acceptance testing is a phase after system testing that is normally done


by the customers or representatives of the customer. The customer defines a set
of test cases that will be executed to qualify and accept the product. These test
cases are executed by the customers themselves to quickly judge the quality of
the product before deciding to buy the product. Acceptance test cases are
normally small in number and are not written with the intention of finding
defects. More detailed testing (which is intended to uncover defects) is expected
to have been completed in the component, integration, and system testing
phases, prior to product delivery to the customer. Sometimes, acceptance test
cases are developed jointly by the customers and product organization. In this
case, the product organization will have complete understanding of what will be
tested by the customer for acceptance testing. In such cases, the product
organization tests those test cases in advance as part of the system test cycle
itself to avoid any later surprises when those test cases are executed by the
customer.
In cases where the acceptance tests are performed by the product
organization alone, acceptance tests are executed to verify if the product meets
the acceptance criteria defined during the requirements definition phase of the
project. Acceptance test cases are black box type of test cases. They are directed
at verifying one or more acceptance criteria.
Acceptance tests are written to execute near real-life scenarios. Apart
from verifying the functional requirements, acceptance tests are run to verify
the non-functional aspects of the system also.
Acceptance test cases failing in a customer site may cause the product to
be rejected and may mean financial loss or may mean rework of product
involving effort and time.
THE CUSTOMER'S RESPONSIBILITY
The acceptance test is the last opportunity customers have to make sure
that the system is what they asked for. When this final test is complete, the
team expects that the customer will formally approve the system or point out
any problems that still need to be resolved. Therefore, unlike all the other tests
performed so far, acceptance testing is the customers' responsibility. Of course,
unless the customers are very savvy in testing techniques, they will still need
the participation of the IT team.

Check your progress 1

Define Acceptance Testing.


Notes: a) Write your answer in the space given below
b)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

110
14.2 ACCEPTANCE CRITERIA

14.2.1 Acceptance criteria-Product acceptance:


During the requirements phase, each requirement is associated with
acceptance criteria. It is possible that one or more requirements may be
mapped to form acceptance criteria (for example, all high priority requirements
should pass 100%). Whenever there are changes to requirements, the
acceptance criteria are accordingly modified and maintained.
Acceptance testing is not meant for executing test cases that have not
been executed before. Hence, the existing test cases are looked at and certain
categories of test cases can be grouped to form acceptance criteria (for example,
all performance test cases should pass meeting the response time
requirements).
Testing for adherence to any specific legal or contractual terms is
included in the acceptance criteria. Testing for compliance to specific laws like
Sarbanes-Oxley can be part of the acceptance criteria.

14.2.2 Acceptance criteria-Procedure acceptance:

Acceptance criteria can be defined based on the procedures followed for


delivery. An example of procedure acceptance could be documentation and
release media. Some examples of acceptance criteria of this nature are as
follows.
1. User, administration and troubleshooting documentation should be
part of the release.
2. Along with binary code, the source code of the product with build scripts
to be delivered in a CD.
3. A minimum of 20 employees are trained on the product usage prior to
deployment.
These procedural acceptance criteria are verified/tested as part of
acceptance testing.

14.2.3 Acceptance criteria-service level agreements:

Service level agreements (SLA) can become part of acceptance criteria.


Service level agreements are generally part of a contract signed by the customer
and product organization. The important contract items are taken and verified
as part of acceptance testing. For example, time limits to resolve those defects
can be mentioned part of SLA such as
• All major defects that come up during first three months of deployment
need to be fixed free of cost;
• Downtime of the implemented system should be less than 0.1%;
• All major defects are to be fixed within 48 hours of reporting.

111
With some criteria as above (except for downtime), it may look as though
there is nothing to be tested or verified. But the idea of acceptance testing here
is to ensure that the resources are available for meeting those SLAs.

14.3 SELECTING TEST CASES FOR ACCEPTANCE TESTING

As mentioned, the test cases for acceptance testing are selected from the
existing set of test cases from different phases of testing. This section gives
some guideline on what test cases can be included for acceptance testing.
1. End-to-end functionality verification Test cases that include the end-
to-end functionality of the product are taken up for acceptance testing.
This ensures that all the business transactions are tested as a whole and
those transactions are completed successfully. Real-life test scenarios are
tested when the product is tested end-to-end.
2. Domain tests Since acceptance tests focus on business scenarios, the
product domain tests are included. Test cases that reflect business
domain knowledge are included.
3. User scenario tests Acceptance tests reflect the real-life user scenario
verification. As a result, test cases that portray them are included.
4. Basic sanity tests Test that verify the basic existing behavior of the
product are included. These tests ensure that the system performs the
basic operations that it was intended to do. Such tests may gain more
attention when a product undergoes changes or modifications. It is
necessary to verify that the existing behavior is retained without any
breaks.
5. New functionality When the product undergoes modifications or
changes, the acceptance test cases focus on verifying the new features.
6. A few non-functional tests Some non-functional tests are included and
executed as part of acceptance testing to double-check that the non-
functional aspects of the product meet the expectations.
7. Tests pertaining to legal obligations and service level agreements
Tests that are written to check if the product complies with certain legal
obligations and SLAs are included in the acceptance test criteria.
8. Acceptance test data Test cases that make use of customer real-life
data are included for acceptance testing.

Check your progress 2

Write on the three acceptance criteria?


Notes: a) Write your answer in the space given below
mm) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

112
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

14.4 EXECUTING ACCEPTANCE TESTS

As explained before, sometimes the customers themselves do the


acceptance tests. In such cases, the job of the product organization is to assist
the customers in acceptance testing and resolve the issues that come out of it.
If the acceptance testing is done by the product organization, forming the
acceptance test team becomes an important activity.
An acceptance test team usually compromises members who are involved
in the day-to-day activities of the product usage or are familiar with such
scenarios. The product management, support and consulting team, who have
good knowledge of the customers, contribute to the acceptance testing
definition and execution. They may not be familiar with the testing process or
the technical aspect of the software. But they know whether the product does
what it is intended to do. An acceptance test team may be formed with 90% of
them possessing the required business process knowledge of the product and
10% being representatives of the technical testing team. The number of test
team members needed to perform acceptance testing is very less, as the scope
and effort involved in acceptance testing is not much when compared to other
phases of testing.
As mentioned earlier, acceptance test team members may or may not be
aware of testing or the process. Hence, before acceptance testing, appropriate
training on the product and the process needs to be provided to the team. This
training can be given to customers and other support functions irrespective of
who does the acceptance tests, as the effort involved is the same. The
acceptance test team may get the help of team members who developed/ tested
software to obtain the required product knowledge. There could also be in-
house training material that could serve the purpose.
The role of the testing team members during and prior to acceptance test
is crucial since they may constantly interact with the acceptance team
members. Test team members help the acceptance members to get the required
test data, select and identify test cases, and analyze the acceptance test results.
During test execution, the acceptance test team reports its progress regularly.
The defects reports are generated on a periodic basis.
Defects reported during the acceptance tests could be of different
priorities. Test teams help acceptance test team report defects. Showstopper
and high-priority defects are necessarily fixed before software is released. In
case major defects are identified during acceptance testing, then there is a risk
of missing the release date. When the defect fixes point to scope or requirement
changes, then it may either result in the extension of the release date to include
the feature in the current release or get postponed to subsequent releases. All
resolution of those defects (and unresolved defects) is discussed with the
acceptance test team and their approval is obtained for concluding the
completion of acceptance testing.

113
Check your progress 3

What is the role of team members to perform acceptance testing?


Notes: a) Write your answer in the space given below
nn)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

14.5 LET US SUM UP

Check Your Progress: Model Answers


1. Acceptance testing is a phase after system testing that is normally done
by the customers or representatives of the customer. The customer
defines a set of test cases that will be executed to qualify and accept the
product. These test cases are executed by the customers themselves to
quickly judge the quality of the product before deciding to buy the
product. Acceptance test cases are normally small in number and are not
written with the intention of finding defects.
2. i) Acceptance criteria-Product acceptance ii) Acceptance criteria-
Procedure acceptance iii) Acceptance criteria-Service Level Agreements
(SLA)
3. The role of the testing team members during and prior to acceptance test
is crucial since they may constantly interact with the acceptance team
members. Test team members help the acceptance members to get the
required test data, select and identify test cases, and analyze the
acceptance test results. During test execution, the acceptance test team
reports its progress regularly. The defects reports are generated on a
periodic basis.

114
LESSON 15

SUMMARY OF TESTING PHASES


Contents
15.0 Aims and Objectives
15.1 Summary of Testing Phases
15.2 Multiphase Testing Model
15.3 Working across Multiple Releases
15.4 Who Does What and When?
15.5 Let Us Sum Up

15.0 AIMS AND OBJECTIVES

This lesson would be the summing up of all the testing methods discussed in
previous chapters. We aim to consolidate the testing phases and their
procedures. At end of this lesson the reader might be able to understand testing
phases, multiphase testing model and working across multiple releases of a
software.

15.1 SUMMARY OF TESTING PHASES

IEEE standards are most accepted in the software testing industry. However, it
is not mandatory that all software testing processes have to follow the standard.
Software testing has many different phases such as the test planning, test
specification and test reporting phase.
Test plan is the most important phase in the software testing process. It sets
the process rolling and describes the scope of the testing assignment, the
approach methodology, the resource requirement for testing and the project
plan or time schedule. The test plan outlines the test items, system features
testing or checking out the functionality of the system, the testing tasks,
responsibility matrix and the risks associated with the process. The testing task
is achieved by testing different types of test data. The steps that are followed in
system testing are program testing, string testing, system testing, system
documentation and user acceptance testing.
Test specification document helps in refining the test approach that has been
planned for executing the test plan. It identifies the test cases, procedures and
the pass/fail criteria for the assignment. The test case specification document
outlines the actual values required as input parameters in the testing process
and the expected outputs of the testing results. It also identifies the various
constraints related to the test case. It is important to note that test cases are re-
usable components and one test case can be used in various test designs. The
test procedure outlines all the processes that are required to test the system
and implement the test cases.

115
During the testing phase all the activities that occur are documented. There are
various reasons why clear documentation is required during testing. It helps
the development team to understand the bugs and fix them quickly. Incase
there is a change in the testing team it will help the new team members to
quickly understand the process and help in a quick transition. The overall
summary report of the testing process helps the entire project team to
understand the initial flaws in design and development and ensure that the
same errors are not repeated again. There are four types of testing documents.
The transmittal report which specifies the testing events being transmitted from
the development team to the testing team, the test log which is a very important
document and used to document the events that happened during execution,
test incident report which has a list of testing events that requires further
investigation and the test summary report which summarizes the overall testing
activities.
Many software testing companies follow the IEEE standard of software testing
when executing their testing projects. Software application development
companies may have their own testing templates which they use for their
testing requirements. Outsourcing the testing requirements to a third party
vendor helps in improving the quality of the software to the great extent. Also
an unbiased view helps to find out the many different loopholes that are
existent in the software system.

Check your progress 1

Define Test Plan


Notes: a) Write your answer in the space given below
oo) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

15.2 Multiphase Testing Model

Various phases of testing have been discussed in this chapter and


previous chapters. When these phases of testing are performed by different test
teams, the effectiveness of the model increases. However the big question in
front of this model is in knowing when to start and finish each of the testing
phases. This section addresses some guidelines that can be used to start /
complete each of the testing phases. The transition to each of the testing phase
is determined by a set of entry and exit criteria. The objective of the entry and
exit criteria is to allow parallelism in testing and, at the same time, give
importance to the quality of the product to decide the transitions. Having very
mild entry criteria or very strict entry criteria have their own disadvantages.
When the criteria are too mild, all the testing phases start at the same time
representing one extreme where the same problem is reported by multiple
teams increasing the duplication of defects and making multiple teams wait for

116
bug fixes. This results in releasing a bad quality product and lack of ownership
on issues. It also creates a repetition of test cases at various phases of testing
in case the quality requirements of that phase are not met. Having too strict
entry criteria solves this problem but a lack of parallelism in this case creates a
delay in the release of the product. These two extreme situations are depicted in
the figure 15.1.
The right approach is to allow product quality to decide when to start a
phase and entry criteria should facilitate both the quality requirements for a
particular phase and utilize the earliest opportunity for starting a particular
phase. The team performing the earlier phase has the owner ship to meet the
entry criteria of the following phase.
Some sample entry and exit criteria are given in table. Please note that
there is no entry and exit criteria for unit testing as it starts soon after the code
is ready to compile and the entry criteria for component testing can serve as
exit criteria for unit testing. However, unit test regression continues till the
product is released. The criterion given below enables the product quality to
decide on starting/completing test phases at the same time and creates many
avenues for allowing parallelism among test phases. The following figure 15.1
illustrates the three possible entry criteria of testing phases.

Too Strict Entry Criteria


Unit Testing
Component
Testing

Integration
Testing

System Testing

Acceptance

Time line

117
Too Mild Entry Criteria

Unit Testing
Component
Testing

Integration
Testing

System Testing

Acceptance

Time line

Optimized Entry
Unit Testing
Criteria
Component
Testing

Integration
Testing

System Testing

Acceptance

Time line
Figure 15.1 Possible entry criteria for testing phases
The above three time charts mention about the time when a testing phase must
be started and when it must be ended. The duration of time taken to complete
each phase is classified into optimum, mild and strict. The Software Test
Engineer specific on his requirements and time schedule can follow one among
the above.

118
15.2 Entry/Exit Criteria for Testing Models

Table 15.1 Sample entry and exit criteria for component testing

Entry Criteria Exit Criteria

Component Testing
Periodic unit test progress report No extreme and critical outstanding
showing 70% completion rate defects in features

Stable build (installable) with basic All 100% component test cases
features working executed with at least 98% pass ratio

Component test cases ready for Component test progress report


execution (periodic) and defect trend sorted
based on features and analyzed

Component level performance and


load testing report and analysis of the
same.

Table 15.2 Sample entry and exit criteria for integration testing

Entry Criteria Exit Criteria

Integration Testing
Periodic component test progress No extreme and critical outstanding
report (with at least 50% completion defects in features
ratio) with at least 70% pass rate

Stable build (installable/upgradeable) All 100% integration test cases


with all features integrated executed with at least 98% pass ratio

Defect arrival showing downward Integration test progress report


trend showing good progress and defects
showing consistent downward trend

Performance, load test report for all


critical features within acceptable
range

Product in release format (including


documents, media and so on)

Table 15.3 Sample entry and exit criteria for Acceptance testing

119
Entry Criteria Exit Criteria

Acceptance Testing
Periodic integration test progress All 100% system test cases executed
report with at least 50% pass rate for with at least 98% pass ratio
starting system testing, 90% pass
All 100% acceptance test cases
rate for starting acceptance testing
executed with 100% pass rate

Stable build (production format) with Test summary report all phases
all features integrated consolidated (periodic) and they are
analyzed and defect trend showing
downward trend for last four weeks

Defect arrival trend showing Metrics (quality and progress metrics)


downward movement showing product readiness for release

No extreme and critical defects Performance, load test report for all
outstanding critical features, system

15.3 Working across Multiple Releases

s explained earlier, separate test teams for each phase of testing


increases effectiveness. It also creates an opportunity for a test team to work on
multiple releases at the same time. This way the test teams can be utilized
completely. For example, when exit criteria is met for component test they can
get on to next release of component testing while the integration and system
test teams are focusing the on current release. This allows a part of the test
team to work on the next release while the testing for current release is in
progress. This is one way to reduce the overall elapsed time for releases,
exploiting the overlap and parallelism among the test phases. Figure 15.2
depicts this concept. This figure takes into account only a few major activities of
a testing team such as automation, test case creation, and test execution to
explain the concept.

120
Figure 15.2 Testing with multiple releases

Check your progress 2

Describe component testing.


Notes: a) Write your answer in the space given below
b)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
15.4 Testing - Who Does What and When?
Table 15.4 gives an overview of when what tests are executed.

Type of Testing Unit Component Integration System and


Testing Testing Testing Acceptance
Testing
Static
 
analysis/memory
Leak/code
complexity


Internationalization

121
Compatibility
(Forward/Backward) 


Localization Testing

 
Interoperability

API/ interface 
testing


Performance testing


Load testing

Reliability


Check your progress 3

What do you mean by performance testing?


Notes: a) Write your answer in the space given below
pp) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

15.5 LET US SUM UP

Check Your Progress: Model Answers


1. Test plan is the most important phase in the software testing process. It
sets the process rolling and describes the scope of the testing
assignment, the approach methodology, the resource requirement for
testing and the project plan or time schedule. The test plan outlines the
test items, system features testing or checking out the functionality of
the system, the testing tasks, responsibility matrix and the risks
associated with the process.

122
2. Specified components. A software component must have a specification
in order to be tested. Given any initial state of the component, in a
defined environment, for any fully-defined sequence of inputs and any
observed outcome, it shall be possible to establish whether or not the
component conforms to the specification.
Dynamic execution. Dynamic execution and analysis of the results of
execution must be focused.
Techniques and measures. define test case design techniques and test
measurement techniques. The techniques are defined to help users of
this Standard design test cases and to quantify the testing performed.
The definition of test case design techniques and measures provides for
common understanding in both the specification and comparison of
software testing.
Test process attributes. Describe attributes of the test process that
indicate the quality of the testing performed. These attributes are
selected to provide the means of assessing, comparing and improving test
quality.
Generic test process. define a generic test process. A generic process is
chosen to ensure that this Standard is applicable to the diverse
requirements of the software industry.
4. Performance Testing covers a broad range of engineering or functional
evaluations where a material, product, system, or person is not specified
by detailed material or component specifications: rather, emphasis is on
the final measurable performance characteristics.
Performance testing can refer to the assessment of the performance of a
human examinee. For example, a behind-the-wheel driving test is a
performance test of whether a person is able to perform the functions of
a competent driver of an automobile.
In the computer industry, software performance testing is used to
determine the speed or effectiveness of a computer, network, software
program or device. This process can involve quantitative tests done in a
lab, such as measuring the response time or the number of MIPS
(millions of instructions per second) at which a system functions.
Qualitative attributes such as reliability, scalability and interoperability
may also be evaluated. Performance testing is often done in conjunction
with stress testing.

123
UNIT – IV
LESSON 16

PERFORMANCE TESTING
Contents
16.0 Aims and Objectives
16.1 Introduction
16.2 Factors governing Performance Testing
16.3 Performance Engineering
16.4 Let Us Sum Up

16.0 AIMS AND OBJECTIVES

We are into the entry of Unit IV, which explores on performance testing, tools,
processes and Regression Testing. In this first lesson, we will discuss on the
fundamentals of performance testing and factors that govern performance
testing etc.
The reader might be expected to be thorough in writing test cases for all testing
models required for her/his project specification. Choosing any of the testing
method is also best quality of a test engineer expected.

16.1 INTRODUCTION

In this Internet era, when more and more of business is transacted


online, there is a big and understandable expectation that all applications run
as fast as possible. When applications run fast, a system can fulfill the business
requirements quickly and put it in a position to expand its business and handle
future needs as well. A system or a product that is not able to service business
transactions due to its slow performance is a big loss for the product
organization, its customers, and its customers' customers. For example, it is
estimated that 40% of online marketing shopping for consumer goods in the
USA happens over a period of November-December. Slowness or lack of
response during this period may result in losses to the tune of several million
dollars to organizations. In yet another example, when examination results are
published on the Internet, several hundreds of thousands of people access the
educational web sites within a very short period. If a given website takes a long
time to complete the request or takes more time to display the pages, it may
mean a lost business opportunity, as the people may go to other web sites to
find the results. Hence, performance is a basic requirement for any product and
is fast becoming a subject of great interest in the testing community.
Performance testing is the process of determining the speed or effectiveness of a
computer, network, software program or device. This process can involve
quantitative tests done in a lab, such as measuring the response time or the
number of MIPS (millions of instructions per second) at which a system

124
functions. Qualitative attributes such as reliability, scalability and
interoperability may also be evaluated. Performance testing is often done in
conjunction with stress testing.
Performance testing can verify that a system meets the specifications claimed
by its manufacturer or vendor. The process can compare two or more devices or
programs in terms of parameters such as speed, data transfer rate, bandwidth,
throughput, efficiency or reliability.
Performance testing can also be used as a diagnostic aid in locating
communications bottlenecks. Often a system will work much better if a problem
is resolved at a single point or in a single component. For example, even the
fastest computer will function poorly on today's Web if the connection occurs at
only 40 to 50 Kbps (kilobits per second).

Check your progress 1

Define Performance Testing.


Notes: a) Write your answer in the space given below
qq) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

16.2 FACTORS GOVERNING PERFORMANCE TESTING

There are many factors that govern performance testing. It is critical to


understand the definition and purpose of these factors prior to understanding
the methodology for performance testing and for analyzing the results.
As explained in the previous section, a product is expected, to handle
multiple transactions in a given period. The capability of the system or the
product in handling multiple transactions is determined by a factor called
throughput. Throughput represents the number of requests/business
transactions processed by the product in specified time duration. It is
important to understand that the throughput (that is, the number of
transactions serviced by the product per unit time) varies according to the load
the product is subjected to figure is an example of the throughput of a system
at various load conditions. The load to the product can be increased by
increasing the number of users or by increasing the number of concurrent
operations of the product.
In the above example, it can be noticed that initially the throughput
keeps increasing as the user load increases. This is the ideal situation for any
product and indicates that the product is capable of delivering more when there
are more users trying to use the product. In the second part of the graph,
beyond certain user load conditions (after the bend), it can be noticed that the
throughput comes down. This is the period when the users of the system notice

125
a lack of satisfactory response and the system starts taking more time to
complete business transactions. The "optimum throughput" is represented by
the saturation point and is the one that represents the maximum throughput
for the product.

Figure 16.1 A view on Throughput


Throughput represents how many business transactions can be serviced
in a given duration for a given load. It is equally important to find out how
much time each of the transactions took to complete. As was explained in the
first section, customers might go to a different website or application if a
particular request takes more time on this website or application. Hence
measuring "response time" becomes an important activity of performance
testing. Response time can be defined as the delay between the point of request
and the first response from the product. In a typical client-server environment,
throughput represents the number of transactions that can be handled by the
server and response time represents the delay between the request and
response.
In reality, not all the delay that happens between the request and the
response is caused by the product. In the networking scenario, the network or
other products which are sharing the network resources, can cause the delays.
Hence, it is important to know what delay the product causes and what delay
the environment causes. This brings up yet another factor for performance-
latency. Latency is a delay caused by the application, operating system, and by
the environment that are calculated separately. To explain latency, let us take
an example of a web application providing a service by talking to a web server
and a database server connected in the network.
In the above example, latency can be calculated for the product that is
running on the client and for the network that represents the infrastructure
available for the product. Thus by using the above figure 16.1, latency and
response time can be calculated as

126
Network latency = NI + N2 + N3 + N4
Product latency = Al + A2 + A3
Actual response time = Network latency + Product latency
The discussion about the latency in performance is very important, as
any improvement that is done in the product can only reduce the response time
by the improvements made in AI, A2, and A3. If the network latency is more
relative to the product latency and if that is affecting the response time, then
there is no point in improving the product performance. In such case it will be
worthwhile looking at improving the network infrastructure. In cases where
network latency is more or can not be improved, the product can use intelligent
approaches of caching and sending multiple requests in one packet and
receiving responses as a bunch.
The next factor that governs the performance testing is tuning. Tuning is
a procedure by which the product performance is enhanced by setting different
values to the parameters (variables) of the product, operation system, and other
components. Tuning improves the product performance without having to touch
the source code of the product. Each product may have certain parameters or
variables that can be set a run time to gain optimum performance. The default
values that are assumed by such product parameters may not always give
optimum performance for a particular deployment. This necessitates the need
for changing the values of parameters or variables to suit the deployment or for
a particular configuration. Doing performance testing, tuning of parameters is
an important activity that needs to be done before collecting actual numbers.
Another factor that needs to be considered for performance testing is
performance of competitive products. A very well-improved performance of a
product makes no business sense if that performance does not match up to the
competitive products. Hence it is very important to compare the throughput
and response time of the product with those of the competitive products. This
type of performance testing wherein competitive products are compared is
called benchmarking. No two products are the same in features, cost, and
functionality. Hence, it is not easy to decide which parameters must be
compared across two products. A careful analysis is needed to chalk out the list
of transactions to be compared across products, so that an apples-to-apples
comparison becomes possible. This produces meaningful analysis to improve
the performance of the product with respect to competition.
One of the most important factors that affect performance testing is the
availability of resources. A right kind of configuration (both hardware and
software) is needed to derive the best results from performance testing and for
deployments.
The exercise to find out what resources and configurations are needed is
called capacity planning. The purpose of a capacity planning exercise is to help
customers plan for the set of hardware and software resources prior to
installation or upgrade of the product. This exercise also sets the expectations
on what performance the customer will get with the available hardware and
software resources.

127
To summarize, performance testing is done to ensure that a product
• processes the required number of transactions in any given interval
(throughput);
• is available and running under different load conditions (availability);
• responds fast enough for different load conditions (response time);
• delivers worthwhile return on investment for the resources hardware and
software-and deciding what kind of resources are needed for the product
for different load conditions (capacity planning); and
• is comparable to and better than that of the competitors for different
parameters (competitive analysis and benchmarking).

Check your progress 2

How will you calculate network latency and response time?


Notes: a) Write your answer in the space given below
rr) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

16.3 PERFORMANCE ENGINEERING

Performance, a stern component in today's competitive and dynamic market


place, is highly critical to meet the ever-increasing customer demands for
service quality. CSS' proven product engineering expertise ties together
architecture and planning with Performance, Scalability, and Sizing to reduce
operational costs and eliminate risk, by assuring that your product will exceed
every functional, reliability, availability, and operational objective during peak
hours.
By acting early to remove performance bottlenecks, we preempt our customers
paying a greater price at a later date.

16.2.1 Performance Engineering Approach

This approach to performance engineering is a systematic and deliberate plan


to meet users performance objectives. The engagement kicks off with
understanding the performance goals of the client. The expected load and
concurrent users are identified.
A quick system study on the existing infrastructure and network topology helps
creating test beds and test environments that replicate the real world scenarios
as in Figure 16.2.

128
More than 200 application specific performance parameters at the client and
server end are monitored and the scalability and stability of the application
under load is studied. Using the knowledge base of common and not-so-
common performance problems, a root cause analysis of the performance
bottlenecks is done and recommendations provided

Figure 16.2 Performance Engineering approach

Check your progress 3

Write notes on some performance testing tools.


Notes: a) Write your answer in the space given below
ss) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

16.4 LET US SUM UP

1. Performance testing is the process of determining the speed or


effectiveness of a computer, network, software program or device. This

129
process can involve quantitative tests done in a lab, such as measuring
the response time or the number of MIPS (millions of instructions per
second) at which a system functions.
2. Network latency = NI + N2 + N3 + N4
Product latency = Al + A2 + A3
Actual response time = Network latency + Product latency
3. The following are some of the open source performance testing tools with
descriptions.
a) Apache JMeter
Description: Apache JMeter is a 100% pure Java desktop application
designed to load test functional behavior and measure performance. It
was originally designed for testing Web Applications but has since
expanded to other test functions. Apache JMeter may be used to test
performance both on static and dynamic resources (files, Servlets, Perl
scripts, Java Objects, Data Bases and Queries, FTP Servers and more). It
can be used to simulate a heavy load on a server, network or object to
test its strength or to analyze overall performance under different load
types. You can use it to make a graphical analysis of performance or to
test your server/script/object behavior under heavy concurrent load.
b) benerator
Description:
benerator is a framework for creating realistic and valid high-volume test
data, used for (unit/integration/load) testing and showcase setup.
Metadata constraints are imported from systems and/or configuration
files. Data can be imported from and exported to files and systems,
anonymized or generated from scratch. Domain packages provide
reusable generators for creating domain-specific data as names and
addresses internationalizable in language and region. It is strongly
customizable with plugins and configuration options.

130
LESSON 17

PERFORMANCE TESTING METHODOLOGY


Contents
17.0 Aims and Objectives
17.1 Methodology for Performance Testing
17.2 Collecting Requirements
17.3 Writing Test Cases
17.4 Automating Performance Test Cases
17.5 Executing Performance Test Cases
17.6 Analyzing the Performance Test Results
17.7 Performance Testing
17.8 Performance Benchmarking
17.9 Capacity Planning
17.10 Let Us Sum Up

17.0 AIMS AND OBJECTIVES

In this lesson, the add-on topics of performance testing are discussed.


Methodology of performance testing, how to collect requirements, how to write
genuine test cases, automation of test cases, executing and analyzing test
results, performance bench marking and capacity planning etc.
In previous lesson, some of the open source performance testing tools are
discussed in the progress section. The reader can recollect them and we are
going to discuss them in detail in the next lesson.

17.1 METHODOLOGY FOR PERFORMANCE TESTING

Performance testing is complex and expensive due to large resource


requirements and the time it takes. Hence, it requires careful planning and a
robust methodology. Performance testing is ambiguous because of the different
people who are performing the various roles having different expectations.
Additionally, a good number of defects that get uncovered during performance
testing may require design and architecture change. Finally, a fix for a
performance defect may even cause some functionality to stop working, thereby
requiring more effort during regression. For these reasons, this section focuses
on various steps and guidelines for doing a performance testing in a methodical
way. A methodology for performance testing involves the following steps.
1. Collecting requirements
2. Writing test cases
3. Automating performance test cases
4. Executing performance test cases

131
5. Analyzing performance test results
6. Performance tuning
7. Performance benchmarking
8. Recommending right configuration for the customers (Capacity Planning)

17.2 COLLECTING REQUIREMENTS

Collecting requirements is the first step in planning the performance


testing. Typically, functionality testing has a definite set of inputs and outputs,
with a clear definition of expected results. In contrast, performance testing
generally needs elaborate documentation and environment setup and the
expected results may not well known in advance. As a result of these
differences, collecting requirements for performance testing presents some
unique challenges.
Firstly, a performance testing requirement should be testable-not
features/ functionality can be performance tested. For example, a feature
involving a manual intervention cannot be performance tested as the results
depend on how fast a user responds with inputs to the product. A performance
test can only be carried out for a completely automated product.
Secondly, a performance-testing requirement needs to clearly state what
factors needs to be measured and improved. As discussed in the previous
section, performance has several factors such as response time, latency
throughput, resource utilization, and others. Hence, a requirement needs to
associate the factors or combination of factors that have to be measured and
improved as part of performance testing.
Lastly, performance testing requirement needs to be associated with
actual number or percentage of improvement that is desired. For example if a
business transaction, say ATM money withdrawal, should be completed within
two minutes, the requirement needs to document the actual response time
expected. Only then can the pass /fail status of a performance testing be
concluded. Not having the expected numbers for the appropriate parameter
(response time, throughput, and so on) renders performance testing completely
futile, as there is no quantitative measure of success and nothing mil concluded
or improved in the end.
Given the above challenges, a key question is how requirements for
performance testing can be derived. There are several sources for performance
requirements. Some of them are as follows.
1. Performance compared to the previous release of the same product
A performance requirement can be something like "an withdrawal
transaction will be faster than the previous by 10%."
2. Performance compared to the competitive products Performance
requirement can be documented as "ATM withdraw be as fast as or faster
than competitive product XYZ."
3. Performance compared to absolute numbers derived from need A
requirement can be documented such as "ATM should be capable of

132
handling 1000 transactions per day transaction not taking more than a
minute."
4. Performance numbers derived from architecture and design The
architect or a designer of a product would normally be in a much better
position than anyone else to say what is the performance expected out of
the product. The architecture and design goals are based on the
performance expected for a particular load. Hence, there is an
expectation that the source code is written in such a way that those
numbers are met. .
There are two types of requirements performance testing focuses on
generic requirements and specific requirements. Generic requirements are
those that are: common across all products in the product domain area. All
products in that area are expected to meet those performance expectations. For
some of the products they are mandated by SLAs (Service Level Agreements)
and standards. The time taken to load a page, initial response when a mouse is
clicked, and times taken to navigate between screens are some examples of
generic requirements. Specific requirements are those that depend on
implementation for a particular product and differ from one product to another
in a given domain. An example of specific performance requirement is the time
taken to withdraw cash in an ATM. During performance testing both generic
and specific requirements need to be tested.
As discussed earlier, the requirements for performance testing also
include the load pattern and resource availability and what is expected from the
product under different load conditions. Hence, while documenting the expected
response time, throughput, or any other performance factor, it is equally
important to map different load conditions as illustrated in the example.
Beyond a particular load, any product shows- some degradation in
performance. While it is easy to understand this phenomenon, it will be very
difficult to do a performance test without knowing the degree of degradation
with respect to load conditions. Massive degradation in performance beyond a
degree is not acceptable by users. For example, ATM cash withdrawal taking
one hour to complete a transaction (regardless of reason or load) is not
acceptable. In such a case, the customer who requested the transaction would
have waited and left the ATM and the money may get disbursed to the person
who reaches the ATM next. The performance values that are in acceptable limits
when the load increases are den' term called II graceful performance
degradation. A performance test case for a product needs to validate this
graceful degradation also as one requirement.

Check your progress 1

Discuss the sources of performance requirements.


Notes: a) Write your answer in the space given below
tt) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

133
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

17.3 WRITING TEST CASES

The next step involved in performance testing is writing test cases briefly
discussed earlier, a test case for performance testing should do the following
details defined.
1. List of operations or business transactions to be tested
2. Steps for executing those operations/transactions
3. List of product, as parameters that impact the performance and
their values
4. Loading pattern
5. Resource and their configuration (network, hardware, so
configurations)
6. The expected results (that is, expected response time, through
latency)
7. The product versions/competitive products to be compared and
related information such as their corresponding fields
Performance test cases are repetitive in nature. These test cases normally
executed repeatedly for different values of parameters, its load conditions,
different configurations, and so on. Hence, the de what tests are to be repeated
for what values should be part of the test documentation.
While testing the product for different load patterns, it is important
increase the load or scalability gradually to avoid any unnecessary case of
failures. For example, if an ATM withdrawal fails for ten concurrent operations,
there is no point in trying it for 10,000 operations. The involved in testing for 10
concurrent operations may be several times than that of testing for 10,000
operations. Hence, a methodical application to gradually improve the
concurrent operations by say 10, 100, l000, l0,000 and so on rather than trying
to attempt 10,000 concurrent operations in first iteration itself. The test case
documentation should clearly reflect approach.
Performance testing is a laborious process involving time and effort 1 all
operations/business transactions can be included in performance tell Hence,
all test cases that are part of performance testing have to be as!' different
priorities so that high priority test cases can be completed 1 others. The priority
can be absolute as indicated by the customers or II within the test cases
considered for performance testing. Absolute p;' is indicated by the
requirements and the test team normally assigns relative priority. While
executing the test cases, the absolute and relative priorities are looked at and
the test cases are sequenced accordingly.

Check your progress 2

What are the contents of a performance testing test case?


Notes: a) Write your answer in the space given below
uu) Check your answer with the one given at the end of this lesson.

134
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

17.4 AUTOMATING PERFORMANCE TEST CASES

Automation is an important step in the methodology for performance


testing. Performance testing naturally lends itself to automation due to the
following characteristics.
1. Performance testing is repetitive.
2. Performance test cases cannot be effective without automation
and in most cases it is, in fact, almost impossible to do
performance testing without automation.
3. The results of performance testing need to be accurate, and
manually calculating the response time, throughput, and so on
can introduce inaccuracy.
4. Performance testing takes into account several factors. There are
far too many permutations and combination of those factors and it
will be difficult to remember all these and use them if the tests are
done manually.
5. The analysis of performance results and failures needs to take into
account related information such as resource utilization, log files,
trace files, and so on that are collected at regular intervals. It is
impossible to do this testing and perform the book-keeping of all
related information and analysis manually.
As we will see in the chapter on test automation, there should not be any
hard coded data in automated scripts for performance testing. Such hard
coding may impact the repeatability nature of test cases and may require
change in automation script, taking more time and effort.
End-to-end automation is required for performance testing. Not only the
steps of the test cases, but also the setup required for the test cases, setting
different values to parameters, creating different load conditions, setting up and
executing the steps for operations/transactions of competitive product, and so
on have to be included as part of the automation script. While automating
performance test cases, it is important to use standard tools and practices.
Since some of the performance test cases involve comparison with the
competitive product, the results need to be consistent, repeatable, and accurate
due to the high degree of sensitivity involved.

17.5 EXECUTING PERFORMANCE TEST CASES

Performance testing generally involves less effort for execution but more
effort for planning, data collection, and analysis. As discussed earlier, 100%
end-to-end automation is desirable for performance testing and if that is
achieved, executing a performance test case may just mean certain automated
scripts. However, the most effort-consuming and execution is usually data

135
collection. Data corresponding to the 1011 points needs to be collected while
executing performance tests.
1. Start and end time of test case execution
2. Log and trace/audit files of the product and operating system future
debugging and repeatability purposes)
3. Utilization of resources (CPU, memory, disk, network and so on) on a
periodic basis
4. Configuration of all environmental factors (hardware, and other
components)
5. The response time, throughput, latency, and so on as specific the test
case documentation at regular intervals
Another aspect involved in performance test execution is scenario
testing. A set of transactions/operations that are usually performed by the user
forms the scenario for performance testing. This particular testing is done to
ensure whether the mix of operations/transactions concurrently by different
users/ machines meets the performance criteria. In real life, not all users the
same operation all the time and hence these tests are performed. For example,
not all users withdraw cash from an ATM; some of them query for account
balance; some make deposits, and so on. In this case this (with different users
executing different transactions) is executed with existing automation that is
available and related data is collected using the existing tools.
What performance a product delivers for different configurations of
hardware and network setup, is another aspect that needs to be included
during execution. This requirement mandates the need for repeating for
different configurations. This is referred to as configuration performance tests.
This test ensures that the performance of the product is compatible with
different hardware, utilizing the special nature of those configurations and
yielding the best performance possible. For a given configuration the product
has to give the best possible performance, and if the configuration is better, it
has to get even better of test. The performance test case is repeated for each
row in the following table and factors such as response time and throughput
are recorded and analyzed.
Once performance tests are executed and various data points are
collected, the next step is to plot them. As explained earlier, performance test
cases are repeated for different configurations and different values of
parameters. Hence, it makes sense to group them and plot them in the form of
graphs and charts. Plotting the data helps in making a quick analysis which
would otherwise be difficult to do with only the raw data.

Check your progress 3

What are the characteristics of a performance testing test case?


Notes: a) Write your answer in the space given below
vv) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------

136
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

17.6 ANALYZING THE PERFORMANCE TEST RESULTS

Analyzing the performance test results require multi-dimensional


thinking. This is the most complex part of performance testing where product
knowledge, analytical thinking, and statistical background are all absolutely
essential. Before analyzing the data, some calculations of data and organization
of the data are required. The following come under this category.
1. Calculating the mean of the performance test result data
2. Calculating the standard deviation
3. Removing the noise (noise removal) and re-plotting calculating the
mean and standard deviation.
4. In terms of caching and other technologies implemented in the product,
the data coming from the cache need to be different from the data that
gets processed by the product, and pre
5. Differentiating the performance data when the resources available
completely as against when some background and were going on.
For publishing the performance numbers, there is one basic expectation
performance numbers are to be reproducible for the customers. To this, all
performance tests are repeated multiple times and the average/mean of those
values are taken. This increases the chance that the performance data can be
reproduced at a customer site for the same configuration and load condition.
Repeatability not only depends on tiling the average/mean of
performance data. It also depends on how consistently the product delivers
those performance numbers. Standard deviation can help here. It may indicate
whether the performance numbers can be reproduced at the customer side. The
standard deviation represents how much the data varies from the mean. For
example, if the average response time of 100 people withdrawing money from an
ATM is 100 seconds and the standard deviation is 2, then there is greater
chance that this performance data is repeatable than in a case where the
standard deviation is 30. Standard deviation close to zero means the product
performance is highly repeatable and performance values are consistent. Higher
the standard deviation more is the variability of the product performance.
When there are a set of performance numbers that came from multiple
runs of the same test, there could be situations where in a few of the iterations,
some errors were committed by the scripts, software, or a human. Taking such
erroneous executions into account may not be appropriate and such values
need to be ignored. Moreover, when a set of values is I on a chart, one or two
values that are out of range may cause the cluttered and prevent meaningful
analysis. Such values can be ignored to produce a smooth curve/graph. The
process of removing some unwanted values in a set is called noise removal.
When some values are removed the set, the mean and standard deviation needs
to be re-calculated.

137
The majority of the server-client, Internet, and database applications
store the data in a local high-speed buffer when a query is made. This enables
them to present the data quickly when the same request is made again is called
caching. The performance data need to be differentiated according to where the
result is coming from - the server or the cache. The data points can be kept as
two different data sets-one for cache and one coming from server. Keeping them
as two different data sets enables the performance data to be extrapolated in
future, based on the hit ratio expected in deployments.
For example, assume that data in a cache can produce a response time
of 1000 microseconds and a server access takes 1 microsecond and 90% of the
time a request is satisfied by the cache. Then the average response time is (0.9)
x 1000 + 0.1 x 1 = 900.1Ils. The mean response time is thus calculated as a
weighted average rather than a simple mean.
Some "time initiated activities" of the product or background activities of
the operating system and network may have an effect on the performance data.
An example of one such activity is garbage collection/defragmentation in
memory management of the operating system or a compiler. When such
activities are initiated in the background, degradation in the performance may
be observed. Finding out such background events and separating those data
points and making an analysis would help in presenting the right performance
data.
Once the data sets are organized (after appropriate noise removal and
after appropriate refinement as mentioned above), the analysis of performance
data is carried out to conclude the following.
1. Whether performance of the product is consistent when tests are
executed multiple times
2. What performance can be expected for what type of configuration (both
hardware and software), resources
3. What parameters impact performance and how they can be used to
derive better performance (Please refer to the section on performance
tuning)
4. What is the effect of scenarios involving several mix of operations for the
performance factors
5. What is the effect of product technologies such as caching on per-
formance improvements (Please refer to the section on performance
tuning)
6. Up to what load are the whether the performance "graceful degradation"
7. What is the optimum throughput/response time of the product for a set
of factors such as load, resources, and parameters
8. What performance requirements are met and how the performance looks
when compared to the previous version or the expectations set earlier or
the competition

138
9. Sometime high-end configuration may not be available for performance
testing. In that case, using the current set of performance data and the
charts that are available through performance testing, the performance
numbers that are to be expected from a high-end configuration should be
extrapolated or predicted.

17.7 PERFORMANCE TUNING

Analyzing performance data helps in narrowing down the list of parameters


that really impact the performance results and improving product performance.
Once the parameters are narrowed down to a few, the performance test cases
are repeated for different values of those parameters to further analyze their
effect in getting better performance. This performance-tuning exercise needs a
high degree of skill in identifying the list of parameters and their contribution to
performance. Understanding each parameter and its impact on the product is
not sufficient for performance tuning. The combination of parameters too
causes changes in performance. The relationship among various parameters
and their impact too become very important to performance tuning.
There are two steps involved in getting the optimum mileage performance
tuning. They are
1. Tuning the product parameters and
2. Tuning the operating system and parameters
There are a set of parameters associated with the product where
administrators or users of the product can set different values to obtain
optimum performance. Some of the common practices are providing a number
of forked processes for performing parallel transactions, caching and memory
size, creating background activities, deferring routine checks to a later point of
time, providing better priority to a highly used operation/ transaction, disabling
low-priority operations, changing the sequence or clubbing a set of operations
to suit the resource availability, and so on. Setting different values to these
parameters enhances the product performance. The product parameters in
isolation as well as in combination have an impact on product performance.
Hence it is important to
1. Repeat the performance tests for different values of each parameter that
impact performance (when changing one parameter you may want to
keep the values of other parameters unchanged).
2. Sometimes when a particular parameter value is changed, it needs
changes in other parameters (as some parameters may be related each
other). Repeat the performance tests for a group of parameters and their
different values.
3. Repeat the performance tests for default values of all parameters (called
factory settings tests).
4. Repeat the performance tests for low and high values of each parameter
and combinations.

139
There is one important point that needs to be noted while tuning the
product parameters. Performance tuning provides better results only for a
particular configuration and for certain transactions. It would have achieved
the performance goals, but it may have a side-effect on functionality or on some
non-functional aspects. Therefore, tuning may be counter-productive to other
situations or scenarios. This side-effect of tuning product parameters needs to
be analyzed and such side-effects also should be included as part of the
analysis of this performance-tuning exercise.
Tuning the as parameters is another step towards getting better per-
formance. There are various sets of parameters provided by the operating
system under different categories. Those values can be changed using the
appropriate tools that come along with the operating system (for example, the
Registry in MS-Windows can be edited using regedit.exe). These parameters in
the operating system are grouped under different categories to explain their
impact, as given below.
1. File system related parameters (for example, number of open files
permitted)
2. Disk management parameters (for example, simultaneous disk
reads/writes)
3. Memory management parameters (for example, virtual memory page size
and number of pages)
4. Processor management parameters (for example, enabling/ disabling
processors in multiprocessor environment)
5. Network parameters (for example, setting TCP/IP time out)
As explained earlier, not only each of the in parameters but also their
combinations, have different effects on product performance. As before, the
performance tests have to be repeated for different values of each and for a
combination of as parameters. While repeating the tests, the as parameters
need to be tuned before application/product tuning is done.
There is one important point that needs to be remembered when tuning
the as parameters for improving product performance. The machine, on which
the parameter is tuned, may have multiple products and applications that are
running. Hence, tuning an as parameter may give better results for the product
under test, but may heavily impact the other products that are running on the
same machine. Hence, as parameters need to be tuned only when the complete
impact is known to all applications running in the machine or they need to be
tuned only when it is absolutely necessary, giving big performance advantages.
Tuning as parameters for small gains in performance is not the right thing to
do.
The products are normally supported on more than one platform. Hence,
the performance tuning procedure should consider the as parameters and their
effect on all supported platforms for the product.

17.8 PERFORMANCE BENCHMARKING

Performance benchmarking is about comparing the performance of


product transactions with that of the competitors. No two products can have

140
the same architecture, design, functionality, and code. The customers and
types of deployments can also be different. Hence, it will be very difficult to
compare two products on those aspects. End-user transactions/scenarios could
be one approach for comparison. In general, an independent test team or an
independent organization not related to the organizations of the products being
compared does performance benchmarking. This does away with any bias in
the test. The person doing the performance benchmarking needs to have the
expertise in all the products being compared for the tests to be executed
successfully. The steps involved in performance benchmarking are the
following:
1. Identifying the transactions/scenarios and the test configuration
2. Comparing the performance of different products
3. Tuning the parameters of the products being compared fairly to deliver
the best performance
4. Publishing the results of performance benchmarking
As mentioned earlier, as the first step, comparable (apples-to-apples)
transactions/scenarios are selected for performance benchmarking. Normally,
the configuration details are determined well in advance and hence test cases
are not repeated for different configurations. Generally, the test cases for all the
products being compared are executed in the same test bed. However, two to
three configurations are considered for performance benchmarking just to
ensure that the testing provides the breadth required to cover realistic
scenarios.
Once the tests are executed, the next step is to compare the results. This
is where the understanding of the products being compared becomes essential.
Equal expertise level in all the products is desirable for the person doing the
tests. The tunable parameters for the various products may be completely
different and understanding those parameters and their impact on performance
is very important in doing a fair comparison of results. This is one place where
bias can come in. A well tuned product, A, may be compared with a product B
with no parameter tuning, to prove that the product A performs better than B.
It is important that in performance benchmarking all products should be tuned
to the same degree.
From the point of view of a specific product there could be three
outcomes from performance benchmarking. The first outcome can be positive,
where it can be found that a set of transactions/scenarios outperform with
respect to competition. The second outcome can be neutral, where a set of
transactions are comparable with that of the competition. The third outcome
can be negative, where a set of transaction under-perform compared to that of
the competition. The last outcome may be detrimental for the success of the
product; hence, the performance tuning exercise described in the previous
section needs to be performed for this set of transactions using the same
configuration internally by the product organization. If tuning helps in this
case, it at least helps in bringing down the criticality of the failure; else it
requires the performance defects to be fixed and a subset of test cases for
performance benchmarking to be repeated again. Even though it was said that
tuning as an exercise needs to be repeated for the third outcome, it need not be

141
limited only to that situation. Tuning can be repeated for all situations of
positive, neutral, and negative results to derive the best performance results.
Repeating the performance tuning may not be always possible. If neutral
agencies (as benchmarks are done) are involved, then they may just bring out
the apples-to-apples comparison and may not do tuning. In such cases, the test
teams will take care of repeating the tests.
The results of performance benchmarking are published. There types of
publications that are involved. One is an internal, confident publication to
product teams, containing all the three outcomes above and the recommended
set of actions. The positive outcomes of performance benchmarking are
normally published as marketing collateral, which helps as a sales tool for the
product. Also benchmarks conducted by independent organizations are
published as audited benchmarks.

17.9 CAPACITY PLANNING

If performance tests are conducted for several configurations, the huge


volume of data and analysis that is available can be used to predict
configurations needed for a particular set of transactions and load pattern. This
reverse process is the objective of capacity planning. Performance configuration
tests are conducted for different configurations and performance data are
obtained. In capacity planning, the performance requirements and performance
results are taken as input requirements and the configuration needed to satisfy
that set of requirements are derived.
Capacity planning necessitates a clear understanding of the resource
requirements for transactions/scenarios. Some transactions of the product
associated with certain load conditions could be disk intensive, some could be
CPU intensive, some of them could be network intensive, and some of them
could be memory intensive. Some transactions may require a combination of
these resources for performing better. This understanding of what resources are
needed for each transaction is a prerequisite for capacity planning.
If capacity planning has to identify the right configuration for the
transactions and particular load patterns, then the next question that arises is
how to decide the load pattern. The load can be the actual requirement of the
customer for immediate need (short term) or the requirements for the next few
months (medium term) or for the next few years (long term). Since the load
pattern changes according to future requirements, it is critical to consider those
requirements during capacity planning. Capacity planning corresponding to
short-, medium-, and long-term requirements are called
1. Minimum required configuration;
2. Typical configuration; and
3. Special configuration.
A minimum required configuration denotes that with anything less than
this configuration, the product may not even work. Thus, configurations below
the minimum required configuration are usually not supported. A typical
configuration denotes that under that configuration the prod work fine for

142
meeting the performance requirements of the required load pattern and can
also handle a slight increase in the load pattern. A special configuration
denotes that capacity planning was done considering all future requirements.
There are two techniques that playa major role in capacity planning.
They are load balancing and high availability. Load balancing ensures that the
multiple machines available are used equally to service the transactions. This
ensures that by adding more machines, more load can be handled by the
product. Machine clusters are used to ensure availability. In a cluster there are
multiple machines with shared data so that in case one machine goes down, the
transactions can be handled by another machine in the cluster. When doing
capacity planning, both load balancing and availability factors are included to
prescribe the desired configuration.
The majority of capacities planning exercises are only interpretations of
data and extrapolation of the available information. A minor mistake in the
analysis of performance results or in extrapolation may cause a deviation in
expectations when the product is used in deployments. Moreover, capacity
planning is based on performance test data generated in the test lab, which is
only a simulated environment. In real-life deployment, there could be several
other parameters that may impact product performance. As a result of these
unforeseen reasons, apart from the skills mentioned earlier, experience is
needed to know real-world data and usage patterns for the capacity planning
exercise.

17.10 LETS SUM UP

Performance
Check Your Progress: Model Answers
1. Performance compared to the previous release of the same product ,
Performance compared to the competitive products, Performance
compared to absolute numbers derived from need, Performance
numbers derived from architecture and design
2. List of operations or business transactions to be tested
Steps for executing those operations/transactions
List of product, as parameters that impact the performance and their
values
Loading pattern
Resource and their configuration (network, hardware, so configurations)
The expected results (that is, expected response time, through latency)
The product versions/competitive products to be compared and related
information such as their corresponding fields
3. Performance testing is repetitive.

143
Performance test cases cannot be effective without automation and in
most cases it is, in fact, almost impossible to do performance testing
without automation.
The results of performance testing need to be accurate, and manually
calculating the response time, throughput, and so on can introduce
inaccuracy.
Performance testing takes into account several factors. There are far too
many permutations and combination of those factors and it will be
difficult to remember all these and use them if the tests are done
manually.
The analysis of performance results and failures needs to take into
account related information such as resource utilization, log files, trace
files, and so on that are collected at regular intervals. It is impossible to
do this testing and perform the book-keeping of all related information
and analysis manually.

144
LESSON 18

TOOLS AND PROCESS FOR PERFORMANCE


TESTING
Contents
18.0 Aims and Objectives
18.1 Tools for Performance Testing
18.2 Process for Performance Testing
18.3 Challenges
18.4 Let Us Sum Up

18.0 AIMS AND OBJECTIVES

In this lesson, we are going to discuss in detail on tools and processes of


performance testing. In previous lesson, we discussed on performance tuning,
bench marking and capacity planning.
The main objective of this lesson to bring out a practical approach on
performance testing.

18.1 TOOLS FOR PERFORMANCE TESTING

There are two types of tools that can be used for performance testing
functional performance tools and load tools.
Functional performance tools help in recording and playing back the
transactions and obtaining performance numbers. This test generally involves
very few machines. Load testing tools simulate the load condition for
performance testing without having to keep that many users or machines. The
load testing tools simplify the complexities involved in creating the load and
without such load tools it may be impossible to perform these kinds of tests. As
was mentioned earlier, this is only a simulated load and real-life experience
may vary from the simulation.
We list below some popular performance tools:
• Functional performance tools
o Win Runner from Mercury
o QA Partner from Compuware
o Silk test from Segue
• Load testing tools
o Load Runner from Mercury
o QA Load from Compuware
o Silk Performer from Segue

145
There are many vendors who sell these performance tools. The references
at the end of the book point to some of the popular tools.
Performance and load tools can only help in getting performance
numbers. The utilization of resources is another important parameter that
needs to be collected. "Windows Task Manager" and "top" in Linux are examples
of tools that help in collecting resource utilization. Network performance
monitoring tools are available with almost all operating systems today to collect
network data.

Check your progress 1

Discuss on WinRunner.
Notes: a) Write your answer in the space given below
ww) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

18.2 PROCESS FOR PERFORMANCE TESTING

Performance testing follows the same process as any other testing type.
The only difference is in getting more details and analysis. As mentioned earlier,
the effort involved in performance testing is more and tests are generally
repeated several times. The increased effort reflects in increased costs, as the
resources needed for performance testing is quite high. A major challenge
involved in performance testing is getting the right process so that the effort can
be minimized. A simple process for performance testing tries to address these
aspects.
Ever-changing requirements for performance is a serious threat to
product as performance can only be improved marginally by fixing i' the code.
As mentioned earlier, a majority of the performance issues require rework or
changes in architecture and design. Hence, it is important collect the
requirements for performance earlier in the life cycle and addresses them,
because changes to architecture and design late in the cycle are expensive.
While collecting requirements for performance testing, it is important to decide
whether they are testable, that is, to ensure that performance requirements are
quantified and validated in an objective way. If so, the quantified expectation of
performance is documented. Making the requirements testable and measurable
is the first activity needed for the success of performance testing.

146
Figure 18.1 Process on performance testing
The next step in the performance testing process in Figure 18.1 is to
create a performance test plan. This test plan needs to have the following
details.
1. Resource requirements All additional resources that are specifically
needed for performance testing need to be planned and obtained.
Normally these resources are obtained, used for performance test, and
released after performance testing is over. Hence, the resources need to
be included as part of the planning and tracked.
2. Test bed (simulated and real life), test-lab setup The test lab, with all
required equipment and software configuration, has to be set up prior to
execution. Performance testing requires a large number of resources and
requires special configurations. Hence, setting up both the simulated
and real-life environment is time consuming and any mistake in the test-
bed setup may mean that the complete performance tests have be
repeated. Hence, it has to be a part of the planning exercise and tracked.
3. Responsibilities Performance defects, as explained earlier, may cause
changes to architecture, design, and code. Additionally, the teams facing
the customers normally communicate requirements for performance.
Multiple teams are involved in the successful execution of performance
tests and all the teams and people performing different roles need to
work together if the objectives of performance have to be met. Hence, a
matrix containing responsibilities must be worked out as part of the
performance test plan and communicated across all teams.
4. Setting up product traces, audits, and traces (external and internal)
Performance test results need to be associated with traces and audit
trails to analyze the results and defects. What traces and audit trials
have to be collected is planned in advance and is an associated part of
the test plan. This is to be planned in advance, because enabling too
many traces and audit traces may start impacting the performance
results.

147
5. Entry and exit criteria Performance tests require a stable product due
to its complexity and the accuracy that is needed. Changes to the
product affect performance numbers and may mean that the tests have
to be repeated. It will be counter-productive to execute performance test
cases before the product is stable or when changes are being made.
Hence, the performance test execution normally starts after the product
meets a set of criteria. The set of criteria to be met are defined well in
advance and documented as part of the performance test plan. Similarly,
a set of exit criteria is defined to conclude the results of performance
tests.
Designing and automating the test cases form the next step in the
performance test process. Automation deserves a special mention as this step
because it is almost impossible to perform performance testing without
automation.
Entry and exit criteria play a major role in the process of performance
test execution. At regular intervals during product development, the entry
criteria are evaluated and the test is started if those criteria are met. There can
be a separate set of criteria for each of the performance test cases. The entry
criteria need to be evaluated at regular intervals since starting the tests early is
counter-productive and starting late may mean that the performance objective
is not met on time before the release. At the end of performance test execution,
the product is evaluated to see whether it met all the exit criteria. If some of the
criteria are not met, improvements are made to the product and the test cases
corresponding to the exit criteria are re-executed with an objective to fill the
gap. This process is repeated till all the exit criteria are met.
Each of the process steps for the performance tests described above are
critical because of the factors involved (that is; cost, effort, time, and
effectiveness). Hence, keeping a strong process for performance testing provides
a high return on investment.

18.3 CHALLENGES

Performance testing is not a very well understood topic in the testing


community. There are several interpretations of performance testing. Some
organizations separate performance testing and load testing and conduct them
at different phases of testing. While it may be successful in some situations,
sometimes separating these two causes complications. When there is a need to
compare these functional performance numbers with load testing numbers, it
becomes difficult as the build used is different and the timeline (when the
timeline is different, the quality of product may also be different) is also
different as they were performed at two different phases. In this case, an
apples-to-apples comparison is not possible.
The availability of skills is a major problem facing performance testing.
As discussed at several places of this chapter, product knowledge, knowledge of
competition, tools usage, automation, process, knowledge on statistics and
analytical skills are needed to do performance testing. This is one of the longest
lists of skills that are required for any type of testing discussed Ii now. Training

148
the engineers on these skills and making them available for a long duration for
doing performance testing will help in meeting these skills.
Performance testing requires a large number and amount of resources
such as hardware, software, effort, time, tools, and people. Even large
organizations find these resources that are needed to meet the objectives of
performance testing scarce. Even if they are available, it is so only for a short
duration. This is yet another challenge in performance testing. Looking at the
resources available and trying to meet as many objectives as possible as what is
expected from the teams executing performance tests.
Performance test results need to reflect real-life environment and
expectations. But due to the nature of tools which only simulate the
environment, the test lab that works in a controlled environment, and data sets
which may not have all fields populated the same way as the customer has,
repeating the performance test results in the real-life customer deployments is a
big challenge. Adequate care to create a test bed as close to a customer
deployment is another expectation for performance tests.
Selecting the right tool for the performance testing is another challenge.
There are many tools available for performance testing but not all of them meet
all the requirements. Moreover, performance test tools are expensive and
require additional resources to install and use. Performance tools also expect
the test engineers to learn additional meta-languages and scripts. This throws
up another challenge for performance testing.
Interfacing with different teams that include a set of customers is yet
another challenge in performance testing. Not only the customers but also the
technologists give performance test requirements and development teams.
Performance testing is conducted to meet the expectations of customers,
architects, and development team. As a business case, the performance of the
product needs to match up with the competition. As expectations keep growing
from all directions, it will be difficult to meet all of them at one go. Sustained
effort is needed if the majority of performance expectations have to be met.
Lack of seriousness on performance tests by the management and
development team is another challenge. Once all functionalities are working fine
in a product, it is assumed that the product is ready to ship. Due to various
reasons specified, performance tests are conducted after the features are stable,
and the defects that come out of these tests need to be looked into very
seriously by the management. Since it may be too late to fix some defects or due
to release pressures or due to fixes needed in design and architecture that may
need a big effort in regression or various other reasons, generally some of the
defects from these tests are postponed to the next release. It defeats the
purpose of performance tests. A high degree of management commitment and
directive to fix performance defects before product release are needed for
successful execution of performance tests.

Check your progress 2

How will you select a tool for performance testing?


Notes: a) Write your answer in the space given below

149
b)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
18.4 BENEFITS OF AUTOMATED TESTING
1. Fast - WinRunner runs tests significantly faster than
human users.
2. Reliable - Tests perform precisely the same operations each
time they are run, thereby eliminating human error.
3. Repeatable - You can test how the software reacts under repeated
execution of the same operations.
4. Programmable - You can program sophisticated tests that bring out
hidden information from the application.
5. Comprehensive - You can build a suite of tests that covers every
feature in your application.
6. Reusable - You can reuse tests on different versions of an
application, even if the user interface changes.

18.5 LET US SUM UP

Check Your Progress: Model Answers


1. WinRunner, Mercury Interactive’s enterprise functional testing tool. It is
used to quickly create and run sophisticated automated tests on your
application. Winrunner helps you automate the testing process, from test
development to execution. You create adaptable and reusable test scripts
that challenge the functionality of your application. Prior to a software
release, you can run these tests in a single overnight run- enabling you
to detect and ensure superior software quality.
2. Selecting the right tool for the performance testing is another challenge.
There are many tools available for performance testing but not all of
them meet all the requirements. Moreover, performance test tools are
expensive and require additional resources to install and use.
Performance tools also expect the test engineers to learn additional meta-
languages and scripts. This throws up another challenge for performance
testing.

150
LESSON 19

REGRESSION TESTING
Contents
19.0 Aims and Objectives
19.1 Introduction
19.2 What is Regression Testing?
19.3 Types of Regression Testing
19.4 When to do Regression Testing?
19.5 Strategies of Regression Testing
19.6 Let Us Sum Up

19.0 AIMS AND OBJECTIVES

In this lesson, we introduce regression testing and its types. When a regression
testing must be done is added as a separate section. The reader is expected to
decide on regression testing and to write test cases.

19.1 INTRODUCTION

Regression testing means rerunning test cases from existing test suites to build
confidence that software changes have no unintended side-effects. The “ideal”
process would be to create an extensive test suite and run it after each and
every change. Unfortunately, for many projects this is just impossible because
test suites are too large, because changes come in too fast, because humans are
in the testing loop, because scarce, highly in-demand simulation laboratories
are needed, or because testing must be done on many different hardware and
OS platforms.
Researchers have tried to make regression testing more effective and efficient by
developing regression test selection (RTS) techniques, but many problem
remain, such as:
• Unpredictable performance. RTS techniques sometimes save time and
money, but they sometimes select most or all of the original test cases.
Thus, developers using RTS techniques can find themselves worse off for
having done so.
• Incompatible process assumptions. Testing time is often limited (e.g.,
must be done overnight). RTS techniques do not consider such
constraints and, therefore, can and do select more test cases than can be
run.
• Inappropriate evaluation models. RTS techniques try to maximize
average regression testing performance rather than optimize aggregate
performance over many testing sessions. However, companies that test
frequently might accept less effective, but cheaper individual testing

151
sessions if the system would, nonetheless, be well-tested over some short
period of time.
These and other issues have not been adequately considered in current
research, yet they strongly affect the applicability of proposed regression testing
processes. Moreover, we believe that solutions to these problems can be
exploited, singly and in combination, to dramatically improve the costs and
benefits of the regression testing process.

Figure 19.1 Agile Testing Life-cycle


The Figure 19.1 illustrates the regression testing performed on Relational
Databases in a testing life cycle.

19.2 WHAT IS REGRESSION TESTING?

Software undergoes constant changes. Such changes are necessitated


because of defects to be fixed, enhancements to be made to functionality, or

152
new functionality to be added. Anytime such changes, made, it is important to
ensure that
1. The changes or additions work as designed; and
2. The changes or additions do not break something that is already working
and should continue to work.
Regression testing is designed to address the above two purpose
illustrate this with a simple example.
Assume that in a given release of a product, there were three defects –
D1, D2, and D3. When these defects are reported, presumably the development
team will fix these defects and the testing team will perform tests to ensure that
these defects are indeed fixed. When the customers start using the product
(modified to fix defects D1, D2, and D3) they may encounter defects-D4 and D5.
Again, the development and testing teams will fix and test these new defect
fixes. But, in the process of fixing D4 and D5, as an unintended side-effect, D1
may resurface. Thus, the testing team should not only ensure that the fixes
take care of the defects they are supposed to fix but also that they do not break
anything else that was already working.
Regression testing enables the test team to meet this objective.
Regression testing is important in today's context since software is being
released very often to keep up with the competition and increasing customer
awareness. It is essential to make quick and frequent releases and also deliver
stable software. Regression testing enables that any new feature introduced to
the existing product does not adversely affect the current functionality.
Regression testing follows selective re-testing technique. Whenever the
defect fixes are done, a set of test cases that need to be run to verify the defect
fixes are selected by the test team. An impact analysis is done to what areas
may get impacted due to those defect fixes. Based on the impact analysis, some
more test cases are selected to take care of the impacted areas. Since this
testing technique focuses on reuse of existing test cases that have already been
executed, the technique is called selective re-testing. There may be situations
where new test cases need to be developed to take care of some impacted areas.
However, by and large, regression testing reuses the test cases that are
available, as it focuses on testing the features that are already available and
tested at least once already.

Check your progress 1

Define regression testing.


Notes: a) Write your answer in the space given below
xx) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

153
19.3 TYPES OF REGRESSION TESTING

Before going into the types of regression testing, let us understand what
a "build" means. When internal or external test teams or customers begin using
a product, they report defects. These defects are analyzed by each developer
who makes individual defect fixes. The developers then do appropriate unit
testing and check the defect fixes into a Configuration Management (CM)
System. The source code for the complete product is then compiled and these
defect fixes along with the existing features get consolidated into the build. A
build thus becomes an aggregation of all the defect fixes and features that are
present in the product.
There are two types of regression testing in practice.
1. Regular regression testing
2. Final regression testing
A regular regression testing is done between test cycles to ensure that the
defect fixes that are done and the functionality that were working with the
earlier test cycles continue to work. A regular regression testing can use more
than one product build for the test cases to be executed.
A "final regression testing" is done to validate the final build before
release. The CM engineer delivers the final build with the media and other
contents exactly as it would go to the customer. The final regression test cycle
is conducted for a specific period of duration, which is mutually agreed upon
between the development and testing teams. This is called the" cook time" for
regression testing. Cook time is necessary to keep testing the product for
certain duration, since some of the defects (for example, Memory leaks) can be
unearthed only after the product has been used for certain time duration. The
product is continuously exercised for the complete duration of the cook time to
ensure that such time-bound defects are identified. Some of the test cases are
repeated to find out whether there are failures in the final product that will
reach the customer. All the defect fixes for the release should have been
completed for the build used for the final regression test cycle. The final
regression test cycle is more critical than any other type or phase of testing, as
this is the only testing that ensures the same build of the product that was
tested reaches the customer.

Check your progress 2

What do you mean by final regression testing?


Notes: a) Write your answer in the space given below
yy) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

154
19.4 WHEN TO DO REGRESSION TESTING?

Whenever changes happen to software, regression testing is done to


ensure that these do not adversely affect adversely the existing function. A
regular regression testing can use multiple builds for the test cases to be
executed. However, an unchanged build is highly recommended for final
regression testing. The test cases that failed due to the defects should be
included for future regression testing.
Regression testing is done between test cycles to find out if the software
delivered is as good or better than the builds received in the past. As testing
involves large amount of resources (hardware, software, and people), a quick
testing is needed to assess the quality of build and changes to software. Initially
a few test engineers with very few machines do regression testing. This prevents
a huge loss of effort in situations where the defect fixes or build process affects
the existing/working functionality, taking the quality or progress in a negative
direction. These kind of side-effects need to be fixed immediately before a large
number of people get involved in testing. Regression testing is done whenever
such a requirement arises.
It is necessary to perform regression testing when
1. A reasonable amount of initial testing is already carried out.
2. A good number of defects have been fixed.
3. Defect fixes that can produce side-effects are taken care of.
Regression testing may also be performed periodically, as a pro-active
measure.
A defect tracking system is used to communicate the status of defect
fixes amongst the various stake holders. When a developer fixes a defect, the
defect is sent back to the test engineer for verification using the defect tracking
system. The test engineer needs to take the appropriate action of closing the
defect if it is fixed or reopening it if it has not been fixed properly. In this
process what may get missed out are the side-effects, where a fix would have
fixed the particular defect but some functionality which was working before has
stopped working now. Regression testing needs to be done when a set of defect
fixes are provided. To ensure that there are no side-effects, some more test
cases have to be selected and defect fixes verified in the regression test cycle.
Thus, before a tester can close the defect as fixed, it is important to ensure that
appropriate regression tests are run and the fix produces no side-effects. It is
always a good practice to initiate regression testing and verify the defect fixes.
Else, when there is a side-effect or loss of functionality observed at a later point
of time through testing, it will become very difficult to identify which defect fix
has caused it.
From the above discussion it is clear that regression testing is both a
planned test activity and a need-based activity and it is done between builds
and test cycles. Hence, regression test is applicable to all phases in a software
development life cycle (SOLC) and also to component, integration, system, and
acceptance test phases.

155
19.5 STRATEGIES OF REGRESSION TESTING

Any time you modify an implementation within a program, you should also do
regression testing. You can do so by rerunning existing tests against the
modified code to determine whether the changes break anything that worked
prior to the change and by writing new tests where necessary. Adequate
coverage without wasting time should be a primary consideration when
conducting regression tests. Try to spend as little time as possible doing
regression testing without reducing the probability that you will detect new
failures in old, already tested code.
Some strategies and factors to consider during this process include the
following:
• Test fixed bugs promptly. The programmer might have handled the
symptoms but not have gotten to the underlying cause.
• Watch for side effects of fixes. The bug itself might be fixed but the fix
might create other bugs.
• Write a regression test for each bug fixed.
• If two or more tests are similar, determine which is less effective and get
rid of it.
• Identify tests that the program consistently passes and archive them.
• Focus on functional issues, not those related to design.
• Make changes (small and large) to data and find any resulting
corruption.
• Trace the effects of the changes on program memory.

Check your progress 3

Write about defect tracking system.


Notes: a) Write your answer in the space given below
zz) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

19.6 LET US SUM UP

Check Your Progress: Model Answers


1. Regression testing is important in today's context since software is being
released very often to keep up with the competition and increasing
customer awareness. It is essential to make quick and frequent releases
and also deliver stable software. Regression testing enables that any new

156
feature introduced to the existing product does not adversely affect the
current functionality.
2. A "final regression testing" is done to validate the final build before
release. The CM engineer delivers the final build with the media and
other contents exactly as it would go to the customer. The final
regression test cycle is conducted for a specific period of duration, which
is mutually agreed upon between the development and testing teams.
This is called the" cook time" for regression testing.
3. A defect tracking system is used to communicate the status of defect
fixes amongst the various stake holders. When a developer fixes a defect,
the defect is sent back to the test engineer for verification using the
defect tracking system. The test engineer needs to take the appropriate
action of closing the defect if it is fixed or reopening it if it has not been
fixed properly.

157
LESSON 20

BEST PRACTICES IN REGRESSION TESTING


Contents
20.0 Aims and Objectives
20.1 How to do Regression Testing?
20.1.1 Performing an Initial “Smoke” or “Sanity” Test
20.1.2 Understanding the Criteria for Selecting the Test Cases
20.1.3 Classifying Test Cases
20.1.4 Methodology for Selecting Test Cases
20.1.5 Resetting the Test Cases for Regression Testing
20.1.6 Concluding the Results of Regression Testing
20.2 Best Practices in Regression Testing
20.3 Let Us Sum Up

20.0 AIMS AND OBJECTIVES

In this lesson, we are going to discuss on the best practices of regression


testing. In recent years regression testing is rapidly applied in RDBMS. Here we
discuss on classification of test cases and methodologies.

20.1 HOW TO DO REGRESSION TESTING

A well-defined methodology for regression testing is very important as


this among is the final type of testing that is normally performed just before
release. If regression testing is not done right, it will enable the defects to seep
through and may result in customers facing some serious issues not found by
test teams.
There are several methodologies for regression testing that are used by
different organizations. The objective of this section is to explain methodology
that encompasses the majority of them. The method here is made of the
following steps.
1. Performing an initial "Smoke" or "Sanity" test
2. Understanding the criteria for selecting the test cases
3. Classifying the test cases into different priorities
4. A methodology for selecting test cases
5. Resetting the test cases for test execution
6. Concluding the results of a regression cycle

158
Check your progress 1

Write down the methods to do regression testing.


Notes: a) Write your answer in the space given below
aaa) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------

20.1.1 PERFORMING AN INITIAL “SMOKE” OR “SANITY” TEST

Whenever changes are made to a product, it should first be made sure


that nothing basic breaks. For example, if you are building a database, then
any build of the database should be able to start it up; perform basic operations
such as queries, data definition, data manipulation; and shutdown the
database. In addition, you may want to ensure that the key interfaces to other
products also work properly. This has to be done before performing any of the
other more detailed tests on the product. If, for example, a given build fails to
bring up a database, then it is of no use at all. The code has to be corrected to
solve this (and any other such basic) problem first, before one can even
consider testing other functionality.
Smoke testing consists of
1. Identifying the basic functionality that a product must satisfy;
2. Designing test cases to ensure that these basic functionality! and
packaging them into a smoke test suite;
3. Ensuring that every time a product is built, this suite is successfully
before anything else is run; and
4. If this suite fails, escalating to the developers to identify the cha, and
perhaps change or roll back the changes to a state where smoke test
suite succeeds.
To make sure that problems in smoke testing are detected up front, some
organizations mandate that anytime a developer makes a change, he or she
should run the smoke test suite successfully on that build before checking the
code into the Configuration Management repository.
Defects in the product can get introduced not only by the code, but also
by the build scripts that are used for compiling and linking the programs.
Smoke testing enables the uncovering of such errors introduced by (also) the
build procedures. This is important, as a research conducted in the past
revealed that 15% of defects are introduced by Configuration Management or
build-related procedures.

159
20.1.2 UNDERSTANDING THE CRITERIA FOR SELECTING THE TEST
CASES

Having performed a smoke test, the product can be assumed worthy of


being subjected to further detailed tests. The question now is what tests should
be run to achieve the dual objective of ensuring that the fixes work and that
they do not cause unintended side-effects.
There are two approaches to selecting the test cases for a regression run.
First, an organization can choose to have a constant set of regression tests that
are run for every build or change. In such a case, deciding what tests to run is
simple. But this approach is likely to be sub-optimal because
1. In order to cover all fixes, the constant set of tests will encompass all
features and tests which are not required may be run every time; and
2. A given set of defect fixes or changes may introduce problems for which
there may not be ready-made test cases in the constant set. Hence, even
after running all the regression Test cases, introduced defects will
continue to exist.
A second approach is to select the test cases dynamically for each build
by making judicious choices of the test cases. The selection of test cases for
regression testing requires knowledge of
1. The defect fixes and changes made in the current build;
2. The ways to test the current changes;
3. The impact that the current changes may have on other parts of the
system; and
4. The ways of testing the other impacted parts.
Some of the criteria to select test cases for regression testing are as follows.
1. Include test cases that have produced the maximum defects in the past
2. Include test cases for a functionality in which a change has been made
3. Include test cases in which problems are reported
4. Include test cases that test the basic functionality or the core features of
the product which are mandatory requirements of the customer
5. Include test cases that test the end-to-end behavior of the application or
the product
6. Include test cases to test the positive test conditions
7. Includes the area which is highly visible to the users
When selecting test cases, do not select more test cases which are bound
to fail and have little or less relevance to the defect fixes. Select more positive
test cases than negative test cases for the final regression test cycle. Selecting
negative test cases-that is, test cases introduced afresh with the intent of
breaking the system - may create some confusion with respect to pinpointing

160
the cause of the failure. It is also recommended that the regular test cycles
before regression testing should have the right mix of both positive and negative
test cases.
The selection of test cases for regression testing depends more on the
impact of defect fixes than the criticality of the defect itself. A minor defect can
result in a major side-effect and a defect fix for a critical defect can have little or
minor side-effect. Hence the test engineer needs to balance these aspects while
selecting test cases for regression testing.
Selecting regression test cases is a continuous process. Each time a set
of regression tests (also called regression test bed) is to be executed, the test
cases need to be evaluated for their suitability, based on the above conditions.

Check your progress 2

How to select a regression test case?


Notes: a) Write your answer in the space given below
bbb) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

20.1.3 CLASSIFYING TEST CASES

When the test cases have to be selected dynamically for each regression
run, it would be worthwhile to plan for regression testing from the beginning of
project, even before the test cycles start. To enable choosing the right tests for a
regression run, the test cases can be classified into various priorities based on
importance and customer usage. As an example, we can the test cases into
three categories.
 Priority-0 These test cases can be called sanity test cases which check
basic functionality and are run for accepting the b further testing. They
are also run when a product goes through a major change. These test
cases deliver a very high project value to both to product development
teams and to the customers.
 Priority-1 Uses the basic and normal setup and these test cases deliver
high project value to both development team and to customers.
 Priority-2 These test cases deliver moderate project value. They are
executed as part of the testing cycle and selected for regression testing
on a need basis.

20.1.4 METHODOLOGY FOR SELECTING TEST CASES

Once the test cases are classified into different priorities, the test cases
can be selected. There could be several right approaches to regression testing

161
which need to be decided on "case to case" basis. There are several
methodologies available in the industry for selecting regression test cases. The
methodology discussed in this section takes into account the criticality and
impact of defect fixes after test cases are classified into several priorities as
explained in the previous section.
Case 1 If the criticality and impact of the defect fixes are low, then it is enough
that a test engineer selects a few test cases from test case database (TCDB), (a
repository that stores all the test cases that can be used for testing a product)
and executes them. These test cases can fall under any priority (0, 1, or 2).
Case 2 If the criticality and the impact of the defect fixes are medium, then we
need to execute all Priority-O and Priority-l test cases. If defect fixes need
additional test cases (few) from Priority-2, then those test cases can also be
selected and used for regression testing. Selecting Priority-2 test cases in this
case is desirable but not necessary.
Case 3 If the criticality and impact of the defect fixes are high, then we need to
execute all Priority-O, Priority-l and a carefully selected subset of Priority-2 test
cases.
The above methodology requires that the impact of defect fixes be
analyzed for all defects. This can be a time-consuming procedure. If, for some
reason, there is not enough time and the risk of not doing an impact analysis is
low, then the alternative methodologies given below can be considered.
 Regress all For regression testing, all priority 0, 1, and 2 test cases are
rerun. This means all the test cases in the regression test bed/ suite are
executed.
 Priority based regression For regression testing based on this priority,
all priority 0, 1, and 2 test cases are run in order, based on the
availability of time. Deciding when to stop the regression testing is based
on the availability of time.
 Regress changes For regression testing using this methodology code
changes are compared to the last cycle of testing and test cases are
selected based on their impact on the code (gray box testing)
 Random regression Random test cases are selected and executed for
this regression methodology.
 Context based dynamic regression A few Priority-O test cases are
selected, and based on the context created by the analysis of those test
cases after the execution (for example, find new defects, r boundary
value) and outcome, additional related cases are selected for continuing
the regression testing.
An effective regression strategy is usually a combination of all of the
above and not necessarily any of these in isolation.

162
Check your progress 3

What do you mean by context based dynamic regression?


Notes: a) Write your answer in the space given below
ccc) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

20.1.5 RESETTING THE TEST CASES FOR REGRESSION TESTING

After selecting the test cases using the above methodology, the next step
is to prepare the test cases for execution. For proceeding with this step, a "test
case result history" is needed.
In a large product release involving several rounds of testing, it is very
important to record what test cases were executed in which cycle, their results,
and related information. This is called test case result history. This is part of
the test case database.
In many organizations, not all the types of testing or all the test cases are
repeated for each cycle. As mentioned, test case result history provides a wealth
of information on what test cases were executed and when. A method or
procedure that uses test case result history to indicate some of the test cases is
selected for regression testing is called a reset procedure. Resetting a test case
is nothing but setting a flag called not run or execute again in test case
database (TCDB). The reset procedure also hides the test case results of
previous builds for the test cases, so that the test engineer executing the test
cases may not be biased by the result history.
Resetting test cases reduces the risk involved in testing defect fixes by
making the testers go through all the test cases and selecting appropriate test
cases based on the impact of those defect fixes. If there are defect fixes that are
done just before the release, the risk is more; hence, more test cases have to be
selected.
Resetting of test cases is not expected to be done often, and it needs to
be done with the following considerations in mind.
1. When there is a major change in the product.
2. When there is a change in the build procedure which affects the product.
3. Large release cycle where some test cases were not executed for a long
time.
4. When the product is in the final regression test cycle with a few selected
test cases.

163
5. Where there is a situation, that the expected results of the test cases
could be quite different from the previous cycles.
6. The test cases relating to defect fixes and production problems need to be
evaluated release after release. In case they are found to be working fine,
they can be reset.
7. Whenever existing application functionality is removed, the related test
cases can be reset.
8. Test cases that consistently produce a positive result can be removed.
9. Test cases relating to a few negative test conditions (not producing any
defects) can be removed.
When the above guidelines are not met, we may want to rerun the test
cases rather than reset the results of the test cases. There are only a few
differences between the rerun and reset states in test cases. In both instances,
the test cases are executed but in the case of "reset" we can expect a different
result from what was obtained in the earlier cycles. In the case of rerun, the
test cases are expected to give the same test result as in the past; hence, the
management need not be unduly worried because those test cases are executed
as a formality and are not expected to reveal any major problem.
Test cases belonging to the "rerun" state help to gain confidence in the
product by testing for more time. Such test cases are not expected to fail or
affect the release. Test cases belonging to the "reset" state say that the test
results can be different from the past, and only after these test cases are
executed can we know the result of regression and the release status.
For example, if there is a change in the installation of a product, which
does not affect product functionality, then the change can be tested
independently by rerunning some test cases and the test cases do not have to
be "reset." Similarly, if there is a functionality that underwent a major change
(design or architecture or code revamp), then all the related test cases for that
functionality need to be "reset," and these test cases have to be executed again.
By resetting test cases, the test engineer has no way of knowing their past
results. This removes bias and forces the test engineer to pick up those test
cases and execute them.
A return state in a test case indicates low risk and reset status
represents medium to high risk for a release. Hence, close to the product
release, it is a good practice to execute the "reset" test cases first before
executing the "rerun" test cases.
Reset is also decided on the basis of the stability of the functionalities. If
you are in Priority-I and have reached a stage of comfort level in Priority-0 (say,
for example, more than 95% pass rate), then you do not reset Priority-0 test
cases unless there is a major change. This is true with Priority-l test cases
when you are in the Priority-2 test phase.
We will now see illustrate the use of the "reset" flag for regression testing
in the various phases.

164
Component test cycle phase Regression testing between component test
cycles uses only Priority-0 test cases. For each build that enters the test, the
build number is selected and all test cases in Priority0 are reset. The test cycle
starts only if all Priority-O test cases pass.
Integration testing phase After component testing is over, if regression is
performed between integration test cycles Priority-O and Priority-I test cases are
executed. Priority-I testing can use multiple builds, In this phase, the test cases
are "reset" only if the criticality and impact of the defect fixes and feature
additions are high. A "reset" procedure during this phase may affect all Priority-
0 and Priority-1 test cases.
System test phase Priority-2 testing starts after all test cases in Priority-1 are
executed with an acceptable pass percentage as defined in the test plan. In this
phase, the test cases are "reset" only if the criticality and impact of the defect
fixes and feature additions are very high. A "reset" procedure during this phase
may affect Priority-0, Priority-1, and Priority-2 test cases.
Why reset test cases Regression testing uses a good number of test cases
which have already been executed and are associated with some results and
assumptions on the result. A "reset" procedure gives a clear picture of how
much of testing still remains, and reflects the status of regression testing.
If test cases are not "reset," then the test engineers tend to report a
completion rate and other results based on previous builds. This is because of
the basic assumption that multiple builds are used in each phase of the testing
and a gut feeling that if something passed in the past builds, it will pass in
future builds also. Regression testing does not go with an assumption that
"Future is an extension of the past." Resetting as a procedure removes any bias
towards test cases because resetting test case results prevents the history of
test cases being viewed by testers.

20.1.6 CONCLUDING THE RESULTS OF REGRESSION TESTING

Apart from test teams, regression test results are monitored by many
people in an organization as it is done after test cycles and sometimes very
close to the release date. Developers also monitor the results from regression as
they would like to know how well their defect fixes work in the product. Hence,
there is a need to understand a method for concluding the results of regression.
Since regression uses test cases that have already executed more than
once, it is expected that 100% of those test cases pass using the same build, if
defect fixes are done right. In situations where the pass percentage is not 100,
the test manager can compare with the previous results of the test case to
conclude whether regression was successful or not.
 If the result of a particular test case was a pass using the previous builds
and a fail in the current build, then regression has failed. A new build is
required and the testing must start from scratch after resetting the test
cases.

165
 If the result of a particular test case was a fail using the previous builds
and a pass in the current build, then it is safe to assume the defect fixes
worked.
 If the result of a particular test case was a fail using the previous builds
and a fail in the current build and if there are no defect fixes for this
particular test case, it may mean that the result of this test case should
not be considered for the pass percentage. This may also mean that such
test cases should not be selected for regression.
 If the result of a particular test case is a fail using the previous builds
but works with a documented workaround and if you are satisfied with
the workaround, then it should considered as a pass for both the system
test cycle and regression test cycle.
 If you are not satisfied with the workaround, then it should be
considered as a fail for a system test cycle but may be considered as a
pass for regression test cycle.

20.2 BEST PRACTICES IN REGRESSION TESTING

Regression methodology can be applied when


1. We need to assess the quality of product between test cycles (both
planned and need based);
2. We are doing a major release of a product, have executed all test cycles,
and are planning a regression test cycle for defect fixes; and
3. We are doing a minor release of a product (support packs, patches, and
so on) having only defect fixes, and we can plan for regression test cycles
to take care of those defect fixes.
There can be multiple cycles of regression testing that can be planned for
every release. This applies if defect fixes come in phases or to take care of some
defect fixes not working with a specific build.
When assigning a fail result to a test case during test execution, it is a
good practice to enter the defect identifier(s) (from the defect tracking system)
along so that you will know what test cases to be executed when a defect fix
arrives. Please note that there can be multiple defects that can come out of a
particular test case and a particular defect can affect more than one test case.
Even though ideally one would like to have a mapping between test cases
and defects, the choice of test cases that are to be executed for taking care of
side-effects of defect fixes may still remain largely a manual process as this
requires knowledge of the interdependences amongst the various defect fixes.
As the time passes by and with each release of the product, the size of
the regression test cases to be executed grows. It has been found that some of
the defects reported by customers in the past were due to last-minute defect
fixes creating side-effects. Hence, selecting the test case for regression testing is
really an art and not that easy. To add to this complexity, most people want
maximum returns with minimum investment on regression testing.
To solve this problem, as and when there are changes made to a product,
regression test cases are added or removed from an existing suite of test cases.
This suite of test cases, called regression suite or regression test bed, is run
when a new change is introduced to an application or a product. The

166
automated test cases in the regression test bed can be executed along with
nightly builds to ensure that the quality of the product is maintained during
product development phases.
It was mentioned earlier that the knowledge of defects, products, their
interdependences and a well-structured methodology are all very important to
select test cases. These points stress the need for selecting the right person for
the right job. The most experienced person in the team or the most talented
person in the team may do a much better job of selecting the right test cases for
regression than someone with less experience. Experience and talent can bring
in knowledge of fragile areas in the product and impact the analysis, of defects.
Please look at the pictures below. In the first picture, the tiger has been
put in a cage to prevent harm to human kind. In the second picture, some
members of a family are lie inside the mosquito net as prevention against from
mosquitoes.
The same strategy has to be adopted for regression. Like the tiger in the
cage, all defects in the product have to be identified and fixed. This is what
"detecting defects in your product" means. All the testing types discussed in the
earlier chapters and regression testing adopt this technique to find each defect
and fix it.
The photograph of the family under the mosquito net signifies "protecting
your product from defects." The strategy followed here is of defect prevention.
There are many verification and quality assurance activities such as reviews
and inspections that try to do this.
Another aspect related to regression testing is "protecting your product
from defect fixes." As discussed earlier, a defect that is classified as a minor
defect may create a major impact on the product when it gets fixed into the
code, It is similar to what a mosquito can do to humans (impact), even though
its size is small. Hence, it is a good practice to analyze the impact of defect
fixes, irrespective of size and criticality, before they are incorporated into the
code. The analysis of an impact due to defect fixes is difficult due to lack of time
and the complex nature of the products. Hence, it is a good practice to limit the
amount of changes in the product when close to the release date. This will
prevent the product from defects that may seep in through the defect fixes
route, just as mosquitoes can get into the mosquito net through a small hole
there. If you make a hole for a mosquito to get out of the net, it also opens the
doors for new mosquitoes to come into the net. Fixing a problem without
analyzing the impact can introduce a large number of defects in the product.
Hence, it is important to insulate the product from defects as well as defect
fixes.
If defects are detected and the product is protected from defects and
defect fixes, then regression testing becomes effective and efficient.

20.3 LET US SUM UP

Check Your Progress: Model Answers


1.
• Performing an initial "Smoke" or "Sanity" test
• Understanding the criteria for selecting the test cases
• Classifying the test cases into different priorities

167
• A methodology for selecting test cases
• Resetting the test cases for test execution
• Concluding the results of a regression cycle
2. Selecting regression test cases is a continuous process. Each time a set
of regression tests (also called regression test bed) is to be executed, the
test cases need to be evaluated for their suitability, based on the above
conditions.
3. Context based dynamic regression A few Priority-O test cases are
selected, and based on the context created by the analysis of those test
cases after the execution (for example, find new defects, r boundary
value) and outcome, additional related cases are selected for continuing
the regression testing.

168
UNIT - V
LESSON 21

TEST PLANNING
Contents
21.0 Aims and Objectives
21.1 Test Planning
21.1.1 Preparing a Test Plan
21.1.2 Scope Management: Deciding Features to be Tested/Not Tested
21.1.3 Deciding Test Approach/Strategy
21.1.4 Setting up Criteria for Testing
21.1.5 Identifying Responsibilities, Staffing and Training Needs
21.1.6 Identifying Resource Requirements
21.1.7 Identifying Test Deliverables
21.1.8 Testing Tasks: Size and Effort Estimation
21.1.9 Activity Breakdown and Scheduling
21.1.210 Communication Management
21.1.11 Risk Management
21.2 Let Us Sum Up

21.0 AIMS AND OBJECTIVES

In this chapter, we will look at some of the project management aspects


of testing. The Project Management Institute [PMI-2004] defines a project
formally as a temporary endeavor to create a unique product or service. This
means that every project has a definite beginning and a definite end and that
the product or service is different in some distinguishing way from all similar
products or services.

21.1 TEST PLANNING

Testing is integrated into the endeavor of creating a given product


service; each phase and each type of testing has different characteristics and is
tested in each version could be different. Hence, testing satisfies this definition
of a project fully.
Given that testing can be considered as a project on its own, it has to be
planned, executed, tracked, and periodically reported on. We will look at test
planning aspects in the next section. We will then look into the process that
drives a testing project. Subsequently, we will look at the execution tests and
the various types of reporting that takes place during a testing project. We will

169
conclude this chapter by sharing some of the best practices in test management
and execution.

Check your progress 1

Define a Test Plan.


Notes: a) Write your answer in the space given below
ddd) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

21.1.1 PREPARING A TEST PLAN

Testing - like any project - should be driven by a plan. The test plan acts
as the anchor for the execution, tracking, and reporting of the entire testing
project and covers
1. What needs to be tested -the scope of testing, including clear
identification of what will be tested and what will not be tested,
2. How the testing i£ going to be performed - breaking down the testing into
small and manageable tasks and identifying the strategies to be used for
carrying out the tasks.
3. What resources are needed for testing-computer as well as human
resources?
4. The time lines by which the testing activities will be performed.
5. Risks that may be faced in all of the above, with appropriate mitigation
and contingency plans.

21.1.2 SCOPE MANAGEMENT: DECIDING FEATURES TO BE TESTED/NOT


TESTED

As was explained in the earlier chapters, various testing teams do testing for
various phases of testing. One single test plan can be prepared to cover all
phases and all teams or there can be separate plans for each phase or for each
type of testing. For example, there needs to be plans for unit testing integration
testing, performance testing, and acceptance testing, and so on. They can all be
part of a single plan or could be covered by multiple plans. In situations where
there are multiple test plans, there should be one test plan, which covers the
activities common for all plans. This is called the master test plan.
Scope management pertains to specifying the scope of a project. For
testing, scope management entails
1. Understanding what constitutes a release of a product;
2. Breaking down the release into features;

170
3. Prioritizing the features for testing;
4. Deciding which features will be tested and which will not be; and
5. Gathering details to prepare for estimation of resources for testing.
It is always good to start from the end-goal or product-release perspective
and get a holistic picture of the entire product to decide the scope and priority
of testing. Usually, during the planning stages of a release, the features that
constitute the release are identified. For example, a particular release of an
inventory control system may introduce new features to automatically integrate
with supply chain management and to provide the user with various options of
costing. The testing teams should get involved early in the planning cycle and
understand the features. Knowing the features and understanding them from
the usage perspective will enable the testing team to prioritize the features for
testing.
The following factors drive the choice and prioritization of features to be
tested.
Features that are new and critical for the release The new features of a
release set" the expectations of the customers and must perform properly.
These new features result in new program code and 'thus have a higher
susceptibility and exposure to defects. Furthermore, these are likely to be areas
where both the development and testing teams will have to go through a
learning curve. Hence, it makes sense to put these features on top of the
priority list to be tested. This will ensure that these key features get enough
planning and learning time for testing and do not go out with inadequate
testing. In order to get this prioritization right, the product marketing team and
some select customers participate in identification of the features to be tested.
Features whose failures can be catastrophic Regardless of whether a feature
is new or not, any feature the failure of which can be catastrophic or produce
adverse business impact has to be high on the list of features to be tested. For
example, recovery mechanisms in a database will always have to be among the
most important features to be tested.
Features that are expected to be complex to test Early participate the
testing team can help identify features that are difficult to test. This can help in
starting the work on these features early and line up appropriate resources in
time.
Features which are extensions of earlier features that have been defect
prone As we have seen in Chapter 8, Regression Testing, certain areas of a code
tend to be defect prone and such areas need very thorough testing so that old
defects do not creep in again. Such features that are defect prone should be
included ahead of more stable features for testing.
A product is not just a heterogeneous mixture of these features. These
features work together in various combinations and depend on several
environmental factors and execution conditions. The test plan should clearly
identify these combinations that will be tested.

171
Given the limitations on resources and time, it is likely that it will not be
possible to test all the combinations exhaustively. During planning time, a test
manager should also consciously identify the features or combinations that will
not be tested. This choice should balance the requirements of time and
resources while not exposing the customers to any serious defects. Thus, the
test plan should contain clear justifications of why certain combinations will
not be tested and what are the risks that may be faced by doing so.
Check your progress 2
Write notes on Scope Management.
Notes: a) Write your answer in the space given below
eee) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

21.1.3 DECIDING TEST APPROACH/STRATEGY

Once we have this prioritized feature list, the next step is to drill down
into some more details of what needs to be tested, to enable estimation of size,
effort, and schedule. This includes identifying
1. What type of testing would you use for testing the functionality?
2. What are the configurations or scenarios for testing the features?
3. What integration testing would you do to ensure these features work
together?
4. What localization validations would be needed?
5. What "non-functional" tests would you need to do?
We have discussed various types of tests in earlier chapters of this book.
Each of these types has applicability and usefulness under certain conditions.
The test approach/strategy part of the test plan identifies the right type of
testing to effectively test a given feature or combination. The test strategy or
approach should result in identifying the right type of test for each of the
features or combinations. There should also be objective criteria for measuring
the success of a test.

21.1.4 SETTING UP CRITERIA FOR TESTING

As we have discussed in earlier chapters (especially chapters on system


and acceptance testing) there must be clear entry and exit criteria for different
phases of testing. The test strategies for the various features and combinations
determined how these features and combinations would be tested. Ideally, tests
must be run as early as possible so that the last-minute pressure of running

172
tests after development delays (see the section on Risk Management below) is
minimized. However, it is futile to run certain tests too early. The entry criteria
for a test specify threshold criteria for each phase or type of test. There may
also be entry criteria for the entire testing activity to start. The completion/exit
criteria specify when a test cycle or a testing activity can be deemed complete.
Without objective exit criteria, it is possible or testing to continue beyond the
point of diminishing returns.
A test cycle or a test activity will not be an isolated, continuous activity
hat can be carried out at one go. It may have to be suspended at various points
of time because it is not possible to proceed further. When it is possible to
proceed further, it will have to be resumed. Suspension criteria specify when a
test cycle or a test activity can be suspended. Resumption criteria specify when
the suspended tests can be resumed. Some of the typical suspension criteria
include
1. Encountering more than a certain number of defects, causing frequent
stoppage of testing activity
2. Hitting show stoppers that prevent further progress of testing (for
example, if a database does not start, further tests of query, data
manipulation, and so on are is simply not possible to execute); and
3. Developers releasing a new version which they advise should be used in
lieu of the product under test (because of some critical defect fixes).
When such conditions are addressed, the tests can resume.

21.1.5 IDENTIFYING RESPONSIBILITIES, STAFFING AND TRAINING


NEEDS

Scope management identifies what needs to be tested. The test strategy


outlines how to do it. The next aspect of planning is the who part of it.
Identifying responsibilities, staffing, and training needs addresses this aspect.
A testing project requires different people to play different roles. As
discussed in the previous two chapters, there are the roles of test engineers,
test leads, and test managers. There is also role definition on the dimensions of
the modules being tested or the type of testing. These different roles should
complement each other. The different role definitions should
1. Ensure there is clear accountability for a given task, so that each person
knows what he or she has to do;
2. Clearly list the responsibilities for various functions to various people, so
that everyone knows how his or her work fits into the entire project;
3. Complement each other, ensuring no one steps on an others toes; and
4. Supplement each other, so that no task is left unassigned.
Role definitions should not only address technical roles, but also list the
management and reporting responsibilities. This includes frequency, format,
and recipients of status reports and other project-tracking mechanisms. In

173
addition, responsibilities in terms of SLAs for responding to queries should also
be addressed during the planning stage.
Staffing is done based on estimation of effort involved and the availability
of time for release. In order to ensure that the right tasks get executed, the
features and tasks are prioritized the basis of on effort, time, and importance.
People are assigned to tasks that achieve the best possible fit between
the requirements of the job and skills and experience levels needed to perform
that function. It may not always be possible to find the perfect fit between the
requirements and the skills available. In case there are gaps between the
requirements and availability of skills, they should be addressed with
appropriate training programs. It is important to plan for such training
programs upfront as they are usually are de-prioritized under project pressures.

21.1.6 IDENTIFYING RESOURCE REQUIREMENTS

As a part of planning for a testing project, the project manager (or test
manager) should' provide estimates for the various hardware and software
resources required. Some of the following factors need to be considered.
1. Machine configuration (RAM, processor, disk, and so on) needed to run
the product under test
2. Overheads required by the test automation tool, if any
3. Supporting tools such as compilers, test data generators, configuration
management tools, and so on
4. The different configurations of the supporting software (for example, OS)
that must be present
5. Special requirements for running machine-intensive tests such as load
tests and performance tests
6. Appropriate number of licenses of all the software
In addition to all of the above, there are also other implied environmental
requirements that need to be satisfied. These include office space, support
functions (like HR), and so on.
Underestimation 6f these resources can lead to considerable slowing
down of the testing efforts and this can lead to delayed product release and to
de-motivated testing teams. However, being overly conservative and "safe" in
estimating these resources can prove to be unnecessarily expensive. Proper
estimation of these resources requires co-operation and teamwork among
different groups-product development team, testing team, system
administration team, and senior management.

21.1.7 IDENTIFYING TEST DELIVERABLES

The test plan also identifies the deliverables that should come out of the
test cycle/testing activity. The deliverables include the following, all reviewed
and approved by the appropriate people.
1. The test plan itself (master test plan, and various other test plans for the
project)

174
2. Test case design specifications
3. Test cases, including any automation that is specified in the plan
4. Test logs produced by running the tests
5. Test summary reports
As we will see in the next section, a defect repository gives the status of
the defects reported in a product life cycle. Part of the deliverables of a test cycle
is to ensure that the defect repository is kept current. This includes entering
new defects in the repository and updating the status of defect fixes after
verification. We will see the contents of some of these deliverables in the later
part of this chapter.

21.1.8 TESTING TASKS: SIZE AND EFFORT ESTIMATION

The scope identified above gives a broad overview of what needs to be


tested. This understanding is quantified in the estimation step. Estimation
happens broadly in three phases.
1. Size estimation
2. Effort estimation
3. Schedule estimation
We will cover size estimation and effort estimation in this sub-section
and address schedule estimation in the next sub-section. Size estimate
quantifies the actual amount of testing that needs to be done. Several factors
contribute to the size estimate of a testing project.
Size of the product under test This obviously determines the amount at
testing that needs to be done. The larger the product, in general, greater is the
size of testing to be done. Some of the measures of the size of product under
test are as follows.
1. Lines of code (Loc) is a somewhat controversial measure as it depends on
the language, style of programming, compactness of programming, and
so on. Furthermore, Loc represents size estimate only for the coding
phase and not for the other phases such as requirements, design, and so
on. Notwithstanding these limitations, Loc is still a popular measure for
estimating size.
2. A function point (FP) is a popular method to estimate the size of an
application. Function points provide a representation of application size,
independent of programming language. The application features (also
called functions) are classified as inputs, outputs, interfaces, external
data files, and enquiries. These are increasingly complex and hence are
assigned increasingly higher weights. The weighted average of functions
(number of functions of each type multiplied by the weight for that
function type) gives an initial estimate of size or complexity. In addition,
the function point methodology of estimating size also provides for 14
environmental factors such as distributed processing, transaction rate,
and so on.

175
This methodology of estimating size or complexity of an application is
comprehensive in terms of taking into account realistic factors. The
major challenge in this method is that it requires formal training and is
not easy to use. Furthermore, this method is not directly suited to
systems software type of projects.
3. A somewhat simpler representation of application size is the number of
screens, reports, or transactions. Each of these can't be further classified
as "simple," "medium," or "complex." This classification can be based on
intuitive factors such as number or fields in the screen, number of
validations to be done, and so on.
Extent of automation required When automation is involved, the size of work
to be done for testing increases. This is because, for automation, we should first
perform the basic test case design (identifying input data and expected results
by techniques like condition coverage, boundary value analysis, equivalence
partitioning, and so on.) and then scripting them into the programming
language of the test automation tool.
Number of platforms and inter-operability environments to be tested If a
particular product is to be tested under several different platforms or under
several different configurations then the size of the testing task increases. In
fact as the number of platforms or touch points across different environment
increases, the amount of testing increases almost exponentially.
All the above size estimates pertain to "regular" test case development.
Estimation of size for regression testing involves considering the changes in the
product and other similar factors.
In order to have a better handle on the size estimate, the work to be done
is broken down into smaller and more manageable parts called work breakdown
structure (WBS) units. For a testing project, WBS units are typically test cases
for a given module, test cases for a given platform, and so on. This
decomposition breaks down the problem domain or the product into simpler
parts and is likely to reduce the uncertainty and unknown factors.
Size estimate is expressed in terms of any of the following.
4. Number of test cases
5. Number of test scenarios
6. Number of configurations to be tested
Size estimate provides an estimate of the actual ground to be covered for
testing. This acts as a primary input for estimating effort. Estimating effort is
important because often effort has a more direct influence on cost than size.
The other factors that drive the effort estimate are as follows.
Productivity data Productivity refers to the speed at which the various
activities of testing can be carried out. This is based on historical data available
in the organization. Productivity data can be further classified into the number
of test cases that can be developed per day (or some unit time), the number of
test cases that can be run per day, the number of pages of pages of
documentation that can be tested per day, and so on. Having these fine-grained

176
productivity data enables better planning and increases the confidence level
and accuracy of the estimates.
Reuse opportunities If the test architecture has been designed keeping reuse
in mind, then the effort required to cover a given size of testing can come down.
For example, if the tests are designed in such a way that some of the earlier
tests can be reused, then the effort of test development decreases.
Robustness of processes Reuse is a specific example of process maturity of an
organization. Existence of well-defined processes will go a long way in reducing
the effort involved in any activity. For example, in an organization with higher
levels of process maturity, there are likely to be
1. Well-documented standards for writing test specifications, test scripts,
and so on;
2. Proven processes for performing functions such as reviews and audits;
3. Consistent ways of training people; and
4. Objective ways of measuring the effectiveness of compliance to processes.
All these reduce the need to reinvent the wheel and thus enable
reduction in the effort involved.
Effort estimate is derived from size estimate by taking the individual
WBS units and classifying them as "reusable," "modifications," and "new
development." For example, if parts of a test case can be reused from existing
test cases, then the effort involved in developing these would be close to zero. If,
on the other hand, a given test case is to be developed fully from scratch, it is
reasonable to assume that the effort would be the size of the test case divided
by productivity.
Effort estimate is given in person days, person months, or person years.
The effort estimate is then translated to a schedule estimate. We will address
scheduling in the next sub-section.

21.1.9 ACTIVITY BREAKDOWN AND SCHEDULING

Activity breakdown and schedule estimation entail translating the effort


required into specific time frames. The following steps make up this translation.
1. Identifying external and internal dependencies among the activities
2. Sequencing the activities, based on the expected duration as well as on
the dependencies.
3. Identifying the time required for each of the WBS activities, taking into
account the above two factors.
4. Monitoring the progress in terms of time and effort
5. Rebalancing schedules and resources as necessary
During the effort estimation phase, we have identified the effort required
for each of the WBS unit, factoring in the effect of reuse. This effort was
expressed in terms of person months. If the effort for a particular WBS until
estimated as, say, 40 person months, it is not possible to trade the "persons” for
"months," that is, we cannot indefinitely increase the number of people working

177
on it, expecting the duration to come down proportionally. As stated, adding
more people to an already delayed project is a sure way of delaying the project
even further. This is because, when new people are added to a project, it
increases the communication overheads and it takes some time for the new
members to gel with the rest of the team. Furthermore, these WBS units cannot
be executed in any random order because there will be dependencies among the
activities. These dependencies can be external dependencies or internal
dependencies. External dependencies of an activity are beyond the control and
purview of the manager/person performing the activity. Some of the common
external dependencies are
1. Availability of the product from developers;
2. Hiring;
3. Training;
4. Acquisition of hardware/software required for training; and
5. Availability of translated message files for testing.
Internal dependencies are fully within the control of the manager/person
performing that activity. For example, some of the internal dependencies could
be.
1. Completing the test specification
2. Coding/scripting the tests
3. Executing the tests
The testing activities will also face parallelism constraints that will
further restrict the activities that can be done at a time. For example, certain
tests cannot be run together because of conflicting conditions (for example,
requiring different versions of a component for testing) or a high-end machine
may have to be multiplexed across multiple tests.
Based on the dependencies and the parallelism possible, the test
activities are scheduled in a sequence that helps accomplish the activities in the
minimum possible time, while taking care of all the dependencies.

Check your progress 3

What do you mean by activity breakdown?


Notes: a) Write your answer in the space given below
fff) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

178
21.1.10 COMMUNICATIONS MANAGEMENT

Communications management consists of evolving and following


procedures for communication that ensure that everyone is kept in sync with
the right level of detail. Since this is intimately connected with the test
execution and progress of the testing project.

21.1.11 RISK MANAGEMENT

Just like every project, testing projects also face risks. Risks are events
that could potentially affect a project's outcome. These events are normally
beyond the control of the project manager.
1. Identifying the possible risks;
2. Quantifying the risks;
3. Planning how to mitigate the risks; and
4. Responding to risks when they become a reality.
As some risks are identified and resolved, other risks may surface. Hence
as risks can happen any time, risk management is essentially a cycle, which
Des through the above four steps repeatedly.
Risk identification consists of identifying the possible risks that may hit a
project. Although there could potentially be many risks that can hit a project le
risk identification step should focus on those risks that are more likely to
happen. The following are some of the common ways to identify risks in testing.
1. Use of checklists Over time, an organization may find new gleanings on
testing that can be captured in the form of a checklist. For example, if
during installation testing, it is found that a particular step of the
installation has repeatedly given problems, then the checklist can have
an explicit line item to check that particular problem. When checklists
are used for risk identification, there is also a great risk of the checklist
itself being out of date, thereby pointing to red herrings instead of risks!
2. Use of organizational history and metrics When an organization
collects and analyzes the various metrics, the information can provide
valuable insights into what possible risks can hit a project. For example,
the past effort variance in testing can give pointers to how much
contingency planning is required.
3. Informal networking across the industry The informal networking
across the industry can help in identifying risks that other organizations
have encountered.
Risk quantification deals with expressing the risk in numerical terms.
There are two components to the quantification of risk. One is the probability of
the risk happening and the other is the impact of the risk, if the risk happens.
For example, the occurrence of a low-priority defect may have a high
probability, but a low impact. However, a show stopper may have (hopefully!) a
low probability, but a very high impact (for both the customer and the vendor

179
organization). To quantify both these into one number, Risk exposure is used'.
This is defined as the product of risk probability and risk impact. To make
comparisons easy, risk impact is expressed in monetary terms (for example, in
dollars).
Risk mitigation planning deals with identifying alternative strategies to
combat a risk event, should that risk materialize. For example, a couple of
mitigation strategies for the risk of attrition are to spread the knowledge to
multiple people and to introduce organization-wide processes and standards. To
be better prepared to handle the effects of a risk, it is advisable to have multiple
mitigation strategies.
When the above three steps are carried out systematically and in a timely
manner, the organization would be in a better position to respond to the risks,
should the risks become a reality. When sufficient care is not given to these
initial steps, a project may find itself under immense pressure to react to a risk.
In such cases, the choices made may not be the most optimal or prudent, as
the choices are made under pressure.
The following are some of the common risks encountered in testing
projects and their characteristics.
Unclear requirements The success of testing depends a lot on knowing what
the correct expected behavior of the product under test is. When the
requirements to be satisfied by a product are not clearly documented, there is
ambiguity in how to interpret the results of a test. This could result in wrong
defects being reported or in the real defects being missed out. This will, in turn,
result in unnecessary and wasted cycles of communication between the
development and testing teams and consequent loss of time. One way to
minimize the impact of this risk is to ensure upfront participation of the testing
team during the requirements phase itself.
Schedule dependence The schedule of the testing team depends significantly
on the schedules of the development team. Thus, it becomes difficult for the
testing team to line up resources properly at the right time. The impact of this
risk is especially severe in cases where a testing team is shared across multiple-
product groups or in a testing services organization. A possible mitigation
strategy against this risk is to identify a backup project for a testing resource.
Such a backup project may be one of that could use an additional resource to
speed up execution but would not be unduly affected if the resource were not
available. An example of such a backup project is chipping in for speeding up
test automation.
Insufficient time for testing Throughout the book, we have stressed the
different types of testing and the different phases of testing. Though some of
these types of testing-such as white box testing-can happen early in the cycle,
most of the tests tend to happen closer to the product release. For example,
system testing and performance testing can happen only after the entire
product is ready and close to the release date. Usually these tests are resource
intensive for the testing team and, in addition, the defects that these tests
uncover are challenging for the developers to fix. As discussed in performance
testing chapter, fixing some of these defects could lead to changes in

180
architecture and design. Carrying out such changes into the cycle may be
expensive or even impossible. Once the developers fix the defects, the testing
team would have even lesser time to complete the testing and is under even
greater pressure. The use of the V model to at least shift the test design part of
the various test types to the earlier phases of the project can help in
anticipating the risks of tests failing at each level in a better manner. This in
turn could lead to a reduction in the last-minute crunch. The metric days
needed for release when captured and calculated properly, can help in planning
the time required for testing better.
"Show stopper" defects When the testing team reports defects, the dev-
elopment team has to fix them. Certain defects which are show stoppers may
prevent the testing team to proceed further with testing, until development fixes
such show stopper defects. Encountering this type of defects will have a double
impact on the testing team: Firstly, they will not be able to continue with the
testing and hence end up with idle time. Secondly, when the defects do get fixed
and the testing team restarts testing, they would have lost valuable time and
will be under tremendous pressure with the deadline being nearer. This risk of
show stopper defects can pose a big challenge to scheduling and resource
utilization of the testing teams. The mitigation strategies for this risk are similar
to those seen on account of dependence development schedules.
Availability of skilled and motivated people for testing As we saw, People
Issues in Testing, hiring and motivating people in testing is a major challenge.
Hiring, retaining and constant skill upgrade of testers in an organization is
vital. This is especially important for testing functions because of the tendency
of people to look for development positions.
Inability to get a test automation tool Manual testing is error prone and
labor intensive. Test automation alleviates some of these problems. However,
test automation tools are expensive. An organization may face the risk of not
being able to afford a test automation tool. This risk can in turn lead to less
effective and efficient testing as well as more attrition. One of the ways in which
organizations may try to reduce this risk is to develop in-house tools. However,
this approach could lead to an even greater risk of having a poorly written or
inadequately documented in-house tool.
These risks are not only potentially dangerous individually, but even
more dangerous when they occur in tandem. Unfortunately, often, these risks
do happen in tandem! A testing group plans its schedules based on
development schedules, development schedules slip, testing team resources get
into an idle time, pressure builds, schedules slip, and the vicious cycle starts all
over again. It is important that these risks be caught early or before they create
serious impact on the testing teams. Hence, we need to identify the symptoms
for each of these risks. These symptoms and their impacts need to be tracked
closely throughout the project.

181
21.2 LET US SUM UP

Check Your Progress: Model Answers


1. A test plan documents the strategy that will be used to verify and ensure
that a hardware product or system meets its design specifications and
other requirements. A test plan is usually prepared by or with significant
input from Test Engineers.
Depending on the product and the responsibility of the organization to
which the test plan applies, a test plan may include one or more of the
following:
• Design Verification or Compliance test - to be performed during the
development or approval stages of the product, typically on a small
sample of units.
• Manufacturing or Production test - to be performed during preparation or
assembly of the product in an ongoing manner for purposes of
performance verification and quality control.
• Acceptance or Commissioning test - to be performed at the time of delivery
or installation of the product.
• Service and Repair test - to be performed as required over the service life
of the product.
A complex system may have a high level test plan to address the overall
requirements and supporting test plans to address the design details of
subsystems and components.
Test plan document formats can be as varied as the products and
organizations to which they apply, but there are three major elements of
a test strategy that should be described in the test plan: Test Coverage,
Test Methods, and Test Responsibilities.
2. Scope management pertains to specifying the scope of a project. For
testing, scope management entails
• Understanding what constitutes a release of a product;
• Breaking down the release into features;
• Prioritizing the features for testing;
• Deciding which features will be tested and which will not be; and
• Gathering details to prepare for estimation of resources for testing.
3. Activity breakdown and schedule estimation entail translating the effort
required into specific time frames. The following steps make up this
translation.
• Identifying external and internal dependencies among the activities
• Sequencing the activities, based on the expected duration as well as on
the dependencies.

182
• Identifying the time required for each of the WBS activities, taking into
account the above two factors.
• Monitoring the progress in terms of time and effort
• Rebalancing schedules and resources as necessary

183
LESSON 22

TEST MANAGEMENT
Contents
22.0 Aims and Objectives
22.1 Test Management
22.2 Choice of Standards
22.3 Test Infrastructure Management
22.4 Test People Management
22.5 Integration with Product Release
22.6 Let Us Sum Up

22.0 AIMS AND OBJECTIVES

In this lesson, we are going to introduce Test Management and its fundamental
concepts.
At the end of this lesson, you are able to understand the choice of standards,
test infrastructure management, Test people management and Integration of a
Product Release.

22.1 TEST MANAGEMENT

In the previous section, we considered testing as a project in its own


right and addressed some of the typical project management issues in testing.
In this section, we will look at some of the aspects that should be taken care of
in planning such a project. These planning aspects are proactive measures that
can have an across-the-board influence on all testing projects.

22.2 CHOICE OF STANDARDS

Standards compromise an important part of planning in any


organization. Standards are of two types – external standards and internal
standards. External standards are standards that a product should comply
with, are externally visible, and are usually stipulated by external consortia.
From a testing perspective, these standards include standard tests supplied by
external consortia and acceptance tests supplied by customers. Compliance to
external standards is usually mandated by external parties.
Internal standards are standards formulated by a testing organization to
bring in consistency and predictability. They standardize the processes and
methods of working within the organization. Some of the internal standards
include
1. Naming and storage conventions for test artifacts;
2. Document standards;

184
3. Test coding standards; and
4. Test reporting standards.
Naming and storage conventions for test artifacts Every test artifact (test
specification, test case, test results, and so on) have to be named appropriately,
and meaningfully. Such naming conventions should enable
1. Easy identification of the product functionality that a set of tests are
intended for; and
2. Reverse mapping to identify the functionality corresponding to a given set
of tests.
This two-way mapping between tests and product functionality through
appropriate naming conventions will enable identification of appropriate tests to
be modified and run when product functionality changes.
In addition to file-naming conventions, the standards may also stipulate
the conventions for directory structures for tests. Such directory structures can
group logically related tests together (along with the related product
functionality). These directory structures are mapped into a configuration
management repository (discussed later in the chapter).
Documentation standards Most of the discussion on documentation and
coding standards pertain to automated testing. In the case of manual testing &
documentation standards correspond to specifying the user and system
responses at the right level of detail that is consistent with the skill level of the
tester.
While naming and directory standards specify how a test entity is
represented externally, documentation standards specify how to capture
information about the tests within the test scripts themselves. Internal
documentation of test scripts are similar to internal documentation of program
code and should include the following.
1. Appropriate header level comments at the beginning of a file that outlines
the functions to be served by the test.
2. Sufficient in-line comments, spread throughout the file, explaining the
functions served by the various parts of a test script. This is especially
needed for those parts of a test script that are difficult to understand or
have multiple levels of loops and iterations.
3. Up-to-date change history information, recording all the changes made to
the test file.
Without such detailed documentation, a person maintaining the test
scripts is forced to rely only on the actual test code or script to guess what the
test is supposed to do or what changes happened to the test scripts. This may
not give a true picture. Furthermore, it may place an undue dependence on the
person who originally wrote the tests.

185
Test coding standards Test coding standards go one level deeper into the tests
and enforce standards on how the tests themselves are written. The standards
may
1. Enforce the right type of initialization and clean-up that the test should
do to make the results independent of other tests;
2. Stipulate ways of naming variables within the scripts to makes sure that
a reader understands consistently the purpose of a variable. (for
example, instead of using generic names such as i, j I and so on, the
names can be meaningful such as network - init_flag);
3. Encourage reusability of test artifacts (for example, all tests should call
an initialization module init_env first, rather than use their own
initialization routines); and
4. Provide standard interfaces to external entities like operating system,
hardware, and so on. For example, if it is required for tests to spawn
multiple as processes, rather than have each of the test, directly spawn
the processes, the coding standards may dictate that they should all call
a standard function, say, create_os_process. By isolating the external
interfaces separately, the tests can be reasonably insulated from changes
to these lower-level layers.
Test reporting standards Since testing is tightly interlinked with product
quality, all the stakeholders must get a consistent and timely view of the
progress of tests. Test reporting standards address this issue. They provide
guidelines on the level of detail that should be present in the test reports, their
standard formats and contents, recipients of the report, and so on. We will
revisit this in more detail later in this chapter.
Internal standards provide a competitive edge to a testing organization
and act as a first-level insurance against employee turnover and attrition.
Internal standards help bring new test engineers up to speed rapidly. When
such consistent processes and standards are followed across an organization, it
brings about predictability and increases the confidence level one can have on
the quality of the final product. In addition, any anomalies can be brought to
light in a timely manner.

Check your progress 1

List down various testing internal standards.


Notes: a) Write your answer in the space given below
ggg) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

186
22.3 TEST INFRASTRUCTURE MANAGEMENT

Testing requires a robust infrastructure to be planned up front. This


infrastructure is made up of three essential elements.
1. A test case database (TCDB)
2. A defect repository
3. Configuration management repository and tool
A test case database captures all the relevant information about the test
cases in an organization. Some of the entities and the attributes in each of the
entities in such a TCDB.
A defect repository captures all the relevant details of defects reported for
a product. The defect repository is an important vehicle of communication that
influences the work flow within a software organization. It also provides the
base data in arriving at several of the metrics, Metrics and Measurements. In
particular, most of the metrics classified as testing defect metrics and
development defect metrics are derived out of the data in defect repository.
Yet another infrastructure that is required for a software product
organization (and in particular for a testing team) is a software configuration
management (SCM) repository. An SCM repository also known as (CM
repository) keeps track of change control and version control of all the
files/entities that make up a software product. A particular case of the files/
entities is test files depicted in Table 22.1.
Table 22.1 Information in a defect repository.

187
Change control ensures that
1. Changes to test files are made in a controlled fashion and only with
proper approvals.
2. Changes made by one test engineer are not accidentally lost or
overwritten by other changes.
3. Each change produces a distinct version of the file that is recreatable at
any point of time.
4. At any point of time, everyone gets access to only the most recent version
of the test files (except in exceptional cases).
Version control ensures that the test scripts associated with a given release of a
product are base lined along with the product files. Base lining is akin to taking
a snapshot of the set of related files of a version, assigning a unique identifier to
this set. In future, when anyone wants to recreate the environment for the given
release, this label would enable him or her to do so.
CDB, defect repository, and SCM repository should complement each other and
work together in an integrated. For example, the defect repository links the
defects, fixes, and tests. The files for all these will be in the SCM. The meta data
about the modified test files will be in the TCDB. Thus, starting with a given
defect, one can trace all the test cases that test the defect (from the TCDB) and
then find the corresponding test case files and source files from the SCM
repository.
Similarly, in order to decide which tests to run for a given regression run,
1. The defects recently fixed can be obtained from the defect repository and
tests for these can be obtained from the TCDB and included in the
regression tests.
2. The list of files changed since the last regression run can be obtained
from the SCM repository and the corresponding test files traced from the
TCDB.
3. The set of tests not run recently can be obtained from the TCDB and
these can become potential candidates to be run at certain frequencies

Check your progress 2

What do you mean by version control?


Notes: a) Write your answer in the space given below
b)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

188
22.4 TEST PEOPLE MANAGEMENT

People management is an integral part of any project management.


Often, it is a difficult chasm for engineers-turned-managers to cross. As an
individual contributor, a person relies only on his or her own skills to
accomplish an assigned activity; the person is not necessarily trained on how to
document what needs to be done so that it can be accomplished by someone
else. Furthermore, people management also requires the ability to hire,
motivate, and retain the right people. These skills are seldom formally taught
(unlike technical skills). Project managers often learn these skills in a "sink or
swim" mode, being thrown head-on into the task.
Most of the above gaps in people management apply to all types of
projects. Testing projects present several additional challenges. We believe that
the success of a testing organization (or an individual in a testing career)
depends vitally on judicious people management skills. Since the people and
team-building issues are significant enough to be considered in their own right,
we have covered these in detail, on People Issues in Testing, and on
Organization Structures for Testing Teams. These chapters address issues
relevant to building and managing a good global testing team that is effectively
integrated into product development and release.
We would like to stress that these team-building exercises should be
ongoing and sustained, rather than be done in one burst. The effects of these
exercises tend to wear out under the pressure of deadlines of delivery and
quality. Hence, they need to be periodically recharged. The important point is
that the common goals and the spirit of teamwork have to be internalized by all
the stakeholders. Once this internalization is achieved, then they are unlikely to
be swayed by operational hurdles that crop up during project execution. Such
an internalization and upfront team building has to be part of the planning
process for the team to succeed.

22.5 INTEGRATION WITH PRODUCT RELEASE

Ultimately, the success of a product depends on the effectiveness of


integration of the development and' testing activities. These job functions have
to work in tight unison between themselves and with other groups such as
product support, product management, and so on. The schedules of testing
have to be linked directly to product release. Thus, project planning for the
entire product should be done in a holistic way, encompassing the project plan
for testing and development. The following are some of the points to be decided
for this planning.
1. Sync points between development and testing as to when different types
of testing can commence. For example, when integration testing could
start, when system testing could start and so on. These are governed by
objective entry criteria for each phase of testing (to be satisfied by
development).
2. Service level agreements between development and testing as to how long
it would take for the testing team to complete the testing. This will

189
ensure that testing focuses on finding relevant and important defects
only.
3. Consistent definitions of the various priorities and severities of the
defects. This will bring in a shared vision between development and
testing teams, on the nature of the defects to focus on.
4. Communication mechanisms to the documentation group to ensure that
the documentation is kept in sync with the product in terms of known
defects, workarounds, and so on.
The purpose of the testing team is to identify the defects in the product
and the risks that could be faced by releasing the product with the existing
defects. Ultimately, the decision to release or not is a management decision,
dictated by market forces and weighing the business impact for the
organization and the customers.

Check your progress 3

Define People management.


Notes: a) Write your answer in the space given below
hhh) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

22.6 LET US SUM UP

Check Your Progress: Model Answers


1. Some of the internal standards include
• Naming and storage conventions for test artifacts;
• Document standards;
• Test coding standards; and
• Test reporting standards.
2. Version control ensures that the test scripts associated with a given
release of a product are base lined along with the product files. Base
lining is akin to taking a snapshot of the set of related files of a version,
assigning a unique identifier to this set. In future, when anyone wants to
recreate the environment for the given release, this label would enable
him or her to do so.
3. People management is an integral part of any project management.
Often, it is a difficult chasm for engineers-turned-managers to cross. As
an individual contributor, a person relies only on his or her own skills to
accomplish an assigned activity; the person is not necessarily trained on
how to document what needs to be done so that it can be accomplished
by someone else

190
LESSON 23

TEST PROCESS
Contents
23.0 Aims and Objectives
23.1 Test Process
23.2 Putting Together and Base lining a Test plan
23.3 Test Case Specification
23.4 Update of Traceability Matrix
23.5 Identifying possible Candidates for Automation
23.6 Developing and Base lining Test Cases
23.7 Executing Test Cases and keeping Traceability Matrix Current
23.8 Collecting and Analyzing Metrics
23.9 Preparing Test Summary Report
23.10 Recommending Product Release Criteria
23.11 Let Us Sum Up

23.0 AIMS AND OBJECTIVES

In this lesson, we discuss on the process of the testing, how to put base line for
a test plan, test case specification, traceability matrix, developing and executing
test cases and preparing test summary report. At the end of this lesson, you
can be able to understand the test case specification and all the above contents.

23.1 TEST PROCESS

Software testing process implementation is essential for reliable and


predictable software. If you do not test to see how a program will perform under
adverse conditions, then you are deploying software that has the potential to be
highly unstable. Compuware's DevPartner Fault Simulator is a test processing
tool gives developer’s accountability for software performance under adverse
conditions.

23.2 PUTTING TOGETHER AND BASE LINING A TEST PLAN

A test plan combines all the points discussed above into a single
document that acts as an anchor point for the entire testing project. A template
of a test plan is provided in Appendix B at the end of this chapter. Appendix A
gives a check list of questions that are useful to arrive at a Test Plan.
An organization normally arrives at a template that is to be used across
the board. Each testing project puts together a test plan based on the template.
Should any changes be required in the template, then such a change is made

191
only after careful deliberations (and with appropriate approvals). The test plan
is reviewed by a designated set of competent people in the organization. It then
is approved by a competent authority, which is independent of the project
manager directly responsible for testing. After this, the test plan is base lined
into the configuration management repository. From then on, the base lined
test plan becomes the basis for running the testing project. Any significant
changes in the testing project should thereafter be reflected in the test plan and
the changed test plan base lined again in the configuration management
repository. In addition, periodically, any change needed to the test plan
templates are discussed among the different stake holders and this is kept
current and applicable to the testing teams.

Check your progress 1

Describe Software Test Process.


Notes: a) Write your answer in the space given below
iii) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

23.3 TEST CASE SPECIFICATION

Using the test plan as the basis, the testing team designs test case
specifications, which then becomes the basis for preparing individual test
cases. We have been using the term test cases freely throughout this book.
Formally, a test case is nothing but a series of steps executed on a product,
using a pre-defined set of input data, expected to produce a pre-defined set of
outputs, in a given environment. Hence, a test case specification should clearly
identify
1. The purpose of the test: This lists what feature or part the test is
intended for. The test case should follow the naming conventions (as
discussed earlier) that are consistent with the feature/module being
tested.
2. Items being tested, along with their version/release numbers as
appropriate.
3. Environment that needs to be set up for running the test case: This can
include the hardware environment setup, supporting software
environment setup (for example, setup of the operating system, database,
and so on), setup of the product under test (installation of the right
version, configuration, data initialization, and so on).
4. Input data to be used for the test case: The choice of input data will be
dependent on the test case itself and the technique followed in the test
case (for example, equivalence partitioning, boundary value analysis, and

192
so on). The actual values to be used for the various fields should be
specified unambiguously (for example, instead of saying "enter a three-
digit positive integer," it is better to say "enter 789"). If automated testing
is to be used, these values should be captured in a file and used, rather
than having to enter the data manually every time.
5. Steps to be followed to execute the test: If automated testing is used,
then, these steps are translated to the scripting language of the tool. If
the testing is manual, then the steps are detailed instructions that can
be used by a tester to execute the test. It is important to ensure that the
level of detail in documenting the steps is consistent with the skill and
expertise level of the person who will execute the tests.
6. The expected results that are considered to be "correct results." These
expected results can be what the user may see in the form of a GUI,
report, and so on and can be in the form of updates to persistent storage
in a database or in files.
7. A step to compare the actual results produced with the expected results:
This step should do an "intelligent" comparison of the expected and
actual results to highlight any discrepancies. By "intelligent" comparison,
we mean that the comparison should take care of "acceptable
differences" between the expected results and the actual results, like
terminal ID, user ID, system date, and so on.
8. Any relationship between this test and other tests: This can be in the
form of dependencies among the tests or the possibility of reuse across
the tests.

23.4 UPDATE OF TRACEABILITY MATRIX

As we have discussed earlier, Black Box Testing, a requirements


traceability matrix ensures that the requirements make it through the
subsequent life cycle phases and do not get orphaned mid-course. In particular,
the traceability matrix is a tool to validate that every requirement is tested. The
traceability matrix is created during the requirements gathering phase itself by
filling up the unique identifier for each requirement. Subsequently, as the
project proceeds through the design and coding phases, the unique identifier
for design features and the program file name is entered in the traceability
matrix. When a test case specification is complete, the row corresponding to the
requirement which is being tested by the test case is updated with the test case
specification identifier. This ensures that there is a two-way mapping between
requirements and test cases.

23.5 IDENTIFYING POSSIBLE CANDIDATES FOR AUTOMATION

The test case design forms the basis for writing the test cases. Before writing
the test cases, a decision should be taken as to which tests are to be automated

193
and which should be run manually. Suffice to say here, some of the criteria that
will be used in deciding which scripts to automate include
1. Repetitive nature of the test;
2. Effort involved in automation;
3. Amount of manual intervention required for the test; and
4. Cost of automation tool.
The need for efficient software testing has always existed, because the users
expect software that works. The need for process improvement is apparent, due
to the number of defects delivered, and the time and money consumed in
testing.
We expect that the future will present even more extensive demands to the
testing process. We believe that users will demand better technical quality, and
we expect development organisations to demand less expensive testing.
The solutionThe solution is not more people equipped with more tools. As
already mentioned, the solution is process improvement. The ’Testing Maturity
Model’ (TMM) is developed with this purpose. It is an extension to CMM
Choosing the TMM is a perfectly good choice, and we will work with this and
other SPI-models in the future.

Check your progress 2

Expand TMM.
Notes: a) Write your answer in the space given below
jjj) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

23.6 DEVELOPING AND BASE LINING TEST CASES

Based on the test case specifications and the choice of candidates for
automation, test cases have to be developed. The development of test cases
entails translating the test specifications to a form from which the tests can be
executed. If a test case is a candidate for automation, then, this step requires
writing test scripts in the automation language. If the test case is a manual test
case, then test case writing maps to writing detailed step-by-step instructions
for executing the test and validating the results. In addition, the test case
should also capture the documentation for the changes made to the test case
since the original development. Hence, the test cases should also have change
history documentation, which specifies
1. What was the change;

194
2. Why the change was necessitated;
3. Who made the change;
4. When was the change made;
5. A brief description of how the change has been implemented; and
6. Other files affected by the change.
All the artifacts of test cases-the test scripts, inputs, scripts, expected
outputs, and so on-should be stored in the test case database and SCM, as
described earlier. Since these artifacts enter the SCM, they have to be reviewed
and approved by appropriate authorities before being base lined.

23.7 EXECUTING TEST CASES AND KEEPING TRACEABILITY MATRIX


CURRENT

The prepared test cases have to be executed at the appropriate times


during a project. For example, test cases corresponding to smoke tests may be
run on a daily basis. System testing test cases will be run during system
testing.
As the test cases are executed during a test cycle, the defect repository is
updated with
1. Defects from the earlier test cycles that are fixed in the current build;
and
2. New defects that get uncovered in the current run of the tests.
The defect repository should be the primary vehicle of communication
between the test team and the development team. As mentioned earlier, the
defect repository contains all the information about defects uncovered by
testing (and defects reported by customers). All the stakeholders should be
referring to the defect repository for knowing the current status of all the
defects. This communication can be augmented by other means like emails,
conference calls, and so on.
As discussed in the test plan, a test may have to be suspended during its
run because of certain show stopper defects. In this case, the suspended test
case should wait till the resumption criteria are satisfied. Likewise, a test
should be run only when the entry criteria for the test are satisfied and should
be deemed complete only when the exit criteria are satisfied.
During test design and execution, the traceability matrix should be kept
current. As and when tests get designed and executed successfully, the
traceability matrix should be updated. The traceability matrix itself should be
subject to configuration management, that is, it should be subject to version
control and change control.

Check your progress 3

Define a successful test case.

195
Notes: a) Write your answer in the space given below
kkk) Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

23.8 COLLECTING AND ANALYZING METRICS

When tests are executed, information about test execution gets collected
in test logs and other files. The basic measurements from running the tests are
then converted to meaningful metrics by the use of appropriate transformations
and formulae, Metrics and Measurements.

23.9 PREPARING TEST SUMMARY REPORT

At the completion of a test cycle, a test summary report is produced. This


report gives insights to the senior management about the fitness of the product
for release.
IEEE 829-1998, also known as the 829 Standard for Software Test
Documentation, is an IEEE standard that specifies the form of a set of
documents for use in eight defined stages of software testing, each stage
potentially producing its own separate type of document. The standard specifies
the format of these documents but does not stipulate whether they all must be
produced, nor does it include any criteria regarding adequate content for these
documents. These are a matter of judgement outside the purview of the
standard. The documents are:
• Test Plan: a management planning document that shows:
• How the testing will be done (including SUT configurations).
• Who will do it
• What will be tested
• How long it will take (although this may vary, depending upon
resource availability).
• What the test coverage will be, i.e. what quality level is required
• Test Design Specification: detailing test conditions and the
expected results as well as test pass criteria.
• Test Case Specification: specifying the test data for use in
running the test conditions identified in the Test Design
Specification
• Test Procedure Specification: detailing how to run each test,
including any set-up preconditions and the steps that need to be
followed
• Test Item Transmittal Report: reporting on when tested software
components have progressed from one stage of testing to the next
• Test Log: recording which tests cases were run, who ran them, in
what order, and whether each test passed or failed

196
• Test Incident Report: detailing, for any test that failed, the actual
versus expected result, and other information intended to throw
light on why a test has failed. This document is deliberately
named as an incident report, and not a fault report. The reason is
that a discrepancy between expected and actual results can occur
for a number of reasons other than a fault in the system. These
include the expected results being wrong, the test being run
wrongly, or inconsistency in the requirements meaning that more
than one interpretation could be made. The report consists of all
details of the incident such as actual and expected results, when
it failed, and any supporting evidence that will help in its
resolution. The report will also include, if possible, an assessment
of the impact upon testing of an incident.
• Test Summary Report: A management report providing any
important information uncovered by the tests accomplished, and
including assessments of the quality of the testing effort, the
quality of the software system under test, and statistics derived
from Incident Reports. The report also records what testing was
done and how long it took, in order to improve any future test
planning. This final document is used to indicate whether the
software system under test is fit for purpose according to whether
or not it has met acceptance criteria defined by project
stakeholders.

23.10 RECOMMENDING PRODUCT RELEASE CRITERIA

One of the purposes of testing is to decide the fitness of a product for


release. Testing can never conclusively prove the absence of defects in a
software product. What it provides is an evidence of what defects exist in the
product, their severity, and impact. As we discussed earlier, the job of the
testing team is to articulate to the senior management and the product release
team
1. What defects the product has;
2. What is the impact/severity of each of the defects; and
3. What would be the risks of releasing the product with the existing
defects.
The senior management can then take a meaningful business decision
on whether to release a given version or not.
TEST REPORTING
Testing requires constant communication between the test team and
other teams (like the development team). Test reporting is a means of achieving
this communication. There are two types of reports or communication that are
required: test incident reports and test summary reports (also called test
completion reports).
Test incident report A test incident report is a communication that happens
through the testing cycle as and when defects are encountered. Earlier, we
described the defect repository. A test incident report is nothing but an entry

197
made in the defect repository. Each defect has a unique ID and this is used to
identify the incident. The high impact test incidents (defects) are highlighted in
the test summary report.
Test cycle report As discussed, test projects take place in units of test cycles.
A test cycle entails planning and running certain tests in cycles, each cycle
using a different build of the product. As the product progresses through the
various cycles, it is to be expected to stabilize. A test cycle report, at the end of
each cycle, gives
1. A summary of the activities carried out during that cycle;
2. Defects that were uncovered during that cycle, based on their severity
and impact;
3. Progress -from the previous cycle to the current cycle in terms of defects
fixed;
4. Outstanding defects that are yet to be fixed in this cycle; and
5. Any variations observed in effort or schedule (that can be used for future
planning).
Test summary report The final step in a test cycle is to recommend the
suitability of a product for release. A report that summarizes the results of a
test cycle is the test summary report.
There are two types of test summary reports.
1. Phase-wise test summary, which is produced at the end of every phase
2. Final test summary reports (which has all the details of all testing done
by all phases and teams, also called as "release test report")

23.11 LET US SUM UP

Check Your Progress: Model Answers


1. Software testing process implementation is essential for reliable and
predictable software. If you do not test to see how a program will perform
under adverse conditions, then you are deploying software that has the
potential to be highly unstable.
2. ’Testing Maturity Model’ (TMM)
3. A successful test case is one that furthers progress in the above direction
by causing the program to fail. Thus the program testing establishes
some degree of confidence that a program does what it is supposed to do
and does not do what it is not supposed to do and this purpose is best
achieved by a diligent exploration for errors

198
LESSON 24

TESTING METRICS
Contents
24.0 Aims and Objectives
24.1 What are Metrics and Measurements?
24.2 Why Metrics in Testing?
24.3 Types of Metrics
24.4 Project Metrics
24.4.1 Effort Variance (Planned vs Actual)
24.4.2 Schedule Variance (Planned vs Actual)
24.4.3 Effort Distribution across Phases
24.5 Progress Metrics
24.5.1 Test Defect Metrics
24.5.2 Development Defect Metrics
24.6 Let Us Sum Up

24.0 AIMS AND OBJECTIVES

In this lesson, we are going to discuss on Testing metrics, types of testing


metrics, project metrics and progress metrics etc. And we are going to see on
how planned and real testing metrics can happen.

24.1 WHAT ARE METRICS AND MEASUREMENT?

All significant activities in a project need to be tracked to ensure that .e


the project is going as per plan and to decide on any corrective actions. The
measurement of key parameter is an integral part of tracking. Measurements
first entail collecting a set of data. But, raw data by itself may not throw light on
why a particular event has happened. The collected data have to be analyzed in
totality to draw the appropriate conclusions. In the above cartoon, the two data
points were that the boss had gone on vacation and the profits zoomed in the
previous quarter. However (hopefully!) the two events are not directly linked to
each other. So the conclusion from the raw data was not useful for decision
making.
Metrics derive information from raw data with a view to help in decision
making.
Some of the areas that such information would shed light on are
1. Relationship between the data points;
2. Any cause and effect correlation between the observed data points; and

199
3. Any pointers to how the data can be used for future planning and
continuous improvements.
Metrics are thus derived from measurements using appropriate formulae
or calculations. Obviously, the same set of measurements can help product
different set of metrics, of interest to different people.
From the above discussion, it is obvious that in order that a project
performance be tracked and its progress monitored effectively,
1. The right parameters must be measured; the parameters may pertain
to product or to process.
2. The right analysis must be done on the data measured, to draw
correct conclusions about the health of the product or process within
a project or organization.
3. The results of the analysis must be presented in an appropriate form
to the stakeholders to enable them to make the right decisions on
improving product or process quality (or any other relevant business
drivers).
Since the focus of this book is on testing and products under test, only
metrics related to testing and product are discussed in this chapter and not
those meant for process improvements.
The metrics and analysis of metrics may convey the reason when data
points are combined. Relating several data points and consolidating the result
in terms of charts and pictures simplifies the analysis and facilitates the use of
metrics for decision making.
Effort is the actual time that is spent on a particular activity or a phase.
Elapsed days is the difference between the start of an activity and the
completion of the activity. For example, ordering a product through the web
may involve five minutes of effort and three elapsed days. It is the packaging
and shipping that takes that much duration, not the time spent by the person
in ordering. However, in the schedule, this latency or delay needs to be entered
as three days. Of course, during these three days, the person who ordered the
product can get on to some other activity and do it in simultaneously. In
general, effort is derived from productivity numbers, and elapsed days are the
number of days required to complete the set of activities. Elapsed days for a
complete set of activities become the schedule for the project. Collecting and
analyzing metrics involves effort and several steps.
The first step involved in a metrics program is to decide what
measurements are important and collect data accordingly. The effort spent on
testing, number of defects, and number of test cases, are some examples of
measurements. Depending on what the data is used for, the granularity of
measurement will vary.
While deciding what to measure, the following aspects need to be kept in
mind.
1. What is measured should be of relevance to what we are trying to
achieve. For testing functions, we would obviously be interested in the

200
effort spent on testing, number of test cases, number of defects
reported from test cases, and so on.
2. The entities measured should be natural and should not involve too
many overheads for measurements. If there are too many overheads
in making the measurements or if the measurements do not follow
naturally from the actual work being done, then the people who
supply the data may resist giving the measurement data (or even give
wrong data).
3. What is measured should be at the right level of granularity to satisfy
the objective for which the measurement is being made.
Let us look at the last point on granularity of data in more detail. The
different people who use the measurements may want to make inferences on
different dimensions. The level of granularity of data obtained depends on the
level of detail required by a specific audience. Hence the measurements and the
metrics derived from them - will have to be at different levels for different
people. An approach involved in getting the granular detail is called data
drilling. Given in the next page is an example of a data drilling exercise. This is
what typically happens in many organizations when metrics/test reports are
presented and shows how different granularity of data is relevant for decision
making at different levels.
The conversation in the example continues till all questions are answered
or till the defects in focus becomes small in number and can be traced to
certain root causes. The depth to which data drilling happens depends on the
focus area of the discussion or need. Hence, it is important to provide as much
granularity in measurements as possible. In the above example, the
measurement was "number of defects."
Not all conversations involve just one measurement as in the example. A
set of measurements can be combined to generate metrics that will be explained
in further sections of this chapter. An example question involving multiple
measurements is "How many test cases produced the 40 defects in data
migration involving different schema?" There are two measurements involved in
this question: the number of test cases and the number of defects. Hence, the
second step involved in metrics collection is defining how to combine data
points or measurements to provide meaningful metrics. A particular metric can
use one or more measurements.
Knowing the ways in which a measurement is going to be used and
knowing the granularity of measurements leads us to the third step in the
metrics program-deciding the operational requirement for measurements. The
operational requirement for a metrics plan should lay down not only the
periodicity but also other operational issues such as who should collect
measurements, who should receive the analysis, and so on. This step helps to
decide on the appropriate periodicity for the measurements as well as assign
operational responsibility for collecting, recording, and reporting the
measurements and dissemination of the metrics information. Some
measurements need to be made on a daily basis (for example, how many test
cases were executed, how many defects found, defects fixed, and so on). But the

201
metrics involving a question like the one above ("how many test cases produced
40 defects") is a type of metric that needs to be monitored at extended periods
of time, say, once in a week or at the end of a test cycle. Hence, planning
metrics generation also needs to consider the periodicity of the metrics.
The fourth step involved in a metrics program is to analyze the metrics to
identify both positive areas and improvement areas on product quality. Often,
only the improvement aspects pointed to by the metrics are analyzed and
focused; it is important to also highlight and sustain the positive areas of the
product. This will ensure that the best practices get institutionalized and also
motivate the team better.
The final step involved in a metrics plan is to take necessary action and
follow up on the action. The purpose of a metrics program will be defeated if the
action items are not followed through to completion. This is especially true of
testing, which is the penultimate phase before release. Any delay in analysis
and following through with action items to completion can result in undue
delays in product release.
Any metrics program, as described above, is a continuous and ongoing
process. As we make measurements, transform the measurements into metrics,
analyze the metrics, and take corrective action, the issues for which the
measurements were made in the first place will become resolved. Then, we
would have to continue the next iteration of metrics programs, measuring
(possibly) a different set of measurements, leading to more refined metrics
addressing (possibly) different issues.

Check your progress 1

What do you mean by Effort?


Notes: a) Write your answer in the space given below
b)Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

24.2 WHY METRICS IN TESTING?

Since testing is the penultimate phase before product release, it is


essential to measure the progress of testing and product quality. Tracking test
progress and product quality can give a good idea about the release-whether it
will be met on time with known quality. Measuring and producing metrics to
determine the progress of testing is thus very important.
Knowing only how much testing got completed does not answer the
question on when the testing will get completed and when the product will be
ready for release. To answer these questions, one needs to know how much
more time is needed for testing. To judge the remaining days needed for testing,

202
two data points are needed - remaining test cases yet to be executed and how
many test cases can be executed per elapsed day. The test cases that can be
executed per person day are calculated based on a measure called test case
execution productivity. This productivity number is derived from the previous
test cycles. It is represented by the formula, given alongside in the margin.
Thus, metrics are needed to know test case execution productivity and to
estimate test completion date.
It is not testing alone that determines the date at which the product can
be released. The number of days needed to fix all outstanding defects is another
crucial data point. The number of days needed for defects fixes needs to take
into account the" outstanding defects waiting to be fixed" and a projection of
"how many more defects that will be unearthed from testing in future cycles."
The defect trend collected over a period of time gives a rough estimate of the
defects that will come through future test cycles. Hence, metrics helps in
predicting the number of defects that can be found in future test cycles.
The defect-fixing trend collected over a period of time gives another
estimate of the defect-fixing capability of the team. This measure gives the
number of defects that can be fixed in a particular duration by the development
team. Combining defect prediction with defect-fixing capability produces an
estimate of the days needed for the release. The formula given alongside
in the margin can help arrive at a rough estimate of the total days needed for
defect fixes.
Hence, metrics helps in estimating the total days needed for fixing
defects. Once the time needed for testing and the time for defects fixing are
known, the release date can be estimated. Testing and defect fixing are
activities that can be executed simultaneously, as long as there is a regression
testing planned to verify the outstanding defects fixes and their side-effects. If a
product team follows the model of separate development and testing teams, the
release date is arrived at on the basis of which one (days needed for testing or
days needed for defect fixes) is on the critical path. The formula given alongside
in the margin helps in arriving at the release date.
The defect fixes may arrive after the regular test cycles are completed.
These defect fixes will have to be verified by regression testing before the
product can be released. Hence, the formula for days needed for release is to be
modified as alongside in the margin. The above formula can be further tuned to
provide more accuracy to estimates as the current formula does not include
various other activities such as documentation, meetings, and so on. The idea
of discussing the formula here is to explain that metrics are important and help
in arriving at the release date for the product.
The measurements collected during the development and test cycle are
not only used for release but also used for post-release activities. Looking at the
defect trend for a period helps in arriving at approximate estimates for the
number of defects that may get reported post release. This defect trend is used
as one of the parameters to increase the size of the maintenance/sustenance
team to take care of defects that may be reported post release. Knowing the type
of defects that are found during a release cycle and having an idea of all

203
outstanding defects and their impact helps in training the support staff, thereby
ensuring they are well equipped and prepared for the defects that may get
reported by the customers.
Metrics are not only used for reactive activities. Metrics and their
analysis help in preventing the defects proactively, thereby saving cost and
effort. For example, if there is a type of defect (say, coding defects) that is
reported in large numbers, it is advisable to perform a code review and prevent
those defects, rather than finding them one by one and fixing them in the code.
Metrics help in identifying these opportunities.
Metrics are used in resource management to identify the right size of
product development teams. Since resource management is an important
aspect of product development and maintenance, metrics go a long way in
helping in this area.
There are various other areas where metrics can help; ability of test
cases in finding defects is one such area. We discussed test case result history
in Chapter 8, Regression Testing. When this history is combined with the
metrics of the project, it provides detailed information on what test cases have
the capabilities to produce more/less defects in the current cycle.
To summarize, metrics in testing help in identifying
 When to make the release.
 What to release - Based on defect density (formally defined later) across
modules, their importance to customers, and impact analysis of those
defects, the scope of the product can be decided to release the product on
time. Metrics help in making this decision.
 Whether the product is being released with known quality - The idea of
metrics is not only for meeting the release date but also to know the
quality of the product and ascertaining the decision on whether we are
releasing the product with the known quality and whether it will function
in a predictable way in the field.

24.3 TYPES OF METRICS

Metrics can be classified into different types based on what they measure
and what area they focus on. At a very high level, metrics can be classified as
product metrics and process metrics. As explained earlier, process metrics are
not discussed in this chapter.
Product metrics can be further classified as
1. Project metrics A set of metrics that indicates how the project is
planned and executed.
2. Progress metrics A set of metrics that tracks how the different activities
of the project are progressing. The activities include both development
activities and testing activities. Since the focus of this book is testing,
only metrics applicable to testing activities are discussed in this book
(and in this chapter). Progress metrics is monitored during testing

204
phases. Progress metrics helps in finding out the status of test activities
and they are also good indicators of product quality. The defects that
emerge from testing provide a wealth of information that help both
development team and test team to analyze and improve. For this reason,
progress metrics in this chapter focus only on defects. Progress metrics,
for convenience, is further classified into test defect metrics and
development defect metrics.
3. Productivity metrics A set of metrics that takes into account various
productivity numbers that can be collected and used for planning and
tracking testing activities. These metrics help in planning and estimating
of testing activities.

Check your progress 2

What do you mean by progress metrics?


Notes: a) Write your answer in the space given below
3. Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

24.4 PROJECT METRICS

A typical project starts with requirements gathering and ends with


product release. All the phases that fall in between these points need to be
planned and tracked. In the planning cycle, the scope of the project is finalized.
The project scope gets translated to size estimates, which specify the quantum
of work to be done. This size estimate gets translated to effort estimate for each
of the phases and activities by using the available productivity data available.
This initial effort is called base lined effort.
As the project progresses and if the scope of the project changes or if the
available productivity numbers are not correct, then the effort estimates are re-
evaluated again and this re-evaluated effort estimate is called revised effort. The
estimates can change based on the frequency of changing requirements and
other parameters that impact the effort.
Over or under estimating the effort are normal experiences faced by
many organizations. Perils of such wrong estimates are financial losses, delay in
release, and wrong brand image among the customers. Right estimation comes
by experience and by having the right productivity numbers from the team.
Effort and schedule are two factors to be tracked for any phase or
activity. Tracking the activities in the SDLC phase is done by two means, that
is, effort and schedule. In an ideal situation, if the effort is tracked closely and
met, then the schedule can be met. The schedule can also be met by adding
more effort to the project (additional resources or asking the engineers to work

205
late). If the release date (that is, schedule) is met by putting more effort, then
the project planning and execution cannot be considered successful. In fact,
such an idea of adding effort may not be possible always as the resources may
not be available in the organization every time and engineers working late may
not be productive beyond a certain point.
At the same time, if planned effort and actual effort are the same but if
the schedule is not met then too the project cannot be considered successful.
Hence, it is a good idea to track both effort and schedule in project metrics.
The basic measurements that are very natural, simple to capture, and
form the inputs to the metrics in this section are
1. The different activities and the initial baselined effort and schedule for
each of the activities; this is input at the beginning of the project/phase.
2. The actual effort and time taken for the various activities; this is entered
as and when the activities take place.
3. The revised estimate of effort and schedule; these are re-calculated at
appropriate times in the project life.

24.4.1 EFFORT VARIANCE (PLANNED VS ACTUAL)

When the base lined effort estimates, revised effort estimates, and actual
effort are plotted together for all the phases of SOLC, it provides many insights
about the estimation process. As different set of people may get involved in
different phases, it is a good idea to plot these effort numbers phase-wise.
Normally, this variation chart is plotted as the point revised estimates are being
made or at the end of a release.
If there is a substantial difference between the base lined and revised
effort, it points to incorrect initial estimation. Calculating effort variance for
each of the phases (as calculated by the formula below) provides a quantitative
measure of the relative difference between the revised and actual efforts.
If variance takes into account only revised estimate and actual effort,
then a question arises, what is the use of base lined estimate? As mentioned
earlier, the effort variation chart provides input to estimation process. When
estimates are going wrong (or right), it is important to find out where we are
going wrong (or right). Many times the revised estimates are done in a hurry, to
respond fast enough to the changing requirements or unclear requirements. If
this is the case, the right parameter for variance calculation is the base lined
estimate. In this case analysis should point out the problems in the revised
estimation process. Similarly, there could be a problem in the baseline
estimation process that can be brought out by variance calculation. Hence, all
the base lined estimates, revised estimates, and actual effort are plotted
together for each of the phases.
The variance can be negative also. A negative variance is an indication of
an over estimate. These variance numbers along with analysis can help in
better estimation for the next release or the next revised estimation cycle.

206
24.4.2 SCHEDULE VARIANCE (PLANNED VS ACTUAL)

Most software projects are not only concerned about the variance in
effort, but are also concerned about meeting schedules. This leads us to the
schedule variance metric. Schedule variance, like effort variance, is the
deviation of the actual schedule from the estimated schedule. There is one
difference, though. Depending on the SDLC model used by the project, several
phases could be active at the same time. Further, the different phases in SDLC
are interrelated and could share the same set of individuals. Because of all
these complexities involved, schedule variance is calculated only at the overall
project level, at specific milestones, not with respect to each of the SDLC
phases.
Using the data in the above chart, the variance percent can be calculated
using a similar formula as explained in the previous section, considering the
estimated schedule and actual schedule.
Schedule variance is calculated at the end of every milestone to find out
how well the project is doing with respect to the schedule. To get a real picture
on schedule in the middle of project execution, it is important to calculate
"remaining days yet to be spent" on the project and plot it along with the "actual
schedule spent" as in the above chart. "Remaining days yet to be spent" can be
calculated by adding up all remaining activities. If the remaining days yet to be
spent on project is not calculated and plotted, it does not give any value to the
chart in the middle of the project, because the deviation cannot be inferred
visually from the chart. The remaining days in the schedule becomes zero when
the release is met.
Effort and schedule variance have to be analyzed in totality, not in
isolation. This is because while effort is a major driver of the cost, schedule
determines how best a product can exploit market opportunities. Variance can
be classified into negative variance, zero variance, acceptable variance, and
unacceptable variance. Generally 0-5% is considered as acceptable variance.
Probable causes and outcomes under the various scenarios, it may not
reflect all possible causes and outcomes. For example, a negative variance in
phase/module would have nullified the positive variance in another phase of
product module. Hence, it is important to look at the "why and how" in metrics
rather than just focusing on "what" was achieved. The data drilling exercise
discussed earlier will help in this analysis. Some of the typical questions one
should ask to analyze effort and schedule variances are given below.
 Did the effort variance take place because of poor initial estimation or
poor execution?
 If the initial estimation turns out to be off the mark, is it because of
lack of availability of the supporting data to enable good estimation?
 If the effort or schedule in some cases is not in line with what was
estimated, what changes caused the variation? Was there a change in
technology of what was tested? Was there a new tool introduced for
testing? Did some key people leave the team?

207
 If the effort was on target, but the schedule was not, did the plan take
into account appropriate parallelism? Did it explore the right
multiplexing of the resources?
 Can any process or tool be enhanced to improve parallelism and
thereby speed up the schedules?
 Whenever we get negative variance in effort or schedule (that is, we
are completing the project with lesser effort and/or faster than what
was planned), do we know what contributed to the efficiency and if
so, can we institutionalize the efficiencies to achieve continuous
improvement?

24.4.3 EFFORT DISTRIBUTION ACROSS PHASES

Variance calculation helps in finding out whether commitments are met


on time and whether the estimation method works well. In addition, some
indications on product quality can be obtained if the effort distribution across
the various phases are captured and analyzed. For example,
 Spending very little effort on requirements may lead to frequent changes
but one should also leave sufficient time for development and testing
phases.
 Spending less effort in testing may cause defects to crop up in the
customer place but spending more time in testing than what is needed
may make the product lose the market window.
The distribution percentage across the different phases can be estimated
at the time of planning and these can be compared with the actuals at the time
of release for getting a comfort feeling on the release and estimation methods.

24.5 PROGRESS METRICS

Any project needs to be tracked from two angles. One, how well the
project is doing with respect to effort and schedule. This is the angle we have
been looking at so far in this chapter. The other equally important angle is to
find out how well the product is meeting the quality requirements for the
release. There is no point in producing a release on time and within the effort
estimate but with a lot of defects, causing the product to be unusable. One of
the main objectives of testing is to find as many defects as possible before any
customer finds them. The number of defects that are found in the product is
one of the main indicators of quality. Hence in this section, we will look at
progress metrics that reflect the defects (and hence the quality) of a product.
Defects get detected by the testing team and get fixed by the development
team. In line with this thought, defect metrics are further classified in to test
defect metrics (which help the testing team in analysis of product quality and
testing) and development defect metrics (which help the development team in
analysis of development activities).
How many defects have already been found and how many more defects
may get unearthed are two parameters that determine product quality and its

208
assessment. For this assessment, the progress of testing has to be understood.
If only 50% of testing is complete and if 100 defects are found, then, assuming
that the defects are uniformly distributed over the product (and keeping all
other parameters same), another 80-100 defects can be estimated as residual
defects.
The progress chart gives the pass rate and fail rate of executed test
cases, pending test cases, and test cases that are waiting for defects to be fixed.
Representing testing progress in this manner will make it is easy to understand
the status and for further analysis. Another perspective from the chart is that
the pass percentage increases and fail percentage decreases, showing the
positive progress of testing and product quality. The defects that are blocking
the execution of certain test cases also get reduced in number as weeks
progress in the above chart. Hence, a scenario represented by such a progress
chart shows that not only is testing progressing well, but also that the product
quality is improving (which in turn means that the testing is effective). If, on the
other hand, the chart had shown a trend that as the weeks progress, the "not
run" cases are not reducing in number, or "blocked" cases are increasing in
number, or "pass" cases are not increasing, then it would clearly point to
quality problems in the product that prevent the product from being ready for
release.

Check your progress 3

Write short notes on scheduled variance.


Notes: a) Write your answer in the space given below
4. Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

24.5.1 TEST DEFECT METRICS

The test progress metrics discussed in the previous section capture the
progress of defects found with time. The next set of metrics helps us
understand how the defects that are found can be used to improve testing and
product quality. Not all defects are equal in impact or importance. Some
organizations classify defects by assigning a defect priority (for example, PI, P2,
P3, and so on). The priority of a defect provides a management perspective for
the order pf defect fixes. For example, a defect with priority PI indicates that it
should be fixed before another defect with priority P2. Some organizations use
defect severity levels (for example, 51, 52, S3, and so on). The severity of defects
provides the test team a perspective of the impact of that defect in product
functionality. For example, a defect with severity level 51 means that either the
major functionality is not working or the software is crashing. 52 may mean a
failure or functionality not working. From the above example it is clear that
priority is a management perspective and priority levels are relative. This means

209
that the priority of a defect can change dynamically once assigned. Severity is
absolute and does not change often as they reflect the state and quality of the
product. Some organizations use a combination of priority and severity to
classify the defects.

24.5.2 DEVELOPMENT DEFECT METRICS

So far, our focus has been on defects and their analysis to help in
knowing product quality and in improving the effectiveness of testing. We will
now take a different perspective and see how metrics can be used to improve
development activities. The defect metrics that directly help in improving
development activities are discussed in this section and are termed as
development defect metrics. While defect metrics focus on the number of
defects, development defect metrics try to map those defects to different
components of the product and to some of the parameters of development such
as lines of code.
While it is important to count the number of defects in the product, for
development it is important to map them to different components of the product
so that they .can be assigned to the appropriate developer to fix those defects.
The project manager in charge of development maintains a module ownership
list. where all product modules and owners are listed. Based on the number of
defects existing in each of the modules, the effort needed to fix them, and the
availability of skill sets for each of the modules, the project manager assigns
resources accordingly.
It can be noted from the chart that there are four components (install,
reports, client, and database) with over 20 defects, indicating that more focus
and resources are needed for these components. The number of defects and
their classification are denoted in different colors and shading as mentioned in
the legend. The defect classification as well as the total defects corresponding to
each component in the product helps the project manager in assigning and
resolving those defects.
There is another aspect of release, that is, what to release. If there is an
independent component which is producing a large number of defects, and if all
other components are stable, then the scope of the release can be reduced to
remove the component producing the defects and release other stable
components thereby meeting the release date and release quality provided the
functionality provided by that component is not critical to the release. The
above classification of defects into components helps in making such decisions.

24.6 LET US SUM UP

Check Your Progress: Model Answers


1. Effort is the actual time that is spent on a particular activity or a phase.
Elapsed days is the difference between the start of an activity and the
completion of the activity. For example, ordering a product through the
web may involve five minutes of effort and three elapsed days. It is the
packaging and shipping that takes that much duration, not the time
spent by the person in ordering.

210
2 Progress metrics A set of metrics that tracks how the different activities
of the project are progressing. The activities include both development
activities and testing activities. Since the focus of this book is testing,
only metrics applicable to testing activities are discussed in this book
(and in this chapter). Progress metrics is monitored during testing
phases. Progress metrics helps in finding out the status of test activities
and they are also good indicators of product quality.
3. Schedule variance is calculated at the end of every milestone to find out
how well the project is doing with respect to the schedule. To get a real
picture on schedule in the middle of project execution, it is important to
calculate "remaining days yet to be spent" on the project and plot it along
with the "actual schedule spent" as in the above chart. "Remaining days
yet to be spent" can be calculated by adding up all remaining activities

211
LESSON 25

PRODUCTIVITY METRICS
Contents
25.0 Aims and Objectives
25.1 Productivity Metrics
25.1.1 Defects per 100 Hours of Testing
25.1.2 Test Cases Executed per 100 Hours of Testing
25.1.3 Test Cases Developed per 100 Hours of Testing
25.1.4 Defects per 100 Test Cases
25.1.5 Defects per 100 Failed Test Cases
25.1.6 Test Phase Effectiveness
25.1.7 Closed Defect Distribution
25.2 Release Metrics
25.3 Let Us Sum Up

25.0 AIMS AND OBJECTIVES

In this lesson, we are going discuss on productivity metrics and how to improve
the quality of software by finding defects. And calculating 100 hrs of testing,
identifying defects also a challenging process discussed here in this lesson.

25.1 PRODUCTIVITY METRICS

Productivity metrics combine several measurements and effort spent on


the product. They help in finding out the capability and team as well as other
purposes, such as
1. Estimating for the new release.
2. Finding out how well the team is progressing, understanding reasons for
(both positive and negative) variations in results.
3. Estimating the number of defects that can be found.
4. Estimating release date and quality
5. Estimating the cost involved in the release
A conceptual framework is required for understanding the relative merits and
utility of alternate computer systems for high performance: their architecture,
execution model, and software methods for programming and operation. The
general concept of “productivity” has been articulated as that attribute of the
entire process of computing that delivers ultimate value to the end user
mission. Productivity is defined to reflect the rate, value, and costs of producing
computational results in contribution to achieving mission objectives of the

212
using institution. While there is general consensus as to the importance and
abstract nature of a productivity value of merit, there is not a common
viewpoint of the specific metric and its formulation by which to quantitatively
characterize it. The contribution of this paper is to propose a conceptual
framework within which to consider productivity as a parameter and a
formulation by which to quantify it. The objective of this framework is to
establish a set of mutually consistent and complementary metrics that satisfy
the discipline of dimensional analysis and that together provide a rigorous
definition of productivity. It is the intent of this work that its results may serve
as a tool for evaluation and comparison of alternate high performance
computing systems.

Check your progress 1

Describe productivity metrics.


Notes: a) Write your answer in the space given below
5. Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

25.1.1 DEFECTS PER 100 HOURS OF TESTING

Program testing can only prove the presence of defects, never their
absence. Hence, it is reasonable to conclude that there is no end to testing and
more testing may reveal more new defects. But there may be a point of
diminishing returns when further testing may not reveal any defects. If
incoming defects in the product are reducing, it may mean various things.
1. Testing is not effective.
2. The quality of the product is improving.
3. Effort spent in testing is falling.
The first two aspects have been adequately covered in the metrics discussed
above. The metric defects per 100 hours of testing covers the third point and
normalize the number of defects found in the product with respect to the effort
spent. It is calculated as given below:
Defects per 100 hours of testing = (Total defects found in the product for a
period/Total hours spent to get those defects) * 100
Effort plays an important role in judging quality. It is assumed that
constant effort is spent in all the weeks. The chart produced a bell curve,
indicating readiness for the release.
However, in real life, the above assumption may not be true and effort is
not spent equally on testing week by week. Sometimes, the charts and analysis
can be misleading if the effort spent towards the end of the release reduces and

213
may mean that the downward trend in defect arrival is because of less focus on
testing, not because of improved quality.
It is also assumed that 15 hours are spent ill weeks 9 and 10 and 120
hours ill all other weeks. This assumption, which could mean reality too,
actually suggests that the quality of the product has fallen and more defects
were found by investing less effort in testing for weeks 9 and 10. This example
clearly shows that the product is not ready for release at all.
It may be misleading to judge the quality of a product without looking at
effort because a downward trend shown assumes that effort spent is equal
across all weeks. This chart provides the insight-where people were pulled out
of testing or less number of people was available for testing and that is making
the defect count come down. Defects per 100 hours of testing provide this
important perspective, to make the right decision for the release.

25.1.2 TEST CASES EXECUTED PER 100 HOURS OF TESTING

The number of test cases executed by the test team for a particular
duration depends on team productivity and quality of product. The team
productivity has to be calculated accurately so that it can be tracked for the
current release and be used to estimate the next release of the product. If the
quality of the product is good, more test cases can be executed, as there may
not be defects blocking the tests. Also, there may be few defects and the effort
required in filing, reproducing, and analyzing defects could be minimized.
Hence, test cases executed per 100 hours of testing helps in tracking
productivity and also in judging the product quality. It is calculated using the
formula
Test cases executed per 100 hours of testing = (Total test cases
executed for a period/Total hours spent in test execution) * 100

Check your progress 2

Define productivity.
Notes: a) Write your answer in the space given below
6. Check your answer with the one given at the end of this lesson.
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

25.1.3 TEST CASES DEVELOPED PER 100 HOURS OF TESTING

Both manual execution of test cases and automating test cases require
estimating and tracking of productivity numbers. In a product scenario, not all
test cases are written afresh for every release. New test cases are added to
address new functionality and for testing features that were not tested earlier.
Existing test cases are modified to reflect changes in the product. Some test

214
cases are deleted if they are no longer useful or if corresponding features are
removed from the product. Hence the formula for test cases developed uses the
count corresponding to added/modified and deleted test cases.
Test cases developed per 100 hours of testing = (Total test cases
developed for a period /total hours spent in test case development) * 100

25.1.4 DEFECTS PER 100 TEST CASES

Since the goal of testing is find out as many defects as possible, it is


appropriate to measure the" defect yield" of tests, that is, how many defects get
uncovered during testing. This is a function of two parameters-one, the
effectiveness of the tests in uncovering defects and two, the effectiveness of
choosing tests that are capable of uncovering defects. The ability of a test case
to uncover defects depends on how well the test cases are designed and
developed. But, in a typical product scenario, not all test cases are executed for
every test cycle. Hence, it is better to select test cases that produce defects. A
measure that quantifies these two parameters is defect per 100 test cases. Yet
another parameter that influences this metric is the quality of product. If
product quality is poor, it produces more defects per 100 test cas1es compared
to a good quality product. The formula used for calculating this metric is
Defects per 100 test cases = (Total defects found for a period/
Total testcases executed for the same period) * 100

25.1.5 DEFECTS PER 100 FAILED TEST CASES

Defects per 100 failed test cases are a good measure to find out how
granular the test cases are. It indicates
1. How many test cases need to be executed when a defect is fixed;
2. What defects need to be fixed so that an acceptable number of test cases
reach the pass state; and
3. How the fail rate of test cases and defects affect each other for release
readiness analysis.

Defects per 100 failed test cases = (Total defects found for a period!
Total test cases failed due to those defects) * 100

25.1.6 TEST PHASE EFFECTIVENESS

In Principles of Testing we saw that testing is not the job of testers alone.
Developers perform unit testing and there could be multiple testing teams
performing component, integration, and system testing phases. The idea of
testing is to find defects early in the cycle and in the early phases of testing. As
testing is performed by various teams with the objective of finding defects early
at various phases, a metric is needed to compare the defects filed by each of the
phases in testing. The defects found in various phases such as unit testing
(UT), component testing (CT), integration testing (IT), and system testing (ST).

215
The following few observations can be made.
1. A good proportion of defects were found in the early phases of testing
(UT and CT).
2. Product quality improved from phase to phase (shown by less percent
of defects found in the later test phases-IT and ST).
Extending this data, some projections on post-release defects can be
arrived at. CT found 32% of defects and IT found 17% of defects. This is
approximately a 45% reduction in the number of defects. Similarly,
approximately 35% reduction in the number of defects was found going from IT
to ST. A post release can now assume 35% reduction in the number of defects
which amounts to 7.5% of the total defects. A conservative estimate thus
indicates that close to 7.5% of total defects will be found by customers. This
may not be an accurate estimate but can be used for staffing and planning of
support activities.

25.1.7 CLOSED DEFECT DISTRIBUTION

The objective of testing is not only to find defects. The testing team also
has the objective to ensure that all defects found through testing are fixed so
that the customer gets the benefit of testing and the product quality improves.
To ensure that most of the defects are fixed, the testing team has to track the
defects and analyze how they are closed. The closed defect distribution helps in
this analysis.

25.2 RELEASE METRICS

We discussed several metrics and how they can be used to determine


whether the product is ready for release. The decision to release a product
would need to consider several perspectives and several metrics. All the metrics
that were discussed in the previous section need to be considered in totality for
making the release decision. The purpose of this section is to provide certain
guidelines that will assist in making this decision. These are only set of
guidelines and the exact number and nature of criteria can vary from release to
release, product to product and organization to organization.
The Center for Internet Security plans to release a set of security metrics to help
organizations gauge their security posture. The effort involved more than 80 IT
security experts from government, academia and business. The metrics help
organizations check on how effectively they have deployed security technologies
and policies.

25.3 LET US SUM UP

Check Your Progress: Model Answers


1. Productivity metrics combine several measurements and effort spent on
the product.
2. The general concept of “productivity” has been articulated as that
attribute of the entire process of computing that delivers ultimate value

216
to the end user mission. Productivity is defined to reflect the rate, value,
and costs of producing computational results in contribution to
achieving mission objectives of the using institution.

References :
1. SOFTWARE TESTING Principles and Practices – Srinivasan Desikan
& Gopalswamy Ramesh, 2006, Pearson Education.
2. http://www.a2zdotnet.com/View.aspx?id=51
3. http://en.wikipedia.org/wiki/White_box_testing
4. https://buildsecurityin.us-cert.gov/daisy/bsi/articles/best-
practices/white-box/259-BSI.html
5. http://pagkis-software-testing.blogspot.com/2007/10/functional-
testing-vs-non-functional.html
6. http://www.testingstandards.co.uk/non_functional_testing_
techniques.htm
7. http://en.wikipedia.org/wiki/Performance_testing
8. http://www.opensourcetesting.org/performance.php
9. http://www.cs.umd.edu/~aporter/html/currTesting.html
10. http://www.cs.umd.edu/~aporter/html/currTesting.html
11. http://www.stpmag.com/downloads/stp-0507_testmetrics.html
12. http://www.webspiders.com/en/testing_deliverables.asp
13. http://it.toolbox.com/blogs/enterprise-solutions/identifying-test-
metrics-13382

217

You might also like