software testing notes
software testing notes
As the programs are coded (in the chosen programming language), they are also tested.
In addition, after the coding is (deemed) complete, the product is subjected to testing.
Testing is the process of exercising the software product in pre-defined ways to check if the
behavior is the same as expected behavior.
By testing the product, an organization identifies and removes as many defects as possible
before shipping it out.
Deployment and Maintenance
Once a product is tested, it is given to the customers who deploy it in their environments.
The product now enters the maintenance phase, wherein the product is maintained or changed
to satisfy the changes that arise from customer expectations, environmental changes, etc.
Maintenance is made up of
corrective maintenance (for example, fixing customer-reported problems),
adaptive maintenance (for example, making the software run on a new version of an operating
system or database), and
preventive maintenance (for example, changing the application program code to avoid a
potential security hole in an operating system code).
QUALITY, QUALITY ASSURANCE, AND QUALITY CONTROL
Quality is meeting the requirements expected of the software, consistently and predictably.
2. What changes are actually produced in the internal state or environment; and
If the actual behavior and the expected behavior are identical in all their characteristics, then that test
case is said to be passed. If not, the given software is said to have a defect on that test case.
2
There are two types of methods
quality control
quality assurance.
Quality control attempts to build a product, test it for expected behavior after it is built, and if
the expected behavior is not the same as the actual behavior of the product, fixes the product as
is necessary and rebuilds the product.
This iteration is repeated till the expected behavior of the product matches the actual behavior
for the scenarios tested.
Thus quality control is defect-detection and defect-correction oriented, and works on the
product rather than on the process.
Quality assurance, on the other hand, attempts defect prevention by concentrating on the
process of producing the product rather than working on defect detection/correction after the
product is built.
Difference between quality assurance and quality control.
oriented Usually done throughout the life cycle Usually done after the product is built
Verification is the process of evaluating a system or component to determine whether the products of
a given phase satisfy the conditions imposed at the start of that phase.
Timely testing increases the chances of a product or service meeting the customer's requirements.
When a product is tested with appropriate and realistic tests that reflect typical usage patterns by the
intended users, the chances of the product satisfying the customer's requirement is much higher.
While testing does not guarantee zero defects, effective testing certainly increases the chances of
customer acceptance of the software.
The purpose of testing is to uncover defects in the system (and to have someone fix the defects).
Testing is done by a set of people within a software product (or service) organization whose goal and
charter is to uncover the defects in the product before it reaches the customer.
3
Validation is the process of evaluating a system or component during or at the end of the
development process to determine whether it satisfies specified requirements.
Testing is NOT meant to replace other ways of ensuring quality (like reviews). It is one of the
methods to detect defects in a software product. There are other methods that achieve the same
function.
The idea of catching defects within each phase leads us to define two more terms—verification
and validation.
During the requirements gathering phase, the requirements are faithfully captured. The SRS
document is the product of the requirements phase. To ensure that requirements are faithfully
captured, the customer verifies this document.
The design phase takes the SRS document as input and maps the requirements to a design that
can drive the coding. The SDD document is the product of the design phase.
The SDD is verified by the requirements team to ensure that the design faithfully reflects the
SRS, which imposed the conditions at the beginning of the design phase.
Verification takes care of activities to focus on the question “Are we building the product right?” and
validation takes care of a set of activities to address the question “Are we building the right product?”
To build the product right, certain activities/conditions/procedures are imposed at the beginning of the
life cycle. These activities are considered “proactive” as their purpose is to prevent the defects before
they take
shape. The process activities carried out during various phases for each of the product releases can be
termed as verification. Requirements review, design review, and code review are some examples of
verification activities.
To build the right product, certain activities are carried out during various phases to validate
whether the product is built as per specifications. These activities are considered “reactive” as
their purpose is to find defects that affect the product and fix them as soon as they are introduced.
Some examples of validation include unit testing performed to verify if the code logic works,
integration testing performed to verify the design, and system testing performed to verify that the
requirements are met.
Testing
A process model is a way to represent any given phase of software development that effectively
builds in the concepts of validation and verification to prevent and minimize the delay between defect
injection and defect detection (and eventual correction). In this model, each phase of a software
project is characterized by the following.
4
Entry criteria, which specify when that phase can be started. Also included are the inputs
for the phase.
Tasks, or steps that need to be carried out in that phase, along with measurements that
characterize the tasks.
Verification, which specifies methods of checking that the tasks have been carried out correctly.
Exit criteria, which stipulate the conditions under which one can consider the phase as
done. Also included are the outputs for only the phase.
This model, known as the Entry Task Verification eXit or ETVX model, offers several advantages
for effective verification and validation.
1. Clear entry criteria make sure that a given phase does not start prematurely.
2. The verification for each phase (or each activity in each phase) helps prevent defects, or
at least, minimizes the time delay between defect injection and defect detection.
3. Documentation of the detailed tasks that comprise each phase reduces the ambiguity in
interpretation of the instructions and thus minimizes the variations that can come from repeated
executions of these tasks by different individuals.
4. Clear exit criteria provide a means of validation of the phase, after the phase is done but
before handing over to the next phase.
An example of applying the ETVX model to the design phase is presented in figure
The ETVX model characterizes a phase of a project. A Life Cycle model describes how the phases
combine together to form a complete project or life cycle. Such a model is characterized by the
following attributes.
5
The activities performed In any given software project, apart from the most common activities or
phases— requirements gathering, design, development, testing, and maintenance—there could be
other activities as well. Some of these activities could be technical activities (for example, porting)
and some could be non- technical (for example, hiring).
The deliverables from each activity Each activity produces a set of deliverables, which are the end
products of that activity. For example, the requirements gathering phase produces the SRS document,
the design phase produces the SDD document, and so on.
Methods of validation of the deliverables The outputs produced by a given activity represent the
goal to be satisfied by that activity. Hence it is necessary to have proper validation criteria for each
output.
The sequence of activities The different activities work together in unison in a certain sequence of
steps to achieve overall project goals. For example, the process of requirements gathering may
involve steps such as interviews with customers, documentation of requirements, validation of
documented requirements with customers, and freezing of requirements. These steps may be
repeated as many times as needed to get the final frozen requirements.
Waterfall Model
In the Waterfall model, a project is divided into a set of phases (or activities).
A project starts with an initial phase, and upon completion of the phase, moves on to the next
phase. On the completion of this phase, the project moves to the subsequent phase and so on.
Thus the phases are strictly time sequenced.
6
Waterfall model.
A Waterfall model is characterized by three attributes.
3. When an error is detected, it is traced back to one previous phase at a time, until it gets
resolved at some earlier phase.
The main strength of the Waterfall Model is its simplicity. The model is very useful when a
project can actually be divided into watertight compartments. But very few software projects
can be divided thus.
The major drawback in the Waterfall model arises from the delay in feedback among the
phases, and thus the ineffectiveness of verification and validation activities. An error in one
phase is not detected till at least the next phase.
When a given phase detects an error, the communication is only to the immediately preceding
phase. This sequential nature of communication among the phases can introduce inordinate
delays in resolving the problem.
The reduced responsiveness that is inherent in the model and the fact that the segregation of
phases is unrealistic severely restricts the applicability of this model.
Prototyping and Rapid Application Development (RAD) models recognize and address the following issues.
1. Early and frequent user feedback will increase the chances of a software project
meeting the customers’ requirements.
2. Changes are unavoidable and the software development process must be able to adapt itself to
rapid changes.
1. A Prototyping model uses constant user interaction, early in the requirements gathering
stage, to produce a prototype.
7
2. The proto-type is used to derive the system requirements specification and can be discarded
after the SRS is built.
3. An appropriate life cycle model is chosen for building the actual product after the user accepts the
SRS.
The Prototyping model comprises the following activities.
1. The software development organization interacts with customers to understand their requirements.
2. The software development organization produces a prototype to show how the eventual
software system would look like. This prototype would have the models of how the input
screens and output reports would look like, in addition to having some “empty can
functionality” to demonstrate the workflow and processing logic.
3. The customer and the development organization review the prototype frequently so that the
customer's feedback is taken very early in the cycle (that is, during the requirements gathering
phase).
4. Based on the feedback and the prototype that is produced, the software development
organization produces the System Requirements Specification document.
6. The SRS document is used as the basis for further design and development.
Thus, the prototype is simply used as a means of quickly gathering (the right) requirements.
The Rapid Application Development model
The Rapid Application Development model is a variation of the Prototyping Model. Like the
Prototyping Model, the RAD Model relies on feedback and interaction by the customers to gather the
initial requirements. However, the Prototyping model differs from the RAD Model on two counts.
First, in the RAD Model, it is not a prototype that is built but the actual product itself. That is,
the built application (prototype, in the previous model) is not discarded. Hence, it is named
Rapid Application Development model.
Second, in order to ensure formalism in capturing the requirements and proper reflection of the
requirements in the design and subsequent phases, a Computer Aided Software Engineering (CASE)
tool is used throughout the life cycle, right from requirements gathering. Such CASE tools have
The methodologies provided by a CASE tool can provide inbuilt means of verification and
validation. For example, the tool may be able to automatically detect and resolve inconsistencies in
8
data types or dependencies. Since the design (and, perhaps, even the program code) can be
automatically generated from the requirements, the validation can be very complete, extending to all
the downstream phases, unlike the Prototyping model.
This method can have wider applicability for even general-purpose products. The automatic generation
of the design and programs produced by a CASE tool makes this model more attractive. The cost of
such CASE tools is a factor that an organization would have to consider before deciding on the use of
this model for a given project. In addition, CASE tools and this model is generally more suited for
applications projects rather than systems type projects.
The Spiral or Iterative model follows a process in which the requirements gathering, design,
coding, and testing are performed iteratively till all requirements are met.
There is also a good amount of overlap among the activities of requirements gathering,
design, coding, and testing following this model.
What phase the product is in is difficult to conclude as each requirement can be at a different
phase.
The only conclusion that can be made is at what phase each of the requirements is in. If a
defect is produced in any phase of a given requirement, it may cause that requirement to
revisit an earlier phase.
This model enables incremental development whereby the product evolves, with requirements
getting added to it dynamically. T
his enables the product to be demonstrated, at any point of time, with the functionality
available at that point of time.
It also enables the “increments” to be sent to the customer for approval. The progress of the
product can be seen from the beginning of the project as the model delivers “increments” at
regular intervals. Even though it will be very difficult to plan a release date following this
model, it allows the progress to be tracked and the customer approvals to be obtained at
regular intervals, thereby reducing the risk of finding major defects at a later point of time.
Table gives an example of phases for some of the requirements in the product.
Requirement- 2 Design
Requirement-3 Requirement
Requirement-4 Testing
Requirement-5 Released
9
Figure depicts the Spiral model and the phases involved inthe model, for the example on Table. As can be seen,
each requirement is “spiraling outwards” through the different phases as the entire project evolves.
Spiral model.
The V Model
The V Model starts off being similar to the Waterfall Model in that it envisages product development
to be made up of a number of phases or levels. However, the new perspective that the V Model brings
in is that different types of testing apply at different levels. Thus, from a testing perspective, the types
of tests that need to be done at each level vary significantly.
In order to realize the software requirements, the proposed software system is envisaged as a set of
subsystems that work together.
This high-level design (of breaking the system into subsystems with identified interfaces) then
gets translated to a more detailed or low-level design.
This detailed design goes into issues like data structures, algorithm choices, table layouts,
processing logic, exception conditions, and so on. It results in the identification of a number of
components, each component realized by program code written in appropriate programming
languages.
Given these levels, what kind of tests apply in each of these levels?
To begin with, for overall business requirements, eventually whatever software is developed
should fit into and work in this overall context and should be accepted by the end users, in their
environment.
This testing, the final proof of the pudding, is acceptance testing.
But, before the product is deployed in the customer's environment, the product vendor should
test it as an entire unit to make sure that all the software requirements are satisfied by the
product that is developed. This testing of the entire software system can be called system
testing.
Since high-level design views the system as being made up of interoperating and integrated
(software) subsystems, the individual subsystems should be integrated and tested together before
a full blown system test can be done. This testing of high-level design corresponds to
integration testing.
10
The components that are the outputs of the low-level design have to be tested independently
before being integrated. Thus, the testing corresponding
11
to the low-level design phase is component testing.
Finally, since coding produces several program units, each of these smaller program units have to be
tested independently before trying to combine them together to form components. This testing of the
program units forms unit testing.
Figure depicts the different types of testing that apply to each of the steps.In other words, the step
called “testing” is now broken down into different sub-steps called acceptance testing, system testing,
and so on So, it is still the case that all the testing execution related activities are done only at the end
of the life cycle.
2. Test design is done early, while test execution is done in the end.
3. There are different types of tests for each phase of life cycle.
Not only are the skill sets required for designing these different types of tests different, but also, there
is no reason to defer the designing of the tests till the very end. As and when each activity on the left-
hand side of
the “V” is being carried out, the design of the corresponding type of tests can be carried out. By
performing an early design of the tests and deferring only the test execution till the end, we achieve
three important gains.
First, we achieve more parallelism and reduce the end-of-cycle time taken for testing.
Second, by designing tests for each activity upfront, we are building in better upfront
validation, thus again reducing last-minute surprises.
This is the basis for the V Model, which presents excellent advantages for verification and validation.
As shown in below figure each type of test, we move the design of tests upstream, along with the
12
actual activities and retain the test execution downstream, after the product is built.
V Model.
Modified V Model
The V Model split the design and execution portion of the various types of tests and attached the test
design portion to the corresponding earlier phases of the software life cycle.
In the modified V Model, this parallelism is exploited. When each unit or component or module is
given explicit exit criteria to pass on to the subsequent stage, the units or components or modules that
satisfy a given phase of testing move to the next phase of testing where possible, without necessarily
waiting for all the units or components or modules to move in unison from one phase of testing to
another.
Modified V Model.
1. The modified V model recognizes that different parts of a product are in different stages of evolution.
2. Each part enters the appropriate testing phase (such as unit testing, component testing, and
so on) when the appropriate entry criteria are met.
Just as the V Model introduced various types of testing, the modified V model introduces
13
various phases of testing.
A phase of testing has a one-to-one mapping to the types of testing, that is, there is a unit-
testing phase, component-testing phase, and so on.
Once a unit has completed the unit-testing phase, it becomes part of a component and enters the
component-testing phase.
It then moves to integration-testing phase and so on. Rather than view the product as going
through different types of tests (as the V model does), the modified V Model views each part of
the product to go through different phases of testing.
These are actually two sides of the same coin and thus provide complimentary views. The main
advantage the modified V model brings to the table is the recognition of the parallelism present
in different parts of the product and assigning each part to the most appropriate phase of testing
that is possible. In figure the columns of the table represents one side of V, and rows (which are
test phases) represent the other side of V.
While starting a phase of testing it is important to look at whether the product is ready for
testing. It is determined by a set of entry criteria.
The earliest possible quality to start the next phase of testing is denoted by entry criteria, and
to start the next phase of testing the earlier phase need not have completed
The testing phases are also associated with a set of exit criteria to complete the test activities
for each phase. They are determined by exit criteria.
The entry and exit criteria for each of the phases ensure that right quality of product delivered
for starting the test and right amount of testing is completed for the release.
Even though it is indicated in the picture all of the test phases finish at the same time,
practically it can have different time lines. The longest phase determines the release date.
Every software product is realized by means of a program code. White box testing is a way of
testing the external functionality of the code by examining and testing the program code that
realizes the external functionality. This is also known as clear box, or glass box or open box
testing.
STATIC TESTING
Static testing is a type of testing which requires only the source code of the product, not the binaries
or executables. Static testing does not involve executing the programs on computers but involves
select people going through the code to find out whether
the code has been written in accordance with the design developed earlier in the project life cycle;
Static testing can be done by humans or with the help of specialized tools.
These methods rely on the principle of humans reading the program code to detect errors
rather than computers executing the code to find errors. This process has several advantages.
1. Sometimes humans can find errors that computers cannot. For example, when there are two
variables with similar names and the programmer used a “wrong” variable by mistake in an
expression, the computer will not detect the error but execute the statement and produce
incorrect results, whereas a human being can spot such an error.
2. By making multiple humans read and evaluate the program, we can get multiple
perspectives and therefore have more problems identified upfront than a computer could.
3. A human evaluation of the code can compare it against the specifications or design and thus
ensure that it does what is intended to do. This may not always be possible when a computer
runs a test.
4. A human evaluation can detect many problems at one go and can even try to identify the root
causes of the problems. More often than not, multiple problems can get fixed by attending to
the same root cause. Typically, in a reactive testing, a test uncovers one problem (or, at best, a
few problems) at a
time. Often, such testing only reveals the symptoms rather than the root causes. Thus, the
overall time required to fix all the problems can be reduced substantially by a human
evaluation.
5. By making humans test the code before execution, computer resources can be saved. Of
course, this comes at the expense of human resources.
6. A proactive method of testing like static testing minimizes the delay in identification of the
problems.
7. From a psychological point of view, finding defects later in the cycle (for example, after the
code is compiled and the system is being put together) creates immense pressure on
programmers. They have to fix defects with less time to spare. With this kind of pressure, there
15
are higher chances of other defects creeping in.
There are multiple methods to achieve static testing by humans. They are (in the increasing
order of formalism) as follows.
2. Code walkthrough
3. Code review
4. Code inspection
Desk checking
Normally done manually by the author of the code, desk checking is a method to verify the portions of
the code for correctness. Such verification is done by comparing the code with the design or
specifications to make sure that the code does what it is supposed to do and effectively. This is the
desk checking that most programmers do before compiling and executing the code. Whenever errors
are found, the author applies the corrections for errors on the spot. This method of catching and
correcting errors is characterized by:
In effect, this method relies completely on the author's thoroughness, diligence, and skills. There is no
process or structure that guarantees or verifies the effectiveness of desk checking. This method is
effective for
correcting “obvious” coding errors but will not be effective in detecting errors that arise due to
incorrect understanding of requirements or incomplete requirements. This is because developers (or,
more precisely, programmers who are doing the desk checking) may not have the domain knowledge
required to understand the requirements fully.
The main advantage offered by this method is that the programmer who knows the code and the
programming language very well is well equipped to read and understand his or her own code. Also,
since this is done by one individual, there are fewer scheduling and logistics overheads. Furthermore,
the defects are detected and corrected with minimum time delay.
1. A developer is not the best person to detect problems in his or her own code. He or she may be
tunnel visioned and have blind spots to certain types of problems.
2. Developers generally prefer to write new code rather than do any form of testing.
3. This method is essentially person-dependent and informal and thus may not work consistently
across all developers.
Owing to these disadvantages, the next two types of proactive methods are introduced. The basic
principle of walkthroughs and formal inspections is to involve multiple people in the review process.
Code walkthrough
16
This method and formal inspection are group-oriented methods. Walkthroughs are less formal
than inspections.
The line drawn in formalism between walkthroughs and inspections is very thin and varies
from organization to organization.
The advantage that walkthrough has over desk checking is that it brings multiple perspectives.
In walkthroughs, a set of people look at the program code and raise questions for the author.
The author explains the logic of the code, and answers the questions.
If the author is unable to answer some questions, he or she then takes those questions and finds
their answers. Completeness is limited to the area where questions are raised by the team.
Formal inspection
Code inspection—also called Fagan Inspection (named after the original formulator)—is a method,
normally with a high degree of formalism. The focus of this method is to detect all faults, violations,
and other side- effects. This method increases the number of defects detected by
A formal inspection should take place only when the author has made sure the code is ready
for inspection by performing some basic desk checking and walkthroughs.
When the code is in such a reasonable state of readiness, an inspection meeting is arranged.
There are four roles in inspection.
First is the author of the code.
Second is a moderator who is expected to formally run the inspection according to the process.
Third are the inspectors. These are the people who actually provides, review comments for the
code. There are typically multiple inspectors.
Finally, there is a scribe, who takes detailed notes during the inspection meeting and circulates
them to the inspection team after the meeting.
The author or the moderator selects the review team. The chosen members have the skill sets
to uncover as many defects as possible.
In an introductory meeting, the inspectors get copies (These can be hard copies orsoft
copies) of the code to be inspected along with other supporting documents such as the design
document,requirements document, and any documentation of applicable standards.
The author also presents his or her perspective of what the program is intended to do along
with any specific issues that he or she may want the inspection team to put extra focus on. T
he moderator informs the team about the date, time, and venue of the inspection meeting.
The inspectors get adequate time to go through the documents and program and ascertain
their compliance to the requirements, design, and standards.
The inspection team assembles at the agreed time for the inspection meeting (also called the
defect logging meeting). The moderator takes the team sequentially through the program code,
asking each inspector if there are any defects in that part of the code.
If any of the inspectors raises a defect, then the inspection team deliberates on the defect and,
when agreed that there is a defect, classifies it in two dimensions–– minor/major and
systemic/mis-execution.
17
A mis-execution defect is one which, as the name suggests, happens because of an error or slip
on the part of the author.
It is unlikely to be repeated later, either in this work product or in other work products. An
example of this is using a wrong variable in a statement. Systemic defects, on the other hand,
can require correction at a different level.
A scribe formally documents the defects found in the inspection meeting and the author takes
care of fixing these defects. In case the defects are severe, the team may optionally call for a
review meeting to inspect the fixes to ensure that they address the problems. In any case,
defects found through inspection need to be tracked till completion and someone in the team
has to verify that the problems have been fixed properly.
The review and inspection mechanisms described above involve significant amount of manual work.
There are several static analysis tools available in the market that can reduce the manual work and
perform analysis of
the code to find out errors such as those listed below.
5. memory allocated but not having corresponding statements for freeing them up memory
Some of the static analysis tools can also check compliance for coding standards as prescribed by
standards such as POSIX. These tools can also check for consistency in coding guidelines (for
example, naming conventions, allowed data types, permissible programming constructs, and so on).
If the programming language allows mixed case names, are there variable names with
confusing use of lower case letters and capital letters?
18
Are there similar sounding names (especially words in singular and plural)? [These could be
possible causes of unintended errors.]
Are all the common stru tures, constants, and flags to be used defined in a header file rather
than in each file separately?
Has the usage of similar-looking operators (for example,=and == or & and && in C) checked?
If there is a nested IF statement, are the THEN and ELSE parts appropriately delimited?
In the case of a multi-way branch like SWITCH / CASE statement, is a default clause
provided? Are the breaks after each CASE appropriate?
Are there any loops where the final condition will never be met and hence cause the
program to go into an infinite loop?
What is the level of nesting of the conditional statements? Can the code be simplified to
reduce complexity?
STANDARDS RELATED
Does the code follow any coding conventions that are platform specific (for example, GUI
calls specific to Windows or Swing)
STYLE RELATED
Are unhealthy programming constructs (for example, global variables in C, ALTER statement
in COBOL) being used in the program?
MISCELLANEOUS
Have you checked for memory leaks (for example, memory acquired but not explicitly freed)?
DOCUMENTATION RELATED
Is the code adequately documented, especially where the logic is complex or the section of
code is critical for product functioning?
Is appropriate change history documented?Are the interfaces and the parameters thereof
properly documented?
STRUCTURAL TESTING
Structural testing takes into account the code, code structure, internal design, and how they are
coded. The fundamental difference between structural testing and static testing is that in
structural testing tests are actually run by the computer on the built product, whereas in static
testing, the product is tested by humans using just the source code and not the executables or
binaries.
Structural testing entails running the actual product against some predesigned test cases to
exercise as much of the code as possible or necessary. A given portion of the code is exercised
if a test case causes the program to execute that portion of the code when running the test.
structural testing can be further classified into unit/code functional testing, code
coverage, and code complexity testing.
This initial part of structural testing corresponds to some quick checks that a developer performs
before subjecting the code to more extensive code coverage testing or code complexity testing. This
can happen by several methods.
1. Initially, the developer can perform certain obvious tests, knowing the input variables and
the corresponding expected output variables. This can be a quick test that checks out any
obvious mistakes. By repeating these tests for multiple values of input variables, the
confidence level of the developer to go to the next level increases. This can even be done
prior to formal reviews of static testing so that the review mechanism does not waste time
catching obvious errors.
2. For modules with complex logic or conditions, the developer can build a “debug version” of
the product by putting intermediate print statements and making sure the program is passing
through the right loops and iterations the right number of times. It is important to remove the
intermediate print statements after the defects are fixed.
3. Another approach to do the initial test is to run the product under a debugger or an Integrated
Development Environment (IDE). These tools allow single stepping of instructions (allowing
the developer to stop at the end of each instruction, view or modify the contents of variables,
and so on), setting break points at any function or instruction, and viewing the various system
20
parameters or program variable values.
Code coverage testing involves designing and executing test cases and finding out the percentage of
code that is covered by testing. The percentage of code covered by a test is found by adopting a
technique called instrumentation of code.
There are specialized tools available to achieve instrumentation. Instrumentation rebuilds the product,
linking the product with a set of libraries provided by the tool vendors.
This instrumented code can monitor and keep an audit of what portions of code are covered. The tools
also allow reporting on the portions of the code that arecovered frequently, so that the critical or most-
often portions of code can be identified.
Code coverage testing is made up of the following types of coverage.
1. Statement coverage
2. Path coverage
3. Condition coverage
4. Function coverage
Statement coverage
Object-oriented languages have all of the above and, in addition, a number of other constructs and
concepts.
Statement coverage refers to writing test cases that execute each of the program statements. One can
start with the assumption that more the code covered, the better is the testing of the functionality, as
the code realizes the functionality. Based on this assumption, code coverage can be achieved by
providing coverage to each of the above types of statements.
When we consider a two-way decision construct like the if statement, then to cover all the statements,
we should also cover the then and else parts of the if statement. This means we should have, for each
if then else, (at least) one test case to test the Then part and (at least) one test case to test the else part.
The multi-way decision construct such as a Switch statement can be reduced to multiple two-way
if statements. Thus, to cover all possible switch cases, there would be multiple test cases
21
Loop constructs present more variations to take care of. A loop—in various forms such as for,
while, repeat, and so on—is characterized by executing a set of statements repeatedly until or while
certain conditions are met. A good percentage of the defects in programs come about because of
loops that do not function properly. More often, loops fail in what are called “boundary conditions.”
One of the common looping errors is that the termination condition of the loop is not properly
stated. In order to make sure that there is better statement coverage for statements within a loop,
there should be test cases that
1. Skip the loop completely, so that the situation of the termination condition being true before
starting the loop is tested.
2. Exercise the loop between once and the maximum number of times, to check all possible
“normal” operations of the loop.
3. Try covering the loop, around the “boundary” of n—that is, just below n, n, and just above n.
The statement coverage for a program, which is an indication of the percentage of statements
actually executed in a set of tests, can be calculated by the formula given alongside in the
margin.
Statement Coverage=(Total statements exercised / Total number of executable statements in program) * 100
It is clear from the above discussion that as the type of statement progresses from a simple
sequential statement to if then else and through to loops, the number of test cases required to
achieve statement coverage increases.
Even if we were to achieve a very high level of statement coverage, it does not mean that the
program is defect-free. First, consider a hypothetical case when we achieved 100 percent code
coverage. If the program implements wrong requirements and this wrongly implemented code is
“fully tested,” with 100 percent code coverage, it still is a wrong program and hence the 100 percent
code coverage does not mean anything.
*/ if (code == “M”) {
stmt1
stmt2
Stmt3
stmt4
Stmt5
22
;
stmt6
Stmt7
In the above program, when we test with code=“M,” we will get 80 percent code coverage. But if
the data distribution in the real world is such that 90 percent of the time, the value of code is
not=“M,” then, the program will fail 90 percent of the time (because of the divide by zero in the
highlighted line). Thus, even with a code coverage of 80 percent, we are left with a defect that hits
the users 90 percent of the time.
In path coverage, we split a program into a number of distinct paths. A program (or a part of a
program) can start from the beginning and take any of the paths to its completion.
Path Coverage = (Total paths exercised / Total number of paths in program) * 100
Let us take an example of a date validation routine. The date is accepted as three fields’ mm, dd and
yyyy. We have assumed that prior to entering this routine, the values are checked to be numeric. To
simplify the discussion, we have assumed the existence of a function called leapyear which will
return TRUE if the given year is a leap year. There is an array called DayofMonth which contains the
number of days in each month. A simplified flow chart is given below.
As can be seen from the figure, there are different paths that can be taken through the program. Each
part of the path is shown in red. Some of the paths are
23
A
B-D-G
B-D-H
B-C-E-G
B-C-E-H
B-C-F-G
B-C-F-H
Regardless of the number of statements in each of these paths, if we can execute these paths, then we
would have covered most of the typical scenarios.
Path coverage provides a stronger condition of coverage than statement coverage as it relates to the
various logical paths in the program rather than just program statements.
Condition coverage
Condition Coverage = (Total decisions exercised / Total number of decisions in program ) * 100
The condition coverage, as defined by the formula alongside in the margin gives an indication of
the percentage of conditions covered by a set of test cases. Condition coverage is a much stronger
criteria than path coverage, which in turn is a much stronger criteria than statement coverage.
Function coverage
This is a new addition to structural testing to identify how many program functions (similar to
functions in “C” language) are covered by test cases.
The advantages that function coverage provides over the other types of coverage are as follows.
1. Functions are easier to identify in a program and hence it is easier to write test cases to
provide function coverage.
2. Since functions are at a much higher level of abstraction than code, it is easier to achieve 100
percent function coverage than 100 percent coverage in any of the earlier methods.
3. Functions have a more logical mapping to requirements and hence can provide a more direct
correlation to the test coverage of the product.
4. Since functions are a means of realizing requirements, the importance of functions can be
prioritized based on the importance of the requirements they realize. Thus, it would be easier
to prioritize the functions for testing. This is not necessarily the case with the earlier methods
of coverage.
Thus, function coverage can help in improving the performance as well as quality of the product.
Function Coverage = (Total functions exercised / Total number of functions in program) * 100
Two questions that come to mind while using these coverage are:
1. Which of the paths are independent? If two paths are not independent, then we may be
able to minimize the number of tests.
24
2. Is there an upper bound on the number of tests that must be run to ensure that all the
statements have been executed at least once?
Cyclomatic complexity is a metric that quantifies the complexity of a program and thus provides
answers to the above questions.
A program is represented in the form of a flow graph. A flow graph consists of nodes and edges. In
order to convert a standard flow chart into a flow graph to compute cyclomatic complexity, the
following steps can be taken.
1. Identify the predicates or decision points (typically the Boolean conditions or conditional
statements) in the program.
2. Ensure that the predicates are simple (that is, no and/or, and so on in each predicate).
3. Combine all sequential statements into a single node. The reasoning here is that these
statements all get executed, once started.
4. When a set of sequential statements are followed by a simple predicate (as simplified in (2)
above), combine all the sequential statements and the predicate check into one node and have
two edges emanating from this one node. Such nodes with two edges emanating from them
are called predicate nodes.
5. Make sure that all the edges terminate at some node; add a node to represent all the
sets of sequential statements at the end of the program.
25
Figure :Converting a conventional flow chart to a flow graph.
Intuitively, a flow graph and the cyclomatic complexity provide indicators to the complexity of the
logic flow in a program and to the number of independent paths in a program. The primary
contributors to both the complexity and independent paths are the decision points in the program.
Consider a hypothetical program with no decision points. The flow graph of such a program would
have two nodes, one for the code and one for the termination node. Since all the sequential steps are
combined into one node (the first node), there is only one edge, which connects the two nodes. This
edge is the only independent path. Hence, for this flow graph, cyclomatic complexity is equal to one.
This graph has no predicate nodes because there are no decision points. Hence, the cyclomatic
complexity is also equal to the number of predicate nodes (0) + 1.
Note that in this flow graph, the edges (E) = 1; nodes (N) = 2. The cyclomatic complexity is also
equal to 1 = 1 + 2 – 2 = E – N + 2.
26
When a predicate node is added to the flow graph, there are obviously two independent paths, one
following the path when the Boolean condition is TRUE and one when the Boolean condition is
FALSE. Thus, the cyclomatic complexity of the graph is 2
Incidentally, this number of independent paths, 2, is again equal to the number of predicate nodes (1) +
1. When we add a predicate node (a node with two edges), complexity increases by 1, since the “E” in
the E – N + 2 formula is increased by one while the “N” is unchanged. As a result, the complexity
using the formula E – N + 2 also works out to 2.
From the above reasoning, the reader would hopefully have got an idea about the two different ways
to calculate cyclomatic complexity and the relevance of cyclomatic complexity in identifying
independent paths through a program. We have summarized these formulae below. We are not going
to formally prove these formulae. Suffice to say that these formulae are extremely useful.
The above two formulae provide an easy means to calculate cyclomatic complexity, given a flow
graph. In fact the first formula can be used even without drawing the flow graph, by simply counting
the number of the basic predicates. There are other formulations of cyclomatic complexity derived
from the foundations of Graph Theory, which we have not covered here. The references given at the
end can provide pointers for the interested reader.
Using the flow graph, an independent path can be defined as a path in the flow graph that has at least
one edge that has not been traversed before in other paths. A set of independent paths that cover all
the edges is a basis set. Once the basis set is formed, test cases should be written to execute all the
paths in the basis set.
For small programs cyclomatic complexity can be calculated manually, but automated tools are
essential as several thousands of lines of code are possible in each program in a project. It will be very
difficult to manually create flow graphs for large programs. There are several tools that are available
in the market which can compute cyclomatic complexity. But, we would like to caution that
calculating the complexity of a module after it has been built and tested may be too late—it may not
27
be possible to redesign a complex module after it has been tested. Thus some basic complexity checks
must be performed on the modules before embarking upon the testing (or even coding) phase. This
can become one of the items to check for in a code review.
Based on the complexity number that emerges from using the tool, one can conclude what actions
need to be taken for complexity measure using table.
Table
Complexity What it means
>40 Not testable, any amount of money/effort to maintain may not be enough
White box testing requires a sound knowledge of the program code and the programming language.
This means that the developers should get intimately involved in white box testing. Developers, in
general, do not like to perform testing functions. This applies to structural testing as well as static
testing methods such as
reviews. In addition, because of the timeline pressures, the programmers may not “find time” for
reviews (an euphemism for wanting to do more coding).
Human tendency of a developer being unable to find the defects in his or her code As we saw
earlier, most of us have blind spots in detecting errors in our own products. Since white box testing
involves programmers who write the code, it is quite possible that they may not be most effective in
detecting defects in their own work products. An independent perspective could certainly help.
Fully tested code may not correspond to realistic scenarios Programmers generally do not have
a full appreciation of the external (customer) perspective or the domain knowledge to visualize how a
product will be deployed in realistic scenarios. This may mean that even after extensive testing, some
of the common user scenarios may get left out and defects may creep in.
These challenges do not mean that white box testing is ineffective. But when white box testing is
carried out and these challenges are addressed by other means of testing, there is a higher likelihood
of more effective testing. Black box testing, to be discussed in the following chapter addresses some
of these challenges.
UNIT-II
BLACK BOX TESTING
WHAT IS BLACK BOX TESTING?
Black box testing involves looking at the specifications and does not require examining the
code of a program.
Black box testing is done from the customer's viewpoint.
The test engineer engaged in black box testing only knows the set of inputs and expected
outputs and is unaware of how those inputs are transformed into outputs by the software.
Black box testing is done without the knowledge of the internals of the system under test.
Black box tests are convenient to administer because they use the complete finished product
28
and do not require any knowledge of its construction
Black box testing handles valid and invalid inputs It is natural for users to make errors while using
a product. Hence, it is not sufficient for black box testing to simply handle valid inputs. Testing from
the end-user perspective includes testing for these error or invalid conditions. This ensures that the
product behaves as expected in a valid situation and does not hang or crash when provided with an
invalid input. These are called positive and negative test cases.
The tester may or may not know the technology or the internal logic of the product. However,
knowing the technology and the system internals helps in constructing test cases specific to the
error-prone areas.
Test scenarios can be generated as soon as the specifications are ready. Since requirements
specifications are the major inputs for black box testing, test design can be started early in the cycle.
Testers can get involved right from the requirements gathering and analysis phase for the system
under test. Test scenarios and test data are prepared during the test construction phase of the test life
cycle, when the software is in the design phase.
Once the code is ready and delivered for testing, test execution can be done. All the test scenarios
developed during the construction phase are executed. Usually, a subset of these test scenarios is
selected for regression testing.
29
5. State based testing
6. Compatibility testing
8. Domain testing
Requirements testing deals with validating the requirements given in the Software Requirements
Specification (SRS) of the software system.
The precondition for requirements testing is a detailed review of the requirements specification.
Requirements review ensures that they are consistent, correct, complete, and testable. This process
ensures that some implied requirements are converted and documented as explicit requirements,
thereby bringing better clarity to requirements and making requirements based testing more
effective.
Some organizations follow a variant of this method to bring more details into requirements. All
explicit requirements (from the Systems Requirements Specifications) and implied requirements
(inferred by the test team) are collected and documented as “Test Requirements Specification”
(TRS). Requirements based testing can also be conducted based on such a TRS, as it captures the
testers’ perspective as well. However, for simplicity, we will consider SRS and TRS to be one and
the same.
A requirements specification for the lock and key example explained earlier can be documented as
given in the table
Table Sample requirements specification for lock and key system.
Requirements (like the ones given above) are tracked by a Requirements Traceability Matrix
(RTM). An RTM traces all the requirements from their genesis through design, development,
and testing. This matrix evolves through the life cycle of the project. To start with, each
30
requirement is given a unique id along with a brief description.
The requirement identifier and description can be taken from the Requirements Specification
(above table) or any other available document that lists the requirements to be tested for the
product.
In the above table, the naming convention uses a prefix “BR” followed by a two-digit number.
BR indicates the type of testing—"Black box-requirements testing.”
The two-digit numerals count the number of requirements.
In systems that are more complex, an identifier representing a module and a running serial
number within the module (for example, INV-01, AP-02, and so on) can identify a
requirement.
Each requirement is assigned a requirement priority, classified as high, medium or low. This
not only enables prioritizing the resources for development of features but is also used to
sequence and run tests.
Tests for higher priority requirements will get precedence over tests for lower priority
requirements. This ensures that the functionality that has the highest risk is tested earlier in the
cycle. Defects reported by such testing can then be fixed as early as possible.
One to one—For each requirement there is one test case (for example, BR-01)
One to many—For each requirement there are many test cases (for example, BR-03)
31
Many to many—Many requirements can be tested by many test cases (these kind of test
cases are normal with integration and system testing; however, an RTM is not meant for
this purpose)
One to none—The set of requirements can have no test cases. The test team can take a decision
not to test a requirement due to non-implementation or the requirement being low priority (for
example,
BR-08)
1. Regardless of the number of requirements, ideally each of the requirements has to be tested.
When there are a large numbers of requirements, it would not be possible for someone to
manually keep a track of the testing status of each requirement. The RTM provides a tool to
track the testing status of each requirement, without missing any (key) requirements.
2. By prioritizing the requirements, the RTM enables testers to prioritize the test cases execution
to catch defects in the high-priority area as early as possible. It is also used to find out whether
there are adequate test cases for high-priority requirements and to reduce the number of test
cases for low- priority requirements. In addition, if there is a crunch for time for testing, the
prioritization enables selecting the right features to test.
3. Test conditions can be grouped to create test cases or can be represented as unique test cases.
The list of test case(s) that address a particular requirement can be viewed from the RTM.
4. Test conditions/cases can be used as inputs to arrive at a size / effort / schedule estimation of tests.
The Requirements Traceability Matrix provides a wealth of information on various test metrics.
Some of the metrics that can be collected or inferred from this matrix are as follows.
Requirements addressed priority wise—This metric helps in knowing the test coverage
based on the requirements. Number of tests that is covered for high-priority requirement
versus tests created for low-priority requirement.
Number of test cases requirement wise—For each requirement, the total number of test cases created.
Total number of test cases prepared—Total of all the test cases prepared for all
requirements. Once the test cases are executed, the test results can be used to collect metrics
such as
Total number of test cases (or requirements) passed—Once test execution is completed,
the total passed test cases and what percent of requirements they correspond.
32
Total number of test cases (or requirements) failed—Once test execution is completed,
the total number of failed test cases and what percent of requirements they correspond.
defects. Requirements based testing tests the product's compliance to the requirements
specifications.
To illustrate the metrics analysis, let us assume the test execution data as given in table
From the above table, the following observations can be made with respect to the requirements.
83 percent passed test cases correspond to 71 percent of requirements being met (five out of
seven requirements met; one requirement not implemented). Similarly, from the failed test
cases, outstanding defects affect 29 percent (= 100 - 71) of the requirements.
There is a high-priority requirement, BR-03, which has failed. There are three corresponding
defects that need to be looked into and some of them to be fixed and test case Lock_04 need to
be executed again for meeting this requirement. Please note that not all the three defects may
need to be fixed, as some of them could be cosmetic or low-impact defects.
There is a medium-priority requirement, BR-04, has failed. Test case Lock_06 has to be re-
executed after the defects (five of them) corresponding to this requirement are fixed.
33
The requirement BR-08 is not met; however, this can be ignored for the release, even though
there is a defect outstanding on this requirement, due to the low-priority nature of this
requirement.
Positive testing tries to prove that a given product does what it is supposed to do. When a test case
verifies the requirements of the product with a set of expected output, it is called positive test case. The
purpose of positive testing is to prove that the product works as per specification and expectations. A
product delivering an error when it is expected to give an error, is also a part of positive testing.
Positive testing can thus be said to check the product's behavior for positive and negative conditions
as stated in the requirement.
Let us take the first row in the below table. When the lock is in an unlocked state and you use key
123—156 and turn it clockwise, the expected outcome is to get it locked. During test execution, if
the test results in ocking, then the test is passed. This is an example of “positive test condition” for
positive testing.
In the fifth row of the table, the lock is in locked state. Using a hairpin and turning it clockwise
should not cause a change in state or cause any damage to the lock. On test execution, if there are no
changes, then this positive test case is passed. This is an example of a “negative test condition” for
positive testing.
Positive testing is done to verify the known test conditions and negative testing is done to break the
product with unknowns.
34
Negative testing is done to show that the product does not fail when an unexpected input is
given. The purpose of negative testing is to try and break the system.
Negative testing covers scenarios for which the product is not designed and coded. In other
words, the input values may not have been represented in the specification of the product.
These test conditions can be termed as unknown conditions for the product as far as the
specifications are concerned. But, at the end-user level, there are multiple scenarios that are
encountered and that need to be taken care of by the product. It becomes even more important
for the tester to know the negative situations that may occur at the end-user level so that the
application can be tested and made foolproof.
A negative test would be a product not delivering an error when it should or delivering an
error when it should not.
Table gives some of the negative test cases for the lock and key example.
Table Negative test cases.
In the above table, unlike what we have seen in positive testing, there are no requirement numbers.
This is because negative testing focuses on test conditions that lie outside the specification. Since all
the test conditions are outside the specification, they cannot be categorized as positive and negative
test conditions. Some people consider all of them as negative test conditions, which is technically
correct.
The difference between positive testing and negative testing is in their coverage. For positive testing if
all documented requirements and test conditions are covered, then coverage can be considered to be
100 percent. If the specifications are very clear, then coverage can be achieved. In contrast there is no
end to negative testing, and 100 percent coverage in negative testing is impractical. Negative testing
requires a high degree of creativity among the testers to cover as many “unknowns” as possible to
avoid failure at a customer site.
Boundary Value Analysis
Most of the defects in software products hover around conditions and boundaries. By conditions, we
mean situations wherein, based on the values of various variables, certain actions would have to be
taken. By boundaries, we mean “limits” of values of the various variables.
We will now explore boundary value analysis (BVA), a method useful for arriving at tests that are
effective in catching defects that happen at boundaries. Boundary value analysis believes and
extends the concept that the density of defect is more towards the boundaries.
To illustrate the concept of errors that happen at boundaries, let us consider a billing system that
offers volume discounts to customers.
Most of us would be familiar with the concept of volume discounts when we buy goods —buy one
packet of chips for $1.59 but three for $4. It becomes economical for the buyer to buy in bulk. From
the seller's point of view also, it is economical to sell in bulk because the seller incurs less of storage
and inventory costs and has a better cash flow. Let us consider a hypothetical store that sells certain
commodities and offers different pricing for people buying in different quantities—that is, priced in
35
different “slabs.”
From the above table, it is clear that if we buy 5 units, we pay 5*5 = $25. If we buy 11 units, we pay
5*10 = $50 for the first ten units and $4.75 for the eleventh item. Similarly, if we buy 15 units, we will
pay 10*5 + 5*4.75 =$73.75.
The question from a testing perspective for the above problem is what test data is likely to reveal the most
number of defects in the program? Generally it has been found that most defects in situations such
as this happen around the boundaries—for example, when buying 9, 10, 11, 19, 20, 21, 29, 30, 31,
and similar number of items. While the reasons for this phenomenon is not entirely clear, some
possible reasons are as follows.
Programmers’ tentativeness in using the right comparison operator, for example, whether to use the <
= operator or < operator when trying to make comparisons.
Confusion caused by the availability of multiple ways to implement loops and condition
checking. For example, in a programming language like C, we have for loops, while loops
and repeat loops. Each of these have different terminating conditions for the loop and this
could cause some confusion in deciding which operator to use, thus skewing the defects
around the boundary conditions.
The requirements themselves may not be clearly understood, especially around the
boundaries, thus causing even the correctly coded program to not perform the correct way.
In the above case, the tests that should be performed and the expected values of the output variable
(the cost of the units ordered) are given in table below. This table only includes the positive test cases.
Negative test cases like a non-numeric are not included here. The circled rows are the boundary
values, which are more likely to uncover the defects than the rows that are not circled.
36
Table Boundary values for the volumes discussed in the example.
Boundary value analysis is useful to generate test cases when the input (or output) data is made up of
clearly identifiable boundaries or ranges.
Look for any kind of gradation or discontinuity in data values which affect
computation—the discontinuities are the boundary values, which require thorough
testing.
37
Look for any internal limits such as limits on resources (as in the example of buffers given
above). The behavior of the product at these limits should also be the subject of boundary
value testing.
Also include in the list of boundary values, documented limits on hardware resources. For
example, if it is documented that a product will run with minimum 4MB of RAM, make sure
you include test cases for the minimum RAM (4MB in this case).
The examples given above discuss boundary conditions for input data—the same analysis
needs to be done for output variables also.
Boundary value analysis discussed here in context of black box testing applies to white box testing
also. Internal data structures like arrays, stacks, and queues need to be checked for boundary or limit
conditions; when there are linked lists used as internal structures, the behavior of the list at the
beginning and end have to be tested thoroughly.
Boundary values and decision tables help identify the test cases that are most likely to uncover
defects. A generalization of both these concepts is the concept of equivalence classes, discussed later
in this chapter.
Decision Tables
To illustrate the use of conditions (and decision tables) in testing, let us take a simple example of
calculation of standard deduction on taxable income. The example is meant to illustrate the use of
decision tables and not to be construed as tax advice or a realistic tax scenario in any specific country.
A decision table lists the various decision variables, the conditions (or values) assumed by each of the
decision variables, and the actions to take in each combination of conditions.
The variables that contribute to the decision are listed as the columns of the table.
38
The last column of the table is the action to be taken for the combination of values of the decision
variables.
In cases when the number of decision variables is many (say, more than five or six) and the number of
distinct combinations of variables is few (say, four or five), the decision variables can be listed as
rows (instead of columns) of the table.
4. Identify the cases when values assumed by a variable (or by sets of variables) are
immaterial for a given combination of other input variables. Represent such variables by
the don't care symbol.
5. For each combination of values of decision variables (appropriately minimized with the
don't care scenarios), list out the action or expected result.
6. Form a table, listing in each but the last column a decision variable. In the last column, list
the action item for the combination of variables in that row (including don't cares, as
appropriate).
A decision table is useful when input and output data can be expressed as Boolean conditions
(TRUE, FALSE, and DON't CARE).
Equivalence Partitioning
Equivalence partitioning is a software testing technique that involves identifying a small set of
representative input values that produce as many different output conditions as possible. This reduces
the number of permutations and combinations of input, output values used for testing, thereby
increasing the coverage and reducing the effort involved in testing.
The set of input values that generate one single expected output is called a partition.
When the behavior of the software is the same for a set of values, then the set is termed as an
equivalance class or a partition.
In this case, one representative sample from each partition (also called the member of the
equivalance class) is picked up for testing.
One sample from the partition is enough for testing as the result of picking up some more values
from the set will be the same and will not yield any additional defects. Since all the values
produce equal and same output they are termed as equivalance partition.
Testing by this technique involves (a) identifying all partitions for the complete set of input, output
values for a product and (b) picking up one member value from each partition for testing to maximize
complete coverage.
From the results obtained for a member of an equivalence class or partition, this technique
extrapolates the expected results for all the values in that partition. The advantage of using this
technique is that we gain good coverage with a small number of test cases. For example, if there is a
defect in one value in a partition, then it can be extrapolated to all the values of that particular
partition. By using this technique, redundancy of tests is minimized by not repeating the same tests for
39
multiple values in the same partition.
For example, consider the set of real numbers. One way to divide this set is by
1. Prime numbers
2. Composite numbers
These three classes divide the set of numbers into three valid classes. In addition, to account for any
input a user may give, we will have to add an invalid class—strings with alphanumeric characters.
As in the previous case, we can construct an equivalence partitions table for this example as shown
in table.
Identify the valid equivalence classes based on the above criteria (number of ranges allowed
values, and so on)
If the expected result is not clear for any particular test case, mark appropriately and escalate
for corrective actions. If you cannot answer a question, or find an inappropriate answer,
consider whether you want to record this issue on your log and clarify with the team that
arbitrates/dictates the requirements.
1. The product under test is a language processor (for example, a compiler), wherein the
syntax of the language automatically lends itself to a state machine or a context free
grammar represented by a railroad diagram.
2. Workflow modeling where, depending on the current state and appropriate combinations of
40
input variables, specific workflows are carried out, resulting in new output and new state.
3. Dataflow modeling, where the system is modeled as a set of dataflow, leading from one
state to another.
In the above (2) and (3) are somewhat similar. We will give one example for (1) and one
example for (2). Consider an application that is required to validate a number according to the
4. If there is a decimal point, then there should be two digits after the decimal.
blank. The above rules can be represented in a state transition diagram as shown below
1 + 2
1 - 2
2 Digit 2
2 Blank 6
2 Decimal 3
point
3 Digit 4
41
4 Digit 5
6
5 Blank
The above state transition table can be used to derive test cases to test valid and invalid numbers.
Valid test cases can be generated by:
2. Choose a path that leads to the next state (for example, +/-/digit to go from State 1 to State 2).
3. If you encounter an invalid input in a given state (for example, encountering an alphabetic
character in State 2), generate an error condition test case.
4. Repeat the process till you reach the final state (State 6 in this example).
Compatibility Testing
Testing done to ensure that the product features work consistently with different infrastructure
components is called compatibility testing.
The parameters that generally affect the compatibility of the product are
Processor (CPU) (Pentium III, Pentium IV, Xeon, SPARC, and so on) and the number of
processors in the machine
Architecture and characterstics of the machine (32 bit, 64 bit, and so on)
Equipment that the product is expected to work with (printers, modems, routers, and so on)
Operating system (Windows, Linux, and so on and their variants) and operating system
services (DNS, NIS, FTP, and so on)
Middle-tier infrastructure components such as web server, application server, network server
Any software used to generate product binaries (compiler, linker, and so on and their
appropriate versions)
42
Various technological components used to generate components (SDK, JDK, and so on
and their appropriate different versions)
Compatibility matrix for a mail application.
The below table is only an example. It does not cover all parameters and their combinations.
Some of the common techniques that are used for performing compatibility testing, using a
compatibility table are
1. Horizontal combination All values of parameters that can coexist with the product for
executing the set test cases are grouped together as a row in the compatibility matrix.
2. Intelligent sampling In the horizontal combination method, each feature of the product has to be
tested with each row in the compatibility matrix. This involves huge effort and time. To solve this
problem, combinations of infrastructure parameters are combined with the set of
The compatibility testing of a product involving parts of itself can be further classified into two types.
1. Backward compatibility testing There are many versions of the same product that are
available with the customers. It is important for the customers that the objects, object
properties, schema, rules, reports, and so on, that are created with an older version of the
product continue to work with the current version of the same product.
2. Forward compatibility testing There are some provisions for the product to work with later
versions of the product and other infrastructure components, keeping future requirements in
43
mind
User Documentation Testing
User documentation covers all the manuals, user guides, installation guides, setup guides, read me
file, software release notes, and online help that are provided along with the software to help the
end user to understand the software system.
In addition, user documentation testing also checks for the language aspects of the document like spell
check and grammar.
User documentation is done to ensure the documentation matches the product and vice-versa.
Some of the benefits that ensue from user documentation testing are:
1. User documentation testing aids in highlighting problems over looked during reviews.
2. High quality user documentation ensures consistency of documentation and product, thus
minimizing possible defects reported by customers. It also reduces the time taken for each
support call— sometimes the best way to handle a call is to alert the customer to the relevant
section of the manual. Thus the overall support cost is minimized.
3. Results in less difficult support calls. When a customer faithfully follows the instructions
given in a document but is unable to achieve the desired (or promised) results, it is frustrating
and often this frustration shows up on the support staff. Ensuring that a product is tested to
work as per the document and that it works correctly contributes to better customer
satisfaction and better morale of support staff.
4. New programmers and testers who join a project group can use the documentation to
learn the external functionality of the product.
5. Customers need less training and can proceed more quickly to advanced training and product
usage if the documentation is of high quality and is consistent with the product. Thus high-
quality user documentation can result in a reduction of overall training costs for user
organizations.
Defects found in user documentation need to be tracked to closure like any regular software defect.
In order to enable an author to close a documentation defect information about the defect/comment
description,
44
paragraph/page number reference, document version number reference, name of reviewer, name of
author, reviewer's contact number, priority, and severity of the comment need to be passed to the
author.
Domain Testing
Domain testing can be considered as the next level of testing in which we do not look even at the
specifications of a software product but are testing the product, purely based on domain knowledge
and expertise in the domain of application. This testing approach requires critical understanding of the
day-to-day business activities for which the software is written. This type of testing requires business
domain knowledge rather than the knowledge of what the software specification contains or how the
software is written. Thus domain testing can be considered as an extension of black box testing
The test engineers performing this type of testing are selected because they have in-depth knowledge
of the business domain. Since the depth in business domain is a prerequisite for this type of testing,
sometimes it is easier to hire testers from the domain area (such as banking, insurance, and so on) and
train them in software, rather than take software professionals and train them in the business domain.
This reduces the effort and time required for training the testers in domain testing and also increases
the effectivenes of domain testing.
Let us further understand this testing using an example of cash withdrawal functionality in an ATM,
extending the earlier example on banking software. The user performs the following actions.
Integration testing type focuses on testing interfaces that are “implicit and explicit” and
“internal and external.”
The solid lines represent explicit interfaces and the dotted lines represent implicit interfaces, based
on the understanding of architecture, design, or usage.
46
There are several methodologies available, to in decide the order for integration testing. These are as follows.
1. Top-down integration
2. Bottom-up integration
3. Bi-directional integration
4. System integration
Top-Down Integration
1 1-2
2 1-3
3 1-4
4 1-2-5
5 1-3-6
6 1-3-6-(3-7)
7 (1-2-5)-(1-3-6-(3-7))
8 1-4-8
9 (1-2-5)-(1-3-6-(3-7))-(1-4-8)
In an incremental product development, where one or two components gets added to the product in
each increment, the integration testing methodology pertains to only to those new interfaces that are
added and those related interfaces that are impacted by the changes or increment in the product.
47
Bottom-up Integration
Bottom-up integration is just the opposite of top-down integration, where the components for a
new product development become available in reverse order, starting from the bottom.
Arrows pointing down depict logic flow; arrows pointing up indicate integration paths.
1 1-5
2 2-6, 3-6
3 2-6-(3-6)
4 4-7
5 1-5-8
6 2-6-(3-6)-8
7 4-7-8
8 (1-5-8)-(2-6-(3-6)-8)-(4-7-8)
Bi-Directional Integration
48
Bi-directional integration.
1 6-2
2 7-3-4
3 8-5
4 (1-6-2)-(1-7-3-4)-(1-8-5)
An area where this approach comes in handy is when migrating from a two-tier to a three-tier
environment. In the product development phase when a transition happens from two-tier
architecture to three-tier architecture, the middle tier (components 6–8) gets created as a set of
new components from the code taken from bottom-level applications and top-level services.
System Integration
System integration means that all the components of the system are integrated and tested as
a single unit. Integration testing, which is testing of interfaces, can be divided into two
types:
System integration using the big bang approach is well suited in a product development
scenario where the majority of components are already available and stable and very few
components get added or modified. In this case, instead of testing component interfaces one by
one, it makes sense to integrate all the components at one go and test once, saving effort and
time for the multi-step component integrations.
While this approach saves time and effort, it is also not without disadvantages. Some of
the important disadvantages that can have a bearing on the release dates and quality of a
product are as follows.
2. The ownership for correcting the root cause of the defect may be a difficult issue to pinpoint.
3. When integration testing happens in the end, the pressure from the approaching
49
release date is very high. This pressure on the engineers may cause them to
compromise on the quality of the product.
The integration testing phase focuses on finding defects which predominantly arise because
of combining various components for testing, and should not be focused on for component
or few components.
Integration testing as a type focuses on testing the interfaces. This is a subset of the
integration testing phase.
When a sub-system or system components are put together (or integrated), the defects
not only arise because of interfaces, but also for various other reasons such as usage,
incomplete understanding of product domain, user errors, and so on.
Hence the integration testing phase needs to focus on interfaces as well as usage flow.
It is very important to note this point to avoid confusion between integration testing
type and integration testing phase.
All testing activities that are conducted from the point where two components are
integrated to the point where all system components work together, are considered a
part of the integration testing phase.
Integration testing as a phase involves different activities and different types of testing
have to be done in that phase. This is a testing phase that should ensure completeness
and coverage of testing for functionality. To achieve this, the focus should not only be
on planned test case execution but also on unplanned testing, which is termed as “ad hoc
testing.”
The integration testing phase involves developing and executing test cases that cover
multiple components and functionality. When the functionality of different components are
combined and tested together for a sequence of related operations, they are called scenarios.
Scenario testing is a planned activity to explore different usage patterns and combine them
into test cases called scenario test cases. We will see scenario testing in more detail in the
next section.
SCENARIO TESTING
Scenario testing is defined as a “set of realistic user activities that are used for evaluating
the product.” It is also defined as the testing involving customer scenarios.
1. System scenarios
50
System Scenarios
System scenario is a method whereby the set of activities used for scenario testing covers
several components in the system. The following approaches can be used to develop system
scenarios.
Story line Develop a story line that combines various activities of the product that may be
executed by an end user. A user enters his or her office, logs into the system, checks mail,
responds to some mails, compiles some programs, performs unit testing and so on. All these
typical activities carried out in the course of normal work when coined together become a
scenario.
Business verticals Visualize how a product/software will be applied to different verticals and
create a set of activities as scenarios to address specific vertical businesses.
Battle ground Create some scenarios to justify that “the product works” and some scenarios to
“try and break the system” to justify “the product doesn't work.” This adds flavor to the
scenarios mentioned above.
Coverage is always a big question with respect to functionality in scenario testing. This testing
is not meant to cover different permutations and combinations of features and usage in a
product.
Coverage of activities by scenario testing.
A use case scenario is a stepwise procedure on how a user intends to use a system, with
different user roles and associated parameters. A use case scenario can include stories,
pictures, and deployment details. Use cases are useful for explaining customer problems and
51
how the software can solve those problems without any ambiguity.
Use case scenarios term the users with different roles as actors. What the product should do for
a particular activity is termed as system behavior. Users with a specific role to interact between
the actors and the system are called agents.
This way of describing different roles in test cases helps in testing the product without getting
into the details of the product.
DEFECT BASH
Defect bash is an ad hoc testing where people performing different roles in an organization test
the product together at the same time. This is very popular among application development
companies, where the product can be used by people who perform different roles. The testing
by all the participants during defect bash is not based on written test cases. What is to be tested
is left to an individual's decision and creativity. They can also try some operations which are
beyond the product specifications. Defect bash brings together plenty of good practices that are
popular in testing industry. They are as follows.
2. Bringing different people performing different roles together in the organization for
testing—“Testing isn't for testers alone”
3. Letting everyone in the organization use the product before delivery—“Eat your own dog food”
4. Bringing fresh pairs of eyes to uncover new defects—“Fresh eyes have less bias”
5. Bringing in people who have different levels of product understanding to test the
product together randomly—“Users of software are not same”
6. Let testing doesn't wait for lack of/time taken for documentation—“Does
52
testing wait till all documentation is done?”
7. Enabling people to say “system works” as well as enabling them to “break the system”
— “Testing isn't to conclude the system works or doesn't work”
Even though it is said that defect bash is an ad hoc testing, not all activities of defect bash are
unplanned. All the activities in the defect bash are planned activities, except for what to be
tested. It involves several steps.
Step 1 Choosing the frequency and duration of defect bash
Defect bash is an activity involving a large amount of effort (since it involves large a number
of people) and an activity involving huge planning (as is evident from the above steps).
Frequent defect bashes will incur low return on investment, and too few defect bashes may not
meet the objective of finding all defects. Duration is also an important factor. Optimizing the
small duration is a big saving as a large number of people are involved. On the other hand if
the duration is small, the amount of testing that is done may not meet the objective.
Since the defect bash involves a large number of people, effort and planning, a good quality
build is needed for defect bash.
Even though defect bash is an ad hoc activity, its purpose and objective have to be very clear.
Since defect bash involves people performing different roles, the contribution they make has to
be focused towards meeting the purpose and objective of defect bash
Since defect bashes are planned, short term and resource intensive activities, it makes sense to
setup and monitor a laboratory for this purpose.
There are two types of defects that will emerge during a defect bash. The defects that are in the
product, as reported by the users, can be classified as functional defects.
Defects that are unearthed while monitoring the system resources, such as memory leak, long
53
turnaround time, missed requests, high impact and utilization of system resources, and so on are
called non-functional defects.
Defect bash is a unique testing method which can bring out both functional and non-functional
defects. However, if the lab is not set up properly or not monitored properly, there is a chance
that some of the non-functional defects may not get noticed at all.
Optimizing the Effort Involved in Defect Bash
Since a defect bash involves a large number of people, spending much effort is normal for
conducting defect bashes. There are several ways to optimize the effort involved in a defect
bash if a record of objectives and outcome is kept.Hence, a defect bash can be further
classified into
To explain the effort saved by the defect bash classification, let us take three product defect
bashes conducted in two hours with 100 people. The total effort involved is 3*2*100=600
person hours.
If the feature/component test team and integration test team, that has 10 people each, can
participate in doing two rounds of micro level bashes, which can find out one third of defects
that are expected, then effort saving is 20% with respect to the following calculation.
Total effort involved in two rounds of product bashes—400 man hours
Effort involved in two rounds of feature bash (2*2*10)—40
This is only an approximate calculation, as the effort involved in the steps mentioned earlier
in this section (Steps 1–6) also need to be included for each defect bash. Those steps have to
be repeated for each defect bash, irrespective of whether they are at feature level or
integration level or product level.
54