17is63 Software Testing
17is63 Software Testing
Lecture Notes on
SOFTWARE TESTING (17IS63)
Prepared by
Prof. Ramya S & Prof. SmithaShree K P
Vision
Mission
disciplines.
society.
Maharaja Institute of Technology Mysore
Department of Information Science and Engineering
Program Outcomes
Course Overview
SUBJECT: DOT NET FRAMEWORK FOR APPLICATION DEVELOPMENT
SUBJECT CODE: 17CS564
This course provides an introduction to the process of validating and verifying the software during the
life cycle of the software. Software testing also helps to identify errors, gaps or missing requirements
in contrary to the actual requirements. It can be either done manually or using automated tools.
Software testing techniques will be thought.
By this course student will learn verifying and validating softwares by using different software
testing techniques. Student will be able to analyze the problem and apply the proper testing
techniques.
At the end of the course students will be able to apply the different testing techniques to create the
real time applications and also plan and monitor the process and documenting analysis and test,design
specification ,test and analysis report.
Course Objectives
Course Outcomes
CO’s DESCRIPTION OF THE OUTCOMES
17IS63.1 Apply the basic concepts of testing to derive different test cases.
Apply the different testing techniques to validate the software product.
17IS63.2
Analyze the testing techniques and tools to classifying the problems into suitable
17IS63.3 testing model.
Analyze the different way of creating the documentation for software artifact.
17IS63.4
Maharaja Institute of Technology Mysore
Department of Information Science and Engineering
Syllabus
SUBJECT: SOFTWARE TESTING
SUBJECT CODE: 17IS63
Teaching
Topics Covered as per Syllabus Hours
MODULE-1:
Basics of Software Testing: Basic definitions, Software Quality , Requirements, Behaviour and
Correctness, Correctness versus Reliability, Testing and Debugging, Test cases, Insights from a Venn
diagram, Identifying test cases, Test-generation Strategies, Test Metrics, Error and fault taxonomies ,
Levels of testing, Testing and Verification, Static Testing. Problem Statements: Generalized pseudocode, 10 Hours
the triangle problem, the NextDate function, the commission problem, the SATM (Simple Automatic
Teller Machine) problem, the currency converter, Saturn windshield wiper.
MODULE-2:
Functional Testing: Boundary value analysis, Robustness testing, Worst-case testing, Robust Worst
testing for triangle problem, Nextdate problem and commission problem, Equivalence classes,
Equivalence test cases for the triangle problem, NextDate function, and the commission problem, 10 Hours
Guidelines and observations, Decision tables, Test cases for the triangle problem, NextDate function, and
the commission problem, Guidelines and observations. Fault Based Testing: Overview, Assumptions in
fault based testing, Mutation analysis, Fault-based adequacy criteria, Variations on mutation analysis.
MODULE -3:
Structural Testing: Overview, Statement testing, Branch testing, Condition testing , Path testing: DD
paths, Test coverage metrics, Basis path testing, guidelines and observations, Data –Flow testing:
Definition-Use testing, Slicebased testing, Guidelines and observations. Test Execution: Overview of test 10
execution, from test case specification to test cases, Scaffolding, Generic versus specific scaffolding, Hours
Test oracles, Self-checks as oracles, Capture and replay.
MODULE-4:
Process Framework :Basic principles: Sensitivity, redundancy, restriction, partition, visibility, Feedback,
the quality process, Planning and monitoring, Quality goals, Dependability properties ,Analysis Testing,
Improving the process, Organizational factors. Planning and Monitoring the Process: Quality and 10 Hours
process, Test and analysis strategies and plans, Risk planning, monitoring the process, Improving the
process, the quality team Documenting Analysis and Test: Organizing documents, Test strategy
document, Analysis and test plan, Test design specifications documents, Test and analysis reports.
MODULE-5:
Integration and Component-Based Software Testing: Overview, Integration testing strategies, Testing
components and assemblies. System, Acceptance and Regression Testing: Overview, System testing,
Acceptance testing, Usability, Regression testing, Regression test selection techniques, Test case 10 Hours
prioritization and selective execution. Levels of Testing, Integration Testing: Traditional view of testing
levels, Alternative life-cycle models, The SATM system, Separating integration and system testing, A
closer look at the SATM system, Decomposition-based, call graph-based, Path-based integrations.
List of Text Books
1.Paul C. Jorgensen: Software Testing, A Craftsman’s Approach, 3rd Edition, Auerbach Publications, 2008.
(Listed topics only from Chapters 1, 2, 5, 6, 7, 9, 10, 12, 13)
2.Mauro Pezze, Michal Young: Software Testing and Analysis – Process, Principles and Techniques, Wiley
India, 2009. (Listed topics only from Chapters 3, 4, 16, 17, 20,21, 22,24)
1. 3. Aditya P Mathur: Foundations of Software Testing, Pearson Education, 2008.( Listed topics only
from Section 1.2 , 1.3, 1.4 ,1.5, 1.8,1.12,6. 2.1,6. 2.4 )
List of Reference Books
1. Software testing Principles and Practices – Gopalaswamy Ramesh, Srinivasan Desikan, 2 nd Edition,
Pearson, 2007.
Index
Module-2 Pg no
Equivalence classes, Equivalence test cases for the triangle problem 56-61
Module 3 Pg no
Planning and Monitoring the Process: Quality and process, Test and 118-124
analysis strategies and plans
Risk planning, monitoring the process, Improving the process, the 125-135
quality team
Software Testing
Module 1
Basic Definitions
Error—People make errors. A good synonym is mistake. When people make mistakes while
coding, we call these mistakes bugs.
Fault—A fault is the result of an error. It is more precise to say that a fault is the representation of
an error, another name for fault is defects.Fault of commission occurs when we enter something
into a representation that is incorrect. An error of omission results in a fault in which something is
missing that should be present in the representation.Faults of omission occur when we fail to enter
correct information. Of these two types, faults of omission are more difficult to detect and resolve.
Failure—A failure occurs when the code corresponding to a fault executes. Two subtleties arise
here: one is that failures only occur in an executable representation, which is usually taken to be
source code, or more precisely, loaded object code; the second subtlety is that this definition relates
failures only to faults of commission.
Incident: An incident is the symptom associated with a failure that alerts the user to the
occurrence of a failure.
Test—Testing is obviously concerned with errors, faults, failures, and incidents. A test is the act of
exercising software with test cases. A test has two distinct goals: to find failures or to demonstrate
correct execution.
Test case—A test case has an identity and is associated with a program behavior. It also has a set
of inputs and expected outputs.
The development phases, three opportunities arise for errors to be made, resulting in faults that may
propagate through the remainder of the development process. The fault resolution step is another
opportunity for errors (and new faults). When a fix causes formerly correct software to misbehave,
the fix is deficient. We will revisit this when we discuss regression testing. From this sequence of
terms, we see that test cases occupy a central position in testing. The process of testing can be
subdivided into separate steps: test planning, test case development, running test cases, and
evaluating test results.
Test case
The essence of software testing is to determine a set of test cases for the item to be tested. A
complete test case will contain a test case identifier, a brief statement of purpose (e.g., a business
rule), a description of preconditions, the actual test case inputs, the expected outputs, a description
of expected postconditions, and an execution history. The execution history is primarily for test
management use—it may contain the date when the test was run, the person who ran it, the version
on which it was run, and the pass/fail result.
Test case execution entails establishing the necessary preconditions, providing the test case inputs,
observing the outputs, comparing these with the expected outputs, and then ensuring that the
expected postconditions exist to determine whether the test passed. From all of this, it becomes
clear that test cases are valuable—at least as valuable as source code. Test cases need to be
developed, reviewed, used, managed, and saved.
Testing is fundamentally concerned with behavior, and behavior is orthogonal to the code-based
view common to software (and system) developers.In this section, we develop a simple Venn
diagram that clarifies several questions about testing.
Consider a universe of program behaviors.Given a program and its specification, consider the set S
of specified behaviors and the set P of programmed behaviors. Figure 1.2 shows the relationship
between the specified and programmed behaviors. Of all the possible program behaviors, the
specified ones are in the circle labeled S and all those behaviors actually programmed are in P.
What if certain specified behaviors have not been programmed? In our earlier terminology, these
are faults of omission. Similarly, what if certain programmed (implemented) behaviors have not
been specified? These correspond to faults of commission and to errors that occurred after the
specification was complete.
The intersection of S and P (the football-shaped region) is the “correct” portion, that is,
behaviors that are both specified and implemented.
The new circle in Figure 1.3 is for test cases. Now, consider the relationships among sets S, P, and
T.
There may be specified behaviors that are not tested (regions 2 and 5), specified behaviors
that are tested (regions 1 and 4), and test cases that correspond to unspecified behaviors
(regions 3 and 7).
There may be programmed behaviors that are not tested (regions 2 and 6), programmed
behaviors that are tested (regions 1 and 3), and test cases that correspond to behaviors that
were not implemented (regions 4 and 7).
Two fundamental approaches are used to identify test cases; traditionally, these have been
calledfunctional and structural testing. Specification-based and code-based are more descriptive
names,and they will be used here.
Specification-Based Testing
The reason that specification-based testing was originally called “functional testing” is that any
program can be considered to be a function that maps values from its input domain to values in its
output range. It also called Black-Box testing in which the content (implementation) of the black
box is not known, and the function of the black box is understood completely in terms of its inputs
and outputs.
Advantages:
(1) They are independent of how the software is implemented, so if the implementation
changes, the test cases are still useful;
(2) Test case development can occur in parallel with the implementation, thereby reducing the
overall project development interval.
Disadvantage:
(1) Significant redundancies may exist among test cases.
(2) Compounded by the possibility of gaps of untested software.
Figure 1.5 shows the results of test cases identified by two specification-based methods. Method
A identifies a larger set of test cases than does method B. Notice that, for both methods, the set of
test cases is completely contained within the set of specified behavior. Because specification based
Software Testing(17IS63) Page 5
Maharaja Institute of Technology Mysore Department of ISE
methods are based on the specified behavior, it is hard to imagine these methods identifying
behaviors that are not specified.
Code-Based Testing
Code-based testing is the other fundamental approach to test case identification. It is also called
White-Box testing, it main focus on implementation and also ability to “see inside” the black box
allows the tester to identify test cases on the basis of how the function is actually implemented.
Figure 1.6 shows the results of test cases identified by two code-based methods. Method A
identifies a larger set of test cases than does method B. Notice that, for both methods, the set of test
cases is completely contained within the set of programmed behavior. Because code-based methods
are based on the program, it is hard to imagine these methods identifying behaviors that are not
programmed.
Fault Taxonomies
Our definitions of error and fault hinge on the distinction between process and product: process
refers to how we do something, and product is the end result of a process. The point at which
testing and Software Quality Assurance (SQA) meet is that SQA typically tries to improve the
product by improving the process.Some of the more useful anomalies are given in Tables 1.1
through 1.5; most of these are from the IEEE standard, as follows
Levels of Testing
A diagrammatic variation of the waterfall model, known as the V-Model in ISTQB parlance, is
given in Figure 1.8; this variation emphasizes the correspondence between testing and design
levels.
Specification-based testing, the three levels of definition (specification, preliminary design,
and detailed design) correspond directly to three levels of testing—system, integration, and
unit testing.
A practical relationship exists between levels of testing versus specification-based and
codebasedtesting. Most practitioners agree that code-based testing is most appropriate at the
unitlevel, whereas specification-based testing is most appropriate at the system level.
The constructs defined for code-based testing make the most sense at the unit level, and
similar constructs are only now becoming available for the integration and system levels of
testing.
Examples
Generalized Pseudocode
Pseudocode provides a language-neutral way to express program source code. This version is
loosely based on Visual Basic and has constructs at two levels: unit and program components.
Units can be interpreted either as traditional components (procedures and functions) or as object-
oriented components (classes and objects). We illustrate this by allowing natural language phrases
in place of more formal, complex conditions ( Table 2.1).
Simple version:
The triangle program accepts three integers, a, b, and c, as input. These are taken to be sides of a
triangle. The output of the program is the type of triangle determined by the three sides: Equilateral,
Isosceles, Scalene, or Not A Triangle.
Improved version:
The triangle program accepts three integers, a, b, and c, as input. These are taken to be sides of a
triangle. The integers a, b, and c must satisfy the following conditions:
Traditional Implementation
The traditional implementation of this grandfather of all examples has a rather FORTRAN-like
style. The flowchart for this implementation appears in Figure 2.1. Figure 2.2 is a flowchart for the
improved version. The variable “match” is used to record equality among pairs of the sides. A
classic intricacy of the FORTRAN style is connected with the variable “match”: notice that all
three tests for the triangle inequality do not occur. If two sides are equal, say a and c, it is only
necessary to compare a + c with b.
Structured Implementations
Simple version
Improved Version
Program triangle3’
Dim a, b, c As Integer
Dim c1, c2, c3, IsATriangleAs Boolean
NextDate is a function of three variables: month, date, and year. It returns the date of the day after
the input date. The month, date, and year variables have integer values subject to these conditions
(the year range ending in 2012 is arbitrary, and is from the first edition):
c1. 1 ≤ month ≤ 12
c2. 1 ≤ day ≤ 31
c3. 1812 ≤ year ≤ 2012
If any of conditions c1, c2, or c3 fails, NextDate produces an output indicating the corresponding
variable has an out-of-range value—for example, “Value of month not in the range 1...12.” Because
numerous invalid day–month–year combinations exist, NextDate collapses these into one message:
“Invalid Input Date.”
Simple version
Two sources of complexity exist in the NextDate function: the complexity of the input domain
discussed previously, and the rule that determines when a year is a leap year. A year is 365.2422
days long; therefore, leap years are used for the “extra day” problem. If we declared a leap year
every fourth year, a slight error would occur. Another complexity validating December month.
Improved version
If any of conditions c1, c2, or c3 fails, NextDate produces an output indicating the corresponding
variable has an out-of-range value—for example, “Value of month not in the range 1...12.” Because
numerous invalid day–month
A rifle salesperson in the former Arizona Territory sold rifle locks, stocks, and barrels made by a
gunsmith in Missouri. Locks cost $45, stocks cost $30, and barrels cost $25. The salesperson had to
sell at least one lock, one stock, and one barrel (but not necessarily one complete rifle) per month,
and production limits were such that the most the salesperson could sell in a month was 70 locks,
80 stocks, and 90 barrels. The gunsmith then knew the sales for the month were complete and
computed the salesperson’s commission as follows: 10% on sales up to (and including) $1000, 15%
on the next $800, and 20% on any sales in excess of $1800.
The SATM system communicates with bank customers via the 15 screens shown in Figure 2.4.
SATM customers can select any of three transaction types: deposits, withdrawals, and balance
inquiries. For simplicity, these transactions can only be done on a checking account.
When a bank customer arrives at an SATM station, screen 1 is displayed. The bank
customer accesses the SATM system with a plastic card encoded with a personal account
number (PAN), which is a key to an internal customer account file, containing, among other
things, the customer’s name and account information.
If the customer’s PAN matches the information in the customeraccount file, the system
presents screen 2 to the customer. If the customer’s PAN is not found, screen 4 is displayed,
and the card is kept.
At screen 2, the customer is prompted to enter his or her personal identification number
(PIN). If the PIN is correct (i.e., matches the information in the customer account file), the
system displays screen 5; otherwise, screen 3 is displayed. The customer has three chances
to get the PIN correct; after three failures, screen 4 is displayed, and the card is kept.
On entry to screen 5, the customer selects the desired transaction from the options shown on
screen. If balance is requested, screen 14 is then displayed. If a deposit is requested, the
status of the deposit envelope slot is determined from a field in the terminal control file. If
no problem is known, the system displays screen 7 to get the transaction amount.
If a problem occurs with the deposit envelope slot, the system displays screen 12. Once the
deposit amount has been entered, the system displays screen 13,accepts the deposit
envelope, and processes the deposit. The system then displays screen 14.
If a withdrawal is requested, the system checks the status (jammed or free) of the
withdrawal chute in the terminal control file. If jammed, screen 10 is displayed; otherwise,
screen 7 is displayed so the customer can enter the withdrawal amount.
Once the withdrawal amount is entered, the system checks the terminal status file to see if it
has enough currency to dispense. If it does not, screen 9 is displayed; otherwise, the
withdrawal is processed.
If the funds in the account are insufficient, screen 8 is displayed. If the account balance is
sufficient, screen 11 is displayed and the money is dispensed. The balance is printed on the
transaction receipt as it is for a balance request transaction. After the cash has been
removed, the system displays screen 14.
When the “No” button is pressed in screens 10, 12, or 14, the system presents screen 15 and
returns the customer’s ATM card. Once the card is removed from the card slot, screen 1 is
displayed. When the “Yes” button is pressed in screens 10, 12, or 14, the system presents
screen 5 so the customer can select additional transactions.
The currency conversion program is another event-driven program that emphasizes code associated
with a GUI. A sample GUI is shown in Figure 2.5.
The application converts US dollars to any of four currencies: Brazilian reals, Canadian dollars,
European Union euros, and Japanese yen. Currency selection is governed by the radio buttons
(option buttons), which are mutually exclusive. When a country is selected, the system responds by
completing the label;
Either before or after currency selection, the user inputs an amount in US dollars. Once both tasks
are accomplished, the user can click on the Compute button, the Clear button, or the Quit button.
Clicking on the Compute button results in the conversion of the US dollar amount to the equivalent
amount in the selected currency.
Saturn Windshield Wiper Controller
The windshield wiper on some Saturn automobiles is controlled by a lever with a dial. The lever
has four positions: OFF, INT (for intermittent), LOW, and HIGH; and the dial has three positions,
numbered simply 1, 2, and 3. The dial positions indicate three intermittent speeds, and the dial
position is relevant only when the lever is at the INT position. The decision table below shows the
windshield wiper speeds (in wipes per minute) for the lever and dial positions.
SOFTWARE QUALITY
Software quality is a multidimensional quantity and is measurable.
Quality Attributes
These can be divided to static and dynamic quality attributes
Static quality attributes
It refers to the actual code and related documents.
Example: A poorly documented piece of code will be harder to understand and hence difficult to
modify. A poorly structured code might be harder to modify and difficult to test.
Reliability:
It refers to the probability of failure free operation.
Correctness:
Refers to the correct operation and is always with reference to some artefact.
For a Tester, correctness is w.r.t to the requirements
For a user correctness is w.r.t the user manual
Completeness:
Refers to the availability of all the features listed in the requirements or in the user manual.
An incomplete software is one that does not fuly implement all features required.
Consistency:
Refers to adherence to a common set of conventions and assumptions.
Ex: All buttons in the user interface might follow a common-color coding convention.
Usability:
Refer to ease with which an application can be used. This is an area in itself and there exist
techniques for usability testing.
Psychology plays an important role in the design of techniques for usability testing.
Usability testing is a testing done by its potential users.
The development organization invites a selected set of potential users and asks them to test
the product.
Users in turn test for ease of use, functionality as expected, performance, safety and
security.
Users thus serve as an important source of tests that developers or testers within the
organization might not have conceived.
Usability testing is sometimes referred to as user-centric testing.
Performance:
Refers to the time the application takes to perform a requested task. Performance is considered as a
non-functional requirement.
Reliability:
Software reliability can vary from one operational profile to another. An implication is that one
might say “this program is lousy” while another might sing praises for the same program. Software
reliability is the probability of failure free operation of software in its intended environments.
During the development of the product, the requirement might have changed from what was stated
originally. Regardless of any change, the expected behaviour of the product is determined by the
tester’s understanding of the requirements during testing.
Requirement 1: It is required to write a program that inputs and outputs the maximum of
these. Requirement 2: It is required to write a program that inputs a sequence of integers and
outputs the sorted version of this sequence.
Suppose now that the tester wants to know if the two integers are to be input to the program
on one line followed by a carriage return typed in after each number. The requirement as
stated above fails to provide an answer to this question. This example illustrates the
incompleteness requirements 1.
Software Testing(17IS63) Page 27
Maharaja Institute of Technology Mysore Department of ISE
The second requirement in (the above example is ambiguous. It is not clear from this
requirement whether the input sequence is to be sorted in ascending or descending order.
To establish correctness via testing would imply testing a program on all elements in the
input domain, which is impossible to accomplish in most cases that are encountered in
practice. Thus, correctness is established via mathematical proofs of programs.
While correctness attempts to establish that the program is error-free, testing attempts to
find if there are any errors in it. Thus, completeness of testing does not necessarily
demonstrate that a program is error-free.
Testing, debugging and the error-removal process together increase confidence in the
correct functioning of the program under test.
When the error is removed by changing the condition x<y to x≤𝑦, the program fails again
when the input values are the same. The latter failure is due to the error in function f. In this
program, when the error in f is also removed, the program will be correct assuming that all
other code is correct.
An operational profile is a numerical description of how a program is used. In accordance with the
above definition, a program might have several operational profiles depending on its users.
Testing: Testing is the process of determining if a program behaves as expected. In the process
one may discover errors in the program under test. However, the testing reveals an error.
Debugging: The process used to determine the cause of this error and to remove it is known as
debugging.
As illustrated in figure, testing and debugging are often used as two related activities in a cyclic
manner. Steps are
1. Preparing a test plan
2. Constructing test data
3. Executing the program
4. Specifying program behaviour
5. Assessing the correctness of program behaviour
6. Construction of oracle
The test data is a set of values, one for each input variable.
A test set is a collection of zero or ore cases.
Execution of a program under test is the next significant step in the testing. Execution of this step
for the sort program is most likely a trivial exercise. The complexity of actual program execution is
dependent on the program itself. The output generated by the program may be saved in a file for
subsequent examination by a tester.
State vector: collecting the current values of program variables into a vector known as the state
vector. An indication of where the control of execution is at any instant of time can be given by
using an identifier associated with the next program statement.
Construction of oracles:
Construction of automated oracles, such as the one to check a matrix multiplication program or a
sort program, Requires determination of I/O relationship.
Example: Consider a program named Hvideo that allows one to keep track of home videos. In the
data entry mode, it displays a screen in which the user types in information about a DVD. In search
mode, the program displays a screen into which a user can type some attribute of the video being
searched for and set up a search criterion. To test Hvideo we need to create an oracle that checks
whether the program function correctly in data entry and search nodes. The input generator
generates a data entry request. The input generaor now requests the oracle to test if Hvideo
performed its task correctly on the input given for data entry.
TEST METRICS
The term metric refers to a standard of measurement. In software testing, there exist a variety of
metrics.
rces
and size.
Schedule related metrics: Measure actual completion times of various activities and compare these
with estimated time to completion.
Quality related metrics: Measure quality of a product or a process
Resource related metrics: Measure items such as cost in dollars, man power and test executed.
Size-related metrics: Measure size of various objects such as the source code and number of tests
in a test suite
Organizational metrics: Metrics at the level of an organization are useful in overall project
planning and management. Ex: the number of defects reported after product release, averaged over
Software Testing(17IS63) Page 35
Maharaja Institute of Technology Mysore Department of ISE
a set of products developed and marketed by an organization, is a useful metric of product quality
at the organizational level.
Project metrics:
Project metrics relate to a specific project, for example the I/O device testing project or a compiler
project. These are useful in the monitoring and control of a specific project.
1. Actual/planned system test effort is one project metrics. Test effort could be measured in terms
of the tester_man_months.
2. Project metric=𝑛𝑜.𝑜𝑓𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙𝑡𝑒𝑠𝑡𝑠𝑡𝑜𝑡𝑎𝑙 /𝑛𝑢𝑚𝑏𝑒𝑟𝑜𝑓𝑡𝑒𝑠𝑡𝑠𝑖𝑛𝑡ℎ𝑒𝑠𝑦𝑠𝑡𝑒𝑚𝑝ℎ𝑎𝑠𝑒
Process metrics:
Test process consists of several phases like unit test, integration test, system test, one can measure
how many defects were found in each phase. It is well known that the later a defect is found, the
consttier it is to fix.
Module 2
Boundary Value Testing
Software Testing(17IS63) Page 36
Maharaja Institute of Technology Mysore Department of ISE
Any program can be considered to be a function in the sense that program inputs form its domain
and program outputs form its range. Input domain testing (also called “boundary value testing”) is
the best-known specification-based testing technique.There are two independent considerations that
apply to input domain testing.
The first asks whether or not we are concerned with invalid values of variables.
The second consideration is whether we make the “single fault” assumption common to
reliability theory.
Taken together, the two considerations yield four variations of boundary value testing:
Function, F, of two variables x1 and x2. When the function F is implemented as a program, the
input variables x1 and x2 will have some (possibly unstated) boundaries:
The input space (domain) of our function F is shown in Figure 5.1. Any point within the shaded
rectangle and including the boundaries is a legitimate input to the function F.the intervals [a, b] and
[c, d] are referred to as the ranges of x1 and x2, so right away we have an overloaded term.
The normal boundary value analysis test cases for our function F of two variables (illustrated in
Figure 5.2) are
Most of the discussion of boundary value analysis applies directly to robustness testing, especially
the generalizations and limitations. The most interesting part of robustness testing is not with the
inputs but with the expected outputs.
What happens when a physical quantity exceeds its maximum? If it is the angle of attack of an
airplane wing, the aircraft might stall. If it is the load capacity of a public elevator, we hope nothing
special would happen. If it is a date, like May 32, we would expect an error message. The main
value of robustness testing is that it forces attention on exception handling.
With strongly typed languages, robustness testing may be very awkward. In Pascal, for example, if
a variable is defined to be within a certain range, values outside that range result in run-time errors
that abort normal execution.
Both forms of boundary value testing, make the single fault assumption of reliability theory,
Rejecting single-fault assumption means that we are interested in what happens when more than
one variable has an extreme value. In electronic circuit analysis, this is called “worst-case analysis”
we use that idea here to generate worstcase test cases.
For each variable, we start with the five-element set that contains the min, min+, nom, max–, and
max values. We then take the Cartesian product (see Chapter 3) of these sets to generate test cases.
The result of the two-variable version of this is shown in Figure 5.4.
We could go to robust worst-case testing. This involves the Cartesian product of the seven-
element sets we used in robustness testing resulting in 7n test cases. Figure 5.5 shows the
robust worst-case test cases for our two-variable function.
For triangle problem the lower bounds of the ranges are all 1 and arbitrarily take 200 as an upper
bound. For each side, the test values are {1, 2, 100, 199, 200}. Robust boundary value test cases
will add {0, 201}. Table 5.1 contains boundary value test cases using these ranges. Notice that test
cases 3, 8, and 13 are identical; two should be deleted.
The cross-product of test values will have 125 test cases (some of which will be repeated)—too
many to list here. The full set is available as a spreadsheet in the set of student exercises. Table 5.2
only lists the first 25 worst-case boundary value test cases for the triangle problem.
All 125 worst-case test cases for NextDate are listed in Table 5.3. Take some time to examine it for
gaps of untested functionality and for redundant testing.
Instead of going through 125 test cases again, we will look at some more interesting test cases for
the commission problem. This time, we will look at boundary values derived from the output range,
especially near the threshold points of $1000 and $1800 where the commission percentage changes.
The output space of the commission is shown in Figure 5.6.
Module-2
Equivalence Class Testing
The use of equivalence classes as the basis for functional testing has two motivations:
1) we would like to have a sense of complete testing,
2) We would hope to avoid redundancy.
Boundary value testing—looking at the tables of test cases, it is easy to see massive
redundancy,and looking more closely, serious gaps exist. Equivalence class testing echoes the two
deciding factors of boundary value testing, robustness and the single/multiple fault assumption.
Equivalence class testing has four distinct forms based on the two assumptions.
The single versus multiple fault assumption yields the weak/strong distinction.
second distinction: the focus on invalid data yieldsi.enormal versus robust.
Two problems occur with robust forms.The first is that, very often, the specification does not
define what the expected output for an invalid input should be.The second problem is that strongly
typed languages eliminate the need for the consideration of invalid inputs.
Equivalence Classes
Equivalence classes are that they form a partition of a set, where partition refers to a collection of
mutually disjoint subsets, the union of which is the entire set.
This has two important implications for testing—the fact that the entire set isrepresented
provides a form of completeness.
Disjointedness ensures a form of nonredundancy.
The idea of equivalence class testing is to identify test cases by using one element from each
equivalence class. If the equivalence classes are chosen wisely, this greatlyreduces the potential
redundancy among test cases.
Eg:In the triangle problem, for example, we would certainly have a test case for an equilateral
triangle, and we might pick the triple (5, 5, 5) as inputs for a test case. If we did this, we would not
expect to learn much from test cases such as (6, 6, 6) and (100, 100, 100).Thus, they would be
redundant.
The four forms of equivalence class testing all address the problems of gaps and
redundancies that are common to the four forms of boundary value testing.
There will be one point of overlap—this occurs when equivalence classes are defined by
bounded variables.
In such cases, a hybrid of boundary value and equivalence class testing is appropriate. The
International Software Testing Qualifications Board (ISTQB) syllabi refer to this as “edge
testing.”
Traditional equivalence class testing is nearly identical to weak robust equivalence class testing.
This traditional form focuses on invalid data values.In the early years, it was the program user’s
responsibility provides valid data. There was no guarantee about results based on invalid data. The
term soon became known as GIGO(“Garbage In, Garbage Out”).The usual response to GIGO
was extensive input validation sections of a program.
In modern programming languages, especially those that feature strong data typing, and then to
graphical user interfaces (GUIs) obviated much of the need for input data validation. Indeed,
good use of user interface devices such as drop-down lists and slider bars reduces the likelihood
of bad input data.
Traditional equivalence class testing echoes the process of boundary value testing. Figure 6.1
shows test cases for a function F of two variables x1 and x2,The extension to more realistic cases of
n variables proceeds as follows:
One clear advantage of this process is that it focuses on finding faults due to invalid data. Since the
GIGO concern was on invalid data, the kinds of combinations that we saw in the worst-case
variations of boundary value testing were ignored.Figure 6.1 shows the five test cases for this
process for our continuing function F of two variables.
When F is implemented as a program, the input variables x1 and x2 will have the following
boundaries, and intervals within the boundaries:
Where square brackets and parentheses denote, respectively, closed and open interval
endpoints.These ranges are equivalence classes. Invalid values of x1 and x2 are x1 <a, x1> d, and
x2 <e, x2> g. The equivalence classes of valid values are,
NV1 = {x1: x1 < a}, NV2 = {x1: d < x1}, NV3 = {x2: x2 < e}, NV4 = {x2: g < x2}
The equivalence classes V1, V2, V3, V4, V5, NV1, NV2, NV3, and NV4 are disjoint, and their
union is the entire plane.
Weak normal equivalence class testing is accomplished by using one variable from each
equivalence class (interval) in a test case.
For the running example, we would end up with the three weak equivalence class test cases shown
in Figure 6.2. These three test cases use one value from each equivalence class.
The test case in the lower left rectangle corresponds to a value of x1 in the class [a, b), and
to a value of x2 in the class [e, f ).
The test case in the upper center rectangle corresponds to a value of x1 in the class [b, c)
and to a value of x2 in the class [f, g].
The third test case could be in either rectangle on the right side of the valid values.
There could be a problem with x1, or a problem with x2, or maybe an interaction between the two.
This ambiguity is the reason for the “weak” designation. If the expectation of failure is low, as it is
for regression testing, this can be an acceptable choice. When more fault isolation is then, the
stronger forms is required.
Strong equivalence class testing is based on the multiple fault assumption, so we need test cases
from each element of the Cartesian product of the equivalence classes, as shown in Figure 6.3.The
Cartesian product guarantees that we have a notion of “completeness” in two senses: we cover all
the equivalence classes, and we have one of each possible combination of inputs.
How can something be both weak and robust? The robust part comes from consideration of
invalid values, and the weak part refers to the single fault assumption. The process of weak
robust equivalence class testing is a simple extension of that for weak normal equivalence class
testing—pick test cases such that each equivalence class is represented.
The two additional test cases cover all four classes of invalid values is shown in figure 6.4. The
process is similar to that for boundary value testing:
1. For valid inputs, use one value from each valid class (as in what we have called weak normal
equivalence class testing). (Note that each input in these test cases will be valid.)
2. For invalid inputs, a test case will have one invalid value and the remaining values will all
be valid. (Thus, a “single failure” should cause the test case to fail.)
There is a potential problemwith these test cases. Consider the test cases in the upper left and
lower right corners. Each of the test cases represents values from two invalid equivalence
classes. Failure of either of these could be due to the interaction of two variables. Figure 6.5
presents a compromise between “pure” weak normal equivalence class testing and its robust
extension.
The robust part comes from consideration of invalid values, and the strong part refers to the
multiple fault assumption. We obtain test cases from each element of the Cartesian product of all
the equivalence classes, both valid and invalid, as shown in Figure 6.6.
In the problem statement, we note that four possible outputs can occur: NotATriangle, Scalene,
Isosceles, and Equilateral. We can use these to identify output (range) equivalence classes as
follows.
Four weak normal equivalence class test cases, chosen arbitrarily from each class are as follows:
Because no valid subintervals of variables a, b, and c exist, the strong normal equivalence
classtest cases are identical to the weak normal equivalence class test cases.
Considering the invalid values for a, b, and c yields the following additional weak robust
equivalence class test cases. (The invalid values could be zero, any negative number, or
any numbergreater than 200.)
Here is one “corner” of the cube in three-space of the additional strong robust equivalenceclass
test cases:
Equivalence class testing is clearly sensitive to the equivalence relation used to define classes.
Here is another instance of craftsmanship.
D1 = {<a, b, c>: a = b = c}
D2 = {<a, b, c>: a = b, a ≠ c}
D3 = {<a, b, c>: a = c, a ≠ b}
D4 = {<a, b, c>: b = c, a ≠ b}
D5 = {<a, b, c>: a ≠ b, a ≠ c, b ≠ c}
Invalid cases for NOTATriangle
D6 = {<a, b, c>: a ≥ b + c}
D7 = {<a, b, c>: b ≥ a + c}
D8 = {<a, b, c>: c ≥ a + b}
The NextDate function illustrates very well the craft of choosing the underlying equivalence
relation.NextDate is a function of three variables: month, day, and year, and these have intervals of
valid values defined as follows:
Because the number of valid classes equals the number of independent variables, only oneweak
normal equivalence class test case occurs, and it is identical to the strong normal equivalence class
test case:
As with the triangle problem, here is one “corner” of the cube in three-space of the additional
strong robust equivalence class test cases:
The problem of leap year makes determining the last day of a month interesting. With all this in
mind, we might postulate the following equivalence classes:
These classes yield the following weak normal equivalence class test cases. As before, the inputs
are mechanically selected from the approximate middle of the corresponding class:
The strong normal equivalence class test cases for the revised classes are as follows:
The input domain of the commission problem is “naturally” partitioned by the limits on locks,
stocks, and barrels. These equivalence classes are exactly those that would also be identified by
traditional equivalence class testing. The first class is the valid input; the other two are invalid.
The salesperson’s commission as follows: 10% on salesup to (and including) $1000, 15%
on the next $800, and 20% on any sales in excess of $1800.
The fact that our input is a triplet means that we no longer take test cases from a Cartesian product.
Here is one “corner” of the cube in 3-space of the additional strong robust equivalence class test
cases:
Module-2
Fault-Based Testing
Overview
Experience with common software faults sometimes leads to improvements in design methods and
programming languages.The basic concept of fault-based testing is to select test cases that would
distinguish the program under test from alternative programs that contain hypothetical
faults. This is usually approached by modifying the program under test to actually produce the
hypothetical faulty programs.Fault seeding can be used to evaluate the thoroughness of a test suite
(that is, as an element of a test adequacy criterion), or for selecting test cases to augment a test
suite, or to estimate the number of faults in a program.
The effectiveness of fault-based testing depends on the quality of the fault model and on
some basic assumptions about the relation of the seeded faults to faults that might actually
be present. In practice, the seeded faults are small syntactic changes, like replacing one
variable reference by another in an expression, or changing a comparison from < to <=.
The program from small variantsi.e the competent programmer hypothesis, anassumption
that the program under test is “close to” (in the sense of textual difference)a correct
program.
Some program faults are indeed simple typographical errors, and others that involvean
error of logic will result in much more complex differences inprogram text. This may not
invalidate fault-based testing with a simpler fault model,provided test cases sufficient for
detecting the simpler faults are sufficient also for complex fault. This is known as the
coupling effect.
Fault-based testing can guarantee fault detection only if the competent
programmerhypothesis and the coupling effect hypothesis hold.
Mutation Analysis
Mutation analysis is the most common form of software fault-based testing. A fault model is used
to produce hypothetical faulty programs by creating variants of the program under test. Variants
are created by “seeding” faults, that is, by making a small change to the program under test
following a pattern in the fault model. The patterns for changing program text are called mutation
operators, and each variant program is operator called a mutant.
We say a mutant is valid if it is syntactically correct. A mutant obtained from the program
of Figure 16.1 by substituting while for switch in the statement at line 13 would not be
valid, since it would result in a compile-time error.
We say a mutant is D useful mutant useful if, in addition to being valid, its behavior
differs from the behavior of the original program for no more than a small subset of
program test cases. A mutant obtained by substituting 0 for 1000 in the statement at line 4
would be valid, but not useful, since the mutant would be distinguished from the program
under test by all inputs and thus would not give any useful information on the effectiveness
of a test suite.
Defining mutation operators that produce valid and useful mutations is a nontrivial
task.
Given a program and a test suite T, mutation analysis consists of the following steps:
1. Select mutation operators: If we are interested in specific classes of faults, we may select
a set of mutation operators relevant to those faults.
2. Generate mutants: Mutants are generated mechanically by applying mutation operators to
the original program.
3. Distinguish mutants: Execute the original program and each generated mutant with the test
cases in T. A mutant is killed when it can be distinguished from the original program.
Figure 16.3 shows a sample of mutants for program Transduce, obtained by applying the mutant
operators in Figure 16.2. Test suite TS
TS = {1U,1D,2U,2D,2M,End,Long}
kills Mj, which can be distinguished from the original program by test cases 1D, 2U, 2D, and 2M.
Mutants Mi, Mk, and Ml are not distinguished from the original programby any test in TS.
We say that mutants not killed by a test suite are live.
The mutant can be distinguished from the original program, but the test suite T does not
contain a test case that distinguishes them (i.e., the test suite is not adequate with respect to
the mutant).
The mutant cannot be distinguished from the original program by any test case (i.e., the
mutant is equivalent to the original program).
The adequacy of the test suite TS evaluated with respect to the four mutants of Figure 16.3 is 25%.
However, we can easily observe that mutant Mi is equivalent to the original program (i.e., no input
would distinguish it). Conversely, mutants Mk and Ml seem to be nonequivalent to the original
program: There should be at least one test case that distinguishes each of them from the original
program. Thus the adequacy of TS, measured after eliminating the equivalent mutant Mi, is 33%.
Mutant Ml is killed by test case Mixed, which represents the unusual case of an input file
containing both DOS- and Unix-terminated lines.
1. Strong mutation: the output of the program is compared at the end of the program
execution.
The mutation analysis process described in the preceding sections, which kills mutants
based on the outputs produced by execution of test cases, is known as strong mutation.
Disadvantage: The time and space required for compiling all mutants and for executing all
test cases for each mutant maybe impractical.
2. Weak mutation: the state of the program is compared after every execution of the
component.
Weak mutation analysis decreases the number of tests to be executed by killing mutants
whenthey produce a different intermediate state, rather than waiting for a difference in
the final result or observable program behavior.
3. Meta mutant:program is divided into segments containing original and mutated source
code,with a mechanism to select which segments to execute. Two copies of the meta-
mutant are executed in parallel, one with only original program code selected and the
otherwith a set of live mutants selected. Execution is paused after each segment to
comparethe program state of the two versions.
If the state is equivalent, execution resumes withthe next segment of original and
mutated code.
If the state differs, the mutant is markedas dead, and execution of original and
mutated code is restarted with a new selectionof live mutants.
4. Statistical mutation: To estimate the number of faults remaining in a program. Usually
we know only the number of faults that have been detected, and not the number that
remains. However, again to the extent that thefault model is a valid statistical model of
actual fault occurrence, we can estimate thatthe ratio of actual faults found to those still
remaining should be similar to the ratio ofseeded faults found to those still remaining.
Case study 2:
MODULE-3
(STRUCTURAL TESTING)
Path Testing
Program Graphs
Definition
Given a program written in an imperative programming language, its program graph is a directed
graph in which nodes are statement fragments, and edges represent flow of control.If i and j are
nodes in the program graph, an edge exists from node i to node j if and only if the statement
fragment corresponding to node j can be executed immediately after the statement fragment
corresponding to node i.
Program graph from a given program is an easy process. It is illustrated here with four of the basic
structured programming constructs (Figure 8.1), and also with our pseudocodeimplementation of
the triangle program.
A program graph of the second version of the triangle problem is given in Figure 8.2.
Software Testing(17IS63) Page 75
Maharaja Institute of Technology Mysore Department of ISE
Nodes 4 through 8 are a sequence, nodes 9 through 12 are an if–then–else construct, and
nodes 13 through 22 are nested if–then–else constructs. Nodes 4 and 23 are the program
source and sink nodes, corresponding to the single entry, single-exit criteria. No loops exist,
The importance of the program graph is that program executions correspondto paths from
the source to the sink nodes. Because test cases force the execution of somesuch program
path, we now have a very explicit description of the relationship between a test caseand the
There are detractors of path-based testing. Figure 8.3 is a graph of a simple (but
unstructured!)program;Inthis program, five paths lead from node B to node F in the interior
of the loop. If the loop mayhave up to 18 repetitions, some 4.77 trillion distinct program
execution paths exist.
DD-Paths
We will define DD-paths in terms of paths of nodes in a program graph. In graph theory, these
paths are called chains, where a chain is a path in which the initial and terminal nodes are distinct,
and every interior node has indegree = 1 and outdegree = 1.
Notice that the initial node is 2-connected to every other node in the chain, and no instances of 1- or
3-connected nodes occur, as shown in Figure 8.4. The length (number of edges)of the chain in
Figure 8.4 is 6.
Cases 1 and 2 establish the unique source and sink nodes of the program graph of a
structured program as initial and final DD-paths.
Case 3 deals with complex nodes; it assures that no node is contained in more than one DD-
path.
Case 4 is needed for “short branches”; it also preserves the one-fragment, one DD-path
principle.
Case 5 is the “normal case,” in which a DD-path is a single entry, single-exit sequence of
nodes (a chain). The “maximal” part of the case 5 definition is used to determine the final
node of a normal (nontrivial) chain.
Definition
Given a program written in an imperative language, its DD-path graph is the directed graph in
which nodes are DD-paths of its program graph, and edges represent control flow between
successor DD-paths.
Node 4 is a case 1 DD-path; we will call it “first.” Similarly, node 23 is a case 2 DD-path,
and we will call it “last.”
Nodes 5 through 8 are case 5 DD-paths. We know that node 8 is the last node in this DD-
path because it is the last node that preserves the 2-connectedness property of the chain. If
we go beyond node 8 to include node 9, we violate the indegree = outdegree = 1 criterion of
a chain.
If we stop at node 7, we violate the “maximal” criterion. Nodes 10, 11, 15, 17, 18, and 21
are case 4 DD-paths. Nodes 9, 12, 13, 14, 16, 19, 20, and 22 are case 3 DD-paths. Finally,
node 23 is a case 2 DD-path. All this is summarized in Figure 8.5.
Test coverage metrics are a device to measure the extent to which a set of test cases covers (or
exercises) a program.
Given a program graph, we can define the following set of test coverage metrics. We will use them
to relate to other published sets of coverage metrics.
Definition-1
Given a set of test cases for a program, they constitute node coverage if, when executed on the
program, every node in the program graph is traversed. Denote this level of coverage as Gnode,
where the G stands for program graph.
Definition-2
Given a set of test cases for a program, they constitute edge coverage if, when executed on the
program, every edge in the program graph is traversed. Denote this level of coverage as Gedge.
Definition-3
Given a set of test cases for a program, they constitute chain coverage if, when executed on the
program, every chain of length greater than or equal to 2 in the program graph is traversed.
Denote this level of coverage as Gchain.
Definition-4
Given a set of test cases for a program, they constitute path coverage if, when executed on the
program, every path from the source node to the sink node in the program graph is traversed.
Denote this level of coverage as Gpath.
Several widely accepted test coverage metrics are used; most of those in Table 8.1 are due to the
early work of Miller (1977).
Statement Testing
Because our formulation of program graphs allows statement fragments to be individual nodes, Miller’s
C0 metric is subsumed by our Gnode metric.If some statements have not beenexecuted by the set of
test cases, there is clearly a severe gap in the test coverage. Although less adequate than DD-path
coverage, the statement coverage metric (C0) is still widely accepted.
DD-Path Testing
When every DD-path is traversed (the C1 metric), we know that each predicate outcome hasbeen
executed; this amounts to traversing every edge in the DD-path graph (or program
graph).Therefore, the C1 metric is exactly our Gchainmetric. For if–then and if–then–else
statements, this means that both the true and the false branches are covered (C1p coverage).
The C2 metric requires DD-path coverage (the C1 metric) plus loop testing.The simple view of
loop testing is that every loop involves a decision, and we need to test bothoutcomes of the
decision: one is to traverse the loop, and the other is to exit (or not enter) the loop.Notice that
this is equivalent to the Gedge test coverage.
Miller’s Cik metric extends the loop coverage metric to include full paths from source to sink nodes
that contain loops.
Concatenated loops are simply a sequence of disjoint loops, while nested loops are such that one
is contained inside another. Knotted (Beizer calls them “horrible”) loops cannot occur when the
structured programming precepts are followed, but they can occur in languages like Java with
try/catch.
Miller’s CMCC metric addresses the question of testing decisions made by compound conditions.
The Cstat metric is awkward—what constitutes a statistically significant set of full program
paths?Maybe this refers to a comfort level on the part of the customer/user.
The subscript in Miller’s C∞ metric says it all—this can be enormous for programs with loops
“A Boolean expression evaluates to one of two possible (Boolean) outcomes traditionally known as
False and True.”A Boolean expression may be a simple Boolean variable, or a compound
expression containing one or more Boolean operators. Chilenski clarifies Boolean operators into
four categories:
In mathematical logic, Boolean expressions are known as logical expressions, where a logical
expression can be
“A condition is an operand of a Boolean operator. Generally this refers to the lowest level
conditions, which are normally the leaves of an expression tree. Note that a condition is a Boolean
(sub)expression.”
Coupled Conditions (per Chilenski): Two (or more) conditions are coupled if changing one also
changes the other(s).When conditions are coupled, it may not be possible to vary individual
conditions, because the coupled condition(s) might also change.
Software Testing(17IS63) Page 83
Maharaja Institute of Technology Mysore Department of ISE
In a strongly coupled pair, changing one condition always changes the other.
In a weakly coupled triplet, changing one condition may change one other coupled
condition, but not the third one.
“The process masking conditions involves of setting the one operand of an operator to a value such
that changing the other operand of that operator does not change the value of the operator.
1. For an AND operator, masking of one operand can be achieved by holding the other
operand False.
(X AND False = False AND X = False no matter what the value of X is.)
2. For an OR operator, masking of one operand can be achieved by holding the other operand
True.
MCDC has three variations: Masking MCDC, Unique-Cause MCDC, and Unique-Cause + Masking
MCDC.
Definition
MCDC requires
2. Every program entry point and exit point must be invoked at least once.
3. All possible outcomes of every control statement are taken at least once.
4. Every nonconstant Boolean expression has been evaluated to both true and false outcomes.
5. Every nonconstant condition in a Boolean expression has been evaluated to both true and
false outcomes.
6. Every nonconstant condition in a Boolean expression has been shown to independently
affect the outcomes (of the expression).
“Unique-Cause MCDC [requires] a unique cause (toggle a single condition and change the
expression result) for all possible (uncoupled) conditions.”
“Unique-Cause + Masking MCDC [requires] a unique cause (toggle a single condition and
change the expression result) for all possible (uncoupled) conditions. In the case of strongly
coupled conditions, masking [is allowed] for that condition only, i.e., all other (uncoupled)
conditions will remain fixed.”
“Masking MCDC allows masking for all conditions, coupled and uncoupled (toggle a single
condition and change the expression result) for all possible (uncoupled) conditions. In the case of
strongly coupled conditions, masking [is allowed] for that condition only (i.e., all other
(uncoupled) conditions will remain fixed).”
Examples
Consider the program fragment in Figure 8.7. It is deceptively simple, with a cyclomatic
complexity of 2.
To attain MCDC, each condition must be evaluated to both true and false while holding the other
conditions constant, and the change must be visible at the outcome. Rules 1 and 5 toggle condition
a; rules 2 and 4 toggle condition b; and rules 3 and 4 toggle condition c.
In the Chelinski (2001) paper (p. 9), it happens that the Boolean expression used is
(a AND (b OR c))
In its expanded form, (a AND b) OR (a AND c), the Boolean variable a cannot be subjected to
unique cause MCDC testing because it appears in both AND expressions.
In our continuing NextDate problem, suppose we have some code checking for valid inputs of the day,
month, and year variables. A code fragment for this and its program graph are in Figure 8.8.
Table 8.3 is a decision table for the NextDate code fragment. Since the day, month, and year variables
are all independent, each can be either true or false. The cyclomatic complexity of the program graph in
Figure 8.8 is 5.
Decision coverage is attained by exercising any pair of rules such that each action is executed at least
once. Test cases corresponding to rule 1 and any one of rules 2–8 provide decision coverage.
Multiple condition coverage requires exercising a set of rules such that each condition is evaluated to
both True and False. The eight test cases corresponding to all eight rules are necessary to provide
decision coverage.To attain MCDC, each condition must be evaluated to both true and false while
holding the other conditions constant, and the change must be visible at the outcome. Rules 1 and 2
toggle condition yearOK; rules 1 and 3 toggle condition monthOK, and rules 1 and 5 toggle condition
dayOK.
This example is included to show important differences between it and the first two examples. The
code fragment in Figure 8.9 is the part of the triangle program that checks to see if the values of
sides a, b, and c constitute a triangle.
The dependence among a, b, and c is the cause of the four impossible rules in the decision table for
the fragment in Table 8.4; this is proved next.
This code fragment avoids the numerically impossible combinations of a, b, and c. There are four
distinct paths through its program graph, and these correspond to rules 1, 2, 3, and 5 in the decision
table.
Mathematicians usually define a basis in terms of a structure called a “vector space,” which is a set
of elements (called vectors) as well as operations that correspond to multiplication and addition
defined for the vectors.
Figure 8.10 is taken from McCabe (1982). It is a directed graph that we might take to be the
program graph (or the DD-path graph) of some program.The program does have a single entry (A)
and a single exit (G). McCabe based his view of testing on a major result from graph theory, which
states that the cyclomatic number (see Chapter 4) of a strongly connected graph is the number of
linearly independent circuits in the graph.
We can always create a strongly connected graph by adding an edge from the (every) sink node to
the (every) source node.The right side of Figure 8.10 shows the result of doing this; it also contains
edge labels that are used in the discussion that follows. The number of linearly independent paths
from the source node to the sink node of the graph on the left side of Figure 8.10 is
The number of linearly independent circuits of the graph on the right side of the graph in Figure
8.10 is
The cyclomatic complexity of the strongly connected graph in Figure 8.10 is 5; thus, there arefive
linearly independent circuits. If we now delete the added edge from node G to node A, these five
circuits become five linearly independent paths from node A to node G. In small graphs, we can
visually identify independent paths. Here, we identify paths as sequences of nodes:
Table 8.5 shows the edges traversed by each path, and also the number of times an edge is
traversed. We can force this to begin to look like a vector space by defining notions of addition and
scalar multiplication:
His illustration of the basis part of this framework is that the path A, B, C, B, E, F, G is the
basis sum p2 + p3.
The path A, B, C, B, C, B, C, G is the linear combination2p2 – p1.
McCabe next develops an algorithmic procedure (called the baseline method) to determine a set of
basis paths.
The method begins with the selection of a baseline path, which should correspond to some
“normal case” program execution.
McCabe advises choosing a path with as many decision nodes as possible.
Next, the baseline path is retraced, and in turn each decision is “flipped”; that is, when a
node of outdegree ≥ 2 is reached, a different edge must be taken.
Here we follow McCabe’s example, in which he first postulates the path through nodes A,
B, C, B, E, F, G as the baseline.
Notice that this set of basis paths is distinct from the one in Table 8.6: this isnot problematic
because a unique basis is not required.
Time for a reality check: if you follow paths p2 and p3, you find that they are both infeasible. Path
p2 is infeasible because passing through node D means the sides are not a triangle; so the outcome
of the decision at node F must be node G. Similarly, in p3, passing through node C means the sides
do form a triangle; so node G cannot be traversed. Paths p4 and p5 are both feasible and correspond
to equilateral and isosceles triangles, respectively. Notice that we do not have a basis path for the
NotATriangle case.
Another is to reason about logical dependencies. If we think about this problem, we can identify
two rules:
The triangle problem is atypical in that no loops occur. The program has only eight
topologicallypossible paths; and of these, only the four basis paths listed above are feasible. Thus,
for this special case, we arrive at the same test cases as we did with special value testing and output
range testing.
Essential Complexity
Part of McCabe’s work on cyclomatic complexity does more to improve programming than testing.
In this section, we take a quick look at this elegant blend of graph theory, structured programming,
and the implications these have for testing.
we take a quick look at this elegant blend of graph theory, structured programming, and the
implications these have for testing.we condense around the structured programming constructs,
which are repeated as Figure 8.11.
McCabe (1976) went on to find elemental “unstructures” that violate the precepts of structured
programming. These are shown in Figure 8.13. Each of these violations contains three distinct
paths, as opposed to the two paths present in the corresponding structured programming constructs;
The basic idea is to look for the graph of one of the structured programming constructs, collapse it
into a single node, and repeat until no more structured programming constructs can be found.
This process is followed in Figure 8.12, which starts with the DD-path graph of the pseudocode
triangle program. The if–then–else construct involving nodes B, C, D, and E is condensed into node
a, and then the three if–then constructs are condensed onto nodes b, c, and d. The remaining if–
then–else (which corresponds to the IF IsATriangle statement) is condensed into node e, resulting
in a condensed graph with cyclomatic complexity V(G) = 1. In general, when a program is well
structured, it can always be reduced to a graph with one path.
McCabe (1982) was partly right when he observed, “It is important to understand that these are
purely criteria that measure the quality of testing, and not a procedure to identify test cases.” He
was referring to the DD-path coverage metric and his basis path heuristic based on
cyclomaticcomplexity metric. Basis path testing therefore gives us a lower boundary on how much
testing is necessary.
Path-based testing also provides us with a set of metrics that act as crosschecks onspecification-
based testing. We can use these metrics to resolve the gaps and redundancies question.
MODULE 3
Data Flow Testing(Chapter 2)
Software Testing(17IS63) Page 95
Maharaja Institute of Technology Mysore Department of ISE
Data flow testing refers to forms of structural testing that focus on the points at which variables
receive values and the points at which these values are used (or referenced).
Early data flow analyses often centered on a set of faults that are now known as define/reference
anomalies:
Define/Use Testing
The following definitions refer to a program P that has a program graph G(P) and a set of program
variables V. The program graph G(P) is constructed, with statement fragments as nodes and edges
that represent node sequences. G(P) has a single-entry node and a single-exit node, Paths, subpaths,
and cycles.
Definition
Node n ∈G(P) is a defining node of the variable v ∈ V, written as DEF(v, n), if and only if the
value of variable v is defined as the statement fragment corresponding to node n.
Input statements, assignment statements, loop control statements, and procedure calls are all
examples of statements that are defining nodes. When the code corresponding to such statements
executes, the contents of the memory location(s) associated with the variables are changed.
Definition
Node n ∈G(P) is a usage node of the variable v ∈ V, written as USE(v, n), if and only if the value
of the variable v is used as the statement fragment corresponding to node n.
Output statements, assignment statements, conditional statements, loop control statements, and
procedure calls are all examples of statements that are usage nodes. When the code corresponding
to such statements executes, the contents of the memory location(s) associated with the
variables remain unchanged.
Definition
A usage node USE(v, n) is a predicate use (denoted as P-use) if and only if the statement n is a
predicate statement; otherwise, USE(v, n) is a computation use (denoted C-use). The nodes
corresponding to predicate uses always have an outdegree ≥ 2, and nodes corresponding to
computation uses always have an outdegree ≤ 1.
Definition
A definition/use path with respect to a variable v (denoted du-path) is a path in PATHS(P) such
that, for some v ∈ V, there are define and usage nodes DEF(v, m) and USE(v, n) such that m
and n are the initial and final nodes of the path.
Definition
Example
We will use the commission problem and its program graph to illustrate these definitions. The
numbered pseudocode and its corresponding program graph are shown in Figure 9.1. This program
computes the commission on the sales of the total numbers of locks, stocks, and barrels sold.
The while loop is a classic sentinel controlled loop in which a value of –1 for locks signifies the
end of the sales data. The totals are accumulated as the data values are read in the while loop. After
printing this preliminary information, the sales value is computed, using the constant item prices
defined at the beginning of the program. The sales value is then used to compute the commission
in the conditional portion of the program.
Figure 9.2 shows the decision-to-decision path (DD-path) graph of the program graph in Figure 9.1.
More compression exists in this DD-path graph because of the increased computation in the
commission problem. Table 9.1 details the statement fragments associated with DD-paths.
Table 9.2 lists the define and usage nodes for the variables in the commission problem. We use this
information in conjunction with the program graph in Figure 9.1 to identify various definition/ use
and definition-clear paths.
Tables 9.3 and 9.4 present some of the du-paths in the commission problem; they are named by
their beginning and ending nodes (from Figure 9.1). The third column in Table 9.3 indicates
whether the du-paths are definition clear.
First, let us look at a simple path: the du-path for the variable stocks. We have DEF(stocks, 15) and
USE(stocks, 17), so the path <15, 17> is a du-path with respect to stocks. No other defining nodes
are used for stocks; therefore, this path is also definition clear.
Two defining and two usage nodes make the locks variable more interesting: we have DEF(locks,
13), DEF(locks, 19), USE(locks, 14), and USE(locks, 16). These yield four du-paths; they are
shown in Figure 9.3.
p1 = <13, 14>
p2 = <13, 14, 15, 16>
p3 = <19, 20, 14>
p4 = <19, 20, 14, 15, 16>
The du-paths for totalLocks will lead us to typical test cases for computations. With two
definingnodes (DEF(totalLocks, 10) and DEF(totalLocks, 16)) and three usage nodes
(USE(totalLocks, 16), USE(totalLocks, 21), USE(totalLocks, 24)), we might expect six du-paths.
Let us take a closer look.
Definition
The set T satisfies the All-Defs criterion for the program P if and only if for every variable v ∈ V,
Definition
The set T satisfies the All-Uses criterion for the program P if and only if for every variable v ∈
V, T contains definition-clear paths from every defining node of v to every use of v, and to the
Definition
The set T satisfies the All-P-Uses/Some C-Uses criterion for the program P if and only if for every
variable v ∈ V, T contains definition-clear paths from every defining node of v to every predicate
use of v; and if a definition of v has no P-uses, a definition-clear path leads to at least one
computation use.
Definition
The set T satisfies the All-C-Uses/Some P-Uses criterion for the program P if and only if for every
variable v ∈ V, T contains definition clear paths from every defining node of v to every
These relationships are shown in Figure 9.5. We now have a more refined view of structural testing
possibilities between the extremes of the (typically unattainable) All-Paths metric and the
generally accepted minimum, All-Edges.
If the test case specifications produced in test design already include concrete input values and
expected results,then producing a complete test case may be as simple as filling a template with
those values.
Eg: A more general test case specification (e.g., one that calls for “a sorted sequence, length greater
than 2, with items in ascending order with no duplicates”) may designate many possible concrete
test cases, and it may be desirable to generate just one instance or many.
Automatic generation of concrete test cases from more abstract test case specifications reduces the
impact of small interface changes in the course of development. Corresponding changes to the test
suite are still required with each program change, but changes to test case specifications are likely
to be smaller and more localized than changes to the concrete test cases.
Scaffolding
During much of development, only a portion of the full system is available for testing. In modern
development methodologies, the partially developed system is likely to consist of one or more
runnable programs and may even be considered a version or prototype of the final system from
very early in construction, so it is possible at least to execute each new portion of the software as it
is constructed.
Code developed to facilitate testing is called scaffolding,Scaffolding may include test drivers
(substituting for a main or calling program), test harnesses (substituting for parts of the deployment
environment), and stubs (substituting for functionality called or used by the software under test).
The purposes of scaffolding are to provide controllability to execute test cases and observabilityto
judge the outcome of test execution.
The simplest form of scaffolding is a driver program that runs a single, specific test case. If, for
example, a test case specification calls for executing method calls in a particular sequence, this is
easy to accomplish by writing the code to make the method calls in that sequence.
At thevery least one will want to factor out some of the common driver code into reusablemodules.
Sometimes it is worthwhile to write more generic test drivers that essentially interpret test case
specifications.
At least some level of generic scaffolding support can be used across a fairly wide class of
applications. Such support typically includes, in addition to a standard interface for executing a set
of test cases, basic support for logging test execution and results. Figure 17.1 illustrates use of
generic test scaffolding in the JFlex lexical analyzer generator.
Figure 17.1: Excerpt of JFlex 1.4.1 source code (a widely used open-source scannergenerator) and
accompanying JUnit test cases. JUnit is typical of basic test scaffoldinglibraries, providing support
for test execution, logging, and simple result checking
Test Oracles
It is little use to execute a test suite automatically if execution results must be manually inspected to
apply a pass/fail criterion. Relying on human intervention to judge test outcomes is not merely
expensive, but also unreliable.Software that applies a pass/fail criterion to a program execution is
called a test oracle often shortened to oracle.
Support for comparison-based test oracles is often included in a test harness program or testing
framework. A harness typically takes two inputs: (1) the input to the program under test (or can be
mechanically transformed to a well-formed input), and (2) the predicted output. Frameworks for
writing test cases as program code likewise provide support for comparison-based oracles. The
assertEquals method of JUnit, illustrated in Figure 17.1, is a simple example of comparison-based
oracle support.
Self-Checks as Oracles
A program or module specification describes all correct program behaviors, so an oracle based on a
specification need not be paired with a particular test case. Instead, the oracle can be incorporated
into the program under test, so that it checks its own work (see Figure 17.3).
Self-checks in the form of assertions embedded in program code are useful primarily for checking
module and subsystem-level specifications, rather than overall program behavior. Devising
program assertions that correspond in a natural way to specifications (formal or informal) poses
two main challenges: bridging the gap between concrete execution values and abstractions used in
specification, and dealing in a reasonable way with quantification over collections of values.
The first time such a test case is executed, the oracle function is carried out by a human, and the
interaction sequence is captured. Provided the execution was judged (by the human tester) to be
correct, the captured log now forms an (input, predicted output) pair for subsequent automated
retesting.
Module IV (Chapter 1)
Planning and monitoring the process
Overview:
Planning involves scheduling activities, allocating resources, and devising observable,
unambiguous milestones against which progress and performance can be monitored. Monitoring
means answering the question, "How are we doing?"
Quality planning is one aspect of project planning, and quality processes must be closely
coordinated with other development processes. Coordination among quality and development
tasks may constrain ordering
Formulation of the plan involves risk analysis and contingency planning. Execution of the plan
involves monitoring, corrective action, and planning for subsequent releases and projects
An appropriate quality process follows a form similar to the overall software process in which it
is embedded
A general principle, across all software processes, is that the cost of detecting and repairing a
fault increases as a function of time between committing an error and detecting the resultant
faults.
The particular verification or validation step at each stage depends on the nature of the
intermediate work product and on the anticipated defects
The evolving collection of work products can be viewed as a set of descriptions of different parts
and aspects of the software system, at different levels of detail. Portions of the implementation
have the useful property of being executable in a conventional sense, and are the traditional
subject of testing, but every level of specification and design can be both the subject of
verification activities and a source of information for verifying other artifacts. A typical
intermediate artifact - say, a subsystem interface definition or a database schema - will be subject
to the following steps:
Internal consistency check: Check the artifact for compliance with structuring rules that define
"well-formed" artifacts of that type. An important point of leverage is defining the syntactic and
semantic rules thoroughly and precisely enough that many common errors result in detectable
violations. This is analogous to syntax and strong-typing rules in programming languages, which
are not enough to guarantee program correctness but effectively guard against many simple
errors.
External consistency check: Check the artifact for consistency with related artifacts. Often this
means checking for conformance to a "prior" or "higher-level" specification, but consistency
checking does not depend on sequential, top-down development - all that is required is that the
related information from two or more artifacts be defined precisely enough to support detection
of discrepancies
A body of explicit knowledge, shared and refined by the group, is more valuable than islands of
individual competence. Organizational knowledge in a shared and systematic form is more
amenable to improvement and less vulnerable to organizational change, including the loss of key
individuals
Capturing the lessons of experience in a consistent and repeatable form is essential for avoiding
errors, maintaining consistency of the process, and increasing development efficiency
Cleanroom
The Cleanroom process involves two cooperating teams, the development and the quality teams,
and five major activities: specification, planning, design and verification, quality certification,
and feedback
In the specification activity, the development team defines the required behavior of the system,
while the quality team defines usage scenarios that are later used for deriving system test suites.
The planning activity identifies incremental development and certification phases
After planning, all activities are iterated to produce incremental releases of the system. Each
system increment is fully deployed and certified before the following step. Design and code
undergo formal inspection ("Correctness verification") before release. One of the key premises
underpinning the Clean room process model is that rigorous design and formal inspection
produce "nearly fault-free software.
Another key assumption of the Clean room process model is that usage profiles are sufficiently
accurate that statistical testing will provide an accurate measure of quality as perceived by
users.[a] Reliability is measured in terms of mean time between failures (MTBF) and is
constantly controlled after each release. Failures are reported to the development team for
correction, and if reliability falls below an acceptable range, failure data is used for process
improvement before the next incremental release.
Test and analysis strategies capture commonalities across projects and provide guidelines for
maintaining consistency among quality plans.
solutions to problems specific to that organization. Among the factors that particularize the
strategy are
Overall process
We have already noted the intertwining of quality process with other aspects of an overall
software process, and this is of course reflected in the quality strategy. Notations, standard
process steps, and even tools can be reflected in the quality strategy to the extent they are
consistent from project to project.
Application domain
The domain may impose both particular quality objectives (e.g., privacy and security in medical
records processing), and in some cases particular steps and documentation required to obtain
certification from an external authority.
SRET
The software reliability engineered testing (SRET) approach, developed at AT&T in the early
1990s, assumes a spiral development process and augments each coil of the spiral with rigorous
testing activities. SRET identifies two main types of testing: development testing, used to find
and remove faults in software at least partially developed in-house, and certification testing, used
to either accept or reject outsourced software.
Define "Necessary" Reliability Determine operational models, that is, distinct patterns of
system usage that require separate testing, classify failures according to their severity, and
engineer the reliability strategy with fault prevention, fault removal, and fault tolerance
activities.
Develop Operational Profiles Develop both overall profiles that span operational models and
operational profiles within single operational models.
Interpret Failure Data Interpretation of failure data depends on the type of testing. In
development testing, the goal is to track progress and compare present failure intensities with
objectives. In certification testing, the goal is to determine if a software component or system
should be accepted or rejected.
Test cases suitable for batch execution are part of the system code base and are implemented
prior to the implementation of features they check ("test-first"). Developers work in pairs,
incrementally developing and testing a module. Pair programming effectively conflates a review
Software Testing(17IS63) Page 114
Maharaja Institute of Technology Mysore Department of ISE
activity with coding. Each release is checked by running all the tests devised up to that point of
development, thus essentially merging unit testing with integration and system testing
Each of these issues is addressed to some extent in the quality strategy, but must be elaborated
and particularized. This is typically the responsibility of a quality manager, who should
participate in the initial feasibility study to identify quality goals and estimate the contribution of
test and analysis tasks on project cost and schedule
To produce a quality plan that adequately addresses the questions above, the quality manager
must identify the items and features to be verified, the resources and activities that are required,
the approaches that should be followed, and criteria for evaluating the results
Items and features to be verified circumscribe the target of the quality plan. While there is an
obvious correspondence between items to be developed or modified and those to undergo testing,
they may differ somewhat in detail. For example, overall evaluation of the user interface may be
the purview of a separate human factors group.
Defining quality objectives and process organization in detail requires information that is not all
available in the early stages of development. Test items depend on design decisions; detailed
approaches to evaluation can be defined only after examining requirements and design
specifications; tasks and schedule can be completed only after the design; new risks and
contingencies may be introduced by decisions taken during development
After capturing goals as well as possible, the next step in construction of a quality plan is to
produce an overall rough list of tasks. The quality strategy and past experience provide a basis
for customizing the list to the current project and for scaling tasks appropriately
The manager can start noting dependencies among the quality activities and between them and
other activities in the overall project, and exploring arrangements of tasks over time. The main
objective at this point is to schedule quality activities so that assessment data are provided
continuously throughout the project, without unnecessary delay of other development activities
Risks:
While the project plan shows the expected schedule of tasks, the arrangement and ordering of
tasks are also driven by risk. The quality plan, like the overall project plan, should include an
explicit risk plan that lists major risks and contingencies
A key tactic for controlling the impact of risk in the project schedule is to minimize the
likelihood that unexpected delay in one task propagates through the whole schedule and delays
project completion
Critical paths are chains of activities that must be completed in sequence and that have
maximum overall duration. Tasks on the critical path have a high priority for early scheduling,
and likewise the tasks on which they depend (which may not themselves be on the critical path)
should be scheduled early enough to provide some schedule slack and prevent delay in the
inception of the critical tasks
A critical dependence occurs when a task on a critical path is scheduled immediately after some
other task on the critical path, particularly if the length of the critical path is close to the length of
the project. Critical dependence may occur with tasks outside the quality plan part of the overall
project plan
The primary tactic available for reducing the schedule risk of a critical dependence is to
decompose a task on the critical path, factoring out subtasks that can be performed earlier
Fig below shows alternative schedules for a simple project that starts at the beginning of January
and must be completed by the end of May. In the top schedule, indicated as CRITICAL
SCHEDULE, the tasks Analysis and design, Code and Integration, Design and execute
subsystem tests, and Design and execute system tests form a critical path that spans the duration
of the entire project. A delay in any of the activities will result in late delivery. In this schedule,
only the Produce user documentation task does not belong to the critical path, and thus only
delays of this task can be tolerated
In the middle schedule, marked as UNLIMITED RESOURCES, the test design and execution
activities are separated into distinct tasks. Test design tasks are scheduled early, right after
analysis and design, and only test execution is scheduled after Code and integration. In this way
the tasks Design subsystem tests and Design system tests are removed from the critical path,
which now spans 16 weeks with a tolerance of 5 weeks with respect to the expected termination
of the project
The LIMITED RESOURCES schedule at the bottom of Figure 20.1 rearranges tasks to meet
resource constraints. In this case we assume that test design and execution, and production of
user documentation share the same resources and thus cannot be executed in parallel. We can see
that, despite the limited parallelism, decomposing testing activities and scheduling test design
earlier results in a critical path of 17 weeks, 4 weeks earlier than the expected termination of the
project. Notice that in the example, the critical path is formed by the tasks Analysis and design,
Design subsystem tests, Design system tests, Produce user documentation, Execute subsystem
tests, and Execute system tests. In fact, the limited availability of resources results in
dependencies among Design subsystem tests, Design system tests and Produce user
documentation that last longer than the parallel task Code and integration
Risk Planning
Risk is an inevitable part of every project, and so risk planning must be a part of every plan.
Risks cannot be eliminated, but they can be assessed, controlled, and monitored.
Risk Management in the Quality Plan: Risks Generic to Process Management The quality plan
must identify potential risks and define appropriate control tactics. Some risks and control tactics
are generic to process management, while others are specific to the quality process. Here we
provide a brief overview of some risks generic to process management.
Monitoring produces a surfeit of detail about individual activities. Managers need to make
decisions based on an overall understanding of project status, so raw monitoring information
must be aggregated in ways that provide an overall picture.
Accurate classification schemata can improve monitoring and may be used in very large projects,
where the amount of detailed information cannot be summarized in overall data. The orthogonal
defect classification (ODC) approach has two main steps: (1) fault classification and (2) fault
analysis
ODC fault classification is done in two phases: when faults are detected and when they are fixed.
At detection time, we record the activity executed when the fault is revealed, the trigger that
exposed the fault, and the perceived or actual impact of the fault on the customer
Fault, which can be: missing, that is, the fault is to due to an omission, as in a missing statement;
incorrect, as in the use of a wrong parameter;
or extraneous, that is, due to something not relevant or pertinent to the document or code, as in a
section of the design document that is not pertinent to the current product and should be
removed. The source of the fault indicates the origin of the faulty modules: in-house, library,
ported from other platforms, or outsourced code
Distribution of fault types versus activities Different quality activities target different classes
of faults. For example, algorithmic (that is, local) faults are targeted primarily by unit testing,
and we expect a high proportion of faults detected by unit testing to be in this class.
Distribution of triggers over time during field test Faults corresponding to simple usage
should arise early during field test, while faults corresponding to complex usage should arise
late. In both cases, the rate of disclosure of new faults should asymptotically decrease
Age distribution over target code Most faults should be located in new and rewritten code,
while few faults should be found in base or re-fixed code, since base and re-fixed code has
already been tested and corrected
Distribution of fault classes over time the proportion of missing code faults should gradually
decrease, while the percentage of extraneous faults may slowly increase, because missing
functionality should be revealed with use and repaired, while extraneous code or documentation
may be produced by updates
What are the faults? The goal of this first step is to identify a class of important faults. Faults are
categorized by severity and kind. The severity of faults characterizes the impact of the fault on
the product
Backward Compatibility A difference between the current and earlier versions of an artifact
that could be perceived by the customer as a failure.
Internal Document An internal inconsistency in the artifact (e.g., inconsistency between code
and comments).
Lateral Compatibility An incompatibility between the artifact and some other system or
module with which it should interoperate.
Side Effects A potential undesired interaction between the reviewed artifact and some other part
of the system.
Rare Situation An inappropriate response to a situation that is not anticipated in the artifact.
(Error handling as specified in a prior artifact design conformance, not rare situation.)
Complex Path The fault is detected by a test case derived to cover a combination of program
elements.
Variation The fault is detected by a test case derived to exercise a particular combination of
parameters for a single procedure.
Sequencing The fault is detected by a test case derived for testing a sequence of procedure calls.
Interaction The fault is detected by a test case derived for testing procedure interactions.
System Test
Workload/Stress The fault is detected during workload or stress testing.
Recovery/Exception The fault is detected while testing exceptions and recovery procedures.
Startup/Restart The fault is detected while testing initialization conditions during start up or
after possibly faulty shutdowns.
Hardware Configuration The fault is detected while testing specific hardware configurations.
Software Configuration The fault is detected while testing specific software configurations.
Performance The perceived and actual impact of the software on the time required for the
customer and customer end users to complete their tasks.
Maintenance The ability to correct, adapt, or enhance the software system quickly and at
minimal cost.
Serviceability Timely detection and diagnosis of failures, with minimal customer impact.
Migration Ease of upgrading to a new system release with minimal disruption to existing
customer data and operations.
Documentation Degree to which provided documents (in all forms, including electronic)
completely and correctly describe the structure and intended uses of the software.
Usability The degree to which the software and accompanying documents can be understood
and effectively employed by the end user.
Standards The degree to which the software complies with applicable standards.
Reliability The ability of the software to perform its intended function without unplanned
interruption or failure.
Accessibility The degree to which persons with disabilities can obtain the full benefit of the
software system.
Software Testing(17IS63) Page 124
Maharaja Institute of Technology Mysore Department of ISE
Capability The degree to which the software performs its intended functions consistently with
documented system requirements.
Requirements The degree to which the system, in complying with document requirements,
actually meets customer expectations
Checking Procedure parameters or variables were not properly validated before use.
When did faults occur, and when were they found? It is typical of mature software processes
to collect fault data sufficient to determine when each fault was detected (e.g., in integration test
or in a design inspection). In addition, for the class of faults identified in the first step, we
attempt to determine when those faults were introduced (e.g., was a particular fault introduced in
coding, or did it result from an error in architectural design?).
Why did faults occur? In this core RCA step, we attempt to trace representative faults back to
causes, with the objective of identifying a "root" cause associated with many faults in the class.
Analysis proceeds iteratively by attempting to explain the error that led to the fault, then the
cause of that error, the cause of that cause, and so on
How could faults be prevented? The final step of RCA is improving the process by removing
root causes or making early detection likely. The measures taken may have a minor impact on
the development process (e.g., adding consideration of exceptional conditions to a design
inspection checklist), or may involve a substantial modification of the process ODC and RCA
are two level, represented directly in the project plan, assigns responsibility to
examples of individuals in accordance with the general strategy. It involves
feedback balancing level of effort across time and carefully managing personal
and interactions. The strategic level of organization is represented not only
improveme in the quality strategy document, but in the structure of the
nt, which organization itself
are an The strategy for assigning responsibility may be partly driven by
important external requirements. For example, independent quality teams may be
dimension required by certification agencies or by a client organization
of most
good When quality tasks are distributed among groups or organizations, the
software plan should include specific checks to ensure successful completion of
processes quality activities. For example, when module testing is performed by
developers and integration and system testing is performed by an
The independent quality team, the quality team should check the
Quality completeness of module tests performed by developer
Team Many variations and hybrid models of organization can be designed.
The quality Some organizations have obtained a good balance of benefits by
plan must rotating responsibilities. For example, a developer may move into a
assign roles role primarily responsible for quality in one project and move back
and into a regular development role in the next. In organizations large
responsibilit enough to have a distinct quality or testing group, an appropriate
ies to balance between independence and integration typically varies across
people. As levels of project organization
with other
The plan must clearly define milestones and delivery for outsourced
aspects of
activities, as well as checks on the quality of delivery in both
planning,
directions: Test organizations usually perform quick checks to verify
assignment
the consistency of the software to be tested with respect to some
of
minimal "testability" requirements; clients usually check the
responsibilit
completeness and consistency of test results
y occurs at a
strategic
level and a
tactical CHAPTER 2
level. The
tactical Documenting Analysis and Test
Documentation can be inspected to verify progress against schedule and quality goals and to
identify problems, supporting process visibility, monitoring, and replicability.
Documents are divided into three main categories: planning, specification, and reporting.
Planning documents describe the organization of the quality process and include strategies and
plans for the division or the company, and plans for individual projects. Specification documents
describe test suites and test cases. A complete set of analysis and test specification documents
include test design specifications, test case specification, checklists, and analysis procedure
specifications. Reporting documents include details and summary of analysis and test results
Organizing Documents
In larger projects, it is common practice to produce and regularly update a global guide for
navigating among individual documents.
Naming conventions help in quickly identifying documents. A typical standard for document
names would include keywords indicating the general scope of the document, its nature, the
specific document, and its version
History
Version description
Summary
Summarize the contents of the document. The summary should clearly explain the relevance of
the document to its possible uses.
Goals of the document Describe the purpose of this document: Who should read it, and why?
Required documents and references Provide a reference to other documents and artifacts
needed for understanding and exploiting this document. Provide a rationale for the provided
references.
Section 1
Section N
Documentation quality: The strategy document sets out requirements on other quality
documents, typically including an analysis and test plan, test design specifications, test case
specifications, test logs, and test summary reports.
Features to be tested: The features considered in the plan. Features not to be tested: Features
not considered in the current plan.
Approach: The overall analysis and test approach, sufficiently detailed to permit identification
of the major test and analysis tasks and estimation of time and resources.
Pass/Fail criteria: Rules that determine the status of an artifact subjected to analysis and test.
Suspension and resumption criteria: Conditions to trigger suspension of test and analysis
activities (e.g., an excessive failure rate) and conditions for restarting or resuming an activity.
Risks and contingencies: Risks foreseen when designing the plan and a contingency plan for
each of the identified risks.
Deliverables: A list all A&T artifacts and documents that must be produced.
Task and schedule: A complete description of analysis and test tasks, relations among them,
and relations between A&T and development tasks, with resource allocation and constraints. A
task schedule usually includes GANTT and PERT diagrams.
Staff and responsibilities: Staff required for performing analysis and test activities, the required
skills, and the allocation of responsibilities among groups and individuals. Allocation of
resources to tasks is described in the schedule.
Environmental needs: Hardware and software required to perform analysis or testing activities
Features to Be Tested Functional test for check configuration, module specification WB02-
15.32.[a]
Procedure Designed for conditional inclusion in nightly test run. Build target T02 15 32 11
includes JUnit harness and oracles, with test reports directed to standard test log. Test
environment includes table MDB 15 32 03 for loading initial test database state.
Test cases
WB0715.01.C09 valid model number with all legal required slots and some legal optional
slots … …
Pass/Fail Criterion Successful completion requires correct execution of all test cases with no
violations in test log.
Test items Module check configuration of the Chipmunk Web presence system, business logic
subsystem.
Input specification
No. of optional components with select ≠ empty < No. of optional slots
Test case:
#SMRS 5
Screen 13"
Hard disk 30 GB
RAM 512 MB
#SMOS 4
Environment Needs Execute with ChipmunkDBM v3.4 database initialized from table MDB 15
32 03.
MODULE V
Integration and component based software testing
Overview
Divides testing into four main levels of granularity: module, integration, system, and acceptance
test. Module or unit test checks module behavior against specifications or expectations;
integration test checks module compatibility; system and acceptance tests check behavior of the
whole system with respect to specifications and user needs, respectively
While integration testing may to some extent act as a process check on module testing (i.e., faults
revealed during integration test can be taken as a signal of unsatisfactory unit testing), thorough
integration testing cannot fully compensate for sloppiness at the module level. In fact, the quality
of a system is limited by the quality of the modules and components from which it is built, and
even apparently noncritical modules can have widespread effects.
Integration faults
Inconsistent interpretation of parameters or values each module's interpretation may be
reasonable, but they are incompatible Example: Unit mismatch: A mix of metric and British
measures (meters and yards) is believed to have led to loss of the Mars Climate Orbiter in
September 1999
Side-effects on parameters or resources Example: A module often uses resources that are not
explicitly mentioned in its interface. Integration problems arise when these implicit effects of one
module interfere with those of another. For example, using a temporary file "tmp" may be
invisible until integration with another module that also attempts to use a temporary file "tmp" in
the same directory of scratch files
be done in many different ways: per unique IP address, per hit, including or excluding spiders,
and so on. Problems arise if the interpretation assumed in the counting module differs from that
of its clients
Dynamic mismatches many languages and frameworks allow for dynamic binding. Problems
may be caused by failures in matching’s when modules are integrated Example: Polymorphic
calls may be dynamically bound to incompatible methods
Big bang testing One extreme approach is to avoid the cost of scaffolding by waiting until all
modules are integrated, and testing them together - essentially merging integration testing into
system testing. In this big bang approach, neither stubs nor drivers need be constructed, nor must
the development be carefully planned to expose well-specified interfaces to each subsystem.
These savings are more than offset by losses in observability, diagnosability, and feedback
Structural integration test strategy: Among strategies for incrementally testing partially
assembled systems, we can distinguish two main classes: structural and feature oriented. In a
structural approach, modules are constructed, assembled, and tested together in an order based on
hierarchical structure in the design. Structural approaches include bottom-up, top-down, and a
combination sometimes referred to as sandwich or backbone strategy. Feature oriented strategies
derive the order of integration from characteristics of the application, and include threads and
critical modules strategies.
Top-down and bottom up testing: A top-down integration strategy begins at the top of the uses
hierarchy, including the interfaces exposed through a user interface or top-level application
program interface (API)
Bottom-up integration similarly reduces the need to develop stubs, except for breaking circular
relations. Referring again to the example in Figure 21.1, we can start bottom-up by integrating
Slot with Component, using drivers for Model and Order.We can then incrementally add Model
and Order. We can finally add either Package or Account and Customer, before integrating
CustomerCare, without constructing stubs
Sandwich or backbone: Integration may combine elements of the two approaches, starting from
both ends of the hierarchy and proceeding toward the middle. An early top-down approach may
result from developing prototypes for early user feedback, while existing modules may be
integrated bottom-up. This is known as the sandwich or backbone strategy
Thread testing: The thread integration testing strategy integrates modules according to system
features. Test designers identify threads of execution that correspond to system features, and they
incrementally test each thread. The thread integration strategy emphasizes module interplay for
specific functionality
Critical module: Critical module integration testing focuses on modules that pose the greatest
risk to the project. Modules are sorted and incrementally integrated according to the associated
risk factor that characterizes the criticality of each module. Both external risks (such as safety)
and project risks (such as schedule) can be considered.
when reusing a component that has been in use in other applications for some time, one obtains
the benefit not only of test and analysis by component developers, but also of actual operational
use.
These advantages are balanced against two considerable disadvantages. First, a component
designed for wide reuse will usually be much more complex than a module designed for a single
use; a rule of thumb is that the development effort (including analysis and test) for a widely
usable component is at least twice that for a module that provides equivalent functionality for a
single application.
The interface specification of a component should provide all the information required for
reusing the component, including so-called nonfunctional properties such as performance or
capacity limits, in addition to functional behavior
The main problem facing test designers in the organization that produces a component is lack of
information about the ways in which the component will be used. A component may be reused in
many different contexts, including applications for which its functionality is an imperfect fit
Test designers cannot anticipate all possible uses of a component under test, but they can design
test suites for classes of use in the form of scenarios. Test scenarios are closely related to
scenarios or use cases in requirements analysis and design
Components are often confused with objects, and a component can be encapsulated by an object
or a set of objects, but they typically differ in many respects:
Components typically use persistent storage, while objects usually have only local state.
Component contract or interface The component contract describes the access points and
parameters of the component, and specifies functional and nonfunctional behavior and any
conditions required for using the component.
Frameworks and design patterns Patterns are logical design fragments, while frameworks are
concrete elements of the application. Frameworks often implement patterns.
COTS The term commercial off-the-shelf, or COTS, indicates components developed for the
sale to other organizations.
CHAPTER 2
Software Testing(17IS63) Page 137
Maharaja Institute of Technology Mysore Department of ISE
Overview
System, acceptance, and regression testing are all concerned with the behavior of a software
system as a whole, but they differ in purpose.
System testing
System testing can be considered the culmination of integration testing, and passing all system
tests is tantamount to being complete and free of known bugs. The system test suite may share
some test cases with test suites used in integration and even unit testing, particularly when a
thread-based or spiral model of development has been taken and subsystem correctness has been
tested primarily through externally visible features and behavior
The appropriate notions of thoroughness in system testing are with respect to the system
specification and potential usage scenarios, rather than code or design. Each feature or specified
behavior of the system should be accounted for in one or several test cases
Some system properties, including performance properties like latency between an event and
system response and reliability properties like mean time between failures, are inherently global
Global properties like performance, security, and safety are difficult to specify precisely and
operationally, and they depend not only on many parts of the system under test, but also on its
environment and use.
Acceptance Testing
The purpose of acceptance testing is to guide a decision as to whether the product in its current
state should be released. The decision can be based on measures of the product or process.
Quantitative goals for dependability, including reliability, availability, and mean time between
failure
Systematic testing, which includes all of the testing techniques presented heretofore in this book,
does not draw statistically representative samples. Their purpose is not to fail at a "typical" rate,
but to exhibit as many failures as possible. They are thus unsuitable for statistical testing.
A less formal, but frequently used approach to acceptance testing is testing with users. An early
version of the product is delivered to a sample of users who provide feedback on failures and
usability. Such tests are often called alpha and beta tests. The two terms distinguish between
testing phases. Often the early or alpha phases are performed within the developing organization,
while the later or beta phases are performed at users' sites.
In alpha and beta testing, the user sample determines the operational profil
Usability
A usable product is quickly learned, allows users to work efficiently, and is pleasant to use.
Usability involves objective criteria such as the time and number of operations required to
perform tasks and the frequency of user error, in addition to the overall, subjective satisfaction of
users
Even if usability is largely based on user perception and thus is validated based on user feedback,
it can be verified early in the design and through the whole software life cycle. The process of
verifying and validating usability includes the following main steps:
Testing early prototypes with end users to explore their mental model (exploratory test),
evaluate alternatives (comparison test), and validate software usability. A prototype for early
assessment of usability may not include any functioning software; a cardboard prototype may be
as simple as a sequence of static images presented to users by the usability tester.
Testing incremental releases with both usability experts and end users to monitor progress and
anticipate usability problems.
System and acceptance testing that includes expert-based inspection and testing, userbased
testing, comparison testing against competitors, and analysis and checks often done
automatically, such as a check of link connectivity and verification of browser compatibility.
User-based testing (i.e., testing with representatives of the actual end-user population) is
particularly important for validating software usability. It can be applied at different stages, from
early prototyping through incremental releases of the final system, and can be used with different
goals: exploring the mental model of the user, evaluating design alternatives, and validating
against established usability requirements and standards.
Exploratory testing: The purpose of exploratory testing is to investigate the mental model of
end users. It consists of asking users about their approach to interactions with the system. For
example, during an exploratory test for the Chipmunk Web presence, we may provide users with
a generic interface for choosing the model they would like to buy, in order to understand how
users will interact with the system
comparison testing: The purpose of comparison testing is evaluating options. It consists of
observing user reactions to alternative interaction patterns. During comparison test we can, for
example, provide users with different facilities to assemble the desired Chipmunk laptop
configuration, and to identify patterns that facilitate users' interactions
validation testing: The purpose of validation testing is assessing overall usability. It includes
identifying difficulties and obstacles that users encounter while interacting with the system, as
well as measuring characteristics such as error rate and time to perform a task.
Regression Testing
Testing activities that focus on regression problems are called (non) regression testing. Usually
"non" is omitted and we commonly say regression testing.
A simple approach to regression testing consists of reexecuting all test cases designed for
previous versions. Even this simple retest all approach may present nontrivial problems and
costs. Former test cases may not be re executable on the new version without modification, and
rerunning all test cases may be too expensive and unnecessary. A good quality test suite must be
maintained across system versions
Test case maintenance: Changes in the new software version may impact the format of inputs
and outputs, and test cases may not be executable without corresponding changes. Even simple
modifications of the data structures, such as the addition of a field or small change of data types,
may invalidate former test cases, or outputs comparable with the new ones.
Scaffolding that interprets test case specifications, rather than fully concrete test data, can reduce
the impact of input and output format changes on regression testing
High-quality test suites can be maintained across versions by identifying and removing obsolete
test cases, and by revealing and suitably marking redundant test cases. Redundant cases differ
from obsolete, being executable but not important with respect to the considered testing criteria
Prioritization can be based on the specification and code-based regression test selection
techniques described later in this chapter. In addition, test histories and fault-proneness models
can be incorporated in prioritization schemes. For example, a test case that has previously
revealed a fault in a module that has recently undergone change would receive a very high
priority, while a test case that has never failed (yet) would receive a lower priority, particularly if
it primarily concerns a feature that was not the focus of recent changes.
Regression test selection techniques are based on either code or specifications. Code based
selection techniques select a test case for execution if it exercises a portion of the code that has
been modified. Specification-based criteria select a test case for execution if it is relevant to a
portion of the specification that has been changed. Code based regression test techniques can be
supported by relatively simple tools. They work even when specifications are not properly
maintained.
Control flow graph (CFG) regression techniques are based on the differences between the CFGs
of the new and old versions of the software. Let us consider, for example, the C function cgi_
decode from Chapter 12. Figure 22.1 shows the original function as presented in Chapter 12,
while Figure 22.2 shows a revision of the program. We refer to these two versions as 1.0 and 2.0,
respectively. Version 2.0 adds code to fix a fault in interpreting hexadecimal sequences ‘%xy’.
The fault was revealed by testing version 1.0 with input terminated by an erroneous subsequence
‘%x’, causing version 1.0 to read past the end of the input buffer and possibly overflow the
output buffer. Version 2.0 contains a new branch to map the unterminated sequence to a question
mark.
Random sampling is a simple way to reduce the size of the regression test suite. high-priority
test cases are selected more often than low-priority test cases. With a good selection strategy, all
test cases are executed sooner or later
Execution history priority schema Priorities can be assigned in many ways. A simple priority
scheme assigns priority according to the execution history: Recently executed test cases are
given low priority, while test cases that have not been recently executed are given high priority.
In the extreme, heavily weighting execution history approximates round robin selection.
Fault revealing priority schema Other history-based priority schemes predict fault detection
effectiveness. Test cases that have revealed faults in recent versions are given high priority.
Faults are not evenly distribute
Structural priority schema Structural coverage leads to a set of priority schemes based on the
elements covered by a test case.
Structural priority schemes produce several criteria depending on which elements we consider:
statements, conditions, decisions, functions, files, and so on. The choice of the element of
interest is usually driven by the testing level. Fine-grain elements such as statements and
conditions are typically used in unit testing, while in integration or system testing one can
consider coarser grain elements such as methods, features, and files.