0% found this document useful (0 votes)
408 views53 pages

ST Module - 5

The document discusses integration and component-based software testing. It covers the following key points: 1) Integration testing checks for compatibility between module interfaces and focuses on errors in interface specifications or implementations that cause incompatibilities. 2) Integration faults can be difficult to anticipate and may only appear in certain configurations. A famous example is a buffer overflow that caused the loss of the Ariane 5 rocket. 3) Strategies for integration testing include incrementally assembling and testing modules in subsystems to provide early feedback and improve fault localization.

Uploaded by

Dbit Docs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
408 views53 pages

ST Module - 5

The document discusses integration and component-based software testing. It covers the following key points: 1) Integration testing checks for compatibility between module interfaces and focuses on errors in interface specifications or implementations that cause incompatibilities. 2) Integration faults can be difficult to anticipate and may only appear in certain configurations. A famous example is a buffer overflow that caused the loss of the Ariane 5 rocket. 3) Strategies for integration testing include incrementally assembling and testing modules in subsystems to provide early feedback and improve fault localization.

Uploaded by

Dbit Docs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Software Testing(18IS62) Module 5

MODULE – 5
CHAPTER -1
INTEGRATION AND COMPONENT-BASED SOFTWARE TESTING

5.1 Overview
The traditional V model introduced in Chapter 2 divides testing into four main levels
of granularity: module, integration, system, and acceptance test. Module or unit test checks
module behavior against specifications or expectations; integration test checks module
compatibility; system and acceptance tests check behavior of the whole system with respect
to specifications and user needs, respectively. An effective integration test is built on a
foundation of thorough module testing and inspection. Module test maximizes controllability
and observability of an individual unit, and is more effective in exercising the full range of
module behaviors, rather than just those that are easy to trigger and observe in a particular
context of other modules.

While integration testing may to some extent act as a process check on module testing
(i.e., faults revealed during integration test can be taken as a signal of unsatisfactory unit
testing), thorough integration testing cannot fully compensate for sloppiness at the module
level. In fact, the quality of a system is limited by the quality of the modules and components
from which it is built, and even apparently noncritical modules can havewidespread effects.
For example, in 2004 a buffer overflow vulnerability in a single, widely used library for
reading Portable Network Graphics (PNG) files caused security vulnerabilities in Windows,
Linux, and Mac OS X Web browsers and email clients.

On the other hand, some unintended side-effects of module faults may become
apparent only in integration test and even a module that satisfies its interface specification
may be incompatible because of errors introduced in design decomposition. Integration tests
therefore focus on checking compatibility between module interfaces.

Integration faults are ultimately caused by incomplete specifications or faulty


implementations of interfaces, resource usage, or required properties. . For example, it may
be very difficult to anticipate interactions between remote and apparently unrelated modules
through sharing a temporary hidden file that just happens to be given the same name by two

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 1


Software Testing(18IS62) Module 5

modules, particularly if the name clash appears rarely and only in some installation
configurations.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 2


Software Testing(18IS62) Module 5

The official investigation of the Ariane 5 accident that led to the loss of the rocket on
July 4, 1996 concluded that the accident was caused by incompatibility of a software module
with the Ariane 5 requirements. The software module was in charge of computing the
horizontal bias, a value related to the horizontal velocity sensed by the platform that is
calculated as an indicator of alignment precision. The module had functioned correctly for
Ariane 4 rockets, which were smaller than the Ariane 5, and thus had a substantially lower
horizontal velocity. It produced an overflow when integrated into the Ariane 5 software. The
overflow started a series of events that terminated with self-destruction of the launcher. The
problem was not revealed during testing because of incomplete specifications:

As with most software problems, integration problems may be attacked at many


levels. Good design and programming practice and suitable choice of design and
programming environment can reduce or even eliminate some classes of integration
problems. For example, in applications demanding management of complex, shared
structures, choosing a language with automatic storage management and garbage collection
greatly reduces memory disposal errors such as dangling pointers and redundant
deallocations ("double frees").

Even if the programming language choice is determined by other factors, many errors
can be avoided by choosing patterns and enforcing coding standards across the entire code
base; the standards can be designed in such a way that violations are easy to detect manually
or with tools. For example, many projects using C or C++ require use of "safe" alternatives to
unchecked procedures, such as requiring strncpy or strlcpy (string copy procedures less
vulnerable to buffer overflow) in place of strcpy. Checking for the mere presence of strcpy is
much easier (and more easily automated) than checking for its safe use. These measures do
not eliminate the possibility of error, but integration testing is more effective when focused
on finding faults that slip through these design measures.

5.2 Integration Testing Strategies

Integration testing proceeds incrementally with assembly of modules into


successively larger subsystems. Incremental testing is preferred, first, to provide the earliest
possible feedback on integration problems. In addition, controlling and observing the
behavior of an integrated collection of modules grows in complexity with the number of
modules and the complexity of their interactions. Complex interactions may hide faults, and

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 3


Software Testing(18IS62) Module 5

failures that are manifested may propagate across many modules, making fault localization
difficult. Therefore it is worthwhile to thoroughly test a small collection of modules before
adding more.

A strategy for integration testing of successive partial subsystems is driven by the


order in which modules are constructed (the build plan), which is an aspect of the system
architecture. The build plan, in turn, is driven partly by the needs of test. Design and
integration testing are so tightly coupled that in many companies the integration and the
testing groups are merged in a single group in charge of both design and test integration.

Since incremental assemblies of modules are incomplete, one must often construct
scaffolding - drivers, stubs, and various kinds of instrumentation - to effectively test them.
This can be a major cost of integration testing, and it depends to a large extent on the order in
which modules are assembled and tested.
One extreme approach is to avoid the cost of scaffolding by waiting until all modules
are integrated, and testing them together - essentially merging integration testing into system
testing. In this big bang approach, neither stubs nor drivers need be constructed, nor must the
development be carefully planned to expose well-specified interfaces to each subsystem.
These savings are more than offset by losses in observability, diagnosability, and feedback.
Delaying integration testing hides faults whose effects do not always propagate outward to
visible failures (violating the principle that failing always is better than failing sometimes)
and impedes fault localization and diagnosis because the failures that are visible may be far
removed from their causes. Requiring the whole system to be available before integration
does not allow early test and feedback, and so faults that are detected are much more costly to
repair. Big bang integration testing is less a rational strategy than an attempt to recover from
a lack of planning; it is therefore also known as the desperate tester strategy.

Memory Leaks
Memory leaks are typical of program faults that often escape module testing. They
may be detected in integration testing, but often escape further and are discoveredonly in
actual system operation.
The Apache Web server, version 2.0.48, contained the following code for reacting to
normal Web page requests that arrived on the secure (https) server port:

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 4


Software Testing(18IS62) Module 5

1 static void ssl_io_filter_disable(ap_filter_t *f)


2{
3 bio_filter_in_ctx_t *inctx = f->ctx;
4 inctx->ssl = NULL;
5 inctx->filter_ctx->pssl = NULL;
6}

This code fails to reclaim some dynamically allocated memory, causing the
Webserver to "leak" memory at run-time. Over a long period of use, or over a shorter period
if the fault is exploited in a denial-of-service attack, this version of the Apache Web server
will allocate and fail to reclaim more and more memory, eventually slowing to the point of
unusability or simply crashing.
The fault is nearly impossible to see in this code. The memory that should be
deallocated here is part of a structure defined and created elsewhere, in the SSL (secure
sockets layer) subsystem, written and maintained by a different set of developers. Even
reading the definition of the ap filter t structure, which occurs in a different part of the
Apache Web server source code, doesn't help, since the ctx field is an opaque pointer (type
void * in C) . The repair, applied in version 2.0.49 of the
server, is:

1 static void ssl_io_filter_disable(SSLConnRec *sslconn, ap_filter 2 {


3 bio_filter_in_ctx_t *inctx = f->ctx;
4 SSL_free(inctx->ssl);
5 sslconn->ssl = NULL;
6 inctx->ssl = NULL;
7 inctx->filter_ctx->pssl = NULL;
8}

This memory leak illustrates several properties typical of integration faults. In


principle, it stems from incomplete knowledge of the protocol required to interact with some
other portion of the code, either because the specification is (inevitably) incomplete or
because it is not humanly possible to remember everything. The problem is due at least in
part to a weakness of the programming language - it would not have occurred in a language
with automatic garbage collection, such as Java.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 5


Software Testing(18IS62) Module 5

Finally, although the fault would be very difficult to detect with conventional unit
testing techniques, there do exist both static and dynamic analysis techniques
Among strategies for incrementally testing partially assembled systems, we can
distinguish two main classes: structural and feature oriented. In a structural approach,
modules are constructed, assembled, and tested together in an order based on hierarchical
structure in the design. Structural approaches include bottom-up, top-down, and a
combination sometimes referred to as sandwich or backbone strategy. Feature oriented
strategies derive the order of integration from characteristics of the application, and include
threads and critical modules strategies.

Top-down and bottom-up strategies are classic alternatives in system construction and
incremental integration testing as modules accumulate.

A top-down integration strategy begins at the top of the uses hierarchy, including the
interfaces exposed through a user interface or top-level application program interface (API).
The need for drivers is reduced or eliminated while descending the hierarchy, since at each
stage the already tested modules can be used as drivers while testing the next layer. For
example, referring to the excerpt of the Chipmunk Web presence shown in Figure we can
start by integrating Customer Care with Customer, while stubbing Account and Order. We
could then add either Account or Order and Package, stubbing Model and Component in the
last case.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 6


Software Testing(18IS62) Module 5

Figure 21.1: An excerpt of the class diagram of the Chipmunk Web presence. Modules are
sorted from the top to the bottom according to the use/include relation. The topmost modules
are not used or included in any other module, while the bottommost modules do not include
or use other modules.

Bottom-up integration similarly reduces the need to develop stubs, except for
breaking circular relations. Referring again to the example in Figure 21.1, we can start
bottom-up by integrating Slot with Component, using drivers for Model and Order.We can
then incrementally add Model and Order. We can finally add either Package or Account and
Customer, before integrating CustomerCare, without constructing stubs.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 7


Software Testing(18IS62) Module 5

Top-down and bottom-up approaches to integration testing can be applied early in the
development if paired with similar design strategies: If modules are delivered following the
hierarchy, either top-down or bottom-up, they can be integrated and tested as soon as they are
delivered, thus providing early feedback to the developers. Both approaches increase
controllability and diagnosability, since failures are likely caused by interactions with the
newly integrated modules.

In practice, software systems are rarely developed strictly top-down or bottom-up.


Design and integration strategies are driven by other factors, like reuse of existing modules or
commercial off-the-shelf (COTS) components, or the need to develop early prototypes for
user feedback. Integration may combine elements of the two approaches, starting from both
ends of the hierarchy and proceeding toward the middle.

An early top-down approach may result from developing prototypes for early user
feedback, while existing modules may be integrated bottom-up. This is known as the
sandwich or backbone strategy. For example, referring once more to the small system of
Figure let us imagine reusing existing modules for Model, Slot, and Component, and
developing CustomerCare and Customer as part of an early prototype. We can start
integrating CustomerCare and Customer top down, while stubbing Account and Order.
Meanwhile, we can integrate bottom-up Model, Slot, and Component with Order, using
drivers for Customer and Package. We can then integrate Account with Customer, and
Package with Order, before finally integrating the whole prototype system.

The price of flexibility and adaptability in the sandwich strategy is complex planning
and monitoring. While top-down and bottom-up are straightforward to plan and monitor, a
sandwich approach requires extra coordination between development and test. In contrast to
structural integration testing strategies, feature-driven strategies select an order of integration
that depends on the dynamic collaboration patterns among modules regardless of the static
structure of the system. The thread integration testing strategy integrates modules according
to system features. Test designers identify threads of execution that correspond to system
features, and they incrementally test each thread.

The thread integration strategy emphasizes module interplay for specific functionality.
Referring to the Chipmunk Web presence, we can identify feature threads for assembling
models, finalizing orders, completing payments, packaging and shipping, and so on. Feature
thread integration fits well with software processes that emphasize incremental delivery of
Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 8
Software Testing(18IS62) Module 5

user-visible functionality. Even when threads do not correspond to usable end-user features,
ordering integration by functional threads is a useful tactic to make flaws in integration
externally visible. Incremental delivery of usable features is not the only possible
consideration in choosing the order in which functional threads are integrated and tested. Risk
reduction is also a driving force in many software processes. Critical module integration
testing focuses on modules that pose the greatest risk to the project. Modules are sorted and
incrementally integrated according to the associated risk factor that characterizes the
criticality of each module. Both external risks (such as safety) and project risks (such as
schedule) can be considered.

A risk-based approach is particularly appropriate when the development team does


not have extensive experience with some aspect of the system under development. Consider
once more the Chipmunk Web presence. If Chipmunk has not previously constructed
software that interacts directly with shipping services, those interface modules will be critical
because of the inherent risks of interacting with externally provided subsystems, which may
be inadequately documented or misunderstood and which may also change.

Feature-driven test strategies usually require more complex planning and management
than structural strategies. Thus, we adopt them only when their advantages exceed the extra
management costs. For small systems a structural strategy is usually sufficient, but for large
systems feature-driven strategies are usually preferred. Often large projects require
combinations of development strategies that do not fit any single test integration strategies. In
these cases, quality managers would combine different strategies: top-down, bottom-up, and
sandwich strategies for small subsystems, and a blend of threads and critical module
strategies at a higher level.

5.3 Testing Components and Assemblies


Many software products are constructed, partly or wholly, from assemblies of prebuilt
software components.[1] A key characteristic of software components is that the organization
that develops a component is distinct from the (several) groups of developers who use it to
construct systems. The component developers cannot completely anticipate the uses to which
a component will be put, and the system developers have limited knowledge of the
component. Testing components (by the component developers) and assemblies (by system
developers) therefore brings some challenges and constraints that differ from testing other
kinds of module.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 9


Software Testing(18IS62) Module 5

Reusable components are often more dependable than software developed for a single
application. More effort can be invested in improving the quality of a component when the
cost is amortized across many applications. Moreover, when reusing a component that has
been in use in other applications for some time, one obtains the benefit not only of test and
analysis by component developers, but also of actual operational use.

The advantages of component reuse for quality are not automatic. They do not apply
to code that was developed for a single application and then scavenged for use in another.
The benefit of operational experience as a kind of in vivo testing, moreover, is obtained only
to the extent that previous uses of the component are quite similar to the new use. These
advantages are balanced against two considerable disadvantages. First, a component designed
for wide reuse will usually be much more complex than a module designed for a single use; a
rule of thumb is that the development effort (including analysis and test) for a widely usable
component is at least twice that for a module that provides equivalent functionality for a
single application. In addition, a reusable component is by definition developed without full
knowledge of the environment in which it will be used, and it is exceptionally difficult to
fully and clearly describe all the assumptions, dependencies, and limitations that might
impinge upon its use in a particular application.

In general, a software component is characterized by a contract or application


program interface (API) distinct from its implementation. Where a mature market has
developed for components addressing a particular need, a single interface specification (e.g.,
SQL for database access or document object model (DOM) for access and traversal of XML
data) can have several distinct implementations. The contract describes the component by
specifying access points of the component, such as procedures (methods) and their
parameters, possible exceptions, global variables, and input and output network connections.
Even when the interface specification is bound to a single implementation, the logical
distinction between interface and implementation is crucial to effective use and testing.

Component A software component is a reusable unit of deployment and composition that is


deployed and integrated multiple times and usually by different teams. Components are
characterized by a contract or interface and may or may not have state.
Components are often confused with objects, and a component can be encapsulated
by an object or a set of objects, but they typically differ in many respects:

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 10


Software Testing(18IS62) Module 5

 Components typically use persistent storage, while objects usually have only local
state.
 Components may be accessed by an extensive set of communication mechanisms,
while objects are activated through method calls.
 Components are usually larger grain subsystems than objects.

Component contract or interface The component contract describes the access points and
parameters of the component, and specifies functional and nonfunctional behavior and any
conditions required for using the component.
Framework A framework is a micro-architecture or a skeleton of an application, with hooks
for attaching application-specific functionality or configuration-specific components. A
framework can be seen as a circuit board with empty slots for components.
Frameworks and design patterns Patterns are logical design fragments, while frameworks
are concrete elements of the application. Frameworks often implement patterns.
Component-based system A component-based system is a system built primarily by
assembling software components (and perhaps a small amount of application specific
code) connected through a framework or ad hoc "glue code."
COTS The term commercial off-the-shelf, or COTS, indicates components developed
for the sale to other organizations.

The interface specification of a component should provide all the information


required for reusing the component, including so-called nonfunctional properties such as
performance or capacity limits, in addition to functional behavior. All dependence of the
component on the environment in which it executes should also be specified. In practice, few
component specifications are complete in every detail, and even details that are specified
precisely can easily be overlooked or misunderstood when embedded in a complex
specification document.
The main problem facing test designers in the organization that produces a component
is lack of information about the ways in which the component will be used. A component
may be reused in many different contexts, including applications for which its functionality is
an imperfect fit. A general component will typically provide many more features and options
than are used by any particular application.
A good deal of functional and structural testing of a component, focused on finding
and removing as many program faults as possible, can be oblivious to the context of actual

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 11


Software Testing(18IS62) Module 5

use. As with system and acceptance testing of complete applications, it is then necessary to
move to test suites that are more reflective of actual use. Testing with usage scenarios places
a higher priority on finding faults most likely to be encountered in use and is needed to gain
confidence that the component will be perceived by its users (that is, by developers who
employ it as part of larger systems) as sufficiently dependable.

Test designers cannot anticipate all possible uses of a component under test, but
theycan design test suites for classes of use in the form of scenarios. Test scenarios are
closely related to scenarios or use cases in requirements analysis and design.

Sometimes different classes of use are clearly evident in the component specification.
For example, the W3 Document Object Model (DOM) specification has parts that deal
exclusively with HTML markup and parts that deal with XML; these correspond to different
uses to which a component implementing the DOM may be put. The DOM specification
further provides two "views" of the component interface. In the flat view, all traversal and
inspection operations are provided on node objects, without regard to subclass. In the
structured view, each subclass of node offers traversal and inspection operations specific to
that variety of node. For example, an Element node has methods to get and set attributes, but
a Text node (which represents simple textual data within XML or HTML) does not.

Open Research Issues


Ensuring quality of components and of component-based systems remains a
challenging problem and a topic of current research. One research thread considers how
dynamic analysis of components and component-based systems in one environment can
produce useful information for assessing likely suitability for using some of the same
components in another environment (by characterizing the contexts in which a component
has been used successfully).

A related approach of characterizing a set of behaviors and recognizing changes or


differences (whether or not those differences are failures) may be applicable in the
increasingly important context of dynamically configurable and field upgradable systems,
which pose all the problems of component- based systems with the additional complication of
performing integration in deployed systems rather than in the development environment.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 12


Software Testing(18IS62) Module 5

Software design for testability is an important factor in the cost and effectiveness of
test and analysis, particularly for module and component integration. To some extent model
based testing

5.4 System, Acceptance, and Regression Testing

5.4.1 Overview
System, acceptance, and regression testing are all concerned with the behavior of a
software system as a whole, but they differ in purpose. System testing is a check of
consistency between the software system and its specification (it is a verification activity).
Like unit and integration testing, system testing is primarily aimed at uncovering faults, but
unlike testing activities at finer granularity levels, system testing focuses on system-level
properties. System testing together with acceptance testing also serves an important role in
assessing whether a product can be released to customers, which is distinct from its role in
exposing faults to be removed to improve the product

Flaws in specifications and in development, as well as changes in users' expectations,


may result in products that do not fully meet users' needs despite passing system tests.
Acceptance testing, as its name implies, is a validation activity aimed primarily at the
acceptability of the product, and it includes judgments of actual usefulness and usability

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 13


Software Testing(18IS62) Module 5

rather than conformance to a requirements specification. Regression testing is specialized to


the problem of efficiently checking for unintended effects of software changes. New
functionality and modification of existing code may introduce unexpected interactions and
lead latent faults to produce failures not experienced in previous releases.

5.5 System Testing


The essential characteristics of system testing are that it is comprehensive, based on a
specification of observable behavior, and independent of design and implementation
decisions. System testing can be considered the culmination of integration testing, and
passing all system tests is tantamount to being complete and free of known bugs.

The system test suite may share some test cases with test suites used in integration
and even unit testing, particularly when a thread-based or spiral model of development has
been taken and subsystem correctness has been tested primarily through externally visible
features and behavior. However, the essential characteristic of independence implies that test
cases developed in close coordination with design and implementation may be unsuitable.
The overlap, if any, should result from using system test cases early, rather than reusing unit
and integration test cases in the system test suite.

Independence in system testing avoids repeating software design errors in test design.
This danger exists to some extent at all stages of development, but always in trade for some
advantage in designing effective test cases based on familiarity with the software design and
its potential pitfalls. The balance between these considerations shifts at different levels of
granularity, and it is essential that independence take priority at some level to obtain a
credible assessment of quality.

In some organizations, responsibility for test design and execution shifts at a discrete
point from the development team to an independent verification and validation team that is
organizationally isolated from developers. More often the shift in emphasis is gradual,
without a corresponding shift in responsible personnel.

Particularly when system test designers are developers or attached to the development
team, the most effective way to ensure that the system test suite is not unduly influenced by
design decisions is to design most system test cases as early as possible. Even in agile
development processes, in which requirements engineering are tightly interwoven with

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 14


Software Testing(18IS62) Module 5

development, it is considered good practice to design test cases for a new feature before
implementing the feature. When the time between specifying a feature and implementing it is
longer, early design of system tests facilitates risk-driven strategies that expose critical
behaviors to system test cases as early as possible, avoiding unpleasant surprises as
deployment nears.

For example, in the (imaginary) Chipmunk development of Web-based purchasing,


some questions were raised during requirements specification regarding the point at which a
price change becomes effective. For example, if an item's catalog price is raised or lowered
between the time it is added to the shopping cart and the time of actual purchase, which price
is the customer charged? The requirement was clarified and documented with a set of use
cases in which outcomes of various interleavings of customer actions and price changes were
specified, and each of these scenarios became a system test case specification. Moreover,
since this was recognized as a critical property with many opportunities for failure, the
system architecture and buildplan for the Chipmunk Web presence was structured with
interfaces that could be artificially driven through various scenarios early in development,
and with several of the system test scenarios simulated in earlier integration tests.

The appropriate notions of thoroughness in system testing are with respect to the
system specification and potential usage scenarios, rather than code or design. Each feature or
specified behavior of the system should be accounted for in one or several test cases. In
addition to facilitating design for test, designing system test cases together with the system
requirements specification document helps expose ambiguity and refine specifications.

The set of feature tests passed by the current partial implementation is often used as a
gauge of progress. Interpreting a count of failing feature-based system tests Additional test
cases can be devised during development to check for observable symptoms of failures that
were not anticipated in the initial system specification. They may also be based on failures
observed and reported by actual users, either in acceptance testing or from previous versions
of a system. These are in addition to a thorough specification-based test suite, so they do not
compromise independence of the quality assessment.

Some system properties, including performance properties like latency between an


event and system response and reliability properties like mean time between failures, are
inherently global. While one certainly should aim to provide estimates of these properties as
early as practical, they are vulnerable to unplanned interactions among parts of a complex
Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 15
Software Testing(18IS62) Module 5

system and its environment. The importance of such global properties is therefore magnified
in system

Testing. Global properties like performance, security, and safety are difficult to
specify precisely and operationally, and they depend not only on many parts of the system
under test, but also on its environment and use. For example, U.S. HIPAA regulations
governing privacy of medical records require appropriate administrative, technical, and
physical safeguards to protect the privacy of health information, further specified as follows:
Implementation specification: safeguards. A covered entity must reasonably safeguard
protected health information from any intentional or unintentional use or disclosure that is in
violation of the standards, implementation specifications or other requirements of this
subpart. [Uni00, sec. 164.530(c)(2)]

It is unlikely that any precise operational specification can fully capture the HIPAA
requirement as it applies to an automated medical records system. One must consider the
whole context of use, including, for example, which personnel have access to the system and
how unauthorized personnel are prevented from gaining access. Some global properties may
be defined operationally, but parameterized by use. For example, a hard-real-time system
must meet deadlines, but cannot do so in a completely arbitrary environment; its performance
specification is parameterized by event frequency and minimum inter-arrival times. An e-
commerce system may be expected to provide a certain level of responsiveness up to a
certain number of transactions per second and to degrade gracefully up to a second rate. A
key step is identifying the "operational envelope" of the system, and testing both near the
edges of that envelope (to assess compliance with specified goals) and well beyond it (to
ensure the system degrades or fails gracefully). Defining borderline and extreme cases is
logically part of requirements engineering, but as with precise specification of features, test
design often reveals gaps and ambiguities.

Not all global properties will be amenable to dynamic testing at all, at least in the
conventional sense. One may specify a number of properties that a secure computer system
should have, and some of these may be amenable to testing. Others can be addressed only
through inspection and analysis techniques, and ultimately one does not trust the security of a
system at least until an adversarial team has tried and failed to subvert it. Similarly, there is
no set of test cases that can establish software safety, in part because safety is a property of a
larger system and environment of which the software is only part. Rather, one must consider

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 16


Software Testing(18IS62) Module 5

the safety of the overall system, and assess aspects of the software that are critical to that
overall assessment. Some but not all of those claims may be amenable to testing.

Testing global system properties may require extensive simulation of the execution
environment. Creating accurate models of the operational environment requires substantial
human resources, and executing them can require substantial time and machine resources.
Usually this implies that "stress" testing is a separate activity from frequent repetition of
feature tests. For example, a large suite of system test cases might well run each night or
several times a week, but a substantial stress test to measure robust performance under heavy
load might take hours to set up and days or weeks to run.

A test case that can be run automatically with few human or machine resources should
generally focus on one purpose: to make diagnosis of failed test executions as clear and
simple as possible. Stress testing alters this: If a test case takes an hour to set up and a day to
run, then one had best glean as much information as possible from its results. This includes
monitoring for faults that should, in principle, have been found and eliminated in unit and
integration testing, but which become easier to recognize in a stress test (and which, for the
same reason, are likely to become visible to users). For example, several embedded system
products ranging from laser printers to tablet computers have been shipped with slow
memory leaks that became noticeable only after hours or days of continuous use. In the case
of the tablet PC whose character recognition module gradually consumed all system memory,
one must wonder about the extent of stress testing the software was subjected to.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 17


Software Testing(18IS62) Module 5

5.6 Acceptance Testing

The purpose of acceptance testing is to guide a decision as to whether the product in


its current state should be released. The decision can be based on measures of the product or
process. Measures of the product are typically some inference of dependability based on
statistical testing. Measures of the process are ultimately based on comparison to experience
with previous products.

Although system and acceptance testing are closely tied in many organizations,
fundamental differences exist between searching for faults and measuring quality. Even when
the two activities overlap to some extent, it is essential to be clear about the distinction, in
order to avoid drawing unjustified conclusions. Quantitative goals for dependability,
including reliability, availability, and mean time between failures, These are essentially
statistical measures and depend on a statistically valid approach to drawing a representative

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 18


Software Testing(18IS62) Module 5

sample of test executions from a population of program behaviors. Systematic testing, which
includes all of the testing techniques presented heretofore in this book, does not draw
statistically representative samples. Their purpose is not to fail at a "typical" rate, but to
exhibit as many failures as possible. They are thus unsuitable for statistical testing.

The first requirement for valid statistical testing is a precise definition of what is being
measured and for what population. If system operation involves transactions, each of which
consists of several operations, a failure rate of one operation in a thousand is quite different
from a failure rate of one transaction in a thousand. In addition, the failure rate may vary
depending on the mix of transaction types, or the failure rate may be higher when one million
transactions occur in an hour than when the same transactions are spread across a day.
Statistical modeling therefore necessarily involves construction of a model of usage, and the
results are relative to that model.

Suppose, for example, that a typical session using the Chipmunk Web sales facility
consists of 50 interactions, the last of which is a single operation in which the credit card is
charged and the order recorded. Suppose the Chipmunk software always operates flawlessly
up to the point that a credit card is to be charged, but on half the attempts it charges the
wrong amount. What is the reliability of the system? If we count the fraction of individual
interactions that are correctly carried out, we conclude that only one operation in 100 fails, so
the system is 99% reliable. If we instead count entire sessions, then it is only 50% reliable,
since half the sessions result in an improper credit card charge.

Statistical models of usage, or operational profiles, may be available from


measurement of actual use of prior, similar systems. For example, use of a current telephone
handset may be a reasonably good model of how a new handset will be used. Good models
may also be obtained in embedded systems whose environment is primarily made up of
predictable devices rather than unpredictable humans. In other cases one cannot justify high
confidence in a model, but one can limit the uncertainty to a small number of parameters.
One can perform sensitivity testing to determine which parameters are critical. Sensitivity
testing consists of repeating statistical tests while systematically varying parameters to note
the effect of each parameter on the output. A particular parameter may have little effect on
outcomes over the entire range of plausible values, or there may be an effect that varies
smoothly over the range. If the effect of a given parameter is either large or varies

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 19


Software Testing(18IS62) Module 5

discontinuously (e.g., performance falls precipitously when system load crosses some
threshold), then one may need to make distinct predictions for different value ranges.

A second problem faced by statistical testing, particularly for reliability, is that it may
take a very great deal of testing to obtain evidence of a sufficient level of reliability. Consider
that a system that executes once per second, with a failure rate of one execution in a million,
or 99.9999% reliability, fails about 31 times each year; this may require a great testing effort
and still not be adequate if each failure could result in death or a lawsuit. For critical systems,
one may insist on software failure rates that are an insignificant fraction of total failures. For
many other systems, statistical measures of reliability may simply not be worth the trouble.

A less formal, but frequently used approach to acceptance testing is testing with users.
An early version of the product is delivered to a sample of users who provide feedback on
failures and usability. Such tests are often called alpha and beta tests. The two terms
distinguish between testing phases. Often the early or alpha phases are performed within the
developing organization, while the later or beta phases are performed at users' sites.

In alpha and beta testing, the user sample determines the operational profile. A good
sample of users should include representatives of each distinct category of users, grouped by
operational profile and significance. Suppose, for example, Chipmunk plans to provide Web-
based sales facilities to dealers, industrial customers, and individuals. A good sample should
include both users from each of those three categories and a range of usage in each category.
In the industrial user category, large customers who frequently issue complex orders as well
as small companies who typically order a small number of units should be represented, as the
difference in their usage may lead to different failure rates. We may weigh differently the
frequency of failure reports from dealers and from direct customers, to reflect either the
expected mix of usage in the full population or the difference in consequence of failure.

5.7 Usability

A usable product is quickly learned, allows users to work efficiently, and is pleasant
to use. Usability involves objective criteria such as the time and number of operations
required to perform tasks and the frequency of user error, in addition to the overall, subjective
satisfaction of users.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 20


Software Testing(18IS62) Module 5

For test and analysis, it is useful to distinguish attributes that are uniquely associated
with usability from other aspects of software quality (dependability, performance, security,
etc.). Other software qualities may be necessary for usability; for example, a program that
often fails to satisfy its functional requirements or that presents security holes is likely to
suffer poor usability as a consequence. Distinguishing primary usability properties from other
software qualities allows responsibility for each class of properties to be allocated to the most
appropriate personnel, at the most cost-effective points in the project schedule.

Even if usability is largely based on user perception and thus is validated based on
user feedback, it can be verified early in the design and through the whole software life cycle.
The process of verifying and validating usability includes the following main steps:

Inspecting specifications with usability checklists. Inspection provides early feedback


on usability.
Testing early prototypes with end users to explore their mental model (exploratory test),
evaluate alternatives (comparison test), and validate software usability. A prototype for early
assessment of usability may not include any functioning software; a cardboard prototype may
be as simple as a sequence of static images presented to users by the usability tester.
Testing incremental releases with both usability experts and end users to monitor
Progress and anticipate usability problems.
System and acceptance testing that includes expert-based inspection and testing, User based
testing, comparison testing against competitors, and analysis and checks often done
automatically, such as a check of link connectivity and verification of browser compatibility.
User-based testing (i.e., testing with representatives of the actual end-user population) is
particularly important for validating software usability. It can be applied at different stages,
from early prototyping through incremental releases of the final system, and can be used with
different goals: exploring the mental model of the user, evaluating design alternatives, and
validating against established usability requirements and standards.

The purpose of exploratory testing is to investigate the mental model of end users. It
consists of asking users about their approach to interactions with the system. For example,
during an exploratory test for the Chipmunk Web presence, we may provide users with a
generic interface for choosing the model they would like to buy, in order to understand how
users will interact with the system. A generic interface could present information about all

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 21


Software Testing(18IS62) Module 5

laptop computer characteristics uniformly to see which are examined first by the sample
users, and thereby to determine the set of characteristics that should belong to the summary in
the menu list of laptops. Exploratory test is usually performed early in design, especially

when designing a system for a new target population.

The purpose of comparison testing is evaluating options. It consists of observing user


reactions to alternative interaction patterns. During comparison test we can, for example,
provide users with different facilities to assemble the desired Chipmunk laptop configuration,
and to identify patterns that facilitate users' interactions. Comparison test is usually applied
when the general interaction patterns are clear and need to be refined. It can substitute for
exploratory testing if initial knowledge about target users is sufficient to construct a range of
alternatives, or otherwise follows exploratory testing
The purpose of validation testing is assessing overall usability. It includes identifying
difficulties and obstacles that users encounter while interacting with the system, as well as
measuring characteristics such as error rate and time to perform a task.
A well-executed design and organization of usability testing can produce results that
are objective and accurately predict usability in the target user population. The usability test
design includes selecting suitable representatives of the target users and organizing sessions
that guide the test toward interpretable results. A common approach is divided into
preparation, execution, and analysis phases. During the preparation phase, test designers
define the objectives of the session, identify the items to be tested, select a representative
population of end users, and plan the required actions. During execution, users are monitored
as they execute the planned actions in a controlled environment. During analysis, results are
evaluated, and changes to the software interfaces or new testing sessions are planned, if
required.

Users are asked to execute a planned set of actions that are identified as typical uses
of the tested feature. For example, the Chipmunk usability assessment team may ask users to
configure a product, modify the configuration to take advantage of some special offers, and
place an order with overnight delivery.
Users should perform tasks independently, without help or influence from the testing
staff. User actions are recorded, and comments and impressions are collected with a post
activity questionnaire. Activity monitoring can be very simple, such as recording sequences
of mouse clicks to perform each action. More sophisticated monitoring can include recording

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 22


Software Testing(18IS62) Module 5

mouse or eye movements. Timing should also be recorded and may sometimes be used for

driving the sessions (e.g., fixing a maximum time for the session or for each set of actions).

An important aspect of usability is accessibility to all users, including those with


disabilities. Accessibility testing is legally required in some application domains. For
example, some governments impose specific accessibility rules for Web applications of
public institutions. The set of Web Content Accessibility Guidelines (WCAG) defined by the
World Wide Web Consortium are becoming an important standard reference.

Web Content Accessibility Guidelines (WCAG)


1. Provide equivalent alternatives to auditory and visual content that convey essentially the
same function or purpose.
2. Ensure that text and graphics are understandable when viewed without color.
3. Mark up documents with the proper structural elements, controlling presentation with style
sheets rather than presentation elements and attributes.
4. Use markup that facilitates pronunciation or interpretation of abbreviated or foreign text.
5. Ensure that tables have necessary markup to be transformed by accessible browsers and
other user agents.
6. Ensure that pages are accessible even when newer technologies are not supported or are
turned off.
7. Ensure that moving, blinking, scrolling, or auto-updating objects or pages may be paused
or stopped.
8. Ensure that the user interface, including embedded user interface elements, follows
principles of accessible design: device-independent access to functionality, keyboard
operability, self-voicing, and so on.
9. Use features that enable activation of page elements via a variety of input devices.
10. Use interim accessibility so that assisting technologies and older browsers will operate
correctly.
11. Where technologies outside of W3C specifications is used (e.g, Flash), provide alternative
versions to ensure accessibility to standard user agents and assistive technologies (e.g., screen
readers).
12. Provide context and orientation information to help users understand complex pages or
elements.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 23


Software Testing(18IS62) Module 5

13. Provide clear and consistent navigation mechanisms to increase the likelihood that a
person will find what they are looking for at a site.
14. Ensure that documents are clear and simple, so they may be more easily understood.

5.8 Regression Testing

When building a new version of a system (e.g., by removing faults, changing or


adding functionality, porting the system to a new platform, or extending interoperability), we
may also change existing functionality in unintended ways. Sometimes even small changes
can produce unforeseen effects that lead to new failures. For example, a guard added to an
array to fix an overflow problem may cause a failure when the array is used in other contexts,
or porting the software to a new platform may expose a latent fault in creating and modifying
temporary files.

When a new version of software no longer correctly provides functionality that should be
preserved, we say that the new version regresses with respect to former versions. The non
regression of new versions (i.e., preservation of functionality), is a basic quality requirement.
Disciplined design and development techniques, including precise specification and
modularity that encapsulates independent design decisions, improves the likelihood of
achieving non regression. Testing activities that focus on regression problems are called
(non) regression testing.

A simple approach to regression testing consists of re-executing all test cases


designed for previous versions. Even this simple retest all approach may present nontrivial
problems and costs. Former test cases may not be re-executable on the new version without
modification, and rerunning all test cases may be too expensive and unnecessary. A good
quality test suite must be maintained across system versions.

Changes in the new software version may impact the format of inputs and outputs,
and test cases may not be executable without corresponding changes. Even simple
modifications of the data structures, such as the addition of a field or small change of data
types, may invalidate former test cases, or outputs comparable with the new ones. Moreover,

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 24


Software Testing(18IS62) Module 5

some test cases may be obsolete, since they test features of the software that have been
modified, substituted, or removed from the new version.

Scaffolding that interprets test case specifications, rather than fully concrete test data,
can reduce the impact of input and output format changes on regression testing,

High-quality test suites can be maintained across versions by identifying and


removing obsolete test cases, and by revealing and suitably marking redundant test cases.
Redundant cases differ from obsolete, being executable but not important with respect to the
considered testing criteria. For example, test cases that cover the same path are mutually
redundant with respect to structural criteria, while test cases that match the same partition are
mutually redundant with respect to functional criteria. Redundant test cases may be
introduced in the test suites due to concurrent work of different test designers or to changes in
the code. Redundant test cases do not reduce the overall effectiveness of tests, but impact on
the cost-benefits trade-off: They are unlikely to reveal faults, but they augment the costs of
test execution and maintenance. Obsolete test cases are removed because they are no longer
useful, while redundant test cases are kept because they may become helpful in successive
versions of the software.

5.9 Regression Test Selection Techniques


Even when we can identify and eliminate obsolete test cases, the number of tests to be
re executed may be large, especially for legacy software. Executing all test cases for large
software products may require many hours or days of execution and may depend on scarce
resources such as an expensive hardware test harness. For example, some mass market
software systems must be tested for compatibility with hundreds of different hardware
configurations and thousands of drivers. Many test cases may have been designed to exercise
parts of the software that cannot be affected by the changes in the version under test. Test
cases designed to check the behavior of the file management system of an operating system is
unlikely to provide useful information when re-executed after changes of the window
manager. The cost of re-executing a test suite can be reduced by selecting a subset of test
cases to be re-executed, omitting irrelevant test cases or prioritizing execution of subsets of
the test suite by their relation to changes.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 25


Software Testing(18IS62) Module 5

Test case prioritization orders frequency of test case execution, executing all of them
eventually but reducing the frequency of those deemed least likely to reveal faults by some
criterion. Alternate execution is a variant on prioritization for environments with frequent
releases and small incremental changes; it selects a subset of regression test cases for each
software version. Prioritization can be based on the specification and code-based regression
test selection techniques. In addition, test histories and fault-proneness models can be
incorporated in prioritization schemes.
For example, a test case that has previously revealed a fault in a module that has
recently undergone change would receive a very high priority, while a test case that has never
failed (yet) would receive a lower priority, particularly if it primarily concerns a
feature that was not the focus of recent changes.

Regression test selection techniques are based on either code or specifications. Code
based selection techniques select a test case for execution if it exercises a portion of the code
that has been modified. Specification-based criteria select a test case for execution if it is
relevant to a portion of the specification that has been changed. Code based regression test
techniques can be supported by relatively simple tools. They work even when specifications
are not properly maintained. However, like code-based test techniques in general, they do not

scale well from unit testing to integration and system testing.

Among code-based test selection techniques, control-based techniques rely on a


record of program elements executed by each test case, which may be gathered from an
instrumented version of the program. The structure of the new and old versions of the
program are compared, and test cases that exercise added, modified, or deleted elements are
selected for re execution. Different criteria are obtained depending on the program model on
which the version comparison is based (e.g., control flow o0072 data flow graph models).

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 26


Software Testing(18IS62) Module 5

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 27


Software Testing(18IS62) Module 5

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 28


Software Testing(18IS62) Module 5

CFG regression testing techniques compare the annotated control flow graphs of the
two program versions to identify a subset of test cases that traverse modified parts of the
graphs. The graph nodes are annotated with corresponding program statements, so that
comparison of the annotated CFGs detects not only new or missing nodes and arcs, but also
nodes whose changed annotations correspond to small, but possibly relevant, changes in
statements.
The CFG for version 2.0 of cgi_decode is given in Figure 22.4. Differences between
version 2.0 and 1.0 are indicated in gray. In the example, we have new nodes, arcs and paths.
In general, some nodes or arcs may be missing (e.g., when part of the program is removed in
the new version), and some other nodes may differ only in the annotations (e.g., when we
modify a condition in the new version).
CFG criteria select all test cases that exercise paths through changed portions of the
CFG, including CFG structure changes and node annotations. In the example, we would

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 29


Software Testing(18IS62) Module 5

select all test cases that pass through node D and proceed toward node G and all test cases
that reach node L, that is, all test cases except TC1. In this example, the criterion is not very
effective in reducing the size of the test suite because modified statements affect almost all
paths.

Figure: The control flow graph of function cgi_decode version 2.0. Gray background
indicates the changes from the former version.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 30


Software Testing(18IS62) Module 5

If we consider only the corrective modification (nodes X and Y ), the criterion is more
effective. The modification affects only the paths that traverse the edge between D and G, so
the CFG regression testing criterion would select only test cases traversing those nodes (i.e.,
TC2, TC3, TC4, TC5, TC8 and TC9). In this case the size of the test suite to be re executed
includes two-thirds of the test cases of the original test suite. In general, the CFG regression
testing criterion is effective only when the changes affect a relatively small subset of the
paths of the original program, as in the latter case. It becomes almost useless when the
changes affect most paths, as in version 2.0.

DF regression selection techniques re-execute test cases that, when executed on the original
program, exercise DU pairs that were deleted or modified in the revised program. Test cases
that executed a conditional statement whose predicate was altered are also selected, since the
changed predicate could alter some old definition-use associations.

In contrast to code-based techniques, specification-based test selection techniques do


not require recording the control flow paths executed by tests. Regression test cases can be
identified from correspondence between test cases and specification items. For example,
when using category partition, test cases correspond to sets of choices, while in finite state
machine model-based approaches, test cases cover states and transitions.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 31


Software Testing(18IS62) Module 5

Where test case specifications and test data are generated automatically from a
specification or model, generation can simply be repeated each time the specification or
model changes.

Code-based regression test selection criteria can be adapted for model-based regression test
selection. Consider, for example, the control flow graph derived from the process shipping
order specification We add the following item to that specification:

Restricted countries A set of restricted destination countries is maintained, based on current


trade restrictions. If the shipping address contains a restricted destination country, only credit
card payments are accepted for that order, and shipping proceeds only after approval by a
designated company officer responsible for checking that the goods ordered may be legally
exported to that country. The new requirement can be added to the flow graph model of the
specification as illustrated in

We can identify regression test cases with the CFG criterion that selects all cases that
correspond to international shipping addresses (i.e., test cases TC-1 and TC-5 from the
following table). The table corresponds to the functional test cases derived using to the
method

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 32


Software Testing(18IS62) Module 5

5.10 Test Case Prioritization and Selective Execution


Regression testing criteria may select a large portion of a test suite. When a regression
test suite is too large, we must further reduce the set of test cases to be executed. Random
sampling is a simple way to reduce the size of the regression test suite. Better approaches
prioritize test cases to reflect their predicted usefulness. In a continuous cycle of retesting as
the product evolves, high-priority test cases are selected more often than low-priority test
cases. With a good selection strategy, all test cases are executed sooner or later, but the
varying periods result in an efficient rotation in which the cases most likely to reveal faults
are executed most frequently.
Priorities can be assigned in many ways. A simple priority scheme assigns
priority according to the execution history: Recently executed test cases are given low
priority, while test cases that have not been recently executed are given high priority. In the
extreme, heavily weighting execution history approximates round robin selection.
Other history-based priority schemes predict fault detection effectiveness. Test
cases that have revealed faults in recent versions are given high priority. Faults are not evenly
distributed, but tend to accumulate in particular parts of the code or around particular
functionality. Test cases that exercised faulty parts of the program in the past often exercise
faulty portions of subsequent revisions.
Structural coverage leads to a set of priority schemes based on the elements covered
by a test case. We can give high priority to test cases that exercise elements that have not
recently been exercised. Both the number of elements covered and the "age" of each element
(time since that element was covered by a test case) can contribute to the prioritization.
Structural priority schemes produce several criteria depending on which elements we
consider: statements, conditions, decisions, functions, files, and so on. The choice of the
element of interest is usually driven by the testing level. Fine-grain elements such as
statements and conditions are typically used in unit testing, while in integration or system

testing one can consider coarser grain elements such as methods, features, and files.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 33


Software Testing(18IS62) Module 5

CHAPTER -2
LEVELS OF TESTING, INTEGRATION TESTING

1. Traditional view of testing levels

The traditional model of software development is the waterfall model, which is


illustrated in Figure 11.1. It is sometimes drawn as a V as in Figure 11.2 to emphasize how
the basic levels of testing reflects the early waterfall phases. In this view, information
produced in one of the development phases constitutes the basis for test case identification at
that level. Nothing controversial here: we certainly would hope that system test cases are
clearly correlated with the requirements specification, and that unit test cases are derived
from the detailed design of the unit. On the upper left side of the waterfall, the tight what/
how cycles are important. They underscore the fact that the predecessor phase defines what is
to be done in the successor phase. When complete, the successor phases states how it
accomplishes ―what‖ was to be done.
Of the three main levels of testing (unit, integration, and system), unit testing is best
understood. System testing is understood better than integration testing, but both need
clarification. The bottom–up approach sheds some insight: test the individual components,
and then integrate these into subsystems until the entire system is tested. System testing
should be something that the customer (or user) understands, and it often borders on customer
acceptance testing. Generally, system testing is functional instead of structural; this is mostly
due to the lack of higher-level structural notations.

Figure: The waterfall life cycle

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 34


Software Testing(18IS62) Module 5

The waterfall model is closely associated with top–down development and design by
functional decomposition. The end result of preliminary design is a functional decomposition
of the entire system into a tree-like structure of functional components. Figure 1.2 contains a
partial functional decomposition of our automated teller machine(ATM)system.
With this decomposition top down integration would begin with the main program
,checking the calls to the three tree level procedures(Terminal I/O ,ManageSession, and
conduct Transaction),Following the tree the manage session procedure would be tested and
then the CardEntry ,PIN entry, and Sales Transaction procedures, in this case the actual code
for low level units is replaced by a stub which is a throwaway piece of code that takes the
place of actual code, Bottom-up integration would be the opposite sequence, starting with the
CardEntry,PIN entry ,and Select Transaction procedures and working up toward the main
program, in the bottom up integration units at higher levels are replaced by drivers that
emulate the procedure calls .The ―big bang‖ approach simply puts all the units together at
once, with no stubs or drivers whichever approach is taken, the goal of traditional integration
testing is to integrate previously tested units with respect to the functional decomposition
tree,

2. Alternative Life Cycle Models


Since the early 1980s, practitioners have devised alternatives in response to
shortcomings of the traditional waterfall model just mentioned. Common to all of these
alternatives is the shift away from the functional decomposition to an emphasis on iteration
and composition. Decomposition is a perfect fit both to the top–down progression of the

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 35


Software Testing(18IS62) Module 5

waterfall model and to the bottom–up testing order, but it relies on one of the major
weaknesses of waterfall development cited by Agresti (1986)—the need for ―perfect
foresight.‖ Functional decomposition can only be well done when the system is completely
understood, and it promotes analysis to the near exclusion of synthesis. The result is a very
long separation between requirements specification and a completed system, and during this
interval, no opportunity is available for feedback from the customer. Composition, on the
other hand, is closer to the way people work: start with something known and understood,
then add to it gradually, and maybe remove undesired portions.

2.2.1 Waterfall Spin-Offs

Figure: Life cycle with build sequence

There are three mainline derivatives of the waterfall model: incremental development,
evolutionary development, and the spiral model (Boehm, 1988). Each of these involves a
series of increments or builds as shown in Figure 11.3 Within a build, the normal waterfall
phases from detailed design through testing occur with one important difference: system
testing is split into two steps—regression and progression testing

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 36


Software Testing(18IS62) Module 5

The main impact of the series of builds is that regression testing becomes necessary.
The goal of regression testing is to ensure that things that worked correctly in the previous
build still work with the newly added code. Regression testing can either precede or follow
integration testing, or possibly occur in both places. Progression testing assumes that
regression testing was successful and that the new functionality can be tested. (We like to
think that the addition of new code represents progress, not a regression.) Regression testing
is an absolute necessity in a series of builds because of the well-known ripple effect of
changes to an existing system. (The industrial average is that one change in five introduces a
new fault.)
The differences among the three spin-off models are due to how the builds are
identified. In incremental development, the motivation for separate builds is usually to flatten
the staff profile. With pure waterfall development, there can be a huge bulge of personnel for
the phases from detailed design through unit testing. Many organizations cannot support such
rapid staff fluctuations, so the system is divided into builds that can be supported by existing
personnel. In evolutionary development, the presumption of a build sequence is still made,
but only the first build is defined. On the basis of that, later builds are identified, usually in
response to priorities set by the customer/user, so the system evolves to meet the changing
needs of the user.
This foreshadows the customer-driven tenet of the agile methods. The spiral model is
a combination of rapid prototyping and evolutionary development, in which a build is defined
first in terms of rapid prototyping and then is subjected to a go/no-go decision based on
technology-related risk factors. From this, we see that keeping preliminary design as an
integral step is difficult for the evolutionary and spiral models. To the extent that this cannot
be maintained as an integral activity, integration testing is negatively affected. System testing
is not affected.

2.2.2 Specification-Based Life Cycle Models


Rapid prototyping has no new implications for integration testing; however, it has
very interesting implications for system testing. Where are the requirements? Is the last
prototype the specification? How are system test cases traced back to the prototype? One
good answer to questions such as these is to use the prototyping cycles as information-
gathering activities and then produce a requirements specification in a more traditional
manner. Another possibility is to capture what the customer does with the prototypes, define

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 37


Software Testing(18IS62) Module 5

these as scenarios that are important to the customer, and then use these as system test cases.
These could be precursors to the user stories of the agile life cycles. The main contribution of
rapid prototyping is that it brings the operational (or behavioral) viewpoint to the
requirements specification phase. Usually, requirements specification techniques emphasize
the structure of a system, not its behavior. This is unfortunate because most customers do not
care about the structure, and they do care about the behavior.

Figure 1.5: Rapid Prototyping life cycle


Executable specifications (Figure 1x.5) are an extension of the rapid prototyping
concept. With this approach, the requirements are specified in an executable format (such as
finite state machines, State Charts, or Petri nets). The customer then executes the
specification to observe the intended system behavior and provides feedback as in the rapid
prototyping model.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 38


Software Testing(18IS62) Module 5

Once again, this life cycle has no implications for integration testing. One big
difference is that the requirements specification document is explicit, as opposed to a
prototype. More important, it is often a mechanical process to derive system test cases from
an executable specification. Although more work is required to develop an executable
specification, this is partially offset by the reduced effort to generate system test cases. Here
is another important distinction: when system testing is based on an executable specification,
we have an interesting form of structural testing at the system level.

Figure 1.6: Executable specification

3 The SATM System


• To a higher level example, the simple Automatic Teller Machine (SATM) system .It
is built around the fifteen screens shown in Figure .

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 39


Software Testing(18IS62) Module 5

• The SATM terminal is sketched in Figure; in addition to the display screen, there are
function buttons B1, B2, and B3, a digit keypad with a cancel key, slots for printer
receipts and ATM cards, and doors for deposits and cash withdrawals.
• The SATM system is described here in two ways: with a structured analysis
approach, and with an object-oriented approach.

Figure: Screens for the SATM System

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 40


Software Testing(18IS62) Module 5

The structured analysis approach to requirements specification is still widely used .it
enjoys extensive CASE tools support as well as commercial training and is described in
numerous texts, the technique is based on three complimentary models function ,data, and
control, here we use dataflow diagram for the functional model, the entity/relationship model
for data, and finite state machine models for the control aspect of the SATM system, The
functional and data models were drawn with the Deft CASE tool from Sybase ,Inc. That tool
identifies external devices with lowercase letter .Elements of the functional decomposition
are identified with numbers. The open and filled arrowhead on flow arrows signify whether
the flow item is simple or compound. The portion of the SATM System shows here pertain
generally to the personal identification number(PIN) Verification portion of the system

Figure: The SATM Terminal


The Deft CASE Tool distinguish between simple and compound flows, where
compound flows may be decomposed into two other flows, which may be compound ,The
graphic appearance of the choice is that simple flows have filled arrowheads while compound
flows have open arrowheads

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 41


Software Testing(18IS62) Module 5

Figure: Context diagram of the SATM System

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 42


Software Testing(18IS62) Module 5

Figure: Level 1 Dataflow diagram of the SATM System

Screen is comprised of
Screen1 welcome
Screen2 enter PIN
Screen3 wrong PIN
Screen4 PIN failed, card retained
Screen5 select trans type
Screen6 select account type
Screen7 enter amount
Screen8 Insufficient funds
Screen 9 cannot dispense that amount
Screen10 cannot process withdrawals

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 43


Software Testing(18IS62) Module 5

Screen11 take your cash


Screen12 cannot process deposits
Screen13 put dep envelop in slot
Screen14 another transaction?
Screen15 Thanks; take card and receipt

Figure: Entity/relationship model of the SATM system

 Figure is an (incomplete) Entity/Relationship diagram of the major data structures in


the SATM system: Customers, Accounts, Terminals, and Transactions.
 Good data modelling practice dictates postulating an entity for each portion of the
system that is described by data that is retained
 (and used by functional components). Among the data the system would need for each
customer are
 the customer’s identification and personal account number (PAN
 These are encoded into the magnetic strip on the customer’s ATM card.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 44


Software Testing(18IS62) Module 5

 We would also want to know information about a customer’s account(s), including


the account numbers, the balances, the type of account (savings or checking), and the
Personal Identification Number (PIN) of the account.
 At this point, we might ask why the PIN is not associated with the customer, and the
PAN with an account.
 Part of the E/R model describes relationships among the entities: a customer HAS
account(s), a customer conducts transaction(s) in a SESSION, and, independent of
customer information, transaction(s) OCCUR at an ATM terminal.
 The single and double arrowheads signify the singularity or plurality of these
relationships: one customer may have several accounts, and may conduct none or
several transactions.
 This is problematic for testers, because test cases are concerned with behavior, not
with structure.
 As a supplement, the functional and data information are linked by a control model;
here we use a finite state machine.
 Control models represent the point at which structure and behavior intersect; as such,
they are of special utility to testers.

Figure: Upper-level SATM finite state machine


Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 45
Software Testing(18IS62) Module 5

 The upper level finite state machine in Figure 4.12 divides the system into states that
correspond to stages of customer usage.
 Other choices are possible, for instance, we might choose states to be screens being
displayed (this turns out to be a poor choice).
 Finite state machines can be hierarchically decomposed in much the same way as
dataflow diagrams.
 The decomposition of the Await PIN state is shown in Figure 4.13. In both of these
figures, state transitions are caused either by events at the ATM terminal (such as a
keystroke) or by data conditions (such as the recognition that a PIN is correct)

Figure: PIN Entry Finite State Machine

Here is the functional decomposition carried further in outline form,the numbering


scheme preserves the levels of the components

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 46


Software Testing(18IS62) Module 5

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 47


Software Testing(18IS62) Module 5

 During design, some of the original decisions may be revised based on additional
insights and more detailed requirements (for example, performance or reliability
goals).
 The end result is a functional decomposition such as the partial one shown in the
structure chart in figure
 Notice that the original first level decomposition into four subsystems is continued:
the functionality has been decomposed to lower levels of detail

Figure: A decomposition tree for the SATM System

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 48


Software Testing(18IS62) Module 5

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 49


Software Testing(18IS62) Module 5

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 50


Software Testing(18IS62) Module 5

4. Separating Integration and System Testing

Almost we are in a position to make a clear distinction between integration and


system testing.
We need this distinction to avoid gaps and redundancies across levels of testing, to clarify
appropriate goals for these levels, and to understand how to identify test cases at different
levels.
• This whole discussion is facilitated by a concept essential to all levels of testing: the
notion of a ―thread‖.
• A thread is a construct that refers to execution time behavior; when we test a system,
we use test cases to select (and execute) threads.
• We can speak of levels of threads: system threads describe system level behavior,
integration threads correspond to integration level behavior, and unit threads
correspond to unit level behavior.
• Thread is something that is determined by a data context and a sequence of port level
input events, such as those in the context diagram of the SATM system.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 51


Software Testing(18IS62) Module 5

• We could also think of a thread as a sequence of source statements, or as a sequence


of machine instructions.
• The point is, threads are a generic concept, and they exist independently of how a
system is described and developed.
• We certainly expect that test cases at various levels can be traced back to
developmental information.
• While this is necessary, it fails to be sufficient: we will finally make our desired
separation in terms of behavioral constructs.

4.1 Structural Insights

 If we follow the definition of the SATM system, we could first postulate that system
testing should make sure that all fifteen display screens have been generated. (An
output domain based, functional view of system testing.)
 The entity/relationship model also helps: the one-to-one and one-to-many
relationships help us understand how much testing must be done.
 The control model (in this case, a hierarchy of finite state machines) is the most
helpful.
 system test cases in terms of paths through the finite state machine(s); doing this
yields a system level analog of structural testing.
 The functional models (dataflow diagrams and structure charts) move in the direction
of levels because both express a functional decomposition.
 structure chart and identify where system testing ends and integration testing starts
 For instance, the following threads are all clearly at the system level:
1. Insertion of an invalid card. (this is probably the ―shortest‖ system thread)
2. Insertion of a valid card, followed by three failed PIN entry attempts.
3. Insertion of a valid card, a correct PIN entry attempt, followed by a balance
inquiry.
4. Insertion of a valid card, a correct PIN entry attempt, followed by a deposit.
5. Insertion of a valid card, a correct PIN entry attempt, followed by a withdrawal.
6. Insertion of a valid card, a correct PIN entry attempt, followed by an attempt to
withdraw more cash than the account balance.

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 52


Software Testing(18IS62) Module 5

4.2 Behavioral Insights

• Every system has a port boundary; the port boundary of the SATM system includes
the digit keypad, the function buttons, the screen, the deposit and withdrawal doors,
the card and receipt slots, and so on.

• Each of these devices can be thought of as a ―port‖, and events occur at system ports.

• The port input and output events are visible to the customer, and the customer very
often understands system behavior in terms of sequences of port events.

• This fits our understanding of a test case, in which we specify pre-conditions, inputs,
outputs, and post-conditions.

• With this mandate we can always recognize a level violation:

• if a test case (thread) ever requires an input (or an output) that is not visible at the port
boundary, the test case cannot be a system level test case (thread

Manjushree TL, Asst Prof., Dept of ISE, SVIT Page 53

You might also like