Performance Testing of Software Systems: January 1998
Performance Testing of Software Systems: January 1998
net/publication/221556480
CITATIONS READS
34 5,202
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Elaine J. Weyuker on 06 January 2014.
81
4. The Role of Requirements and for, but one that is extremely important both for
Specifications in Performance Testing functionality testing and performance testing. If detailed
traffic data is not available for the system or for newly-
In order to do software performance testing in a implemented features, then estimates should be made by
meaningful way, it is also necessary to have performance the most knowledgeable team members. If a business case
requirements, provided in a concrete, verifiable manner. has been made for the addition of new features, then that
This should be explicitly included in a requirements or might include the necessary estimates. Of course, as
specification document and might be provided in terms of additional information is acquired, a refinement of this
throughput or stimulus-response time, and might also data should be made.
include system availability requirements. Frequently,
unfortunately, no such requirements are provided which 5. Previous Work
means that there is no precise way of determining In an earlier paper [l], a software performance testing
whether or not the performance is acceptable. In addition, approach was presented. The goal in that work was to
one of the most serious problems with performance compare the performance of an existing production
testing is making sure that the stated requirements can platform and a proposed replacement architecture to
actually be checked to see whether or not they are assess whether the new platform would be adequate to
fultilled. Just as it is not very useful to select inputs for handle the required workload. The traditional approach to
which it is not possible to determine whether or not the such a comparison is to develop software for the
resulting output is correct when testing a software system proposed platform, build the new architecture, and collect
for correct functionality, when doing performance testing, performance measurements on both the existing system in
it is just as important to write requirements that are production and the new system in the development
verifiable. It is easy enough to write a performance environment.
requirement for a compiler such as: Every module must In [ 11, in contrast, Avritzer and Weyuker introduced a
be compilable in less than one second. Although it might new way to design an application-independent workload
be possible to show that this requirement is not satisfied for doing such a performance evaluation. It was
by compiling a module that takes longer than one second determined that for the given project, the quantity of
to compile, the fact that the compiler has been tested on software and system availability requirements made it
many modules, all of which compiled correctly in less impossible to port the system to the new platform in order
than one second, does not guarantee that the requirement to do the performance testing. Therefore, a novel
has been satisfied. Thus, even plausible-sounding approach was designed in which no software was ported
performance requirements might be unverifiable. at all. Instead, a synthetic workload was devised by tuning
A more satisfactory type of performance requirement commercially-available benchmarks to approximate
would state that the system should have a CPU utilization resource usage of the actual system.
rate that does not exceed 50% when the system is run with Motivating this work was the insight that, given that
an average workload. Assuming that a benchmark has the project had systematically collected system usage data,
been created that accurately reflects the average workload, this information represented a major asset that could be
it is possible to test whether or not this requirement has exploited. In particular, it was not necessary to run the
been satisfied. Another example of a verifiable actual system on the new platform, only to run software
performance requirement might be that the CPU that behaved the way the system did from the point of
utilization rate must not exceed 90% even when the view of resource usage. It was recognized that it was
system is being run with a stress load equal to the totally irrelevant what the software was actually doing,
heaviest load ever encountered during any monitored provided that it used resources in ways similar to the
period. Similarly, it is possible to test for things like software to be ported. The result of testing the software
maximum queue length or total throughput required using this approach was a decision not to port the system
under specified loads. to the proposed platform because the testing approach
Since performance requirements must be included for indicated that although the new platform would be
average system loads and peak loads, it is important to adequate to handle average workloads, it would not be
specify those as early as possible, preferably in the able to handle peak loads that the system was likely to
requirements document. As discussed above, detailed encounter.
operational profile data is frequently available, given that This technique proved to be a very efftcient way to
analysis of recorded system traffic data is performed. This arrive at this decision, and saved the project significant
is frequently a significant task that needs to be planned resources both in terms of personnel costs for porting the
82
1-1
Customer
Host Computer 1
-L
Host Gateway 2
L
L
C”StOlTlE?r
Host computer 4
software that would ultimately have had to have been remote systems. One gateway provides service to many
back-ported to the existing platform, plus very significant VRUs. Conversely, a VRU may access the services
savings made by not purchasing what this testing provided by multiple gateways. One gateway may also
approach determined would have been inappropriate (and connect to multiple remote hosts. The system architecture
very expensive) hardware for the system. In what follows is shown in Figure 1.
we will discuss a related approach to software An Application Script running on the VRU interacts
performance testing that we have found useful when with the end-user. When the application needs to
testing the performance of an existing system redeployed perform a host transaction, an inter-process
on a new platform. communications (IPC) message is sent to the appropriate
server process running on the gateway. For our purposes,
6. A Case Study a transaction is defined as a request sent to the host and
In this section, we describe how we used our the reply received from the host. The server process on
performance testing approach to test a gateway system the gateway formats the request according to the host
that is the middle layer of a 3-tiered client/server requirements and sends it to the host. When a reply is
transaction processing application. The system accepts received, the process on the gateway parses the returned
input from a caller, and returns information that resides host data, formats an IPC message and sends the reply to
on a host server. The input may be in the form of dual the application on the VRU. The high level software
tone multifrequency signals (touch tones) or limited architecture is shown in Figure 2.
vocabulary speech. Output is in the form of prerecorded The existing gateway systems are based on PC
digitized or computer generated speech. hardware (Intel Pentium processors, ISA bus) running the
The client or first tier of this architecture consists of UNIX operating system. To improve performance,
Voice Response Units (VRUs). These are computer reliability, and to reduce maintenance costs, it was
systems that terminate the call. The server or third tier decided to upgrade the hardware platform with mid-
consists of mainframe host computers that contain the range systems that use multiple PA RISC processors, have
information requested by end-users. The middle layer or large amounts of RAM, and are capable of supporting
second tier consists of the gateways: computer systems many physical I/O interfaces.
that allow the clients to interface with the servers. To be
able to communicate with a variety of host servers, a
7. Performance Testing Objectives
gateway supports three different network protocols: When the new platform was purchased, the only
SNA/3270, TN3270 and TCP/IP. available information regarding its required performance
From an architectural standpoint, the purpose of the came from an analytical study done by the project team.
gateway is to allow concentration of network connections This study estimated the expected number of transactions
and to off-load some of the processing that would that should be processable by the new platform when it is
otherwise have to be performed by each VRU. The connected to host systems on the SNA/3270 and TN3270
gateways and VRUs share a LAN. The host servers are protocols. It determined that 56 links, each capable of
83
Telephony to SNA13270 TN3270 TCPilP
Callt?r to Host to Host to Host
T-7
;:”Application
Script
Transacton
Messages
I
Transacton
Messages
Transaction
Messages
I
Transacton
Messages
LAN
handling 56Kbps were needed, and that the system would specified budgets of resource consumption and
have to function with 80% of the links active, each latency?
handling transactions averaging 1620 bytes. Although the 2. At what volume of end-users do these resources
vendor had given assurances that the new platform would become inadequate?
be able to handle the required workload, they did not 3. What components of the hardware and software
provide adequate data to substantiate these claims. What architecture of the gateway limit performance’.’
they did supply was performance data from the use of Where are the bottlenecks?
these systems as database servers. However, a gateway
system provides functionality that is qualitatively different 8. Designing the Performance Test Cases
than the functionality provided by a database server and As we mentioned earlier, the project team decided that
therefore we felt that the performance data supplied by it would be best to do performance testing on a
the vendor would not be indicative of the behavior that we configuration similar to the one used in production. To
could expect in our target environment. accomplish this, we had to borrow testing facilities from
Given that the new platform contained new software the vendor. Access to these facilities was strictly limited
components whose performance behavior was largely to one week. As a result, test cases had to be carefully
unknown, and given that the performance data supplied planned.
by the vendor was not directly applicable to a system used
as a gateway, the project team decided to test the
performance of the gateway prior to deployment with a
configuration similar to the one that will be used in
production. An important first step was to identify the
objectives of the performance testing activity. Initially we
thought that the objective should be to validate the
requirements set by the analytical study. That is, to
directly address the question of whether or not the
gateway could effectively handle the expected number of
transactions per second. After careful consideration,
however, we decided that performance testing should help
us answer the following questions:
1. Does the new gateway have enough resources to
support increased end-user volume within pre-
84
Figure 3. Gateway Software Architecture
It was decided that our performance testing had to to a host and each of these processes may interact with up
address the questions listed in the last section for both to 25 different emulators. Each emulator supports a
average and peak workloads. This meant that we had to maximum of 10 Logical Units (LUs).
determine what these loads were. Experience told us that The third process interfaces directly with the VRU.
a usage scenario should be defined in terms of those There is one of these processes on the gateway and it has
parameters that would most significantly affect the been ported without any modifications from the old
overall performance of the system. Therefore, our platform. Since the first process deals with connections to
immediate priority was to identify such parameters. TO do customer host systems and with processes that
this, we first looked at the software architecture of the communicate with the end-user, and the second also deals
gateway and its relationship to the VRUs and to the with those processes that communicate with the end-user,
customer host systems. We felt that the software we reasoned that the number of simultaneous active
architecture of a system [3] could provide important connections to customer host systems and the number of
insights that could be useful when designing our simultaneous callers should be two of the parameters.
performance test suite. Additional parameters that would noticeably affect the
By studying the software architecture of the gateway, overall performance of the gateway were the number of
we observed that its performance is closely related to the transactions per call, transaction sizes, call holding times,
performance of three processes. The first process was a call interarrival rate and traffic mixture (that is, percent
new software component that did not exist on the old of SNA vs. TN3270 vs. TCP/IP).
platform. There is one such process on the gateway and To determine realistic values for these parameters, we
its purpose is to communicate on one end with all 3270- collected and analyzed data from the current production
based hosts and on the other end with the different gateways. We used tracing to record certain types of
SNA/3270 emulators. The second process was a modified information for each transaction. These included such
version of one found on the old platform. Modifications things as time-stamps, response time, and LU number.
were made so that this process could work with the new We did this for several days with 6 different applications
process mentioned above. No additional tinctionality was at two different sites. The data gave us a good feel for the
incorporated. There is one of these processes for each link usage levels. Based on this information and using some
85
estimated values from past experience, we determined varying the distributions by starting with SNA/3270 at
that: 90% and TN3270 at 10% and decreasing and increasing
l Average usage was typically about forty LUs per the distributions by 10% until SNA/3270 was at 10% and
link, with 24 simultaneous callers, 3 transactions TN3270 at 90%. These nine tests, ordered in decreasing
per call, 2.5 minute call holding time, and a 5 order of importance, would be run on both the average
second call interarrival rate. This corresponds to and high usage scenarios, and would provide us with
approximately one transaction every three valuable insights on the role that each protocol plays in
seconds for each link. the consumption of resources as well as the overall effect
. Heavy usage was determined to be one hundred on the performance of the gateway.
twenty LUs per link, with 120 simultaneous
callers, 4 transactions per call, 3.5 minute call
9. Performance Testing and Results
holding time, and a 5 second call interarrival To implement the test cases representing these
rate. This is equivalent to approximately 2.2 scenarios, the project team developed two programs to
transactions per second for each link. simulate callers. These programs send transaction request
We were unable to obtain precise data regarding messages to the gateway server process and receive
transaction sizes and traffic mixture. For transaction replies. Both programs simulate a number of
sizes, the only available information we had was that the simultaneous callers, performing transactions of a size
minimum and maximum sizes were 100 and 1920 bytes specified on the command line. One program sends
respectively. As a first approximation, our solution was to transaction requests as quickly as possible, i.e., as soon as
run the test cases for both usage scenarios assuming that the reply to a request is received, another request is sent
each transaction was 1100 bytes (100 bytes sent to the out. The other program more closely simulates human
host and 1000 bytes received from the host). This was the users by accepting parameters for call duration and
number assumed in the analytical study and, based on number of transactions per call, and using them to add
experience, it seemed to be a reasonable assumption. “think time” and simulate realistic call arrival rates. For a
Alternatively we could have assumed a non-uniform host application, we used an existing simple application
distribution of transaction sizes such as assuming that that had been used in the past for functionality testing.
different segments of the transaction population would Since all customer applications require similar processing
have different transaction sizes. For example, for the cycles by the gateway, the choice of customer application
typical usage scenario we could assume that 25% of the was not important for our testing.
transactions were 100 bytes, 25% were 1920 bytes and the Testing was conducted on a hardware configuration
remaining 50% were equally distributed to represent similar to the one used on the field and revealed some
transactions of 500 bytes, 1000 bytes, and 1500 bytes. For very surprising results regarding the performance of the
the heavy usage scenario we could assume that 25% of the gateway. We learned that for both the average and heavy
transactions were 1000 bytes, 25% 1500 bytes, and 50% usage scenarios, the gateway could handle only a small
1920 bytes. fraction of the transactions estimated to be required by the
With respect to the traffic mixture, our primary analytical study. This was disturbing since we had been
concern was the SNA/3270 protocol. All the traffic on the assured by the vendor that the hardware was capable of
existing gateway used the SNAJ3270 protocol. Support handling this load. Luckily, we were able to quickly
for the TN3270 was new and the system engineers identify three problems that caused very high CPU
anticipated that, at least initially, only a very small utilization and contributed to the poor performance of the
number of customer host systems would be using that gateway. The first problem was related to the polling
protocol. Furthermore, both the TN3270 and the frequency of the gateway by the host. The second problem
SNA/3270 used some of the same processes, so we did not was related to the algorithm used by the gateway server
expect any performance surprises from TN3270 traffic. process to search the various LUs for the arrival of new
As for the TCP/IP protocol, its implementation does not data. The third problem was related to the inefficient
require significant resources on the gateway and we implementation of some library modules used by the SNA
already had data that indicated that the effect of the server process. The first two problems were resolved
TCP/IP protocol on the overall performance of the fairly quickly and we were able to observe substantial
gateway was negligible. Thus, we decided to focus our improvement in the performance of the gateway. The
testing effort on the SNA/3270. Had the sensitivity of the solution to the third problem is much more subtle and
traffic distribution between SNAl3270 and TN3270 been involves software modifications that must be carefully
significant, we would have created different test cases by designed. It is expected that when completed, these
86
modifications will further improve the performance of the with customers dependent on it, than finding out about
gateway. them when users’ needs cannot be met.
87