0% found this document useful (0 votes)
5 views9 pages

Test Plan

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Test Plan

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 9

Purpose of System Testing

The purpose of system testing is to ensure that a system meets its specification and
any non-functional requirements (such as stability and throughput) that have been
agreed with its users. In theory, R-GMA could be system tested independently
because both its specification and requirements are public. In reality, only we
currently have the in-depth understanding of the specification that is required to do it
properly.

Scope
This document covers only the R-GMA system as described in the current R-GMA
System Specification. It does not cover the R-GMA tools (Browser, Command Line,
Site Publisher, Service Tool, Flexible Archiver, Glue Archiver and GIN). Each of
these tools should have its own system tests, located in the same CVS module as the
rest of the tool's files, but these system tests are not discussed further here.

Overview
I think the R-GMA system tests should consist of a set of top-level command-line
scripts together with any supporting files, located in org.glite.rgma.system-tests. The
existing code used by rgma-client-check should be removed to org.glite.rgma.base
and any other old tests discarded (I think that’s already been done). Each test script
takes no parameters, runs without user intervention and reports a simple pass/fail to
standard output on completion. Any diagnostic output is written to standard error. The
scripts are consistently named and described in a Test Specification document (to be
written). It is extremely important that we keep the test structure and the test scripts as
simple, consistent and clean as possible, because they will have to be continually
maintained in line with any externally visible changes to the system. The scripts are
only for our use: I do not think that they should be distributed with R-GMA.

For the purposes of system testing, R-GMA can be broken down into chunks each of
which can be tested independently without any modification to the installed system
(apart from user-configurable properties). These chunks are just the four User APIs,
the Java System API and the seven Services. In this document, I will refer to them as
sub-systems (I’m considering R-GMA in isolation from gLite here). Each test script
will test one or more complete sub-systems, with the remaining sub-systems
sometimes replaced by dummy versions to aid testing. It is essential that the sub-
systems being tested are installed exactly as a user would install them, and are not
modified in any way (and that no special "test paths" through them are used). The idea
is to start with simple single-sub-system tests and work up gradually to running the
full system: that way, silly bugs can be detected by the easy tests to reduce the
number of times the more complicated and time-consuming ones need to be run.

System testing needs to be a formal process for us to have any faith in it, so in the
next section, I propose a formal procedure for doing it.

Test Procedure
System testing needs one person to oversee the whole process and sign it off, so I
assume that someone is nominated as the test manager for each round of testing.
System testing needs to be planned into the development cycle, so I also assume that
we're working to a 3-month release cycle, and that sometime near the start of that
cycle we agree the enhancements and bug fixes that we plan to include in the next
release. Given that information, the test manager can identify which of the tests in the
test specification are going to need to be run, and if necessary, get them modified in
line with the changes being introduced, so they’re ready in time. A set of clean
machines also needs to be identified by the test manager for the sole use of system
testing, probably about four machines will be adequate (we'll know once the tests are
designed).

When system testing is to begin, a new test release of the entire system is created. A
branch is created in CVS from the development HEAD and all the modules on the
new branch are tagged with the test release tag. The tag is simply something like
rgma_test_nnnn for some number nnnn that increases with each new test release. If
some modules are not ready for system testing, then the current production version of
those modules will need to be tagged instead – in all cases, the system is tagged and
built as whole. The code is then built and installed onto the test machines using some
repeatable process (e.g. Yaim). The test environment (from org.glite.rgma.system-
tests) is also installed onto each machine.

The tests identified by the test manager are run, starting with the lowest numbered
ones. A test record is kept for each test release, with one entry for each test,
containing date, test release number, test name, pass/fail and a reference to a Savanna
bug if it failed. Testing of the release continues until either it is complete or the
failures being found make it unreasonable to continue. At that point, testing stops and
the test machines can be hacked if necessary to help diagnose bugs. The bugs are then
fixed and committed to the test branch; all of the code on the test branch HEAD is
tagged for a new test release and the new release is built, installed on clean machines
and tested as just described. Of course, tests unaffected by the changes don't need to
be re-run. The reason for tagging all modules for each test release is so that we can
revert to a previous test release if, for example, we're pressured into finishing early
and the test failures in that release are deemed to be acceptable. If late code is to be
added into the testing, it is copied to the test branch HEAD and a new test release is
built and tested in the usual way.

Once testing has been completed to the satisfaction of the test manager, the test
release tag (with no other changes sneaked in) is passed to the deployment team for
integration testing. There is no need to branch again as the HEAD of the test branch
will only now change if patches are required for the deployed code. We should find
that the need for patches reduces with proper system testing, because the only bugs
that get through should be minor and can be fixed in the normal development cycle.

Note that by creating the release branch at the start of testing, other development (for
the following release) can continue on the development HEAD.

Test Specification
The Test Specification is yet another document, but it’s a very useful one. It simply
names each test and describes in detail what it should do (and also how to run it,
although in our case, that will be the same for all of them). In the short term, the test
spec. is the basis for actually writing the tests (that job is much easier if someone else
has already named them and described exactly how they must run and what they must
do). Long term, the document is used to decide, for each release, what tests need to be
run to cover the changes that have been made. It’s also useful if bug fixes passes to
Alastair contain a reference in Savanna to the tests that should be run (or added).

For the purposes of specifying them, I’ve split the tests into six groups. The first
group tests a single API in isolation. The second group tests a single Service (through
the Java API) in isolation. The third group tests a single functional component1 – by
this I mean one well-defined bit of R-GMA’s behaviour (such as mediation or
streaming) that spreads across several services. The fourth group contains simple tests
for the whole system. The fifth group attempts to test the whole system under more
realistic loading. The final group tests the whole system for resilience in the face of
very high loading and partial failures. I’ll take each test group in turn, but first I’ll
describe dummy services that are used to allow sub-systems to be tested
independently.

Dummy Services
Testing will require dummy versions of all seven services. Each dummy service
implements the same interface as its corresponding real service, but none of its
functionality. Functionally, all dummy services are the same. A configuration file
loaded by the service specifies, for each operation, what value should be returned, or
what exception should be thrown. When called, each operation simply writes its
parameters to a log file then returns the prescribed value or throws the prescribed
exception. Each dummy service also has one extra call by means of which test scripts
can change the names of the configuration file and log file, without restarting the
service. Each test will have its own configuration files and create its own log files.
The test services are installed alongside the real R-GMA services, either as a different
Web application in the same Tomcat, or possibly in a second Tomcat instance running
on a different port. The test scripts just modify rgma.conf to switch in dummy
services as required2. The configuration file will probably be a simple XML file,
much as we used for testing the SOAP APIs.

Group 1: Single-API tests (using dummy services)


These tests use a single machine.

Method calls with valid parameters (successful)


For each API, the test scripts run a test program in the corresponding language to call
each API method with valid parameters. Where parameters can take a fixed range of
values (e.g. query type), all values are tested. Dummy services are used and test
programs will check the APIs correctly pass back any return values as prescribed in
the tests' configuration files. The log files from the dummy services are compared
against model log files (just with a diff) to check that the API called the correct
operations with the correct parameters. The same configuration and model log files
are used for all four user APIs to ensure they remain consistent; different files will be
required for the Java System API.

1
Sorry about the jargon – if anyone can do better, I’ll change it.
2
I think R-GMA services also look at this file (although that should probably change). If that’s true, we
may need to run the tests on a separate machine from the services.
Method calls with valid parameters (server-side exceptions)
These are like the previous tests, except that the configuration file specifies that an
example of each type of server exception is thrown. The test program checks that a
corresponding exception (with the correct error message and number) is thrown by the
API.

Method calls with invalid parameters (client-side exceptions)


For each API, the test scripts run a test program in the corresponding language to call
each API method with any invalid parameters that the API should detect (e.g. out-of-
range values for query type if the language can’t prevent this) and check that an
appropriate exception is thrown.

Group 2: Single-Service tests (with Java User/System APIs)


These tests use a single machine. Although they are nominally single-service tests,
there are some places where it will be necessary to use a real Registry and Schema –
tests that use them are responsible for re-initialising them at the start of each test.

Operation calls with valid parameters (successful)


These tests are limited to operations that don’t involve producer/consumer resources
and are not covered by subsequent tests. The test scripts use Java test programs that
contact the services through the Java User/System APIs (never directly via HTTP as
this is not the published interface) – they simply call the operation with valid
parameters and check for the expected result. For the Producer, Consumer and
Mediator3 services, the operations tested here are getProperty, setProperty,
getVersion and ping (and possibly setDefaultVDB). All operations of the Registry and
Schema services are tested here, by adding entries then reading them back (the
Registry tests will require a dummy Consumer service to receive and ignore the
addProducer messages). Registry and Schema test scripts may (in fact, must) empty
out the Registry and Schema databases via MySQL before they begin, but
subsequently they may only read and write to them through the Registry and Schema
APIs, even for the purposes of checking, because system tests must never poke about
in the internal workings of the component being tested.

Operation calls with invalid parameters (server-side exceptions)


For each Service, the test scripts run a test program in the corresponding language to
call each Service operation with any invalid parameters that the Service should detect
by itself4, and check that an appropriate exception is thrown by the service. Note that
server parameter checks (should) go much further than the API checks. As before, the
test scripts use Java test programs that contact the services through the Java
User/System APIs. The parameter checks that should be made in each call of each
services that use them are:

Producer and Consumer services


 Termination interval out of permitted range for server
 Requested resource does not exist
 Requested resource is closed
3
The Mediator’s getPlansForQuery operation is tested extensively as part of the mediation tests (see
later).
4
I think we make one exception: the service can contact the schema (because a lot of parameter
checking requires it).
 Attempting to declare a table that is not in the schema of the specified VDB
 Attempting to re-declare a table
 Invalid SQL syntax in a producer predicate
 Invalid column names (for table) in a producer predicate
 History or latest retention periods out of permitted range for server
 Invalid SQL syntax in INSERT statement
 Invalid table name or column names in INSERT statement (not in schema)
 Table in INSERT statement not previously declared
 Type mismatch in INSERT statement
 Read-only column name (Rgma*) in INSERT statement
 Tuple store (logical) name does not exist
 Invalid property name (get/setProperty)
 Bad URI for ODP query handler
 Invalid SQL syntax in SELECT statement (i.e. doesn't conform to SQL92
Entry Level)5
 Illegal SQL for query type (e.g. complex continuous query)
 Invalid VDB, table or column names in SELECT statement
 Type mismatches in SQL SELECT statement
 Query timeout out of range
 Invalid producer endpoint (in startDirected)
 No data available to pop
 Request by non-SP to create a Secondary Consumer (if it can be checked)
 Bad producer endpoint or type passed to add/removeProducer
 Unexpected connection to streaming port
 Bad data written to streaming port

Mediator service
 VDBs, table names, column names don't exist

Registry service
 VDB or registry service URL don’t exist.
 Table name not in schema for this VDB (or should Registry not check this?)
 Invalid SQL syntax in a predicate
 Attempt to read/modify a producer/consumer entry that’s not in the registry
(Registry replication of bad data is dealt with in later tests)

Schema service
 VDB or schema service URL don’t exist
 Invalid SQL syntax in CREATE/DROP TABLE/VIEW/INDEX
 Illegal table or column name in CREATE TABLE (breaks R-GMA rules)
 Unsupported type in CREATE TABLE
 Attempt to create a table entry that’s already in the schema
 Attempt to read/modify a table entry that’s not in the schema
(Schema replication of bad data is dealt with in later tests)

Group 3: Functional components (with Java APIs)


Unless otherwise stated, these tests use a single machine.

5
R-GMA’s SQL Query processing is tested extensively in later tests.
Mediation and Query Planning
The mediation tests use real Mediator, Registry and Schema services. They use the
Java System API to register all reasonable combinations of producer/table entries in
the Registry and to check the plans returned by getPlansForQuery for a range of
queries and query types that exercise all of the mediation rules (this includes checking
history retention periods and generating warnings).

To check Query Planning, a real Consumer Service is added, together with dummy
Producer services to receive and log the start/stopStreaming messages, so that the test
script can check that the correct producers are contacted for each plan. The Consumer
Service is driven through the Java User API, by creating and starting suitable
consumers. The producer entries can however still be added directly to the Registry
through its Java System interface (or the Mediator could be simulated).

Termination Intervals
These tests check that:

 producer and consumer resources remain alive for their entire termination interval
 their lifetime can be extended by API calls, epecially showSignOfLife, insert and
pop
 they remain registered for as long as they remain alive6
 they are closed and unregistered fairly quickly after the termination interval is
exceeded

Retention Periods and Close Processing


These tests check that:

 History retention periods are registered correctly in the Registry


 Tuples are retained in producers’ tuple stores in line with the rules for latest and
history retention periods
 No producer returns expired tuples in Latest queries
 The rules for remaining alive, registered and retaining tuples in the History tuple
store after a close operation has been initiated, are respected by producers
 The rules for retaining or discarding tuples when the tuple stores fill up are also
respected by producers.

These tests do not check that producers clean up expired tuples, as the producer is not
obliged to do so unless its tuple stores fill up.

Tuple metadata management


These tests check that:

 Primary Producers correctly insert all tuple metadata fields into new tuples
 Primary Producers retain RgmaTimestamp data already inserted by users
 Secondary Producers do not alter tuple metadata
 On-demand Producers add NULL values for all tuple metadata fields.

6
The special behaviour caused by a call to close is dealt with in later tests
SQL Query Processing
These tests check that all data types and SQL constructs that R-GMA claims to
support are correctly handled by the entire system. If it’s not possible to test
everything, R-GMA should only claim to support that which has been tested
successfully. This will include the new multi-VDB queries when they are supported
(and will then require multiple Registries and Schemas). It may be sensible to run the
tests first directly against the producer’s system interfaces (via the Java system API),
then against the whole system by creating consumers. Producers with all supported
types of storage must be tested in full.

Registry Replication
The tests check that the Registry Replication algorithm itself works by running a
number of Registries (probably three) on different machines with a fast replication
cycle, and by making modifications to all of them directly via their Java System
interface. This can all be done from one client machine by changing rgma.conf. The
Registries are then compared over a period of time to check that they remain
acceptably consistent (again, by querying them through the System interface rather
than directly reading the database). It is also necessary to test the behaviour of
replication when Registry is shut down for a period during which the others change,
then brought back up, i.e. how quickly that replica recovers and how much damage it
does to the others in the mean time.

The impact of replication on the normal operation of R-GMA is checked in later tests.

Schema Replication
These will be much like the Registry replication tests, once Schema replication is
added.

Registry and Schema Discovery


Once multi-VDB support is added, tests will be added to ensure that Registry and
Schema services can correctly locate and contact remote Registry instances on the
basis of VDB name. Note that these tests are just concerned with the local file lookup,
not with the maintenance of that file (by whatever means).

Security
These tests check that:

 Valid real and proxy certificates are accepted by the services


 Invalid or revoked certificates are rejected by the services
 The access control list correctly allows or denies access to selected services (i.e.
that the list is correctly interpreted and that no unwanted side-effects of blocking
access to particular services are seen)
 Secure R-GMA services will not connect to insecure clients or services

When operation authorization, data authorization and encrypted streaming are added
to R-GMA, this set of tests will be expanded considerably.
Named tuple-store management
When this is added to R-GMA, we need to test creation, listing and destruction of
named tuple stores. We also need to check that compatible producers can re-use
existing stores and their tuples can be consumed, that only one producer at a time can
use a tuple store, and that incompatible producers (wrong predicate or authorization)
can’t reuse them at all.

Group 4: Simple full-system tests


These tests use a single machine installed with all services. Their purpose is to check
normal operation of the full system under light load, by creating producers and
consumers of every type and inserting/querying small amounts of data. The test
scripts call Java programs using the Java User API only: They do not use the Java
System interfaces directly. Most of these tests can be taken from the existing R-GMA
test suite (org.glite.testsuites.rgma – Java API version only). Certainly tests 1, 2, 3,
4A, 4B and 4C are worthwhile. Test 12 is not worthwhile. I think that unless we
decide to run these tests directly out of org.glite.testsuites.rgma, we should copy the
code into our own org.glite.rgma.system-tests module and maintain them
independently. We will need to add a test for On-demand producers and static queries,
and also for calls (on all resource types, where appropriate) to insertList,
getHistoryRetentionPeriod, getLatestRetentionPeriod, getTerminationInterval and
destroy, as they are not covered by the existing tests.

Group 5: Multi-node full-system tests


This group of tests aims to put R-GMA under a more realistic loading. They are not
testing failure scenarios, just how well R-GMA behaves on a fairly busy system.
These tests should ensure that at least the following scenarios are covered:

 a consumer streaming from multiple producers


 a producer streaming to multiple consumers
 multiple clients making simultaneous connections to the same service
 multiple clients making simultaneous connections to the same resource
 rapid creation of producers and closing of consumers to exercise the Registry and
Consumer message queues
 registry replication running while all this is going on
 the R-GMA browser being used, while all this is going on, to run all types of
query on the tables being used by the tests (this is not to test the Browser, just the
impact on the system of all the short-lived consumers that it generates) - this may
have to be done manually following some written prescription in the test spec.

I suggest three machines, one acting as a client and the other three as R-GMA servers
running all services. Registry replication should be running between all three
registries (and schema replication if it ever happens). A single test script running on
the client machine will run the whole test, switching MON boxes as necessary by
changing rgma.conf. Monitoring the system is difficult, but I suggest we use multiple
archivers, as these seem to be the best indicator of how the system is running as a
whole. In fact we could probably make the archivers the central feature of these tests
and get everything else almost for nothing. I haven’t really begun to think in detail
about the best way to put these tests together, but Rob’s fake WMS looks like a good
way to schedule lots of producers and consumers running in parallel, so tests 5, 6, 7,
8 in the test suite may be a starting point.

Group 6: Resilience tests


The following errors can probably be considered to be fatal:

 Server Configuration file is corrupted


 Database driver throws fatal exception
 File system I/O throws fatal exception
 Operation system resource (e.g. max number of open files) is exceeded
 Tomcat runs out of memory

We need to decide how R-GMA should behave in the face of these errors before we
can test that behaviour. If, for example, we decide that R-GMA is obliged to notice
these things and attempt to shut itself down cleanly, rather than soldier on, we can
then write tests that cause them and check this happens.

The more frequent error conditions that R-GMA services are expected to handle, and
which are not allowed to have an adverse effect on other the rest of the system are:

 Network connections failing


 Network connection hanging
 Tuple buffers and message buffers filling up
 Server clocks not being synchronized

I’m really not sure how to test this properly. One thought I had (although the Steves
were not impressed) is to consider simulating R-GMA by modelling the services just
by their external calls, internal queues and any synchronized code. We could then put
configurable delays on the simulated network links to see how well the queues and
connection timeouts deal with very high loads and isolated connection problems. I’m
open to better suggestions, but whatever we come up with must run on limited
hardware, in relatively short time, and be able to be monitored.

J.A.W. 29/07/05

You might also like