Bypass Testing Web Apps
Bypass Testing Web Apps
1.1. Running Web application tests through HTML 2.1. Semantic input validation
forms
In an initial attempt at categorization, we have identified
HTML forms expect users to type their values and make three types of semantic data input validation. A number is
their choices by using the keyboard and mouse. However, provided for each type to refer to later in the paper.
it turns out to be easy for users to bypass the HTML to 1. Data type conversion (2.1.A). Most inputs to HTML
send values directly to the server software. For example, form elements are plain strings that are converted to other
if the GET request is expected, the users can simply type types on the server. The client can check whether the string
the parameters into the URL box in their browsers. If the can be converted correctly. For example, if the input is an
POST request is expected, a simple program can be written integer, the client can check to ensure that all characters are
on the client that creates and submits the request. There are numeric digits.
two reasons for bypassing HTML forms. One is for conve- 2. Data format validation (2.1.B). There are many more
nience; if a Web application is used a lot it might be more restrictive constraints on inputs that can be checked, and this
convenient to skip the relatively slow FORM interface. An- is one of the most common ways to validate input on the
other reason is for automation; when running multiple tests Web. This includes checking the format of money, phone
on a Web application, the test execution can be automated numbers, personal identification numbers, email address,
by bypassing the forms. and URLs.
This ability to bypass form entry allows another strategy 3. Inter-value constraint validation (2.1.C). There are of-
to be used. If the Web application uses client-side input ten constraint relationships among input values. For exam-
erated page includes different input elements, for example, This type of bypass testing tries to verify whether a Web
if an order entry form sometimes includes an input box to application adequately evaluates invalid inputs. This testing
enter a discount coupon code. is based on the restrictions described in Section 2. Given
3. Optional input value composition. Two input units a single input variable, invalid inputs can be generated ac-
iu1 = S1 D1 T1 and iu2 = S2 D2 T2 have op- cording to the 14 types of input validation that are specified
tional input values if S1 = S2 , T1 = T2 and there exists in Section 2.
n1 v1 2 D1 and n2 v2 2 D2 , such that n1 = n2
but v1 6= v2 . Then the two input units are merged, form- Data type conversion violation (2.1.A). HTML inputs
ing iu = S1 D T1 where D = fD1 , n1 v1 g
0 0
are initially strings, but they are often converted to
fD1 , n2 v2 g fn1 v1 v2 g. This happens when other data types on the server. Data type conversion
a dynamically generated page sometimes includes different testing uses values of different types to evaluate the
server-side processing, including general strings, inte- It is relatively easy to enumerate possible invalid inputs
gers, real numbers, and dates. for an input parameter. However, the restrictive relation-
ships among different parameters are hard to identify, hard
Built-in length restriction violation (2.2.A). The to validate and are thus often ignored during testing. There
HTML tag input can have an attribute maxLength, as are many kinds of relationships. One type is invalid pair,
described in Section 2. Invalid values are generated to where two parameters cannot both have values at the same
violate these restrictions. time. For example, it is not reasonable to have a check-
ing account number and a credit card expiration date in the
Built-in value restriction violation (2.2.B). Pre-defined same transaction. Another type is required pair, where if
input restrictions from HTML select, check and radio one parameter has a value, the other must also have a value.
boxes are violated by modifying the submission to sub- For example, if we have a credit card number, we must also
mit values that are not in the pre-defined set. have an expiration date. Parameter level bypass testing tries
to test Web application by executing test cases that violate
Special input value (2.1.B, 2.3.A - 2.3.E). When data
restrictive relationships among multiple parameters.
is stored into a database or XML document, and under
Because the HTML files are very often generated dy-
certain kinds of processing, some special characters,
namically, these relationships cannot always be obtained
as defined in Table 1, can corrupt the data or cause
statically and must be identified dynamically. They are
the software to fail. This data is often validated with
sometimes described in English-language instructions, and
client-side checking, but sometimes with server-side
sometimes simply assumed. Nevertheless, if we can iden-
checking. Thus, following Wheeler’s suggestions [18],
tify and follow all possible ways to send parameters to a
values for text fields are generated that contain special
server program, we can ensure conformance to the restric-
characters.
tive relationships, and then find values to violate the restric-
tive relationships. Thus, we define the input pattern a par-
4.2. Parameter level bypass testing ticular set of parameters that can be used at the same time.
In the example that is shown in Figure 1, four dif-
This type of bypass testing tries to address issues related ferent buttons (two search buttons and two all record
to inter-value constraint (2.1.C), built-in data access (2.2.D), buttons) send requests to the same server soft-
and built-in input field selection (2.2.E). ware component update search params.jsp.
Step 1 : Create a stack ST to retain all input units that need intent of the differential input pattern is to make subtle
to be explored. Define an initial input unit ius as the changes that are not likely to be identified by checks
URL for I with no parameters. Initialize ST to ius . other than invalid input checking.
Create a set IUS to retain all input units that have been
identified. Initialize IUS to empty. Parameter level bypass testing focuses on relationships
among different parameters, therefore, all values of input
Step 2 : While ST is not empty, pop an input unit (defined parameters are selected from a set of valid values.
in Section 3) from ST, generate data for the input unit
and send it to the server. When a reply is returned, 4.3. Control flow level bypass testing
analyze the HTML content. For each input unit iu in
the returned HTML document: The previous two types of bypass testing assume users
if iu is a link input unit (A tag) and the URL follow the control flow that is defined by the software. How-
has already been explored, do not push iu onto ever, users of Web applications can alter the control flow
the stack. (built-in control flow restriction violation 2.2.F) by pressing
the back button, pressing the refresh button, or by directly
if iu 2 I U S (it has already been found), do not entering a URL into a browser. This ability adds uncertainty
push iu onto the stack. and threatens the reliability of Web applications.
0
if there exists an input unit iu 2 I U S such that Control flow level bypass testing tries to verify Web ap-
0
iu and iu have optional input elements, update plications by executing test cases that break the normal ex-
the value of iu. Do not push iu onto the stack. ecution sequence. As a first step, the “normal” control flow
6.2. GUI testing without limited use of the server software. They define
intra-object testing, where test paths are selected for the
HTML forms can be considered to offer a graphical user variables that have def-use chains within the object, inter-
interface to run software that is deployed across the Web. object testing, where test paths are selected for variables
Memon has developed techniques to test software through that have def-use chains across objects, and inter-client test-
their GUIs by creating inputs that match the input specifica- ing, where tests are derived from a reachability graph that is
tions of the software [12, 13]. This approach focuses on the related to the data interactions among clients.
layout of graphical elements and the user’s interaction when Ricca and Tonella [16] proposed an analysis model and
supplying form data. Bypass testing relies on following the corresponding testing strategies for static Web page analy-
syntax of the GUI forms, but specifically finds ways to vio- sis. As Web technologies have developed, more and more
late constraints imposed by the syntax. The two approaches Web applications are being built on dynamic content, and
are complementary, specifically, GUI testing could be used therefore strategies are needed to model these dynamic be-
to develop values for bypass testing. haviors.
Benedikt, Freire and Godefroid [3] presented VeriWeb,
6.3. Web application testing a navigation testing tool for Web applications. VeriWeb ex-
plores sequences of links in Web applications by nondeter-
Most research in testing Web applications has focused ministically exploring “action sequences”, starting from a
on client-side validation and static server-side validation of given URL. Excessively long sequences of links are limited
links. An extensive listing of existing Web test support tools by pruning paths in a derivative form of prime path cover-
is on a Web site maintained by Hower [7]. The list includes age. VeriWeb creates data for form fields by choosing from
link checking tools, HTML validators, capture/playback a set of name-value pairs that are initialized by the tester.
tools, security test tools, and load and performance stress VeriWeb’s testing is based on graphs where nodes are Web
tools. These are all static validation and measurement tools, pages and edges are explicit HTML links, and the size of the
none of which support functional testing or black box test- graphs is controlled by a pruning process. This is similar to
ing. our algorithm, but does not handle dynamically generated
The Web Modeling Language (WebML) [4] allows Web HTML pages.
sites to be conceptually described. The focus of WebML is Elbaum, Karre and Rothermel [5] proposed a method to
primarily from the user’s view and the data modeling. Our use what they called “user session data” to generate test
model derived from the software is complementary to the cases for Web applications. Their use of the term user ses-
solutions proposed by WebML. sion data was nonstandard for Web application developers.
More recent research has looked into testing software Instead of looking at the data kept in J2EE servlet session,
from a static view, but few researchers have addressed the their definition of user session data was input data collected
problem of dynamic integration. Kung et al. [9, 11] have and remembered from previous user sessions. The user data
developed a model to represent Web sites as a graph, and was captured from HTML forms and included name-value
provide preliminary definitions for developing tests based pairs. Experimental results from comparing their method
on the graph in terms of Web page traversals. Their model with existing methods show that user session data can help
includes static link transitions and focuses on the client side produce effective test suites with very little expense.