A Survey On Web Application Security: Xiaowei Li and Yuan Xue
A Survey On Web Application Security: Xiaowei Li and Yuan Xue
Abstract—Web applications are one of the most prevalent and client-side technologies (e.g., JavaScript, Flash). Web
platforms for information and services delivery over Internet application built and hosted upon such a complex infrastruc-
today. As they are increasingly used for critical services, web ture faces inherent challenges posed by the features of those
applications become a popular and valuable target for security
attacks. Although a large body of techniques have been devel- components and technologies and the inconsistencies among
oped to fortify web applications and and mitigate the attacks them. Current widely-used web application development and
toward web applications, there is little effort devoted to drawing testing frameworks, on the other hand, offer limited security
connections among these techniques and building a big picture support. Thus secure web application development is an error-
of web application security research. prone process and requires substantial efforts, which could
This paper surveys the area of web application security,
with the aim of systematizing the existing techniques into a be unrealistic under time-to-market pressure and for people
big picture that promotes future research. We first present with insufficient security skills or awareness. As a result, a
the unique aspects in the web application development which high percentage of web applications deployed on the Internet
bring inherent challenges for building secure web applications. are exposed to security vulnerabilities. According to a report
Then we identify three essential security properties that a web by the Web Application Security Consortium, about 49% of
application should preserve: input validity, state integrity and logic
correctness, and describe the corresponding vulnerabilities that the web applications being reviewed contain vulnerabilities
violate these properties along with the attack vectors that exploit of high risk level and more than 13% of the websites can be
these vulnerabilities. We organize the existing research works compromised completely automatically [2]. A recent report [3]
on securing web applications into three categories based on their reveals that over 80% of the websites on the Internet have had
design philosophy: security by construction, security by verification at least one serious vulnerability.
and security by protection. Finally, we summarize the lessons
learnt and discuss future research opportunities in this area. Motivated by the urgent need for securing web applications,
a substantial amount of research efforts have been devoted
into this problem with a number of techniques developed
I. I NTRODUCTION for hardening web applications and mitigating the attacks.
World Wide Web has evolved from a system that delivers Many of these techniques make assumptions on the web
static pages to a platform that supports distributed applications, technologies used in the application development and only
known as web applications and become one of the most address one particular type of security flaws; their prototypes
prevalent technologies for information and service delivery are often implemented and evaluated on limited platforms. A
over Internet. The increasing popularity of web application practitioner may wonder whether these techniques are suitable
can be attributed to several factors, including remote accessi- for their scenarios. And if they can not be directly applied,
bility, cross-platform compatibility, fast development, etc. The whether these techniques can be extended and/or combined.
AJAX (Asynchronous JavaScript and XML) technology also Thus, it is desirable and urgent to provide a systematic
enhances the user experiences of web applications with better framework for exploring the root causes of web application
interactiveness and responsiveness. vulnerabilities, uncovering the connection between the existing
As web applications are increasingly used to deliver security techniques, and sketching a big picture of current research
critical services, they become a valuable target for security at- frontier in this area. Such a framework would help both
tacks. Many web applications interact with back-end database new and experienced researcher to better understand web
systems, which may store sensitive information (e.g., financial, application security challenges and assess existing defenses,
health), the compromise of web applications would result in and inspire them with new ideas and trends.
breaching an enormous amount of information, leading to In this paper, we survey the state of the art in web ap-
severe economical losses, ethical and legal consequences. A plication security, with the aim of systematizing the existing
breach report from Verizon [1] shows that web applications techniques into a big picture that promotes future research.
now reign supreme in both the number of breaches and the Based on the conceptual security framework by Bau and
amount of data compromised. Mitchell [4], we organize our survey using three components
The Web platform is a complex ecosystem composed of for assessing the security of a web application (or equipped
a large number of components and technologies, including with a defense mechanism): system model, threat model
HTTP protocol, web server and server-side application de- and security property. System model describes how a web
velopment technologies (e.g., CGI, PHP, ASP), web browser application works and its unique characteristics; threat model
describes the power and resources attackers possess; security Web ecosystem that enables dynamic information and service
property defines the aspect of the web application behavior delivery. As shown in Fig. 1, a web application may consist
intended by the developers. Given a threat model, if one web of code on both the server side and the client side. The server-
application fails to preserve certain security property under side code will generate dynamic HTML pages either through
all scenarios, this application is insecure or vulnerable to execution (e.g., Java servlet, CGI) or interpretation (e.g., PHP,
corresponding attacks. JSP). During the execution of the server-side code, the web
This survey covers the techniques which consider the fol- application may interact with local file system or back-end
lowing threat model: 1) the web application itself is benign database for storing and retrieving data. The client-side code
(i.e., not hosted or owned for malicious purposes) and hosted (e.g., in JavaScript) are embedded in the HTML pages, which
on a trusted and hardened infrastructure (i.e., the trust com- is executed within the browser. It can communicate with the
puting base, including OS, web server, interpreter, etc.); 2) the server-side code (i.e., AJAX) and dynamically updates the
attacker is able to manipulate either the contents or the se- HTML pages. In what follows, we describe three unique as-
quence of web requests sent to the web application, but cannot pects of the web application development, which differentiate
directly compromise the infrastructure or the application code. web applications from traditional applications.
We note here that although browser security ([5], [6]) is also
Web Application
an essential component in end-to-end web application security,
JavaScript Sta!c Executable, eg.,
research works on this topic usually have a different threat , Flash, HTML Java Servlet,
Dynamic HTML
page, eg. PHP, JSP.
page cgi.
model, where web applications are considered as potentially etc.
Run!me/Interpreter, e.g., JVM, Zend
malicious. This survey does not include the research works on Web browser Database
Web server
browser security so that it can focus on the problem of building HTTP
secure web applications and protecting vulnerable ones. The Client Side Server Side
1) Security by construction: Chong et al. [17] develop a sanitization routines provided by template systems and web
web application framework SIF (Servlet Information Flow), development frameworks. A very recent work by Samuel et al.
based on a security-typed language Jif [18], which extends [24] builds a reliable context-sensitive auto-sanitization engine
Java with information flow control and access control. SIF into web template systems based on type qualifiers to address
is able to label user input, track the information flow and this problem.
enforce the annotated security policies at both compile time 2) Security by verification: There are broadly two ap-
and runtime. In addition, their parallel work Swift [19] is a proaches employed by the works in this class: program
unifying framework to enforce end-to-end information flow analysis and program testing. Program analysis techniques
policies for distributed web applications. Jif source code can aim to identify insecure information flow within the web
be automatically and securely partitioned into server-side and application. To do so, the set of sources, propagators, sinks
client-side code. SIF and Swift can be used for building secure and sanitizers have to be first manually specified by the
web applications free of input validation vulnerabilities, as developers, which obviously has great impacts on the analysis
long as the security policies associated with the information precision. In contrast, program testing aims to construct input
flow of untrusted user data are specified correctly. We note validation attack vectors to expose vulnerabilities within web
that they can also be used to enforce other security policies applications. Both benign and malformed user input into web
that are relevant with application logic (e.g., authorization), applications and their responses are examined to see if there
which we will explain later. exists structural differences. There are two key challenges for
Robertson et al. [20] propose a strong typing development employing testing techniques: (1) it is difficult to generate
framework to build robust web applications against XSS and test cases to completely explore the paths through which user
SQL injection. This framework leverages Haskell, a strong typ- input can reach sinks; (2) it is difficult to generate specially
ing language, to remedy the weak typing feature of scripting crafted user input to expose subtle vulnerabilities within web
languages. Untrusted user data is reliably distinguished from applications, such as insufficient sanitization.
trusted static web contents via static types and passed through Program analysis. This class of techniques include static,
type-specific sanitization routines. Identifying all different user dynamic and hybrid analysis.
input types and performing correct and accurate sanitization Static analysis includes several techniques, such as dataflow
for each type still involve non-trivial manual efforts. analysis, pointer analysis, string analysis, etc. Static analysis
There are also other security mechanisms and defensive can conservatively identify all possible insecure information
programming practices that are proposed to assist developers flows, but is limited by its capability of modeling dynamic
in building web applications free of input validation vul- features of scripting languages, such as code inclusion, object-
nerabilities. For example, Prepared Statement [21] (or SQL oriented code. Complex alias analysis has to be employed,
DOM [22]) are recommended for defending against SQL which makes static taint analysis inherently inaccurate, leading
injections, where the structure of SQL query is explicitly to a number of false positives.
specified by developers and enforced. HTML template systems Huang et al. [25] propose a tool WebSSARI that applies
(e.g., Google CTemplate) force developers to separate user static analysis into identifying vulnerabilities within web ap-
data from HTML structure explicitly, so that auto-sanitization plications. The tool employs flow-sensitive, intra-procedural
functions are performed before user data can be embedded analysis based on a lattice model. They extend the PHP
in web responses. This feature addresses the completeness language with two type-states, namely tainted and untainted,
of user input sanitization, as long as the developers identify and track each variable’s type-state. In addition, runtime saniti-
and mark all of them. However, the correctness of auto- zation functions are inserted where the tainted data reaches the
sanitization routines is overlooked for a long time. A recent sinks to automatically harden the vulnerable web application.
study [23] shows that there still exists a large gap between However, a number of language features are not supported,
the correctness requirements and the actual capability of such as recursive functions, array elements, etc.
Xie and Aiken [26] perform a bottom-up analysis of basic dynamic analysis to further improve the analysis precision.
blocks, procedures and the whole program to find SQL injec- Balzarotti et al. [36] argue that faulty sanitization can introduce
tion vulnerabilities. Their technique is able to automatically numerous subtle flaws into the web application, which cannot
derive the set of variables that have to be sanitized before be identified by the above techniques. They present Saner,
function invocation using symbolic execution. However, their which employs hybrid analysis to validate the correctness
static analysis is also limited to a certain set of language of both built-in and custom sanitization routines. Saner first
features. applies conservative static string analysis to model how user
Pixy [27], [28] is an open source tool that performs inter- input is sanitized, then feeds a large set of malicious inputs into
procedural flow-sensitive data flow analysis on PHP web suspicious sanitization routines to identify weak or incorrect
applications. Pixy first constructs a parse tree for PHP codes, sanitization.
which is represented as a control-flow graph for each function. Based on [32], Monica et al. [37] present a holistic tech-
Then, it performs precise alias and literal analysis on the nique, that combines static analysis, model checking, dynamic
intermediate nodes. Pixy is the first to apply alias analysis checking and runtime detection. In particular, they employ
over scripting languages, which greatly improves the analysis model checking to improve the accuracy of static analysis.
precision. Model checking can systematically explores the space of a
Wassermann et al. [29], [30] propose string-taint analysis, finite-state system and verifies the correctness of the system
which enhances Minamide’s string analysis [31] with taint with respect to a given property or specification. Model
support. Their technique labels and tracks untrusted substrings checking is also able to automatically generate concrete attack
from user input and ensures no untrusted scripts can be vectors and exploit paths and produce no false positives.
included in SQL queries and generated HTML pages. Their Program testing. A number of black-box testing tools,
technique not only addresses the missing sanitization but also also referred to web application scanners, generate input
the weak sanitization performed over user input. vectors from a library of known attack patterns, including both
Instead of analyzing PHP web applications, Livshits and open-source (e.g., Spike, Burp) and commercial (e.g., IBM
Lam [32] apply precise context-sensitive (but flow-insensitive) AppScan) products. From the research community, WAVES
points-to analysis into analyzing bytecode of Java web appli- [38] first applies penetration testing into identifying injection
cations based on binary decision diagrams. In particular, they vulnerabilities within web applications and leverages machine
use a high-level language Program Query Language (PQL) learning to guide its test case generation. Secubat [39] is
for specifying the information flow policy and automate the another black-box scanner targeting at SQL injection and XSS
information flow analysis, distinct from traditional techniques attacks. McAllister et al. [40] focus on utilizing recorded
based on type declaration or program assertions. user sessions for more comprehensive exploration. Black-box
Similarly, Rubyx [33] requires developers to specify security testing techniques are essential when the application code
polices as constraints using the notions of principal, secrecy is not available, which is a common scenario. They are
and trust level and verify those policies for Ruby-on-Rails web increasingly deployed as remote web services.
applications based on symbolic execution. It is able to identify Traditional fuzzing method feeds random inputs into web
a number of vulnerabilities, including XSS, CSRF, insufficient applications. To improve testing effectiveness, random fuzzing
access control, as well as application-specific flaws. The com- can be enhanced with program analysis techniques. On one
pleteness of Rubyx is the same as bounded model checking. hand, fuzzing can generate concrete attack vectors that confirm
Dynamic analysis tracks the information flow of user input the presence of vulnerability, thus reducing false alerts. On the
during runtime execution by instrumentation. Compared to other hand, program analysis can guide test case generation
static analysis, dynamic analysis doesn’t require complex code for achieving better efficiency and coverage.
analysis, thus improving the analysis precision. On the other Martin et al. [41] apply model checking for systematically
hand, the deep instrumentation may negatively affect the generating attacks vectors. Similar to [32], the target vulner-
application’s performance and stability. Also, the completeness ability is specified as PQL queries and instrumented into the
cannot be guaranteed. application. They leverage Java PathFinder to systematically
Taint mode, which supports dynamic taint tracking, is first explore the application via concrete execution. In particular, to
introduced into Perl, whose interpreter is extended to ensure address the state explosion problem, the inherent challenge of
that no external data can be used by critical functions. Nguyen- model checking, they apply static analysis to prune infeasible
Tuong et al. [34] modify PHP interpreter to precisely taint paths and generate more promising input vectors.
user data at the granularity of characters and tracks tainted ARDILLA [42] first generates sample inputs, then symbol-
user data at runtime. However, the sanitization of user data ically tracks tainted inputs through the execution and mutates
requires retrofitting the application source code to explicitly the inputs, whose parameters flow into sensitive sinks. In par-
call a newly-defined function, which can be error-prone and ticular, it is capable of tracking tainted data through database
affect the analysis precision. Haldar et al. [35] instrument accesses, which enable it to precisely identify second-order
Java system class bytecode to extend Java with taint tracking XSS vulnerabilities.
support. FLAX [43] is a taint-enhanced black-box fuzzing technique
Hybrid analysis combines strengths of both static and that aims to discover CSV (client-side input validation) vul-
nerabilities within JavaScript code. Dynamic taint analysis Yacin et al. [48] propose to enforce the document structure
extracts knowledge of the sinks, which is used to significantly integrity (DSI) of web pages via parser-level isolation (PLI).
prune the mutation space and direct more effective sink-aware At the server side, web pages are instrumented (the authors
fuzzing. refer to as serialization), where all sections that contain user
3) Security by protection: There are two approaches em- input data are surrounded with randomized delimiters, before
ployed by the works in this class. One approach is to follow they are sent to the clients. Then, at the client side, the static
information flow specification, in which untrusted user inputs document structure can be robustly interpreted by the modified
are identified and tracked, so that the trustworthiness of web web browser, while the suspicious user contents are tracked
contents can be evaluated. We refer to this approach as taint- and monitored during dynamic evaluation and code execution.
based protection. In this approach, the corrupted web content This technique is robust to a large categories of XSS attacks,
(e.g., SQL queries) can be directly dropped. Alternatively, including DOM-based XSS, etc. However, it relies on the
suspicious user input can be sanitized, filtered or quarantined, instrumentation of web browsers.
without dropping the entire web content (e.g., web responses). Louw et al. [49] argue that the trust on the web browser
Instead of tracking user input, another approach aims to for parsing web pages should be minimized, since the incon-
directly detect input validation attacks before it even reaches sistencies between web browser implementations may allow
the web application or after it triggers a vulnerability in for server-side defenses to be circumvented by the attackers.
the application (i.e., the structure of web contents has been Thus, their proposed system Blueprint embeds context repre-
tampered), which we refer to as taint-free protection. sentations (i.e., models) of user input into the original web
Taint-based protection. To defend against XSS attacks, pages and parses the web pages by linking a reliable script
ScriptGuard [44] addresses subtle sanitization errors (i.e., library. Their method moves the functions of browser parsers
context-mismatched sanitization and inconsistent multiple san- to the server side to ensure that no malicious contents can
itization) in large-scale and complex web applications. Script- be executed. However, context-dependent embedding faces the
Guard instruments the web application with an inferred same challenge as context-sensitive sanitization.
browser model (i.e., the context) when HTML is output and There are also a number of pure client-side defenses against
employs positive tainting to conservatively identify saniti- XSS attacks, including IE8 XSS filter [50], Firefox NoScript
zation errors. At runtime, ScriptGuard leverages a training plugin [51], XSSDS [52], Noxes [53], BrowserShield [54],
phase to learn correct sanitizers for different program paths CoreScript [55], NoMoXSS [56], etc. However, as described in
and achieves auto-repairing of sanitization without incurring the introduction section, this line of works assume a different
significant performance overhead. threat model, thus beyond the scope of this survey paper.
However, pure server-side protections are susceptible to To defend against SQL injections, dynamically generated
the browser inconsistencies and cannot effectively handle SQL queries are evaluated to see if user input data has changed
client-side XSS attacks (e.g., DOM-based XSS), which are the query structure. Following the idea of instruction-set ran-
launched during web pages dynamically get updated within domization [57], Boyd et al. [58] propose SQLrand to preserve
the browser. Thus, several techniques require the client-server the intended structure of SQL queries and defend against SQL
collaboration, in which the web application conveys certain injections. SQLrand separates untrusted user input from SQL
security policies to be enforced at the browser side. structures by randomizing SQL keywords with secret keys,
BEEP [45] embeds a whitelist of known-good scripts into so that the attackers cannot inject SQL keywords to tamper
each web page and enforces the policy by filtering suspicious the structure. It uses a SQL proxy to dynamically translate
scripts at the instrumented web browser. The whitelist works “encrypted” SQL queries and drop injected ones. However,
like tainting the trusted scripts so that untrusted ones can be managing randomization keys requires additional efforts.
identified. BEEP also protects the whitelist from being tam- Su et. al give a formalization of SQL injection and pro-
pered using script key [46]. Although BEEP works efficiently, pose SQLCheck [59], which taints untrusted user input with
the whitelist is static and may not accurately differentiate surrounding special brackets and propagates bracketed user
injected scripts from trusted ones. input throughout the application. A SQL injection attack is
Matthew et al. [47] propose Noncespace, which annotates detected if any bracketed data spans a SQL keyword. However,
the elements and attributes within HTML document into this technique may break some internal functions (e.g., loop,
different trust classes using randomized XML namespaces conditional statement, etc.) when the bracketed user input
through a modified web template engine. Different trust classes traverses the web application.
are associated with distinct permissions, specified in a policy Similar to dynamic analysis techniques, Tadeusz et al. [60]
file. They set up a proxy to verify the HTML document propose CSSE to detect injection attacks by tracking user
with the policy file before forwarding it to the web browser. input through meta-data assignment and metadata preserving
Injected documents will be identified and dropped. Although operations. CSSE performs context-aware string evaluation
Noncespace encodes the structure of web documents at a to ensure no tainted user data can be used as literals, SQL
much finer-granularity than BEEP [45], it still cannot detect keywords or operators.
sophisticated attacks, which are dynamically launched within Instead of tracking untrusted user input, Halfond et al. [61]
the web browser due to the static policy. propose a novel technique “positive tainting”, which taints
and tracks trusted strings generated by the web application able to capture higher-level structure of web requests than
and performs syntax-aware string evaluation to detect SQL individual characters.
injections. The advantage of positive tainting is that it is To detect SQL injections, AMNESIA [71] models the
conservative, since the set of trusted data easily converges to structure of legitimate SQL queries. In particular, it builds non-
be complete, thus tends to be more accurate. deterministic finite automata (NDFA) models for SQL queries
Taint-free protection. This class of techniques usually by analyzing the application source code. SQL injections can
require an additional phase to establish detection models. To be detected if the runtime generated query violates its intended
do so, one way is to directly encode the malicious user input structure. However, the model accuracy is bounded by their
patterns (i.e., attack signature), which is referred to as misuse flow-insensitive static analysis. It may miss certain attacks if
detection. Another way is to characterize the benign user input the resulting SQL query matches to a legitimate one on a
pattern or the structure of web documents and SQL queries different path.
intended by the web application and identify the deviation CANDID [72] uses dynamic techniques to extract more
from established models as potential attacks, which is referred accurate structure of SQL queries by feeding benign can-
to as anomaly detection. didate inputs into the application. Then, the application is
Misuse detection employs a set of pre-defined attack signa- instrumented at each query generation point with a shadow
tures to identify known attacks toward web applications. Usu- query, which captures its legitimate structure and is compared
ally a proxy, also referred to as web application firewall, is set with runtime generated queries. It also monitors the executed
up for monitoring the HTTP interactions between the clients control path during dynamic execution, thus is more complete
and the application and stopping the attacks from reaching in modeling SQL queries than learning based technique [67].
the application. A number of application firewalls, both open The same technique is used for a different application, which
source (e.g., ModSecurity) and commercial (e.g., Imperva, automatically retrofits vulnerable SQL query generation into
Barracuda), are on the market. From the academia, David et prepared statements [73]. Their technique is also extended
al. [62] first propose a security proxy, which examines HTTP to model web responses and detect XSS attacks. XSS-Guard
requests in terms of parameter lengths, special characters, etc. [74] generates a shadow page to capture the web application’s
Signature-based detection is accurate and efficient within its intent for each web response, which contains only the autho-
capability range. However, it cannot detect zero-day attacks rized and expected scripts. Any differences between the real
and requires expertise to develop and update attack signatures. constructed page and the shadow page indicate potential script
Anomaly detection assumes that the attacks would cause injections.
the web application behavior to deviate sufficiently from that A black-box taint-inference technique is proposed by Sekar
under attack-free circumstances. The key is to establish a [75] for detecting a range of injection attacks, which doesn’t
model that characterizes the application’s normal behavior. require source code and avoids the negative effects intro-
Such behavior model needs to be accurate and sensitive. duced by deep instrumentation (e.g., taint tracking). First,
Otherwise, it suffers from false positives and false negatives, the events that traverse across different components/libraries
respectively. Depending on the target attack, different features are intercepted, from which data flows are identified through
of the web application behavior can be examined, such as approximate string matching. Then, data flows that contain
web request, response, SQL query, etc. and different modeling untrusted user data are evaluated over a set of language-
techniques can be applied. neutral syntax- and taint-aware uniform policies. Policy-based
Kruegel et al. [63], [64] are among the first that apply evaluation makes this technique much more accurate than
anomaly detection into detecting web-based attacks. They anomaly detection techniques. However, it faces challenges
derive multiple statistical models for normal web requests, when complex operations are performed over user input, in
in terms of attribute length, character distribution, attribute which case such data flow may not be identifiable.
order, etc. In the detection phase, a web request is blocked if 4) Open Issues: Though a substantial amount of efforts that
any anomaly score given by those models exceeds the trained have been devoted to input validation, several open issues
threshold. They further reduce false positives by grouping are still not or insufficiently addressed for securing web
anomalies into specific attack categories based on heuristic applications from its related attacks. First although taint-based
[65] and addressing concept drift phenomenon in web appli- techniques (i.e., program analysis, taint-based protection) have
cations [66]. Valeur et al. [67] also extract a similar set of been demonstrated to be very effective, tracking user input
features from normal SQL queries, especially for detecting by program annotations still faces technical challenges. For
SQL injections. static analysis, it is inherently difficult to handle dynamic and
Instead of examining web requests at the character level, complex features of scripting languages (e.g., object-oriented
several other works characterize normal web requests by first code). Inaccurate approximation of the application behavior
transforming web requests into a set of tokens. For example, leads to a large number of false positives. Moreover, taint
while Ingham et al. [68] employ deterministic finite automata, tracking is mostly limited to the application itself. Inability
Song et al. [69] use a mixture of Markov chains based on n- of tracking user input across multiple applications, external
gram transitions. A comparative study [70] shows that token- libraries, databases [76], etc, will miss certain subtle vulnera-
based algorithms tend to be more accurate, since they are bilities and result in stored or second-order attacks.
In terms of handling user input, sanitization, as the most labels and enforces security policies associated with differ-
common approach, surprisingly fails to achieve its desired ent flows at both compile time (i.e., static checking) and
functionality in many web development frameworks [23]. runtime (i.e., dynamic checking). For example, SIF [17] can
Thus, reasoning the correctness of sanitization routines still also be used to enforce authorization policies, in addition to
requires substantial work ([36], [77], [44], [24]). Policy-based addressing input validity property. Similarly, SELinks [81] is
techniques, as another way of handling user input, become a cross-tier programming framework for building secure and
promising, since the abstraction of security policies from ap- efficient multi-tier web applications, where security policies
plications enables security mechanisms to be easily deployed (e.g., access control, data provenance, information flow, etc.)
for a number of applications and facilitates security review are specified as customizable labels and a type system Fable
and verification [78]. However the development of the policy [82] is employed to ensure that labeled data/function can only
needs non-trivial human involvement. be accessed after checking policies. In particular, SELinks
Black-box application testing is independent of the appli- compiler translates customized access control checks into
cation source code and platforms and provides a promising executable SQL queries by the database engine, which greatly
scalable method for web application security. However, recent improves the efficiency of cross-tier policy enforcement.
comparative studies [79] [80] show that most of current black- Security typed language provides strong security assurance,
box scanners can only offer security assurance at a certain since it guards both explicit and implicit flow channels.
level and has limited capabilities in several aspects, such as However, it requires a lot of annotations, instrumentation,
detecting “stored” vulnerabilities (e.g., stored XSS), handling and even restructuring the application to handle complex and
active contents (e.g., flash, Java Applet), deep crawling of the dynamic security policies. Resin [83] is a much lighter-weight
application state and identifying application-specific flaws. approach to ensuring application-specific data-flow security
To address the above open issues, only relying on one single policies at runtime for mitigating both script injections and
technique tends to be insufficient. We have seen an increasing missing access control checks. Based on a modified language
number of works that combine two or several techniques and runtime, it attaches policy objects to variables, tracks the
achieve better performances, such as hybrid taint analysis [36], policy objects flowing through the web application, including
string-taint analysis [29], [30], taint-enhanced fuzzing [42], persistent storage, and enforces policies through filter objects,
etc. Another alternative is to apply one technique in a novel which guards the boundary between the web application
way, such as positive tainting [61], black-box inference [75], and the external environment. In particular, Resin reuses the
etc. How to combine existing techniques in a creative way original programming language and structure, which greatly
to address the limits of single techniques is an interesting facilitates the adoption of Resin for developers. As expected,
research direction. Resin cannot track implicit flow, such as program control flow,
data structure layout, etc., which may miss subtle bugs within
B. Logic Correctness applications.
We first recall the logic correctness property: Static checking adds no runtime overhead, while dynamic
Users can only access authorized information and oper- checking is able to handle complex and dynamic security
ations and are enforced to follow the intended workflow policies. UrFlow [84] is designed to combine their strengths.
provided by the web application. In particular, since security policies usually co-locate with
Different from input validation vulnerabilities that originate application data in the database, e.g., access control matrix, it
from insecure information flow, logic vulnerabilities are multi- requires developers to specify security policies in the form of
faceted and specific to web applications. Due to the fact SQL queries. UrFlow is able to perform sound static checking
that logic is application specific, there are two scenarios for of logic correctness of the application and verify dynamic
addressing this property: 1) security policies to be enforced are policies via a known predicate. However, it only supports a
explicitly specified by developers; 2) security policies are not limited range of authorization policies.
specified, in which case the specification has to be inferred Access control model can be implemented through
from the application implementation. In the latter case, the capability-based system to enforce authorization policies. Cap-
specification inference is the key challenge, especially due to sules [85] is a web development framework based on an
the heterogeneity of logic implementation. object-capability language Joe-E [86] for enforcing privilege
1) Security by construction: Both information flow and separation. The web application is automatically partitioned
access control models can be applied to construct secure web into isolated components, each of which only exposes lim-
applications that enforce authorization policies. Different from ited and explicitly-specified privileges to others. Privilege
the information flow specification applied for input validation, separation can contain the damages caused by vulnerable
which prevents untrusted user data from flowing into trusted components, especially third-party code, and facilitate security
web contents, the application of information flow model into reviews and verifications. However, it cannot guarantee each
authorization prohibits sensitive information from flowing to application component free of vulnerabilities.
unauthorized principals. 2) Security by verification: To verify if a web application
Security typed language, which usually implements a follows a logic specification, such specification has to be first
lattice-based type system, annotates data flows with specific inferred from its implementation. Static analysis extracts the
specification by analyzing source code, while dynamic analysis side and server-side code. They extract the constraints over
observes the application behavior under normal execution. form parameters from client-side JavaScript code to generate
Then, the discrepancies between the inferred specification and benign inputs. They also construct malicious inputs by solving
the actual implementation are identified as logic vulnerabili- negated constraints and feed both into the web application. If
ties. Obviously, the quality of the inferred specification greatly their web responses are the same, one vulnerability is found.
affects the correctness and accuracy of logic verification. Their follow-up work WAPTEC [93] enhances the analysis
Static analysis. MiMoSA [87] aims to identify vulnerabili- precision by employing white-box analysis and automatically
ties that are introduced by unintended navigation paths among constructs concrete exploits.
multiple modules. First, each module (a PHP file in their case) 3) Security by protection: Nemesis [94] implements dy-
is analyzed to extract the “state view”, which represent the namic information flow tracking through modifying language
influences on state variables by this module. Then, separate runtime to enforce the authentication mechanism and autho-
state views are concatenated to derive the intended workflow rization policies in legacy web applications. In particular,
graph. They apply model checking on the workflow graph to it provides reliable evidences for successful authentication
identify possible violations of graph traversal, which indicate when user input meets “known” credentials via a shadow au-
workflow violation vulnerabilities. However, MiMoSA cannot thentication system, thus bypassing the potentially vulnerable
discover missing or faulty checks within each module. authentication mechanisms in the application. It also keeps
Similar in purpose as MiMoSA, Sun et al. [88] perform track of users’ credentials to enforce predefined access control
role-specific analysis on PHP web application for identifying policies over resources, including files, database objects, etc.
access control vulnerabilities. They first specify a set of roles To provide robust user data segregation, CLAMP [95]
and infer the implicit access control policies by collecting the employs virtualization technology to isolate the application
set of allowed pages for each role, which are exposed through components running on behalf of different users. CLAMP
explicit links. Then, they try to directly access other unpriv- assigns a virtual web server instance to each user’s web session
ileged pages for each role to identify missing or incorrect and ensures that the current user can only access his/her own
access checks. data. Session-level separation provides a certain level of access
RoleCast [89] aims to identify missing access control checks control assurance. However, it cannot stop the attacks within
at a finer granularity. It first automatically infers the set of a single web session, especially in a shared-resource scenario.
user roles for the application by partitioning program files Swaddler [96] applies anomaly detection into detection
based on a statistical measure. Then, it extracts the set of of state violation attacks. It establishes statistical models of
critical variables that need to be checked for each role. The session variables for each program block through runtime ex-
inconsistencies of checking critical variables at different con- ecution, which indicate the application state when the block is
texts are reported as vulnerabilities. However, it only models executed. At runtime, the set of models, i.e., the specification,
queries that affect the database state (i.e., INSERT, DELETE, are evaluated to determine if the execution of current program
UPDATE) as security-sensitive operations and cannot identify block is an instance of state violation attack.
faulty checks. Arjun et al. [97] extract a control-flow graph from client-
Doupe et al. [90] address a particular type of vulnerability side JavaScript code as the specification for well-behaved
called Execution After Redirect (EAR), where the application clients and then set up a proxy for monitoring client behavior
continues execution after developer-intended redirection, thus and detecting malicious activities against server-side web
resulting in violation of intended control flow and unautho- applications. Ripley [98] is another technique for detecting
rized execution. They extract the control flow graph from malicious user behaviors within distributed Ajax web appli-
application source code and identify control paths that lead cations by leveraging replicated execution. The client-side
to privileged code after calling redirection routines. computation is exactly emulated on the trusted server side and
Dynamic analysis. Waler [91] aims to automatically dis- the discrepancies between computation results are flagged as
cover application-specific logic flaws. First, they infer the ap- exploits.
plication specification by deriving value-based likely invariants BLOCK [99] is a black-box approach for inferring the
for session variables and function parameters at each program application specification and detecting state violation attacks.
function via dynamic execution. Then, they perform model It observes the interactions between the clients and the ap-
checking combined with symbolic execution over the applica- plication and extracts a set of invariants, which characterize
tion source code to identify violations of inferred invariants. the relationship between web requests, responses and session
In particular, they only make use of “reliable” invariants, variables. Then, web requests and responses are evaluated at
which are supported by explicit checks along the control path runtime with the inferred invariants. Compared to Swaddler,
within the code and captures the relationship between session BLOCK is independent of the application source code.
variables and database objects. 4) Open Issues: Securing web applications from logic flaws
Bisht et al. propose a black-box fuzzing approach NoTamper and attacks still remain an under-explored area. Only a limited
[92] to detect a particular logic vulnerability within form pro- number of techniques are proposed. Most of them only address
cessing functionalities of web applications, which is caused by one specific part of application logic flaws [90], [88], [92]. The
inconsistent validation of form parameters between the client- fundamental difficulty for ensuring application logic correct-
ness property is the absence of application logic specification. requires security professionals to quickly react without putting
As logic is application specific, there is no general model of a huge number of web applications at risk.
application logic that is applicable for all applications. The
absence of a general and automatic mechanism for character- R EFERENCES
izing the application logic may be the inherent reason of the
inability of application scanners and firewalls at handling logic [1] Verizon 2010 Data Breach Investigations Report,
“http://www.verizonbusiness.com/resources/reports/rp 2010-data-
flaws and attacks [79], [80]. breach-report en xg.pdf.”
Several recent works try to develop a general and systematic [2] Web Application Security Statistics,
method for automatically inferring the specifications for web “http://projects.webappsec.org/w/page/13246989/WebApplication
SecurityStatistics.”
applications, which in turn facilitates automatic and sound [3] WhiteHat Security, “WhiteHat website security statistic report 2010.”
verification of application logic. One class of methods leverage [4] J. Bau and J. C. Mitchell, “Security modeling and analysis,” IEEE
the program source code [96], [91]. As a result, the inferred Security & Privacy, vol. 9, no. 3, pp. 18–25, 2011.
[5] H. J. Wang, C. Grier, A. Moshchuk, S. T. King, P. Choudhury,
specification is highly dependent on how the application is and H. Venter, “The multi-principal os construction of the gazelle
structured and implemented (e.g., the definition of a program web browser,” in USENIX’09: Proceedings of the 18th conference on
function or block). Implementation flaws may result in an USENIX security symposium, 2009, pp. 417–432.
[6] S. Tang, H. Mai, and S. T. King, “Trust and protection in the
inaccurate specification. Other method infers the application illinois browser operating system,” in OSDI’10: Proceedings of the 9th
specification by observing and characterizing the application’s USENIX conference on Operating systems design and implementation,
external behavior. The noisy information observed from ex- 2010, pp. 1–8.
ternal behaviors may lead to inaccurate specification in this [7] W. G. Halfond, J. Viegas, and A. Orso, “A Classification of SQL-
Injection Attacks and Countermeasures,” in Proc. of the International
method. Moreover, web application maintains both a large Symposium on Secure Software Engineering, March 2006.
number of persistent states in the database. Correctly iden- [8] MySpace Samy Worm, “http://namb.la/popular/tech.html,” 2005.
tifying these states to accurately characterize the application [9] A. Barth, J. Caballero, and D. Song, “Secure content sniffing for
web browsers, or how to stop papers from reviewing themselves,” in
logic is extremely hard. Oakland’09: Proceedings of the 30th IEEE Symposium on Security and
Privacy, 2009, pp. 360–371.
V. C ONCLUSION AND F UTURE D IRECTIONS [10] Gmail CSRF Security Flaw, “http://ajaxian.com/archives/gmail-csrf-
security-flaw,” 2007.
This paper provided a comprehensive survey of recent [11] M. Johns, “Sessionsafe: Implementing xss immune session handling,”
research results in the area of web application security. We in ESORICS’06: Proceedings of the 11th European Symposium On
described unique characteristics of web application develop- Research In Computer Security, 2006.
[12] A. Barth, C. Jackson, and J. C. Mitchell, “Robust defenses for cross-site
ment, identified important security properties that secure web request forgery,” in CCS’08: Proceedings of the 15th ACM conference
applications should preserve and categorized existing works on Computer and communications security, 2008, pp. 75–88.
into three major classes. We also pointed out several open [13] N. Jovanovic, E. Kirda, and C. Kruegel, “Preventing cross site request
forgery attacks,” in SecureComm’06: 2nd International Conference on
issues that still need to be addressed. Security and Privacy in Communication Networks, 2006, pp. 1 –10.
Web applications have been evolving extraordinarily fast [14] M. Johons and J. Winter, “Requestrodeo: Client-side protection against
with new programming models and technologies emerging, session riding,” in OWASP AppSec Europe, 2006.
resulting in an ever-changing landscape for web application [15] Z. Mao, N. Li, and I. Molloy, “Defeating cross-site request forgery
attacks with browser-enforced authenticity protection,” in FC’09: 13
security with new challenges, which requires substantial and th International Conference on Financial Cryptography and Data
sustained efforts from security researchers. We outline several Security, 2009, pp. 238–255.
evolving trends and point out several pioneering works as [16] M. Cova, V. Felmetsger, and G. Vigna, “Vulnerability Analysis of Web
Applications,” in Testing and Analysis of Web Services, L. Baresi and
follows. First, an increasing amount of application code and E. Dinitto, Eds. Springer, 2007.
logic is moving to the client side, which brings new security [17] S. Chong, K. Vikram, and A. C. Myers, “Sif: Enforcing confidentiality
challenges. Since the client-side code is exposed, the attacker and integrity in web applications,” in USENIX’07: Proceedings of the
16th conference on USENIX security symposium, 2007.
is able to gain more knowledge about the application, thus [18] L. Z. Andrew C. Myers, “Jif: Java information flow.” [Online].
more likely to compromise the server-side application state. Available: http://www.cs.cornell.edu/jif
Several works have been trying to address this problem [19], [19] S. Chong, J. Liu, A. C. Myers, X. Qi, K. Vikram, L. Zheng, and
X. Zheng, “Secure web applications via automatic partitioning,” in
[43], [97], [98], [92], [93]. Second, the business logic of SOSP ’07: Proceedings of the 21st ACM SIGOPS symposium on
web applications is becoming more and more complex, which Operating systems principles, 2007, pp. 31–44.
further exacerbates the absence of formal verification and [20] W. Robertson and G. Vigna, “Static enforcement of web application
integrity through strong typing,” in USENIX’09: Proceedings of the
robust protection mechanisms for application logic. For ex- 18th conference on USENIX security symposium, 2009, pp. 283–298.
ample, when multiple web applications are integrated through [21] H. Fisk., “Prepared Statements,” 2004. [Online].
APIs, their interactions may expose logic vulnerabilities [100]. Available: http://dev.mysql.com/tech-resources/articles/4.1/prepared-
Third, an increasing number of web applications are embed- statements.html
[22] R. A. McClure and I. H. Krüger, “Sql dom: compile time checking
ding third-party programs or extensions, e.g., iGoogle gadgets, of dynamic sql statements,” in ICSE’05: Proceedings of the 27th
Facebook games etc. To automatically verify the security of international conference on Software engineering, 2005, pp. 88–96.
third-party applications and securely integrate them is non- [23] J. Weinberger, P. Saxena, D. Akhawe, M. Finifter, R. Shin, and D. Song,
“A Systematic Analysis of XSS Sanitization in Web Application
trivial [85]. Last but not least, new types of attacks are always Frameworks,” in ESORICS’11: Proc. of 16th European Symposium on
emerging, e.g., HTTP parameter pollution attack [101], which Research in Computer Security, 2011.
[24] M. Samuel, P. Saxena, and D. Song, “Context-sensitive auto- [44] P. Saxena, D. Molnar, and B. Livshits, “Scriptguard: automatic context-
sanitization in web templating languages using type qualifiers,” in sensitive sanitization for large-scale legacy web applications,” in
CCS’11: Proceedings of the 18th ACM conference on Computer and CCS’11: Proceedings of the 18th ACM conference on Computer and
communications security, 2011, pp. 587–600. communications security, 2011, pp. 601–614.
[25] Y.-W. Huang, F. Yu, C. Hang, C.-H. Tsai, D.-T. Lee, and S.-Y. Kuo, [45] T. Jim, N. Swamy, and M. Hicks, “Defeating script injection attacks
“Securing web application code by static analysis and runtime protec- with browser-enforced embedded policies,” in WWW ’07: Proceedings
tion,” in WWW’04: Proceedings of the 13th international conference of the 16th international conference on World Wide Web, 2007, pp.
on World Wide Web, 2004, pp. 40–52. 601–610.
[26] Y. Xie and A. Aiken, “Static detection of security vulnerabilities in [46] G. Markham, “Content restrictions.” 2006. [Online]. Available:
scripting languages,” in USENIX’06: Proceedings of the 15th confer- http://www.gerv.net/security/content-restrictions/
ence on USENIX Security Symposium, 2006. [47] M. V. Gundy and H. Chen, “Noncespaces: Using randomization to
[27] N. Jovanovic, C. Kruegel, and E. Kirda, “Pixy: A static analysis tool for enforce information flow tracking and thwart xss attacks,” in NDSS’09:
detecting web application vulnerabilities (short paper),” in Oakland’06: Proceedings of the 16th Annual Network and Distributed System
Proceedings of the 27th IEEE Symposium on Security and Privacy, Security Symposium, 2009.
2006, pp. 258–263. [48] Y. Nadji, P. Saxena, and D. Song, “Document structure integrity: A ro-
[28] ——, “Precise alias analysis for syntactic detection of web application bust basis for cross-site scripting defense,” in NDSS’09: Proceedings of
vulnerabilities,” in ACM SIGPLAN Workshop on Programming Lan- the 16th Annual Network and Distributed System Security Symposium,
guages and Analysis for Security, 2006. 2009.
[29] G. Wassermann and Z. Su, “Sound and precise analysis of web [49] M. Ter Louw and V. Venkatakrishnan, “Blueprint: Precise browser-
applications for injection vulnerabilities,” in PLDI’07: Proceedings of neutral prevention of cross-site scripting attacks,” in Oakland’09:
the 2007 ACM SIGPLAN conference on Programming language design Proceedings of the 30th IEEE Symposium on Security and Privacy,
and implementation, 2007, pp. 32–41. 2009.
[30] ——, “Static detection of cross-site scripting vulnerabilities,” in [50] D. Ross, “IE 8 XSS filter architecture.” [Online]. Avail-
ICSE’08: ACM/IEEE 30th International Conference on Software En- able: http://blogs.technet.com/swi/archive/2008/08/19/ie-8-xss-filter-
gineering, 2008. architecture-implementation.aspx
[31] Y. Minamide, “Static approximation of dynamically generated web [51] G. Maone, “NoScript features: Anti-XSS protection.” [Online].
pages,” in WWW’05: Proceedings of the 14th international conference Available: http://noscript.net/feature-xss
on World Wide Web, 2005, pp. 432–441. [52] M. Johns, B. Engelmann, and J. Posegga, “Xssds: Server-side detection
[32] V. B. Livshits and M. S. Lam, “Finding security vulnerabilities in java of cross-site scripting attacks,” 2008, pp. 335–344.
applications with static analysis,” in USENIX’05: Proceedings of the [53] E. Kirda, C. Kruegel, G. Vigna, and N. Jovanovic, “Noxes: a client-
14th conference on USENIX Security Symposium, 2005, p. 18. side solution for mitigating cross-site scripting attacks,” in SAC ’06:
[33] A. Chaudhuri and J. S. Foster, “Symbolic security analysis of ruby- Proceedings of the 2006 ACM symposium on Applied computing, 2006,
on-rails web applications,” in CCS ’10: Proceedings of the 17th ACM pp. 330–337.
conference on Computer and communications security, 2010. [54] C. Reis, J. Dunagan, H. J. Wang, O. Dubrovsky, and S. Esmeir,
[34] A. Nguyen-tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans, “Browsershield: vulnerability-driven filtering of dynamic html,” in
“Automatically hardening web applications using precise tainting,” in OSDI ’06: Proceedings of the 7th symposium on Operating systems
Proc. of the 20th IFIP International Information Security Conference, design and implementation, 2006, pp. 61–74.
2005, pp. 372–382. [55] D. Yu, A. Chander, N. Islam, and I. Serikov, “Javascript instrumentation
[35] V. Haldar, D. Chandra, and M. Franz, “Dynamic taint propagation for browser security,” in POPL ’07: Proceedings of the 34th annual
for java,” in ACSAC ’05: Proceedings of the 21st Annual Computer ACM SIGPLAN-SIGACT symposium on Principles of programming
Security Applications Conference, 2005, pp. 303–311. languages, 2007, pp. 237–249.
[36] D. Balzarotti, M. Cova, V. Felmetsger, N. Jovanovic, E. Kirda, [56] F. Nentwich, N. Jovanovic, E. Kirda, C. Kruegel, and G. Vigna, “Cross-
C. Kruegel, and G. Vigna, “Saner: Composing static and dynamic site scripting prevention with dynamic data tainting and static analysis,”
analysis to validate sanitization in web applications,” in Oakland’08: in NDSS’07: Proceeding of the 14th Network and Distributed System
Proceedings of the 29th IEEE Symposium on Security and Privacy, Security Symposium, 2007.
2008, pp. 387–401. [57] G. S. Kc, A. D. Keromytis, and V. Prevelakis, “Countering code-
[37] M. S. Lam, M. Martin, B. Livshits, and J. Whaley, “Securing web injection attacks with instruction-set randomization,” in CCS ’03:
applications with static and dynamic information flow tracking,” in Proceedings of the 10th ACM conference on Computer and communi-
PEPM ’08: Proceedings of the 2008 ACM SIGPLAN symposium on cations security, 2003, pp. 272–280.
Partial evaluation and semantics-based program manipulation, 2008, [58] S. W. Boyd and A. D. Keromytis, “Sqlrand: Preventing sql injection
pp. 3–12. attacks,” in ACNS’04: Proceedings of the 2nd Applied Cryptography
[38] Y.-W. Huang, S.-K. Huang, T.-P. Lin, and C.-H. Tsai, “Web application and Network Security Conference, 2004, pp. 292–302.
security assessment by fault injection and behavior monitoring,” in [59] Z. Su and G. Wassermann, “The essence of command injection attacks
WWW’03: Proceedings of the 12th international conference on World in web applications,” in POPL’06: Conference record of the 33rd
Wide Web, 2003, pp. 148–159. ACM SIGPLAN-SIGACT symposium on Principles of programming
[39] S. Kals, E. Kirda, C. Kruegel, and N. Jovanovic, “Secubat: a web vul- languages, 2006, pp. 372–382.
nerability scanner,” in WWW’06: Proceedings of the 15th international [60] T. Pietraszek, C. V. Berghe, C. V, and E. Berghe, “Defending against in-
conference on World Wide Web, 2006, pp. 247–256. jection attacks through context-sensitive string evaluation,” in RAID’05:
[40] S. Mcallister, E. Kirda, and C. Kruegel, “Leveraging user interactions Proceedings of the 8th International Symposium on Recent Advances
for in-depth testing of web applications,” in RAID ’08: Proceedings in Intrusion Detection, 2005.
of the 11th international symposium on Recent Advances in Intrusion [61] W. G. J. Halfond, A. Orso, and P. Manolios, “Using positive taint-
Detection, 2008, pp. 191–210. ing and syntax-aware evaluation to counter sql injection attacks,”
[41] M. Martin and M. S. Lam, “Automatic generation of xss and sql in SIGSOFT ’06/FSE-14: Proceedings of the 14th ACM SIGSOFT
injection attacks with goal-directed model checking,” in USENIX’08: international symposium on Foundations of software engineering, 2006,
Proceedings of the 17th conference on USENIX Security symposium, pp. 175–185.
2008, pp. 31–43. [62] D. Scott and R. Sharp, “Abstracting application-level web security,” in
[42] A. Kieyzun, P. J. Guo, K. Jayaraman, and M. D. Ernst, “Automatic WWW ’02: Proceedings of the 11th international conference on World
creation of sql injection and cross-site scripting attacks,” in ICSE Wide Web, 2002, pp. 396–407.
’09: Proceedings of the 31st International Conference on Software [63] C. Kruegel and G. Vigna, “Anomaly Detection of Web-based Attacks,”
Engineering, 2009, pp. 199–209. in CCS’03: Proceedings of the 10th ACM Conference on Computer and
[43] P. P. Prateek Saxena, Steve Hanna and D. Song, “Flax: Systematic dis- Communication Security, 2003, pp. 251–261.
covery of client-side validation vulnerabilities in rich web applications.” [64] C. Kruegel, G. Vigna, and W. Robertson, “A Multi-model Approach
in NDSS’10: Proceedings of the 17th Annual Network and Distributed to the Detection of Web-based Attacks,” Computer Networks, vol. 48,
System Security Symposium, 2010. no. 5, pp. 717–738, August 2005.
[65] W. Robertson, G. Vigna, C. Kruegel, and R. Kemmerer, “Using separation for web applications,” in WWW’10: Proceedings of the 19th
Generalization and Characterization Techniques in the Anomaly-based international conference on World Wide Web, 2010, pp. 551–560.
Detection of Web Attacks,” in NDSS’06: Proceeding of the 13th [86] D. W. A. Mettler and T. Close, “Joe-e: A security-oriented subset
Network and Distributed System Security Symposium, 2006. of java,” in NDSS’10: Proceedings of the 17th Annual Network and
[66] F. Maggi, W. Robertson, C. Kruegel, and G. Vigna, “Protecting a Distributed System Security Symposium, 2010, pp. 357–374.
moving target: Addressing web application concept drift,” in RAID’09: [87] D. Balzarotti, M. Cova, V. V. Felmetsger, and G. Vigna, “Multi-
Proceedings of the 12th International Symposium on Recent Advances module vulnerability analysis of web-based applications,” in CCS
in Intrusion Detection, 2009, pp. 21–40. ’07: Proceedings of the 14th ACM conference on Computer and
[67] F. Valeur, D. Mutz, and G. Vigna, “A Learning-Based Approach to communications security, 2007, pp. 25–35.
the Detection of SQL Attacks,” in DIMVA’05: Proceedings of the [88] F. Sun, L. Xu, and Z. Su, “Static detection of access control vulnera-
Conference on Detection of Intrusions and Malware and Vulnerability bilities in web applications,” in USENIX’11: Proceedings of the 20th
Assessment, 2005, pp. 123–140. USENIX Security Symposium, 2011.
[68] K. L. Ingham, A. Somayaji, J. Burge, and S. F. A. C, “Learning [89] S. Son, K. S. McKinley, and V. Shmatikov, “Rolecast: finding missing
dfa representations of http for protecting web applications,” Computer security checks when you do not know what checks are,” in OOPSLA
Networks, vol. 51, pp. 1239–1255, 2007. ’11: Proceedings of the 26th Annual ACM SIGPLAN Conference on
[69] A. D. K. Yingbo Song and S. J. Stolfo, “Spectrogram: A Mixture- Object-Oriented Programming, Systems, Languages, and Applications,
of-Markov-Chains Model for Anomaly Detection in Web Traffic,” in 2011, pp. 1069–1084.
NDSS’09: Proceedings of the 16th Annual Network and Distributed [90] C. K. Adam Doupé, Bryce Boe and G. Vigna, “Fear the EAR:
System Security Symposium, 2009. Discovering and Mitigating Execution After Redirect Vulnerabilities,”
[70] K. L. Ingham and H. Inoue, “Comparing anomaly detection techniques in CCS’11: Proceeding of the 18th ACM Conference on Computer and
for http,” in RAID’07: Proceedings of the 10th international conference Communications Security, 2011.
on Recent advances in intrusion detection, 2007, pp. 42–62. [91] V. Felmetsger, L. Cavedon, C. Kruegel, and G. Vigna, “Toward
[71] W. G. Halfond and A. Orso, “Amnesia: Analysis and monitoring Automated Detection of Logic Vulnerabilities in Web Applications,”
for neutralizing sql-injection attacks,” in ASE’05: Proceedings of the in USENIX’10: Proceedings of the 19th USENIX Security Symposium,
20th IEEE and ACM International Conference on Automated Software 2010.
Engineering, 2005. [92] P. Bisht, T. Hinrichs, N. Skrupsky, R. Bobrowicz, and V. N. Venkatakr-
ishnan, “Notamper: automatic blackbox detection of parameter tam-
[72] S. Bandhakavi, P. Bisht, P. Madhusudan, and V. N. Venkatakrishnan,
pering opportunities in web applications,” in CCS ’10: Proceedings of
“Candid: preventing sql injection attacks using dynamic candidate
the 17th ACM conference on Computer and communications security,
evaluations,” in CCS ’07: Proceedings of the 14th ACM conference
2010.
on Computer and communications security, 2007, pp. 12–24.
[93] P. Bisht, T. Hinrichs, N. Skrupsky, and V. N. Venkatakrishnan, “Waptec:
[73] P. Bisht, A. P. Sistla, and V. Venkatakrishnan, “Automatically preparing
whitebox analysis of web applications for parameter tampering exploit
safe sql queries,” in FC’10: Proceedings of the 14th International
construction,” in CCS’11: Proceedings of the 18th ACM conference on
Conference on Financial Cryptography and Data Security, 2010.
Computer and communications security, 2011, pp. 575–586.
[74] P. Bisht and V. Venkatakrishnan, “XSS-GUARD: Precise Dynamic [94] M. Dalton, C. Kozyrakis, and N. Zeldovich, “Nemesis: preventing au-
Prevention of Cross-Site Scripting Attacks,” in DIMVA’08: Proceedings thentication & access control vulnerabilities in web applications,”
of the 5th International Conference on Detection of Intrusions and in USENIX’09: Proceedings of the 18th conference on USENIX security
Malware, and Vulnerability Assesment, 2008. symposium, 2009, pp. 267–282.
[75] R. Sekar, “An efficient black-box technique for defeating web appli- [95] B. Parno, J. M. McCune, D. Wendlandt, D. G. Andersen, and A. Perrig,
cation attacks,” in NDSS’09: Proceedings of the 16th Annual Network “CLAMP: Practical prevention of large-scale data leaks,” in Oak-
and Distributed System Security Symposium, 2009. land’09: Proceedings of the 30th IEEE Symposium on Security and
[76] B. Davis and H. Chen, “Dbtaint: cross-application information flow Privacy, 2009.
tracking via databases,” in WebApps’10: Proceedings of the 2010 [96] M. Cova, D. Balzarotti, V. Felmetsger, and G. Vigna, “Swaddler: An
USENIX conference on Web application development, 2010. Approach for the Anomaly-based Detection of State Violations in
[77] P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, and M. Veanes, “Fast Web Applications,” in RAID’07: Proceedings of the 10th International
and precise sanitizer analysis with bek,” in USENIX’11: Proceedings Symposium on Recent Advances in Intrusion Detection, 2007, pp. 63–
of the 20th USENIX Security symposium, 2011. 86.
[78] J. Weinberger, A. Barth, and D. Song, “Towards client-side html [97] A. Guha, S. Krishnamurthi, and T. Jim, “Using static analysis for ajax
security policies,” in HotSec’11: Proc. of 6th USENIX Workshop on intrusion detection,” in WWW’09: Proceedings of the 18th international
Hot Topics in Security, 2011. conference on World Wide Web, 2009, pp. 561–570.
[79] A. Doupe, M. Cova, and G. Vigna, “Why Johnny Cant Pentest: An [98] K. Vikram, A. Prateek, and B. Livshits, “Ripley: automatically se-
Analysis of Black-box Web Vulnerability Scanners,” in DIMVA’10: curing web 2.0 applications through replicated execution,” in CCS
Proceedings of the Conference on Detection of Intrusions and Malware ’09: Proceedings of the 16th ACM conference on Computer and
and Vulnerability Assessment, 2010. communications security, 2009, pp. 173–186.
[80] J. Bau, E. Bursztein, D. Gupta, and J. Mitchell, “State of the art: Au- [99] X. Li and Y. Xue, “BLOCK: A Black-box Approach for Detection
tomated black-box web application vulnerability testing,” Oakland’10: of State Violation Attacks Towards Web Applications,” in ACSAC’11:
Proceedings of the 31st IEEE Symposium on Security and Privacy, pp. Proceedings of 27th Annual Computer Security Applications Confer-
332–345, 2010. ence, 2011.
[81] B. J. Corcoran, N. Swamy, and M. Hicks, “Cross-tier, label-based secu- [100] R. Wang, S. Chen, X. Wang, and S. Qadeer, “How to shop for free
rity enforcement for web applications,” in SIGMOD ’09: Proceedings online - security analysis of cashier-as-a-service based web stores,” in
of the 35th SIGMOD international conference on Management of data, Oakland’11: Proceedings of the 32nd IEEE Symposium on Security
2009, pp. 269–282. and Privacy, 2011.
[82] N. Swamy, B. J. Corcoran, and M. Hicks, “Fable: A language for [101] M. Balduzzi, C. T. Gimenez, D. Balzarotti, and E. Kirda, “Automated
enforcing user-defined security policies,” in Oakland ’08: Proceedings discovery of parameter pollution vulnerabilities in web applications.”
of the 29th IEEE Symposium on Security and Privacy. in NDSS’11: Proceedings of the 8th Annual Network and Distributed
[83] A. Yip, X. Wang, N. Zeldovich, and M. F. Kaashoek, “Improving ap- System Security Symposium, 2011.
plication security with data flow assertions,” in SOSP’09: Proceedings
of the ACM SIGOPS 22nd symposium on Operating systems principles,
2009, pp. 291–304.
[84] A. Chlipala, “Static checking of dynamically-varying security policies
in database-backed applications,” in OSDI’10: Proceedings of the 9th
USENIX conference on Operating systems design and implementation,
2010.
[85] A. Krishnamurthy, A. Mettler, and D. Wagner, “Fine-grained privilege