0% found this document useful (0 votes)

37 views25 pages

Half On D 07 Springer

The document discusses SQL injection attacks and presents AMNESIA, a technique for automatically detecting and preventing such attacks. AMNESIA uses static analysis to build a model of legitimate queries and checks at runtime that queries comply with the model. An evaluation found AMNESIA effective and efficient at detecting and preventing SQL injection attacks with low overhead.

Uploaded by

Karthik S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views25 pages

Half On D 07 Springer

Uploaded by

Karthik S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Detection and Prevention of SQL Injection Attacks

William G.J. Halfond and Alessandro Orso Georgia Institute of Technology {whalfond,orso}Sec.gatech.edu

Summary. We depend on database-driven web applications for an ever increasing amount of activities, such as banking and shopping. When performing such activities, we entrust our personal information to these web applications and their underlying databases. The confidentiality and integrity of this information is far from guaranteed; web applications are often vulnerable to attacks, which can give an attacker complete access to the application's underlying database. SQL injection is a type of code-injection attack in which an attacker uses specially crafted inputs to trick the database into executing attacker-specified database commands. In this chapter, we provide an overview of the various types of SQL injection attacks and present AMNESIA, a technique for automatically detecting and preventing SQL injection attacks. AMNESIA uses static analysis to build a model of the legitimate queries an application can generate and then, at runtime, checks that all queries generated by the appUcation comply with this model. We also present an extensive empirical evaluation of AMNESIA. The results of our evaluation indicate that AMNESIA is, at least for the cases considered, highly effective and efficient in detecting and preventing SQL injection attacks.

5.1 Introduction
SQL Injection Attacks (SQLIAs) have emerged as one of the most serious threats to the security of database-driven applications. In fact, the Open Web Application Security Project (OWASP), an international organization of web developers, has placed SQLIAs among the top ten vulnerabilities that a web application can have [7]. Similarly, software companies such as Microsoft [3] and SPI Dynamics have cited SQLIAs as one of the most critical vulnerabilities that software developers must address. SQL injection vulnerabilities can be particularly harmful because they allow an attacker to access the database that underlies an application. Using SQLIAs, an attacker may be able to read, modify, or even delete database information. In many cases, this information is confidential or sensitive and its loss can lead to problems such as identity theft and fraud. The list of high-profile victims of SQLIAs includes Travelocity, FTD.com, Creditcards.com, Tower Records, Guess Inc., and the Recording Industry Association of America (RIAA).

William G.J. Halfond and Alessandro Orso

The errors that lead to SQLIAs are well understood. As with most code-injection attacks, SQLIAs are caused by insufficient validation of user input. The vulnerability occurs when input from the user is used to directly build a query to the database. If the input is not properly encoded and validated by the application, the attacker can inject malicious input that is treated as additional commands by the database. Depending on the severity of the vulnerability, the attacker can issue a wide range of SQL commands to the database. Many interactive database-driven applications, such as web applications that use user input to query their underlying databases, can be vulnerable to SQLIA. In fact, informal surveys of database-driven web applications have shown that almost 97% are potentially vulnerable to SQLIA. Like most security vulnerabilities, SQLIAs can be prevented by using defensive coding. In practice however, this solution is very difficult to implement and enforce. As developers put new checks in place, attackers continue to innovate and find new ways to circumvent these checks. Since the state of the art in defensive coding is a moving target, it is difficult to keep developers up to date on the latest and best defensive coding practices. Furthermore, retroactively fixing vulnerable legacy applications using defensive coding practices is complicated, labor-intensive, and errorprone. These problems motivate the need for an automated and generalized solution to the SQL injection problem. In this chapter we present AMNESIA (Analysis and Monitoring for NEutralizing SQL Injection Attacks), a fully automated technique and tool for the detection and prevention of SQLIAs.' AMNESIA was developed based on two key insights: (1) the information needed to predict the possible structure of all legitimate queries generated by a web application is contained within the application's code, and (2) an SQLIA, by injecting additional SQL statements into a query, would violate that structure. Based on these two insights we developed a technique against SQL injection that combines static analysis and runtime monitoring. In the static analysis phase, AMNESIA extracts from the web-application code a model that expresses all of the legitimate queries the application can generate. In the runtime monitoring phase, AMNESIA checks that all of the queries generated by the application comply with the model. Queries that violate the model are stopped and reported. We also present an extensive empirical evaluation of AMNESIA. We evaluated AMNESIA on seven web applications, including commercial ones, and on thousands of both legitimate and illegitimate accesses to such applications. We modeled the illegitimate accesses after real attacks that are in use by hackers and penetration testing teams. In the evaluation, AMNESIA did not generate any false positives or negatives and had a very low runtime overhead. These results indicate AMNESIA is an effective and viable technique for detecting and preventing SQLIAs. The rest of the chapter is organized as follows. Section 5.2 discusses SQLIAs and their various types. Section 5.3 illustrates our technique against SQLIAs. Section 5.4 presents an empirical evaluation of our technique. Section 5.5 compares our approach to related work. Section 5.6 concludes and discusses future directions for the work.

^ An early version of this work was presented in [9].

5 Detection and Prevention of SQL Injection Attacks

5.2 SQL Injection Attacks Explained

The presence of an SQL injection vulnerability allows an attacker to issue commands directly to a web application's underlying database and to subvert the intended functionality of the application. Once an attacker has identified an SQLIA vulnerability, the vulnerable application becomes a conduit for the attacker to execute commands on the database and possibly the host system itself. SQLIAs are a class of code injection attacks that take advantage of a lack of validation of user input. The vulnerabilities occur when developers combine hard-coded strings with user-input to create dynamic queries. If the user input is not properly validated, it is possible for attackers to shape their input in such a way that, when it is included in the final query string, parts of the input are evaluated as SQL keywords or operators by the database. 5.2.1 Example of an SQLIA To illustrate how an SQLIA can occur, we introduce an example web application that is vulnerable to a type of SQLIA that we call a tautology-based attack. The architecture of this web application is shown in Figure 5.1. In this example, the user interacts with a web form that takes a login name and pin as input and submits them to the web server. The web server passes the user supplied credentials to a servlet (show, j s p , in the example), which resides on the application server. The servlet queries the database to check whether the credentials are valid and, based on the result of the query, generates a response for the user in the form of a web page. The servlet, whose partial implementation is shown in Figure 5.2, uses the user-supplied credentials to dynamically build a database query. Method g e t U s e r I n f o is called with the login and pin provided by the user. If both l o g i n and p i n are empty, the method submits the following query to the database:
SELECT info FROM users WHERE login='guest'

Conversely, if l o g i n and p i n are specified by the user, the method embeds the submitted credentials in the query. Therefore, if a user submits l o g i n and p i n as " d o e " and "12 3," the servlet dynamically builds the query:
SELECT info FROM users WHERE login='doe' AND pin=123

A web site that uses this servlet would be vulnerable to SQLIAs. For example, if a user enters " ' OR 1=1 " and "", instead of "doe" and "12 3", the resulting query is:
SELECT info FROM users WHERE login=" OR 1=1 ' AND pin=

The database interprets everything after the WHERE token as a conditional statement and the inclusion of the "OR 1 = 1 " clause turns this conditional into a tautology. (The characters " " mark the beginning of a comment, so everything after them is ignored.) As a result, the database would return information for all user entries. It is important to note that tautology-based attacks represent only a small subset of the different types of SQLIAs that attackers have developed. We present this type

William G.J. Halfond and Alessandro Orso

Browser/ Application
login Idoel pin [ j
Isubmltl^^ieai] http://foo>cora/saoiflrtsp'?logtr^oe&pin=123

Application server

Database server (MySQL, Oracle, IBM DB2,...)

Internet

http: // foo.com /

show.jsp

'^0=123^

Fig. 5.1. Example of interaction between a user and a typical web application.
public class Show extends HttpServlet { 1. public ResultSet getuserlnfo(String login, String pin) { 2. Connection conn = DriverManager.getConnection("MyDB"}; 3. Statement stmt = conn.createStatement(); 4. String queryString = ""; 5. 6. 7. 8. 9. 10. 11. queryString = "SELECT info FROM users WHERE "; if ((! login.equals("")) && ( ! pin.equals(""))) { queryString += "login='" + login + "' AND pin=" + pin;

)
else { queryString+="login=' guest' "; } ResultSet tempSet = stmt.execute(queryString); return tempSet;

Fig. 5.2. Example servlet.

of attack as an example because it is fairly straightforward and intuitive. For this same reason, tautology-based attacks have been widely cited in literature and are often mistakenly viewed as the only type of SQLIAs. However, current attack techniques are not limited to only injecting tautologies. In the rest of this section, we first provide a general definition of SQLIAs and then present an overview of the currently known types of SQLIAs. 5.2.2 General Definition of SQLIA An SQL injection attack occurs when an attacker changes the intended logic, semantics, or syntax of a SQL query by inserting new SQL keywords or operators. This definition includes all of the variants of SQLIAs discussed in the following subsections.

5 Detection and Prevention of SQL Injection Attacks

5.2.3 Variants of SQLIA Over the past several years, attackers have developed a wide array of sophisticated attack techniques that can be used to exploit SQL injection vulnerabilities. These techniques go beyond the commonly used tautology-based SQLIA examples and take advantage of esoteric and advanced SQL constructs. Ignoring the existence of these kinds of attacks leads to the development of solutions that address the SQLIA problem only partially. For example, SQLIA can be introduced into a program using several different types of input sources. Developers and researchers often assume that SQLIAs are only introduced via user input that is submitted as part of a web form or as a response to a prompt for input. This assumption misses the fact that any external string or input that is used to build a query string can be under the control of an attacker and represents a possible input channel for SQLIAs. It is common to see other external sources of input such as fields from an HTTP cookie or server variables being used to build a query. Since cookie values are under the control of the user's browser and server variables are often set via values from HTTP headers, these values represent external strings that can be manipulated by an attacker. In addition, second-order injections use advanced knowledge of a vulnerable application to introduce an attack using otherwise properly secured input sources [1]. A developer may properly escape, type-check, and filter input that comes from the user and assume it is safe. Later on, when that data is used in a different context or to build a different type of query, the previously safe input becomes an injection attack. Because there are many input sources that could lead to a SQLIA, techniques that focus on simply checking user input or explicitly enumerating all untrusted input sources are often incomplete and still leave ways for malicious input to affect the generated query strings. Once attackers have identified an input source that can be used to exploit an SQLIA vulnerability, there are many different types of attack techniques that they can employ. Depending on the type and extent of the vulnerabiUty, the results of these attacks can include crashing the database, gathering information about the tables in the database schema, establishing covert channels, and open-ended injection of virtually any SQL command. We briefly summarize the main techniques for performing SQLIAs using the example code from Figure 5.2. Interested readers can refer to [10] for additional information and examples of how these techniques work. Tautologies. The general goal of a tautology-based attack is to inject SQL tokens that cause the query's conditional statement to always evaluate to true. Although the results of this type of attack are application specific, the most common uses are to bypass authentication pages and extract data. In this type of injection, an attacker exploits a vulnerable input field that is used in the query's WHERE conditional. This conditional logic is evaluated as the database scans each row in the table. If the conditional represents a tautology, the database matches and returns all the rows in the table as opposed to matching only one row, as it would normally do in the absence of injection. We showed an example of this type of attack in Section 5.2.1. Malformed Queries.

William GJ. Halfond and Alessandro Orso

This attack technique takes advantage of overly descriptive error messages that are returned by the database when a query is rejected. Database error messages often contain useful debugging information that also allows an attacker to accurately identify which parameters are vulnerable in an application and the complete schema of the underlying database. Attackers exploit this situation by injecting SQL tokens or garbage input that causes the query to contain syntax errors, type mismatches, or logical errors. Consider our example, an attacker could try to cause a type mismatch error by injecting the following text into the pin input field: " c o n v e r t ( i n t , ( s e l e c t t o p 1 name from s y s o b j e c t s w h e r e x t y p e = ' u ' ) ) ". The resulting query generated by the web application would be:
SELECT info FROM users WHERE login=" AND pin= convert (int,(select top 1 name from sysobjects where xtype='u'))

In the attack string, the injected select query extracts the name of the first user table ( x t y p e = ' u ' ) from the database's metadata table, s y s o b j e c t s , which contains information on the structure of the database. It then converts this table name to an integer. Because the name of the table is a string, the conversion is illegal, and the database returns an error. For example, an SQL Server may return the following error: "Microsoft OLE DB Provider for SQL Server (Ox80040E07) Error converting nvarchar value 'CreditCards' to a column of data type int." There are two useful pieces of information in this message that aid an attacker. First, the attacker can see that the database is an SQL Server database, as the error message explicitly states this. Second, the error message reveals the string that caused the type conversion to occur (in this case, the name of the first user-defined table in the database, "CreditCards"). A similar strategy can be used to systematically extract the name and type of each column in the given table. Using this information about the schema of the database, an attacker can create more precise attacks that specifically target certain types of information. Attacks based on malformed queries are typically used as a preliminary information-gathering step for other attacks. Union Query. The Union Query technique refers to injection attacks in which an attacker causes the application to return data from a table that is different from the one that was intended. To this end, attackers inject a statement of the form "UNION < i n j e c t e d q u e r y > " . By suitably defining < i n j e c t e d q u e r y > , attackers can retrieve information from a specified table. The outcome of this attack is that the database returns a dataset that is the union of the results of the original query with the results of the injected query. In our example, an attacker could perform a Union Query injection by injecting the text " ' UNION SELECT c a r d N o from C r e d i t C a r d s

5 Detection and Prevention of SQL Injection Attacks

w h e r e a c c t N o = 1 0 0 3 2 " into the login field. The application would then produce the following query:
SELECT info FROM users WHERE login=" UNION SELECT cardNo from CreditCards where acctNo=10032 AND pin=

Assuming that there is no login equal to "" (the empty string), the original query returns the null set, and the injected query returns data from the "CreditCards" table. In this case, the database returns field "cardNo" for account "10032." The database takes the results of these two queries, unions them together, and returns them to the application. In many applications, the effect of this attack would be that the value for "cardNo" is displayed with the account information. Piggy-backed Queries. In the Piggy-backed Query technique, an attacker tries to append additional queries to the original query string. If the attack is successful, the database receives and executes a query string that contains multiple distinct queries. The first query is generally the original, legal query, whereas subsequent queries are the injected, malicious queries. This type of attack can be especially harmful; attackers can use it to inject virtually any type of SQL command. In our example application, an attacker could inject the text " 0 ; d r o p t a b l e u s e r s " into the pin input field. The application would then generate the query:
SELECT info FROM users WHERE login='doe' AND pin=0; drop table users

The database treats this query string as two queries separated by the query delimiter, " ; " , and executes both. The second, malicious query causes the database to drop the u s e r s table in the database, which would have the catastrophic consequence of deleting all of the database users. Other types of queries can be executed using this technique, such as insertion of new users into the database or execution of stored procedures. It is worth noting that many databases do not require a special character to separate distinct queries, so simply scanning for a special character is not an effective way to prevent this attack technique. Stored Procedures. In this technique, attackers focus on the stored procedures that are present on the database system. Stored procedures are code that is stored in the database and run directly by the database engine. Stored procedures enable a programmer to code database or business logic directly into the database and provide an extra layer of abstraction. It is a common misconception that the use of stored procedures protects an application from SQLIAs. Stored procedures are just code and can be just as vulnerable as the application's code. Depending on the specific stored procedures that are available on a database, an attacker has different ways of exploiting a system. The following example demonstrates how a parameterized stored procedure can be exploited via an SQLIA. In this scenario, we assume that the query string constructed at lines 5, 7, and 9 of our example has been replaced by a call to the stored procedure defined in Figure .5.3. The stored procedure returns a boolean value to indicate whether the user's credentials were authenticated by the database. To perform an SQLIA that exploits this stored procedure, the attacker can simply inject the text"

William G.J. Halfond and Alessandro Orso

CREATE PROCEDURE DBO.isAuthenticated @userName varchar2, @pin int AS EXEC("SELECT info FROM users WHERE login='" +@userName+ "' and pin=" +@pin); GO

Fig. 5.3. Stored procedure for checking credentials. ' ; SHUTDOWN; " into the u s e r N a m e field. This injection causes the Stored procedure to generate the following query:
SELECT info FROM users WHERE login=' '; SHUTDOWN; AND pin=

This attack works like a piggy-back attack. When the second query is executed, the database is shut down. Inference. Inference-based attacks create queries that cause an application or database to behave differently based on the results of the query. In this way, even if an application does not directly provide the results of the query to the attacker, it is possible to observe side effects caused by the query and deduce the results. These attacks allow an attacker to extract data from a database and detect vulnerable parameters. Researchers have reported that, using these techniques, they have been able to achieve a data extraction rate of one byte per second [2]. There are two well-known attack techniques that are based on inference: blind-injection and timing attacks. Blind Injection: In this variation, an attacker performs queries that have a boolean result. The queries cause the application to behave correctly if they evaluate to true, whereas they cause an error if the result is false. Because error messages are easily distinguishable from normal results, this approach provides a way for an attacker to get an indirect response from the database. One possible use of blind-injection is to determine which parameters of an application are vulnerable to SQLIA. Consider again the example code in Figure 5.2. Two possible injections into the login field are " l e g a l U s e r ' a n d 1=0 " and " l e g a l U s e r ' a n d 1=1 ". These injections result in the following two queries:
SELECT Info FROM users WHERE logln='legalUser' and 1=0 SELECT info FROM users WHERE login='legalUser' and 1=1 ' AND pin= ' AND pin=

Now, let us consider two scenarios. In the first scenario, we have a secure application, and the input for login is validated correctly. In this case, both injections would return login error messages from the application, and the attacker would know that the login parameter is not vulnerable to this kind of attack. In the second scenario, we have a non-secure application in which the login parameter is vulnerable to injection. In this case, the first injection would evaluate to false, and the application would return a login-error message. Without additional information, attackers would not know whether the error occurred because the application validated the input correctly and blocked the attack attempt or because the attack itself caused the login error. However, when the attackers observe that the second query does not re-

5 Detection and Prevention of SQL Injection Attacks

suit in an error message, they know that the attack was successful and that the login parameter is vulnerable to injection. Timing Attacks: A timing attack lets an attacker gather information from a database by observing timing delays in the database's responses. This attack is similar to blind injection, but uses a different type of observable side effect. To perform a timing attack, attackers structure their injected query in the form of an if-then statement whose branch condition corresponds to a question about the contents of the database. The attacker then uses the WAITFOR keyword along one of the branches, which causes the database to delay its response by a specified time. By measuring the increase or decrease in the database response time, attackers can infer which branch was taken and the answer to the injected question. Using our example, we illustrate how to use a timing-based inference attack to extract a table name from the database. In this attack, the following text is injected into the login parameter:
legalUser' and ASCII(SUBSTRING((select top 1 name from sysobjects) , 1, 1)) > X WAITFOR 5

This injection produces the following query:

SELECT i n t o FROM u s e r s WHERE l o g l n = ' l e g a l U s e r ' 1 name from s y s o b j e c t s ) , 1 , 1 ) ) and ASCII(SUBSTRING((select top > X WAITFOR 5 ' AND pin=

In this attack, the SUBSTRING function is used to extract the first character of the first table's name. The attacker can then ask a series of questions about this character. In this example, the attacker is asking if the ASCII value of the character is greaterthan or less-than-or-equal-to the value of X. If the value is greater, the attacker will be able to observe an additional five-second delay in the database response. The attacker can continue in this way and use a binary-search strategy to identify the value of the first character, then the second character, and so on. Alternate Encodings. Using alternate encoding techniques, attackers modify their injection strings in a way that avoids typical signature- and filter-based checks that developers put in their applications. Alternate encodings, such as hexadecimal, ASCII, and Unicode can be used in conjunction with other techniques to allow an attack to escape straightforward detection approaches that simply scan for certain known "bad characters." Even if developers account for alternative encodings, this technique can still be successful because alternate encodings can target different layers in the application. For example, a developer may scan for a Unicode or hexadecimal encoding of a single quote and not realize that the attacker can leverage a database function (e.g., c h a r ( 4 4 ) ) to encode the same character. An effective code-based defense against alternate encodings requires developers to be aware of all of the possible encodings that could affect a given query string as it passes through the different application layers. Because developing such a complete protection is very difficult in practice, attackers have been very successful in using alternate encodings to conceal attack strings. The following example attack (from [11]) shows the level of obfuscation that can be achieved using alter-

William G.J. Halfond and Alessandro Orso

nate encodings. In the attack, the pin field is injected with the following string: "0 ; e x e c ( 0 x 7 3 5 8 7 5 7 4 64 5f77 6 e ) , " and the resulting query is:
SELECT info FROM users WHERE logln=" AND pin=0; exec(char(0x73687574646f776e))

This example makes use of the c h a r () function and ASCII hexadecimal encoding. The c h a r {) function takes as a parameter an integer or hexadecimal encoding of one or more characters and replaces the function call with the actual character(s). The stream of numbers in the second part of the injection is the ASCII hexadecimal encoding of the attack string. This encoded string is inserted into a query using some other type of attack profile and, when it is executed by the database, translates into the s h u t d o w n command.

5.3 Detection and Prevention of SQL Injection Attacks

AMNESIA, (Analysis for Monitoring and NEutralizing SQL Injection Attacks) is a fully-automated and general technique for detecting and preventing all types of SQLIAs. The approach works by combining static analysis and runtime monitoring. Our two key insights behind the approach are that (1) the information needed to predict the possible structure of all legitimate queries generated by a web application is contained within the application's code, and (2) an SQLIA, by injecting additional SQL statements into a query, would violate that structure. In its static part, our technique uses program analysis to automatically build a model of the legitimate queries that could be generated by the application. In its dynamic part, our technique monitors the dynamically generated queries at runtime and checks them for compliance with the statically-generated model. Queries that violate the model represent potential SQLIAs and are reported and prevented from executing on the database. The technique consists of four main steps. We first summarize the steps and then describe them in more detail in subsequent sections. 5.3.1 The AMNESIA Approach Identify hotspots: Scan the appUcation code to identify hotspotspoints in the application code that issue SQL queries to the underlying database. Build SQL-query models: For each hotspot, build a model that represents all the possible SQL queries that may be generated at that hotspot. A SQL-query model is a non-deterministic finite-state automaton in which the transition labels consist of SQL tokens (SQL keywords and operators), delimiters, and placeholders for string values. Instrument Application: At each hotspot in the application, add calls to the runtime monitor. Runtime monitoring: At runtime, check the dynamically-generated queries against the SQL-query model and reject and report queries that violate the model.

5 Detection and Prevention of SQL Injection Attacks

SELECT info FROM ûserTable WHERE ^ OO-Ô O CÔ O ^M^gi

oo--o-^--o-*oFig. 5.4. SQL-query model for the servlet in Figure 5.2.

Identify Hotspots In this step, AMNESIA performs a simple scan of the application code to identify hotspots. In the Java language, all interactions with the database are performed through a predefined API, so identifying all the hotspots is a trivial step. In the case of the example servlet in Figure 5.2, the set of hotspots contains a single element: the call to s t m t . e x e c u t e on line 10. Build SQL-Query Models In this step, we build the SQL-query model for each hotspot. We perform this step in two parts. In the first part, we use the Java String Analysis (JSA) developed by Christensen, M0ller, and Schwartzbach [5] to compute all of the possible values for each hotspot's query string. JSA computes a flow graph that abstracts away the control flow of the program and only represents string-manipulation operations performed on string variables. For each string of interest, the library analyzes the flow graph and simulates the string-manipulation operations that are performed on the string. The result is a Non-Deterministic Finite Automaton (NDFA) that expresses, at the character level, all possible values that the considered string variable can assume. Because JSA is conservative, the NDFA for a given string variable is an overestimate of all of its possible values. In the second part, we transform the NDFA computed by JSA into an SQL-query model. More precisely, we perform an analysis of the NDFA that produces another NDFA in which all of the transitions are labeled with SQL keywords, operators, or literal values. We create this model by performing a depth first traversal of the character-level NDFA and grouping characters that correspond to SQL keywords, operators, or literal values. For example, a sequence of transitions labeled ' S ' , 'E', 'L', 'E', ' C , and 'T' would be recognized as the SQL keyword SELECT and grouped into a single transition labeled "SELECT". This step is configurable to recognize different dialects of SQL. In the SQL-query model, we represent variable strings (i.e., strings that correspond to a variable related to some user input) using the symbol p. For instance, in our example, the value of the variable l o g i n is represented as /3. This process is analogous to the one used by Gould, Su, and Devanbu [8], except that we perform it on NDFAs instead of DFAs. Figure 5.4 shows the SQL-query model for the single hotspot in our example. The model reflects the two different query strings that can be generated by the code depending on the branch followed after the i f statement at line 6 in Figure 5.2.

William G.J. Halfond and Alessandro Orso

Instrument Application In this step, we instrament the application by adding calls to the monitor that checks the queries at runtime. For each hotspot, the technique inserts a call to the monitor before the call to the database. The monitor is invoked with two parameters: the query string that is about to be submitted to the database and a unique identifier for the hotspot. Using the unique identifier, the runtime monitor is able to correlate the hotspot with the specific SQL-query model that was statically generated for that point and check the query against the correct model. Figure 5.5 shows how the example application would be instrumented by our technique. The hotspot, originally at line 10 in Figure 5.2, is now guarded by a call to the monitor at line 10a.

10a. if (monitor.accepts (<hotspot ID>, queryString))

{
10b. 11. ResultSet tempSet = stmt.execute(queryString); return tempSet;

}
Fig. 5.5. Example hotspot after instrumentation.

Runtime Monitoring At runtime, the application executes normally until it reaches a hotspot. At this point, the query string is sent to the runtime monitor, which parses it into a sequence of tokens according to the specific SQL syntax considered. In our parsing of the query string, the parser identifies empty string and empty numeric literals by their syntactic position, and we denote them in the parsed query string using e. Figure 5.6 shows how the last two queries discussed in Section 5.2.1 would be parsed during runtime monitoring. It is important to point out that our technique parses the query string in the same way that the database would and according to the specific SQL grammar considered. In other words, our technique does not perform a simple keyword matching over the query string, which would cause false positives and problems with user input that happened to match SQL keywords. For example, a user-submitted string that contains SQL keywords but is syntactically a text field, would be correctly recognized as a text field. However, if the user were to inject special characters, as in our example, to force part of the text to be evaluated as a keyword, the parser would correctly interpret this input as a keyword. Using the same parser as the database is essential because it guarantees that we are interpreting the query in the same way that the database will.

5 Detection and Prevention of SQL Injection Attacks

(a) SELECT info FROM users WHERE login='doe' AND pin=123
I SELECT 1^ I info], | F R 0 M | - | users ^ ^ | \ \ T ^ E R E | ~ j l o g i n | ^ Q Q - |doe]- Q ^ |A N D | . [ ^ Q |l23^

(b) SELECT i n f o FROM u s e r s WHERE l o g i n = " OR 1=1 ~

' AND p i n =

lSELEiJ- 0*)- [ 3 - G ^ ' IWHEREj. H ^ - Q , R_ Q , Q , [OTJ, Q - Q - Q - Q - Q - 1 1 - B - Q - Q

Fig. 5.6. Example of parsed runtime queries.

After the query has been parsed, the runtime monitor checks it by assessing whether the query violates the SQL-query model associated with the current hotspot. An SQL-query model is an NDFA whose alphabet consists of SQL keywords, operators, literal values, and delimiters, plus the special symbol /?. Therefore, to check whether a query is compliant with the model, the runtime monitor can simply check whether the model accepts the the sequence of tokens derived from the query string. A string or numeric literal (including the empty string, e) in the parsed query string can match either /? or an identical literal value in the SQL-query model. If the model accepts the query, the monitor lets the execution of the query continue. Otherwise, the monitor identifies the query as an SQLIA. In this case, the monitor prevents the query from executing on the database and reports the attack. To illustrate, consider again the queries shown in Figure 5.6 and recall that the first query is legitimate, whereas the second one corresponds to an SQLIA. When checking query (a), the analysis would start matching from token SELECT and from the initial state of the SQL-query model in Figure 5.4. Because the token matches the label of the only transition from the initial state, the automaton reaches the second state. Again, token | i n f o | matches the only transition from the current state, so the automaton reaches the third state. The automaton continues to reach new states until it reaches the state whose two outgoing transitions are labeled "=". At this point, the automaton would proceed along both transitions. On the upper branch, the query is not accepted because the automaton does not reach an accept state. Conversely, on the lower branch, all the tokens in the query are matched with labels on transitions, and the automaton reaches the accept state after consuming the last token in the query ("' "). The monitor can therefore conclude that this query is legitimate. The checking of query (&) proceeds in an analogous way until token OR in the query is reached. Because the token does not match the label of the only outgoing transition from the current state (AND), the query is not accepted by the automaton, and the monitor identifies the query as a SQLIA. Efficiency and limitations For the technique to be practical, the runtime overhead of the monitoring must not affect the usability of the web application. We analyze the cost of AMNESIA'S runtime monitoring in terms of both space and time. The space complexity of the monitoring is dominated by the size of the generated SQL-query models. In the worst case, the size of the query models is quadratic in the size of the application. This case corresponds to the unlikely situation of a program that branches and modifies the query

William G.J. Halfond and Alessandro Orso

string at each program statement. In typical programs, the generated automata are linear in the program size. In fact, our experience is that most automata are actually quite small with respect to the size of the corresponding application (see Table 5.1). The time complexity of the approach depends on the cost of the runtime matching of the query tokens against the models. Because we are checking a set of tokens against an NDFA, the worst case complexity of the matching is exponential in the number of tokens in the query (in the worst case, for each token all states are visited). In practice, however, the SQL-query models typically reduce to trees, and the cost of the matching is almost linear in the size of the query. Our experience shows that the cost of the runtime phase of the approach is negligible (see Section 5.4). As far as Umitations are concerned, our technique can generate false positives and false negatives. Although the string analysis that we use is conservative, false positives can be created in situations where the string analysis is not precise enough. For example, if the analysis cannot determine that a hard-coded string in the application is a keyword, it could assume that it is an input-related value and erroneously represent it as a /3 in the SQL-query model. At runtime, the original keyword would not match the placeholder for the variable, and AMNESIA would flag the corresponding query as an SQLIA. False negatives can occur when the constructed SQL query model contains spurious queries, and the attacker is able to generate an injection attack that matches one of the spurious queries. For example, if a developer adds conditions to a query from within a loop, an attacker who inserts an additional condition of the same type would generate a query that does not violate the SQL-query model. We expect these cases to be rare in practice because of the peculiar structure of SQLIAs. The attacker would have to produce an attack that directly matches either an imprecision of the analysis or a specific pattern. Moreover, in both cases, the type of attacks that could be exploited would be limited by the constraints imposed by the rest of the model that was used to match the query. It is worth noting that, in our empirical evaluation, neither false positives nor false negatives were generated (see Section 5.4). 5.3.2 Implementation AMNESIA is the prototype tool that implements our technique for Java-based web applications. The technique is fully automated, requiring only the web application as input, and requires no extra runtime environment support beyond deploying the application with the AMNESIA library. We developed the tool in Java and its implementation consists of three modules: Analysis module. This module implements Steps 1 and 2 of our technique. It inputs a Java web application and outputs a list of hotspots and a SQL-query model for each hotspot. For the implementation of this module, we leveraged the implementation of the Java String Analysis library by Christensen, M0ller, and Schwartzbach [5]. The analysis module is able to analyze Java Servlets and JSP pages.

5 Detection and Prevention of SQL Injection Attacks

Static Phase (Static Analysis) AIVINESIA Tooiset
Instrumentation Module Web Application

^
Analysis Module

SQL-Query Model

Dynamic Phase (Runtime iVIonitoring)

Fig. 5.7. High-level overview of AMNESIA. Instrumentation module. This module implements Step 3 of our technique. It inputs a Java web application and a list of hotspots and instruments each hotspot with a call to the runtime monitor. We implemented this module using INSECTJ, a generic instrumentation and monitoring framework for Java developed at Georgia Tech [23]. Runtime-monitoring module. This module implements Step 4 of our technique. The module takes as input a query string and the ID of the hotspot that generated the query, retrieves the SQL-query model for that hotspot, and checks the query against the model. Figure 5.7 shows a high-level overview of AMNESIA. In the static phase, the Instrumentation Module and the Analysis Module take as input a web application and produce (I) an instrumented version of the application, and (2) an SQL-query model for each hotspot in the application. In the dynamic phase, the Runtime-Monitoring Module checks the dynamic queries while users interact with the web application. If a query is identified as an attack, it is blocked and reported. Once an SQLIA has been detected, AMNESIA stops the query before it is executed on the database and reports relevant information about the attack in a way that can be leveraged by developers. In our implementation of the technique for Java, we

100

William G.J. Halfond and Alessandro Orso

throw an exception when the attack is detected and encode infonnation about the attack in the exception. If developers want to access the information at runtime, they can simply leverage the exception-handling mechanism of the language and integrate their handUng code into the appHcation. Having this attack information available at runtime is useful because it allows developers to react to an attack right after it is detected and develop an appropriate customized response. For example, developers may decide to avoid any risk and shut-down the part of the application involved in the attack. Alternatively, a developer could handle the attack by converting the information into a format that is usable by another tool, such as an Intrusion Detection System, and reporting it to that tool. Because this mechanism integrates with the application's language, it allows developers flexibility in choosing a response to SQLIAs. Currently, the information reported by our technique includes the time of the attack, the location of the hotspot that was exploited, the attempted-attack query, and the part of the query that was not matched against the model. We are currently considering additional information that could be useful for the developer (e.g., information correlating program execution paths with specific parts of the query model) and investigating ways in which we can modify the static analysis to collect this information. 5.3.3 Implementation Assumptions Our implementation makes one main assumption regarding the applications that it analyzes. The tool assumes that queries are created by manipulating strings in the application, that is, the developer creates queries by combining hard-coded strings and variables using operations such as concatenation, appending, and insertion. Although this assumption precludes the use of AMNESIA on some applications (e.g., applications that externalize all query-related strings in files), it is not overly restrictive and, most importantly, can be eliminated with suitable engineering.

5.4 Empirical Evaluation

The goal of our empirical evaluation is to assess the effectiveness and efficiency of the technique presented in this chapter when applied to various web applications. We used our prototype tool, AMNESIA, to perform an empirical study on a set of subjects. The study investigates three research questions: R Q l : What percentage of attacks can AMNESIA detect and prevent that would otherwise go undetected and reach the database? R Q l : How much overhead does AMNESIA impose on web applications at runtime? RQ3: What percentage of legitimate accesses does AMNESIA identify as attacks? The following sections illustrate the setup for the evaluation, and discuss the two studies that we performed to address the research questions.

5 Detection and Prevention of SQL Injection Attacks

101

Table 5.1. Subject programs for the empirical study. Subject LOC Servlets Injectable State Hotspots Automata Size (Description) Params Params (#nodes) 44 0 5 289 (2-772) Checkers 5,421 18(61) 44 (Online checkers game) 13 1 40 40(8-167) Office Talk 4,543 7 (64) (Purchase-order management) Employee Directory 5,658 7(10) 25 25 23 107 (2-952) 9 (Online employee directory) Bookstore 16,959 8(28) 36 36 6 71 159 (2-5,269) (Online bookstore) Events 7,242 7(13) 36 10 31 77 (2-550) (Event tracking system) 34 Classifieds 10,949 6(14) 18 8 91 (2-799) (Management system for classifieds) Portal 16,453 3(28) 39 7 67 117(2-1,187) (Portal for a club)

5.4.1 Experiment Setup To investigate our research questions, we leveraged a previously developed testbed for SQLIAs, which was presented in [9]. This testbed provides a set of web applications and a large set of both legitimate and malicious inputs for the applications. In the next two sections we briefly review the testbed, describe the applications it contains, and explain how the inputs were generated. Readers can refer to [9] for additional details. Subjects The testbed contains seven subjects. All of the subjects are typical web applications that accept user input via web forms and use that input to build queries to an underlying database. Five of the applications are commercial applications that we obtained fromGotoCode ( h t t p : //www. g o t o c o d e . com); Employee Directory, Bookstore, Events, Classifieds, and Portal. The last two applications. Checkers and OfficeTalk, were student-developed applications created for a class project. We consider them because they have been used in previous related studies [8]. In Table 5.1 we provide information about the subject applications. For each subject, the table shows: its name (Subject); a concise description (Description); its size in terms of lines of code (LOC); the number of accessible servlets (Servlets), with the total number of servlets in the application in parenthesis; the number of injectable parameters (Injectable Params); the number of state parameters (State Params); the number of hotspots (Hotspots); and the average size of the SQL automata generated by AMNESIA (Automata Size), with the minimum-maximum range in parentheses.

102

William G.J. Halfond and Alessandro Orso

The table distinguishes between injectable parameters and state parameters for each application. This distinction is necessary because each type of parameter plays a different role in the application. An injectable parameter is an input parameter whose value is used to build part of a query that is then sent to the database. A state parameter is a parameter that may affect the control flow within the web application but never becomes part of a query. Because, by definition, state parameters cannot result in SQL injection, we only focus on injectable parameters for our attacks. We also distinguish between total and accessible servlets in the applications. An accessible servlet is a servlet that, to be accessed, only requires the user to be logged-in or does not require sessions at all. Some servlets, conversely, must have specific session data (i.e., cookies) to function properly, which considerably complicates the automation of the evaluation. Because we were able to generate enough attacks considering accessible servlets only, we did not consider the remaining servlets. Input Generation The sets of inputs provided by the testbed framework represent normal and malicious usages of the applications. In this section we briefly review how these sets were generated and the types of inputs they contain. In a preliminary step, we identified all of the servlets in each web application and the corresponding parameters that could be submitted to the servlet. Each parameter was identified as either an injectable or state parameter. State parameters must be handled specially because they often determine the behavior of the application. Without a correct and meaningful value assigned to them, the application fails and no attack can be successful. Lastly, we identified the expected type of each injectable parameter. This information helps us in identifying potential attacks that can be used on the parameter and in generating legitimate inputs. The set of attack strings was generated independently using commercial penetration testing techniques. For this task, we leveraged the services of a Masters-level student at Georgia Tech who worked for a local software-security company. The student is an experienced programmer who has developed commercial-level penetration tools for detecting SQL-injection vulnerabilities. In addition, the student was not familiar with our technique, which reduced the risk of developing a set of attacks biased by the knowledge of the approach and its capabilities. To define the initial set of attack strings, the student used a combination of sources, including (1) exploits developed by commercial penetrating teams to take advantage of SQL-injection vulnerabilities, (2) online sources of vulnerability reports, such as US-CERT ( h t t p : / / w w w . u s - c e r t . g o v / ) and CERT/CC Advisories ( h t t p : / / w w w . c e r t . o r g / a d v i s o r i e s / ) , and (3) information extracted from several security-related mailing lists. The resulting set of attack strings contained thirty unique types of attacks. All types of attacks reported in literature (e.g., [1]) were represented in this set with the exception of attacks that take advantage of overly-descriptive database error messages and second-order injections. We excluded these kinds of attacks because they are multi-phase attacks that require intensive human intervention to interpret the attacks' partial results.

5 Detection and Prevention of SQL Injection Attacks

103

The student generated two sets of inputs for each application. The first set contained normal or legitimate inputs for the application. We call this set LEGIT. The second set contained malicious inputs, that is, strings that would result in an SQLIA. We call this set ATTACK. To populate the LEGIT set, the student generated, for each servlet, different combinations of legitimate values for each injectable parameter. State parameters were assigned a meaningful and correct value. To populate the ATTACK set, a similar process was used. For each accessible servlet in the application the student generated the Cartesian product of its injectable parameters using values from the initial attack strings and legitimate values. This approach generated a large set of potentially malicious inputs, which we used as the ATTACK set. 5.4.2 Study 1: Effectiveness In the first study, we investigated RQl, the effectiveness of our technique in detecting and preventing SQLIAs. We analyzed and instrumented each application using AMNESIA and ran all of the inputs in each of the applications' ATTACK sets. For each application, we measured the percentage of attacks detected and reported by AMNESIA. (As previously discussed, when AMNESIA detects an attack, it throws an exception, which is in turn returned by the web application. Therefore, it is easy to accurately detect when an attack has been caught.) The results for this study are shown in Table 5.2. The table shows, for each subject, the number of unsuccessful attacks {Unsuccessful),^ the number of successful attacks (Successful), and the number of attacks detected and reported by AMNESIA (Detected) in absolute terms and as a percentage over the total number of successful attacks, in parentheses. As the table shows, AMNESIA achieved a perfect score. For all subjects, it was able to correctly identify all attacks as SQLIAs, that is, it generated no false negatives.

Table 5.2. Results of Study 1. Subject Unsuccessful Successful Detected Checkers 1195 248 248 (100%) Office Talk 598 160 160(100%) Employee Directory 413 280 280 (100%) Bookstore 1028 182 182 (100%) Events 875 260 260 (100%) Classifieds 823 200 200 (100%) Portal 880 140 140 (100%)

Because the applications performed input validation, they were able to block a portion of the attacks without the attack reaching AMNESIA'S monitor.

104

William G.J. Halfond and Alessandro Orso

5.4.3 Study 2: Efficiency and Precision In the second study, we investigated RQ2 and RQ3. To investigate RQ2, the efficiency of our technique, we ran all of the inputs in the LEGIT sets on the uninstrumented web appUcations and measured the response time of the applications for each web request. We then ran the same inputs on the versions of the web applications instrumented by AMNESIA and again measured the response time. The difference in the two response times corresponds to the overhead imposed by our technique. We found that the overhead imposed by our technique is negligible and, in fact, barely measurable, averaging about 1 milUsecond. Note that this time should be considered an upper bound on the overhead, as our implementation was not optimized. These results confirm our expectations. Intuitively, the time for the network access and the database transaction completely dominates the time required for the runtime checking. As the results show, our technique is efficient and can be used without significantly affecting the response time of a web application. To investigate RQ3, the rate of false positives generated by our technique, we simply assessed whether AMNESIA identified any legitimate query as an attack. The results of the assessment were that AMNESIA correctly identified all such queries as legitimate queries and reported no false positives. 5.4.4 Discussion The results of our study are very encouraging. For all subjects, our technique was able to correctly identify all attacks as SQLIAs, while allowing all legitimate queries to be performed. In other words, for the cases considered, our technique generated no false positives and no false negatives. The lack of false positives and false negatives is promising and provides evidence of the viability of the technique. In our study, we did not compare our results with alternative approaches against SQLIAs because most of the existing automated approaches address only a subset of the possible SQLIAs. (For example, the approach in [8] is focused on type safety, and the one in [25] focuses only on tautologies.) Therefore, we can conclude analytically that such approaches would not be able to identify many of the attacks in our test bed. As for all empirical studies, there are some threats to the validity of our evaluation, mostly with respect to external validity. The results of our study may be related to the specific subjects considered and may not generalize to other web applications. To minimize this risk, we used a set of real web applications (except for the two applications developed by students teams) and an extensive set of realistic attacks. Although more experimentation is needed before drawing definitive conclusions on the effectiveness of the technique, the results we obtained so far are promising.

5 Detection and Prevention of SQL Injection Attacks

105

5.5 Related Approaches

There has been a wide range of techniques proposed to counter SQLIAs. However, when compared to AMNESIA, these solutions have several limitations and shortcomings. In this section we review and discuss the main approaches against SQLIAs. Defensive Programming. Developers have proposed a range of code-based development practices to counter SQLIAs. These techniques generally focus on proper input filtering, such as escaping potentially harmful characters and rigorous type-checking of inputs. Many of these approaches are summarized in Reference [11]. In general, a rigorous and systematic application of these techniques is an effective solution to the problem. However, in practice, the application of such techniques is human-based and is therefore less than ideal. For example, many SQLIA vulnerabilities that have been discovered in various applications correspond to cases where the applications contained input-validation operations, but the validation was inadequate. The situation is further complicated because attackers continue to find new attack strings or subtle variation on old attacks that are able to avoid the checks programmers put in place. Lastly, retroactively fixing vulnerable legacy applications using defensive coding practices is complicated, labor-intensive, and error-prone. Two widely suggested "SQLIA remedies" merit specific mention. Both of them initially appear to offer viable solutions to the SQLIA problem, but do not correctly address it. The first remedy consists of simply checking user input for malicious keywords. This approach would clearly result in a high rate of false positives because an input field could legally contain words that match SQL keywords (i.e. "FROM","OR", or "AND"). The second remedy is to use stored procedures for database access. The ability of stored procedures to prevent SQLIAs is dependent on their implementation. The mere fact of using stored procedures does not protect against SQLIA. Interested readers may refer to Section 5.2 and to References [1, 15, 18, 19] for examples of how SQLIAs can be performed in the presence of stored procedures. Two approaches, SQL DOM [17] and Safe Query Objects [6], use encapsulation of database queries to provide a safe and reliable way to access databases. These techniques offer an effective way to avoid the SQLIA problem by changing the querybuilding process from one that uses string concatenation to a systematic one that uses a type-checked API. (In this sense, SQL DOM and Safe Query Objects can be considered instances of defensive coding.) Although these techniques are as effective as AMNESIA, they have the drawback that they require developers to learn and use a new programming paradigm or query-development process. In general, defensive coding has not been successful in completely preventing SQLIA. While improved coding practices can help mitigate the problem, they are limited by the developer's ability to generate appropriate input validation code and recognize all situations in which it is needed. AMNESIA, being fully automated, can provide stronger guarantees about the completeness and accuracy of the protections put in place.

106

William GJ. Halfond and Alessandro Orso

General Techniques Against SQLIAs. Security Gateway [22] uses a proxy filter to enforce input validation rules on the data that reaches a web application. Using a descriptor language, developers create filters that specify constraints and transformations to be applied to application parameters as they flow from the web page to the application server. By creating appropriate filters, developers can block or transform potentially malicious user input. The effectiveness of this approach is limited by the developer's abiUty to (1) identify all the input streams that can affect the query string and (2) determine what type of filtering rules should be placed on the proxy. WAVES [12] is a penetration testing tool that attempts to discover SQLIA vulnerabilities in web applications. This technique improves over normal penetrationtesting techniques by using machine learning to guide its testing. However, like all penetration testing techniques, it can not provide guarantees of completeness. Valeur and colleagues [24] propose the use of an Intrusion Detection System (IDS) to detect SQLIAs. Their IDS is based on a machine learning technique that is trained using a set of typical appUcation queries. The technique builds models of normal queries and then monitors the application at runtime to identify queries that do not match the model. The fundamental limitation of learning based techniques is that they can not provide guarantees about their detection abilities because their success is dependent on the use of an optimal training set. Without such a set, this technique could generate a large number of false positives and negatives. Boyd and Keromytis propose SQLrand, an approach that uses key-based randomization of SQL instructions [4]. In this approach, SQL code injected by an attacker would result in a syntactically incorrect query because it was not specified using the randomized instruction set. While this technique can be very effective, there are several practical drawbacks to this approach. First, the security of the key may be compromised by looking at the error logs or messages. Furthermore, the approach imposes a significant infrastructure overhead because it requires the integration of an encryption proxy for the database. Static Detection Techniques. JDBC-Checker is a technique for statically checking the type correctness of dynamically generated SQL queries [8]. Although this technique was not originally intended to address SQLIA, it can detect one of the root causes of SQL-injection vulnerabilitiesimproper type checking of input. In this sense, JDBC-Checker is able to detect and help developers eUminate some of the code that allows attackers to exploit type mismatches. However, JDBC-Checker cannot prevent other types of SQLIAs that produce syntactically and type correct queries. Wassermann and Su propose an approach that uses static analysis combined with automated reasoning to verify that the SQL queries generated in the application layer cannot contain a tautology [25]. The scope of this technique is limited, in that it can only address one type of SQLIAs, namely tautology-based attacks, whereas AMNESIA is designed to address all types of SQLIAs. Taint-based Approaches.

5 Detection and Prevention of SQL Injection Attacks

107

Two similar approaches have been proposed by Nguyen-Tuong et al. [20] and Pietraszek and Berghe [21]. These approaches modify a PHP interpreter to track precise taint information about user input and use a context sensitive analysis to detect and reject queries if untrasted input has been used to create certain types of SQL tokens. In general, these taint-based techniques have shown much promise in their ability to detect and prevent SQLIAs. The main drawback of these approaches concerns their practicality. First, identifying all sources of tainted user input in highlymodular web applications introduces problem of completeness. Second, accurately propagating taint information may result in high runtime overhead for the web applications. Finally, the approach relies on the use of a customized version of the runtime system, which affects portability. Huang and colleagues define WebSSARI, a white-box approach for detecting input-validation-related errors, that is based on information-flovi' analysis [13]. This approach uses static analysis to check information flows against preconditions for sensitive functions. The analysis detects where preconditions are not satisfied and suggests filters and sanitization functions that can be automatically added to the application to satisfy the preconditions. The primary drawbacks of this technique are the assumptions that (1) preconditions for sensitive functions can be adequately and accurately expressed using their type system and (2) forcing input to pass through certain types of filters is sufficient to consider it trusted. For many types of functions and applications, these assumptions do not hold. Livshits and Lam [14] use a static taint analysis approach to detect code that is vulnerable to SQLIA. This approach checks whether user input can reach a hotspot and flags this code for developer intervention. A further extension to this work, Securifly [16], detects vulnerable code and automatically adds calls to a sanitization function. This automated defensive coding practice, while effective in some cases, would not prevent all types of SQLIAs. In particular, it would not prevent SQLIAs that inject malicious text into numeric non-quoted fields.

5.6 Conclusion
SQLIAs have become one of the more serious and harmful attacks on databasedriven web appUcations. They can allow an attacker to have unmitigated access to the database underlying an application and, thus, the power to access or modify its contents. In this article, we have discussed the various types of SQLIAs known to date and presented AMNESIA, a fully automated technique and tool for detecting and preventing SQLIAs. AMNESIA uses static analysis to build a model of the legitimate queries that an application can generate and runtime monitoring to check the dynamically generated queries against this model. Our empirical evaluation, performed on commercial applications using a large number of realistic attacks, shows that AMNESIA is a highly effective technique for detecting and preventing SQLIAs. Compared to other approaches, AMNESIA offers the benefit of being fully automated and is general enough to address all known types of SQLIAs.

108

William G.J. Halfond and Alessandro Orso

Acknowledgments
This material is based upon work supported by NSF award CCR-0209322 to Georgia Tech and by the Department of Homeland Security and United States Air Force under Contract No. FA8750-05-2-0214. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the United States Air Force. Jeremy Viegas developed our test bed infrastructure.

References
1. C. Anley. Advanced SQL Injection In SQL Server Applications. White paper, Next Generation Security Software Ltd., 2002. 2. C. Anley. (more) Advanced SQL Injection. White paper. Next Generation Security Software Ltd., 2002. 3. D. Aucsmith. Creating and maintaining software that resists malicious attack, h t t p : / / www. g t i s c . g a t e c h . e d u / a u c s m i t h _ b i o . htm, September 2004. Distinguished Lecture Series. 4. S. W. Boyd and A. D. Keromytis. SQLrand: Preventing SQL injection attacks. In Proceedings of the 2nd Applied Cryptography and Network Security (ACNS) Conference, pages 292-302, June 2004. 5. A. S. Christensen, A. M0ller, and M. I. Schwartzbach. Precise analysis of string expressions. In Proc. 10th International Static Analysis Symposium, SAS '03, volume 2694 of LNCS, pages 1-18. Springer-Verlag, June 2003. Available from http://www.brics.dk/JSA/. 6. W. R. Cook and S. Rai. Safe Query Objects: Statically Typed Objects as Remotely Executable Queries. In Proceedings of the 27th International Conference on Software Engineering (ICSE2005), 2005. 7. T. O. Foundation. Top ten most critical web application vulnerabilities, 2005. h t t p : //www.owasp.org/documentation/topten.html. 8. C. Gould, Z. Su, and P. Devanbu. Static Checking of Dynamically Generated Queries in Database Applications. In Proceedings of the 26th International Conference on Software Engineering (ICSE 04), pages 645-654, 2004. 9. W G. Halfond and A. Orso. AMNESIA: Analysis and Monitoring for NEutralizing SQLInjection Attacks. In Proceedings of the IEEE and ACM International Conference on Automated Software Engineering (ASE 2005), Long Beach, CA, USA, Nov 2005. 10. W. G. Halfond, J. Viegas, and A. Orso. A Classification of SQL-Injection Attacks and Counter Techniques. Technical report, Georgia Institute of Technology, August 2005. 11. M. Howard and D. LeBlanc. Writing Secure Code. Microsoft Press, Redmond, Washington, second edition, 2003. 12. Y. Huang, S. Huang, T. Lin, and C. Tsai. Web Application Security Assessment by Fault Injection and Behavior Monitoring. In Proceedings of the 11th International World Wide Web Conference (WWW 03), May 2003. 13. Y. Huang, F. Yu, C. Hang, C. H. Tsai, D. T Lee, and S. Y. Kuo. Securing Web Application Code by Static Analysis and Runtime Protection. In Proceedings of the 12th International World Wide Web Conference (WWW 04), May 2004.

5 Detection and Prevention of SQL Injection Attacks

109

14. V. B. Livshits and M. S. Lam. Finding Security Vulnerabilities in Java Applications with Static Analysis. In Usenix Security Symposium, August 2005. 15. O. Maor and A. Shulman. SQL Injection Signatures Evasion. White paper, Imperva, April 2004. h t t p : / / w w w . i m p e r v a . c o m / a p p l i c a t i o n _ d e f e n s e _ c e n t e r / white_papers/sql_injection_signatures_evasion.html. 16. M. Martin, V. B. Livshits, and M. S. Lam. Finding Application Errors and Security Flaws Using PQL: a Program Query Language. In Proceedings of the ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), October 2005. 17. R. McClure and I. Kriiger. SQL DOM: Compile Time Checking of Dynamic SQL Statements. In Proceedings of the 27th International Conference on Software Engineering (ICSE 05), pages 88-96, 2005. 18. S. McDonald. SQL Injection: Modes of attack, defense, and why it matters. White paper, GovemmentSecurity.org, April 2002. h t t p : / /www. g o v e r n m e n t s e c u r i t y . o r g / articles/SQLInjectionModesofAttackDefenceandWhyltMatters. php. 19. S. McDonald. SQL Injection Walkthrough. White paper, SecuriTeam, May 2002. h t t p : //www.securiteam.com/securityreviews/5DP0NlP7 5E.html. 20. A. Nguyen-Tuong, S. Guamieri, D. Greene, J. Shirley, and D. Evans. Automatically Hardening Web Applications Using Precise Tainting Information. In Twentieth IFIP International Information Security Conference (SEC 2005), May 2005. 21. T. Pietraszek and C. V. Berghe. Defending Against Injection Attacks through ContextSensitive String Evaluation. In Proceedings of Recent Advances in Intrusion Detection (RAID2005), 2005. 22. D. Scott and R. Sharp. Abstracting Application-level Web Security. In Proceedings of the 11^'^ International Conference on the World Wide Web (WWW 2002), pages 396-407, 2002. 23. A. Seesing and A. Orso. InsECTJ: A Generic Instrumentation Framework for Collecting Dynamic Information within Eclipse. In Proceedings of the eclipse Technology eXchange (eTX) Workshop at OOPSLA 2005, pages 49-53, San Diego, USA, October 2005. 24. F. Valeur, D. Mutz, and G. Vigna. A Learning-Based Approach to the Detection of SQL Attacks. In Proceedings of the Conference on Detection of Intrusions and Mahvare and Vulnerability Assessment (DIMVA), Vienna, Austria, July 2005. 25. G. Wassermann and Z. Su. An Analysis Framework for Security in Web Applications. In Proceedings of the FSE Workshop on Specification and Verification of Component-Based Systems (SAVCBS 2004), pages 70-78, 2004.

Evolution of Media (Timeline)
No ratings yet
Evolution of Media (Timeline)
9 pages
Learning Spoken English
93% (88)
Learning Spoken English
54 pages
SmartForm - Invoice Tutorial
100% (8)
SmartForm - Invoice Tutorial
17 pages
Books For Ibps Po 2013 PDF
50% (2)
Books For Ibps Po 2013 PDF
3 pages
Tet TNPSC Tamil 6 12 Study Materials
0% (2)
Tet TNPSC Tamil 6 12 Study Materials
524 pages
SQL Injection
No ratings yet
SQL Injection
32 pages
Peraaa
No ratings yet
Peraaa
2 pages
Dvwa SQL Injection Lab
No ratings yet
Dvwa SQL Injection Lab
9 pages
Anna University CP 2 Marks
50% (2)
Anna University CP 2 Marks
62 pages
Manual SQL Injection Using DVWA
100% (1)
Manual SQL Injection Using DVWA
16 pages
BLUR15EN
No ratings yet
BLUR15EN
149 pages
Baby Names - 1
No ratings yet
Baby Names - 1
2 pages
Domain Name System (DNS)
100% (1)
Domain Name System (DNS)
29 pages
Aadhaar Seeding Form PDF
No ratings yet
Aadhaar Seeding Form PDF
1 page
SOP For Last 72 Hrs From Close of Poll
100% (1)
SOP For Last 72 Hrs From Close of Poll
63 pages
Bypass
No ratings yet
Bypass
2 pages
TNPSC Counselling Procedure 24 08 2018
No ratings yet
TNPSC Counselling Procedure 24 08 2018
5 pages
RCU II Users Guide FV 9 10 31 08 PDF
No ratings yet
RCU II Users Guide FV 9 10 31 08 PDF
51 pages
DT Traffic Ssasa1m Spx1
100% (1)
DT Traffic Ssasa1m Spx1
16 pages
Fortigate@nettrain
No ratings yet
Fortigate@nettrain
131 pages
Maxis Annual Report 2015
No ratings yet
Maxis Annual Report 2015
210 pages
EX - NO 10 Simulation of Error Correction Code (CRC) Aim
100% (1)
EX - NO 10 Simulation of Error Correction Code (CRC) Aim
4 pages
Final Year Project
No ratings yet
Final Year Project
18 pages
Google Site Teacher Basics
No ratings yet
Google Site Teacher Basics
28 pages
Practical Attendance Sheet Apr May 2015
No ratings yet
Practical Attendance Sheet Apr May 2015
194 pages
Lab Files For Opnet Modeler PDF
No ratings yet
Lab Files For Opnet Modeler PDF
101 pages
Web Tech
No ratings yet
Web Tech
119 pages
SQLPrevent Effective Dynamic Detection A PDF
No ratings yet
SQLPrevent Effective Dynamic Detection A PDF
25 pages
SQL Injection
No ratings yet
SQL Injection
9 pages
03 Sqlia
No ratings yet
03 Sqlia
41 pages
B1e0 PDF
No ratings yet
B1e0 PDF
13 pages
SQL Injection - Wikipedia
No ratings yet
SQL Injection - Wikipedia
72 pages
SQL Injection
No ratings yet
SQL Injection
30 pages
Lecture 8 Database Security
No ratings yet
Lecture 8 Database Security
34 pages
18bit0236 Ism Lab Da 6
No ratings yet
18bit0236 Ism Lab Da 6
13 pages
Detection and Prevention of SQL Injectio PDF
No ratings yet
Detection and Prevention of SQL Injectio PDF
5 pages
What Are The Security Levels Under ISPS Code
No ratings yet
What Are The Security Levels Under ISPS Code
10 pages
Basic Network and Routing Concepts: CCNP ROUTE: Implementing IP Routing
No ratings yet
Basic Network and Routing Concepts: CCNP ROUTE: Implementing IP Routing
67 pages
Sqligot: Detecting SQL Injection Attacks Using Graph of Tokens and SVM
No ratings yet
Sqligot: Detecting SQL Injection Attacks Using Graph of Tokens and SVM
42 pages
SQL Injection Prevention PDF
No ratings yet
SQL Injection Prevention PDF
7 pages
Cyber Crime in The Society: Problems and Preventions
No ratings yet
Cyber Crime in The Society: Problems and Preventions
20 pages
Information Security Analysis and Audit CSE3501: Slot: G1+TG1
No ratings yet
Information Security Analysis and Audit CSE3501: Slot: G1+TG1
31 pages
PPT 04
No ratings yet
PPT 04
33 pages
SQL Injection Attacks and Prevention Tec PDF
No ratings yet
SQL Injection Attacks and Prevention Tec PDF
4 pages
Framework of SQL Injection Attack: IJASCSE Vol 1 Issue 1 2012
No ratings yet
Framework of SQL Injection Attack: IJASCSE Vol 1 Issue 1 2012
12 pages
WiseClose Software User Manual
No ratings yet
WiseClose Software User Manual
27 pages
Lab Manual
No ratings yet
Lab Manual
45 pages
A Review of SQL Injection Attack
No ratings yet
A Review of SQL Injection Attack
16 pages
User Manual Iris IV
No ratings yet
User Manual Iris IV
21 pages
SQL Injection
No ratings yet
SQL Injection
22 pages
Module 2 Lab SQL - Injection - Lab Explainer
No ratings yet
Module 2 Lab SQL - Injection - Lab Explainer
13 pages
Python Book
No ratings yet
Python Book
50 pages
Ak Cyber Next5
No ratings yet
Ak Cyber Next5
23 pages
Ak Cyber Next5
No ratings yet
Ak Cyber Next5
23 pages
Log
No ratings yet
Log
24 pages
Classification of SQL Injection Detection and Prevention Measure
No ratings yet
Classification of SQL Injection Detection and Prevention Measure
13 pages
6 FINAL KhaleelAhmad 2 ResearchCommunication Dec 2010
No ratings yet
6 FINAL KhaleelAhmad 2 ResearchCommunication Dec 2010
9 pages
Networks Security SQL Injection
No ratings yet
Networks Security SQL Injection
19 pages
Ethics - Lect6 - SQL Injection
No ratings yet
Ethics - Lect6 - SQL Injection
19 pages
Pondicherry University: Project Phase - 1
No ratings yet
Pondicherry University: Project Phase - 1
12 pages
Detection and Prevention of SQL Injection Attacks Using Hybrid Approach
No ratings yet
Detection and Prevention of SQL Injection Attacks Using Hybrid Approach
11 pages
RamificationAnalysisOfSQL InjectionD
No ratings yet
RamificationAnalysisOfSQL InjectionD
7 pages
Reference 1 - 2017
No ratings yet
Reference 1 - 2017
13 pages
SQL Injection
No ratings yet
SQL Injection
21 pages
Prevention of Data Leakage Via SQL Injec
No ratings yet
Prevention of Data Leakage Via SQL Injec
6 pages
BlackBerry Enterprise Server For Microsoft Exchange 5.0.4 Compatibility Matrix en
No ratings yet
BlackBerry Enterprise Server For Microsoft Exchange 5.0.4 Compatibility Matrix en
16 pages
A Method of Detecting SQL Injection Attack To Secure Web Applications
No ratings yet
A Method of Detecting SQL Injection Attack To Secure Web Applications
9 pages
SQL Injection On Sybase: The Defensible Attacks
No ratings yet
SQL Injection On Sybase: The Defensible Attacks
16 pages
Ar-401e Mu en
No ratings yet
Ar-401e Mu en
5 pages
SSRN Id3141112
No ratings yet
SSRN Id3141112
6 pages
An Efficient Technique For Detection and Prevention of SQL Injection Attack Using ASCII Based String Matching
No ratings yet
An Efficient Technique For Detection and Prevention of SQL Injection Attack Using ASCII Based String Matching
8 pages
Study On SQL Injection Attacks: Mode, Detection and Prevention
No ratings yet
Study On SQL Injection Attacks: Mode, Detection and Prevention
7 pages
SQL Injection Research Paper
No ratings yet
SQL Injection Research Paper
5 pages
Detection of SQL Injection Using Machine Learning: A Survey
No ratings yet
Detection of SQL Injection Using Machine Learning: A Survey
8 pages
Indian Institute of Entrepreneurship (IIE) Guwahati
No ratings yet
Indian Institute of Entrepreneurship (IIE) Guwahati
46 pages
Implementation of Pattern Matching Algorithm To Defend SQLIA
No ratings yet
Implementation of Pattern Matching Algorithm To Defend SQLIA
7 pages
Survey of SQL Injection Attacks
No ratings yet
Survey of SQL Injection Attacks
6 pages
Template For Dept. Details
No ratings yet
Template For Dept. Details
17 pages
Final Oss Lab Manual
No ratings yet
Final Oss Lab Manual
37 pages
m15 SQL Injection
No ratings yet
m15 SQL Injection
2 pages
Reveiw of Tools Against Vulnerabilies in Web Applications
No ratings yet
Reveiw of Tools Against Vulnerabilies in Web Applications
4 pages
Social Network Security Unit 1
No ratings yet
Social Network Security Unit 1
2 pages
A Review On SQL Injection Prevention Technique: Navu - Verma@yahoo - in
No ratings yet
A Review On SQL Injection Prevention Technique: Navu - Verma@yahoo - in
6 pages
SQL-CB-GuArd: A Deep Learning Mechanism For Structured Query Language Injection Attack Detection
No ratings yet
SQL-CB-GuArd: A Deep Learning Mechanism For Structured Query Language Injection Attack Detection
13 pages
8 IntelliJ
No ratings yet
8 IntelliJ
7 pages
Unit 5 SQL Injection
No ratings yet
Unit 5 SQL Injection
4 pages
Aarush I
No ratings yet
Aarush I
22 pages
Detection of Structured Query Language Injection Attacks Using Machine Learning Techniques
No ratings yet
Detection of Structured Query Language Injection Attacks Using Machine Learning Techniques
14 pages
Prevention of SQL Injection Attacks by Using Service Oriented Authentication Technique
No ratings yet
Prevention of SQL Injection Attacks by Using Service Oriented Authentication Technique
5 pages
Case Study On SQL Injection
No ratings yet
Case Study On SQL Injection
5 pages
SQL Injection Detection and Prevention Techniques: University Technology Malaysia
No ratings yet
SQL Injection Detection and Prevention Techniques: University Technology Malaysia
8 pages
Zru Newsletter
No ratings yet
Zru Newsletter
4 pages
Ex - No: 1 - Android Application That Uses GUI Components, Font and Colors
No ratings yet
Ex - No: 1 - Android Application That Uses GUI Components, Font and Colors
12 pages
128 10th Tamil 1paper Minimum Material Kgi
No ratings yet
128 10th Tamil 1paper Minimum Material Kgi
14 pages
F©Âš Fy JH Fu Âš Fy JH V V©Âš Fy NJ Ïu¡ WH - G©Âš Fy JH V GH Oš Fy JH Cæçš Fy JH Fuiz Fy J
No ratings yet
F©Âš Fy JH Fu Âš Fy JH V V©Âš Fy NJ Ïu¡ WH - G©Âš Fy JH V GH Oš Fy JH Cæçš Fy JH Fuiz Fy J
13 pages
CS6001-C Sharp and .NET Programming
No ratings yet
CS6001-C Sharp and .NET Programming
12 pages
STP Analysis For ICICI Bank
0% (1)
STP Analysis For ICICI Bank
28 pages
Viva Questions
No ratings yet
Viva Questions
2 pages
BY K.Karthikeyan: Hadoop & Map Reduce
No ratings yet
BY K.Karthikeyan: Hadoop & Map Reduce
7 pages
IP Lab Ex
No ratings yet
IP Lab Ex
24 pages
Distance Vector Routing
No ratings yet
Distance Vector Routing
3 pages
Distance Vector Routing
No ratings yet
Distance Vector Routing
3 pages
Get Smallpdf Pro: PDF To Word Converter
No ratings yet
Get Smallpdf Pro: PDF To Word Converter
2 pages
Website Create Wamp Server
No ratings yet
Website Create Wamp Server
3 pages
GTA Syllabus
No ratings yet
GTA Syllabus
1 page
Bhagat 2016 Ijais 451600
No ratings yet
Bhagat 2016 Ijais 451600
4 pages
Wasp:Web Application SQL Injection Preventer
No ratings yet
Wasp:Web Application SQL Injection Preventer
17 pages
SQL Injection Monitoring Security Vulnerabilities in Web Applications
No ratings yet
SQL Injection Monitoring Security Vulnerabilities in Web Applications
6 pages
WP SQL Injection 20
No ratings yet
WP SQL Injection 20
11 pages
Using Positive Tainting and Syntax-Aware Evaluation To Counter SQL Injection Attacks
No ratings yet
Using Positive Tainting and Syntax-Aware Evaluation To Counter SQL Injection Attacks
11 pages
MCITP Training Guide
No ratings yet
MCITP Training Guide
1 page
MSTP Tutorial Part I and II - Lapukhov
No ratings yet
MSTP Tutorial Part I and II - Lapukhov
20 pages
First Toondoo Student Guide
No ratings yet
First Toondoo Student Guide
11 pages
APIs Unlocked
From Everand
APIs Unlocked
Josh Montgomery
No ratings yet
Sql Injection Best Method for Begineers
From Everand
Sql Injection Best Method for Begineers
Kishor Sarkar X
No ratings yet
JavaScript Programming: 3 In 1 Security Design, Expressions And Web Development
From Everand
JavaScript Programming: 3 In 1 Security Design, Expressions And Web Development
Richie Miller
No ratings yet
Hands-On Oracle Application Express Security: Building Secure Apex Applications
From Everand
Hands-On Oracle Application Express Security: Building Secure Apex Applications
Recx
No ratings yet

Half On D 07 Springer

Uploaded by

Half On D 07 Springer

Uploaded by

Detection and Prevention of SQL Injection Attacks

William G.J. Halfond and Alessandro Orso

^ An early version of this work was presented in [9].

5 Detection and Prevention of SQL Injection Attacks

5.2 SQL Injection Attacks Explained

William G.J. Halfond and Alessandro Orso

Database server (MySQL, Oracle, IBM DB2,...)

Fig. 5.2. Example servlet.

5 Detection and Prevention of SQL Injection Attacks

William GJ. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

William G.J. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

This injection produces the following query:

William G.J. Halfond and Alessandro Orso

5.3 Detection and Prevention of SQL Injection Attacks

5 Detection and Prevention of SQL Injection Attacks

oo--o-^--o-*oFig. 5.4. SQL-query model for the servlet in Figure 5.2.

William G.J. Halfond and Alessandro Orso

10a. if (monitor.accepts (<hotspot ID>, queryString))

5 Detection and Prevention of SQL Injection Attacks

(b) SELECT i n f o FROM u s e r s WHERE l o g i n = " OR 1=1 ~

lSELEiJ- 0*)- [ 3 - G ^ ' IWHEREj. H ^ - Q , R_ Q , Q , [OTJ, Q - Q - Q - Q - Q - 1 1 - B - Q - Q

Fig. 5.6. Example of parsed runtime queries.

William G.J. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

Dynamic Phase (Runtime iVIonitoring)

William G.J. Halfond and Alessandro Orso

5.4 Empirical Evaluation

5 Detection and Prevention of SQL Injection Attacks

William G.J. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

William G.J. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

5.5 Related Approaches

William GJ. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

William G.J. Halfond and Alessandro Orso

5 Detection and Prevention of SQL Injection Attacks

You might also like