BH US 12 Ristic Protocol Level WP
BH US 12 Ristic Protocol Level WP
Web application firewalls (WAFs) are security tools designed to provide an independent security
layer for web applications. Implemented as appliances, network sniffers, proxies or web server
modules, they analyze inbound and outbound data and they detect and protect against attacks.
At some point in the last couple of years, WAFs became an accepted security best practice. It
took a lot of time and a lot of struggle, and it was not going to happen until the PCI Council gave
the WAFs a serious push when they made them an integral part of PCI compliance [1]. (I
remember almost hearing a collective sigh of relief from the WAF vendors. Their daily lives
suddenly certainly became easier.) Today, we have a wide deployment of WAFs, although the
doubts and controversies remain.
Anyhow, I am not here today to talk about the slow adoption of WAFs. That would take too long
and would distract us from more interesting things. I want to talk about something else: how
good are WAFs at doing their job? More specifically, I want to focus on protocol-level evasion,
which is a fairly low-level aspect of WAF operation and one that is often forgotten.
My point is that these things need to be talked about. Because of the various realities of their
business existence, vendors cannot and will not build great security products alone. It can only
be done in effective collaboration with the users, with them building the products and us keeping
them in check. Unless we make it clear that technical quality is important to us, we're not going
to get it.
Thus, my main reason for being here today is my desire to expose the inner workings of WAFs,
increase transparency and work to improve the effectiveness of WAFs. It all comes down to the
following question:
It does not help that, despite the continuous vocal opposition to WAFs, security researchers and
penetration testers aren't actually ensuring the technical weaknesses of WAFs are discussed
properly. Ask yourselves this: when did you last see a good technical comparison of WAFs? Or,
rather, did you ever see a good technical comparison? The answer here is a clear no. The
concept behind WAFs is sound, but we still have a way to go.
In terms of protection, we should accept WAFs for what they really are — a method of
increasing the cost of attacks, but not necessarily something that can repel everyone. I have a
feeling that WAFs could be much more useful if more organizations stopped treating them as
specialized IPS tools for HTTP. There are many other use cases with tremendous potential, for
example application security monitoring (ASM), attack surface reduction, application hardening,
policy enforcement, and so on. Unfortunately, application security budgets are not very big, and
these techniques require a significant time investment.
Virtually all of the information here applies not only to WAFs, but to IDS and IPS tools, and
deep-inspection firewalls. In fact, bypassing network-centric security tools is bound to be easier
because, in general, they perform less HTTP processing (parsing) than WAFs.
The job is not trivial. As any IDS vendor will tell you, interpreting traffic passively is fraught with
traps and problems. WAFs have the luxury of having more CPU cycles available to perform
traffic parsing and analysis, but the smarter you get in your parsing, the easier it is for the
complexity to overwhelm you and you become a victim of evasion.
On one end is a simple byte stream inspection, where you treat a TCP stream or some major
part of a HTTP transaction as a series of bytes, and you try to match your signatures against
that. This approach is powerful, as it can treat any data (no protocol parsing is necessary), but it
can be easily evaded. For example, think about the support for header folding in HTTP. The
mechanism can be used to split a single header value across several lines. Further, simple
inspection does not enable advanced features.
On the other end is smart in-context inspection, with full protocol parsing and evaluation of data
in the appropriate context. For example, you know that a piece of data is a value of a request
header, and you treat the data as such. This approach is very powerful, but very difficult to
implement successfully because you have to deal with dozens of different backend
implementations and their parsing quirks.
I experienced this problem first hand in the years of working on ModSecurity [9], which is a
popular open source WAF. I started ModSecurity in 2002, and worked on it until 2009, maybe
2010. I had always wanted ModSecurity to be very smart, but, every time I pushed it into that
direction, I discovered that being smart is not always the best approach. What I eventually
realised is that you need to be smart and dumb at the same time.
During my research of this topic, I used ModSecurity and ModSecurity Core Rule Set [10] (a
separate distribution of security rules) to test against. In the process, I discovered two previously
unaddressed issues. They were disclosed to Trustwave in June 2012 [2] and consequently fixed
in ModSecurity 2.6.6 and ModSecurity Core Rule Set 2.2.5. The details of the problems will
follow later in the text.
Impedance Mismatch
There's a term for what I am talking about; it's called impedance mismatch [3]. This is a very
important concept for security tools: you're interpreting the stream of data in one way, but
there's a danger that whatever you are protecting is interpreting the same data differently.
I'll give you one example: years ago I was testing a passive WAF. Back then, passive WAFs
were easy to sell, because people wanted security but without risk of downtime. To configure
this particular WAF to monitor a web site, you had to put it on a network somewhere, where it
could see all traffic, and input a web site's hostname.
Deciding if a connection should be expected is a decision point. Decisions points are critical for
WAFs; every time they make a decision there's a risk that they'll do the wrong thing. And it's
very tricky problem, too, because if you make a mistake you become completely blind to
attacks.
I succeeded in bypassing this device on my first attempt, and I did it by adding a single
character to the request payload. Here's the request I sent:
To make this request bypass the WAF, I added a single dot character to the end of the Host
request header. Yes, really. Apparently, the WAF developers had not considered the various
alternative representations of the hostname — they implemented only what worked for them.
There's a bunch of other things I could have tried here. Omitting the Host request header or
using a non-existing hostname often works. A WAF may be configured to select sites and
policies based on the hostname, whereas the backend server may simply always fall back to the
default site when the hostname is not recognized. In addition, for performance reasons WAFs
may stop monitoring a connection after determining that the first request is not intended for the
site, but you can often continue to submit further requests (with the correct hostname) when
persistent connections are enabled.
The main lesson here is that security products must be designed to use their most restrictive
policy by default, and relax policy only when there is a good reason to do so. In other words,
they must be designed to fail securely.
Exploitation of Decision Points for Universal Evasion
I want to go back to decision points because it's a very important concept to understand. A
decision point occurs at any place in the code where the implementation logic has to branch. In
the previous evasion example, the key decision point occurred when the WAF examined the
hostname and determined that it did not match the site that is being protected.
Any decision point can be potentially turned into an evasion point, by performing the following
steps:
However, as part of this talk I am releasing a catalog of evasion techniques, and there the goal
is to enumerate everything. In addition, I am also releasing a number of tests along with a
simple tool that you can use to test these things yourselves. All these things are now on GitHub
[4], so feel free to try them out. I'd be delighted if some of you would find the topic so interesting
that you'll be compelled to join the project and contribute refinements, new tests and tools.
Virtual Patching
Virtual patching is the process of addressing security issues in web applications without making
changes to application code. For various reasons, organizations are often unable to address
security issues in a timely manner (e.g., no development resources, not allowed to modify the
source code, etc). In such situations, virtual patching can mitigate the problem by reducing the
window of opportunity for the attack. Further, because the application itself is left alone, virtual
patching can be used even with closed-source applications.
Virtual patching is probably one of the most loved WAF features because of its narrow focus
(breakage potential is limited) and potential high value (vulnerability is mitigated).
However, the same aspect that is making virtual patches so useful (precision, and the ability to
control exactly what is allowed through), is also making them prone to bypasses. A large
number of decision points are required to deliver the precision, but the more decision points
there are, the more opportunities for evasion there are.
Before we proceed further, I want to be clear what I mean when I say virtual patching, because
there isn't only one definition of it. Some people have low expectations of this technique and
they might say that they're using virtual patching even when they do things such as enable
blocking in the part the application that is vulnerable (they might be deploying in monitoring-only
mode elsewhere). Or they might be increasing the aggressiveness of their blocking in the
vulnerable spot. These approaches, although no doubt useful, are not what I have in mind.
My definition is much stricter. For me, a virtual patch is what you produce when you spend the
time to understand the part of the application that is vulnerable and the time to understand the
flaw, and in the end produce a patch in which you accept only data that you know to be valid.
This approach is also known as whitelisting, or a positive security model. It is powerful because
you don't need to know what attacks look like; you only need to know what good data looks like.
The catch is that good virtual patches require a great deal of knowledge. They are tricky to
implement correctly, as you shall soon see.
1. Activation, where you examine the request path to determine if the patch should be
enforced. The site is vulnerable in only one location, so you need to ensure your patch
runs only there and does not interfere with the rest of your site.
2. Inspection, where you examine the vulnerable parameter(s) to determine if they are
safe to allow through
Looking at these two steps, ask yourselves this:
● What if I manipulate the path so that the patch is not activated, but the request is still
delivered to the correct location in the backend?
● What if I manipulate parameters so that my attack payload is missed by the WAF, but
the request is still normally processed by the backend?
Let's take a look.
Let's suppose that we have an application vulnerable to SQL injection. Normally, you would
exploit it by sending something like this:
/myapp/admin.php?userid=1PAYLOAD
ModSecurity is very good at giving you near-complete control over what is and isn't allowed. Not
all WAFs are able to do this. Those that have fewer controls might be easier to evade. Those
that have more controls are better in the hands of an expert, but also offer more room for
mistakes.
/myapp/admin.php/xyz?userid=1PAYLOAD
Simply by appending some random content to the path you completely evade the virtual patch.
The attack does not work against the <Location> tag in Apache, because the value supplied
there is treated as a prefix.
Not all web servers support PATH_INFO, or at least not in all situations. In such cases, a similar
feature called path parameters (there's only a vague mention of it in section 3.3 of the URI RFC)
may come in handy. Consider the following URL:
/myapp/admin.php;param=value?userid=1PAYLOAD
In both cases the operation of this evasion technique is the same: we alter the path so that it is
not matched correctly by the WAF, but still works in the backend.
Take a look at the following example, which is representative of many real-life rules:
This is what is known in ModSecurity as a chain of rules. The first rule will look at the path to
determine if we're in the right place. The second rule, which runs only if the first rule was
successful, will perform parameter inspection.
On the surface, the above chain works as expected, but, sadly, it suffers from more than one
problem. First, it is vulnerable to the PATH_INFO attack. Append anything to the path, and the
rule is bypassed. The mistake here is in using the @streq operator, which requires a complete
match.
Second, the above rule does not anticipate the use of any path obfuscation techniques. For
example, consider the following functionally equivalent paths:
/myapp//admin.php
/myapp/./admin.php
/myapp/xyz/../admin.php
When the <Location> tag is used, Apache handles path normalization and thus the above
attacks. But with self-contained ModSecurity rules, rule writers are on their own. There's a great
feature in ModSecurity for this, and it's called transformation pipelines [5]. The idea is that,
before a matching operation is attempted, input data is converted from the raw representation
into something that, ideally, abstracts away all the evasion issues.
In this case, all we need to do is apply the normalizePath transformation function to take care of
path evasion issues for us:
In addition, the updated version of the rule uses @beginsWith, which is safer (@contains is a
good choice too) to counter PATH_INFO attacks.
/myapp\admin.php
/myapp\ADMIN.PHP
These will actually work, against either the Apache <Location> approach, or the self-contained
ModSecurity rule used above. (You may not necessarily be able to execute these attacks from a
browser, because they sometimes normalize the request URI and in the process convert
backslashes to forward slashes.) The first attack will work because Windows will accept the
backslash character as path separator. The second attack will work because the backend
filesystem on Windows is not case sensitive. This attack is not Windows specific and will work
on any filesystem that is not case sensitive (e.g., HFS).
To counter these new evasion technique in Apache, we have to ditch the comfort of the default
<Location> tag and use the more powerful and more difficult to use pattern-based location
matching. After some trial and error, this is what I came up with:
<Location ~ (?i)^[\x5c/]+myapp[\x5c/]+admin\.php>
SecRule ARGS:userid "!^\d+$"
</Location>
ModSecurity has a much better way to handle these attacks, whereby in the transformation
pipeline instead of normalizePath, which was designed for Unix platforms, you use
normalizePathWin, which was designed for Windows platforms. In addition, we are going to
convert the path to lowercase:
The first transformation will convert input to lowercase. The second will normalize the path first
converting backslashes to forward slashes, and then performing path normalization as
normalizePath does.
/myapp;param=value/admin.php?userid=1PAYLOAD
/myapp;/admin.php?userid=1PAYLOAD
<Location ~
(?i)^[\x5c/]+myapp(;[^\x5c/]*)?[\x5c/]+admin\.php(;[^\x5c/]*)?>
SecRule ARGS:userid "!^\d+$"
</Location>
This seems to work but can we reasonably expect anyone to write and maintain such rules?
ModSecurity does not currently have a transformation function that removes path segment
parameters, which would be the ideal approach to keep virtual patches relatively simple. In
absence of such transformation function, the possible defenses are to use the above pattern
with an @rx operator or detect presence of invalid path segment parameters (which may not be
easy, because there are legitimate uses for them — for example session management) and
reject the transaction on that basis alone.
This evasion technique cannot be used against IIS, but it works well against Apache when it is
running on Windows, presumably because Apache does not implement any Windows-specific
countermeasures. Thanks to Johannes Dahse, who pointed out this out to me when he was
reviewing this document.
● Encode slashes using %u encoding (does not work against Apache, which does not
support %u encoding)
Against some very old IIS installations:
● Bypass patch activation using Alternate Data Streams (e.g., append ::$DATA to the
path)
Fortunately, most of these issues were addressed in newer IIS releases (IIS7+). Unfortunately,
WAFs are often deployed to protect legacy systems that no one knows how to upgrade, so
chances are the problems are still rife. A good WAF should be able to handle all of the above
problems within its default protection rules.
Attacking Parameter Verification
Even after a patch is activated, it still needs to apply correct checks to the correct parameters. In
my experience, this is generally easy to bypass with one of the techniques presented in this
section.
For example:
/myapp/admin.php?userid=1&userid=1PAYLOAD
Another similar approach is to vary case in the parameter name, in the expectation that some
mechanisms may implement case sensitive name detection.
$_REQUEST['userid']
GET /myapp/admin.php
Cookie: userid=1PAYLOAD
With a vulnerable PHP application the above will bypass the WAF that does not treat cookies as
parameters. Further, even though the Cookie specification defines cookie values as opaque,
PHP will URL-decode the supplied values.
/myapp/admin.php?userid=1&userid=2
the following table demonstrates the behavior of major web application platforms (much more
information is available in Luca's and Stefano's presentation [6]):
This technique is especially useful for SQL injection because the comma that is the byproduct of
concatenation in the case of ASP applications can be arranged to work as part of SQL.
Because same-name parameters are often used by applications (selecting more than one
option in a list will produce them), HPP is difficult to detect without some knowledge of the
application.
To counter HPP, virtual patches should be written to restrict the number of same-name
parameters submitted to the application (there's an example of this later in the text). Assuming
individual parameter values are properly validated, it is difficult to use HPP for protocol-level
evasion. It's far more useful for the evasion of the signatures that attempt to detect specific
issues (e.g., SQL injection).
/myapp/admin.php?+userid=1PAYLOAD
This problem emphasizes the often-used (and wrong) approach in virtual patching, where you
inspect only parts of the request. For example, if you remember our patch from earlier, it had
this rule:
Clearly, if the WAF does not see the userid parameter, it won't do anything. To address this
problem, virtual patches should be designed to reject any unknown parameters.
Invalid URL-Encoding
There's an interesting behavior on the ASP platform (and potentially elsewhere), that applies
here. If you supply invalid URL encoding, ASP will remove the % character from the string.
/myapp/admin.php?%userid=1PAYLOAD
This reminds me of a very old problem from the time when many web applications were still
developed in C, and had their own custom URL decoders. Some applications would not check
the range of characters used in the encoding, and would proceed to decode payloads that are
not hexadecimal numbers. For example, normally you would encode i as %69:
/myapp/admin.php?user%69d=1PAYLOAD
But with an incorrectly implemented decode function, %}9 might work just the same:
/myapp/admin.php?user%}9d=1PAYLOAD
<Location /index.php>
SecDefaultAction phase:2,t:none,log,deny
If you cannot identify the encoding, the best you can do is treat the request body as a stream of
bytes, and inspect that. That is actually an excellent second line of defense, but it does not work
very well in all cases, virtual patching in particular.
The WAF may be designed to reject transactions that do not have C-T set, in which case you
may attempt to submit some random value that does not match the type of the payload. With
such C-T the transaction will be valid, strictly speaking, yet the WAF will not be able to process
it correctly. The backend application, with its hard-coded functionality, will happily handle the
data.
In the most difficult case, that of a WAF rejecting unknown MIME types, the best approach is to
submit multipart/form-data in the Content-Type header, along with a request body that can be
parsed as both urlencoded and multipart format. The former will be designed for the application
to consume, and the latter for the WAF.
One approach might be to send multipart/whatever as the MIME type, with the body in
multipart/form-data. The application will decide if the request body does indeed contain
multipart/form-data and process it accordingly. But what will the WAF do?
If you're lucky, your WAF will block upon encountering an invalid or unknown MIME type. If it
does not, anything you put in the request body will potentially evade evasion.
Apache Commons FileUpload is not the only library with lax MIME type detection. A quick
inspection of other popular libraries have equally bad or worse implementations.
Evading ModSecurity
By default, ModSecurity has only a small number of rules in its default configuration. The
purpose of these rules is to check that processing has been performed correctly. These rules do
not check that the supplied MIME type is valid, or known, so using multipart/whatever
against ModSecurity will work when it is used to protect applications relying on the FileUpload
library.
If you are using ModSecurity, but not the separate Core Rule Set package (see below), you
need to add custom rules to your configuration to ensure only known MIME types are allowed
through. ModSecurity does not perform bytestream inspection of unknown content, which
means that it will pass through whatever it does not recognize.
It is difficult for WAFs to be secure by default because different applications will use different
MIME types. If you start with strict configuration, usability suffers. If you start with a lax
configuration, it is the security that suffers. WAFs can't win here.
The @within operator will look for a substring match, which means that any substring of the
above value would work, a single character even. So now we know that we can bypass this
control using invalid MIME types, which may be useful.
One such invalid MIME type is very useful in combination with Apache Commons FileUpload. If
you recall, its check is lax. If we supply multipart/ as the MIME type, the library will accept that
as multipart/form-data. At the same time, ModSecurity will not know what to do with the payload,
and will allow it through without any processing.
If you are a CRS user, you should upgrade to 2.2.5, which should address this problem.
Multipart Evasion
Multipart parsing has always been my favourite area of evasion. Not only is the specification
very shallow and ambiguous, but the multipart/form-data encoding is often forgotten, because
it's needed only by the sites that need file uploads. But people tend to forget that even
applications that do not use file uploads can be forced into processing request bodies in
multipart/form-data format.
● Partial implementations, which cover only what is commonly used by major browsers,
but leave the edge cases unimplemented (e.g., HTTP request header folding)
● Lack of proper parsing, relying instead on crude substring matching or regular
expressions
● Not detecting invalid or ambiguous submissions
● Trying to recover from serious problems, in the name of interoperability
Let's now look at all processing steps, one at a time.
Content-Type evasion
As already discussed, the goal with this approach is to trick the WAF into treating a
multipart/form-data MIME type as something else, while the backend is operating normally.
There's a number of approaches we can try here, for example:
Boundary evasion
The goal of this evasion technique is to get the backend application to use one value for the
boundary, while tricking the WAF to use another value. Once that's possible, you can craft
attack payload in such a way that it can be correctly processed no matter what boundary value
is used. The end result is that the WAF misses the attack payload.
Examples:
Content-Type: multipart/form-data; boundary =0000; boundary=1111
Content-Type: multipart/form-data; boundary=0000; boundary=1111
Content-Type: multipart/form-data; boundary=0000; BOUNDARY=1111
Content-Type: multipart/form-data; BOUNDARY=1111; boundary=0000
Content-Type: multipart/form-data; boundary=ABCD
Content-Type: multipart/form-data; boundary=0000'1111
Content-Type: multipart/form-data; boundary="0000"
Part evasion
The multipart format allows arbitrary data to appear before the first part, as well as after the last
part. In some cases this can be exploited. For example, PHP used to process all parts, even
those that appeared after the part marked as being last.
In 2009, Stefan Esser reported [7] that this problem could be used to bypass ModSecurity
running in front of a PHP application.
For example, PHP will always use the last parameter value (two in the first example), but it's
also picky when it comes to parsing, and may "miss" the value that it is not in exactly the format
it wants it to (this is why the name of the field is one in the second example).
ModSecurity will detect this attack because it does not allow multiple same-name parameters in
a Content-Disposition part header. Similarly, it does not allow unknown parameters to be used.
If the filename parameter is present, the part is considered to contain a file. If the filename
parameter is not present, the part is considered to contain a normal parameter.
There was already a bypass in this area, which Stefan Esser discovered [7] in 2009 (that was
after I had left the project, and I was not involved in the patching effort). It applied to a
deployment of ModSecurity protecting a PHP application. Stefan determined that PHP
supported single quotes for escaping in the C-D header, whereas ModSecurity supported only
double quotes. ModSecurity was subsequently patched to add support for single quote-
escaping, and all was well.
Earlier this year, when I started researching evasion, I decided to re-visit all historic evasion
issues and double-check them. The logic is simple: where you discover one unusual behavior
you may discover another. After reading the PHP source code, I understood that the original
vulnerability was not in the code that parses key-value pairs, but in the code that executes
before that, and which breaks the C-D value into key-value pairs.
In the example the double quotes (single quotes can be used too) are effectively shielding the
semicolon, which is the termination character for a key-value pair. So, as far as PHP is
concerned, the semicolon is data. For everyone else, it's part of the C-D syntax.
Once you understand the core issue, it's clear that ModSecurity's patch was insufficient. All you
need to exploit the problem is to add a single non-quote character as the first character in the
parameter value. Like this:
PHP allows you to use quote characters anywhere on the line, but ModSecurity's parser is quite
strict, so the only place where the exploit can be used is in the parameter value.
name = userid
Although there are two name parameters, the first one is overwritten by the second one.
name = x'
filename = ';name=userid
The evasion messes up the parameter names as observed by ModSecurity. That alone can
interfere with any virtual patches that may be deployed. More importantly, the part is treated as
a file and bypasses all inspection.
Acknowledgements
Thanks to Johannes Dahse, Mario Heiderich, Krzysztof Kotowicz, Marc Stern, and Josh Zlatin
for providing feedback to the draft versions of this paper.
References
1. PCI Data Security Standard (PCI DSS): Application Reviews and Web Application
Firewalls Clarified (2008)
2. ModSecurity and ModSecurity Core Rule Set Bypasses, by Ivan Ristic (2012)
3. External Web Application Protection: Impedance Mismatch, by Ivan Ristic (2005)
4. IronBee WAF Research on GitHub
5. ModSecurity Reference Manual, Transformation functions
6. HTTP Parameter Pollution, by Luca Carettoni and Stefano di Paola (2009)
7. Shocking News in PHP Exploitation, by Stefan Esser (2009)
8. PHP Peculiarities for ModSecurity Users, by Ivan Ristic (2007)
9. ModSecurity open source web application firewall
10. ModSecurity Core Rule Set
11. Apache Commons FileUpload