Xpath Cheat Sheet
Xpath Cheat Sheet
Xpath Cheat Sheet
What Is XPath?
A popular functionality in web development is automation: the hands-free manipulation of a
website's Document Object Model (DOM). If your target websites don’t support application
programming interface (API) calls directly or via Hootsuite, Buffer, or Zapier, how do you
write programs to locate web elements in your browser and act upon them?
\
Here is where XPath plays a role. XPath, short for “XML Path Language,” is a compact way
of teaching a web automation engine such as Selenium WebDriver to locate elements in an
HTML source code document.
To demonstrate its compactness, the XPath string is only 55% the length of that of the
selector path.
Prerequisites
As amazing as XPath appears to be, learning XPath requires a working knowledge of HTML,
CSS, and JavaScript/jQuery, and the ability to open the Inspector panel in your preferred
browser:
If you’re confident in the above, following the examples in this XPath cheat sheet is easier. If
not, bookmark this page and come back when you’re ready.
Expressions/Queries
XPath expressions include XPath queries, which typically refer to absolute and relative
XPaths, and XPath function calls, which can be independent of an XML or HTML document.
Make sure to distinguish XPath queries from XQuery, a query-based functional programming
language outside this cheat sheet’s scope yet supports XPath.
This XPath cheat sheet differs from what we’re used to writing because we’ve found the best
way to learn XPath is by looking at multiple examples and intuitively deriving the XPath
pattern from them. When in doubt, use this website to test out XPath queries.
\
The table below presents static XPath examples, all extracted via the Inspector (see
Prerequisites) from functional websites at the time of writing. The general XPath syntax
follows later below.
Syntax
Hence the basic XPath syntax is as follows, reusing the to-do list example above:
The symbol @ in XPath expressions has to do with XPath axes. An XPath axis describes a
relationship to the current node on the XML/HTML hierarchy tree. The two-colon syntax (::)
specifies conditions on the axis.
A step is an XPath segment between consecutive forward slashes (/), such as html in
absolute paths. An axis can be a step.
Selectors
\
XPath selectors are where XPath expressions and CSS selectors intersect. The table below
illustrates the relationship between XPath axes and their corresponding CSS selectors:
Order selectors enclose ordinal numbers or last() with the selector constraint [ ]:
//input[@disabled] input:disabled
//button[@id="ok"][@type="submit button#ok[for="submit"]
"]
//section[.//h1[@id='intro']] section > h1#intro
//a[@target="_blank"] a[target="_blank"]
Pro tip: You can chain XPath selectors with consecutive selector constraints, but the order
matters. For example, these two XPath queries have different meanings, as explained
below:
● //a[1][@href='/']
○ Get the first <a> tag and check its href has the value '/'.
\ ● //a[@href='/'][1]
○ Get the first <a> with the given href.
Predicates
You can use logical operators in XPath queries:
XPath Description
//span/text() Get the inner text of the <span> tag. In the
example below, "Click here" is the
result.
<span>Click here</span>
//*/a[@id="attention"]/../name() Find the name of the parent element to an
<a> tag with id="attention"
//body//comment() Get the first comment under the <body>
tag.
Extracting data from multiple elements is straightforward. The following XPath expressions
apply to the same HTML example:
<div>
<a class="pink red" href="http://banks.io">oranges</a>
<a class="blue" href="http://crime.io">and lemons</a>
<a class="green" href="http://skyscraper.io">apple</a>
<a class="violet" href="http://leaks.io">honey</a>
<a class="amber" href="http://technology.io">mint</a>
<input type="submit" id="confirm">Go!</input>
</div>
XPath Description
//a/@href Get the URLs (the href string value) in all <a> tags:
http://banks.io
http://crime.io
http://skyscraper.io
http://leaks.io
http://technology.io
//a/text() Get the inner text of all <a> tags:
oranges
and lemons
apple
honey
mint
//a/@class Get the classes of all <a> tags:
pink red
blue
\
green
violet
amber
The table below shows ways to extract data from an element based on its attribute value—
note the mandatory use of @ in the final step of each XPath query:
green
//*[contains(@class, Get the URL (the href string value) in any tag
"red")]/@href with the class 'red':
http://banks.io/
//input[@id="confirm"]/@typ Get the type attribute of an <input> tag with
e id="confirm":
submit
If you want to extract data from an element based on its position, check out these examples:
XPath query Description
//table/tbody/tr[3] Get the third <tr> element in a table
//a[last()] Get the last <a> tag in the document
//main/article/section[positi Get the <h3> tags in all <section> tags after
on()>2]/h3 the second instance of <section>
Now that you’ve made it to the last section of this cheat sheet, here are three real-life XPath
examples of XPath in Selenium.
1. Absolute XPath expressions to get the “Accept All Cookies” footer bar out of the way:
● cookiespress="/html/body/div[1]/main/div/div/div/div/div[4]/di
v/div/div[2]/div/button[1]"
● loginwith="/html/body/div[1]/main/div/div/div/div/div[1]/div[1
]/div[2]/form/div[1]/span"
2. This relative XPath expression maps to a pop-up triggered when a user successfully posts
to a certain social media platform: "//*[@class='Toastify__toast--success']"
https://courses.stationx.net/p/the-complete-application-security-course
https://courses.stationx.net/p/cyber-security-python-and-web-applications
https://courses.stationx.net/p/web-hacking-become-a-web-pentester
\