0% found this document useful (0 votes)
134 views17 pages

Jsluice

The document discusses how JavaScript has evolved to allow dynamic loading of data without page refreshes, making web crawling more complex. It introduces jsluice, an open-source tool that uses syntax trees to extract URLs and other information from JavaScript code in a robust manner, handling edge cases and providing context around matches. Jsluice outputs extracted data as JSON for further processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views17 pages

Jsluice

The document discusses how JavaScript has evolved to allow dynamic loading of data without page refreshes, making web crawling more complex. It introduces jsluice, an open-source tool that uses syntax trees to extract URLs and other information from JavaScript code in a robust manner, handling edge cases and providing context around matches. Jsluice outputs extracted data as JSON for further processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

There’s Gold In Them Thar Files!

Presented by: Tom Hudson, Senior Security Engineer, Bishop Fox | Date: 2023-06-24

1
JSLUICE

Hello, BSides :)

✦ I’m Tom(NomNom)
✦ It’s been a while! Hello! 👋
✦ I make open-source tools (gron, anew, meg, fff, unfurl, gf, waybackurls, httprobe, assetfinder, qsrepla…
✦ I like questions, so have ‘em ready!
✦ I do security tooling R&D stuff at Bishop Fox
⎻ That means this slide-deck is branded and in light-mode
⎻ …and also lacks legally-questionable use of watermarked stock photography

The retu
rn of
light-mo
de
sheepy (:

© Bishop Fox. All rights reserved worldwide. 2


JSLUICE

Crawling Used To Be Easy

✦ The Old Web was pretty easy to crawl


✦ Links were links, marquees scrolled, and HTML was unsullied by JavaScript
✦ When JavaScript arrived it mostly made a trail of kitten gifs follow your cursor

<a href=/https/www.scribd.com/guestbook.html>Sign my guestbook!</a>

A guestb
ook is lik
commen ea
ts sectio
for your n, but
whole sit
e

© Bishop Fox. All rights reserved worldwide. 3


JSLUICE

2001: A Cyberspace Odyssey

✦ In about 2001 JavaScript got a new superpower: XMLHttpRequest


⎻ At the time you might have known it as: ActiveXObject("Microsoft.XMLHTTP")
✦ Now JavaScript could fetch new data and stuff it into the page without a page reload
✦ Fast-forward a couple of decades and we have ReangularJSQuery

felt kind of
Honestly,
not hear
magical to
"click"
the reload
a page
every time
changed

© Bishop Fox. All rights reserved worldwide. 4


JSLUICE

Dealing With The New Web

✦ One way to deal with JavaScript is to use a (headless) browser – a sort of dynamic analysis
⎻ It’s kinda slow and resource intensive
⎻ You only find out about things that are actually executed
✦ To do static analysis you could use regular expressions
⎻ Something something, then you have two problems…

fetch('/api/v2/guestbook', { 'fetch' is a
mode
alternative rn
to
method: "POST", XMLHttpR
equest

headers: {
"Content-Type": "application/json"
},
body: JSON.stringify({msg: "..."})
})
© Bishop Fox. All rights reserved worldwide. 5
JSLUICE

Irregularly Regular

✦ Using regular expressions seems simple enough


✦ You have to deal with nested and escaped quotes, differing whitespace, random variance etc
⎻ At scale, edge-cases become commonplace
✦ Running several-dozen complex regular expressions across multi-megabyte-files isn't great
⎻ Maintaining several-dozen complex regular expressions is worse :(

'/api/v2/guestbook' => /fetch\('([^']+)'/


"/api/v2/guestbook" => /fetch\(['"]([^'"]+)['"]/
"/api/user/o'neill" => /fetch\((['"])([^\1]+)\1/

one from
I stole this
, but it's a
somewhere finding
for
real regex
(?:"|'|\s)(((https?://[A-Za-z0-9_\-\.]+(:\d{1,5})?)+([\.]{1,2})?/[A-Za-z0-9/\-_\.\\%]+([\?|#][^"']+)?)|((\.{1,2}/)?[a-
vaScript!
URLs in Ja
zA-Z0-9\-_/\\%]+\.(aspx?|js(on|p)?|html|php5?|html|action|do)([\?|#][^"']+)?)|((\.{0,2}/)[a-zA-Z0-9\-_/\\%]+(/|\\)[a-
zA-Z0-9\-_]{3,}([\?|#][^"|']+)?)|((\.{0,2})[a-zA-Z0-9\-_/\\%]{3,}/))(?:"|'|\s)

© Bishop Fox. All rights reserved worldwide. 6


JSLUICE

Context could be another name for an SMS scam 🤔

✦ Extracting URLs and paths by themselves is nice


✦ Extracting the context around them is nicer
✦ We can do that with the power of Tree-sitter (https://tree-sitter.github.io/tree-sitter/)
⎻ Shout-out to @LewisArdern and @Semgrep for inspiration :)

fetch('/api/v2/guestbook', {
method: "POST",
headers: {
"Content-Type": "application/json"
},
body: JSON.stringify({msg: "..."})
})

© Bishop Fox. All rights reserved worldwide. 7


JSLUICE

Sitting In A Tree: P, A, R, S, I, N, G

✦ Raw JavaScript source code is difficult to understand for humans, doubly so for programs
✦ Tree-sitter parses JavaScript (and dozens of other languages) into syntax trees
⎻ It's meant for tasks like syntax highlighting so it's tolerant of minor errors <3
✦ jsluice can show you the syntax tree for any JavaScript file

$ cat hello.js
console.log("Hello, world!")

$ jsluice tree hello.js


des in hello.js:
We're 8 sli
e has program
and jsluic
wed up (:
finally sho expression_statement
call_expression
function: member_expression
object: identifier (console)
property: property_identifier (log)
arguments: arguments
string ("Hello, world!")
© Bishop Fox. All rights reserved worldwide. 8
JSLUICE

Meet jsluice: Extracting URLs

✦ There's a jsluice Go package, and also a command-line tool


⎻ We're going to focus mainly on the command-line tool :)
✦ The urls mode can extract URLs, paths, and (where possible) HTTP methods, headers, body data etc
⎻ From calls to fetch, uses of XMLHttpRequest, assignments to document.location, calls to jQuery's
$.get, $.post, and $.ajax, and a handful of other places

$ jsluice urls fetch.js

😍
{
"url": "/api/v2/guestbook",
jslui
JSON ce outputs "method": "POST",
Lin
want t es; you mig
o pipe h "headers": {
it to jq t
:)
"Content-Type": "application/json"
},
"type": "fetch"
}
© Bishop Fox. All rights reserved worldwide. 9
JSLUICE

XMLHttpRequest is tricksy

✦ XMLHttpRequest is especially annoying to deal with


⎻ The data we want is spread out between multiple function calls
✦ Note that jsluice understands string concatenation :)

{
function callAPI(method, callback){
"url": "/api/EXPR?format=json",
var xhr = new XMLHttpRequest();
"queryParams": ["format"],
xhr.onreadystatechange = callback;
"method": "GET",
xhr.open('GET', '/api/' + method + '?format=json');
"headers": {
xhr.setRequestHeader('Accept', 'application/json');
"Accept": "application/json",
"X-Env": "staging"
if (window.env != 'prod'){
},
xhr.setRequestHeader('X-Env', 'staging')
"type": "XMLHttpRequest.open"
}
}
xhr.send();
} 'EXPR' is
the default
placeholde
r, but you
can chang
e it with
--placeho
lder

© Bishop Fox. All rights reserved worldwide. 10


JSLUICE

Secret Sauce

✦ Modern web apps talk to lots of APIs, run in The Cloud™, and need secrets for stuff like that
✦ Sometimes those secrets end up in JavaScript files
✦ You can find secrets with jsluice too!

$ jsluice secrets awskey.js


{
"kind": "AWSAccessKey",
"data": {
"key": "AKIAIOSFODNN7EXAMPLE",
"secret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
},
"filename": "awskey.js",
"severity": "high",
"context": {

🤫
"awsKey": "AKIAIOSFODNN7EXAMPLE",
"awsSecret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"bucket": "examplebucket",
"server": "someserver.example.com"
} Look at tha
t sweet co
} that was e ntext
xtracte d!
© Bishop Fox. All rights reserved worldwide. 11
JSLUICE

Custom Secrets

✦ There are built-in matchers for AWS, GCP, GitHub, and a few other types of secrets
✦ The internet is awash with different secrets types, and your target might use an obscure vendor
✦ You can provide your own patterns in a JSON file :)

[ $ jsluice secrets --patterns=custom.json firebase.js


{ {
"name": "genericSecret", "kind": "firebaseConfig",
"key": "(secret|private|apikey)", "data": {
"value": "[%a-zA-Z0-9+/]+" "apiKey": "AIzaSyB47WKzDu9kkmFAsAYFlagkuJxdEXAMPLE",
}, "appId": "1:586572527435:web:14c624679103dc3e74b755",
{ "authDomain": "someauthdomain.firebaseapp.com",
"name": "firebaseConfig", "projectId": "someprojectid",
"object": [ "storageBucket": "somebucketthatisnotthere.appspot.com"
{"key": "apiKey", "value": "^AIza.+"}, },
{"key": "storageBucket"} "filename": "firebase.js",
] "severity": "info",
} "context": null
] y
}
severit
an s pecify a as ier
You c a ke triag
ee
to m
too,
© Bishop Fox. All rights reserved worldwide. 12
JSLUICE

Queries

✦ Tree-sitter is super cool, it has its own query language for querying syntax trees
✦ The query mode lets you run queries, and massages the results into valid JSON
✦ Use the tree mode we saw earlier to help you write queries
⎻ Also the docs: https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax

$ jsluice query -q '(object) @m' fetch.js | jq


{
"body": "JSON.stringify({id: 123})",
t
conver "headers": {
jslu ic e can't tly to
If c
ing dire "Content-Type": "application/json"
someth kes it a string
it ma },
JSON
"method": "POST"
}
{
"Content-Type": "application/json"
}
{
"id": 123
}
© Bishop Fox. All rights reserved worldwide. 13
JSLUICE

A Neat Trick: Finding Common Keys

✦ Need a word-list for the most common object keys?


✦ Try out this one-liner :)

$ find . -type f -name '*.js' | # Find JavaScript files


jsluice query -q '(object) @m' | # Extract the objects
jq -r 'to_entries[] | .key' | # Extract the keys
sort | uniq -c | sort –nr # Sort and rank them
5 method
4 headers
3 url
3 server
3 secret Maybe
my
doesn't testdata dire
3 data represe make for the
ctory
ntative most
object
3 Content-Type keys (:

...

© Bishop Fox. All rights reserved worldwide. 14


JSLUICE

Where Good Things Come

✦ The command-line tool is nice, and you can use it for automation in shell scripts
✦ But if you want to get serious, use the Go package…

analyzer := jsluice.NewAnalyzer(sourceCode)

analyzer.AddURLMatcher(
jsluice.URLMatcher{"string", func(n *jsluice.Node) *jsluice.URL {

val := n.DecodedString()
if !strings.HasPrefix(val, "mailto:") {
custom
n make return nil
You ca using the full
rs
matche Tree-sitter :) }
r of
powe

return &jsluice.URL{URL: val, Type: "mailto"}


}},
)

for _, match := range analyzer.GetURLs() {


fmt.Println(match.URL)
}

© Bishop Fox. All rights reserved worldwide. 15


JSLUICE

One Last One-liner

✦ Sometimes the most interesting things are in inline JavaScript


✦ Use htmlq to extract them, and some shell trickery to process them :)
⎻ https://github.com/mgdm/htmlq

$ find . -type f -exec file {} \; | # Find files and check what type they are
grep 'HTML document' | # Take just the HTML files
cut -d: -f1 | # Remove everything after the filename
while read htmlfile; do # Loop over each filename
# Use htmlq to extract inline JavaScript
jsluice secrets <(htmlq -f $htmlfile script --text)
done ative get n
e jslu ice will s soon :)
Mayb ML file
t for HT
suppor

© Bishop Fox. All rights reserved worldwide. 16


THANK YOU <3
Questions? :)
BISHOPFOX.COM

Presented by: Tom Hudson, Senior Security Engineer, Bishop Fox | Date: 2023-06-24

17

You might also like