Leaks API
Leaks API
Na Strzi 1702/65
140 00 Praha
Czech Republic
LEAKS API
This document specifies the Leaks API which is part of the Identity Portal. It is a separate API from the
regular Search API, which is part of intelx.io. The high-level use cases are:
Contents
LEAKS API ................................................................................................................................................... 1
Document History ......................................................................................................................................... 1
Authentication............................................................................................................................................... 2
Search data and return lines ......................................................................................................................... 3
Export Leaked Accounts ............................................................................................................................... 7
Terminate Search ......................................................................................................................................... 9
Synchronous Export Leaked Accounts ....................................................................................................... 10
Appendix 1: Buckets ................................................................................................................................... 11
Document History
Version Date Note
0.1 15.10.2020 Initial version.
0.2 26.10.2020 Data leak search: Return line in text encoding in new field “linea”
(before it was only in byte encoding). Add field “positionsize”.
0.3 26.10.2020 New terminate function and parameter to terminate previous searches.
0.4 27.10.2020 Updated list of buckets.
New function to synchronously export leaked accounts.
4 28.10.2020 New optional datefrom/dateto parameters.
5 16.12.2020 Authentication simplified.
Authentication
The API URL is https://3.intelx.io/.
The API key, specific to the end-user, must be specified as HTTP header “x-key”, or as &k=[key] in the
URL. Your key can be found at https://intelx.io/account?tab=developer. Note that your key must have the
“Identity Portal” license assigned to use this API.
The API will return HTTP 401 Unauthorized in case the API key is invalid or not authorized.
Search data and return lines
This is the API used by the “Search Data Leaks” tab of the Identity Portal. It searches for data and returns
the lines where it appears.
• Selector: Must be an email address, domain, social security number (US based), or credit card
number.
• Limit: In some cases, the API might return more results than specified in limit. If an upper hard limit
is required, it must be enforced on the client side.
• Bucket: Optional filter for searching only in the target bucket. See Appendix 1 for list.
• In case a user makes a new search and the previous one shall be discarded; its search ID shall be
specified in the “terminate” parameter to save system resources. Searches may consume Gigabytes
of data, therefore any searches that are no longer required shall be terminated. Searches can also
be manually terminated via the /live/search/terminate function.
• Dates: From/to dates may be used as filter. Note that item’s dates are set to when the original data
was published if available, or otherwise when it was indexed. This means that newly indexed items
are often backdated.
Response is JSON data returning the search job ID that can be used to retrieve the results:
Example request:
GET
https://3.intelx.io/live/search/internal?selector=itsv.at&limit=100&bucket=&skipinvalid=true&analyze=fals
e&k=[KEY]
Example response:
{
"status": 0,
"id": "84bc2d34-ff97-4acc-acc2-7c8737fef917"
}
Using the search job ID, the results can be fetched as JSON records for machine processing, or as HTML
encoded text for visualization to the end-user.
The following function retrieves the actual search results.
Depending on the format parameter the response uses the text field, the records field, or both. Text is
HTML encoded and intended for direct visualization to the end-user. Use format=1 for machine processing.
The response uses the following JSON structure. The status indicates whether results are available and
whether the client shall continue to fetch for results. Status 0 means there are results in the response, 1
means there is currently no result in the response, but the client should continue to fetch for results. 2
(Terminated) and 3 (Search ID Not Found) tell the client to stop querying for results. Note that with status 2
there might be the last results in the response.
For the documentation of all values in an item structure, please see the documentation of
/intelligent/search/result, page 10 ff:
https://github.com/IntelligenceX/SDK/blob/master/Intelligence%20X%20API.pdf
GET https://3.intelx.io/live/search/result?id=84bc2d34-ff97-4acc-acc2-7c8737fef917&format=0&k=[KEY]
Example response:
{
"status": 0,
"text": "\u003ca href=\"https://intelx.io?did=a7143fcd-5b26-429b-9841-
0e26a02d4cad\" target=\"_blank\"\u003ea7143fcd-5b26-429b-9841-
0e26a02d4cad\u003c/a\u003e Text File dumpster vip163@\u003cspan style=\"b
ackground-color: yellow;\"\u003eexample.com\u003c/span\u003e\n"
}
GET https://3.intelx.io/live/search/result?id=84bc2d34-ff97-4acc-acc2-7c8737fef917&format=1&k=[KEY]
Example response:
{
"status": 2,
"text": "",
"records": [
{
"linea": "155159789,,[email protected],FQYuFpg7Usg=,",
"lineraw": "MTU1MTU5Nzg5LCxnYWJyaWVsLm5pdHVAaXRzdi5hdCxGUVl1RnBnN1VzZz0s",
"positionline": 11,
"positionsize": 20,
"item": {
"systemid": "7f016419-3fac-4bfc-b30d-9f93901e5f76",
"owner": "b37bf0f6-ab90-4737-804a-ccf0b1b6d6de",
"storageid": "c6db12911c263f756608eca50fdede1921fa1c180b0ddbdcd27307a0072729fc283913de721
1de8a4ae1fc0d550c8b9c211e2887e56daf2c0f60894baafc261c",
"instore": true,
"size": 4194275,
"accesslevel": 0,
"type": 1,
"media": 24,
"added": "2019-12-04T18:09:19.27645Z",
"date": "2019-12-04T18:05:34.265767Z",
"name": "Adobe October 2013.txt [Part 1337 of 1973]",
"description": "",
"xscore": 67,
"simhash": 9832427935425476012,
"bucket": "leaks.public.general",
"keyvalues": null,
"tags": [
{
"class": 4,
"value": "email"
},
{
"class": 0,
"value": "fr"
}
],
"relations": null
}
},
{
"linea": "[email protected]:8098daaeb98a2e3238b0d956cf4ffc70348a9a35",
"lineraw": "Z2FicmllbC5uaXR1QGl0c3YuYXQ6ODA5OGRhYWViOThhMmUzMjM4YjBkOTU2Y2Y0ZmZjNzAzNDhhOWEzN
Q==",
"positionline": 0,
"positionsize": 20,
"item": {
"systemid": "5a598aca-4952-46e0-8aa5-5e5dee98fd84",
"owner": "ad837f4f-8c83-409d-a389-337c0b62303c",
"storageid": "73dfa0d808da612bb702ce184926b831e56f52ee556b0d3c1a12d67f4e35e16ca1f7ea361a8
a6da9014bb9b9b38339813213b38284812acc73212cdce9d63713",
"instore": true,
"size": 4194303,
"accesslevel": 0,
"type": 1,
"media": 24,
"added": "2020-10-02T19:28:04.041776Z",
"date": "2020-10-02T19:27:30.386528Z",
"name": "Dropbox.com.rar/sha1.txt [Part 197 of 245]",
"description": "",
"xscore": 63,
"simhash": 12638153115695167421,
"bucket": "leaks.private.general",
"keyvalues": null,
"tags": [
{
"class": 0,
"value": "fr"
},
{
"class": 4,
"value": "email"
}
],
"relations": null
}
}
]
}
Export Leaked Accounts
This is the API used by the “Export Leaked Accounts” tab of the Identity Portal. It only supports domains
and email addresses as input.
The response is the same as for /live/search/internal, returning a status and the search job ID. In case a
user makes a new export and the previous one shall be discarded; its search ID shall be specified in the
“terminate” parameter to save system resources. Searches may consume Gigabytes of data, therefore any
searches that are no longer required shall be terminated. Searches can also be manually terminated via the
/live/search/terminate function.
Example request:
GET https://3.intelx.io/accounts/csv?selector=itsv.at&k=[KEY]
Example response:
{
"status": 0,
"id": "63272ebf-d906-4faf-855a-0772c5459f9a"
}
To fetch the results, use the same /live/search/result endpoint as described before. Note that the “records”
field is an array of records with a different structure:
GET https://3.intelx.io/live/search/result?id=63272ebf-d906-4faf-855a-0772c5459f9a&format=1&k=[KEY]
Example response:
{
"status": 0,
"text": "",
"records": [
{
"user": "[email protected]",
"password": "badin1990",
"passwordtype": "Plaintext",
"bucket": "leaks.public.general",
"date": "2019-01-17T22:08:55+01:00",
"sourceshort": "Collection 1",
"sourcelong": "Collection 1/Collection #1_Games combos_Sharpening.tar.gz/Collection #1_Game
s combos_Sharpening/725.txt",
"systemid": "fe1e7f1b-ff8d-4ab0-91c2-7de07fa093f8"
},
{
"user": "[email protected]",
"password": "badin1990",
"passwordtype": "Plaintext",
"bucket": "leaks.public.general",
"date": "2019-01-17T20:09:45+01:00",
"sourceshort": "Collection 1",
"sourcelong": "Collection 1/Collection #1_EU combos.tar.gz/Collection #1_EU combos/1080.txt
",
"systemid": "59c153b9-51bd-40c3-a77d-632272b4a6bb"
}
]
}
Terminate Search
To terminate an active search or export, use this function. Terminating a search that is no longer needed
saves system resources. Since searches may read and process Gigabytes of data, it is highly appreciated
if users terminate searches that are no longer needed.
Note: You should use the asynchronous function /accounts/csv as this one might miss results that are not
available within the given timeout. Searching for leaked accounts may take minutes, especially when
searching for domains that have thousands of results. Internally the API must fetch the entire data for each
individual result which often results internally in Gigabytes of traffic and potentially causes delays.
The default timeout is 10 minutes. The client must make sure to allow for such high HTTP timeouts on the
client side. The timeout must not be higher than 1 hour, which is the HTTP server write timeout.
Example:
GET https://3.intelx.io/accounts/1?selector=itsv.at&timeout=10&k=[KEY]
Appendix 1: Buckets
This is the list of currently available buckets. Please note that access to certain buckets is restricted
according to your obtained license. Buckets may be added or removed at any time without prior notice.
Bucket Info
darknet.tor Tor hidden services (.onion domains)
darknet.i2p I2P eepsites (.i2p domains)
documents.public.scihub Public documents from Sci-Hub
dumpster Any data potentially relevant but does not fit into any other category
dumpster.web.1 Dumpster: Interesting other websites with high-value information
dumpster.web.ssn Dumpster: SSN related websites
leaks.private.general Private Data Leaks
leaks.public.general Public Data Leaks
leaks.public.wikileaks WikiLeaks, Cryptome & Snowden data
pastes Pastes from various pastebin sites
web.public.de Public Web: Germany
web.public.gov Government US
web.public.kp North Korean public websites
web.public.peer Public Web: Decentralized blockchain based TLDs
web.public.ru Public Web: Russia
web.public.com Public Web: International TLDs
whois Whois data