0% found this document useful (0 votes)

47 views24 pages

Open sourceIntelligenceOSINTwithOWASPMaryam

The document discusses the OWASP Maryam open-source intelligence (OSINT) framework. It provides an overview of the framework's features and capabilities, including modules for footprint analysis, OSINT collection, and searching. Example uses of modules like dbrute and entry_points are shown. The document encourages users to experiment with the framework's interactive help, command completion, reporting functions, workspaces, and other global options.

Uploaded by

redteamkali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views24 pages

Open sourceIntelligenceOSINTwithOWASPMaryam

Uploaded by

redteamkali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/343601827

OWASP Maryam: Open-source Intelligence(OSINT) Framewor

Article · May 2020

CITATIONS READS
0 828

1 author:

Saeed Dehqan
OWASP
4 PUBLICATIONS 1 CITATION

SEE PROFILE

All content following this page was uploaded by Saeed Dehqan on 31 January 2022.

The user has requested enhancement of the downloaded file.

Open-source Intelligence(OSINT) with
OWASP Maryam

Open-source intelligence (OSINT) uses open source tools to collect information and
analyze it for a specific purpose. OSINT can be very helpful for hackers to use to garner
data about particular organizations. Today, using open sources to gather data is one of
the critical steps for reconnaissance, and this is a common task. It should be a tool to
automate this routine. One of the best tools in this field is the OWASP Maryam. Maryam
uses NLP and ML algorithms to not only gain information but assess them.

INTRODUCTION
OWASP Maryam is a modular/optional open source framework based on OSINT and data
gathering. Maryam is written in Python programming language and It’s designed to
provide a powerful environment to harvest data from open sources and search engines
and collect data quickly and thoroughly. If you have skill in Metasploit or Recon-ng, you
can easily use it without prerequisites and if not, it’s easy to use.

GETTING STARTED
Note: Currently this tool is optimized for Linux and Darwin. So maybe it doesn't work for
operating systems such as windows.

Prerequire

Maryam requires Python 3.8+ and for package installation also uses python package
manager PiPl(pip).

Install

The project can be installed using the following command:

apt-get install maryam

We can also download using pip

pip install maryam

The installation is finished and you can run with:

maryam

(Figure 1). Launch Maryam

WHAT ARE THE MAIN FEATURES?

In this section, there are some helpful nuggets for getting started with the framework.
While not all features are covered, the following notes will help make sense of a few of
the framework's more helpful and complex features.

Interactive help

All modules and commands have interactive help documentation. For root commands,
typing help <command> will provide this documentation, which includes
documentation for invoking subcommands. For subcommands, providing the wrong
input will prompt the framework to display specific subcommand level documentation.
Also typing <command-name> or <module-name> will provide documentation.
(Figure 2). help commands

Note: commands and module names are case-sensitive but set options are not
case-sensitive.

Command Completion

The entire framework is equipped with command completion. Whether exploring

standard commands or passing parameters to commands, tap the "tab" key several times
to be presented with all of the available options for that command or parameter.

Report Interaction

Gather is a default file in the workspace that allows the user to save the output of the
modules in the JSON format on the workspace. When running the modules, users can
save the output to Gather by setting the --output option. With the report command,
users can get output in different formats from the Gather. For example, if users use
email_search to harvest emails, users can get an output from harvested emails with the
report command:
report xml harvested_emails email_search target.com
Format of report command:
report <format> <filename> <module-name> <target>

(Figure 3). report command

Shell Commands

The shell command and ! alias gives users the ability to run system commands on the
local machine from within the framework. Also if the command is not one of the
framework commands, it will run as a shell command by default(Priority is given to the
framework commands). If the history is true the framework stores the commands.

(Figure 4). shell command

Optional interface
The optional interface can be used directly and users set options with switches. To use
it, just write the module name without directory: tldbrute not footprint/tldbrute.

Global Options

Pay attention to the global options, as they have changed over time and have a large
impact on the performance of the framework. Global options are the options that are
available in the global context of the framework and have a global effect on how the
framework operates. Global options such as "VERBOSITY" and "PROXY" drastically
change how the modules present feedback and make web requests. Explore and
understand the global options before diving into the modules. For instance, if the
"VERBOSITY" is "2", The framework shows all of the errors, HTTP headers and status.
To set an option write set <OPTION-NAME> <VALUE> and to unset unset
<OPTION-NAME>.
Note: Be careful of the rand_agent option. set it false if you want to use search engines.

(Figure 8). Global options

Workspaces

Workspaces help users to conduct multiple simultaneous engagements without having

to repeatedly configure global options, variables and Gather data. All of the information
for each workspace is stored in its own directory underneath the
"~/.maryam/workspaces/" folder. Each workspace consists of its own instance of the
Maryam Gather, a configuration file for the storage of configuration options, reports
from reporting modules, and any loot that is gathered from other modules. To create a
new workspace, use the workspaces command, workspaces create <name>. Loading an
existing workspace is just as easy, workspaces load <name>. To view a list of available
workspaces, see the workspaces list command. To remove a workspace, use the
workspaces remove command, workspaces remove <name>.

Note: Try to create a workspace for each target, purpose or company. It’s a good suggestion
to manage data.

(Figure 9). workspaces command

Flexible Loading

Even with command completion, module loading can be cumbersome because of the
directory structure of the module tree. To make module loading easier, the framework is
equipped with a smart loading feature. This feature allows modules to be loaded by
referring to a keyword unique to the desired module's name. For instance, load email
will load the "osint/email_search" module without requiring the full path since it is the
only module containing the string "email". Attempting to smart load with a string that
exists in more than one module name will result in a list of all possible modules for the
given keyword. For example, there are many modules whose names contain the string
"search". Therefore, the command load search would not load a module but return a list
of possible modules for the user to reference by full module name.

GET STARTED WITH MODULES

Currently, Maryam has three types of modules:

● Footprint
● OSINT
● Search
● Iris: This is a beta version. We are going to use ML and NLP for threat analysis.

FOOTPRINT

(Figure 11). footprint modules

In the section of Footprint, we have modules to Crawl, Identify, Gather and analyze.

What can be done with footprint modules?

● Identify Web Applications, frameworks, plugins, languages and server operating

systems.
● Search on web pages and find emails, usernames, errors, meta tags and anything
else that interests you.
● Search your regular expression on web content and get the result. It’s very
helpful to find interesting things on web content and it’s not limited to one page.
● Fast brute force attack to identify subdomains, files, directories and TLDs with
thread support.
● Find web entry points like forms, inputs, URLs with params, get requests and
post requests to fuzzing attacks.
● Detect Web Application Firewalls(WAF) with 200 payloads

Ok, let's be familiar with some common modules in footprint and know how to use
those.
DBRUTE

The dbrute module can brute force DNS fast and easily with your own wordlist or
default wordlist. If you want to use an optional interface, just write the module name:

(Figure 12). dbrute optional help

Figure 12 shows the dbrute options and then we will be doing a simple DNS brute force
attack on yahoo.com with default wordlist:
Command: dbrute -d yahoo.com
(Figure 14). launch dbrute -d yahoo.com(It’s not a fullscreen)

dbrute started with 790 payload. After 100s It tested 790 payloads:

(Figure 15). finish dbrute -d yahoo.com

In the next, let’s use thread to show how effective it is.

dbrute started with thread 30(Which means 30 requests will send per round) and after
25s, It tested 489 payloads:
Command: dbrute -d yahoo.com -t 30
(Figure 16). dbrute -d yahoo.com -t 30

Note: It depends on Internet speed. It’s tested with 5.4 Mbps

If you want use another wordlist(Remote or local), use -w option:

dbrute -d <DOMAINT> -w
https://raw.githubusercontent.com/danielmiessler/SecLists/master/Disc
overy/DNS/namelist.txt
There are some good wordlist in different scale that you can see with(Figure 18):
dbrute --wordlists

(Figure 17). dbrute --wordlists

ENTRY_POINTS

The entry_points module can crawl web pages and find entry points, like forms,
inputs(hidden, password, text, ..), textareas, upload inputs, get urls and anything can be
useful to fuzzing attacks.
(Figure 18). entry_points optional help

Figure 18 shows entry_points options. Let’s do a search on github.com(Setting 'https?://'

is optional):
Command: entry_points -d github.com

(Figure 19). Launch entry_points

In the next let’s use scraper to search in the more pages and save result to gather and at
the end get a XML report:
Command: entry_points -d github.com -l 2 -t 3 --debug --output
Note: --output saves the output of modules to the gather - to get a report - and this is a
global switch which can be used for all modules.

(Figure 20). Github crawling with the entry_points.

Sometimes using a crawler takes a lot of time. It depends on the number of pages. At
the end it shows all of the entry points and saves it to the Gather.

command: report xml <FILENAME> footprint/entry_points github.com

(Figure 21). xml report

CRAWL_PAGES

Suppose you need specific data that needs to be extracted from web pages by a crawler.
What are you doing?
That's why we wrote a module that searches web pages and finds your specific data by
regular expressions. crawl_pages allow users to find emails, social accounts, errors, etc
and users can extract strings during crawling by specifying a regex pattern with -r
option.

(Figure 22). crawl_pages optional help

Let's have a simple search to find jquery version:

Command: crawl_pages -d owasp.org -r '(jquery[\w\d.\-]+)'

(Figure 23). Simple using crawl_pages

(Figure 24). crawl_pages with --more option and get the number of css files

OSINT

(Figure 25). OSINT modules

OSINT modules are used to find emails, documentations, dns names and social
networks.
Almost all of the OSINT modules use search engines except crawler.

What can be done with OSINT modules?

● Extracts Emails(with 10 source to search)

● Documentations(pdf, csv, txt, xlsx, ..) with 10 sources to search
● DNS names with 24 source to search(All of sources are free)
● Social networks
● Extracts links(in scope, out scope), comments, CSS and JS files, CDN links,
emails, docs and media files from web pages

Getting to know all of OSINT's modules is not possible in one article. So let's get
acquainted with the three modules of dns_search, crawler and docs_search. The godork,
email_search, onion_search and social_nets modules are almost identical so if you get
to know with docs_search, you can easily use all of that.

DNS_SEARCH

This module is really powerful for bug hunters who need a tool to find dns names very
fast and complete for bugs like subdomain takeover.
Dns_search finds subdomains from 24 open sources that are completely free without
any API key.

(Figure 26). dns_search optional help

Here we use this module to find ibm.com’s subdomains to see how fast it is and how
many subdomains could be found(verbosity 3):
Command: dns_search -d ibm.com --thread 5 --max --output

(Figure 27). dns_search in progress

After 1m:41s/net(512kbps) it found 32155 subdomains for ibm.com:

(Figure 28). Ibm subdomain report

CRAWLER

Crawler is a powerful module to crawl web pages and extract data.

The difference between this crawler and other crawlers are as follows:

● It finds all of the URLs(in-scope and out-scope). Not only, a[href].

● It could identify hidden pages.
● It detects media files and metadata files such as .mp4, exe, zip, etc and does not
open these links.
● Not only does it find the links and metadata files, but it also finds emails, dns
names, phone numbers, comments, cdns, etc.
● Another feature that is very useful is identifying social network accounts such as
facebook, github, linkedin, etc.

Infact crawler is a class in core/util/web_scrap.py, You can easily open the file and
know how it works.

(Figure 29). crawler optional help

Almost all of the options are the same with other modules:

● -d / --domain: domain name or url string

● --debug: displays live links | default = False
● --output: Save output of modules to the workspace | default = False
● -l / --limit: Using this option users can set a recursion limit for crawling. For
example, a depth of 2 means crawler will find all the links from the homepage (limit
1) and then will crawl those levels as well (limit 2) | default = 1
● -t / --thread: It is possible to make concurrent requests to the target and -t option
can be used to specify the number of concurrent requests to make | default = 1
Crawl a single web page:
Command: crawler -d python.org

(Figure 30). Launch crawler

(Figure 31). crawl a single web page

DOCS_SEARCH

docs_search allows users to find documents on search engines for a specific company,
keywords or anything else.
(Figure 32). docs_search optional help

Main options:

● -q / --query: search query(It can be hostname, company, keywords..)

● -f / --file: file type(It can be one or some filetype but it’s better to use one filetype)
● --output: Save output of modules to the workspace | default = False
● -l / --limit: number of pages that open for search | default = 1
● -c / --count: number of results per page | default =50
● -e / --engines: search engine names that user want to search(separate with comma)
| default=exalead,bing
● -t / --thread: The number of engines that run simultaneously | default = 2

To find text that related to amazon:

Command: docs_search -q amazon -f txt -e google,bing,metacrawler
--thread 3
(Figure 33). Docs_search launch

(Figure 34). Search modules

Search modules have been created to reduce the time it takes to search for free
resources. Users can search in social networks, certificates, images, news, etc.
What can be done with search modules?

● Search in best search engines without API keys

● Search in common social networks and find peoples, hashtags and status
● Search to find images, news, websites, etc

Let’s have a search to find people that talk about coronavirus in twitter:
Commands: twitter -q coronavirus -e google,bing,carrot2

(Figure 35). Twitter(It’s not fullscreen)

Or find peoples with the name “John Doe” in linkedin:

Command: linkedin -q "John Doe" -e google,carrot2,bing -l 3 -c 50
(Figure 36). Linkedin(It’s not fullscreen)

CRT

In the next, we use crt to search to find telegram.org certifications:

Command: crt -q telegram.org:
View publication stats

(Figure 37). crt launch

WHO DEVELOPED OWASP MARYAM?

Maryam is developed by Saeed Dehqan and there is currently no other partnership. I

started OWASP Maryam in 2017.

CONCLUSION

Maryam is currently developing and testing new ideas on a daily basis and is rapidly
adding new modules and features. It is an essential and fast tool for hackers,
penetration testers and. It’s very time-saving in search and reconnaissance and It’s
open-source and free for all users and always will be. Users can use Maryam for the first
phase of pentesting.

OWASP Page

GitHub