Open sourceIntelligenceOSINTwithOWASPMaryam
Open sourceIntelligenceOSINTwithOWASPMaryam
net/publication/343601827
CITATIONS READS
0 828
1 author:
Saeed Dehqan
OWASP
4 PUBLICATIONS 1 CITATION
SEE PROFILE
All content following this page was uploaded by Saeed Dehqan on 31 January 2022.
Open-source intelligence (OSINT) uses open source tools to collect information and
analyze it for a specific purpose. OSINT can be very helpful for hackers to use to garner
data about particular organizations. Today, using open sources to gather data is one of
the critical steps for reconnaissance, and this is a common task. It should be a tool to
automate this routine. One of the best tools in this field is the OWASP Maryam. Maryam
uses NLP and ML algorithms to not only gain information but assess them.
INTRODUCTION
OWASP Maryam is a modular/optional open source framework based on OSINT and data
gathering. Maryam is written in Python programming language and It’s designed to
provide a powerful environment to harvest data from open sources and search engines
and collect data quickly and thoroughly. If you have skill in Metasploit or Recon-ng, you
can easily use it without prerequisites and if not, it’s easy to use.
GETTING STARTED
Note: Currently this tool is optimized for Linux and Darwin. So maybe it doesn't work for
operating systems such as windows.
Prerequire
Maryam requires Python 3.8+ and for package installation also uses python package
manager PiPl(pip).
Install
maryam
Interactive help
All modules and commands have interactive help documentation. For root commands,
typing help <command> will provide this documentation, which includes
documentation for invoking subcommands. For subcommands, providing the wrong
input will prompt the framework to display specific subcommand level documentation.
Also typing <command-name> or <module-name> will provide documentation.
(Figure 2). help commands
Note: commands and module names are case-sensitive but set options are not
case-sensitive.
Command Completion
Report Interaction
Gather is a default file in the workspace that allows the user to save the output of the
modules in the JSON format on the workspace. When running the modules, users can
save the output to Gather by setting the --output option. With the report command,
users can get output in different formats from the Gather. For example, if users use
email_search to harvest emails, users can get an output from harvested emails with the
report command:
report xml harvested_emails email_search target.com
Format of report command:
report <format> <filename> <module-name> <target>
Shell Commands
The shell command and ! alias gives users the ability to run system commands on the
local machine from within the framework. Also if the command is not one of the
framework commands, it will run as a shell command by default(Priority is given to the
framework commands). If the history is true the framework stores the commands.
Optional interface
The optional interface can be used directly and users set options with switches. To use
it, just write the module name without directory: tldbrute not footprint/tldbrute.
Global Options
Pay attention to the global options, as they have changed over time and have a large
impact on the performance of the framework. Global options are the options that are
available in the global context of the framework and have a global effect on how the
framework operates. Global options such as "VERBOSITY" and "PROXY" drastically
change how the modules present feedback and make web requests. Explore and
understand the global options before diving into the modules. For instance, if the
"VERBOSITY" is "2", The framework shows all of the errors, HTTP headers and status.
To set an option write set <OPTION-NAME> <VALUE> and to unset unset
<OPTION-NAME>.
Note: Be careful of the rand_agent option. set it false if you want to use search engines.
Workspaces
Note: Try to create a workspace for each target, purpose or company. It’s a good suggestion
to manage data.
Flexible Loading
Even with command completion, module loading can be cumbersome because of the
directory structure of the module tree. To make module loading easier, the framework is
equipped with a smart loading feature. This feature allows modules to be loaded by
referring to a keyword unique to the desired module's name. For instance, load email
will load the "osint/email_search" module without requiring the full path since it is the
only module containing the string "email". Attempting to smart load with a string that
exists in more than one module name will result in a list of all possible modules for the
given keyword. For example, there are many modules whose names contain the string
"search". Therefore, the command load search would not load a module but return a list
of possible modules for the user to reference by full module name.
● Footprint
● OSINT
● Search
● Iris: This is a beta version. We are going to use ML and NLP for threat analysis.
FOOTPRINT
In the section of Footprint, we have modules to Crawl, Identify, Gather and analyze.
Ok, let's be familiar with some common modules in footprint and know how to use
those.
DBRUTE
The dbrute module can brute force DNS fast and easily with your own wordlist or
default wordlist. If you want to use an optional interface, just write the module name:
Figure 12 shows the dbrute options and then we will be doing a simple DNS brute force
attack on yahoo.com with default wordlist:
Command: dbrute -d yahoo.com
(Figure 14). launch dbrute -d yahoo.com(It’s not a fullscreen)
dbrute started with 790 payload. After 100s It tested 790 payloads:
dbrute -d <DOMAINT> -w
https://raw.githubusercontent.com/danielmiessler/SecLists/master/Disc
overy/DNS/namelist.txt
There are some good wordlist in different scale that you can see with(Figure 18):
dbrute --wordlists
ENTRY_POINTS
The entry_points module can crawl web pages and find entry points, like forms,
inputs(hidden, password, text, ..), textareas, upload inputs, get urls and anything can be
useful to fuzzing attacks.
(Figure 18). entry_points optional help
In the next let’s use scraper to search in the more pages and save result to gather and at
the end get a XML report:
Command: entry_points -d github.com -l 2 -t 3 --debug --output
Note: --output saves the output of modules to the gather - to get a report - and this is a
global switch which can be used for all modules.
Sometimes using a crawler takes a lot of time. It depends on the number of pages. At
the end it shows all of the entry points and saves it to the Gather.
Suppose you need specific data that needs to be extracted from web pages by a crawler.
What are you doing?
That's why we wrote a module that searches web pages and finds your specific data by
regular expressions. crawl_pages allow users to find emails, social accounts, errors, etc
and users can extract strings during crawling by specifying a regex pattern with -r
option.
OSINT
OSINT modules are used to find emails, documentations, dns names and social
networks.
Almost all of the OSINT modules use search engines except crawler.
Getting to know all of OSINT's modules is not possible in one article. So let's get
acquainted with the three modules of dns_search, crawler and docs_search. The godork,
email_search, onion_search and social_nets modules are almost identical so if you get
to know with docs_search, you can easily use all of that.
DNS_SEARCH
This module is really powerful for bug hunters who need a tool to find dns names very
fast and complete for bugs like subdomain takeover.
Dns_search finds subdomains from 24 open sources that are completely free without
any API key.
CRAWLER
Infact crawler is a class in core/util/web_scrap.py, You can easily open the file and
know how it works.
DOCS_SEARCH
docs_search allows users to find documents on search engines for a specific company,
keywords or anything else.
(Figure 32). docs_search optional help
Main options:
Search
Search modules have been created to reduce the time it takes to search for free
resources. Users can search in social networks, certificates, images, news, etc.
What can be done with search modules?
Let’s have a search to find people that talk about coronavirus in twitter:
Commands: twitter -q coronavirus -e google,bing,carrot2
CRT
CONCLUSION
Maryam is currently developing and testing new ideas on a daily basis and is rapidly
adding new modules and features. It is an essential and fast tool for hackers,
penetration testers and. It’s very time-saving in search and reconnaissance and It’s
open-source and free for all users and always will be. Users can use Maryam for the first
phase of pentesting.
OWASP Page
GitHub