0% found this document useful (0 votes)

183 views7 pages

4 Information Gathering Web Edition Module Cheat Sheet

Uploaded by

reptile-curvy-lend

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

183 views7 pages

4 Information Gathering Web Edition Module Cheat Sheet

Uploaded by

reptile-curvy-lend

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

INFORMATION GATHERING - WEB EDITION

CHEAT SHEET
Cheat Sheet
Web reconnaissance is the first step in any security assessment or penetration testing engagement. It's akin to a
detective's initial investigation, meticulously gathering clues and evidence about a target before formulating a plan of
action. In the digital realm, this translates to accumulating information about a website or web application to identify
potential vulnerabilities, security misconfigurations, and valuable assets.

The primary goals of web reconnaissance revolve around gaining a comprehensive understanding of the target's digital
footprint. This includes:

Identifying Assets: Discovering all associated domains, subdomains, and IP addresses provides a map of the
target's online presence.
Uncovering Hidden Information: Web reconnaissance aims to uncover directories, files, and technologies that
are not readily apparent and could serve as entry points for an attacker.
Analyzing the Attack Surface: By identifying open ports, running services, and software versions, you can
assess the potential vulnerabilities and weaknesses of the target.
Gathering Intelligence: Collecting information about employees, email addresses, and technologies used can
aid in social engineering attacks or identifying specific vulnerabilities associated with certain software.

Web reconnaissance can be conducted using either active or passive techniques, each with its own advantages and
drawbacks:

Type Description Risk of Examples

Detection

Active Involves directly interacting with the Higher Port scanning, vulnerability
Reconnaissance target system, such as sending probes scanning, network mapping
or requests.

Passive Gathers information without directly Lower Search engine queries, WHOIS
Reconnaissance interacting with the target, relying on lookups, DNS enumeration, web
publicly available data. archive analysis, social media

WHOIS
WHOIS is a query and response protocol used to retrieve information about domain names, IP addresses, and other
internet resources. It's essentially a directory service that details who owns a domain, when it was registered, contact

information, and more. In the context of web reconnaissance, WHOIS lookups can be a valuable source of information,
potentially revealing the identity of the website owner, their contact information, and other details that could be used for
further investigation or social engineering attacks.

For example, if you wanted to find out who owns the domain example.com, you could run the following command in your
terminal:
whois example.com

This would return a wealth of information, including the registrar, registration, and expiration dates, nameservers, and
contact information for the domain owner.

However, it's important to note that WHOIS data can be inaccurate or intentionally obscured, so it's always wise to
verify the information from multiple sources. Privacy services can also mask the true owner of a domain, making it
more difficult to obtain accurate information through WHOIS.

DNS
The Domain Name System (DNS) functions as the internet's GPS, translating user-friendly domain names into the
numerical IP addresses computers use to communicate. Like GPS converting a destination's name into coordinates,
DNS ensures your browser reaches the correct website by matching its name with its IP address. This eliminates
memorizing complex numerical addresses, making web navigation seamless and efficient.

The dig command allows you to query DNS servers directly, retrieving specific information about domain names. For
instance, if you want to find the IP address associated with example.com, you can execute the following command:
dig example.com A

This command instructs dig to query the DNS for the A record (which maps a hostname to an IPv4 address) of
example.com. The output will typically include the requested IP address, along with additional details about the query
and response. By mastering the dig command and understanding the various DNS record types, you gain the ability to
extract valuable information about a target's infrastructure and online presence.

DNS servers store various types of records, each serving a specific purpose:

Record Type Description

A Maps a hostname to an IPv4 address.

AAAA Maps a hostname to an IPv6 address.

CNAME Creates an alias for a hostname, pointing it to another hostname.

Record Type Description

MX Specifies mail servers responsible for handling email for the domain.

NS Delegates a DNS zone to a specific authoritative name server.

TXT Stores arbitrary text information.

SOA Contains administrative information about a DNS zone.

Subdomains
Subdomains are essentially extensions of a primary domain name, often used to organize different sections or services
within a website. For example, a company might use mail.example.com for their email server or blog.example.com for
their blog.

From a reconnaissance perspective, subdomains are incredibly valuable. They can expose additional attack surfaces,
reveal hidden services, and provide clues about the internal structure of a target's network. Subdomains might host
development servers, staging environments, or even forgotten applications that haven't been properly secured.

The process of discovering subdomains is known as subdomain enumeration. There are two main approaches to
subdomain enumeration:

Approach Description Examples

Active Directly interacts with the target's DNS servers or utilizes tools Brute-forcing, DNS zone
Enumeration to probe for subdomains. transfers

Passive Collects information about subdomains without directly Certificate Transparency (CT)
Enumeration interacting with the target, relying on public sources. logs, search engine queries

Active enumeration can be more thorough but carries a higher risk of detection. Conversely, passive enumeration is
stealthier but may not uncover all subdomains. Combining both techniques can significantly increase the likelihood of
discovering a comprehensive list of subdomains associated with your target, expanding your understanding of their
online presence and potential vulnerabilities.

Subdomain Brute-Forcing

Subdomain brute-forcing is a proactive technique used in web reconnaissance to uncover subdomains that may not be
readily apparent through passive methods. It involves systematically generating many potential subdomain names and
testing them against the target's DNS server to see if they exist. This approach can unveil hidden subdomains that may
host valuable information, development servers, or vulnerable applications.

One of the most versatile tools for subdomain brute-forcing is dnsenum. This powerful command-line tool combines
various DNS enumeration techniques, including dictionary-based brute-forcing, to uncover subdomains associated with
your target.
To use dnsenum for subdomain brute-forcing, you'll typically provide it with the target domain and a wordlist containing
potential subdomain names. The tool will then systematically query the DNS server for each potential subdomain and
report any that exist.

For example, the following command would attempt to brute-force subdomains of example.com using a wordlist named
subdomains.txt:

dnsenum example.com -f subdomains.txt

Zone Transfers
DNS zone transfers, also known as AXFR (Asynchronous Full Transfer) requests, offer a potential goldmine of
information for web reconnaissance. A zone transfer is a mechanism for replicating DNS data across servers. When a
zone transfer is successful, it provides a complete copy of the DNS zone file, which contains a wealth of details about
the target domain.

This zone file lists all the domain's subdomains, their associated IP addresses, mail server configurations, and other
DNS records. This is akin to obtaining a blueprint of the target's DNS infrastructure for a reconnaissance expert.

To attempt a zone transfer, you can use the dig command with the axfr (full zone transfer) option. For example, to
request a zone transfer from the DNS server ns1.example.com for the domain example.com, you would execute:
dig @ns1.example.com example.com axfr

However, zone transfers are not always permitted. Many DNS servers are configured to restrict zone transfers to
authorized secondary servers only. Misconfigured servers, though, may allow zone transfers from any source,
inadvertently exposing sensitive information.

Virtual Hosts

Virtual hosting is a technique that allows multiple websites to share a single IP address. Each website is associated
with a unique hostname, which is used to direct incoming requests to the correct site. This can be a cost-effective way
for organizations to host multiple websites on a single server, but it can also create a challenge for web
reconnaissance.

Since multiple websites share the same IP address, simply scanning the IP won't reveal all the hosted sites. You need
a tool that can test different hostnames against the IP address to see which ones respond.

Gobuster is a versatile tool that can be used for various types of brute-forcing, including virtual host discovery. Its vhost
mode is designed to enumerate virtual hosts by sending requests to the target IP address with different hostnames. If a
virtual host is configured for a specific hostname, Gobuster will receive a response from the web server.

To use Gobuster to brute-force virtual hosts, you'll need a wordlist containing potential hostnames. Here's an example
command:
gobuster vhost -u http://192.0.2.1 -w hostnames.txt

In this example, -u specifies the target IP address, and -w specifies the wordlist file. Gobuster will then systematically
try each hostname in the wordlist and report any that results in a valid response from the web server.

Certificate Transparency (CT) Logs

Certificate Transparency (CT) logs offer a treasure trove of subdomain information for passive reconnaissance. These
publicly accessible logs record SSL/TLS certificates issued for domains and their subdomains, serving as a security
measure to prevent fraudulent certificates. For reconnaissance, they offer a window into potentially overlooked
subdomains.

The crt.sh website provides a searchable interface for CT logs. To efficiently extract subdomains using crt.sh within
your terminal, you can use a command like this:
curl -s "https://crt.sh/?q=%25.example.com&output=json" | jq -r '.[].name_value' | sed 's/\*\.//g' | sort -u

This command fetches JSON-formatted data from crt.sh for example.com (the % is a wildcard), extracts domain names
using jq, removes any wildcard prefixes (*.) with sed, and finally sorts and deduplicates the results.

Web Crawling
Web crawling is the automated exploration of a website's structure. A web crawler, or spider, systematically navigates
through web pages by following links, mimicking a user's browsing behavior. This process maps out the site's
architecture and gathers valuable information embedded within the pages.

A crucial file that guides web crawlers is robots.txt. This file resides in a website's root directory and dictates which
areas are off-limits for crawlers. Analyzing robots.txt can reveal hidden directories or sensitive areas that the website
owner doesn't want to be indexed by search engines.

Scrapy is a powerful and efficient Python framework for large-scale web crawling and scraping projects. It provides a
structured approach to defining crawling rules, extracting data, and handling various output formats.

Here's a basic Scrapy spider example to extract links from example.com:

import scrapy

class ExampleSpider(scrapy.Spider):
name = "example"
start_urls = ['http://example.com/']

def parse(self, response):

for link in response.css('a::attr(href)').getall():
if any(link.endswith(ext) for ext in self.interesting_extensions):
yield {"file": link}
elif not link.startswith("#") and not link.startswith("mailto:"):
yield response.follow(link, callback=self.parse)

After running the Scrapy spider, you'll have a file containing scraped data (e.g., example_data.json). You can analyze
these results using standard command-line tools. For instance, to extract all links:
jq -r '.[] | select(.file != null) | .file' example_data.json | sort -u

This command uses jq to extract links, awk to isolate file extensions, sort to order them, and uniq -c to count their
occurrences. By scrutinizing the extracted data, you can identify patterns, anomalies, or sensitive files that might be of
interest for further investigation.

Search Engine Discovery

Leveraging search engines for reconnaissance involves utilizing their vast indexes of web content to uncover
information about your target. This passive technique, often referred to as Open Source Intelligence (OSINT)
gathering, can yield valuable insights without directly interacting with the target's systems.
By employing advanced search operators and specialized queries known as "Google Dorks," you can pinpoint specific
information buried within search results. Here's a table of some useful search operators for web reconnaissance:

Operator Description Example

site: Restricts search results to a specific website. site:example.com "password reset"

inurl: Searches for a specific term in the URL of a page. inurl:admin login

filetype: Limits results to files of a specific type. filetype:pdf "confidential report"

intitle: Searches for a term within the title of a page. intitle:"index of" /backup

cache: Shows the cached version of a webpage. cache:example.com

"search term" Searches for the exact phrase within quotation marks. "internal error" site:example.com

OR Combines multiple search terms. inurl:admin OR inurl:login

- Excludes specific terms from search results. inurl:admin -intext:wordpress

By creatively combining these operators and crafting targeted queries, you can uncover sensitive documents, exposed
directories, login pages, and other valuable information that may aid in your reconnaissance efforts.

Web Archives
Web archives are digital repositories that store snapshots of websites across time, providing a historical record of their
evolution. Among these archives, the Wayback Machine is the most comprehensive and accessible resource for web
reconnaissance.

The Wayback Machine, a project by the Internet Archive, has been archiving the web for over two decades, capturing
billions of web pages from across the globe. This massive historical data collection can be an invaluable resource for
security researchers and investigators.

Feature Description Use Case in Reconnaissance

Historical View past versions of websites, including pages, Identify past website content or functionality
Snapshots content, and design changes. that is no longer available.

Hidden Explore directories and files that may have been Discover sensitive information or backups that
Directories removed or hidden from the current version of were inadvertently left accessible in previous
the website. versions.
Feature Description Use Case in Reconnaissance

Content Track changes in website content, including text, Identify patterns in content updates and
Changes images, and links. assess the evolution of a website's security
posture.

By leveraging the Wayback Machine, you can gain a historical perspective on your target's online presence, potentially
revealing vulnerabilities that may have been overlooked in the current version of the website.

Epstein Documents (Jan. 3)
81% (43)
Epstein Documents (Jan. 3)
943 pages
Zetwark India PVT LTD
No ratings yet
Zetwark India PVT LTD
11 pages
Crown Vs Church
No ratings yet
Crown Vs Church
7 pages
Information Gathering - Web Edition Module
No ratings yet
Information Gathering - Web Edition Module
11 pages
Unit - III
No ratings yet
Unit - III
24 pages
Information Gathering
No ratings yet
Information Gathering
9 pages
Inf 370 - Information Security: Lab 1: Internet Footprinting Using LINUX and WINDOWS
100% (1)
Inf 370 - Information Security: Lab 1: Internet Footprinting Using LINUX and WINDOWS
5 pages
4-Information Gathering (Reconn) - Part 1
No ratings yet
4-Information Gathering (Reconn) - Part 1
35 pages
Unit 2
No ratings yet
Unit 2
71 pages
Passive Reconnaissance - Ways To Obtain Data On A Target Without Ever Hitting The Target
No ratings yet
Passive Reconnaissance - Ways To Obtain Data On A Target Without Ever Hitting The Target
8 pages
INE Web Application Penetration Testing Web Fingerprinting and Enumeration
No ratings yet
INE Web Application Penetration Testing Web Fingerprinting and Enumeration
60 pages
1.information Gathering
No ratings yet
1.information Gathering
42 pages
EJPT
No ratings yet
EJPT
3 pages
Chapter 3 - Reconnaissance and Intelligence Gathering
No ratings yet
Chapter 3 - Reconnaissance and Intelligence Gathering
56 pages
Hacker Methodology Handbook 1731258380 9781731258380 Compress
No ratings yet
Hacker Methodology Handbook 1731258380 9781731258380 Compress
80 pages
UNIT 3 CYPER SECURITY Au
No ratings yet
UNIT 3 CYPER SECURITY Au
22 pages
First - Document T Pas de Soucis Pour Le Moment
No ratings yet
First - Document T Pas de Soucis Pour Le Moment
14 pages
Cyber Unit 1 CH 6
No ratings yet
Cyber Unit 1 CH 6
5 pages
Mare Unit - 5
No ratings yet
Mare Unit - 5
19 pages
Reconnaissance Process
No ratings yet
Reconnaissance Process
13 pages
Lecture 5
No ratings yet
Lecture 5
46 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
73 pages
L05 - Information Gathering 2
No ratings yet
L05 - Information Gathering 2
35 pages
Unit 3
No ratings yet
Unit 3
40 pages
Ethical Hacking Day 1
No ratings yet
Ethical Hacking Day 1
39 pages
2 Day
No ratings yet
2 Day
24 pages
Pengchao Li Slides
No ratings yet
Pengchao Li Slides
29 pages
Web Attack Cheat Sheet
No ratings yet
Web Attack Cheat Sheet
42 pages
Information Gathering - Web Edition - @redblock - Team
No ratings yet
Information Gathering - Web Edition - @redblock - Team
67 pages
Information Gathering - 3
No ratings yet
Information Gathering - 3
14 pages
5 InformationGathering-WebEdition
No ratings yet
5 InformationGathering-WebEdition
67 pages
DNSRecon
No ratings yet
DNSRecon
15 pages
DNSRecon
No ratings yet
DNSRecon
15 pages
@FsKnockouT-4. Information Gathering
No ratings yet
@FsKnockouT-4. Information Gathering
67 pages
2.8 Perform DNS Footprinting
No ratings yet
2.8 Perform DNS Footprinting
12 pages
Information Gathering 124 Ch5
No ratings yet
Information Gathering 124 Ch5
45 pages
Web Attack Cheat Sheet
No ratings yet
Web Attack Cheat Sheet
42 pages
Ethical Hacking - Footprinting
No ratings yet
Ethical Hacking - Footprinting
36 pages
Finding IP Address of A Domain
No ratings yet
Finding IP Address of A Domain
2 pages
IISF Report
No ratings yet
IISF Report
14 pages
Et Lab1 2021600035
No ratings yet
Et Lab1 2021600035
31 pages
INT302 Week - 1 Assignment - 1
No ratings yet
INT302 Week - 1 Assignment - 1
10 pages
PT CH 2 Passive Reconnaissance Ready
No ratings yet
PT CH 2 Passive Reconnaissance Ready
64 pages
Information Gathering Web Edition Module Cheat Sheet
No ratings yet
Information Gathering Web Edition Module Cheat Sheet
5 pages
Ethical Hacking Module 2 (Updated)
No ratings yet
Ethical Hacking Module 2 (Updated)
36 pages
TECH5100 Week 3 - Penetration Testing
No ratings yet
TECH5100 Week 3 - Penetration Testing
48 pages
The Art of Gathering Information
No ratings yet
The Art of Gathering Information
26 pages
Information Gathering
No ratings yet
Information Gathering
2 pages
02 Reconnaissance Techniques
No ratings yet
02 Reconnaissance Techniques
34 pages
Lecture 2
No ratings yet
Lecture 2
26 pages
Lab 5 - Student
No ratings yet
Lab 5 - Student
20 pages
Methodology Footprinting&Scanning
No ratings yet
Methodology Footprinting&Scanning
55 pages
Lecture 6,7 Ethical Hacking Framework and Footprinting
No ratings yet
Lecture 6,7 Ethical Hacking Framework and Footprinting
16 pages
Information Gathering Techniques in Cyber Security
No ratings yet
Information Gathering Techniques in Cyber Security
23 pages
Ethical Hacking - Footprinting Footprinting Overview: Unit 2
No ratings yet
Ethical Hacking - Footprinting Footprinting Overview: Unit 2
12 pages
DNSRECON Tool Tutorial
No ratings yet
DNSRECON Tool Tutorial
14 pages
(Part2) Incident Response Reconnasiance-2
No ratings yet
(Part2) Incident Response Reconnasiance-2
17 pages
Lab Manual Gathering Domain Info Using DNSENUM
No ratings yet
Lab Manual Gathering Domain Info Using DNSENUM
10 pages
Footprinting
No ratings yet
Footprinting
43 pages
Automatic Subdomain Enum
No ratings yet
Automatic Subdomain Enum
17 pages
Reconnaissance - Phase-1
No ratings yet
Reconnaissance - Phase-1
6 pages
10 - DOS Concepts
No ratings yet
10 - DOS Concepts
65 pages
Crackmapexec Cheat Sheet
No ratings yet
Crackmapexec Cheat Sheet
13 pages
3 Footprinting Module Cheat Sheet
No ratings yet
3 Footprinting Module Cheat Sheet
5 pages
1 Getting Started Module Cheat Sheet
No ratings yet
1 Getting Started Module Cheat Sheet
6 pages
ST Ambrose Letter 49
No ratings yet
ST Ambrose Letter 49
1 page
Voter List Apbuma 2011-12
No ratings yet
Voter List Apbuma 2011-12
12 pages
Kerns, Matthew - Resume
No ratings yet
Kerns, Matthew - Resume
1 page
Unit 9 - Revision
No ratings yet
Unit 9 - Revision
13 pages
Constitution of UK and USA, Part 1 Complete Notes of UK Constitution
No ratings yet
Constitution of UK and USA, Part 1 Complete Notes of UK Constitution
37 pages
Cool Kid On The Block
No ratings yet
Cool Kid On The Block
9 pages
Adtw e PB 2024 25
No ratings yet
Adtw e PB 2024 25
89 pages
(18-30) Evaluation of The Sunny View Complex, Lahore According To LEEDS Rating System
No ratings yet
(18-30) Evaluation of The Sunny View Complex, Lahore According To LEEDS Rating System
14 pages
Finding Clients
No ratings yet
Finding Clients
177 pages
Crisis
No ratings yet
Crisis
31 pages
Tab 23 - Visual Capitalist. Bre-X Scandal A History Timeline
No ratings yet
Tab 23 - Visual Capitalist. Bre-X Scandal A History Timeline
6 pages
Section A1: Company Profile: Form A: HW Generator Registration Form
No ratings yet
Section A1: Company Profile: Form A: HW Generator Registration Form
3 pages
Affidavit of Consent: Electrification Permit at Tarlac Electric Cooperative I, Magallanes
No ratings yet
Affidavit of Consent: Electrification Permit at Tarlac Electric Cooperative I, Magallanes
56 pages
West Bengal
No ratings yet
West Bengal
102 pages
Gerunds and Infinitives
100% (8)
Gerunds and Infinitives
44 pages
Journal of Retailing and Consumer Services: Anushree Tandon, Amandeep Dhir, Puneet Kaur, Shiksha Kushwah, Jari Salo
No ratings yet
Journal of Retailing and Consumer Services: Anushree Tandon, Amandeep Dhir, Puneet Kaur, Shiksha Kushwah, Jari Salo
12 pages
A Song To Drown Rivers A Novel Liang Ann Instant Download
No ratings yet
A Song To Drown Rivers A Novel Liang Ann Instant Download
26 pages
Ikea Annual Report 2010
No ratings yet
Ikea Annual Report 2010
3 pages
Data & Specifications: Wärtsilä Low-Speed Engines
No ratings yet
Data & Specifications: Wärtsilä Low-Speed Engines
3 pages
Accounting Information System (Internal Control)
No ratings yet
Accounting Information System (Internal Control)
27 pages
Seipts and Cash Disbursements)
No ratings yet
Seipts and Cash Disbursements)
29 pages
Plural Nouns: (Words of Foreign Origins)
No ratings yet
Plural Nouns: (Words of Foreign Origins)
2 pages
Ma Vs Fernandez Digest
No ratings yet
Ma Vs Fernandez Digest
1 page
Crane Bank LTD & Others V DFCU Bank & Others - Approved Judgment - 30.10.24
No ratings yet
Crane Bank LTD & Others V DFCU Bank & Others - Approved Judgment - 30.10.24
12 pages
What The Leaked Pentagon Documents Reveal - 8 Key Takeaways - BBC News
No ratings yet
What The Leaked Pentagon Documents Reveal - 8 Key Takeaways - BBC News
15 pages
Christie 1996
No ratings yet
Christie 1996
10 pages
Apollo-1.5m X-Beam-to Eurocode PDF
No ratings yet
Apollo-1.5m X-Beam-to Eurocode PDF
50 pages