Information Gathering: Difference between revisions

From HackOps
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
= Information Gathering =
== Passive Reconnaissance ==


'''Information gathering''' is the initial phase of hacking and reconnaissance.
Passive techniques involve no direct interaction with the target system. They rely on publicly available data, and are less likely to trigger detection mechanisms.
It focuses on collecting technical and contextual data about a target system, organization, or individual — before any exploitation is attempted.


It includes both '''passive methods''' (observing without interacting directly) and '''active methods''' (engaging with the target system to elicit responses)
=== Common Techniques ===
The purpose is to establish a baseline understanding of the digital environment, reveal potential vulnerabilities, and map the attack surface.
* Monitoring public websites and content (company pages, blogs, changelogs)
* Analyzing social media presence of employees or departments
* Querying DNS and WHOIS records using tools like [[whois]], [[dnsdumpster]], [[crt.sh]]
* Reviewing pastebin dumps and breach databases
* Harvesting metadata from exposed documents and images
* Searching public repositories (GitHub leaks, internal code or config files)
* Mapping infrastructure using [[Shodan]] and [[Censys]]


== Techniques ==
=== Tools ===
* [[theHarvester]]
* [[Recon-ng]]
* [[SpiderFoot]]
* [[Maltego]]
* [[FOCA]] (for metadata extraction)
* [[GitHub Dorking Tools]]


Information gathering relies on a wide range of techniques and tools, depending on scope and approach:
== Active Reconnaissance ==


=== Passive Reconnaissance ===
Active techniques involve sending packets to the target system and observing responses. This can reveal detailed technical data but may trigger logging or alerts.
* Monitoring public data sources (search engines, social media, company websites)
* Collecting DNS and WHOIS records
* Reviewing public repositories, job postings, and metadata leaks


=== Active Reconnaissance ===
=== Common Techniques ===
* Performing port scans
* Scanning open ports using [[Nmap]] or [[Masscan]]
* Fingerprinting services and operating systems
* Banner grabbing to identify services
* Querying DNS servers directly
* OS fingerprinting using TCP/IP stack behavior
* Testing server responses to crafted inputs
* DNS zone transfers and brute-forcing with [[dnsrecon]] or [[dnsenum]]
* Detecting WAFs, proxies, or CDNs
* Enumerating services like SMB, FTP, HTTP, SNMP


== Subcategories ==
=== Tools ===
* [[DNS Reconnaissance]] – Interrogate DNS to uncover subdomains, records, zones, and relationships.
* [[Nmap]]
* [[Network Scanning Tools]] – Use scanners like Nmap or Masscan to map open ports and services.
* [[Masscan]]
* [[OSINT Tools]] – Gather public data using platforms like theHarvester, SpiderFoot, and custom scripts.
* [[Amass]]
* [[dnsenum]]
* [[whatweb]]
* [[Netcat]]
* [[Nikto]]
* [[Wappalyzer]]


== Purpose ==
== Hybrid / Semi-Passive Techniques ==


The main objective is to reduce the unknowns in a system. 
Some techniques blur the line between passive and active.
By compiling an accurate profile of a target, security professionals and researchers can make informed decisions about how to proceed.


This process is essential in both ethical penetration testing and adversarial threat modeling.
* Certificate Transparency Log monitoring (e.g. [[crt.sh]])
* Passive DNS databases
* Third-party subdomain enumeration (without DNS queries)
* Crawling public GitHub issues for leaked credentials
* Using APIs to gather external data (e.g. [[SecurityTrails]], [[Shodan API]])


== Common Goals ==
== Structuring Your Recon ==
* Discover live hosts and IP ranges 
A common workflow combines both passive and active methods:
* Identify open ports and running services 
* Map subdomains and infrastructure 
* Determine software versions and potential vulnerabilities 
* Extract metadata and leaked internal references 
* Enumerate usernames, emails, or associated accounts 


== Considerations ==
1. **Start passive:** collect domains, emails, tech stack, leaked info 
* Active scanning can generate detectable traffic; caution is advised when testing external targets.
2. **Enumerate targets:** subdomains, IPs, related infrastructure 
* Passive techniques offer stealth but may return outdated or incomplete information.
3. **Engage actively:** scan ports, fingerprint services, probe for weaknesses 
* All data gathered should be documented clearly for later analysis and correlation.
4. **Document everything:** maintain structured notes and timestamps
 
== Related Concepts ==
* [[Footprinting]]
* [[Enumeration]]
* [[Recon-ng]]
* [[Threat Modeling]]

Revision as of 13:56, 11 May 2025

Passive Reconnaissance

Passive techniques involve no direct interaction with the target system. They rely on publicly available data, and are less likely to trigger detection mechanisms.

Common Techniques

  • Monitoring public websites and content (company pages, blogs, changelogs)
  • Analyzing social media presence of employees or departments
  • Querying DNS and WHOIS records using tools like whois, dnsdumpster, crt.sh
  • Reviewing pastebin dumps and breach databases
  • Harvesting metadata from exposed documents and images
  • Searching public repositories (GitHub leaks, internal code or config files)
  • Mapping infrastructure using Shodan and Censys

Tools

Active Reconnaissance

Active techniques involve sending packets to the target system and observing responses. This can reveal detailed technical data but may trigger logging or alerts.

Common Techniques

  • Scanning open ports using Nmap or Masscan
  • Banner grabbing to identify services
  • OS fingerprinting using TCP/IP stack behavior
  • DNS zone transfers and brute-forcing with dnsrecon or dnsenum
  • Detecting WAFs, proxies, or CDNs
  • Enumerating services like SMB, FTP, HTTP, SNMP

Tools

Hybrid / Semi-Passive Techniques

Some techniques blur the line between passive and active.

  • Certificate Transparency Log monitoring (e.g. crt.sh)
  • Passive DNS databases
  • Third-party subdomain enumeration (without DNS queries)
  • Crawling public GitHub issues for leaked credentials
  • Using APIs to gather external data (e.g. SecurityTrails, Shodan API)

Structuring Your Recon

A common workflow combines both passive and active methods:

1. **Start passive:** collect domains, emails, tech stack, leaked info 2. **Enumerate targets:** subdomains, IPs, related infrastructure 3. **Engage actively:** scan ports, fingerprint services, probe for weaknesses 4. **Document everything:** maintain structured notes and timestamps