Every engagement I run starts the same way. Before I touch a client's network — before I run a single scan, before I send a single packet — I spend time doing exactly what a real attacker would do first: I look. I use publicly available tools and databases to build a picture of the target that is, in almost every case, far more detailed than the business owner ever imagined possible.
This phase is called passive reconnaissance, or OSINT — Open Source Intelligence. "Passive" means it leaves no fingerprint. No firewall log will show it. No IDS will alert on it. You can be fully, comprehensively profiled from the internet without your systems ever knowing it happened. The attacker then uses everything they've found to craft a targeted, personalised attack — a spear phishing email that references real people, real systems, real job titles — that is dramatically more convincing than a generic scam.
This article walks through every major OSINT source a real attacker uses. For each one I'll show you what they find and, more importantly, what you can do to shrink your exposure.
Each OSINT source section shows you what the attacker learns and ends with a concrete defensive action. The final section synthesises everything into a full spear phishing build — so you can see how these fragments combine into something genuinely dangerous. Then we cover how to fight back.
Category 01 — Your Internet-Facing Infrastructure
The first thing an attacker does is answer one question: what is this business actually running, and where is it exposed? They don't need to touch your systems to answer this. Multiple search engines have already indexed everything visible on the internet — they just need to query them.
IP: 196.22.xxx.xxx
Hostnames: mail.greyhatdemo.co.za, vpn.greyhatdemo.co.za
Country: South Africa
Org: Internet Solutions (Pty) Ltd
Open Ports:
25/tcp SMTP Microsoft Exchange smtpd
443/tcp HTTPS Fortinet FortiGate SSL-VPN 7.4.1 ← CVE-2024-21762 UNPATCHED
3389/tcp RDP Windows Server 2019 ← EXPOSED TO INTERNET
8443/tcp HTTPS Synology DiskStation DSM 7.1
Vulnerabilities:
CVE-2024-21762 FortiOS RCE — CVSS 9.8 — public PoC available
CVE-2024-3400 PAN-OS command injection — CVSS 10.0
Every piece of data in that Shodan output was gathered passively by Shodan's crawlers — not by the attacker. The attacker simply queried a search engine. Your firewall never logged it. Your SIEM never saw it. You had no idea it happened. And now the attacker knows your unpatched FortiGate has a public PoC exploit with a CVSS score of 9.8.
Category 02 — DNS, Subdomains & Certificate Transparency
Your DNS records are public by design — they have to be for email delivery and website routing to work. But most businesses don't realise how much infrastructure topology is inadvertently disclosed through DNS, or that every SSL certificate ever issued for their domain is logged in a publicly queryable database.
In South Africa, fewer than 30% of .co.za domains have a properly enforced DMARC policy (p=reject or p=quarantine). The rest are either missing DMARC entirely or have it set to p=none — which means the policy is monitoring-only and does nothing to prevent spoofing. An attacker who checks your MX records and finds no DMARC enforcement can send email from [email protected] that passes most mail client authenticity checks.
Category 03 — People, Email Addresses & Org Structure
Infrastructure enumeration tells the attacker what systems you're running. Personnel enumeration tells them who to target and how to reach them. These are the building blocks of a spear phishing campaign — the attacker needs real names, real job titles, and real email addresses to make their lure convincing.
[ Emails Found ]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected] ← IT contact = high value target
[ Hosts Found ]
mail.demo-practice.co.za 196.22.x.x
vpn.demo-practice.co.za 196.22.x.x
backup.demo-practice.co.za 196.22.x.x ← forgotten, unmonitored
[ LinkedIn Personnel ]
Dr David S. Principal · 8 years · posted last week
Sarah M. Practice Manager · 3 years
Kyle T. IT Administrator · 4 months ← NEW — higher social engineering risk
Category 04 — Breached Credentials & the Dark Web
Data breaches happen constantly. Millions of credential pairs — email addresses and their corresponding passwords — are stolen from services your staff use every day and end up on dark web marketplaces and freely-shared breach databases. An attacker doesn't need to hack you directly if your staff member used the same password on LinkedIn that they use on your VPN.
Category 05 — GitHub Leaks, Metadata & Job Posts
Three sources that most businesses have never considered as OSINT risks — but which consistently deliver some of the highest-value intelligence an attacker can find.
Category 06 — Google Dorks: The Free Vulnerability Scanner
Google's search operators can be chained together to create highly targeted queries — known as "Google Dorks" — that surface sensitive files, exposed admin panels, login pages, and confidential documents that have been inadvertently indexed. This requires nothing beyond a web browser and takes seconds per query.
site:yourdomain.co.za inurl:admin OR inurl:login OR inurl:portal
# Sensitive file types indexed by Google
site:yourdomain.co.za filetype:pdf OR filetype:xlsx OR filetype:docx
# Configuration and environment files
site:yourdomain.co.za filetype:env OR filetype:cfg OR filetype:ini
# Error pages disclosing stack traces / software versions
site:yourdomain.co.za intext:"sql syntax" OR intext:"stack trace"
# Exposed backup files
site:yourdomain.co.za filetype:bak OR filetype:sql OR filetype:zip
# Staff email enumeration via indexed documents
site:yourdomain.co.za intext:"@yourdomain.co.za" filetype:pdf
The GHDB — Google Hacking Database, maintained at exploit-db.com — catalogues thousands of proven Google Dork queries organised by vulnerability category. An attacker searches the GHDB for queries relevant to your software stack and runs them against your domain in minutes. The results have been weaponised in real attacks and documented repeatedly in incident response reports.
Putting It Together: How a Spear Phishing Attack Is Built
Here is a realistic composite of how everything above combines. This is drawn from documented attack patterns and our own red team engagements. The target is a fictional three-dentist practice — but the techniques apply to any small South African business.
That dossier was built entirely with free tools in under two hours. No network access. No hacking. The attack hasn't started yet. The attacker now sends a single email to a new staff member, appearing to come from the owner's own address, referencing the practice management software by name, with a link to a fake login page. The email is convincing because every detail in it is real.
Before we touch anything in a client's network, we run every source in this article against their domain. The dossier we build is usually more complete than what we've shown here. When we then send a simulated spear phishing email to their staff — crafted with all of this intelligence — the click rate on these targeted, contextual emails is typically 3–5× higher than generic phishing simulations. The intelligence gap is real, and it is routinely exploited.
How to Fight Back: Reducing Your OSINT Attack Surface
You cannot make yourself completely invisible. Some of what OSINT reveals — your MX records, your SSL certificates, your public-facing services — are necessary to run a business. But you can significantly reduce the signal quality an attacker gets, close the most dangerous exposures, and make targeted attacks dramatically harder to construct.
The Uncomfortable Truth About Open Source Intelligence
Everything in this article — every tool, every technique, every piece of intelligence described — is legal, publicly available, and being used right now by attackers scanning for their next target. No laws were broken to compile that dossier. No systems were accessed without permission. The data was simply found, because your business put it there — in DNS records, job posts, LinkedIn profiles, and certificate logs — as part of normal operations.
The appropriate response is not to become invisible — that's not achievable. It's to understand what you're disclosing and make deliberate choices about it. Close the exposures that provide direct attack paths (RDP, DMARC, unpatched CVEs). Reduce the signal quality of everything else (metadata, job post details, subdomain hygiene). And make sure your staff — especially new staff — understand that a convincing email doesn't mean a legitimate one.
If you want to know exactly what an attacker finds when they look at your business, Greyhat4Hire runs a full OSINT reconnaissance engagement as part of every penetration test — and we can run a standalone OSINT report if you want the picture without the full test. We show you the dossier. Then we help you close it.