What Is OSINT? A Complete Beginner's Guide

Open-source intelligence is the collection, verification, and analysis of information from publicly available sources to answer a specific question. The sources are ordinary — court filings, corporate registries, archived web pages, satellite imagery, social media posts, FOIA responses, domain registrations. The discipline is not.

What separates OSINT from a search engine session is method. An investigator defines a question, plans collection against it, documents every source, cross-references findings, and produces a report another analyst could reproduce. The raw material is public; the process is structured.

What OSINT Is Not

OSINT is not hacking. It is not social engineering. It is not buying leaked databases on criminal forums. The moment a technique requires unauthorized access, deception, or a purchase from a breach broker, it has crossed the line from OSINT into something else — something that usually carries criminal liability.

OSINT is also not a single tool. People sometimes equate OSINT with the OSINT Framework website (osintframework.com) or with Maltego. Those are resources. The practice is the investigator.

The Four Phases

Every competent investigation moves through four phases, regardless of domain:

Planning — define the question, scope, priorities, and legal constraints. See /methodology/planning/.
Collection — gather data against the plan from public sources. See /methodology/collection/.
Analysis — test claims, connect entities, resolve contradictions. See /methodology/analysis/.
Reporting — present findings with sources, confidence levels, and limitations. See /methodology/reporting/.

Skipping planning is the most common beginner mistake. Without a question, collection becomes hoarding, and analysis becomes pattern-matching against noise.

Who Uses OSINT

Investigative journalists verifying claims, tracing corporate ownership, geolocating video evidence. See /domains/journalism/.
Academic researchers conducting digital ethnography or document analysis. See /domains/academic-research/.
Compliance officers running enhanced due diligence for KYC and sanctions screening.
Human rights researchers documenting abuses using social media and satellite imagery.
Corporate investigators tracing beneficial ownership through registries. See /domains/corporate/.

The toolkit overlaps across all of them. The ethics, legal exposure, and reporting conventions do not.

A Concrete Example

Suppose a source claims that a shell company called "Meridian Holdings LLC" is linked to a named individual. An OSINT workflow might look like this:

Search the Delaware Division of Corporations and OpenCorporates for "Meridian Holdings."
Pull the registered agent and formation date.
Run the registered agent through SEC EDGAR full-text search.
WHOIS the company's apparent domain:

whois meridianholdings.example

Check the Wayback Machine for historical snapshots of that domain:

https://web.archive.org/web/*/meridianholdings.example*

Cross-reference officer names in state filings against PACER for civil litigation.
Document every URL, timestamp, and retrieval method before writing anything.

Each step is a public record request or a query against a public index. None of it requires a subpoena, a source, or anything you could not teach to a motivated undergraduate.

The Legal and Ethical Floor

Public does not mean consequence-free. Scraping a site in violation of its terms of service, aggregating personal data in ways that violate GDPR, or republishing information that identifies a private person can all create liability even when every source is technically open. The /ethics/ page lays out the baseline rules; the short version is that OSINT practitioners respect both the letter of the law and the reasonable privacy expectations of non-public figures.

For structured document review that holds up under legal scrutiny, practitioners often pair OSINT with the Subthesis legal document analysis tool, which applies consistent methodology to large document sets — useful when your collection phase produces hundreds of filings you need to triage.

Tools You Will Meet Early

A beginner's bench is short. You do not need paid platforms to start:

Wayback Machine for archived pages
WHOIS and DNS lookup for domain attribution
Google dorking for targeted discovery
Reverse image search for image verification
Metadata extraction for document and image forensics
Company registries for corporate structure
FOIA for government records

These ten or so tools carry most entry-level work. The full tool directory catalogs the rest.

Common Misconceptions

"OSINT is mostly social media." Social media is one source. Court records, regulatory filings, and archived web content usually produce harder evidence. See /tools/social-media/ for the narrow cases where social media is load-bearing.

"Anyone can do it in an afternoon." Anyone can run a query. Producing an investigation that survives cross-examination takes training. The methodology framework exists because investigators keep making the same mistakes, and the mistakes are expensive.

"If it's online, it's reliable." The Wayback Machine exists because online content disappears, changes, and gets back-dated. Preservation and hashing are not optional. See /tools/wayback-machine/.

Where to Go Next

If you are new, read the methodology framework end to end, then pick a domain guide that matches your actual use case — journalism, academic research, financial, or corporate. Skim the case studies to see the framework applied to real investigations.

OSINT rewards patience and punishes guessing. The rest of this site is built around that premise.