OSINT for Journalism

Source verification, social media investigation, and standards of proof for published reporting.

Who this is for

Reporters working in investigative, political, foreign, or accountability beats who need to verify claims that originate on the open web, trace ownership and responsibility behind public events, and document their process to a standard that survives pre-publication legal review. This guide assumes you can write; it is about how to collect and corroborate before you write.

If you are newer to reporting, start with the core methodology. If you are experienced and looking for the translation into newsroom expectations, start here.

Core techniques

Source verification at origin. Most viral material carries a caption that drifts from the footage. Reverse-image search every visual before repeating the claim attached to it. Pull EXIF data where available. Check weather, shadows, license plates, signage, and language against the claimed location and time. If the earliest instance you can find is a Telegram channel with no posting history, the item is unverified regardless of how many times it has been shared.

Identity verification. When a source surfaces online, the reportable question is not whether the account exists but whether the account is who it claims to be. Cross-check with public records, past employment, archived web presence, and mutual connections. A clean identity will have a searchable footprint older than the story; a fabricated one will not.

Public records and filings. Court dockets, corporate registries, property records, campaign finance filings, FOIA responses. These are the primary documents around which stories are built. Secondary reporting points you at them; do not cite the secondary reporting in place of the primary record once you have retrieved it.

Archive discipline. Archive every source at the moment of collection. A story held for editorial review for two weeks will find that roughly a tenth of its live citations have moved or disappeared. The Wayback Machine tutorial covers submission and retrieval.

Pivoting. A name becomes a corporate registration becomes a domain registration becomes a second email becomes a second name. Pivots are the engine of investigative reporting; they are also the stage at which confirmation bias enters the work. Document each pivot and re-ask whether the new lead is still answering the original question.

Essential tools

The following are indicative; see the full tool directory for comparisons and alternatives.

Legal and ethical considerations

Newsroom-specific considerations worth recording in the investigation brief:

  • Standards of proof: most newsrooms require two independent corroborating sources for any non-trivial claim attributed to a named subject. Open-source evidence can satisfy one or both, but the independence test still applies.
  • Right of reply: the subject of a significant finding should be contacted with specific questions and a real window to respond, except where doing so creates a safety risk to sources.
  • Harm to bystanders: incidental presence in the evidence base (a victim's name in a court filing, a minor in a photo) should be redacted or anonymised unless their identity is integral to the finding.
  • Source protection: when collection involves tips from confidential sources, the capture log itself can become discoverable. Segregate confidential-source notes from open-source capture logs.

Workflow example

A tip arrives suggesting that a small shell company is funnelling contributions into a local political committee. The reportable question, written at the planning phase, becomes: "Does the corporate registry identify a natural person in common between the shell company and the named committee's officers, during the relevant filing period?"

Collection starts with the state corporate registry for the shell company and the campaign-finance filing for the committee. Officers' names are extracted and normalised. Where names overlap, the candidate match is tested against LinkedIn history, property records, and prior filings to confirm the same person is meant in both. The Wayback Machine is used to retrieve historical versions of the committee's web presence, exposing an officer listing that has since been edited.

Analysis confirms a single natural person linking the two entities across the relevant period, with four independent public sources agreeing on the identification. The report cites the four sources, each backed by an archived capture and a hash, and flags two claims from the original tip that the evidence could not substantiate. The subject is contacted with specific questions. The published story names the person, quotes the records, and makes clear which of the original allegations are established and which are not.

This is unglamorous work. It is also how stories survive legal review.

Further reading