Google Dork

What is a Search Engine?

A search engine is a software system designed to help users find information on the internet. It indexes websites, analyzes their content, and returns relevant results in response to user queries. Popular examples include Google, Bing, and Yahoo.

How Search Engines Work?

Crawling: Search engines use automated programs called crawlers (or bots/spiders) to explore the web by visiting websites and following links.
Indexing: Once a crawler visits a page, it gathers information and stores it in a massive database called the index. The index contains text, keywords, metadata, and other content.
Ranking: When a user searches, the search engine retrieves relevant information from the index, ranks it using algorithms, and presents the most relevant results.

How Crawlers Work?

Crawlers systematically browse the web by following links from page to page. They analyze the content, metadata, and structure of the sites they visit and send this data back to the search engine's index. Crawlers regularly revisit websites to keep the index up to date with new content or changes.

Google Dorking

Google Dorking (or Google hacking) is a technique that uses advanced search operators in Google to find specific information or vulnerabilities that aren’t easily visible. By crafting targeted queries, users can discover sensitive data, like login pages, exposed directories, or misconfigured systems.

Example search query: site:example.com filetype:pdf "confidential"

This query searches for PDF files containing the word "confidential" on a specific domain.

Google hackingWikipedia

Full List of Google Dorks

Google Dorks List and Updated Database in 2022 - Box Piperboxpiperapp

Misc. Techniques

Limit search results to a particular domain: Use site:[domain] to target a specific website and even enumerate subdomains.
Search website titles: With intitle:[keyword], you can filter pages by their titles for more relevant data.
Subdomain enumeration: Use site:*.web.com to uncover subdomains of a target. Tools like Sublist3r utilize this method for passive recon.
intitle: targets the page title; inurl: targets the URL.; The difference between intitle and inurl: While intitle focuses on the webpage title, inurl narrows down results based on keywords in the URL itself.
Search by file type: Looking for specific document types? Try filetype:[type], like PDFs, DOCs, or even configuration files.

Cached or Archived Website

Use cache:[website] to view Google’s cached version of a site, or check historical snapshots on the Wayback Machine.

Wayback Machine

Google Hacking Database (GHDB)

The GHDB is an incredible resource for finding vulnerabilities using Google searches. It’s a goldmine for anyone looking to enhance their OSINT skills.

OffSec’s Exploit Database Archive

PreviousSubdomain NextMisc. Techniques

Last updated 9 months ago