@modular-intelligence/osint
v1.0.2
Published
MCP server for OSINT reconnaissance (theHarvester, GitHub, HIBP, DNS)
Readme
OSINT MCP Server
A comprehensive open-source intelligence (OSINT) reconnaissance platform that provides automated data gathering and intelligence collection capabilities. This MCP (Model Context Protocol) server enables Claude to perform domain reconnaissance, breach searches, email verification, GitHub analysis, and passive DNS lookups using multiple data sources and APIs.
Overview
This server provides access to multiple OSINT data sources and techniques through a unified interface:
- theHarvester - Domain reconnaissance via CLI tool (subdomains, emails, IPs)
- GitHub API - Organization/user/repository reconnaissance and secret discovery
- Google Dork - Query generation for advanced Google search techniques
- MX Records - Email domain validation and MX record lookup
- Social Media - Username lookup across 25+ social platforms
- Certificate Transparency - Passive DNS via crt.sh and SecurityTrails
- Have I Been Pwned - Email breach detection and data class reporting
Perfect for security research, reconnaissance, incident response, threat intelligence, and OSINT investigations.
Tools
| Tool | Method | Description |
|------|--------|-------------|
| theharvester_search | theHarvester CLI | Domain reconnaissance (subdomains, emails, IPs) from 25+ sources |
| github_recon | GitHub API | Organization/user/repository analysis with secret pattern detection |
| google_dork | Query Generation | Generate Google dork queries for reconnaissance (manual execution) |
| email_verify | DNS MX Lookup | Verify email domains via MX record resolution |
| social_lookup | URL Generation | Generate social media profile URLs across 25 platforms |
| passivedns_lookup | crt.sh + SecurityTrails | Passive DNS history and subdomain enumeration via CT logs |
| breach_search | Have I Been Pwned API | Email breach detection with data class reporting |
theHarvester Domain Reconnaissance
Perform comprehensive domain OSINT reconnaissance using theHarvester CLI tool. Discovers emails, subdomains, and IP addresses from multiple data sources including DNS, certificate authorities, search engines, and threat intelligence feeds.
Input Parameters:
{
domain: string // Target domain name
sources: string[] // Data sources: anubis, baidu, bing, bufferoverun, censys, certspotter, crtsh, dnsdumpster, github-code, google, hunter, intelx, linkedin, netcraft, otx, rapiddns, securitytrail, shodan, sublist3r, threatcrowd, threatminer, twitter, urlscan, virustotal, yahoo (default: crtsh, dnsdumpster)
limit: number // Maximum results (1-500, default: 100)
}Example Request:
{
"domain": "example.com",
"sources": ["crtsh", "dnsdumpster", "bufferoverun"],
"limit": 100
}Example Output:
{
"domain": "example.com",
"sources": ["crtsh", "dnsdumpster", "bufferoverun"],
"emails": [
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]"
],
"hosts": [
"mail.example.com",
"www.example.com",
"api.example.com",
"staging.example.com",
"cdn.example.com",
"vpn.example.com"
],
"ips": [
"192.0.2.1",
"192.0.2.2",
"192.0.2.3"
],
"total_emails": 4,
"total_hosts": 6,
"total_ips": 3,
"raw_output": "..."
}GitHub Organization/User/Repository Reconnaissance
Analyze GitHub targets for public information and potential exposed secrets. Searches for configuration files, credentials, API keys, and sensitive patterns across repositories.
Input Parameters:
{
target: string // GitHub username, organization name, or repository (owner/repo)
target_type: string // Type: "org", "user", or "repo"
max_results: number // Maximum results to return (1-100, default: 20)
}Example Request:
{
"target": "microsoft",
"target_type": "org",
"max_results": 20
}Example Output:
{
"target": "microsoft",
"target_type": "org",
"info": {
"login": "microsoft",
"name": "Microsoft",
"description": "The home of Microsoft Open Source Software",
"created_at": "2010-02-25T12:53:47Z",
"updated_at": "2024-01-15T10:30:00Z",
"public_repos": 3847,
"followers": 234567,
"following": 0,
"blog": "https://opensource.microsoft.com",
"email": null,
"location": "Redmond, WA",
"html_url": "https://github.com/microsoft"
},
"potential_secrets": [
{
"pattern": "filename:.env",
"count": 12,
"samples": [
{
"name": ".env",
"path": "src/.env",
"repository": "microsoft/repo1",
"url": "https://github.com/microsoft/repo1/blob/main/src/.env"
},
{
"name": ".env.example",
"path": "config/.env.example",
"repository": "microsoft/repo2",
"url": "https://github.com/microsoft/repo2/blob/main/config/.env.example"
}
]
},
{
"pattern": "filename:credentials",
"count": 5,
"samples": [
{
"name": "credentials",
"path": "docs/credentials",
"repository": "microsoft/repo3",
"url": "https://github.com/microsoft/repo3/blob/main/docs/credentials"
}
]
}
],
"warning": "Note: GitHub API has rate limits. Unauthenticated: 60 req/hour, Authenticated: 5000 req/hour"
}Google Dork Query Generation
Generate Google dork queries for manual reconnaissance of a target domain. Returns search query strings that can be executed manually in Google Search to find files, admin panels, login pages, configuration, databases, and error messages.
Input Parameters:
{
domain: string // Target domain name
dork_type: string // Query type: "files", "admin", "login", "config", "database", "errors", or "all"
custom_dork: string // Optional custom dork pattern to append
}Example Request:
{
"domain": "example.com",
"dork_type": "all",
"custom_dork": "password"
}Example Output:
{
"domain": "example.com",
"dork_type": "all",
"total_queries": 38,
"queries": [
{
"category": "files",
"query": "site:example.com filetype:pdf",
"description": "Find PDF documents"
},
{
"category": "files",
"query": "site:example.com filetype:doc OR filetype:docx",
"description": "Find Word documents"
},
{
"category": "files",
"query": "site:example.com filetype:sql",
"description": "Find SQL dump files"
},
{
"category": "admin",
"query": "site:example.com inurl:admin",
"description": "Find admin panels"
},
{
"category": "admin",
"query": "site:example.com intitle:\"admin panel\"",
"description": "Find admin panels in title"
},
{
"category": "login",
"query": "site:example.com inurl:login",
"description": "Find login pages"
},
{
"category": "config",
"query": "site:example.com filetype:env",
"description": "Find .env configuration files"
},
{
"category": "database",
"query": "site:example.com inurl:phpmyadmin",
"description": "Find phpMyAdmin installations"
},
{
"category": "errors",
"query": "site:example.com intitle:\"Error\" OR intitle:\"Warning\"",
"description": "Find error and warning pages"
},
{
"category": "custom",
"query": "site:example.com password",
"description": "Custom dork query"
}
],
"note": "These are Google dork query strings. Execute them manually in Google Search. Do not automate queries as it may violate Google's Terms of Service."
}Email Domain Verification
Verify email address domains by performing MX (Mail Exchange) record lookups. Determines if a domain can receive email and returns mail server information.
Input Parameters:
{
email: string // Email address to verify
}Example Request:
{
"email": "[email protected]"
}Example Output:
{
"email": "[email protected]",
"domain": "example.com",
"has_mx_records": true,
"mx_records": [
{
"exchange": "mail1.example.com",
"priority": 10
},
{
"exchange": "mail2.example.com",
"priority": 20
}
],
"is_valid_domain": true,
"primary_mx": "mail1.example.com",
"total_mx_records": 2,
"note": "MX records indicate the domain can receive email, but this does not verify if the specific email address exists."
}Social Media Username Lookup
Search for a username across 25+ social media and online platforms. Returns probable profile URLs for manual verification.
Input Parameters:
{
username: string // Username to search
platforms: string[] // Optional list of specific platforms to check
}Example Request:
{
"username": "john_doe",
"platforms": ["GitHub", "Twitter/X", "LinkedIn"]
}Example Output:
{
"username": "john_doe",
"total_platforms": 3,
"profiles": [
{
"platform": "GitHub",
"url": "https://github.com/john_doe",
"profile_url_pattern": "https://github.com/{username}"
},
{
"platform": "Twitter/X",
"url": "https://twitter.com/john_doe",
"profile_url_pattern": "https://twitter.com/{username}"
},
{
"platform": "LinkedIn",
"url": "https://www.linkedin.com/in/john_doe",
"profile_url_pattern": "https://www.linkedin.com/in/{username}"
}
],
"note": "These are probable profile URLs. They may not all exist. Manual verification or automated checking (respecting rate limits) is required to confirm existence."
}Passive DNS History Lookup
Retrieve passive DNS history and subdomain enumeration via Certificate Transparency logs or SecurityTrails API. Discovers subdomains without active scanning.
Input Parameters:
{
domain: string // Target domain name
ip: string // Optional IP address to lookup
}Example Request:
{
"domain": "example.com"
}Example Output:
{
"domain": "example.com",
"source": "crt.sh",
"subdomains": [
"api.example.com",
"app.example.com",
"cdn.example.com",
"dev.example.com",
"mail.example.com",
"staging.example.com",
"www.example.com"
],
"total_subdomains": 7,
"total_certificates": 45,
"note": "Data from crt.sh Certificate Transparency logs. This is free but may be less comprehensive than SecurityTrails."
}Email Breach Detection
Search Have I Been Pwned for email addresses in known data breaches. Returns breach information including data classes exposed and verification status.
Input Parameters:
{
email: string // Email address to search
}Example Request:
{
"email": "[email protected]"
}Example Output (Breached):
{
"email": "[email protected]",
"breached": true,
"total_breaches": 3,
"total_exposed_accounts": 1250000,
"unique_data_classes": [
"Email addresses",
"Passwords",
"Physical addresses",
"Phone numbers"
],
"breaches": [
{
"name": "ExampleBreak",
"title": "Example Data Breach",
"domain": "example.com",
"breach_date": "2023-06-15",
"added_date": "2023-07-01T00:00:00Z",
"pwn_count": 500000,
"description": "A large breach affecting customer data",
"data_classes": ["Email addresses", "Passwords", "Physical addresses"],
"is_verified": true,
"is_sensitive": false,
"is_fabricated": false,
"is_retired": false
},
{
"name": "SampleLeaks",
"title": "Sample Database Leaks",
"domain": "sample.com",
"breach_date": "2023-05-20",
"added_date": "2023-05-25T00:00:00Z",
"pwn_count": 750000,
"description": "Credential database exposure",
"data_classes": ["Email addresses", "Passwords", "Phone numbers"],
"is_verified": true,
"is_sensitive": true,
"is_fabricated": false,
"is_retired": false
}
],
"message": "Warning! This email address has been found in 3 data breach(es).",
"recommendation": "Consider changing passwords for affected services and enabling two-factor authentication."
}Example Output (Not Breached):
{
"email": "[email protected]",
"breached": false,
"total_breaches": 0,
"breaches": [],
"message": "Good news! This email address has not been found in any known data breaches."
}Configuration
Environment Variables
The server requires API keys for some services. Optional services (with fallbacks) enhance functionality but are not required:
# Required
export HIBP_API_KEY="your-have-i-been-pwned-api-key"
# Optional (enhances passive DNS with more comprehensive data)
export SECURITYTRAILS_API_KEY="your-securitytrails-api-key"Getting API Keys
Have I Been Pwned (HIBP)
- Register at https://haveibeenpwned.com/API/v3
- Request API access and receive key via email
- Free tier provides breach search access
- Rate limit: 1 request per 1500ms
- Documentation: https://haveibeenpwned.com/API/v3
SecurityTrails (Optional)
- Sign up at https://securitytrails.com
- Navigate to Account -> API Key
- Free tier provides limited subdomain queries
- Rate limit: 100 requests per month (free tier)
- Documentation: https://docs.securitytrails.com
Rate Limits Summary
| Service | Tier | Rate Limit | |---------|------|-----------| | HIBP | Free | 1 req/1500ms | | SecurityTrails | Free | 100 req/month | | GitHub API | Unauthenticated | 60 req/hour | | GitHub API | Authenticated | 5000 req/hour | | crt.sh | Unlimited | No documented limit |
Prerequisites
Required
- Bun runtime (version 1.x or later) OR Node.js 18+
- HIBP_API_KEY environment variable (for breach search functionality)
Optional
theHarvester - Install via pip for domain reconnaissance:
pip install theHarvester- Required only if using
theharvester_searchtool - Learn more: https://github.com/laramies/theHarvester
- Required only if using
SecurityTrails API Key - For enhanced passive DNS capabilities (crt.sh is free fallback)
Installation
Steps
- Clone or download this repository:
git clone <repo-url>
cd osint- Install dependencies:
bun install- Build the project:
bun run build- Set environment variables:
export HIBP_API_KEY="your-api-key"
export SECURITYTRAILS_API_KEY="your-api-key" # Optional- Run the server:
bun run startThe server will start listening on stdio transport.
Usage
Running the Server
Start the server with Bun:
bun run src/index.tsThe server implements the Model Context Protocol (MCP) and communicates via stdio transport. It can be integrated with Claude or other MCP clients.
Claude Desktop Configuration
Add the server to your Claude Desktop configuration at ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"osint": {
"command": "bun",
"args": [
"run",
"/path/to/osint/src/index.ts"
],
"env": {
"HIBP_API_KEY": "your-api-key",
"SECURITYTRAILS_API_KEY": "your-api-key"
}
}
}
}Claude Code MCP Settings
Configure the server in Claude Code's MCP settings (typically in .mcp.json or via settings UI):
{
"servers": {
"osint": {
"transport": "stdio",
"command": "bun",
"args": ["run", "/path/to/osint/src/index.ts"],
"env": {
"HIBP_API_KEY": "your-api-key",
"SECURITYTRAILS_API_KEY": "your-api-key"
}
}
}
}Example Usage in Claude
Once configured, you can use the tools directly in conversations with Claude:
Request: "Perform OSINT reconnaissance on example.com. Get emails, subdomains, and check for breaches."
Claude will call:
{
"tool": "theharvester_search",
"input": {
"domain": "example.com",
"sources": ["crtsh", "dnsdumpster"],
"limit": 100
}
}Request: "Search for potential secrets in the Microsoft GitHub organization."
Claude will call:
{
"tool": "github_recon",
"input": {
"target": "microsoft",
"target_type": "org",
"max_results": 20
}
}Request: "Check if [email protected] has been in any data breaches."
Claude will call:
{
"tool": "breach_search",
"input": {
"email": "[email protected]"
}
}Request: "Generate Google dork queries to find exposed configuration files on example.com"
Claude will call:
{
"tool": "google_dork",
"input": {
"domain": "example.com",
"dork_type": "config",
"custom_dork": "credentials"
}
}Security
This server implements comprehensive input validation and security measures to prevent injection attacks and ensure responsible use:
Input Validation
Domain Validation
- Requires valid domain name format (RFC-compliant)
- Maximum length: 253 characters
- Validates character set (alphanumeric, dots, hyphens)
- Rejects invalid TLDs
Email Validation
- Requires valid email format
- Validates domain portion independently
- Supports standard email formats
Username Validation
- Alphanumeric with dots, hyphens, underscores only
- Maximum length: 50 characters
- Prevents injection via special characters
Query Validation
- Maximum query length: 500 characters
- Blocks shell injection characters:
;,&,|, backticks,$,(),{} - Prevents command injection via query parameters
API Security
- API keys are never logged or exposed
- Secure environment variable usage
- HTTPS-only API communications
- Proper error handling without credential leakage
- Rate limit awareness and handling
What Gets Blocked
The server rejects:
- Malformed domains and non-domain strings
- Invalid email formats
- Usernames with special characters
- Queries containing shell metacharacters
- Missing or invalid API keys
- Oversized inputs
Error Handling
- Invalid inputs return descriptive error messages
- API errors are caught and reported with status codes
- Missing API keys trigger helpful configuration messages
- Network timeouts are handled gracefully
- theHarvester errors provide installation guidance
License
ISC License - see LICENSE file for details
