@modular-intelligence/web-recon
v1.0.2
Published
MCP server for web application reconnaissance
Readme
Web Recon MCP Server
A comprehensive Model Context Protocol (MCP) server for web application reconnaissance and security analysis. Perform HTTP header audits, SSL/TLS certificate inspection, technology fingerprinting, directory brute-forcing, CORS analysis, Content Security Policy evaluation, and web page capture through a unified Claude interface.
Overview
The Web Recon MCP server provides eight specialized tools for security researchers, penetration testers, and developers to gather reconnaissance data on web applications:
- HTTP Headers Audit: Security headers analysis with present/missing ratings
- SSL/TLS Inspection: Certificate details, cipher suites, and expiry tracking
- Technology Fingerprinting: Web server, framework, CMS, and library detection
- Directory Brute-Force: Path enumeration using gobuster or ffuf
- Robots.txt & Sitemap Parsing: Crawlable resource discovery
- CORS Analysis: Misconfiguration detection and vulnerability assessment
- Content Security Policy Evaluation: Header weakness identification and remediation
- Web Page Capture: Page metadata and resource analysis
Tools
Quick Reference
| Tool | Description | Input | CLI Dependency |
|------|-------------|-------|-----------------|
| http_headers | Fetch and audit security headers | URL + redirect option | None (fetch-based) |
| ssl_inspect | Inspect SSL/TLS certificates | Host + port | openssl |
| whatweb_scan | Fingerprint web technologies | URL + aggression level | whatweb |
| dir_brute | Enumerate directories and files | URL + wordlist config | gobuster or ffuf |
| robots_sitemap | Parse robots.txt and sitemap.xml | URL | None (fetch-based) |
| cors_check | Check CORS configuration | URL + test origins | None (fetch-based) |
| csp_analyze | Analyze Content Security Policy | URL | None (fetch-based) |
| screenshot_capture | Capture page metadata and resources | URL + viewport dimensions | None (fetch-based) |
http_headers
Audit HTTP security headers and gather technology hints.
Input Schema:
{
"url": "https://example.com",
"follow_redirects": true
}Parameters:
url(required): Target URL (HTTP/HTTPS)follow_redirects(optional, default: true): Follow HTTP 3xx redirects
Example Usage:
http_headers(
url: "https://example.com",
follow_redirects: true
)Example Output:
{
"url": "https://example.com",
"status_code": 200,
"headers": {
"content-type": "text/html; charset=utf-8",
"server": "nginx/1.24.0",
"content-length": "4521",
"strict-transport-security": "max-age=31536000; includeSubDomains",
"content-security-policy": "default-src 'self'; script-src 'self' 'unsafe-inline'",
"x-content-type-options": "nosniff",
"cache-control": "public, max-age=3600"
},
"security_headers": {
"present": [
{
"name": "Strict-Transport-Security",
"value": "max-age=31536000; includeSubDomains",
"rating": "good"
},
{
"name": "Content-Security-Policy",
"value": "default-src 'self'; script-src 'self' 'unsafe-inline'",
"rating": "present"
},
{
"name": "X-Content-Type-Options",
"value": "nosniff",
"rating": "good"
}
],
"missing": [
{
"name": "X-Frame-Options",
"recommendation": "Add 'X-Frame-Options: DENY' or 'SAMEORIGIN' to prevent clickjacking"
},
{
"name": "X-XSS-Protection",
"recommendation": "Add 'X-XSS-Protection: 1; mode=block' (legacy browsers)"
},
{
"name": "Referrer-Policy",
"recommendation": "Add 'Referrer-Policy: strict-origin-when-cross-origin'"
},
{
"name": "Permissions-Policy",
"recommendation": "Add Permissions-Policy to control browser features"
}
]
},
"server": "nginx/1.24.0",
"technology_hints": [
"Server: nginx/1.24.0"
]
}Security Headers Rated:
- Strict-Transport-Security: Ratings: "good" (max-age >= 1 year) or "weak"
- Content-Security-Policy: Ratings: "present"
- X-Content-Type-Options: Ratings: "good" (nosniff) or "weak"
- X-Frame-Options: Ratings: "good" (DENY/SAMEORIGIN) or "weak"
- X-XSS-Protection: Ratings: "present"
- Referrer-Policy: Ratings: "present"
- Permissions-Policy: Ratings: "present"
ssl_inspect
Inspect SSL/TLS certificate details, cipher suites, and certificate chain.
Input Schema:
{
"host": "example.com",
"port": 443
}Parameters:
host(required): Hostname or IP address (alphanumeric, dots, hyphens only)port(optional, default: 443): Port number (1-65535)
Example Usage:
ssl_inspect(
host: "example.com",
port: 443
)Example Output:
{
"host": "example.com",
"port": 443,
"protocol": "TLSv1.3",
"cipher": "TLS_AES_256_GCM_SHA384",
"certificate": {
"subject": "CN = example.com",
"issuer": "C = US, O = DigiCert Inc, CN = DigiCert Global G5 TLS RSA SHA256 2021 CA1",
"not_before": "Jan 15 00:00:00 2024 GMT",
"not_after": "Feb 15 23:59:59 2025 GMT",
"serial_number": "0F:A2:C8:D4:E1:2B:5C:7F:9E:6D:3A:1C:4B:8F:2E:9A",
"san": [
"example.com",
"www.example.com",
"api.example.com"
],
"is_expired": false,
"days_until_expiry": 374
},
"chain": [
{
"subject": "CN = example.com",
"issuer": "C = US, O = DigiCert Inc, CN = DigiCert Global G5 TLS RSA SHA256 2021 CA1"
},
{
"subject": "C = US, O = DigiCert Inc, CN = DigiCert Global G5 TLS RSA SHA256 2021 CA1",
"issuer": "C = US, O = DigiCert Inc, CN = DigiCert Global Root CA"
}
]
}Interpretation:
days_until_expiry: Negative value indicates expired certificateis_expired: Boolean flag for quick checkingsan: Subject Alternative Names (typically includes www variant and subdomains)chain: Certificate chain from leaf to root
whatweb_scan
Fingerprint web technologies including web servers, frameworks, CMS, and libraries.
Input Schema:
{
"url": "https://example.com",
"aggression": 1
}Parameters:
url(required): Target URL (HTTP/HTTPS)aggression(optional, default: 1): Scanning intensity (1=stealthy, 2=normal, 3=aggressive)
Example Usage:
whatweb_scan(
url: "https://example.com",
aggression: 1
)Example Output:
{
"url": "https://example.com",
"technologies": [
{
"name": "Nginx",
"version": "1.24.0",
"category": "web-server"
},
{
"name": "Node.js",
"version": "18.17.0",
"category": "language"
},
{
"name": "Express.js",
"version": "4.18.2",
"category": "framework"
},
{
"name": "React",
"version": "18.2.0",
"category": "framework"
},
{
"name": "jQuery",
"version": "3.7.0",
"category": "library"
},
{
"name": "Bootstrap",
"version": "5.2.3",
"category": "library"
}
],
"raw_output": "[{\"http_server\":\"Nginx\",\"plugins\":{\"Nginx\":{...}},\"url\":\"https://example.com\"}]"
}Aggression Levels:
- 1 (Stealthy): HTTP requests only, minimal probes
- 2 (Normal): Standard fingerprinting techniques
- 3 (Aggressive): Additional probes and JavaScript analysis
Technology Categories:
web-server: Apache, Nginx, IIS, Lighttpdlanguage: PHP, Python, Ruby, Node.js, ASP.NETframework: WordPress, Django, Rails, Laravel, React, Angular, Vuelibrary: jQuery, Bootstrapdatabase: MySQL, PostgreSQL, MongoDBother: Miscellaneous technologies
dir_brute
Enumerate directories and files using gobuster or ffuf.
Input Schema:
{
"url": "https://example.com",
"wordlist": "/path/to/wordlist.txt",
"extensions": "php,html,txt",
"threads": 10,
"timeout": 60,
"status_codes": "200,204,301,302,307,401,403"
}Parameters:
url(required): Target URL (HTTP/HTTPS)wordlist(optional): Path to wordlist file (uses/usr/share/wordlists/dirb/common.txtif not specified)extensions(optional): File extensions to try (e.g.,php,html,txt)threads(optional, default: 10): Number of concurrent threads (1-20)timeout(optional, default: 60): Timeout in seconds (10-300)status_codes(optional, default: "200,204,301,302,307,401,403"): HTTP status codes to report as found
Example Usage:
dir_brute(
url: "https://example.com",
extensions: "php,html",
threads: 15,
timeout: 120,
status_codes: "200,301,302,401,403"
)Example Output (gobuster):
{
"url": "https://example.com",
"wordlist": "/usr/share/wordlists/dirb/common.txt",
"tool": "gobuster",
"found": [
{
"path": "https://example.com/admin",
"status": 200,
"size": 5324
},
{
"path": "https://example.com/api",
"status": 301,
"size": 185
},
{
"path": "https://example.com/admin/login.php",
"status": 403,
"size": 0
},
{
"path": "https://example.com/config.php",
"status": 403,
"size": 0
},
{
"path": "https://example.com/backup",
"status": 401,
"size": 234
}
],
"elapsed_seconds": 45
}Notes:
- Tries gobuster first; falls back to ffuf if gobuster unavailable
- Returns early with
total_checkedfield for practical result limits - Custom wordlist paths must exist on the system
- Respects timeout constraints
robots_sitemap
Parse robots.txt and sitemap.xml for crawlable resources and SEO information.
Input Schema:
{
"url": "https://example.com"
}Parameters:
url(required): Target URL (HTTP/HTTPS)
Example Usage:
robots_sitemap(
url: "https://example.com"
)Example Output:
{
"url": "https://example.com",
"robots": {
"found": true,
"user_agents": [
{
"agent": "*",
"rules": [
{
"type": "disallow",
"path": "/admin/"
},
{
"type": "disallow",
"path": "/private/"
},
{
"type": "disallow",
"path": "/*.json$"
},
{
"type": "allow",
"path": "/public/"
}
]
},
{
"agent": "Googlebot",
"rules": [
{
"type": "disallow",
"path": "/tmp/"
}
]
}
],
"sitemaps": [
"https://example.com/sitemap.xml",
"https://example.com/sitemap_posts.xml"
]
},
"sitemap": {
"found": true,
"urls": [
"https://example.com/",
"https://example.com/blog/",
"https://example.com/about/",
"https://example.com/contact/",
"https://example.com/privacy/"
],
"total_urls": 152
}
}Output Details:
robots.found: Boolean indicating if robots.txt existsrobots.user_agents: List of user-agent rules (limited to 10)robots.sitemaps: Sitemap URLs referenced in robots.txtsitemap.found: Boolean indicating if sitemap.xml existssitemap.urls: First 100 URLs from sitemap (preview)sitemap.total_urls: Total number of URLs in sitemap
cors_check
Check CORS configuration for misconfigurations and vulnerabilities.
Input Parameters:
{
url: string // Target URL to check CORS configuration
test_origins: string[] // Origins to test against (optional, defaults to ["https://evil.com", "https://attacker.com", "null"])
}Example Request:
{
"url": "https://example.com",
"test_origins": ["https://evil.com", "https://attacker.com"]
}Example Output:
{
"url": "https://example.com",
"cors_header": "https://evil.com",
"has_vulnerabilities": true,
"findings": [
{
"origin": "https://evil.com",
"allowed": true,
"credentials": false,
"methods": "GET, POST, PUT, DELETE",
"headers": "Content-Type, Authorization",
"severity": "HIGH"
},
{
"origin": "https://attacker.com",
"allowed": false,
"credentials": false,
"methods": "",
"headers": "",
"severity": "INFO"
},
{
"origin": "null",
"allowed": true,
"credentials": false,
"methods": "GET, POST",
"headers": "",
"severity": "HIGH"
}
],
"recommendations": [
"Avoid using wildcard (*) for Access-Control-Allow-Origin",
"Never combine wildcard origin with Access-Control-Allow-Credentials: true",
"Do not reflect the Origin header without validation",
"Maintain a strict allowlist of trusted origins",
"Never allow the 'null' origin"
]
}Severity Levels:
- CRITICAL: Wildcard origin (*) combined with Access-Control-Allow-Credentials: true
- HIGH: Misconfigured origins allowing evil.com, attacker origins, or null origin
- MEDIUM: Overly permissive wildcard usage without credentials
- INFO: No CORS vulnerabilities detected or CORS not configured
Interpretation:
has_vulnerabilities: Indicates presence of HIGH or CRITICAL findingsallowed: Boolean indicating if the test origin is allowedcredentials: Boolean indicating if credentials can be included in cross-origin requestsmethodsandheaders: Allowed HTTP methods and headers for the origin
csp_analyze
Analyze Content Security Policy header for weaknesses and provide remediation recommendations.
Input Parameters:
{
url: string // Target URL to analyze Content Security Policy
}Example Request:
{
"url": "https://example.com"
}Example Output:
{
"url": "https://example.com",
"has_csp": true,
"is_report_only": false,
"raw_csp": "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' https://cdn.example.com; img-src *; object-src 'none'",
"directives": {
"default-src": "'self'",
"script-src": "'self' 'unsafe-inline'",
"style-src": "'self' https://cdn.example.com",
"img-src": "*",
"object-src": "'none'"
},
"overall_severity": "HIGH",
"total_findings": 2,
"findings": [
{
"directive": "script-src",
"issue": "Uses 'unsafe-inline' which allows inline scripts/styles",
"severity": "HIGH",
"recommendation": "Remove 'unsafe-inline' from script-src. Use nonces or hashes instead."
},
{
"directive": "img-src",
"issue": "Uses wildcard (*) which allows loading from any source",
"severity": "HIGH",
"recommendation": "Replace wildcard in img-src with specific trusted domains."
}
]
}Example Output (No CSP):
{
"url": "https://example.com",
"has_csp": false,
"severity": "HIGH",
"message": "No Content-Security-Policy header found",
"recommendation": "Implement a Content-Security-Policy header to prevent XSS and data injection attacks"
}Analyzed Issues:
unsafe-inline: Allows inline scripts/styles (HIGH in script-src, MEDIUM in others)unsafe-eval: Allows eval() and similar functions (HIGH)- Wildcard (*): Allows loading from any source (HIGH)
http:protocol: Allows insecure HTTP loading (MEDIUM)data:URIs: Can be used for injection attacks in script-src or object-src (HIGH)- Missing directives: Important directives like script-src, object-src, base-uri (HIGH/MEDIUM)
- Report-only mode: Policy not enforced, only reported (MEDIUM)
Overall Severity:
- HIGH: One or more high-severity findings present
- MEDIUM: Only medium-severity findings present
- LOW: No significant findings
screenshot_capture
Capture page metadata and resource analysis for a URL.
Input Parameters:
{
url: string // URL to capture screenshot of
viewport_width: number // Viewport width in pixels (320-3840, default: 1280)
viewport_height: number // Viewport height in pixels (240-2160, default: 800)
}Example Request:
{
"url": "https://example.com",
"viewport_width": 1280,
"viewport_height": 800
}Example Output:
{
"url": "https://example.com",
"final_url": "https://example.com/",
"status_code": 200,
"title": "Example Domain",
"description": "Example Domain. This domain is for use in examples and documentation.",
"og_image": "https://example.com/og-image.jpg",
"favicon": "/favicon.ico",
"page_metrics": {
"html_size_bytes": 48756,
"script_tags": 5,
"stylesheet_links": 3,
"image_tags": 12,
"form_tags": 2,
"iframe_tags": 1
},
"viewport": {
"width": 1280,
"height": 800
},
"note": "Full screenshot capture requires a headless browser. This tool provides page metadata and resource analysis.",
"server": "nginx/1.14.0",
"content_type": "text/html; charset=utf-8"
}Output Details:
final_url: URL after redirectsstatus_code: HTTP response statustitle: Page title from tagdescription: Meta description contentog_image: Open Graph image URL if availablefavicon: Favicon URL if availablepage_metrics: Analysis of page resources:html_size_bytes: Total HTML document sizescript_tags: Number of tagsstylesheet_links: Number of elementsimage_tags: Number of tagsform_tags: Number of elementsiframe_tags: Number of elements
server: Server header valuecontent_type: Content-Type header value
Note: This tool provides page metadata and resource analysis. Full screenshot capture requires a headless browser installation.
Prerequisites
The tools have minimal dependencies, split between pure fetch-based tools and those requiring CLI utilities.
Fetch-Based Tools (No Installation Required)
http_headers: Uses native Fetch APIrobots_sitemap: Uses native Fetch APIcors_check: Uses native Fetch APIcsp_analyze: Uses native Fetch APIscreenshot_capture: Uses native Fetch API
CLI-Based Tools (Installation Required)
OpenSSL
Required for ssl_inspect tool. Standard on most systems.
Check if installed:
openssl versionIf not present on macOS:
brew install opensslWhatweb
Required for whatweb_scan tool.
Install on macOS:
brew install whatwebInstall on Linux (Ubuntu/Debian):
sudo apt-get install whatwebInstall on Linux (RHEL/CentOS):
sudo yum install whatwebGobuster or Ffuf
Required for dir_brute tool (at least one).
Install gobuster on macOS:
brew install gobusterInstall gobuster on Linux (Ubuntu/Debian):
sudo apt-get install gobusterInstall ffuf as fallback on macOS:
brew install ffufInstall ffuf on Linux (Ubuntu/Debian):
git clone https://github.com/ffuf/ffuf.git && cd ffuf && go buildInstallation
Requirements
- Bun runtime (version 1.x or later)
- Node.js compatible environment
Steps
- Navigate to the project directory:
cd /path/to/web-recon- Install dependencies:
bun install- Build the server:
bun run build- Verify the build:
ls -la dist/The compiled MCP server is available at dist/index.js.
Usage
Stdio Transport (Standard)
The server uses stdio transport for communication with Claude clients. Run the server directly:
bun run dist/index.jsThe server listens on stdin/stdout and responds to MCP protocol messages.
Claude Desktop Configuration
Add the following to your Claude Desktop configuration file (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"web-recon": {
"command": "bun",
"args": [
"run",
"/absolute/path/to/web-recon/dist/index.js"
]
}
}
}Replace /absolute/path/to/web-recon with the actual absolute path to your installation.
Claude Code MCP Settings
For Claude Code integration, configure in your .mcp/settings.json:
{
"mcpServers": {
"web-recon": {
"command": "bun",
"args": [
"run",
"/absolute/path/to/web-recon/dist/index.js"
],
"enabled": true
}
}
}Example Usage in Claude
Once configured, use the tools naturally in conversation:
Please audit the security headers for https://example.com and show me what's missing.Claude will invoke http_headers and present the results with analysis.
Check the SSL certificate for api.github.com on port 443 and tell me when it expires.Claude will invoke ssl_inspect and provide certificate details.
Fingerprint the technologies used on https://github.com with aggression level 2.Claude will invoke whatweb_scan and identify detected technologies.
Brute-force common directories on https://example.com using the default wordlist.Claude will invoke dir_brute with standard parameters.
Parse the robots.txt and sitemap.xml for https://example.com to see what content is exposed.Claude will invoke robots_sitemap and extract crawlable paths.
Check the CORS configuration on https://api.example.com for misconfigurations.Claude will invoke cors_check and identify vulnerable origins.
Analyze the Content Security Policy on https://example.com and suggest improvements.Claude will invoke csp_analyze and identify policy weaknesses.
Capture page metadata for https://example.com to see its structure and resources.Claude will invoke screenshot_capture and provide page analysis.
Security
The Web Recon MCP server implements multiple security controls to prevent abuse:
URL Validation
All URL inputs are strictly validated:
- Must be valid HTTP or HTTPS URLs
- Only
http://andhttps://protocols allowed - URLs are parsed and validated before use
Implementation in src/security.ts:
export function validateUrl(url: string): URL {
let parsed: URL;
try {
parsed = new URL(url);
} catch {
throw new Error("Invalid URL format");
}
if (!["http:", "https:"].includes(parsed.protocol)) {
throw new Error("Only HTTP and HTTPS URLs are supported");
}
// ... additional checks below
}Private and Local IP Blocking
Scanning of private/internal networks is explicitly blocked to prevent internal network reconnaissance:
Blocked address ranges:
- Localhost:
127.0.0.1,localhost - Link-local:
0.0.0.0 - Private (RFC 1918):
192.168.x.x,10.x.x.x,172.16-31.x.x
Any attempt to scan these addresses will result in an error:
Error: Scanning private/local addresses is not allowed through this toolHostname Character Restrictions
The ssl_inspect tool validates hostname characters to prevent command injection:
Allowed characters:
- Alphanumeric (a-z, A-Z, 0-9)
- Period (.)
- Hyphen (-)
Blocked characters:
- Shell metacharacters:
;,&,|,`,$,(,),{,}
Example validation:
export function validateHost(host: string): void {
if (!host || host.trim().length === 0) {
throw new Error("Host is required");
}
if (host.length > 253) {
throw new Error("Hostname too long");
}
if (/[;&|`$(){}]/.test(host)) {
throw new Error("Host contains disallowed characters");
}
}Command Execution Safety
CLI tools are executed using execFile with strict argument handling:
- No shell interpretation of user input
- Arguments passed as array, not string concatenation
- 30-second timeouts on all CLI executions
- 5MB maximum output buffer to prevent memory exhaustion
Network Isolation
All external connections (HTTP, HTTPS, TLS) are:
- Subject to 15-30 second timeouts
- Performed over standard protocols only
- Not routed through custom proxies
License
MIT License
Copyright (c) 2025
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE OR ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
