@modular-intelligence/web-recon

v1.0.2

Published

2 months ago

MCP server for web application reconnaissance

0High
0Medium
0Low

Web Recon MCP Server

A comprehensive Model Context Protocol (MCP) server for web application reconnaissance and security analysis. Perform HTTP header audits, SSL/TLS certificate inspection, technology fingerprinting, directory brute-forcing, CORS analysis, Content Security Policy evaluation, and web page capture through a unified Claude interface.

Overview

The Web Recon MCP server provides eight specialized tools for security researchers, penetration testers, and developers to gather reconnaissance data on web applications:

HTTP Headers Audit: Security headers analysis with present/missing ratings
SSL/TLS Inspection: Certificate details, cipher suites, and expiry tracking
Technology Fingerprinting: Web server, framework, CMS, and library detection
Directory Brute-Force: Path enumeration using gobuster or ffuf
Robots.txt & Sitemap Parsing: Crawlable resource discovery
CORS Analysis: Misconfiguration detection and vulnerability assessment
Content Security Policy Evaluation: Header weakness identification and remediation
Web Page Capture: Page metadata and resource analysis

Tools

Quick Reference

| Tool | Description | Input | CLI Dependency | |------|-------------|-------|-----------------| | http_headers | Fetch and audit security headers | URL + redirect option | None (fetch-based) | | ssl_inspect | Inspect SSL/TLS certificates | Host + port | openssl | | whatweb_scan | Fingerprint web technologies | URL + aggression level | whatweb | | dir_brute | Enumerate directories and files | URL + wordlist config | gobuster or ffuf | | robots_sitemap | Parse robots.txt and sitemap.xml | URL | None (fetch-based) | | cors_check | Check CORS configuration | URL + test origins | None (fetch-based) | | csp_analyze | Analyze Content Security Policy | URL | None (fetch-based) | | screenshot_capture | Capture page metadata and resources | URL + viewport dimensions | None (fetch-based) |

http_headers

Audit HTTP security headers and gather technology hints.

Input Schema:

{
  "url": "https://example.com",
  "follow_redirects": true
}

Parameters:

url (required): Target URL (HTTP/HTTPS)
follow_redirects (optional, default: true): Follow HTTP 3xx redirects

Example Usage:

http_headers(
  url: "https://example.com",
  follow_redirects: true
)

Example Output:

{
  "url": "https://example.com",
  "status_code": 200,
  "headers": {
    "content-type": "text/html; charset=utf-8",
    "server": "nginx/1.24.0",
    "content-length": "4521",
    "strict-transport-security": "max-age=31536000; includeSubDomains",
    "content-security-policy": "default-src 'self'; script-src 'self' 'unsafe-inline'",
    "x-content-type-options": "nosniff",
    "cache-control": "public, max-age=3600"
  },
  "security_headers": {
    "present": [
      {
        "name": "Strict-Transport-Security",
        "value": "max-age=31536000; includeSubDomains",
        "rating": "good"
      },
      {
        "name": "Content-Security-Policy",
        "value": "default-src 'self'; script-src 'self' 'unsafe-inline'",
        "rating": "present"
      },
      {
        "name": "X-Content-Type-Options",
        "value": "nosniff",
        "rating": "good"
      }
    ],
    "missing": [
      {
        "name": "X-Frame-Options",
        "recommendation": "Add 'X-Frame-Options: DENY' or 'SAMEORIGIN' to prevent clickjacking"
      },
      {
        "name": "X-XSS-Protection",
        "recommendation": "Add 'X-XSS-Protection: 1; mode=block' (legacy browsers)"
      },
      {
        "name": "Referrer-Policy",
        "recommendation": "Add 'Referrer-Policy: strict-origin-when-cross-origin'"
      },
      {
        "name": "Permissions-Policy",
        "recommendation": "Add Permissions-Policy to control browser features"
      }
    ]
  },
  "server": "nginx/1.24.0",
  "technology_hints": [
    "Server: nginx/1.24.0"
  ]
}

Security Headers Rated:

Strict-Transport-Security: Ratings: "good" (max-age >= 1 year) or "weak"
Content-Security-Policy: Ratings: "present"
X-Content-Type-Options: Ratings: "good" (nosniff) or "weak"
X-Frame-Options: Ratings: "good" (DENY/SAMEORIGIN) or "weak"
X-XSS-Protection: Ratings: "present"
Referrer-Policy: Ratings: "present"
Permissions-Policy: Ratings: "present"

ssl_inspect

Inspect SSL/TLS certificate details, cipher suites, and certificate chain.

Input Schema:

{
  "host": "example.com",
  "port": 443
}

Parameters:

host (required): Hostname or IP address (alphanumeric, dots, hyphens only)
port (optional, default: 443): Port number (1-65535)

Example Usage:

ssl_inspect(
  host: "example.com",
  port: 443
)

Example Output:

{
  "host": "example.com",
  "port": 443,
  "protocol": "TLSv1.3",
  "cipher": "TLS_AES_256_GCM_SHA384",
  "certificate": {
    "subject": "CN = example.com",
    "issuer": "C = US, O = DigiCert Inc, CN = DigiCert Global G5 TLS RSA SHA256 2021 CA1",
    "not_before": "Jan 15 00:00:00 2024 GMT",
    "not_after": "Feb 15 23:59:59 2025 GMT",
    "serial_number": "0F:A2:C8:D4:E1:2B:5C:7F:9E:6D:3A:1C:4B:8F:2E:9A",
    "san": [
      "example.com",
      "www.example.com",
      "api.example.com"
    ],
    "is_expired": false,
    "days_until_expiry": 374
  },
  "chain": [
    {
      "subject": "CN = example.com",
      "issuer": "C = US, O = DigiCert Inc, CN = DigiCert Global G5 TLS RSA SHA256 2021 CA1"
    },
    {
      "subject": "C = US, O = DigiCert Inc, CN = DigiCert Global G5 TLS RSA SHA256 2021 CA1",
      "issuer": "C = US, O = DigiCert Inc, CN = DigiCert Global Root CA"
    }
  ]
}

Interpretation:

days_until_expiry: Negative value indicates expired certificate
is_expired: Boolean flag for quick checking
san: Subject Alternative Names (typically includes www variant and subdomains)
chain: Certificate chain from leaf to root

whatweb_scan

Fingerprint web technologies including web servers, frameworks, CMS, and libraries.

Input Schema:

{
  "url": "https://example.com",
  "aggression": 1
}

Parameters:

url (required): Target URL (HTTP/HTTPS)
aggression (optional, default: 1): Scanning intensity (1=stealthy, 2=normal, 3=aggressive)

Example Usage:

whatweb_scan(
  url: "https://example.com",
  aggression: 1
)

Example Output:

{
  "url": "https://example.com",
  "technologies": [
    {
      "name": "Nginx",
      "version": "1.24.0",
      "category": "web-server"
    },
    {
      "name": "Node.js",
      "version": "18.17.0",
      "category": "language"
    },
    {
      "name": "Express.js",
      "version": "4.18.2",
      "category": "framework"
    },
    {
      "name": "React",
      "version": "18.2.0",
      "category": "framework"
    },
    {
      "name": "jQuery",
      "version": "3.7.0",
      "category": "library"
    },
    {
      "name": "Bootstrap",
      "version": "5.2.3",
      "category": "library"
    }
  ],
  "raw_output": "[{\"http_server\":\"Nginx\",\"plugins\":{\"Nginx\":{...}},\"url\":\"https://example.com\"}]"
}

Aggression Levels:

1 (Stealthy): HTTP requests only, minimal probes
2 (Normal): Standard fingerprinting techniques
3 (Aggressive): Additional probes and JavaScript analysis

Technology Categories:

web-server: Apache, Nginx, IIS, Lighttpd
language: PHP, Python, Ruby, Node.js, ASP.NET
framework: WordPress, Django, Rails, Laravel, React, Angular, Vue
library: jQuery, Bootstrap
database: MySQL, PostgreSQL, MongoDB
other: Miscellaneous technologies

dir_brute

Enumerate directories and files using gobuster or ffuf.

Input Schema:

{
  "url": "https://example.com",
  "wordlist": "/path/to/wordlist.txt",
  "extensions": "php,html,txt",
  "threads": 10,
  "timeout": 60,
  "status_codes": "200,204,301,302,307,401,403"
}

Parameters:

url (required): Target URL (HTTP/HTTPS)
wordlist (optional): Path to wordlist file (uses /usr/share/wordlists/dirb/common.txt if not specified)
extensions (optional): File extensions to try (e.g., php,html,txt)
threads (optional, default: 10): Number of concurrent threads (1-20)
timeout (optional, default: 60): Timeout in seconds (10-300)
status_codes (optional, default: "200,204,301,302,307,401,403"): HTTP status codes to report as found

Example Usage:

dir_brute(
  url: "https://example.com",
  extensions: "php,html",
  threads: 15,
  timeout: 120,
  status_codes: "200,301,302,401,403"
)

Example Output (gobuster):

{
  "url": "https://example.com",
  "wordlist": "/usr/share/wordlists/dirb/common.txt",
  "tool": "gobuster",
  "found": [
    {
      "path": "https://example.com/admin",
      "status": 200,
      "size": 5324
    },
    {
      "path": "https://example.com/api",
      "status": 301,
      "size": 185
    },
    {
      "path": "https://example.com/admin/login.php",
      "status": 403,
      "size": 0
    },
    {
      "path": "https://example.com/config.php",
      "status": 403,
      "size": 0
    },
    {
      "path": "https://example.com/backup",
      "status": 401,
      "size": 234
    }
  ],
  "elapsed_seconds": 45
}

Notes:

Tries gobuster first; falls back to ffuf if gobuster unavailable
Returns early with total_checked field for practical result limits
Custom wordlist paths must exist on the system
Respects timeout constraints

robots_sitemap

Parse robots.txt and sitemap.xml for crawlable resources and SEO information.

Input Schema:

{
  "url": "https://example.com"
}

Parameters:

url (required): Target URL (HTTP/HTTPS)

Example Usage:

robots_sitemap(
  url: "https://example.com"
)

Example Output:

{
  "url": "https://example.com",
  "robots": {
    "found": true,
    "user_agents": [
      {
        "agent": "*",
        "rules": [
          {
            "type": "disallow",
            "path": "/admin/"
          },
          {
            "type": "disallow",
            "path": "/private/"
          },
          {
            "type": "disallow",
            "path": "/*.json$"
          },
          {
            "type": "allow",
            "path": "/public/"
          }
        ]
      },
      {
        "agent": "Googlebot",
        "rules": [
          {
            "type": "disallow",
            "path": "/tmp/"
          }
        ]
      }
    ],
    "sitemaps": [
      "https://example.com/sitemap.xml",
      "https://example.com/sitemap_posts.xml"
    ]
  },
  "sitemap": {
    "found": true,
    "urls": [
      "https://example.com/",
      "https://example.com/blog/",
      "https://example.com/about/",
      "https://example.com/contact/",
      "https://example.com/privacy/"
    ],
    "total_urls": 152
  }
}

Output Details:

robots.found: Boolean indicating if robots.txt exists
robots.user_agents: List of user-agent rules (limited to 10)
robots.sitemaps: Sitemap URLs referenced in robots.txt
sitemap.found: Boolean indicating if sitemap.xml exists
sitemap.urls: First 100 URLs from sitemap (preview)
sitemap.total_urls: Total number of URLs in sitemap

cors_check

Check CORS configuration for misconfigurations and vulnerabilities.

Input Parameters:

{
  url: string                  // Target URL to check CORS configuration
  test_origins: string[]       // Origins to test against (optional, defaults to ["https://evil.com", "https://attacker.com", "null"])
}

Example Request:

{
  "url": "https://example.com",
  "test_origins": ["https://evil.com", "https://attacker.com"]
}

Example Output:

{
  "url": "https://example.com",
  "cors_header": "https://evil.com",
  "has_vulnerabilities": true,
  "findings": [
    {
      "origin": "https://evil.com",
      "allowed": true,
      "credentials": false,
      "methods": "GET, POST, PUT, DELETE",
      "headers": "Content-Type, Authorization",
      "severity": "HIGH"
    },
    {
      "origin": "https://attacker.com",
      "allowed": false,
      "credentials": false,
      "methods": "",
      "headers": "",
      "severity": "INFO"
    },
    {
      "origin": "null",
      "allowed": true,
      "credentials": false,
      "methods": "GET, POST",
      "headers": "",
      "severity": "HIGH"
    }
  ],
  "recommendations": [
    "Avoid using wildcard (*) for Access-Control-Allow-Origin",
    "Never combine wildcard origin with Access-Control-Allow-Credentials: true",
    "Do not reflect the Origin header without validation",
    "Maintain a strict allowlist of trusted origins",
    "Never allow the 'null' origin"
  ]
}

Severity Levels:

CRITICAL: Wildcard origin (*) combined with Access-Control-Allow-Credentials: true
HIGH: Misconfigured origins allowing evil.com, attacker origins, or null origin
MEDIUM: Overly permissive wildcard usage without credentials
INFO: No CORS vulnerabilities detected or CORS not configured

Interpretation:

has_vulnerabilities: Indicates presence of HIGH or CRITICAL findings
allowed: Boolean indicating if the test origin is allowed
credentials: Boolean indicating if credentials can be included in cross-origin requests
methods and headers: Allowed HTTP methods and headers for the origin

csp_analyze

Analyze Content Security Policy header for weaknesses and provide remediation recommendations.

Input Parameters:

{
  url: string  // Target URL to analyze Content Security Policy
}

Example Request:

{
  "url": "https://example.com"
}

Example Output:

{
  "url": "https://example.com",
  "has_csp": true,
  "is_report_only": false,
  "raw_csp": "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' https://cdn.example.com; img-src *; object-src 'none'",
  "directives": {
    "default-src": "'self'",
    "script-src": "'self' 'unsafe-inline'",
    "style-src": "'self' https://cdn.example.com",
    "img-src": "*",
    "object-src": "'none'"
  },
  "overall_severity": "HIGH",
  "total_findings": 2,
  "findings": [
    {
      "directive": "script-src",
      "issue": "Uses 'unsafe-inline' which allows inline scripts/styles",
      "severity": "HIGH",
      "recommendation": "Remove 'unsafe-inline' from script-src. Use nonces or hashes instead."
    },
    {
      "directive": "img-src",
      "issue": "Uses wildcard (*) which allows loading from any source",
      "severity": "HIGH",
      "recommendation": "Replace wildcard in img-src with specific trusted domains."
    }
  ]
}

Example Output (No CSP):

{
  "url": "https://example.com",
  "has_csp": false,
  "severity": "HIGH",
  "message": "No Content-Security-Policy header found",
  "recommendation": "Implement a Content-Security-Policy header to prevent XSS and data injection attacks"
}

Analyzed Issues:

unsafe-inline: Allows inline scripts/styles (HIGH in script-src, MEDIUM in others)
unsafe-eval: Allows eval() and similar functions (HIGH)
Wildcard (*): Allows loading from any source (HIGH)
http: protocol: Allows insecure HTTP loading (MEDIUM)
data: URIs: Can be used for injection attacks in script-src or object-src (HIGH)
Missing directives: Important directives like script-src, object-src, base-uri (HIGH/MEDIUM)
Report-only mode: Policy not enforced, only reported (MEDIUM)

Overall Severity:

HIGH: One or more high-severity findings present
MEDIUM: Only medium-severity findings present
LOW: No significant findings

screenshot_capture

Capture page metadata and resource analysis for a URL.

Input Parameters:

{
  url: string                // URL to capture screenshot of
  viewport_width: number     // Viewport width in pixels (320-3840, default: 1280)
  viewport_height: number    // Viewport height in pixels (240-2160, default: 800)
}

Example Request:

{
  "url": "https://example.com",
  "viewport_width": 1280,
  "viewport_height": 800
}

Example Output:

{
  "url": "https://example.com",
  "final_url": "https://example.com/",
  "status_code": 200,
  "title": "Example Domain",
  "description": "Example Domain. This domain is for use in examples and documentation.",
  "og_image": "https://example.com/og-image.jpg",
  "favicon": "/favicon.ico",
  "page_metrics": {
    "html_size_bytes": 48756,
    "script_tags": 5,
    "stylesheet_links": 3,
    "image_tags": 12,
    "form_tags": 2,
    "iframe_tags": 1
  },
  "viewport": {
    "width": 1280,
    "height": 800
  },
  "note": "Full screenshot capture requires a headless browser. This tool provides page metadata and resource analysis.",
  "server": "nginx/1.14.0",
  "content_type": "text/html; charset=utf-8"
}

Output Details:

final_url: URL after redirects
status_code: HTTP response status
title: Page title from tag
description: Meta description content
og_image: Open Graph image URL if available
favicon: Favicon URL if available
page_metrics: Analysis of page resources:
- html_size_bytes: Total HTML document size
- script_tags: Number of tags
- stylesheet_links: Number of elements
- image_tags: Number of tags
- form_tags: Number of elements
- iframe_tags: Number of elements
server: Server header value
content_type: Content-Type header value

Note: This tool provides page metadata and resource analysis. Full screenshot capture requires a headless browser installation.

Prerequisites

The tools have minimal dependencies, split between pure fetch-based tools and those requiring CLI utilities.

Fetch-Based Tools (No Installation Required)

http_headers: Uses native Fetch API
robots_sitemap: Uses native Fetch API
cors_check: Uses native Fetch API
csp_analyze: Uses native Fetch API
screenshot_capture: Uses native Fetch API

CLI-Based Tools (Installation Required)

OpenSSL

Required for ssl_inspect tool. Standard on most systems.

Check if installed:

openssl version

If not present on macOS:

brew install openssl

Whatweb

Required for whatweb_scan tool.

Install on macOS:

brew install whatweb

Install on Linux (Ubuntu/Debian):

sudo apt-get install whatweb

Install on Linux (RHEL/CentOS):

sudo yum install whatweb

Gobuster or Ffuf

Required for dir_brute tool (at least one).

Install gobuster on macOS:

brew install gobuster

Install gobuster on Linux (Ubuntu/Debian):

sudo apt-get install gobuster

Install ffuf as fallback on macOS:

brew install ffuf

Install ffuf on Linux (Ubuntu/Debian):

git clone https://github.com/ffuf/ffuf.git && cd ffuf && go build

Installation

Requirements

Bun runtime (version 1.x or later)
Node.js compatible environment

Steps

Navigate to the project directory:

cd /path/to/web-recon

Install dependencies:

bun install

Build the server:

bun run build

Verify the build:

ls -la dist/

The compiled MCP server is available at dist/index.js.

Usage

Stdio Transport (Standard)

The server uses stdio transport for communication with Claude clients. Run the server directly:

bun run dist/index.js

The server listens on stdin/stdout and responds to MCP protocol messages.

Claude Desktop Configuration

Add the following to your Claude Desktop configuration file (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "web-recon": {
      "command": "bun",
      "args": [
        "run",
        "/absolute/path/to/web-recon/dist/index.js"
      ]
    }
  }
}

Replace /absolute/path/to/web-recon with the actual absolute path to your installation.

Claude Code MCP Settings

For Claude Code integration, configure in your .mcp/settings.json:

{
  "mcpServers": {
    "web-recon": {
      "command": "bun",
      "args": [
        "run",
        "/absolute/path/to/web-recon/dist/index.js"
      ],
      "enabled": true
    }
  }
}

Example Usage in Claude

Once configured, use the tools naturally in conversation:

Please audit the security headers for https://example.com and show me what's missing.

Claude will invoke http_headers and present the results with analysis.

Check the SSL certificate for api.github.com on port 443 and tell me when it expires.

Claude will invoke ssl_inspect and provide certificate details.

Fingerprint the technologies used on https://github.com with aggression level 2.

Claude will invoke whatweb_scan and identify detected technologies.

Brute-force common directories on https://example.com using the default wordlist.

Claude will invoke dir_brute with standard parameters.

Parse the robots.txt and sitemap.xml for https://example.com to see what content is exposed.

Claude will invoke robots_sitemap and extract crawlable paths.

Check the CORS configuration on https://api.example.com for misconfigurations.

Claude will invoke cors_check and identify vulnerable origins.

Analyze the Content Security Policy on https://example.com and suggest improvements.

Claude will invoke csp_analyze and identify policy weaknesses.

Capture page metadata for https://example.com to see its structure and resources.

Claude will invoke screenshot_capture and provide page analysis.

Security

The Web Recon MCP server implements multiple security controls to prevent abuse:

URL Validation

All URL inputs are strictly validated:

Must be valid HTTP or HTTPS URLs
Only http:// and https:// protocols allowed
URLs are parsed and validated before use

Implementation in src/security.ts:

export function validateUrl(url: string): URL {
  let parsed: URL;
  try {
    parsed = new URL(url);
  } catch {
    throw new Error("Invalid URL format");
  }

  if (!["http:", "https:"].includes(parsed.protocol)) {
    throw new Error("Only HTTP and HTTPS URLs are supported");
  }
  // ... additional checks below
}

Private and Local IP Blocking

Scanning of private/internal networks is explicitly blocked to prevent internal network reconnaissance:

Blocked address ranges:

Localhost: 127.0.0.1, localhost
Link-local: 0.0.0.0
Private (RFC 1918): 192.168.x.x, 10.x.x.x, 172.16-31.x.x

Any attempt to scan these addresses will result in an error:

Error: Scanning private/local addresses is not allowed through this tool

Hostname Character Restrictions

The ssl_inspect tool validates hostname characters to prevent command injection:

Allowed characters:

Alphanumeric (a-z, A-Z, 0-9)
Period (.)
Hyphen (-)

Blocked characters:

Shell metacharacters: ;, &, |, `, $, (, ), {, }

Example validation:

export function validateHost(host: string): void {
  if (!host || host.trim().length === 0) {
    throw new Error("Host is required");
  }
  if (host.length > 253) {
    throw new Error("Hostname too long");
  }
  if (/[;&|`$(){}]/.test(host)) {
    throw new Error("Host contains disallowed characters");
  }
}

Command Execution Safety

CLI tools are executed using execFile with strict argument handling:

No shell interpretation of user input
Arguments passed as array, not string concatenation
30-second timeouts on all CLI executions
5MB maximum output buffer to prevent memory exhaustion

Network Isolation

All external connections (HTTP, HTTPS, TLS) are:

Subject to 15-30 second timeouts
Performed over standard protocols only
Not routed through custom proxies

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE OR ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Web Recon MCP Server

Overview

Tools

Quick Reference

http_headers

ssl_inspect

whatweb_scan

dir_brute

robots_sitemap

cors_check

csp_analyze

screenshot_capture

Prerequisites

Fetch-Based Tools (No Installation Required)

CLI-Based Tools (Installation Required)

OpenSSL

Whatweb

Gobuster or Ffuf

Installation

Requirements

Steps

Usage

Stdio Transport (Standard)

Claude Desktop Configuration

Claude Code MCP Settings

Example Usage in Claude

Security

URL Validation

Private and Local IP Blocking

Hostname Character Restrictions

Command Execution Safety

Network Isolation

License