npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

selenium-mcp

v1.5.1

Published

A Model Context Protocol (MCP) server that provides advanced screenshot capabilities using Selenium WebDriver. Perfect for AI agents, automated testing, visual regression testing, and content capture workflows.

Readme

Selenium Screenshot Server

A Model Context Protocol (MCP) server that provides advanced screenshot capabilities using Selenium WebDriver. Perfect for AI agents, automated testing, visual regression testing, and content capture workflows.

📋 Issue Tracker - Report bugs, request features, or ask questions

Getting Started (For AI Agents)

Quick Setup in Cursor

  1. Clone and install the server:
git clone <repository-url>
cd selenium
npm install
npm test  # Verify installation
  1. Add to your Cursor MCP configuration: Create or edit ~/.cursor/mcp.json (macOS/Linux) or %APPDATA%\Cursor\mcp.json (Windows):
{
  "mcpServers": {
    "selenium-screenshot": {
      "command": "node",
      "args": ["/path/to/your/selenium/src/server.js"],
      "env": {
        "NODE_ENV": "production"
      }
    }
  }
}
  1. Restart Cursor and start using the screenshot tool!

Demo Commands

Try these commands in Cursor:

Take a screenshot of https://google.com
Take a full page screenshot of https://google.com in desktop viewport
Take a screenshot of the Google logo on https://google.com
Take a screenshot of https://google.com in mobile viewport

What You Can Do

  • Basic screenshots: "Take a screenshot of [URL]"
  • Full page capture: "Take a full page screenshot of [URL]"
  • Element-specific: "Take a screenshot of the [element] on [URL]"
  • Responsive testing: "Take screenshots of [URL] in mobile, tablet, and desktop viewports"
  • Debug elements: "Take a screenshot of the [element] on [URL] with highlighting"

For detailed setup instructions, see CURSOR_SETUP.md.

Features

  • Full Page Screenshots: Capture entire page content including areas below the fold
  • Element-Specific Screenshots: Target specific DOM elements with CSS selectors
  • Multiple Viewport Sizes: Support for mobile, tablet, and desktop presets
  • Custom Viewport Dimensions: Flexible viewport sizing for responsive testing
  • Wait Conditions: Wait for selectors or custom time periods
  • Element Highlighting: Debug mode for element-specific screenshots
  • Headless Mode: Configurable browser visibility (default: true for efficiency)
  • High-Quality Output: PNG format with configurable quality
  • HTML Retrieval: Get page HTML content with structure analysis options

Installation

Prerequisites

  • Node.js 18+
  • Chrome browser installed
  • ChromeDriver (automatically managed by Selenium)

For AI Agents (Recommended)

Follow the Getting Started section above for quick setup in Cursor.

For Direct Usage

# Clone the repository
git clone <repository-url>
cd selenium

# Install dependencies
npm install

# Run tests to verify installation
npm test

Usage

Quick Reference for AI Agents

| Command | Description | | ---------------------------------------------------------- | ------------------------ | | Take a screenshot of [URL] | Basic screenshot | | Take a full page screenshot of [URL] | Capture entire page | | Take a screenshot of [URL] in mobile viewport | Mobile device testing | | Take a screenshot of the [element] on [URL] | Element-specific capture | | Take screenshots of [URL] in mobile, tablet, and desktop | Responsive testing | | Take a screenshot of [URL] with element highlighting | Debug mode | | Take a screenshot of [URL] with visible browser | Non-headless mode | | Get the HTML content of [URL] | Basic HTML retrieval | | Get the HTML structure of [URL] | Structure mode (default) | | Get the full HTML of [URL] | Complete HTML content | | Click the [element] on [URL] | Basic element click | | Click the [element] on [URL] with visible browser | Non-headless click | | Type [text] into [field] on [URL] | Basic text input | | Type [text] into [field] on [URL] with visible browser | Non-headless text input |

Basic Screenshot

// Take a basic screenshot of a webpage
const result = await takeScreenshot({
  url: 'https://example.com',
  viewportPreset: 'desktop',
});

Full Page Screenshot

// Capture the entire page including scrollable content
const result = await takeScreenshot({
  url: 'https://example.com',
  fullPage: true,
  viewportPreset: 'desktop',
});

Element-Specific Screenshot

// Capture only a specific element
const result = await takeScreenshot({
  url: 'https://example.com',
  elementSelector: 'h1',
  highlightElement: true, // Optional: highlight the element for debugging
});

Mobile Viewport

// Test responsive design with mobile viewport
const result = await takeScreenshot({
  url: 'https://example.com',
  viewportPreset: 'mobile',
  fullPage: true,
});

Custom Viewport with Wait Conditions

// Custom viewport with wait conditions
const result = await takeScreenshot({
  url: 'https://example.com',
  viewportPreset: 'custom',
  width: 1200,
  height: 800,
  waitForSelector: '.content-loaded',
  waitTime: 2000,
  userInteractionTime: 3000,
});

Headless Mode Configuration

// Default: headless mode (efficient, no visible browser)
const result = await takeScreenshot({
  url: 'https://example.com',
  headless: true, // default
});

// Non-headless mode (visible browser for debugging)
const result = await takeScreenshot({
  url: 'https://example.com',
  headless: false, // browser will be visible
});

HTML Retrieval

🚀 PREFERRED METHOD: Use getPageHtml to save HTML to a temp file, then use standard command-line tools for processing.

HTML to File (Recommended)

// Get HTML content and save to temp file
const result = await getPageHtml({
  url: 'https://example.com',
  mode: 'structure', // or 'full'
});

console.log(result.filePath); // e.g., /tmp/page-html-abc123.html

Benefits:

  • File-based approach - LLM can use grep, sed, awk, etc. for any processing
  • No token limits - Content saved to files, process however you want
  • Better performance - No large content in responses
  • Maximum flexibility - Use any command-line tool to filter/analyze HTML

With Wait Conditions

// Wait for specific element before getting HTML
const result = await getPageHtml({
  url: 'https://example.com',
  waitForSelector: '.content-loaded',
  waitTime: 2000,
});

Non-Headless Mode

const result = await getPageHtml({
  url: 'https://example.com',
  headless: false,
});

Click Element

Basic Click

// Click an element on a webpage
const result = await clickElement({
  url: 'https://example.com',
  selector: '#submit-button',
});

With Wait Conditions

// Wait for element to be present before clicking
const result = await clickElement({
  url: 'https://example.com',
  selector: '#submit-button',
  waitForSelector: '#form-loaded',
  waitTime: 2000,
});

Non-Headless Mode

// Click with visible browser for debugging
const result = await clickElement({
  url: 'https://example.com',
  selector: '#submit-button',
  headless: false,
});

Type Text

Basic Text Input

// Type text into an input field
const result = await typeText({
  url: 'https://example.com',
  selector: 'input[name="username"]',
  text: '[email protected]',
});

With Clear First

// Clear field before typing (default behavior)
const result = await typeText({
  url: 'https://example.com',
  selector: 'input[name="username"]',
  text: '[email protected]',
  clearFirst: true, // default
});

Without Clearing

// Type without clearing existing text
const result = await typeText({
  url: 'https://example.com',
  selector: 'input[name="username"]',
  text: ' @example.com',
  clearFirst: false,
});

With Wait Conditions

// Wait for element to be present before typing
const result = await typeText({
  url: 'https://example.com',
  selector: 'input[name="username"]',
  text: '[email protected]',
  waitForSelector: '#login-form',
  waitTime: 1000,
});

Non-Headless Mode

// Type with visible browser for debugging
const result = await typeText({
  url: 'https://example.com',
  selector: 'input[name="username"]',
  text: '[email protected]',
  headless: false,
});

API Reference

Parameters

| Parameter | Type | Default | Description | | --------------------- | ------- | ------------ | --------------------------------------------------------------------- | | url | string | required | URL of the page to screenshot | | viewportPreset | string | 'desktop' | Viewport size preset: 'mobile', 'tablet', 'desktop', 'custom' | | width | number | 1920 | Custom viewport width (used with viewportPreset: 'custom') | | height | number | 1080 | Custom viewport height (used with viewportPreset: 'custom') | | elementSelector | string | - | CSS selector for element-specific screenshot | | fullPage | boolean | false | Capture full page including scroll | | waitForSelector | string | - | CSS selector to wait for before screenshot | | waitTime | number | - | Time to wait after page load (ms) | | userInteractionTime | number | 5000 | Time to wait for user login/navigation (ms) | | highlightElement | boolean | false | Highlight target element for debugging | | headless | boolean | true | Run browser in headless mode (default: true for efficiency) |

HTML Retrieval Parameters

| Parameter | Type | Default | Description | | ----------------- | -------- | ---------------------------- | --------------------------------------------------------------------- | | url | string | required | URL of the page to get HTML from | | mode | string | 'structure' | HTML retrieval mode: 'structure' (clean DOM) or 'full' (complete) | | stripElements | string[] | ['script', 'svg', 'style'] | Element types to strip from HTML | | waitForSelector | string | - | CSS selector to wait for before getting HTML | | waitTime | number | - | Time to wait after page load (ms) | | headless | boolean | true | Run browser in headless mode (default: true for efficiency) |

Click Element Parameters

| Parameter | Type | Default | Description | | ----------------- | ------- | ------------ | ----------------------------------------------------------- | | url | string | required | URL of the page to interact with | | selector | string | required | CSS selector for the element to click | | waitForSelector | string | - | CSS selector to wait for before clicking | | waitTime | number | - | Time to wait after page load (ms) | | headless | boolean | true | Run browser in headless mode (default: true for efficiency) |

Type Text Parameters

| Parameter | Type | Default | Description | | ----------------- | ------- | ------------ | ----------------------------------------------------------- | | url | string | required | URL of the page to interact with | | selector | string | required | CSS selector for the input field | | text | string | required | Text to type into the field | | clearFirst | boolean | true | Clear the field before typing (default: true) | | waitForSelector | string | - | CSS selector to wait for before typing | | waitTime | number | - | Time to wait after page load (ms) | | headless | boolean | true | Run browser in headless mode (default: true for efficiency) |

Viewport Presets

| Preset | Width | Height | Use Case | | --------- | ------------ | ------------ | ------------------------- | | mobile | 375 | 667 | Mobile device testing | | tablet | 768 | 1024 | Tablet device testing | | desktop | 1920 | 1080 | Desktop testing (default) | | custom | configurable | configurable | Custom dimensions |

Screenshot Return Format

{
  success: true,
  mimeType: 'image/png',
  data: 'base64-encoded-image-data',
  size: 12345 // bytes
}

HTML Retrieval Return Format

{
  success: true,
  mode: 'structure', // or 'full'
  data: '<html>...</html>', // plain text HTML
  contentLength: 3720, // characters
  stripElements: ['script', 'svg', 'style']
}

Click Element Return Format

// Success
{
  success: true
}

// Error
{
  success: false,
  error: 'Element not found: Check if selector \'#submit\' is correct. The element may not exist on the page.',
  userMessage: 'Element not found: Check if selector \'#submit\' is correct. The element may not exist on the page.'
}

Type Text Return Format

// Success
{
  success: true
}

// Error
{
  success: false,
  error: 'Input field not found: Check if selector \'#username\' is correct. The field may not exist on the page.',
  userMessage: 'Input field not found: Check if selector \'#username\' is correct. The field may not exist on the page.'
}

MCP Server Usage

Starting the Server

# Run the MCP server
node src/server.js

Setting Up in Cursor

For detailed instructions on integrating this server with Cursor, see CURSOR_SETUP.md.

Quick Setup Example:

{
  "mcpServers": {
    "selenium-screenshot": {
      "command": "node",
      "args": ["/path/to/selenium/src/server.js"],
      "env": {
        "NODE_ENV": "production"
      }
    }
  }
}

MCP Tool Registration

The server registers a take_screenshot tool with the following schema:

{
  "name": "take_screenshot",
  "description": "Take a screenshot of a web page with advanced options",
  "inputSchema": {
    "type": "object",
    "properties": {
      "url": { "type": "string", "description": "URL to screenshot" },
      "viewportPreset": {
        "type": "string",
        "enum": ["mobile", "tablet", "desktop", "custom"],
        "default": "desktop"
      },
      "elementSelector": {
        "type": "string",
        "description": "CSS selector for element-specific screenshot"
      },
      "fullPage": {
        "type": "boolean",
        "default": false,
        "description": "Capture full page including scroll"
      },
      "headless": {
        "type": "boolean",
        "default": true,
        "description": "Run browser in headless mode (default: true for efficiency)"
      }
    },
    "required": ["url"]
  }
}

Testing

Run All Tests

npm test

Run Tests with Coverage

npm run test:coverage

Test Categories

  • Unit Tests: Core functionality and edge cases
  • Integration Tests: Real website testing
  • Viewport Tests: Responsive design validation
  • Element Tests: Element-specific screenshot functionality
  • Full Page Tests: Scroll capture and stitching

Development

Project Structure

selenium/
├── src/
│   ├── server.js          # MCP server with DI pattern
│   ├── logger.js          # Logging utilities
│   └── tools/
│       └── screenshot.js  # Core screenshot functionality
├── test/                  # Test files
├── screenshots/           # Generated screenshots
└── docs/                  # Documentation

Dependency Injection Pattern

This app uses the getDeps pattern for dependency injection. New code should follow this pattern:

  • Define a getDeps function that returns all dependencies
  • main should accept a _getDeps argument (defaulting to getDeps)
  • This enables easy testing and swapping of dependencies

Adding New Features

  1. Follow the getDeps pattern for dependency injection
  2. Add comprehensive tests for new functionality
  3. Update documentation with usage examples
  4. Ensure backward compatibility

Error Handling

The server provides clear error messages for common scenarios:

  • Element not found: Returns error when CSS selector doesn't match
  • Page load timeout: Handles slow-loading pages gracefully
  • Invalid URLs: Validates URL format before processing
  • Browser errors: Captures and reports WebDriver errors

Troubleshooting

MCP Logs in Cursor

When using this server with Cursor, you can view detailed logs to troubleshoot issues:

  1. Open Debug Console: In Cursor, go to ViewDebug Console (or press Ctrl+Shift+Y / Cmd+Shift+Y)
  2. Look for MCP Logs: The server logs will appear in the Debug Console with timestamps
  3. Common Log Messages:
    • [INFO] Starting screenshot capture - Server is processing your request
    • [DEBUG] Headless mode enabled/disabled - Shows browser visibility setting
    • [ERROR] Screenshot capture failed - Indicates what went wrong
    • [DEBUG] WebDriver initialized/closed - Browser lifecycle events

Common Issues

Headless Mode Problems: If headless mode fails in your environment:

  • Try setting headless: false to see the browser window
  • Check if Chrome is installed and accessible
  • Some CI/CD environments may not support headless mode

Timeout Issues: If screenshots are timing out:

  • Increase userInteractionTime for slow-loading pages
  • Use waitForSelector to wait for specific content
  • Check your internet connection

Element Not Found: If element-specific screenshots fail:

  • Verify the CSS selector is correct
  • Use browser dev tools to test the selector
  • Try highlightElement: true to debug element location

Performance Considerations

  • Timeout: 15-second timeout for all page operations
  • Memory: Optimized for large screenshots
  • Concurrency: Single browser instance (no concurrent requests)
  • Caching: No built-in caching (planned for future versions)

Alpha Usage Guidelines

What's Ready for Production

Core Functionality

  • Basic screenshots with viewport control
  • Full page screenshot capture
  • Element-specific screenshots
  • Multiple viewport presets
  • Wait conditions and timeouts
  • Error handling and logging

Testing & Quality

  • Comprehensive test coverage (71% overall)
  • Real-world integration testing
  • Error scenario validation
  • Performance testing

Documentation

  • Complete API reference
  • Usage examples
  • Installation instructions
  • Development guidelines

Known Limitations

⚠️ Alpha Limitations

  • Single browser instance (no concurrent requests)
  • No built-in caching or browser pooling
  • Limited to Chrome browser
  • No PDF or video output (planned for Phase 2, Step 3)

Recommended Usage Patterns

  1. Start Simple: Begin with basic screenshots before using advanced features
  2. Test Responsively: Use viewport presets to test different device sizes
  3. Handle Errors: Implement proper error handling for production use
  4. Monitor Performance: Watch for timeout issues with complex pages
  5. Validate Output: Always verify screenshot quality and content

Roadmap

Phase 2, Step 2: Performance Optimizations

  • Browser pooling for concurrent requests
  • Caching mechanisms
  • Performance monitoring

Phase 2, Step 3: Advanced Features

  • PDF generation
  • Video capture
  • Batch processing

Phase 3: Production Readiness

  • Configuration management
  • Monitoring and observability
  • Deployment automation

Contributing

  1. Follow the getDeps pattern for dependency injection
  2. Add tests for new functionality
  3. Update documentation
  4. Ensure all tests pass before submitting

Changelog

Recent Changes

  • Updated demo URLs from Apple Music to example.com for better reliability
  • Added console.log mocking in Jest setup to reduce test verbosity
  • Removed legacy HTML mode documentation sections
  • Improved README structure and formatting

Version History

  • v1.0.0 - Initial release with core Selenium MCP functionality
  • v1.1.0 - Added filtered HTML retrieval capabilities
  • v1.2.0 - Enhanced browser pool management and error handling

License

[Add your license information here]