npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

sentinel-mcp

v1.4.5

Published

MCP-to-browser bridge library for AI-driven test automation

Readme

Sentinel MCP

MCP-to-browser bridge library for AI-driven test automation

npm version License: MIT Node.js Version

Sentinel MCP is a powerful browser automation library that bridges the Model Context Protocol (MCP) with Playwright, enabling AI agents to interact with web applications through 57 specialized tools. Built with TypeScript and featuring intelligent element detection, visual feedback, and automatic Playwright code generation.

✨ Features

  • 57 MCP Tools - Comprehensive browser control across 7 categories
  • Smart Element Detection - 15-stage algorithm with compound components, shadow DOM, and scroll tracking
  • Advanced Element Detection - Optimized constants, detection heuristics, and filtering algorithms
  • Visual Feedback - Annotated screenshots with element highlighting (Retina/HiDPI display support)
  • Scroll Indicators - Page and element scroll positions in serialized output
  • Code Generation - Automatic Playwright code generation for every action
  • CDP Integration - Chrome DevTools Protocol for reliable browser communication
  • TypeScript First - Full type safety with comprehensive type definitions
  • Security Built-in - Input sanitization and XSS protection
  • Test Ready - 177 unit tests, 130 integration tests

📦 Installation

npm install sentinel-mcp

Requirements

  • Node.js >= 18.0.0
  • Playwright browsers (installed automatically)

🚀 Quick Start

import { BrowserOrchestrator } from 'sentinel-mcp';

// Initialize the orchestrator
const orchestrator = new BrowserOrchestrator();
await orchestrator.initialize();

// Navigate to a website
await orchestrator.navigate('https://example.com');

// Take a snapshot with element detection
const snapshot = await orchestrator.takeSnapshot({
  includeScreenshot: true,
  highlightElements: true
});

console.log(`Found ${snapshot.selectorMap.size} interactive elements`);
console.log(snapshot.screenshot); // Base64 screenshot with annotations

// Interact with elements by index (1-based)
await orchestrator.executeTool('fill', {
  index: 1,
  text: '[email protected]'
});

await orchestrator.executeTool('click', {
  index: 2
});

// Get generated Playwright code
console.log(snapshot.code); // Ready-to-use Playwright test code

// Clean up
await orchestrator.shutdown();

📚 API Documentation

Core Classes

BrowserOrchestrator

Main orchestration class for browser automation.

const orchestrator = new BrowserOrchestrator();

// Initialize browser
await orchestrator.initialize(options?: BrowserOptions);

// Navigate to URL
await orchestrator.navigate(url: string): Promise<void>;

// Take DOM snapshot
await orchestrator.takeSnapshot(options?: SnapshotOptions): Promise<DOMSnapshot>;

// Execute MCP tool
await orchestrator.executeTool(name: string, args: unknown): Promise<ToolResult>;

// Get tool registry
orchestrator.getToolRegistry(): ToolRegistry;

// Get page registry
orchestrator.getPageRegistry(): PageRegistry;

// Shutdown browser
await orchestrator.shutdown(): Promise<void>;

ToolRegistry

Manage and execute tools dynamically.

const registry = new ToolRegistry();

// Get all tools
registry.getAllTools(): ToolDefinition[];

// Get tool by name
registry.getTool(name: string): ToolDefinition | undefined;

// Get all tool names
registry.getToolNames(): string[];

// Execute tool
await registry.executeTool(name: string, args: unknown, context: ToolExecutionContext): Promise<ToolResult>;

ToolResponse

Format tool responses with code generation.

const response = new ToolResponse();

// Add text output
response.appendLine('Operation successful');

// Add generated code
response.addCode('await page.click("#button")');

// Set metadata
response.setMetadata('elementCount', 5);

// Format response
await response.format(): Promise<ToolResult>;

🛠️ Tools

Sentinel MCP provides 57 tools across 7 categories:

Actions (9 tools)

Navigate and interact with web elements:

  • navigate - Navigate to URLs with configurable wait options
  • click - Click elements (supports right-click, double-click)
  • fill - Fill input fields with optional Enter key
  • scroll - Scroll page or elements in any direction
  • hover - Hover over elements for tooltips/menus
  • rightClick - Trigger context menus
  • doubleClick - Double-click elements
  • dragAndDrop - Drag and drop between elements
  • pressKey - Keyboard input with modifier keys

Page Operations (14 tools)

Control page behavior and state:

  • getTitle - Get current page title
  • getUrl - Get current page URL
  • goBack - Navigate backward in history
  • goForward - Navigate forward in history
  • reload - Reload current page
  • getTabs - List all open tabs
  • selectTab - Switch between tabs
  • newTab - Open new tab with optional URL
  • closeTab - Close current or specified tab
  • evaluate - Execute JavaScript in page context (with security validation)
  • executePlaywright - Execute Playwright API code with accurate line numbers
  • getConsoleLogs - Retrieve browser console logs
  • setViewport - Change viewport dimensions
  • handleDialog - Configure alert/confirm/prompt handling

Forms (8 tools)

Specialized form interactions:

  • selectOption - Select dropdown options (by value, label, or index)
  • checkCheckbox - Check checkbox elements
  • uncheckCheckbox - Uncheck checkbox elements
  • uploadFile - Upload files to file inputs
  • clearInput - Clear input field values
  • submitForm - Submit forms
  • type - Type text with configurable delay for natural input
  • focus - Focus on input elements

Assertions (7 tools)

Verify page state and content:

  • assertVisible - Assert element is visible
  • assertText - Assert element text content
  • assertValue - Assert input element value
  • assertAttribute - Assert element attribute value
  • assertUrl - Assert current URL matches pattern
  • assertTitle - Assert page title
  • assertExists - Assert element exists in DOM

Inspection (8 tools)

Query and inspect elements:

  • getElementText - Get text content of elements
  • getElementAttribute - Get element attribute values
  • getElementValue - Get input element values
  • isVisible - Check if element is visible
  • isEnabled - Check if element is enabled
  • isChecked - Check checkbox/radio state
  • getElements - Query multiple elements
  • queryPage - Advanced element queries with filters

Waits (7 tools)

Synchronize with page state:

  • waitForElement - Wait for element to appear
  • waitForNavigation - Wait for page navigation to complete
  • waitForLoadState - Wait for specific load state (load, domcontentloaded, networkidle)
  • waitForTimeout - Wait for fixed duration
  • waitForFunction - Wait for custom JavaScript condition
  • waitForSelector - Wait for CSS selector to match
  • waitForUrl - Wait for URL to match pattern

Dialogs (4 tools)

Handle browser dialogs:

  • acceptDialog - Accept alerts/confirms/prompts
  • dismissDialog - Dismiss/cancel dialogs
  • getDialogMessage - Get dialog message text
  • typeIntoDialog - Type into prompt dialogs

💡 Usage Examples

Basic Navigation and Interaction

const orchestrator = new BrowserOrchestrator();
await orchestrator.initialize();

// Navigate to a page
await orchestrator.navigate('https://example.com');

// Take snapshot to see available elements
const snapshot = await orchestrator.takeSnapshot({
  includeScreenshot: true,
  highlightElements: true
});

// Elements are indexed starting from 1
snapshot.selectorMap.forEach((el, idx) => {
  console.log(`[${idx}] ${el.tagName} - ${el.meaningfulText || el.attributes.placeholder || el.ariaLabel}`);
});

// Interact with specific element by index
await orchestrator.executeTool('click', { index: 1 });

Form Filling Workflow

// Navigate to login page
await orchestrator.navigate('https://app.example.com/login');

// Fill login form
await orchestrator.executeTool('fill', {
  index: 1, // Email input
  text: '[email protected]'
});

await orchestrator.executeTool('fill', {
  index: 2, // Password input
  text: 'secretPassword',
  pressEnter: true // Submit form after filling
});

// Wait for navigation
await orchestrator.executeTool('waitForNavigation', {
  timeout: 5000
});

// Verify login success
await orchestrator.executeTool('assertUrl', {
  pattern: '/dashboard'
});

Assertions and Testing

// Navigate to page
await orchestrator.navigate('https://example.com/profile');

// Verify page state
await orchestrator.executeTool('assertTitle', {
  expected: 'User Profile'
});

await orchestrator.executeTool('assertElementVisible', {
  index: 1 // Profile picture element
});

await orchestrator.executeTool('assertText', {
  index: 3,
  text: 'Welcome back!'
});

// Check input values
const result = await orchestrator.executeTool('getText', {
  index: 2
});
console.log('Current value:', result.metadata?.value);

Advanced: Execute Custom Playwright Code

// Execute Playwright API code directly
const result = await orchestrator.executeTool('executePlaywright', {
  code: `
    // Access to 'page' object
    const title = await page.title();
    const cookies = await page.context().cookies();

    // Perform complex operations
    await page.evaluate(() => {
      localStorage.setItem('theme', 'dark');
    });

    // Return values
    return { title, cookieCount: cookies.length };
  `,
  timeout: 30000
});

console.log(result.metadata?.result); // { title: '...', cookieCount: 3 }

Working with Multiple Tabs

// Open new tab
await orchestrator.executeTool('newTab', {
  url: 'https://example.com/page2'
});

// List all tabs
const tabs = await orchestrator.executeTool('getTabs', {});
console.log(tabs.metadata?.tabs);

// Switch to second tab
await orchestrator.executeTool('selectTab', { tabId: tabs[1].id });

// Close current tab
await orchestrator.executeTool('closeTab', {});

Handling Dialogs

// Configure dialog handling before triggering
await orchestrator.executeTool('handleDialog', {
  action: 'accept',
  promptText: 'Yes, I agree' // For prompt dialogs
});

// Trigger action that opens dialog
await orchestrator.executeTool('click', { index: 10 });

// Dialog is automatically handled based on configuration

🏗️ Architecture

Element Detection Algorithm

Sentinel MCP uses a 15-stage element detection algorithm:

  1. CDP DOM Snapshot - Capture full DOM state via DOMSnapshot.captureSnapshot
  2. Accessibility Integration - Include ARIA roles, labels, and attributes
  3. DevicePixelRatio Detection - Calculate via Page.getLayoutMetrics (deviceWidth / cssWidth)
  4. Coordinate Transformation - Convert CDP device pixels → CSS pixels
  5. Interactivity Detection - Tags, roles, cursors, event handlers (using optimized constant sets)
  6. Clickable Element Detection - Multi-factor detection using tags, roles, cursors, and attributes
  7. Compound Component Detection - Identify date pickers, color pickers, range sliders, custom selects
  8. Shadow DOM Traversal - Process shadow roots and shadow DOM elements
  9. Bounding Box Filtering - Remove elements 99% contained within propagating parents
  10. Paint Order Filtering - O(n) RectUnion algorithm to remove occluded elements
  11. Viewport Filtering - Remove elements outside visible viewport
  12. Scroll Position Tracking - Capture page and element scroll data
  13. Deduplication - Remove duplicate selectors
  14. DOM Order Sorting - Maintain document order
  15. Index Assignment - Assign stable 1-based indices for tool use

Element Detection Implementation

Sentinel MCP uses optimized element detection with the following approach:

Constants

All detection constants are Sets for O(1) lookup performance:

  • INTERACTIVE_TAGS: button, input, select, textarea, a, details, summary, option, optgroup (excludes label)
  • INTERACTIVE_ROLES: button, link, menu, menuitem, option, radio, checkbox, tab, textbox, combobox, slider, spinbutton, search, searchbox, listbox
  • EVENT_HANDLER_ATTRIBUTES: onclick, onmousedown, onmouseup, onkeydown, onkeyup, tabindex (no touch events)
  • CLICKABLE_CURSORS: Only pointer (not grab, text, etc.)

Paint Order O(n) Algorithm

Uses RectUnion class for efficient occlusion detection:

  • Groups elements by paint order (z-index)
  • Processes from highest to lowest paint order
  • Tracks covered area using union of rectangles
  • Only adds opaque elements (opacity >= 0.8, non-transparent background)
  • O(n) complexity vs naive O(n²) approach

Bounding Box Filtering

Removes redundant nested elements:

  • Filters out children 99% contained within propagating parents (<a>, <button>)
  • Reduces snapshot size significantly
  • Maintains interactive element hierarchy

Snapshot Caching

Performance optimization with automatic expiration:

  • 5-second TTL cache prevents redundant DOM processing for same URL+viewport
  • Automatic expiration and deterministic pruning (every 10 snapshots)
  • Cache key based on MD5 hash of URL and viewport dimensions

Display Scaling & Coordinate Systems

Proper handling of Retina/HiDPI displays (2x, 3x scale factors):

DevicePixelRatio Calculation

// Via Page.getLayoutMetrics (NOT Performance.getMetrics)
const devicePixelRatio = deviceWidth / cssWidth;
// Example: 2560 / 1280 = 2.0 (Retina display)

Coordinate Transformation

  1. CDP Returns: Bounds in device pixels (e.g., [0, 0, 2560, 1440])
  2. Parser Converts: Divide by devicePixelRatio → CSS pixels (e.g., [0, 0, 1280, 720])
  3. Screenshot: Playwright captures at CSS resolution (1280x720)
  4. Highlights: Draw using CSS pixel coordinates directly (no scaling needed)

This ensures pixel-perfect element highlighting on all display types.

Serialization Features

The serialized DOM output includes:

Element Markers

Interactive elements are marked with [index] in the output:

<button class="submit">[1]Submit</button>
<input type="email" placeholder="Email">[2]

Scroll Indicators

Page scroll position at the top:

PAGE_SCROLL: (V:100px/2000px)

Scrollable elements show scroll capability:

<div class="scrollable">[5] (scroll: 0px/500px)

Shadow DOM

Shadow roots are represented:

<custom-element>
  #shadow-root
    <button>[3]Click Me</button>
</custom-element>

Minimal Output

Only required nodes are serialized:

  • Interactive elements themselves
  • All ancestor elements (for hierarchy context)
  • Bounding box filtered (removes 99% contained children)
  • Paint order filtered (removes occluded elements)

Result: ~5KB DOM snapshots instead of full 500KB+ page HTML

Code Generation

Every tool execution generates equivalent Playwright code:

const result = await orchestrator.executeTool('fill', {
  index: 1,
  text: '[email protected]'
});

console.log(result.code);
// Output:
// await page.locator('#email').fill('[email protected]');

The serialized DOM output also includes [index] markers for easy identification:

PAGE_SCROLL: (V:0px/2000px)

<form class="login-form">
  <input type="email" id="email">[1]
  <input type="password" id="password">[2]
  <button type="submit">[3]Login</button>
</form>

Store and replay these generated commands for:

  • Test automation
  • Workflow recording
  • Debugging and inspection
  • Documentation generation

CDP Integration

Uses Chrome DevTools Protocol for reliable browser communication:

  • Session Management - Automatic CDP session creation and recovery
  • DOM Snapshots - Fast, accurate DOM state capture
  • Event Handling - Real-time browser events
  • Console Capture - Intercept console logs
  • Network Monitoring - Track network requests (future feature)

⚙️ Configuration

BrowserOptions

interface BrowserOptions {
  headless?: boolean;          // Default: false
  viewport?: {
    width: number;            // Default: 1280
    height: number;           // Default: 720
  };
  slowMo?: number;            // Slow down operations (ms)
  devtools?: boolean;         // Open devtools
  timeout?: number;           // Default timeout (ms)
  userAgent?: string;         // Custom user agent
}

await orchestrator.initialize({
  headless: false,
  viewport: { width: 1920, height: 1080 },
  slowMo: 100
});

SnapshotOptions

interface SnapshotOptions {
  includeScreenshot?: boolean;    // Include base64 screenshot
  fullPage?: boolean;             // Full page vs viewport screenshot
  highlightElements?: boolean;    // Annotate elements on screenshot
}

const snapshot = await orchestrator.takeSnapshot({
  includeScreenshot: true,
  fullPage: false,
  highlightElements: true
});

🧪 Testing

# Run all tests
npm test

# Run unit tests only
npm run test:unit

# Run integration tests only
npm run test:integration

# Run tests in watch mode
npm run test:unit:watch

# Run tests with coverage
npm run test:coverage

Test Structure

  • Unit Tests (127+) - Test individual components and utilities
  • Integration Tests (25+) - Test real browser interactions
  • Test Coverage - High coverage across all modules

🔒 Security

Input Sanitization

The evaluate tool sanitizes JavaScript code to prevent:

  • Prototype pollution (__proto__, constructor.prototype)
  • Constructor access for code execution
  • Direct eval() calls
  • Unsafe patterns

Safe Code Execution

The executePlaywright tool:

  • Executes code in isolated async function scope
  • Provides accurate error line numbers for debugging
  • Cleans up temporary files automatically
  • Enforces execution timeouts

📖 TypeScript Support

Full TypeScript support with comprehensive type definitions:

import {
  BrowserOrchestrator,
  ToolRegistry,
  ToolResponse,
  ToolResult,
  DOMSnapshot,
  ElementData,
  BrowserOptions
} from 'sentinel-mcp';

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

# Clone repository
git clone https://github.com/KylixMedusa/sentinel-mcp.git
cd sentinel-mcp

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

# Run linter
npm run lint

# Type checking
npm run typecheck

📝 Changelog

See CHANGELOG.md for release history and breaking changes.

📄 License

MIT License - see LICENSE file for details.

🔗 Links

🙏 Acknowledgments

  • Playwright - Browser automation framework
  • MCP Protocol - Model Context Protocol specification

📊 Project Stats

  • 57 Tools across 7 categories
  • 15-Stage Detection Algorithm with advanced filtering
  • 177 Unit Tests with Vitest
  • 130 Integration Tests with Playwright
  • TypeScript - Full type safety
  • Node.js >= 18.0.0
  • Retina/HiDPI Support - 2x, 3x displays

Made with ❤️ for AI-driven browser automation