npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mindstudio-ai/browser-agent

v0.1.42

Published

Browser-side agent for MindStudio dev previews — captures logs, provides DOM snapshots, and enables remote interaction.

Readme

@mindstudio-ai/browser-agent

Browser-side agent for MindStudio dev previews. Injected into app preview pages via the dev tunnel proxy. Captures browser events, provides DOM snapshots, enables remote interaction by AI agents, and supports user annotations for visual feedback.

How it works

The dev tunnel proxy injects <script src> into every HTML response (default: ngrok dev URL, fallback: unpkg latest). This script runs inside the app's preview (either in the MindStudio IDE iframe or a standalone tab) and communicates with the tunnel via HTTP endpoints on the proxy.

AI Agent ──stdin──▶ Tunnel ──queue──▶ Proxy endpoint
                                          │
Browser agent ◀──GET /commands────────────┘
Browser agent ──POST /results──▶ Proxy ──stdout──▶ AI Agent
Browser agent ──POST /logs────▶ Proxy ──file──▶ .logs/browser.ndjson

Frontend ──postMessage──▶ Browser agent (notes mode, screenshots)

Features

Log capture (always active)

Captures browser events and POSTs them to /__mindstudio_dev__/logs, which the tunnel writes to .logs/browser.ndjson:

  • Console -- overrides console.log/info/warn/error/debug, calls originals through
  • JS errors -- window.addEventListener('error') with message, stack, source, line, column
  • Unhandled rejections -- window.addEventListener('unhandledrejection')
  • Network requests -- monkey-patches fetch and XMLHttpRequest to log all requests (method, URL, status, duration, response body for failures)
  • Click interactions -- capture-phase click listener with accessible element descriptions

Log entries are batched and flushed every 2 seconds, or immediately on errors. Uses navigator.sendBeacon on page unload.

All monkey-patches are guarded against stacking on HMR/reload (checked via __ms_patched flags on the patched objects).

DOM snapshots

Compact, token-efficient accessibility-tree-style representation of the page. Designed for AI agent consumption (~200-400 tokens for a typical page).

navigation "Generate Collection" [ref=e1]
  button "Generate" [ref=e2]
  button "Collection" [ref=e3]
textbox [value=""] [placeholder="enter a topic..."] [ref=e4]
button "Generate" [disabled] [ref=e5]
paragraph "5 · 7 · 5"

Key design decisions:

  • Semantic roles and accessible names, not CSS classes -- handles styled-components/CSS-in-JS apps where class names are generated hashes
  • Transparent element collapsing -- generic <div>/<span> wrappers without roles disappear from the tree, children float up to the nearest semantic ancestor
  • Cursor-interactive detection -- elements with cursor: pointer or onclick are included even if they're generic divs
  • Block/inline spacing -- text from block-level children gets spaces between them (fixes concatenated text from nested components)
  • Network idle wait -- takeSnapshot() waits for all fetch/XHR requests to settle (200ms quiet period, 5s max) before walking the DOM
  • Stable refs -- interactive elements get [ref=eN] identifiers for command targeting
  • Form state -- shows [value="..."], [placeholder="..."], [disabled], [checked], [open]

Command channel (iframe mode only)

When the page URL contains ?mode=iframe, the agent polls GET /__mindstudio_dev__/commands every 100ms for commands from the AI agent. This ensures only the preview iframe in the MindStudio IDE responds to commands, not standalone browser tabs.

The AI agent sends commands via the tunnel's stdin:

{"action": "browser", "steps": [{"command": "click", "text": "Generate"}]}

The result comes back on the tunnel's stdout with a snapshot, logs captured during execution, and step results:

{"event": "browser-completed", "steps": [...], "snapshot": "...", "logs": [...], "duration": 250}

Commands execute sequentially with a visible animated cursor. Execution stops on first error.

Available commands:

| Command | Description | |---------|-------------| | snapshot | Returns the compact DOM accessibility tree | | click | Clicks an element (full pointer/mouse/click event sequence for React/Vue/Svelte) | | type | Types text into an input/textarea (character-by-character with native value setter for React) | | select | Selects an option from a <select> element | | wait | Waits for an element to appear in the DOM (polls with timeout) | | evaluate | Runs arbitrary JavaScript and returns the result (auto-wraps with return, handles async) |

Element targeting (for click, type, select, wait):

| Field | Example | Description | |-------|---------|-------------| | ref | "e5" | Ref from the last snapshot (most reliable) | | text | "Create Board" | Match by accessible name or visible text | | role + text | "button" + "Submit" | Match by ARIA role and name | | label | "Board name" | Find input by its associated label text | | selector | "#my-id" | CSS selector fallback |

Error messages include what IS on the page so the agent can self-correct (e.g., No button "Submit" found. Visible buttons: "Generate", "Collection").

Screenshots

Screenshots are captured via SnapDOM (@zumer/snapdom) and can be triggered two ways:

  1. Via tunnel stdin -- {"action": "screenshot"} captures the viewport, uploads to S3 via the platform, and returns a CDN URL.
  2. Via postMessage -- the frontend sends notes-screenshot to capture with annotations (see Notes below).

Visible cursor

A Figma-style animated cursor (#DD2590 pink with "Remy" name tag) shows the AI agent's actions in real time:

  • Appears from a random viewport edge on first action
  • Glides smoothly to target elements (450ms ease)
  • Click animation with ripple effect
  • Fades out after 1.5s of inactivity
  • Reappears at last known position for subsequent actions
  • Only renders in iframe mode (?mode=iframe)

Annotation notes (postMessage API)

Users can add ephemeral visual annotations to the preview for AI feedback. Controlled by the frontend via postMessage.

Frontend → iframe messages (channel: 'mindstudio-browser-agent'):

| Command | Purpose | |---------|---------| | notes-enter | Enter notes mode (overlay, custom cursor, click/drag to annotate) | | notes-exit | Exit notes mode, remove all notes | | notes-screenshot | Capture screenshot including annotations, return base64 | | notes-cursor-hide | Hide the notes cursor (call when mouse leaves iframe) |

Iframe → frontend responses:

| Command | Payload | Purpose | |---------|---------|---------| | screenshot-result | { image: string } or { error: string } | Base64 PNG screenshot |

Notes are pink (#DD2590) rounded bubbles with inline-editable text. Pin notes (click) have a dot at the click point. Area notes (drag) have a dashed border around the selected region. Notes support select → edit → move → delete lifecycle with Enter to confirm, Escape to cancel, and a × delete button.

Development

npm install
npm run build    # build dist/index.js (single IIFE, minified)
npm run dev      # watch mode + local HTTP server on port 8787
npm run serve    # serve dist/ on port 8787 (no watch)

The dev tunnel proxy defaults to loading the script from https://seankoji-msba.ngrok.io/index.js. Point ngrok at port 8787 to serve your local dev build to remote sandboxes. Falls back to https://unpkg.com/@mindstudio-ai/browser-agent/dist/index.js when no URL is configured.

Architecture

src/
  index.ts              -- entry point, idempotency guard, init all modules
  transport.ts          -- log entry buffer, batched POST to proxy, capture mode
  network-idle.ts       -- tracks in-flight requests for snapshot idle wait
  utils.ts              -- serialization, element description, sleep
  capture/
    console.ts          -- console.* override (patch-guarded)
    errors.ts           -- error + unhandledrejection listeners
    network.ts          -- fetch monkey-patch (patch-guarded, logs + idle tracking)
    xhr.ts              -- XMLHttpRequest monkey-patch (patch-guarded, logs + idle tracking)
    interactions.ts     -- click listener (patch-guarded)
  snapshot/
    walker.ts           -- DOM walker, takeSnapshot(), describeTarget()
    roles.ts            -- implicit ARIA role mapping, cursor-interactive detection
    name.ts             -- accessible name computation
  commands/
    poller.ts           -- polls proxy for commands (iframe mode only)
    executor.ts         -- dispatches steps, captures logs, appends snapshot
    actions.ts          -- click, type, select, wait, evaluate implementations
    resolve.ts          -- element resolution (ref, text, role, label, selector)
    screenshot.ts       -- SnapDOM viewport capture
  cursor/
    cursor.ts           -- animated Figma-style cursor with ripple
  notes/
    constants.ts        -- shared color constant
    messages.ts         -- postMessage handler (idempotent listener)
    notes-mode.ts       -- enter/exit lifecycle, screenshot orchestration
    note-layer.ts       -- overlay, pointer events, state machine
    note-element.ts     -- DOM creation for pin and area notes

Proxy endpoints

| Endpoint | Method | Purpose | |----------|--------|---------| | /__mindstudio_dev__/logs | POST | Receive browser log entries | | /__mindstudio_dev__/commands | GET | Poll for pending commands | | /__mindstudio_dev__/results | POST | Return command execution results |