npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mercuryo-ai/magicbrowse

v0.0.14

Published

LLM-first browser-automation runtime: planner + navigator agents driving Chrome over CDP

Downloads

2,310

Readme

@mercuryo-ai/magicbrowse

npm version License Node.js >= 18

Give your AI agent a real browser and a disciplined way to operate it.

MagicBrowse is an LLM-first browser automation runtime for systems that need to work with real web pages. It drives Chromium over CDP with a planner and navigator loop, while still exposing deterministic browser verbs for precise click, type, fill, select, and press steps.

Your application stays in charge of business logic, credentials, approvals, and orchestration. MagicBrowse owns the browser session, page observation, goal-driven browser work, guarded deterministic actions, and diagnostics.

The package has four main parts:

  • Browser sessions - launch(...), attach(...), status(...), close(...), and persisted current-session state under MAGICBROWSE_HOME.
  • Page understanding - observe(...) and screenshot(...) return a redacted view of the active page and the actionable/fillable targets on it.
  • LLM-driven action - act(...) runs the planner and navigator against an explicit LLM adapter and returns a typed terminal status.
  • Deterministic apply steps - click(...), type(...), fill(...), select(...), press(...), open-data fill helpers, and protected-fill helpers execute known actions without asking the LLM to invent values.

Typical workflow:

  1. open a browser with launch(...) or connect to one with attach(...);
  2. ask what is on the page with observe(...), or delegate a bounded goal with act(...);
  3. use deterministic verbs when you already know the target and value;
  4. stop at protected-data, approval, login, or human-verification edges and let the owning application decide what happens next;
  5. close the session when the workflow is done.

That shape fits backend workers, CLIs, agent runtimes, browser handoff flows, and application-specific checkout or form automation.

Key Terms

Three terms come up repeatedly in the API:

  • session - the managed browser session returned by launch(...) or attach(...). Top-level calls use the current persisted session; the returned session object exposes the same operations pre-bound to that session id.
  • targetRef - a stable reference returned by observe(...) for an actionable or fillable page element. Refs are tied to the page state that produced them. Navigation, route changes, or a major DOM re-render can make old refs stale, so call observe(...) again after meaningful page changes.
  • CDP - the Chrome DevTools Protocol. Chrome, Chromium, Browserbase, and other browser providers can expose a CDP endpoint that MagicBrowse can launch or attach to.

MagicBrowse does not load .env files in the library entrypoint. Host applications own configuration loading and pass credentials or LLM adapters explicitly.

Install

npm i @mercuryo-ai/magicbrowse

The library is ESM-only and requires Node.js >=18.

If you want the operator-facing CLI, install the separate CLI package:

npm i -g @mercuryo-ai/magicbrowse-cli@latest

@mercuryo-ai/magicbrowse is the library package for imports. It does not install the magicbrowse shell command.

Quick Start

This example launches Chromium, asks the browser agent to read a page, and then closes the session.

import {
  act,
  close,
  createDirectLlmAdapter,
  launch,
} from '@mercuryo-ai/magicbrowse';

const llmAdapter = createDirectLlmAdapter({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
  navigatorModel: 'google/gemini-2.5-flash',
  plannerModel: 'google/gemini-2.5-pro',
});

const session = await launch({
  url: 'https://news.ycombinator.com',
  headless: true,
});

try {
  const result = await act({
    goal: 'Report the title and current score of the front-page top story, then stop.',
    maxSteps: 5,
    llmAdapter,
    onEvent(event) {
      process.stderr.write(`[${event.actor}.${event.state}] ${event.details}\n`);
    },
  });

  if (result.status !== 'completed') {
    throw new Error(result.finalMessage ?? `MagicBrowse stopped with ${result.status}`);
  }

  process.stdout.write(`${result.finalMessage ?? 'Done.'}\n`);
} finally {
  await close({ sessionId: session.id });
}

act(...) requires an explicit LLM. Pass either a MagicBrowseLlmAdapter or a LangChain BaseChatModel. There is no package-level environment fallback.

The built-in direct adapter supports openai, azure-openai, anthropic, deepseek, gemini, xai, groq, cerebras, ollama, openrouter, llama, and custom provider families. Product CLIs and apps can pass their own backend adapter instead.

Observe First, Then Act Precisely

Use observe(...) when your host application wants to inspect the page before choosing a deterministic action.

import { click, launch, observe } from '@mercuryo-ai/magicbrowse';

await launch({ url: 'https://example.com', headless: true });

const page = await observe({
  includeOrchestration: true,
});

const firstButton = page.orchestration?.actionTargets?.descriptors.find(
  (target) => target.kind === 'button'
);

if (firstButton) {
  await click({ target: firstButton });
}

Deterministic verbs do not call the LLM. They take the observed target plus caller-supplied input and dispatch through the same browser executor used by the LLM-driven path.

Attach To An Existing Browser

If a browser already exposes a CDP endpoint, use attach(...) instead of launch(...).

Common CDP sources:

  • local Chrome or Chromium started with --remote-debugging-port;
  • Browserbase or another managed browser provider;
  • any browser runtime that gives you a CDP WebSocket endpoint.
import { attach, observe } from '@mercuryo-ai/magicbrowse';

const session = await attach({
  cdpUrl: 'ws://127.0.0.1:9222/devtools/browser/browser-id',
});

const page = await observe({ sessionId: session.id });
process.stdout.write(`${page.plannerView}\n`);

Attach is not a different API family. It is the second entry point into the same session manager, run store, and action executor used by launch(...).

Cloud Browsers

MagicBrowse can create or attach to Browserbase sessions when the host provides Browserbase credentials in the process environment.

import { launch } from '@mercuryo-ai/magicbrowse';

await launch({
  cloud: true,
  url: 'https://example.com',
});

Required Browserbase environment variables:

  • BROWSERBASE_API_KEY
  • BROWSERBASE_PROJECT_ID

Optional variables include BROWSERBASE_REGION and BROWSERBASE_TIMEOUT. Cloud session metadata is persisted in the MagicBrowse session state, while provider connection URLs are redacted from returned data.

What Each Main API Does

| API | Use it when | Typical result | | --- | --- | --- | | launch(options?) | You need a new owned browser session | managed session, current session file, run record | | attach(options) | You already have a running CDP browser | managed session bound to the external browser | | currentSession() | You want the current persisted session handle | session object or undefined | | status() | You want to know whether the current session is reachable | browser_alive, browser_not_running, or browser_mismatch | | observe(options?) | You want a redacted page inventory | page identity, signals, action targets, fillable targets | | act(options) | You want goal-driven browser work through planner + navigator | typed terminal status, steps, final message, optional handoff | | screenshot(options?) | You want a screenshot of the active page | screenshot bytes or saved file metadata | | click(options) | You know which observed target to click | deterministic action result | | type(options) | You want to type text into an observed target | deterministic action result | | fill(options) | You want to clear and fill an observed target | deterministic action result | | select(options) | You want to choose an option on an observed select target | deterministic action result | | press(options) | You want to send a key chord to the focused page | deterministic action result | | fillOpenDataTarget(options) | You want to apply non-protected user-provided data | field-level fill result | | fillProtectedGroup(options) | You want to apply protected data through a guarded reader | group-level protected fill result | | submitFormTarget(options) | You want to submit a known observed form target | submit result | | markCaptchaResolved(options?) | A human or approved solver resolved CAPTCHA on the current page | single-use trusted continuation marker | | close(options?) | You are done with the browser | owned browser closed or attached session cleared |

Use machine-readable fields for control flow. finalMessage is for humans and logs, not for parsing.

Act Statuses

MagicBrowseActResult.status is intentionally typed:

  • completed - the delegated browser task completed.
  • blocked - the task cannot continue without missing input, a different strategy, or a page state change.
  • needs_handoff - the task reached protected data, login/OTP, payment, identity/KYC data, CAPTCHA, or human verification.
  • needs_approval - the next useful action would commit a consequential external side effect and needs explicit approval.
  • failed - runtime, model, or browser-tool failure.
  • max_steps - the step budget ended before a terminal decision.
  • cancelled - caller or user cancellation.

For protected forms, needs_handoff may include handoff: { kind: 'protected_form', resumeObjective }. After the approved protected-data owner completes the fill, call act(...) again with that narrow resumeObjective to continue.

Open Data And Protected Data

MagicBrowse separates ordinary open-data fills from protected-data fills.

Open data is information the user has already provided clearly for the current task, such as a city, travel date, quantity, or public profile detail.

import { fillOpenDataTarget, observe } from '@mercuryo-ai/magicbrowse';

const page = await observe({ includeOrchestration: true });
const cityTarget = page.orchestration?.fillableTargets?.descriptors.find(
  (target) => target.label?.toLowerCase().includes('city')
);

if (cityTarget) {
  await fillOpenDataTarget({
    target: cityTarget,
    value: {
      value: 'Singapore',
      source: 'user',
    },
  });
}

Protected data is sensitive information such as passwords, OTPs, payment card data, identity documents, bank details, API keys, private keys, and secrets. MagicBrowse can apply protected values only through an explicit artifact reader provided by the caller. The LLM is not asked to invent or handle those values.

Core rules:

  • Protected values are not sent to the planner or navigator.
  • Observe output and run records are redacted after protected fills.
  • Login, identity, checkout, payment, CAPTCHA, and human-verification edges stop as needs_handoff unless the caller uses the dedicated guarded path.
  • Consequential external actions should go through the caller's approval flow before the browser continues.

Session Persistence And Diagnostics

MagicBrowse stores current-session state and run diagnostics under MAGICBROWSE_HOME.

Default location:

~/.magicbrowse/

Important files:

  • current-session.json - active session pointer, CDP endpoint, page identity, and run id.
  • runs/<runId>.json - session events, act starts/results, executor events, LLM trace markers, page identity changes, close/detach events, and redacted browser observations.
  • run-index.json - mapping from session ids to run ids, so repeated act(...) calls inside one session append to the same record.

Use a separate MAGICBROWSE_HOME per concurrent workflow. The default current session file is a singleton.

MAGICBROWSE_HOME=/tmp/my-workflow node ./my-browser-worker.mjs

Browser Safety Boundaries

MagicBrowse is designed for browser delegation, not unchecked autonomy.

  • It should not type passwords, OTPs, payment details, identity details, API keys, private keys, or secrets through the LLM-driven act(...) path.
  • It should not bypass CAPTCHA or human verification.
  • It should not submit consequential external actions without caller approval.
  • It should not invent missing user data.
  • It should fail closed when semantic field matching is unavailable, invalid, or uncertain.

These boundaries are part of the runtime contract, not only prompt wording.

CLI

The CLI lives in @mercuryo-ai/magicbrowse-cli.

npm i -g @mercuryo-ai/magicbrowse-cli@latest
magicbrowse --help

Common commands:

magicbrowse init <apiKey>
magicbrowse doctor
magicbrowse launch https://example.com
magicbrowse observe
magicbrowse act "Find the checkout button and stop before payment"
magicbrowse close

The CLI owns its credential loading and product integration behavior. The library package remains explicit and host-controlled.

Development

npm install
npm run check-types
npm run build
npm run lint
npm run pack:verify
npm run smoke:pack-install

Live browser-agent tests require provider credentials and are intentionally not part of the default public verification workflow.

License And Third Party Notices

Mercuryo-authored code is licensed under the MIT License. MagicBrowse also includes adapted Nanobrowser components under Apache-2.0; see THIRD_PARTY_NOTICES.md for provenance and license details.

Documentation

Vendor adaptation notes live in vendor-notes/ for engineers who need to understand the bundled browser/planner internals.