npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@aimount/browser

v0.1.0-alpha.0

Published

Browser snapshot and scroll tools for embedded aimount integrations

Readme

@aimount/browser

Shared browser observation tools for embedded aimount integrations.

This README freezes the important stage-1 architecture decisions that led to the first package cut, so later work does not need to reconstruct them from chat history.

Why this package exists

aimount is an embedded assistant platform. Host apps already register project-specific browser tools next to domain tools, as seen in senler. The reusable value here is not a second browser-side agent loop. The reusable value is a shared browser engine plus standard tools that fit the existing aimount tool/runtime model.

_ref/page-agent/ is a donor, not a template. The part worth borrowing is the browser controller / snapshot extraction idea. The full page-agent loop, extension transport, and action lifecycle are intentionally not copied into aimount.

v1 public scope

The first public cut is intentionally small.

  • Export createBrowserEngine(...).
  • Export createPageSnapshotTool(engine).
  • Export createPageScrollTool(engine).
  • Do not ship navigate or reload in v1.

The validated reason for cutting navigate and reload is complexity versus shared value. They drag in page transition recovery, durable restart handling, and future runtime-session semantics across tabs. Those concerns remain important, but they do not belong in the first public package cut.

Public wiring

The host owns engine creation. Tools are explicit and separate.

import {
  createBrowserEngine,
  createPageScrollTool,
  createPageSnapshotTool,
} from '@aimount/browser';

const browserEngine = createBrowserEngine({
  layers: {
    content: { selectors: ['#content'] },
    assistant: { selectors: ['#assistant'] },
  },
  defaultLayers: ['content'],
});

export const page_snapshot = createPageSnapshotTool(browserEngine);
export const page_scroll = createPageScrollTool(browserEngine);

Important: aimount tool ids come from export keys, not from an internal name field. If the host wants the tool ids to be page_snapshot and page_scroll, the host should export them under exactly those keys.

Shared engine contract

createBrowserEngine(...) exists because reads and actions must share one browser observer/controller state.

  • The engine is shared.
  • The host creates it once.
  • Each tool creator receives the same engine instance.
  • There is no hidden singleton.
  • AssistantWIProvider does not own browser engine lifecycle in v1.

This is the aimount analogue of borrowing the PageController boundary from page-agent without importing the rest of the agent runtime.

page_snapshot contract

Public API:

page_snapshot({ layers?: string[] })

Locked decisions:

  • View is viewport-only, not full-document.
  • Result shape is text + meta.
  • layers are named layer ids, not raw selectors.
  • Layer selector mapping is configured in the engine.
  • If layers are omitted, the engine uses configured default layers.
  • Unknown layer ids fail explicitly.
  • Visibility boundaries are integrator-provided. The package does not try to autodetect the assistant container or any other hidden zones.

Snapshot content rules:

  • Result shape stays text + meta, but text is now a structural browser-state snapshot.
  • The top-level text format is Page state: plus Visible structure:.
  • Keep visible interactive nodes even when they do not have a reliable name.
  • Include nearby readable context, not only control names.
  • Add coarse 3x3 zones: top-left, top-center, top-right, middle-left, center, middle-right, bottom-left, bottom-center, bottom-right.
  • Do not emit semantic overlays; container-level text blobs were removed because they created noisy evidence for exact UI claims.
  • Include snapshotId and freshness metadata so later action flows do not have to invent identity after the fact.

Item-level evidence rules:

  • Every meta.items[] entry includes name, nameSource, and nameStatus.
  • nameSource is one of text | aria-label | title | placeholder | value | unknown.
  • nameStatus is one of strong | weak | unknown.
  • Unknown or weakly named controls remain visible in the snapshot instead of being dropped.

Observation honesty rules:

  • meta.observation.totalInteractive counts all visible distinct actionable nodes.
  • meta.observation.weakInteractive counts items whose best name comes from a weak source.
  • meta.observation.unknownInteractive counts visible interactives with no reliable name.
  • meta.observation.exactUiClaims communicates whether the snapshot is safe for exact button/icon/label claims:
    • safe when visible interactives are structurally preserved and meaningfully named,
    • partial when rough guidance is okay but exact UI claims should be limited to strong items,
    • unsafe when the page is too semantically weak for exact UI claims.

Safety rules:

  • Minimal built-in redaction is required.
  • Password values and obvious secret/token-like values should not leak into snapshot text.

page_scroll v1 contract

Public API:

page_scroll({ direction, screens? })

Locked v1 decisions:

  • Scope is page-only vertical scroll.
  • Container scroll is out of v1.
  • Unit name is screens, both externally and internally.
  • Fractional values are allowed.
  • Default amount is 0.75 screens.
  • Scrolling should be smooth/animated.
  • Result shape is nested: action metadata plus snapshot.
  • Returned snapshot always uses engine default layers.
  • Action tools do not accept custom layers in v1.

Readiness rules:

  • assistant-wi intentionally differs from page-agent here.
  • page-agent mostly does short waits and expects the model to call a later observe step.
  • In this package, the action tool itself waits heuristically, then returns the post-action snapshot publicly.
  • Readiness is heuristic + timeout.
  • Heuristics are action-specific, not one universal settle rule.
  • Budgets are per-action.
  • On timeout the tool returns best-effort state with an incompleteness flag rather than blocking forever.

Deferred and intentionally out of v1

These topics were discussed and are intentionally not in the first package cut:

navigate and reload

They were explored in detail, including a recovery design based on:

  • package-owned automatic recovery,
  • localStorage,
  • per-tab ownership,
  • toolCallId,
  • TTL expiry,
  • fail-closed cleanup.

That design context is still useful, but it is deferred rather than shipped in v1.

New-tab navigation

New-tab behavior is not the same thing as current-tab navigation. It likely needs explicit runtime session semantics such as fork/clone/handoff instead of silently reusing the same session.

This concern is tracked in the tasks backlog:

  • tasks/projects/browser-std-tools/tasks/model-new-tab-session-fork

Future page_action

The package name is browser, not page-snapshot, because the boundary should survive growth into future browser actions. Those actions are deliberately deferred until the shared snapshot model is stable.

Validation strategy

Playwright is mandatory for acceptance because the real browser is the source of truth for viewport, geometry, scroll, and overlay behavior.

The validation stack is:

  • fast unit tests for layer resolution, redaction, structural item preservation, honesty metadata, and nested action results;
  • Playwright acceptance tests for real browser behavior, including icon-only visible controls.

Donor notes from page-agent

Useful donor ideas:

  • browser-state extraction around a shared controller,
  • indexed interactive elements,
  • local readable context around actions,
  • visible-page guidance rather than raw DOM dumping,
  • preserving actionable structure even when semantics are weak.

Things intentionally not copied:

  • a second browser-side planner/agent loop,
  • extension-centric transport,
  • a separate public wait tool mental model,
  • weakly controlled execute-JS style behavior.