npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

agent.libx.js

v0.94.37

Published

Edge-native AI agent runtime — drives a virtual filesystem via any LLM (ai.libx.js). Same bytes run in node, browser, or edge.

Readme

agent.libx.js

npm publish license runtime edge-ready

A coding agent that matches Claude Code on correctness — then beats it on cost, tokens, and tool-efficiency, and runs where Claude Code can't (sandbox, browser/edge, database).

By default it's a full-strength terminal coding agent: real disk, real shell, and the same Read/Edit/Grep/permissions/streaming-DX surface you'd expect from Claude Code. The difference is its two host couplings are swappable seams:

  • LLM → any model via ai.libx.js (AIClient.chat, OpenAI-style tools/streaming).
  • Filesystem → a pluggable IFilesystem (real disk, in-memory, IndexedDB, a database, hybrid mounts) from wcli's headless core.

So the same agent loop also runs sandboxed (in-memory VFS, real disk untouched), on the edge / browser (no Node, no /bin/sh), or hybrid (mount real dirs + a database + remote storage side by side, with transactional overlays for checkpoint/rollback).

Claude Code is the floor; running isolated, on the edge, or hybrid is the ceiling.

How it stacks up vs Claude Code

Correctness parity — efficiency, cost, and reach are the lead. Hard 7-task coding suite, Sonnet, denoised (each task ×3, no lucky run promotes; SUITE=hard bun compare/run.ts):

| | agent.libx.js | Claude Code | |---|---|---| | Correctness | 7/7 | 7/7 — parity | | Tool-calls | 16 | 28 — −43% | | Tokens | 69k | 171k — 2.5× fewer | | Wall-time | ~100s | 133s — ~25% faster |

Cost (9-task hard suite, USD-metered, vs CC-on-Opus): $0.49 single-tier Sonnet (5.4× cheaper) · $0.82 three-tier voice/duplex (3.3× cheaper) vs CC-Opus $2.67 — at quality parity (16/18 vs 17/18 passes).

Plus things Claude Code simply doesn't do:

  • Runs where CC can't — the same agent loop runs on real disk, an in-memory sandbox, the browser/edge (no Node, no /bin/sh), or a database-backed workspace. Swap the filesystem, not the agent.
  • Keyless web search, built inWebSearch works in any deployment with no API key (DuckDuckGo; auto-upgrades to Tavily if you set one). CC's search is Anthropic-server-bound.
  • Context-safe by default — a 1 MB Grep/Read/MCP result is auto-paginated and can't blow the window; buried detail is recovered via a cheap context-isolated Ask peek — ~5.3× cheaper and more accurate than re-fetching, in a head-to-head.
  • It improves its own efficiency — an autonomous evolution loop cut its own tool-use ~50% (32 → 15 on the core suite, denoised), self-discovered, not hand-tuned — the same lever behind the efficiency lead above.

Honest scope: the win is efficiency / cost / reach, not a claim of smarter reasoning — correctness is parity. All figures are denoised and reproducible (see Eval & compare); full boards in mind/09-outperform.md.

Quickstart

Point it at your project — no clone needed (requires Bun):

export ANTHROPIC_API_KEY=…                              # or OPENAI_API_KEY / GOOGLE_API_KEY / GROQ_API_KEY
bunx agent.libx.js "find and fix the failing test"      # run once in the current directory
bunx agent.libx.js                                      # …or open the interactive REPL

Want a permanent command? bun add -g agent.libx.js, then just agentx (and agentx --duplex for voice). The agent has full real-disk + shell access by default (like Claude Code); add --sandbox to work on an in-memory copy instead. See The agentx CLI for flags, sessions, and slash commands.

Use it as a library

import { AIClient } from 'ai.libx.js';
import { Agent, MemFilesystem } from 'agent.libx.js';

const fs = new MemFilesystem();              // or NodeDiskFilesystem(dir) — interchangeable
await fs.createDir('/src');
await fs.writeFile('/src/x.ts', 'export const add = (a,b) => a - b;\n');

const ai = new AIClient({ apiKeys: { anthropic: process.env.ANTHROPIC_API_KEY } });
const res = await new Agent({ ai, fs, model: 'anthropic/claude-sonnet-4-6' })
  .run('Fix the add bug in /src/x.ts');

console.log(res.finishReason, await fs.readFile('/src/x.ts'));

Tools the agent gets

  • Shell (CLI disk mode) — a real /bin/sh: run any installed binary (git, bun, node, curl, scripts, …). bash (library / sandbox mode) — ls/cat/grep/find/head/tail/echo/mkdir/rm/mv/wc, pipes, redirects, chaining — over the VFS (wcli's sandboxed JS interpreter).
  • Read — 1-indexed numbered lines, offset/limit.
  • Edit — exact unique-substring replace, with a read-before-edit staleness guard.
  • Grep/Glob/Write/MultiEdit — structured, typed results straight from the VFS (no bash parsing). The selectable tool set the self-evolution loop mutates over.
  • TodoWrite — a planning scratchpad; Task — spawn a depth-limited child agent over the VFS (subagents: true); SlashCommand — reusable prompt templates from <dir>/*.md (commandsDir); plus a real MCP client (src/mcp.client.ts, node-only — stdio/HTTP JSON-RPC handshake + discovery) that feeds the edge-safe MCP adapter (mcpToolsToAgentTools), so any MCP server's tools become agent tools.
  • WebFetch/WebSearch — fetch a URL as readable text, or search the web. Keyless by default (WebSearch uses DuckDuckGo; auto-upgrades to Tavily when TAVILY_API_KEY is set) and auto-enabled in the CLI. Factory-built with an injectable fetch, so they stay edge-portable and testable. (In the library they're opt-in by name: tools: [...,'WebSearch'].)
  • Oversized-output pagination — any tool result over a byte ceiling (maxToolResultBytes, default 60k) is cropped to page 1 with a marker (refine the query / read further), so one big Grep/Read/MCP/web result can't blow the context window. In the CLI (on by default; --no-scratch to disable) the full output instead spills losslessly to a scratch file and the model recovers specifics via Grep/Read or Ask — a cheap, context-isolated peek that returns just the answer (the raw blob never re-enters context).

Agentic subsystems

Beyond file tools, the runtime ships the higher-altitude pieces too — each an AgentOptions/loop extension over the two seams (see mind/06):

  • Skills + memory — VFS-backed (skillsDir/memoryDir); persistence is just the backend choice.
  • Subagents (subagents; typed agents via agentsDir<dir>/<name>.md defines a persona + model + scoped tools, selected with the Task agentType), hooks (hooks: preToolUse/postToolUse/onStop — block or audit any tool call), slash-commands (commandsDir), TodoWrite, MCP (mcpToolsToAgentTools).
  • Streaming (stream: truetext_delta via HostBridge.notify) and context compaction (compaction: { maxMessages } → edge-safe summarize-and-boundary). Defaults preserve the original non-stream, drop-oldest behavior.
  • Multi-turn + project contextAgent.send() continues a conversation across turns (vs run(), which starts fresh); project instructions (instructionFiles: AGENTS.md/CLAUDE.md at the FS root) inject into the system prompt.
  • DuplexAgent (src/duplex.ts) — voice-optimized three-tier engine (reflex/act/think): a fast reflex agent streams instant replies and self-selects escalation — Act for standard tool work (Sonnet-class), Think for deep reasoning (Opus-class, configurable, default on). Results are pushed back and re-voiced by the reflex (turn mutex, coalesced completions, TaskStatus/CancelTask). See mind/10.
  • Scheduler (src/scheduler.ts + cli/osScheduler.ts) — one-off ({at}), interval ({everyMs}), cron ({cron}) via ScheduleTask/ScheduleList/ScheduleCancel/Wakeup. In-session jobs fire while the session is alive (persisted, re-armed on --resume); far one-offs (or backend:'os') register with the OS scheduler (launchd / crontab / at) and survive quitting — the fired job headless-resumes the session (agentx -p … --resume <id> --yes). The PushNotification tool (osascript / notify-send) alerts the user out-of-band; Read on a .pdf returns extracted text (poppler's pdftotext, disk mode). RemoteTrigger invokes another agentx session on this machine: a session open in a live terminal receives the prompt as an injected turn (per-session unix socket, same-user only); otherwise it's resumed headless and the final answer comes back. See mind/12.
  • Budget kill-switches — always-on per-run guards (maxTokens/timeoutMs/maxRepeats/maxToolCalls/signalfinishReason budget/timeout/loop/max_tool_calls/aborted) protect the API spend against runaway loops. The enforceable billing cap is server-side in the web key-proxy: a VFS-backed budget config (/.agent/budget.json, USD-metered, hot-reloaded, $100/wk default) a browser client can't bypass. See web/ and mind/06.

The agentx CLI

A dependency-light readline REPL (plus headless -p mode) over the runtime:

agentx                      # interactive REPL in the current dir
agentx "fix the bug in x"   # run once and exit
agentx -c "keep going"      # continue the most recent session
agentx --resume <id> "…"    # resume a specific session
  • Filesystem + Shell — by default the CLI has full real-filesystem access like Claude Code (root / is the machine root, the launch dir is the working dir, absolute host paths and above-cwd reach both work) with a real /bin/sh (Shell tool) so the agent can run git, bun, node, curl, and any installed binary. Secrets (.env, .ssh, keys, .git) stay hidden by the jail; env secrets are scrubbed from the child shell. --sandbox instead operates over an in-memory copy of the working dir with a VFS-only bash — the real disk is never touched. --boddb <dir> runs over a persistent database workspace (a bod-db store at <dir>meta.db tree + files/ bytes) that survives across runs while the real disk stays untouched; DB-native by default, or add --seed to hydrate it from cwd on the first run. --no-shell forces the VFS bash in disk mode. --harden OS-sandboxes the real shell (macOS sandbox-exec / Linux bwrap): writes confined to cwd+tmp, outbound network blocked (--harden-net keeps network); commands fail closed when no wrapper exists. (/sandbox shows the active mode.)
  • Sessions — every conversation persists to ./.agent/sessions/<id>.json, flushed at every tool step (a crash, hang, or Ctrl-C mid-turn loses at most the in-flight step, never the transcript); --continue/--resume (and /sessions, /resume) pick it back up, with memory across turns — a REPL turn sees the previous one. A global symlink index at ~/.agent/sessions/ enables cross-project lookup: --resume 090715-myproject resolves from any directory, and /sessions all lists every project's sessions in one picker.
  • Diffs — every Edit/Write/MultiEdit renders a colorized +/- diff (TTY-gated; plain when piped).
  • Slash commands/help /tools /model /compact /copy /diff /memory /clear /sessions /resume /commands /init; /compact <focus> preserves matching lines from the folded span; /copy [code] puts the last reply (or its last code block) on the OS clipboard; /diff shows everything the session changed (oldest checkpoint → now); /memory opens the memory index in $EDITOR; user-defined ./.agent/commands/<name>.md are invokable directly as /<name> (the same registry the model's SlashCommand tool uses). Skills/commands created mid-session are picked up automatically each turn (delivered as a cache-friendly <system-reminder> delta, like Claude Code) and the Skill/SlashCommand tools rescan on a name miss; /reload forces a full catalog + system-prompt rebuild.
  • Live chrome — the thinking spinner shows elapsed seconds + esc to interrupt; the terminal tab title tracks the session topic; a bell rings when a long (>10s) turn finishes in a backgrounded tab; the footer warns at 80%/90% context pressure and auto-trims announce themselves.
  • /transcript [n] — the full session transcript including complete tool-result bodies (the past-turn equivalent of Ctrl-O live verbose), paged through less; /doctor — one-shot environment sanity check (keys, model pricing, config, session-store writability, memory, MCP mounts).
  • Syntax-highlighted code fences```ts (and js/py/sh/go/rust/…) blocks render with keywords bold, strings green, numbers cyan, comments dim; unknown languages keep the plain cyan body. TodoWrite plans pin a compact ☑ 2/5 · current step line into the idle footer.
  • /agents — list subagent types from ./.agent/agents (description, model, tool scope); /agents new <name> scaffolds a frontmatter'd definition for the Task tool's agentType. !<partial> + menu completes from past ! shell commands. @server:uri mentions inline an MCP resource body into the prompt. Transient network drops mid-step retry automatically (2 attempts, backoff) instead of failing the turn.
  • Project instructions./AGENTS.md (or CLAUDE.md) auto-loads into every run; /init scaffolds one.
  • Any provider — set ANTHROPIC_API_KEY / OPENAI_API_KEY / GOOGLE_API_KEY / GROQ_API_KEY; choose with -m provider/model.
  • @-file mentions & headless JSON — reference files inline in a prompt with @path (e.g. explain @src/Agent.ts; ~/ expands to the home directory; quote paths with spaces as @"…" — drag-dropped files, e.g. macOS screenshots, quote themselves automatically); script with -p --output-format json to get one machine-readable result object on stdout (activity stays on stderr).
  • Tab-completionTab completes /<command> names and @<path> file/dir references (descends subdirs, dotfiles hidden unless typed) straight from the working tree.
  • Duplex modeagentx --duplex runs the full standard REPL (slash commands, sessions, postures, rewind, MCP) with the three-tier engine driving turns: a fast voice model (--voice-model, default groq/openai/gpt-oss-120b) answers every line instantly and delegates real work to background workers built with the same wiring as a normal run (fs mode, permissions, MCP); worker activity shows as dim chrome and results are re-voiced when ready. Switch any tier live with /model (opens a reflex/act/think picker), or the /voice-model · /think-model shortcuts. /tasks lists background tasks, inspects a task's live output tail, and cancels a running one from a picker (Esc mid-turn cancels the foreground turn; Esc again at the idle prompt cancels running workers).
  • MCP servers — declare mcpServers: { name: { command, args } | { url } } in config and they're auto-mounted at startup (in parallel, with an optional mountTimeoutMs deadline so one slow/dead server never blocks the rest): the client does the JSON-RPC handshake (stdio or HTTP) + tools/list, and the discovered tools appear as mcp__<name>__<tool> in /tools (inspect with /mcp). A bad server is logged and skipped, never blocking the agent. For large tool sets, deferred mode (makeMcpToolSearch / mountMcpDeferred) exposes just two bounded tools (ToolSearch + McpCall) instead of N defs — dodging the provider tool-cap and improving selection accuracy; the CLI applies this automatically past 12 mounted tools (a 42-tool server was costing ~80k tok/turn in schema alone), and permission rules written against the real mcp__<name>__<tool> names still match through McpCall. mountMcpCatalog goes further: a cached, hash-keyed catalog + lazy connect means a turn that uses no MCP tool opens zero connections, and one that uses a tool connects exactly that server — latency scales with tools-used, not servers-configured. A down server is negative-cached (failureCooldownMs) so it never re-floors a later turn at the deadline. For zero turn-path latency even on a cold process, call warmMcpCatalog at boot + on a timer (off-turn discovery) and mount with { discover: 'cache-only' } — the turn then never synchronously connects: it serves the warmed catalog and discovers any miss in the background.

🧬 It improves itself

The agent is a coding agent that operates over a swappable filesystem — so it can be pointed at its own repo and evolve its own configuration. evolve/ is an autonomous loop:

champion → propose patch → jailed + sandboxed eval → per-task no-regression gate → ledger → repeat

An LLM is the mutation operator; a behavioral fitness function (run the produced code) is natural selection. Correctness is a hard gate, the rule files are hash-pinned (the agent can't edit what judges it), and every candidate runs under two containment boundaries — a JailedFilesystem (secret denylist, symlink-escape defense) and a sandboxed grader (scrubbed env, nonce-authenticated result, default-on sandbox-exec). Those guardrails were hardened against a 22-agent adversarial red-team (14 findings fixed) before the loop was allowed to run.

Result (Sonnet 4.6): the loop autonomously drove baseline 32 → 15 tool-calls (53% fewer), 5/5 pass heldparity with Claude Code (head-to-head 15 vs 15 tools, 1.8× faster, 2.8× fewer tokens), the efficiency gap we'd only described before. This is the denoised figure (each candidate averaged over 3 runs so no lucky run promotes); a single un-averaged run reached 14. It generalizes to held-out tasks (24 → 12, no overfit) and discovered the human-authored parity plan on its own: use structured Grep/MultiEdit, stop over-exploring.

GENERATIONS=8 bun evolve/loop.ts     # evolve → evolve/champion.json + ledger.jsonl
bun evolve/report.ts                 # instant replay of the arc (no tokens)
EVOLVED=1 bun compare/run.ts         # evolved champion vs Claude Code
bun evolve/generalize.ts             # baseline vs champion on UNSEEN tasks

Full design + threat model + results: mind/08-self-evolve.md.

Status

v1 (done): loop + hybrid tools + Mem/Disk backends + deterministic FakeAIClient tests + real-model run. 5/5 pass@1 on the behavioral eval (Sonnet 4.6); the head-to-head started at correctness parity with Claude Code but ~2× the tool calls (≈28 vs 15) — a gap the self-evolution loop has now closed autonomously: it drove its own baseline from 32 → 15 tool-calls (denoised over 3 runs) and ties Claude Code in a fresh head-to-head (15 vs 15). 820+ tests green.

See mind/ for the full vision, architecture, decision journal, roadmap, eval + head-to-head results, the parity plan, and the self-evolution design.

Develop & evaluate

Hacking on the runtime itself (from a clone):

bun install                # links wcli (file:), ai.libx.js + libx.js (bun link)
bun test                   # 820+ unit/integration tests (offline via FakeAIClient, no key)
ANTHROPIC_API_KEY=… bun examples/run-sonnet.ts   # drive a real model end-to-end

Eval & head-to-head (real model):

bun eval/run.ts            # behavioral scorecard (our agent over MemFilesystem)
bun compare/seed-tasks.ts  # materialize task specs into .tmp/tasks/
bun compare/run.ts         # head-to-head vs Claude Code (needs the `claude` CLI)