npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

kontex-proxy

v0.1.1

Published

Local HTTP proxy + dashboard for AI agent developers. Intercept, inspect, replay, and fork every LLM call — no cloud required.

Downloads

34

Readme

Kontex CLI

Local HTTP proxy + dashboard for AI agent developers.
One command intercepts every LLM API call, saves a full snapshot locally, and opens a "Control Room" dashboard — no cloud, no config, no data leaves your machine.


Dashboard


Why Kontex?

When you're building AI agents, you need to answer questions like:

  • Which LLM call caused the bad output?
  • What was the exact context when the agent went off-track?
  • Can I replay this run with a different response at step 3?

Kontex intercepts every call at the proxy layer, so you get full observability with zero changes to your agent code — just point your base URL at localhost:8080.


What it does

Your agent  →  localhost:8080  →  OpenAI / Anthropic / Ollama / any LLM API
                    │
                    ├── Saves raw prompt + response to .kontex.db (SQLite)
                    ├── Optionally trims context (lossless, toggleable)
                    └── Serves dashboard at GET /

Key features

| Feature | Description | |---|---| | Proxy | Intercepts every POST /* call and forwards to your upstream LLM | | Snapshots | Saves the full untrimmed prompt and response to SQLite — nothing is lost | | Context trimmer | Structurally lossless trimming applied before the upstream call — toggleable from the dashboard | | Session grouping | Groups related agent runs into sessions via a request header | | Multi-agent graph | Swim-lane view showing every agent's trajectory and cross-agent links | | Live pause | Pause a request mid-flight, inspect it, then resume with edited messages | | Fork & replay | Branch from any snapshot with a human-edited response; downstream calls replay deterministically | | Branch chain | Create a new agent task from any snapshot, staying in the same session |


Requirements

  • Node.js 18+
  • npm 9+

Installation

Option A — global install (recommended)

npm install -g kontex-proxy
kontex start

Option B — clone and build

git clone https://github.com/pankaj-agrawalla/kontex-cli.git
cd kontex-cli
npm install
cd web && npm install && cd ..
npm run build

Configuration

Copy .env.example and edit as needed:

cp .env.example .env
# .env
KONTEX_PORT=8080           # Port for the proxy + dashboard (default: 8080)
UPSTREAM_URL=https://api.openai.com   # LLM API to forward requests to

To use with Ollama locally:

UPSTREAM_URL=http://localhost:11434

To use with Anthropic:

UPSTREAM_URL=https://api.anthropic.com

Usage

Start the server

kontex start

The browser opens automatically at http://localhost:8080.

Or with a custom port:

kontex start --port 9000

Point your agents at Kontex

Change your agent's base URL from the LLM provider to the Kontex proxy:

http://localhost:8080

No other code changes are required. All requests are transparently proxied.

Example — OpenAI SDK:

import OpenAI from "openai"

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "http://localhost:8080/v1",   // ← point at Kontex
})

Example — LangChain:

import { ChatOpenAI } from "@langchain/openai"

const llm = new ChatOpenAI({
  openAIApiKey: process.env.OPENAI_API_KEY,
  configuration: {
    baseURL: "http://localhost:8080/v1",  // ← point at Kontex
  },
})

Example — raw fetch:

await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json", "Authorization": `Bearer ${apiKey}` },
  body: JSON.stringify({ model: "gpt-4o", messages }),
})

Optional request headers

These headers unlock richer dashboard views. They are stripped before forwarding upstream — your LLM never sees them.

| Header | Purpose | |---|---| | X-Kontex-Task-Id | Groups snapshots into a named agent task (swim lane in the graph). Defaults to "default" if omitted. | | X-Kontex-Session-Id | Groups all tasks from one run into a single session entry in the sidebar. | | X-Kontex-Parent-Task-Id | Records a cross-agent link (draws an amber dashed edge). Send on the first turn only of a child agent. | | X-Kontex-Fork-Id | Enables deterministic replay. Set to the task ID you forked from. |

Without any headers, everything still works — all snapshots land under the "default" task and appear in the dashboard.

With headers (recommended for multi-agent workflows):

const headers = {
  "X-Kontex-Task-Id": "planner-agent",
  "X-Kontex-Session-Id": "run-2024-001",
  // first turn of a child agent only:
  "X-Kontex-Parent-Task-Id": "planner-agent",
}

The Dashboard

Open http://localhost:8080 in your browser.

Sidebar (left)

  • Lists all sessions ordered newest-first
  • Each entry shows the session ID, timestamp, agent count, and snapshot count
  • Click a session to load its graph
  • Context trimmer toggle at the bottom — turn trimming on or off in real time

Graph (center)

  • One swim-lane column per agent task
  • Nodes = individual LLM calls (snapshots)
  • Gray edges = within the same agent
  • Amber dashed animated edges = cross-agent links (parent → child)
  • Amber-bordered nodes = human-edited snapshots
  • Click any node to open the snapshot drawer

Snapshot drawer (right)

Opens when you click a node. Shows:

  • The full conversation messages sent to the LLM
  • Live Pause — pauses the next request from this task mid-flight so you can inspect and edit messages before they reach the LLM
  • Fork & Edit — save a human-edited version of the messages; the next replay of this prompt hash will return your edited version instead of calling the LLM
  • Branch chain here — create a new agent task (in the same session) branching from this point, with an editable LLM response

Context trimmer

The trimmer applies three structurally lossless passes before forwarding to the upstream LLM:

  1. Tool result truncation — long tool/function responses are sliced to prevent runaway context growth
  2. Middle-turn compression — older assistant turns in the middle of a long conversation are shortened
  3. System prompt deduplication — repeated system content across turns is reduced

The raw untrimmed payload is always saved to the database — trimming only affects what is forwarded upstream.

Toggle it on/off live from the sidebar without restarting the server.


Multi-agent workflow example

const SESSION_ID = `run-${Date.now()}`

// Agent 1 — Planner
const plannerResponse = await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${apiKey}`,
    "X-Kontex-Task-Id": "planner",
    "X-Kontex-Session-Id": SESSION_ID,
  },
  body: JSON.stringify({ model: "gpt-4o", messages: plannerMessages }),
})

// Agent 2 — Coder (links back to planner)
const coderResponse = await fetch("http://localhost:8080/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${apiKey}`,
    "X-Kontex-Task-Id": "coder",
    "X-Kontex-Session-Id": SESSION_ID,
    "X-Kontex-Parent-Task-Id": "planner",   // ← first turn only
  },
  body: JSON.stringify({ model: "gpt-4o", messages: coderMessages }),
})

This produces a dashboard with two swim lanes and an amber edge from Planner → Coder, grouped under one session.


Database

All data is stored in .kontex.db (SQLite) in the project root. The file is created automatically on first run.

To start completely fresh:

rm .kontex.db
kontex start

Schema

CREATE TABLE Snapshots (
  id                 TEXT PRIMARY KEY,   -- cuid
  task_id            TEXT NOT NULL,      -- from X-Kontex-Task-Id header
  parent_id          TEXT,               -- previous snapshot in the same task
  parent_task_id     TEXT,               -- from X-Kontex-Parent-Task-Id header
  session_id         TEXT,               -- from X-Kontex-Session-Id header
  prompt_hash        TEXT NOT NULL,      -- MD5 of messages array (for replay lookup)
  raw_prompt_payload TEXT NOT NULL,      -- original untrimmed JSON body
  llm_response       TEXT,              -- raw response from upstream
  is_human_edited    INTEGER DEFAULT 0, -- 1 if created via fork
  created_at         INTEGER NOT NULL   -- Unix ms
);

Internal API

These endpoints power the dashboard. You can also call them directly.

| Method | Path | Description | |---|---|---| | GET | /health | Health check | | GET | /api/sessions | List all sessions | | GET | /api/tasks | List all task IDs | | GET | /api/graph?session=<id> | Combined graph (nodes + edges) for a session | | GET | /api/tasks/:id/graph | Graph for a single task | | GET | /api/snapshots/:id | Full snapshot detail | | POST | /api/snapshots/:id/pause | Pause the next request on this snapshot | | POST | /api/snapshots/:id/resolve | Resume a paused request with edited messages | | POST | /api/snapshots/:id/fork | Create a human-edited snapshot (same task) | | POST | /api/snapshots/:id/fork-chain | Create a new task branching from this snapshot | | GET | /api/trimmer | Get trimmer state { enabled: boolean } | | POST | /api/trimmer/toggle | Toggle trimmer on/off |


Development

Run the backend and frontend separately with hot reload:

# Terminal 1 — backend
npm run dev

# Terminal 2 — frontend
cd web && npm run dev

The Vite dev server runs on port 5173 and proxies /api to localhost:8080.


E2E test

Requires Ollama running locally with llama3.2:1b:

ollama pull llama3.2:1b
npm run build
npm run e2e

Simulates a 3-agent pipeline (Planner → Coder → Reviewer), verifies snapshots, cross-agent edges, session grouping, fork/replay, and edge cases. Exits 0 on full pass.


Contributing

Issues and PRs are welcome. Please open an issue first for significant changes.


License

MIT — see LICENSE.