nc-pilot-debug-server
v0.2.0
Published
Local debug sink + CLI for the NC-Pilot browser extension. Captures agent steps (model requests, responses, tool calls) and starts/stops from the terminal or the extension.
Maintainers
Readme
NC-Pilot Debug Server
A tiny, zero-dependency observability server for AI browser agents. It captures everything the NC-Pilot browser extension's agent does — every model request, model response, reasoning trace, tool call, and tool result — prints it live to your terminal, and stores it grouped by conversation so you can replay, analyze, and tune agent behavior.
Think of it as a flight recorder for an autonomous browser agent.
Pure Node.js. No npm dependencies. One file. Runs anywhere Node 18+ runs.
Why this exists
Autonomous browser agents (read the page → decide → click/type/navigate) are hard to debug. When a run goes wrong you usually only see the final answer, not why the model chose a wrong element or got stuck in a loop. This server records the full decision trace so you can see exactly what the model saw and did at every step — and hand that trace to an LLM to get fixes.
Features
- Live trace — color-coded, human-readable stream of every agent step in your terminal.
- Full fidelity capture — records the complete
messagesarray the model received each step (system prompt + history + tool results), so prompt/tool-description problems are visible. - Conversation grouping — every event is tagged with a UUID
conversationId; events are stored per conversation underconversations/<id>.ndjson. - Chat restore — the extension can fetch a conversation back from this server on startup, so reopening the side panel restores the previous chat (history + transcript).
- Reasoning capture — captures model "thinking" output when the model emits it.
- Simple HTTP API — list/read/delete conversations as JSON.
- NDJSON storage — one JSON object per line; trivially greppable, diffable, and feedable to any LLM or data pipeline.
- Zero dependencies, zero config —
node server.jsand you're recording.
Use cases
- Debugging agent failures — see the exact step where the agent picked the wrong element, reused a stale index, or stopped early.
- Prompt & tool tuning — the
model_requestevent holds the full context the model saw; diff good vs. bad runs to refine the system prompt or tool descriptions. - Crawling / scraping observability — when you drive the agent to crawl listings, fill
forms, paginate, or extract data across pages, capture the whole multi-step run: which
elements were scanned (
get_dom), what was clicked, what each page returned (read_page). Replay it to confirm coverage or find where extraction broke. - Dataset capture — build a labeled corpus of (page state → chosen action) pairs from real runs for evaluation or fine-tuning.
- Regression checks — keep traces of known-good tasks; re-run after changing the prompt or model and compare.
- LLM-assisted analysis — hand a conversation's NDJSON to Claude/GPT and ask "why did the agent loop here, and how should I fix the prompt?"
- Conversation persistence in dev — restore a chat after a service-worker restart without losing context.
Benefits
- See the model's actual input, not just its output — the #1 thing missing from most agent debugging.
- No infrastructure — no database, no cloud, no API keys. A single local file.
- Privacy by design — runs entirely on your machine; nothing is sent anywhere.
- Portable data — plain NDJSON works with
jq,grep, pandas, or any LLM.
Requirements
- Node.js 18+
- The NC-Pilot browser extension (the source of the events)
Install & run
Quickest — npx (no clone)
npx nc-pilot-debug-server start # start in the background
npx nc-pilot-debug-server status # is it running?
npx nc-pilot-debug-server stop # stop itCLI commands
| command | what it does |
|---|---|
| start | start the server detached (background); records a PID so stop can find it |
| start -f | start in the foreground (this terminal) |
| stop | stop the running server (graceful POST /shutdown, falls back to killing the PID) |
| status | show whether it's running + reachable |
| restart | stop then start |
| --port, -p <N> | listen on a custom port (default 8787; or env NC_DEBUG_PORT) |
From a clone
git clone <this-repo-url> nc-pilot-debug-server
cd nc-pilot-debug-server
node cli.js start # background · node cli.js start -f for foreground
# or the classic: node server.jsYou'll see:
nc-pilot-debug-server on http://localhost:8787
POST /log
GET /conversations
GET /conversations/:id
DELETE /conversations/:id
POST /shutdown
per-conversation logs in ./conversationsThe server can also be stopped from the extension — Options → Debug → Stop server (it calls
POST /shutdown). Starting must be done from the terminal: browser extensions are sandboxed and cannot launch local processes.
It listens on http://localhost:8787 and writes:
debug-events.ndjson— combined log of all eventsconversations/<conversationId>.ndjson— per-conversation logs
Connect the NC-Pilot extension
Debug logging is off by default (so it never phones home for end users). Turn it on from the extension's Options page → Debug server (dev):
- Start this server (
npx nc-pilot-debug-server start). - In NC-Pilot Options, check Enable debug server.
- Set Debug server URL if you changed the port (default
http://localhost:8787). - Click Test connection — it should report "Connected".
- Save. (To stop the server later: Stop server here, or
npx nc-pilot-debug-server stop.)
The extension's manifest already allows http://localhost:* in its CSP, so any local port works
— no manifest editing needed.
Now open the side panel and run a task — events stream into your terminal and files, and the side panel gains a + New chat button and a ☰ chat history list (past conversations are restored from this server).
Uncheck Enable debug server in Options for normal use / before publishing the extension.
What a captured run looks like
run_start [11111111] ollama · qwen2.5:7b
user: find the top story on Hacker News and open it
model_response step 0
→ tool get_dom({})
tool_result get_dom 227 elements
model_response step 1
→ tool open_tab({"url":"https://news.ycombinator.com/item?id=..."})
tool_result open_tab {"openedTabId":42,...}
model_response step 2
text: Opened the top story "…" in a new tab.
run_end 3 steps · finalEvent types
Each line in the NDJSON files is one event:
| type | key fields | meaning |
|---|---|---|
| run_start | conversationId, backend, model, permissionMode, userMessage | a new user task started |
| model_request | step, messageCount, lastRole, messages | the full conversation sent to the model this step |
| model_response | step, text, thinking, toolCalls | the model's reply + reasoning + chosen tools |
| tool_result | name, args, ok, summary | outcome of running a tool |
| error | step, error | a step failed |
| run_end | steps, reason | task finished (final / step_cap / stopped) |
| history_snapshot | history | full normalized messages; used to restore the chat |
All events also include ts (epoch ms) and conversationId.
HTTP API
| method & path | description |
|---|---|
| POST /log | ingest one event (used by the extension) |
| GET /conversations | list conversations: id, events count, model, firstUser, lastTs |
| GET /conversations/:id | full event list for one conversation |
| POST /conversations/:id/title | rename a conversation ({ "title": "..." }) |
| DELETE /conversations/:id | clear one conversation |
| POST /shutdown | stop the server (used by the CLI stop and the extension Stop button) |
| GET / | service info + conversation list |
Examples
# list conversations
curl http://localhost:8787/conversations
# fetch one conversation's full trace
curl http://localhost:8787/conversations/<uuid> | jq
# pull just the model's chosen tools across a run
curl -s http://localhost:8787/conversations/<uuid> \
| jq '.events[] | select(.type=="model_response") | .toolCalls'
# delete a conversation
curl -X DELETE http://localhost:8787/conversations/<uuid>Analyzing a run with an LLM
The fastest way to improve an agent: give the trace to a capable model.
Grab the conversation file:
conversations/<id>.ndjson.Paste it (or the relevant events) to Claude/GPT with a prompt like:
"This is an NC-Pilot agent trace. The agent kept re-clicking the same element instead of finding the 'Apply' button. Look at the
model_requestandtool_resultevents and tell me what to change in the system prompt or tool descriptions."
The model_request events are the most valuable — they show the model's exact input.
Privacy & data
- Everything runs locally. This server makes no outbound connections.
debug-events.ndjsonandconversations/can contain page content, URLs, and form text from whatever the agent read. Treat them as sensitive.- Both are git-ignored by default. Do not commit or publish captured traces — scrub or delete them before sharing.
Project layout
nc-pilot-debug-server/
server.js # the whole server (zero deps)
cli.js # start/stop/status/restart wrapper (the `bin`)
package.json # bin + npm scripts
README.md
conversations/ # generated: per-conversation NDJSON (git-ignored)
debug-events.ndjson # generated: combined log (git-ignored)
.debug-server.pid # generated: background PID (git-ignored)License
MIT — do whatever you want. (Add a LICENSE file before publishing.)
