@coralai/nps-cli
v0.4.1
Published
Local Linux daemon hosting per-DM Claude Code agents accessible via Matrix
Readme
nps-cli — Vision Doc (rev 5, aligned with CEO plan)
Status: v0.1.0-alpha.0 skeleton landed; engineering plan rev 5 ready to implement. Last updated 2026-05-14 by Coral + Claude after CEO + Eng review cycles.
Authoritative engineering plan:
~/.gstack/projects/coral-jarvis-skills/ceo-plans/2026-05-14-nps-cli-v0-1.md(rev 5)This README captures the what + why. The CEO plan captures the how + effort.
1. What is nps-cli
nps is a Linux-only local daemon that hosts ephemeral Claude Code agent processes accessible via IM (Matrix DM in v0.1). Each agent is defined by a profile — a directory with its own personality, skills, and an external workspace pointer. Per message, nps spawns a Claude binary, dispatches the message, persists the result, and stops — no long-lived per-channel cache (codex F2 design lesson from rev 3 → rev 4 pivot).
Think of it as:
- Discord-bot framework, but: agent-first, ephemeral-per-message, project-skill-scoped.
- openclaw, but: agent profile abstraction with workspace routing + safe session resume.
- sps-cli, but: NOT a worker pipeline. Conversational, IM-driven.
The core runtime primitive is @coralai/claude-code-agent (Phase 1 package, v0.2+). nps-cli is the first true consumer of its public ClaudeCodeAgent class.
2. Why nps-cli (the gap)
After Phase 1, @coralai/claude-code-agent exists as a clean library. Beyond proving its API, the actual product value:
- Coral wants a personal AI presence reachable through Matrix DM, backed by skills he can iterate locally.
- Different conversational contexts (work, finance, study) deserve different agents with their own personalities and skills.
- sps-cli is a build pipeline, not a conversational front-end. Two different products with one shared runtime.
v0.1 is a validation vehicle, not a finished product. It proves three things before any platform feature is built on top:
- Workspace routing — agent persona/skills at A, task workspace at B, never conflated.
- Safe session resume —
resumeSessionworks AND has compatibility invalidation done BEFORE resume. - One real transport loop — Matrix DM in, reply out, no crash, no leaked process, with event-id dedupe.
Everything not directly serving those goals is deferred.
3. Relationship to other projects
┌───────────────────────────────────────────────────────────────┐
│ nps-cli │
│ (this project — Linux daemon + Matrix DM gateway + profiles) │
└─────────────────────────────┬─────────────────────────────────┘
│
▼
@coralai/claude-code-agent ^0.2.0 (Phase 1 package)
• v0.1: start / stop / prompt / cancelAll
• v0.2: + start({ resumeSession }) ← nps prereq, shipped
│
▼
@agentclientprotocol/sdk ^0.21
│
▼
claude-agent-acp shim 0.33.1
│
▼
claude binary (OAuth subscription)- sps-cli stays a separate consumer of
@coralai/claude-code-agent. nps-cli is a sibling, not a fork. - Multi-vendor LLM: NOT via an
LLMBackendabstraction inside nps-cli (decision D2). Instead, profiles can pointANTHROPIC_BASE_URLat an API-compat proxy (e.g. CLIProxyAPI, litellm) — the claude binary still thinks it's talking to Anthropic. Backend abstraction = zero code in nps-cli.
4. Platform target
Linux only in v0.1. The implementation depends on systemd user units, journald for log aggregation, flock for file locking, Unix domain sockets, and loginctl enable-linger for user-mode service supervision. macOS launchd and Windows support are deferred to v0.1.x.
5. User scenario (concrete)
Coral runs nps daemon start as a systemd user service. He has three agent profiles:
~/.nps/
├── config.yaml # global: Matrix homeserver, admin user ids
├── bindings.yaml # channel → profile mapping (CLI-managed in v0.1)
└── agents/
├── work-companion/
│ ├── nps.yaml # workspace: /home/coral/jarvis-skills, env: ANTHROPIC_BASE_URL=...
│ ├── CLAUDE.md # "你是 Coral 的工作助手..."
│ └── .claude/skills/
│ ├── code-review/SKILL.md
│ └── architecture-sketch/SKILL.md
├── personal-finance/
│ ├── nps.yaml # workspace: /home/coral/.finance, env: ...
│ ├── CLAUDE.md
│ └── .claude/skills/...
└── study-buddy/
└── ...In Matrix DM with @nps-work-companion:matrix.org, Coral types "review the latest PR on jarvis-skills." nps daemon:
- Receives the message (Matrix event with
event_id). - Checks event-id dedupe (drop if already processed).
- Looks up channel binding →
work-companion. - Computes
compatHashfrom profile artifacts. - Checks stored hash; if mismatch → drop stored sessionId (fresh start); else use it for resume.
- Spawns
ClaudeCodeAgent({ cwd: /home/coral/jarvis-skills, env: profile.env }). agent.start({ resumeSession: validId })— or no resume on fresh start.agent.prompt(text)— claude does its work in the actual repo.- Atomically persists
{ sessionId, compatHash, lastEventId }. agent.stop()— process gone, baseline RAM.- Sends reply (Matrix
m.textfinal, no streaming in v0.1).
Tomorrow Coral restarts the daemon. Next message in the DM resumes from the stored sessionId because compatHash matches (no profile edits).
6. Architecture
6.1 Process model
Single Node.js daemon on Linux. Spawns a fresh Claude binary per message, then exits.
nps daemon (one Node process)
├── ProfileRegistry — scan + chokidar watch ~/.nps/agents/
├── BindingsLoader — atomic read of ~/.nps/bindings.yaml
├── DispatchPipeline — dedupe → lookup → compat check → resume/fresh → prompt → persist → stop → send
├── MatrixGateway — DM-only client (unencrypted, final-only replies, event-id dedupe)
├── SessionStore — atomic-write JSON per channel: { sessionId, compatHash, lastEventId }
├── ControlSocket — JSON-RPC 2.0 on Unix socket + PID file + stale-socket detection
└── pino logger — stdout → journaldNotably absent (deferred to v0.2+ if dogfooding shows need):
- No AgentSupervisor (no per-channel cache)
- No idle eviction (nothing to evict)
- No health auto-restart of agents (ephemeral processes can't hang)
- No LLMBackend abstraction (D2)
6.2 DispatchPipeline — corrected order (codex F1, F11)
on MatrixGateway emits Message{ channelId, eventId, text, sender }:
1. SessionStore.hasSeenEvent(eventId)? → silent drop on duplicate
2. BindingsLoader.lookup(channelId) → profile → /help reply on miss
3. Compute currentCompatHash from profile artifacts (spec below)
4. SessionStore.get(channelId) → { storedHash?, sessionId? }
5. ── COMPAT CHECK BEFORE RESUME ── ← codex F1: NOT after
if storedHash !== currentCompatHash:
resumeId = undefined # fresh start
IM reply hint: "session reset (config changed)"
else:
resumeId = sessionId
6. agent = new ClaudeCodeAgent({ cwd: profile.workspace, env: profile.env, ... })
7. await agent.start({ resumeSession: resumeId })
8. result = await agent.prompt(text)
9. ── PERSIST BEFORE SEND ── ← codex F11
await SessionStore.atomicWrite(channelId, {
sessionId: result.sessionId,
compatHash: currentCompatHash,
lastEventId: eventId,
})
10. await agent.stop()
11. await gateway.sendReply(channelId, result.text) (retry once on transient failure)6.3 compatHash specification (codex F3, F4)
Inputs (in canonical order, each ending with separator "\n--SEP--\n"):
1. profile.yaml — yaml.dump with sorted keys, LF line endings
2. CLAUDE.md — utf-8, LF line endings
3. skills/*.SKILL.md — sorted by lexicographic filename
4. claude-code-agent npm version
5. claude binary version — `claude --version` cached at daemon boot
6. claude-agent-acp version — `claude-agent-acp --version` cached at boot
Algorithm:
sha256(canonical_bytes)
Edge cases:
- Missing file → include literal "<MISSING>" + path marker
- Symlinks → resolve to target's content
- File read error → throw COMPAT_HASH_FAILED, do NOT silently fresh-start
- Binary missing on boot → daemon refuses to start with actionable error
NOT in hash (deliberately):
- workspace directory contents (workspace is task layer, not identity layer)
- .claude/settings.json (per-run preference, not agent identity)6.4 Atomic write contracts (codex F7, F8)
bindings.yaml (CLI-only writer):
- CLI acquires
~/.nps/.bindings.lock(file lock). - Reads current bindings.yaml.
- Mutates in-memory.
- Writes to
bindings.yaml.tmpmode 0600. rename(2)for atomic replace.- Releases lock.
Daemon reader retries on transient EBUSY (max 3 attempts).
SessionStore (daemon writer):
- File per channel:
~/.nps/state/<channelId>.jsonmode 0600. - Write →
.tmp→fsync→rename. - Corrupt JSON on read → log warn + treat as empty (fresh start).
Audit log:
~/.nps/audit.jsonlmode 0600, schema{ v: 1, ts, sender, transport, channel, command_or_event, decision, profile? }.- Rotation at 50MB → atomic rename to
.0; shift.0..3up; delete.4. Retention 5 × 50MB = 250MB max. - ENOSPC → log
[CRITICAL] audit log unavailablevia pino, daemon continues.
7. Configuration model
7.1 Global config (~/.nps/config.yaml)
v0.2 unified format (recommended; introduced in v0.2.0):
daemon:
controlSocket: ~/.nps/daemon.sock
pidFile: ~/.nps/daemon.pid
logLevel: info
adminUserIds:
- matrix:@coral:matrix.org
platforms:
matrix:
enabled: true
token: ${MATRIX_TOKEN} # accessToken; env var indirection supported
extra:
homeserver: https://matrix.org
userId: '@coral:matrix.org'
autoAcceptInvites: true # v0.1.12 B10: auto-join invites from admin
telegram:
enabled: true
token: ${TELEGRAM_BOT_TOKEN}v0.1 legacy format (still supported through v0.2.x; auto-converted at load time, drop planned for v0.3):
im:
matrix:
homeserver: https://matrix.org
user: '@coral:matrix.org'
accessToken: ${MATRIX_TOKEN}
telegram:
botToken: ${TELEGRAM_BOT_TOKEN}Resolver (gateway/platform-config.ts) precedence: platforms.<transport>
wins when both shapes are present.
7.2 Per-agent profile (~/.nps/agents/<name>/)
<name>/
├── nps.yaml # profile metadata (workspace, env, recording, idle settings)
├── CLAUDE.md # personality / rules (read by claude binary natively)
└── .claude/skills/
└── <skill>/SKILL.mdnps.yaml schema:
workspace: /home/coral/jarvis-skills # SINGULAR — exactly one path per profile (D3=A)
# claude binary cwd; need multiple? create multiple profiles.
env: # passed through to claude binary via AgentOptions.env
ANTHROPIC_BASE_URL: https://my-proxy.example/anthropic # optional API-compat proxy
# ANTHROPIC_API_KEY: ${MY_KEY} # optional; omit to use OAuth subscription
recording: ~/.nps/recordings/work-companion.jsonl # optional, ClaudeCodeAgent recording
permissionMode: auto # auto | readonly | manualSkill loading is free: claude binary reads cwd/CLAUDE.md + cwd/.claude/skills/ natively. But here cwd = workspace (task layer), so we set CLAUDE.md + skills INSIDE the profile dir AND we set workspace as cwd. Solution: nps daemon copies / symlinks profile's CLAUDE.md + .claude/skills/ into the workspace before spawn? Or: claude binary supports a separate --system-prompt-file flag? TBD in M-WS implementation.
7.3 Channel bindings (~/.nps/bindings.yaml)
matrix:
'!abc123:matrix.org': work-companion
'!def456:matrix.org': personal-financeCLI-managed only in v0.1 (codex F5 dodge — daemon doesn't write):
nps bind add <channel-id> <profile>
nps bind remove <channel-id>
nps bind listIM /bind /unbind /persona /reset deferred to v0.1.x.
Daemon CLI quick reference
nps daemon start # foreground (rare; usually use install-service)
nps daemon stop # graceful stop via control socket
nps daemon restart # v0.2.2: auto-detects launchd (Mac) /
# systemd (Linux), kicks the service,
# then runs `nps daemon status`
nps daemon status # health JSON
nps daemon logs [-f] [-n N] # v0.2.2: tails ~/.nps/daemon.{out,err}.log on
# Mac, journalctl --user -u nps on Linux
nps daemon install-service # generate + install launchd plist / systemd unitCommon test loop after upgrading nps:
npm i -g @coralai/nps-cli@latest
nps daemon restart # cross-platform; prints health after restart
nps daemon logs -f # follow logs until ^C7.4 Interactive TUI chat (nps chat)
nps chat [profile]Opens an Ink/React terminal chat against a profile, spawning ClaudeCodeAgent
directly (no daemon required). Streams agent output token-by-token.
Profile resolution:
[profile]arg (explicit)- else
daemon.defaultProfilefrom~/.nps/config.yaml - else the single profile if exactly one exists in
~/.nps/agents/
In-chat slash commands: /help /reset /quit.
Keys:
Tab— complete the current slash command (cycles when ambiguous)↑/↓— navigate input history (within the session)Ctrl+C— cancel the current turn, or exit when idle
Each assistant reply shows [in: N · out: M · $cost] when the agent
reports usage data.
7.5 Web chat API (POST /api/chat, SSE)
The console server (enable with daemon.consolePort in config.yaml)
exposes a Server-Sent-Events chat endpoint for browser-side chat:
POST /api/chat body: { profile, message, sessionId? }
DELETE /api/chat/:id
GET /api/chat/sessionsPOST /api/chat responds with a text/event-stream containing:
data: {"type":"session","sessionId":"<uuid>"}
data: {"type":"chunk","text":"<accumulated text so far>"}
…
data: {"type":"done","text":"<final>","sessionId":"<id>","usage":{...}}Or data: {"type":"error","error":"..."} on failure. Closing the
request mid-stream cancels the current turn (handle.cancel()).
Sessions are in-memory only (lost on daemon restart). Idle sessions are GC'd after 30 min. Localhost-bound; no auth — v0.2 adds tokens.
The console page at http://127.0.0.1:<consolePort>/ also embeds a
browser chat panel that consumes this API: profile dropdown,
streaming transcript, Enter-to-send, Shift+Enter newline, Cancel
button. Session and profile choice persist in localStorage.
8. v0.1 milestones
See CEO plan rev 5 for "done when" criteria + effort estimates. Summary:
PROTOTYPE GATES (parallel, can run before main impl):
M0 Lifecycle gate — 3 concurrent claude sessions, distinct sessionIds, no auth collision (1-2d)
M0a Resume gate — resume sessionId, history retained, cleanly stop (0.5d)
M5a Matrix transport spike — DM detect, encrypted-room refuse, send msg (0.5-1d)
M-WS Workspace routing — singular workspace per profile, agent gets correct cwd (1d)
LANE A (daemon core, sequential):
M1 Daemon + control socket + JSON-RPC + PID file (2-3d)
M2 ProfileRegistry (scan + chokidar watch) (1d)
M3 BindingsLoader (atomic read + retry-on-EBUSY) (1d)
M4 DispatchPipeline (corrected order, codex F1+F11) (2d)
M6 SessionStore + compatHash + lastEventId dedupe (2d)
LANE B (gateway, after M5a + M6):
M5b Matrix gateway production (final-only, dedupe, reconnect backoff) (2-3d)
LANE C (CLI + ops, parallel with A):
M7 nps CLI (agent/daemon/bind subcommands) (2d)
M8 Slash commands (/help open, /status admin only) (1d)
M10 Boot runtime check (claude + shim presence) (0.5d)
M11 systemd user unit + install-service (1-1.5d)
M12 Audit log (rotation + 0600 + schema v=1) (1.5d)
M13 Structured logs (pino → journald) (0.5d)Effort total: 18-23 human-days + 20% buffer = 22-28 human-days (3-4.5 CC-days, 3-5 weeks calendar).
Slash command contract (minimal — D6=A, D7-resolved)
| Command | Args | Side effect | Auth |
|---------|------|-------------|------|
| /help | — | Reply with command list + binding info | Open (any sender, including unbound DM first message) |
| /status | — | Reply with profile, lastActivity, sessionId presence | Admin only |
/bind /unbind /reset /persona → CLI only in v0.1.
9. Hard prerequisite
@coralai/claude-code-agent v0.2 — shipped 2026-05-14. Provides start({ resumeSession?: string }). nps-cli depends on ^0.2.0.
What we use:
start({ resumeSession })— public option, calls ACPloadSessioninternallyprompt(text)+stop()— v0.1.0 unchanged
What we DON'T need from claude-code-agent v0.2:
- Public
loadSessionmethod (codex F9 clarified — internal only) - Replay protection idempotency keys (shim replay is informational, no tool re-execution observed)
fork()(still v1.1)
10. Risks & posture
R1 — Matrix protocol complexity
E2EE adds a fully separate code path. v0.1 ships with E2EE disabled — encrypted rooms refused at gateway with explicit error. Multi-participant rooms also refused (v0.1 is DM-only).
R2 — ~~Per-channel memory pressure~~ — DISSOLVED
Ephemeral spawn-per-message means baseline daemon RAM is ~50MB; each turn briefly spikes ~200MB during claude binary's lifetime, then back to baseline. No N-concurrent-claude-processes problem.
R3 — Single-user assumption
v0.1 is single-user: daemon.adminUserIds lists Coral's transport-prefixed IDs (e.g. matrix:@coral:matrix.org). Non-admin senders in a bound DM get rejected with audit log entry. Multi-user with roles → v0.1.x.
R4 — Skill conflicts across profiles
Non-risk (isolated profile dirs).
R5 — ~~LLM backend over-design~~ — dissolved
D2: no LLMBackend abstraction. Multi-vendor via ANTHROPIC_BASE_URL proxy at deployment, not at code level.
Cold-start UX (D1=A)
Ephemeral mode: 5-10s per message before first token. Gateway sends Matrix m.typing indicator during. Coral's IM rate (~human-paced) makes this acceptable. If dogfooding reveals pain, v0.2 may add a warm hot-pool (NOT residency — pool-of-1).
M0 prototype gate is the hinge
Concurrent claude binaries on a single OAuth subscription must work. If M0 reveals auth collision or rate-limit pathology, plan rev 5 is invalidated; rev 6 designs a queue-based serialization model. No daemon code lands before M0 passes.
11. Open decisions
| # | Question | Status |
|---|----------|--------|
| Q1 | LLM backend abstraction timing | Resolved (D2): never. Multi-vendor via ANTHROPIC_BASE_URL proxy. |
| Q2 | Streaming UX (Matrix m.replace vs final-only) | Resolved: final-only in v0.1 on both transports. Streaming → v0.1.x. |
| Q3 | resumeSession hard block vs stateless v0.1 | Resolved (D10): hard block; claude-code-agent v0.2 shipped 2026-05-14. |
| Q4 | Project name (does "nps" stand for something?) | Open. Owner: Coral. Deadline: before npm publish. Fallback codename: nps-cli. |
12. Comparison with sps-cli (so we don't accidentally rebuild it)
| Dimension | sps-cli | nps-cli |
|---|---|---|
| Form factor | Pipeline orchestrator | Conversational daemon |
| Concurrency | Single worker, serial (v0.37.2 deliberate) | Ephemeral process per message; no parallelism within a channel |
| Trigger | File changes in ~/.coral/projects/... | IM messages |
| Output | Card state machine + worker artifacts | Reply messages in IM channel |
| Skill scope | sps global + project | nps profile-local (cwd = workspace, with system prompt + skills resolved by M-WS) |
| Web UI | Yes (console) | Deferred to v0.1.x |
| Platform | Cross-platform | Linux-only (v0.1) |
| Persistence | Card state in files | Per-channel SessionStore + audit log |
Two different products with one shared runtime (@coralai/claude-code-agent). The Phase 1 extraction is what makes both possible.
13. Distribution
{
"name": "@coralai/nps-cli",
"version": "0.1.0",
"type": "module",
"bin": { "nps": "dist/main.js" },
"files": ["dist", "systemd/", "README.md"],
"engines": { "node": ">=18" },
"os": ["linux"]
}CI/CD — wired in two places:
- GitHub Actions (
/.github/workflows/nps-cli-ci.yml): activates when the repo is mirrored to GitHub. Jobs:test(typecheck + vitest on push/PR),preflight(consumer-env smoke on tag),publish(npm publish on tag, needsNPM_TOKENsecret). - GitLab CI (
/.gitlab-ci.yml): canonical home atgit.wymsn.com. Same three stages keyed on tag prefixnps-cli-v*. Needs maskedNPM_TOKENCI/CD variable.
Both workflows scope to nps-cli/** changes so sps-cli / claude-code-agent commits don't trigger them.
systemd ExecStart pattern (documented by nps daemon install-service):
ExecStart=/usr/bin/env node /home/<user>/.npm-global/bin/nps daemon start --foreground14. NOT in scope (v0.1, rev 5 final)
- nps-cli's own
LLMBackendabstraction (D2) - Plugin extension model for nps-cli (D3)
- sps-cli chat-worker integration
- Pure library release without daemon
- Per-channel residency / process cache (codex F2)
- Multi-participant Matrix rooms (codex F8)
- Telegram gateway (codex F6 — hidden work; own milestone in v0.1.x)
- IM-driven binding commands
/bind /unbind /persona /reset(codex F5) - Web admin console (codex F7)
- Cross-agent context handoff (D8)
- Voice messages STT/TTS (D9)
- Cost tracking (D6)
- Streaming UX (Matrix m.replace) (Q2)
- macOS launchd / Windows (codex F20 — Linux-only)
- Public
loadSessionmethod in claude-code-agent (codex F9 — internal only)
15. Implementation status
| Component | Status |
|-----------|--------|
| @coralai/claude-code-agent v0.2 (resumeSession) | ✅ Shipped 2026-05-14 |
| Skeleton jarvis-skills/nps-cli/ | ✅ Landed 2026-05-14 |
| M0 lifecycle prototype gate | ✅ PASSED 2026-05-14 (solo=4847ms, parallel=4712ms, ratio 0.97x; all 6 criteria green) |
| M0a resume prototype gate | ✅ PASSED 2026-05-14 (stored "42" → resumed → recalled "42"; bogus sessionId → SESSION_LOST; 5/5 criteria) |
| M-WS workspace prototype gate | ✅ PASSED 2026-05-14 (profile-as-cwd loads CLAUDE.md + skills; workspace via absolute path also works without additionalDirectories) |
| M5a Matrix transport spike | ✅ PASSED 2026-05-14 (im.wymsn.com sync 218ms, DM classification works, test send succeeded; lib decision: matrix-js-sdk@41 direct) |
| M1-M13 implementation | ✅ DONE 2026-05-14 (62/62 unit tests; agent invoke smoke green; daemon RPC works) |
| Real-world Matrix DM E2E | ✅ PASSED 2026-05-15 (dedicated @nps-bot:wymsn.com account registered via /register API; DM rooms with both @elon and @yuguo; T1 /help / T2 prompt / T3 /status all replied correctly; sessions persisted with compatHash) |
| systemd deployment | ✅ ACTIVE (nps daemon install-service writes wrapper at ~/.nps/run-daemon.sh + unit at ~/.config/systemd/user/nps.service; wrapper sources ~/.coral/env to handle export KEY=value lines + inherits PATH so claude + claude-agent-acp resolve; systemctl --user enable --now nps works; loginctl enable-linger set) |
| Project name (Q4) | ✅ Locked: nps-cli (D2=A 2026-05-15); package private flag removed; LICENSE + publishConfig + repository added |
| npm publish | ✅ @coralai/[email protected] on npm (2026-05-15). Release iteration: 0.1.0 had matrix-js-sdk in devDeps (fatal global install bug) → 0.1.1 moved to deps → 0.1.2 added profile-dir abs path to compatHash (caught cwd relocation case) → 0.1.3 generalizes by auto-falling-back to fresh start whenever ACP loadSession raises SESSION_LOST for ANY reason (shim crash, claude binary upgrade, manual cleanup); user sees [session recovered: prior session was lost, started fresh] instead of error. Also adds scripts/pre-release-smoke.sh + npm run preflight — simulates consumer env via npm pack + --omit=dev install + import-every-module check, catches devDep mis-classification before publishing. |
Vision rev 5 · 2026-05-14 · aligned with CEO plan rev 5 + eng plan rev 5
