@nevescloud/pip-relay
v0.11.0
Published
Operator co-pilot for chat. Claude (via the operator's ai-bridge) writes the reply; the visitor's browser displays it. WebRTC carries one envelope between them. No proxy server.
Maintainers
Readme
pip-relay
Operator co-pilot for chat. Claude decides what to say and writes the reply; the visitor's browser displays it. WebRTC carries one envelope between them. No proxy server. Bring your own Claude key.
Why
The original problem: a small in-browser model (LFM2.5 350M) with a resume in its system prompt would dump the resume on greetings. The first fix was structural — take the resume out of the small model and give it a typed envelope to render from. The second fix, after observing that small open-weight models in 2026 still can't compose 1-2 sentences coherently, was to let Claude write the prose directly. As of 0.9 the schema collapsed to the minimum that earns its weight: an intent for routing/escalation and the final text Claude writes.
pip-relay's current architecture: Claude on the operator's machine, brokered via ai-bridge, receives the manifest as system prompt and emits a typed envelope per visitor turn:
{
intent: "greet" | "answer" | "decline" | "clarify" | "redirect" | "escalate",
text: string, // final 1-2 sentence reply, ready to display
citation: string?, // optional URL or section anchor
}The visitor's browser displays env.text directly. intent is for routing (escalation alerts, dashboard logging). Hosts that want to post-process text (rephrase in their own voice, etc.) can pass an onEnvelope hook.
Architecture
visitor browser operator browser (your Mac)
───────────────────────── ─────────────────────────────
Pip pip-relay dashboard
├─ relay-provider ├─ session timeline
├─ envelope cache (localStorage) ├─ manifest editor
└─ in-browser LM └─ intervention textarea
▲ │
│ │ HTTP localhost:7337
│ ▼
│ ai-bridge
│ │
│ ▼ OAuth
│ Claude
│ │
└─────────── WebRTC data channel ◀────────────┘
(signal.neevs.io for pairing,
proxy.neevs.io for TURN credentials)Pairing happens once via signal's pair-request protocol over a fixed lobby room (pip-relay:<your-site>). After acceptance each visitor gets an ephemeral room and a peer-to-peer data channel to the dashboard. TURN credentials are minted on demand from proxy.neevs.io/cloudflare/turn.
Optimistic display
The visitor caches every Claude envelope keyed by the normalized visitor message. On the next message that matches a cached key, the relay-provider shows the cached envelope's text immediately while sending the message to Claude in the background. When Claude's envelope arrives:
- Match → leave the bubble alone.
- Diverge → replace the bubble with the authoritative text.
Cache hit = no network round-trip; the visitor sees the reply instantly. Cache miss = a normal turn (wait for Claude). The cache lives in localStorage per site (FIFO at 100 entries), so a returning visitor inherits learned predictions.
Who this is for
A 1-person creator, consultant, or small team handling many incoming chats on their site. Routine turns are AI; the human watches the live conversation and types directly into the visitor's chat any time they want. Claude can flag escalate envelopes when it's uncertain, pinging the operator.
You bring your own Claude subscription via ai-bridge; pip-relay runs on your Mac. No SaaS, no per-resolution fees, no third-party seeing your visitors' messages.
Visitor (site owner) — install
import { createPip } from 'https://cdn.jsdelivr.net/npm/@jonasneves/pip@2/pip-core.esm.js';
import { createRelayProvider } from 'https://cdn.jsdelivr.net/npm/@nevescloud/pip-relay@0/docs/relay-provider.esm.js';
let pip;
const provider = createRelayProvider({
siteId: 'your-site-id',
// Open the WebRTC session on construction (instead of on first submit).
// Required if you want to surface operator presence in the UI before
// the visitor types — onStatus will fire 'connecting' immediately and
// either 'connected' or 'error' shortly after.
eager: true,
// While disconnected, retry connecting every reconnectIntervalMs. Lets
// the visitor's UI come back to life on its own when the operator
// opens the dashboard mid-visit.
autoReconnect: true,
reconnectIntervalMs: 30_000,
// Tighten the lobby pair-request timeout (default 30s). With nothing on
// the other side, the visitor would otherwise wait the full default
// before falling back to onOffline.
pairTimeoutMs: 10_000,
// Connection state for the host UI. Status is one of:
// 'connecting' | 'connected' | 'disconnected' | 'offline' | 'error'.
onStatus({ status }) {
document.body.classList.toggle('relay-offline',
status === 'disconnected' || status === 'offline' || status === 'error');
},
// Operator-typed interventions arrive here when no submit is in flight.
async onUnprompted(env, ctx) {
if (!pip) return;
const turnEl = pip.startTurn();
pip.setReplyText(turnEl, env.text, true);
},
// Optional: wire onEnvelope to post-process env.text (rephrase in your
// own voice, route by intent, decorate with metadata). The default
// returns env.text unchanged.
//
// async onEnvelope(env, ctx) {
// return await yourLocalModel.rephrase(env.text);
// },
});
pip = createPip({ onSubmit: provider.onSubmit });Operator — run the dashboard
- Install ai-bridge on your Mac. The proxy listens on
localhost:7337. - Open:
https://jonasneves.github.io/pip-relay/?site=<your-site-id>&manifest=<your-manifest-url> - The dashboard advertises itself in the
pip-relay:<your-site-id>lobby and pairs with any visitor that loads your site.
You get a live timeline per session, a manifest editor that broadcasts updates to connected visitors, and a textarea to type messages directly to a visitor.
Manifest
The closed set of facts Claude may quote verbatim when composing envelope.text. The same set drives the visitor's offline-fallback render.
{
"facts": [
"Jonas Neves builds agent systems for healthcare.",
"Recent: Duke RadChat — DIHI-funded clinical decision support."
],
"decline": "Not sure — Jonas might know better.",
"escalate_when": [
"Visitor asks about pricing or contract terms."
]
}facts is the closed set Claude is allowed to quote in envelope.text. decline is the fallback string when Claude can't answer from the facts. escalate_when (optional) lists situations where Claude should emit intent: "escalate" so the operator gets pinged.
Repo layout
pip-relay/
└── docs/
├── relay-provider.esm.js ← visitor-side Pip provider (npm-published)
├── envelope.esm.js ← typed schema + helpers (npm-published)
├── transport.esm.js ← WebRTC pairing + data channels (npm-published)
├── manifest.example.json ← offline-fallback example (npm-published)
├── signal/ ← vendored from jonasneves/signal/src/client/
├── index.html ← operator dashboard (Pages-only)
├── dashboard.esm.js ← dashboard logic (Pages-only)
└── visitor.html ← visitor test page (Pages-only)The flat docs/ layout is deliberate: GitHub Pages won't follow symlinks targeting outside the served folder, so every loadable file lives here. The npm package ships only the visitor-relevant subset via package.json's files field.
Composition
pip-relay is glue. The interesting bits are in the projects it composes:
- signal — WebRTC signaling rooms on Cloudflare's edge. Vendored:
peer.js,peer-key.js,pair-request.js,room-lobby.js,discover.js. - ai-bridge — Mac-resident proxy for Claude/OpenAI/Gemini using your subscription tokens. Removes the API-key-in-browser problem.
- Pip — the floating chat bubble + panel UI primitive. pip-relay implements a Pip provider; the visitor's host wires its own model.
What it's not
- Not a SaaS. Each operator runs their own dashboard with their own ai-bridge. "Multi-tenant" here means many operators each running their own instance, not a shared service.
- Not authenticated for visitors. Visitors are anonymous; the operator's acceptance of a pair-request is the trust boundary.
- Not a general-purpose chat agent. Hard scope: 1-2 sentence envelope rendering, fact-based routing, no tool use on the visitor side. For a streaming general agent, use Pip's runtime layer with a streaming provider directly.
Status
| Phase | Scope | Status |
|---|---|---|
| 1 | Round-trip plumbing: pair-request → ephemeral room → WebRTC | ✓ |
| 2 | Envelope schema + manifest pipeline | ✓ |
| 3 | Per-session focus + live manifest editor | ✓ |
| 4 | Operator intervention via onUnprompted | ✓ |
| 5 | Optimistic display via envelope cache | ✓ |
| 6 | Escalation: alerts on intent: escalate, configurable via manifest.escalate_when | ✓ |
| 7 | text field on the envelope; visitor displays it directly | ✓ |
| 8 | Schema collapsed to {intent, text, citation?} — facts, register, exemplars, LFM render hooks, AbortSignal speculation all removed | ✓ |
License
MIT.
