@mnemix-ai/vapi-kit
v0.1.0
Published
Mnemix integration kit for Vapi voice assistants — assistant-request enrichment, per-call cache, post-call memory write-back.
Maintainers
Readme
@mnemix-ai/vapi-kit
Mnemix integration kit for Vapi voice agents. Assistant-request enrichment, per-turn cache reuse, and end-of-call memory write-back.
Why this exists
Vapi can ask your server.url for an assistant at call start and can call the same server during later LLM turns. Mnemix is the memory and caller enrichment layer behind that flow: every Vapi call can resolve the caller through Twilio Lookup, Trestle, and Baylio, return prompt variables before the assistant speaks, then write the transcript and outcome back after hangup.
The kit is intentionally thin. It verifies Vapi webhook signatures when configured, calls the supported Mnemix v1 endpoints, builds Vapi-ready assistant context, caches recall by call.id, and degrades to safe defaults when external data is unavailable.
Install
npm install @mnemix-ai/vapi-kit @mnemix-ai/clientQuickstart: Cloudflare Worker
import { Mnemix } from "@mnemix-ai/client";
import {
handleVapiEndOfCallReport,
handleVapiServerUrl,
type VapiEndOfCallReportPayload,
type VapiServerUrlEvent,
} from "@mnemix-ai/vapi-kit";
interface Env {
MNEMIX_KEY: string;
VAPI_WEBHOOK_SECRET?: string;
MNEMIX_TENANT?: string;
}
export default {
async fetch(req: Request, env: Env): Promise<Response> {
if (req.method !== "POST") {
return new Response("Method not allowed", { status: 405 });
}
const rawBody = await req.text();
const signature = req.headers.get("x-vapi-signature") ?? undefined;
const payload = JSON.parse(rawBody) as VapiServerUrlEvent | VapiEndOfCallReportPayload;
const mnemix = new Mnemix({ apiKey: env.MNEMIX_KEY });
const opts = {
mnemix,
vapiWebhookSecret: env.VAPI_WEBHOOK_SECRET,
tenantId: env.MNEMIX_TENANT,
};
if (payload.message.type === "end-of-call-report") {
return Response.json(
await handleVapiEndOfCallReport(opts, payload as VapiEndOfCallReportPayload, {
signature,
rawBody,
}),
);
}
const response = await handleVapiServerUrl(opts, payload as VapiServerUrlEvent, {
signature,
rawBody,
tone: "warm",
agentName: "the assistant",
});
return Response.json(response);
},
};Point Vapi's assistant server.url at the Worker route that contains this handler, then use examples/vapi-assistant.json as a starting config.
Quickstart: Node Express
import express from "express";
import { Mnemix } from "@mnemix-ai/client";
import {
handleVapiEndOfCallReport,
handleVapiServerUrl,
type VapiEndOfCallReportPayload,
type VapiServerUrlEvent,
} from "@mnemix-ai/vapi-kit";
const app = express();
app.use(express.text({ type: "application/json" }));
const mnemix = new Mnemix({ apiKey: process.env.MNEMIX_KEY! });
app.post("/vapi/server", async (req, res) => {
const rawBody = req.body as string;
const signature = req.header("x-vapi-signature") ?? undefined;
const payload = JSON.parse(rawBody) as VapiServerUrlEvent | VapiEndOfCallReportPayload;
const opts = {
mnemix,
vapiWebhookSecret: process.env.VAPI_WEBHOOK_SECRET,
tenantId: process.env.MNEMIX_TENANT,
};
if (payload.message.type === "end-of-call-report") {
const result = await handleVapiEndOfCallReport(opts, payload as VapiEndOfCallReportPayload, {
signature,
rawBody,
});
res.json(result);
return;
}
const response = await handleVapiServerUrl(opts, payload as VapiServerUrlEvent, {
signature,
rawBody,
tone: "professional",
});
res.json(response);
});
app.listen(process.env.PORT ? Number(process.env.PORT) : 8787);Quickstart: Next.js App Router
// app/api/vapi/server/route.ts
import { Mnemix } from "@mnemix-ai/client";
import {
handleVapiEndOfCallReport,
handleVapiServerUrl,
type VapiEndOfCallReportPayload,
type VapiServerUrlEvent,
} from "@mnemix-ai/vapi-kit";
const mnemix = new Mnemix({ apiKey: process.env.MNEMIX_KEY! });
export async function POST(req: Request) {
const rawBody = await req.text();
const signature = req.headers.get("x-vapi-signature") ?? undefined;
const payload = JSON.parse(rawBody) as VapiServerUrlEvent | VapiEndOfCallReportPayload;
const opts = {
mnemix,
vapiWebhookSecret: process.env.VAPI_WEBHOOK_SECRET,
tenantId: process.env.MNEMIX_TENANT,
};
if (payload.message.type === "end-of-call-report") {
return Response.json(
await handleVapiEndOfCallReport(opts, payload as VapiEndOfCallReportPayload, {
signature,
rawBody,
}),
);
}
return Response.json(
await handleVapiServerUrl(opts, payload as VapiServerUrlEvent, {
signature,
rawBody,
tone: "warm",
}),
);
}Server URL Events
Vapi sends all events to the configured server.url as JSON with a message.type.
assistant-request is the cold-start path. The kit reads message.call.id as the cache key and message.call.customer.number as the canonical E.164 caller phone number, calls Mnemix once, builds variables, and returns a Vapi assistant response.
function-call is the per-turn path. Vapi's server.url fires per LLM turn. The kit caches the Mnemix recall by call.id (default 15 min TTL) so per-turn invocations don't re-hit Mnemix. Function handlers can read the cached caller context and return small tool results without repeating enrichment.
end-of-call-report is the write-back path. The kit maps Vapi's artifact.transcript, artifact.messages, artifact.recordingUrl, timestamps, and endedReason into POST /v1/calls/end, then invalidates the cache entry for the completed call.id.
Cache Layer
The cache exists because Vapi may invoke your server URL on every LLM turn. The first assistant-request stores Mnemix recall and prompt variables under call.id; later function-call messages with the same call.id reuse that context.
Cache is per Worker isolate. Same call.id is sticky to one isolate for the call's duration, so this is correct for production. If you scale across multiple processes, the cache hit rate may drop on long calls.
Default TTL is 15 minutes. If a call outlives the TTL, the kit re-fetches from Mnemix on the next per-turn event and falls back to the last known context if the re-fetch fails.
What You Get Back
handleVapiServerUrl() returns a Vapi-ready JSON response. For assistant requests, that response can include variable values and system prompt context derived from:
interface VapiCallVariables {
caller_name: string;
is_returning: boolean;
last_intent: string | null;
last_call_summary: string | null;
carrier: string | null;
line_type: string | null;
company: string | null;
role: string | null;
industry: string | null;
suggested_intent: string | null;
}The variables are the prompt-safe subset. Keep full Mnemix responses in logs only when your retention policy allows it.
System Prompt Customization
import { buildSystemPrompt } from "@mnemix-ai/vapi-kit";
const systemPrompt = buildSystemPrompt(ctx.variables, {
tone: "warm",
agentName: "Avery",
brandName: "Northstar Support",
fallbackGreeting: "Hi there",
});The builder does not invent missing history. If Mnemix returns no memory, the prompt tells the agent to ask rather than assume.
Webhook Signature Verification
Set VAPI_WEBHOOK_SECRET in your deployment and configure the same secret in Vapi. Vapi sends x-vapi-signature as sha256=<hex>. Pass the raw request body and signature header to the kit so verifyVapiSignature() can check the exact bytes Vapi signed.
If signature verification fails, do not return enriched context. The kit rejects the request or returns safe defaults depending on the handler path you call.
Graceful Degradation
Voice agents should not block on enrichment. If Mnemix times out, returns a 5xx, receives a malformed phone number, or returns partial enrichment, the kit returns defaults such as caller_name: "there" and is_returning: false. Vapi can continue the call while Mnemix logs the failure path.
Partial enrichment is still useful. Twilio Lookup can supply carrier data while Trestle is unavailable, or Baylio can suggest a likely intent without a full profile.
Configuration
| Option | Required | Description |
| --- | --- | --- |
| mnemix | Yes | A Mnemix client constructed with your API key. |
| vapiWebhookSecret | No, recommended | Shared secret used to verify Vapi webhook signatures. |
| tenantId | No | Optional tenant override when one API key serves multiple tenants. |
| recallTimeoutMs | No | Maximum time to wait before falling back to default assistant context. |
| cacheTtlMs | No | Per-call cache TTL. Defaults to 15 minutes. |
| logger | No | Logger with info, warn, and error; defaults to console. |
The 3 v1 Endpoints This Kit Calls
POST /v1/recall_and_enrichfor assistant-request memory and enrichment.POST /v1/calls/endfor end-of-call memory write-back.GET /v1/caller/{phone_number}for optional out-of-band caller lookup.
The default API base URL is https://mnemix-api.sayeed965.workers.dev. The Vapi path is designed for sub-300ms voice recall without publishing numeric Mnemix p95 or p99 claims.
Privacy
Phone numbers are normalized to E.164 before use. Audit logging stores a phone hash produced with HMAC-SHA256 over the tenant secret and normalized phone number. The kit sends caller context to Mnemix only; it does not send PII to OpenAI or Anthropic. If your agent sends transcripts to a model provider, redact or filter those transcripts in your application before write-back.
Keep Vapi recordings, transcripts, and webhook payloads aligned with your own consent, retention, and data processing terms.
License
MIT
