@cortexmemory/vercel-ai-provider
v0.2.0
Published
Cortex Memory Provider for Vercel AI SDK - Add persistent memory with enhanced streaming capabilities to any AI application
Downloads
328
Maintainers
Readme
Cortex Memory Provider for Vercel AI SDK
Persistent memory for your AI applications powered by Cortex and Convex
Add long-term memory to any Vercel AI SDK application with a single import. Built on Cortex for TypeScript-native memory management with zero vendor lock-in.
✨ Features
- 🧠 Automatic Memory - Retrieves relevant context before each response, stores conversations after
- 🚀 Zero Configuration - Works out of the box with sensible defaults
- 📦 TypeScript Native - Built for TypeScript, not ported from Python
- 🔒 Self-Hosted - Deploy Convex anywhere, no API keys or vendor lock-in
- ⚡ Edge Compatible - Works in Vercel Edge Functions, Cloudflare Workers
- 🎯 Memory Spaces - Isolate memory by user, team, or project
- 🐝 Hive Mode - Share memory across multiple agents/applications
- 📊 ACID Guarantees - Never lose data with Convex transactions
- 🔍 Semantic Search - Find relevant memories with embeddings
- 🧬 Fact Extraction - Optional LLM-powered fact extraction for 60-90% storage savings
Quick Start
Installation
npm install @cortexmemory/vercel-ai-provider @cortexmemory/sdk ai convexWhat's New in v0.2.0
The provider now uses the enhanced rememberStream() API, unlocking powerful streaming capabilities:
- Progressive Storage - Store partial responses during streaming for resumability
- Streaming Hooks - Monitor progress with
onChunk,onProgress,onCompletecallbacks - Comprehensive Metrics - Track latency, throughput, token usage, and costs
- Progressive Fact Extraction - Extract facts incrementally during streaming
- Error Recovery - Resume interrupted streams with checkpoints
- Adaptive Processing - Auto-optimize based on stream characteristics
All features are opt-in and fully backward compatible.
Setup
- Deploy Cortex Backend to Convex:
npx create-cortex-memories
# Follow the wizard to set up Convex backend- Create Memory-Enabled Chat:
// app/api/chat/route.ts
import { createCortexMemory } from "@cortexmemory/vercel-ai-provider";
import { openai } from "@ai-sdk/openai";
import { streamText } from "ai";
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "my-chatbot",
userId: "user-123", // Get from session/auth in production
});
export async function POST(req: Request) {
const { messages } = await req.json();
const result = await streamText({
model: cortexMemory(openai("gpt-5-nano")),
messages,
});
return result.toDataStreamResponse();
}- Use in Your UI:
// app/page.tsx
'use client';
import { useChat } from 'ai/react';
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat();
return (
<div>
{messages.map(m => <div key={m.id}>{m.content}</div>)}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
<button type="submit">Send</button>
</form>
</div>
);
}That's it! Your AI now has persistent memory that works across sessions.
Enhanced Streaming Features (v0.2.0+)
The provider now includes powerful streaming enhancements powered by the rememberStream() API.
Progressive Storage
Store partial responses during streaming for resumability:
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "demo-chat",
userId: "user-123",
streamingOptions: {
storePartialResponse: true,
partialResponseInterval: 3000, // Update every 3 seconds
},
});Streaming Hooks
Monitor streaming progress in real-time:
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "demo-chat",
userId: "user-123",
streamingHooks: {
onChunk: (event) => {
console.log(`Chunk ${event.chunkNumber}: ${event.chunk}`);
},
onProgress: (event) => {
console.log(`Progress: ${event.bytesProcessed} bytes`);
updateProgressBar(event.bytesProcessed);
},
onComplete: (event) => {
console.log(`Completed in ${event.durationMs}ms`);
console.log(`Facts extracted: ${event.factsExtracted}`);
},
onError: (error) => {
console.error("Stream error:", error.message);
},
},
});Comprehensive Metrics
Automatic collection of streaming performance metrics:
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "demo-chat",
userId: "user-123",
enableStreamMetrics: true, // Default: true
});Metrics include:
- First chunk latency
- Total stream duration
- Chunks per second
- Estimated tokens and costs
- Performance bottlenecks and recommendations
Progressive Fact Extraction
Extract facts incrementally during streaming instead of waiting for completion:
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "demo-chat",
userId: "user-123",
enableFactExtraction: true,
streamingOptions: {
progressiveFactExtraction: true,
factExtractionThreshold: 500, // Extract every 500 characters
},
});Error Recovery
Handle interrupted streams with resume tokens:
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "demo-chat",
userId: "user-123",
streamingOptions: {
partialFailureHandling: "store-partial", // or 'rollback', 'retry', 'best-effort'
maxRetries: 3,
generateResumeToken: true,
streamTimeout: 30000, // 30 seconds
},
});Complete Example with All Features
import { createCortexMemory } from "@cortexmemory/vercel-ai-provider";
import { openai } from "@ai-sdk/openai";
import { streamText, embed } from "ai";
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "advanced-chat",
userId: "user-123",
// Progressive storage
streamingOptions: {
storePartialResponse: true,
partialResponseInterval: 3000,
progressiveFactExtraction: true,
progressiveGraphSync: true,
enableAdaptiveProcessing: true,
},
// Real-time hooks
streamingHooks: {
onProgress: (event) => {
websocket.send({ type: "progress", data: event });
},
onComplete: (event) => {
console.log(`Stream metrics:`, event);
},
},
// Semantic search
embeddingProvider: {
generate: async (text) => {
const { embedding } = await embed({
model: openai.embedding("text-embedding-3-small"),
value: text,
});
return embedding;
},
},
// Fact extraction
enableFactExtraction: true,
});How It Works
Automatic Memory Flow
Every time your AI generates a response:
- 🔍 Search - Cortex searches past conversations for relevant context
- 💉 Inject - Relevant memories are injected into the prompt
- 🤖 Generate - LLM generates response with full context
- 💾 Store - Conversation is automatically stored for future reference
User: "Hi, my name is Alice"
Agent: "Nice to meet you, Alice!"
↓
[Stored in Cortex]
↓
[Refresh page / New session]
↓
User: "What's my name?"
↓
[Cortex searches memories]
↓
[Finds: "my name is Alice"]
↓
[Injects context into prompt]
↓
Agent: "Your name is Alice!"What's Happening Behind the Scenes
// When you call streamText with cortexMemory:
const result = await streamText({
model: cortexMemory(openai("gpt-5-nano")),
messages: [{ role: "user", content: "What did I tell you earlier?" }],
});
// Cortex automatically:
// 1. Searches memories: "What did I tell you earlier?"
// 2. Finds relevant memories from past conversations
// 3. Injects them into the system prompt:
// "Relevant context from past conversations:
// 1. User said their name is Alice
// 2. User prefers dark mode
// ..."
// 4. Calls OpenAI with augmented prompt
// 5. Stores new conversation turn for future referenceConfiguration
Basic Configuration
const cortexMemory = createCortexMemory({
// Required
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "my-agent",
userId: "user-123",
// Optional
userName: "Alice",
conversationId: () => generateConversationId(),
});With Embeddings
import { embed } from "ai";
import { openai } from "@ai-sdk/openai";
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "my-agent",
userId: "user-123",
// Enable semantic search with embeddings
embeddingProvider: {
generate: async (text) => {
const { embedding } = await embed({
model: openai.embedding("text-embedding-3-small"),
value: text,
});
return embedding;
},
},
// Fine-tune memory retrieval
memorySearchLimit: 10,
minMemoryRelevance: 0.75,
});With Fact Extraction
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "smart-agent",
userId: "user-123",
// Enable automatic fact extraction
enableFactExtraction: true,
extractFacts: async (userMsg, agentResp) => {
// Use LLM to extract structured facts
const facts = await extractFactsWithLLM(userMsg + " " + agentResp);
return facts;
},
});With Hive Mode (Cross-Application Memory)
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "shared-workspace", // Shared across apps
userId: "user-123",
hiveMode: {
participantId: "web-assistant", // Track which agent/tool
},
});
// Now this agent's memories are visible to other agents
// in the same memory space (e.g., Cursor MCP, Claude Desktop)API Reference
createCortexMemory(config)
Creates a memory-augmented model factory.
Parameters:
| Parameter | Type | Required | Description |
| -------------------------- | ---------------------- | -------- | ------------------------------------------------ |
| convexUrl | string | ✅ | Convex deployment URL |
| memorySpaceId | string | ✅ | Memory space for isolation |
| userId | string | () => string | ✅ | User ID (static or function) |
| userName | string | ❌ | User name (default: 'User') |
| conversationId | string | () => string | ❌ | Conversation ID (auto-generated if not provided) |
| embeddingProvider | object | ❌ | Custom embedding provider |
| memorySearchLimit | number | ❌ | Max memories to retrieve (default: 5) |
| minMemoryRelevance | number | ❌ | Min score 0-1 (default: 0.7) |
| contextInjectionStrategy | 'system' | 'user' | ❌ | Where to inject context (default: 'system') |
| enableFactExtraction | boolean | ❌ | Enable fact extraction (default: false) |
| enableGraphMemory | boolean | ❌ | Sync to graph DB (default: false) |
| hiveMode | object | ❌ | Enable cross-app memory |
| defaultImportance | number | ❌ | Default importance 0-100 (default: 50) |
| debug | boolean | ❌ | Enable debug logging (default: false) |
Returns: CortexMemoryModel - Function to wrap models + manual memory methods
Model Wrapping
const cortexMemory = createCortexMemory({
/* config */
});
// Wrap any Vercel AI SDK provider
const model1 = cortexMemory(openai("gpt-5-nano"));
const model2 = cortexMemory(anthropic("claude-3-opus"));
const model3 = cortexMemory(google("gemini-pro"));
// Use with streamText, generateText, generateObject, etc.
const result = await streamText({ model: model1, messages });Manual Memory Control
// Search memories manually
const memories = await cortexMemory.search("user preferences", {
limit: 10,
minScore: 0.8,
});
// Store memory manually
await cortexMemory.remember(
"My favorite color is blue",
"Noted, I will remember that!",
{ conversationId: "conv-123" },
);
// Get all memories
const all = await cortexMemory.getMemories({ limit: 100 });
// Clear memories (requires confirmation)
await cortexMemory.clearMemories({ confirm: true });
// Get current configuration
const config = cortexMemory.getConfig();Examples
Basic Chat
See examples/next-chat - Simple chat with memory (5 files, ~200 lines)
RAG Pattern
See examples/next-rag - Document search + conversation memory
Multi-Modal
See examples/next-multimodal - Images + text with memory
Hive Mode
See examples/hive-mode - Cross-application memory sharing
Multi-Tenant
See examples/memory-spaces - SaaS with tenant isolation
Comparison with mem0
| Feature | Cortex | mem0 | | --------------------- | ----------------------- | -------------------------------- | | Hosting | ✅ Self-hosted (Convex) | ❌ Cloud only (API key required) | | TypeScript | ✅ Native | ⚠️ Ported from Python | | Edge Runtime | ✅ Full support | ❌ Limited | | Memory Spaces | ✅ Built-in | ❌ Not available | | ACID Guarantees | ✅ Full (Convex) | ❌ Eventual consistency | | Real-time Updates | ✅ Reactive queries | ❌ Polling/webhooks | | Hive Mode | ✅ Cross-app sharing | ❌ Not available | | Versioning | ✅ 10 versions auto | ❌ No versioning | | Cost | 💰 Convex pricing | 💰 mem0 API + LLM | | Data Sovereignty | ✅ Your infrastructure | ❌ mem0 cloud |
Migration from mem0
Before (mem0):
import { createMem0 } from "@mem0/vercel-ai-provider";
const mem0 = createMem0({
provider: "openai",
mem0ApiKey: process.env.MEM0_API_KEY!,
config: { apiKey: process.env.OPENAI_API_KEY! },
mem0Config: { user_id: "user-123" },
});
const result = await streamText({
model: mem0("gpt-5-nano"),
messages,
});After (Cortex):
import { createCortexMemory } from "@cortexmemory/vercel-ai-provider";
import { openai } from "@ai-sdk/openai";
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!, // Self-hosted, no API key
memorySpaceId: "my-chatbot",
userId: "user-123",
});
const result = await streamText({
model: cortexMemory(openai("gpt-5-nano")),
messages,
});Benefits of switching:
- ✅ No mem0 API key needed (one less dependency)
- ✅ Self-hosted (full control over data)
- ✅ Memory Spaces (better isolation)
- ✅ Real-time updates (Convex reactive queries)
- ✅ ACID guarantees (no data loss)
- ✅ Versioning (track changes over time)
Advanced Usage
Custom Context Injection
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "custom-agent",
userId: "user-123",
// Custom context builder
customContextBuilder: (memories) => {
const important = memories.filter(
(m) =>
("metadata" in m
? m.metadata?.importance
: m.memory?.metadata?.importance) > 70,
);
return `Critical information:\n${important
.map((m) => ("content" in m ? m.content : m.memory?.content))
.join("\n")}`;
},
});Dynamic User Resolution
import { auth } from "@clerk/nextjs";
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "saas-app",
// Resolve from auth system
userId: async () => {
const { userId } = await auth();
if (!userId) throw new Error("Unauthorized");
return userId;
},
});Per-Request Memory Spaces
export async function POST(req: Request) {
const { teamId } = await req.json();
// Create memory provider per request
const teamMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: `team-${teamId}`, // Isolated per team
userId: currentUser.id,
});
const result = await streamText({
model: teamMemory(openai("gpt-5-nano")),
messages,
});
return result.toDataStreamResponse();
}Documentation
- Getting Started - Step-by-step tutorial
- API Reference - Complete API documentation
- Advanced Usage - Custom configurations
- Memory Spaces - Multi-tenancy guide
- Hive Mode - Cross-application memory
- Migration from mem0 - Switching guide
FAQ
Q: Does this work with other AI SDK providers (Anthropic, Google, etc.)? A: Yes! Wrap any Vercel AI SDK provider:
import { anthropic } from "@ai-sdk/anthropic";
import { google } from "@ai-sdk/google";
const model1 = cortexMemory(anthropic("claude-3-opus"));
const model2 = cortexMemory(google("gemini-pro"));Q: Can I use this in Edge Functions? A: Yes! Cortex is fully edge-compatible:
// app/api/chat/route.ts
export const runtime = "edge";
export async function POST(req: Request) {
const result = await streamText({
model: cortexMemory(openai("gpt-5-nano")),
messages,
});
return result.toDataStreamResponse();
}Q: Do I need to manually buffer streams? A: No! Cortex v0.9.0+ handles streaming automatically:
// Cortex buffers the stream internally and stores after completion
const result = await streamText({
model: cortexMemory(openai("gpt-5-nano")),
messages,
});
// No manual buffering neededQ: How much does it cost? A: Cortex uses Convex for storage:
- Free tier: 1GB storage, perfect for development
- Pro: $25/month for production apps
- No per-request fees - Unlike mem0, you only pay for storage
Q: Can I disable automatic memory for specific requests? A: Yes! Configure per instance:
const noMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "temp-space",
userId: "user-123",
enableMemorySearch: false,
enableMemoryStorage: false,
});Q: How do I handle multiple users? A: Use dynamic user resolution:
const cortexMemory = createCortexMemory({
convexUrl: process.env.CONVEX_URL!,
memorySpaceId: "multi-user-chat",
userId: () => req.user.id, // Resolved per request
});Q: Can I use this with LangChain? A: Not directly (LangChain has different interfaces), but Cortex SDK works standalone:
import { Cortex } from "@cortexmemory/sdk";
const cortex = new Cortex({ convexUrl: process.env.CONVEX_URL! });
// Search memories
const memories = await cortex.memory.search("user preferences");
// Store LangChain conversations
await cortex.memory.remember({
memorySpaceId: "langchain-agent",
conversationId: "conv-123",
userMessage: input,
agentResponse: output,
userId: "user-123",
userName: "User",
});Troubleshooting
"Failed to connect to Convex"
Make sure:
- Convex is running:
npx convex dev CONVEX_URLis set correctly- Cortex backend is deployed to Convex
"Memory search returns no results"
This is expected if:
- No prior conversations stored
- Using keyword search without embeddings (set up
embeddingProvider) - Running on local Convex (vector search not supported locally)
"Type errors with LanguageModelV1"
Make sure you're using compatible versions:
ai: ^3.0.0@cortexmemory/sdk: ^0.9.0@cortexmemory/vercel-ai-provider: ^0.1.0
For more troubleshooting help, see Troubleshooting Guide.
Contributing
We welcome contributions! Please see CONTRIBUTING.md.
License
Apache 2.0 - See LICENSE.md
Complete Documentation
- Cortex Documentation - Full Cortex documentation
- Vercel AI SDK Integration - All integration docs
Links
Built with ❤️ by the Cortex team
