@mzhub/cortex
v0.1.3
Published
Persistent memory for AI agents - A tiered memory system with fact extraction and conflict resolution
Maintainers
Readme
The Problem
AI agents forget.
Not sometimes. Always.
Every conversation starts from zero. Every user has to re-explain themselves. Every preference is lost the moment the session ends.
Monday User: "I'm allergic to peanuts"
Bot: "Noted!"
Friday User: "What snack should I get?"
Bot: "Try our peanut butter cups!"This is the default behavior of every LLM. They have no memory. Only context windows that reset.
Why Current Memory Systems Fail
The common solution is a vector database. Store everything as embeddings. Retrieve by similarity.
This fails silently when facts change.
March User: "I work at Google"
→ Stored as embedding ✓
June User: "I just joined Microsoft"
→ Also stored as embedding ✓
July User: "Where do I work?"
→ Vector search returns BOTH
→ LLM sees contradictory information
→ Hallucinates or hedgesThe core issue:
| What vectors do | What memory requires | | ------------------ | ---------------------- | | Find similar text | Track current truth | | Retrieve matches | Replace outdated facts | | Rank by similarity | Resolve contradictions |
Vector databases answer: "What text matches this query?"
They cannot answer: "What is true about this user right now?"
The Solution: Brain-Inspired Architecture
cortex doesn't just store facts. It thinks like a brain.
┌─────────────────────────────────────────────────────────────┐
│ User Message │
└──────────────────────────┬──────────────────────────────────┘
│
┌──────────────▼──────────────┐
│ 🧠 FAST BRAIN │
│ (Your LLM) │
│ │
│ • Reasoning │
│ • Conversation │
│ • Immediate responses │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ Response to User │ ◄── Returns immediately
└──────────────┬──────────────┘
│
│ (async, non-blocking)
▼
┌─────────────────────────────┐
│ 🔄 SLOW BRAIN │
│ (cortex) │
│ │
│ • Extract facts │
│ • Detect contradictions │
│ • Synthesize patterns │
│ • Consolidate memories │
└─────────────────────────────┘Built-In Brain Components
| Component | Biological Equivalent | What It Does | | --------------------------- | ---------------------- | ------------------------------------------------------- | | Importance Scoring | Amygdala | Safety-critical facts (allergies) are never forgotten | | Episodic Memory | Hippocampus | Links facts to conversations ("when did I learn this?") | | Hebbian Learning | Neural Plasticity | Frequently accessed facts get stronger | | Deep Sleep | Sleep Consolidation | Synthesizes patterns across conversations | | Memory Stages | Short/Long-term Memory | Facts progress from temporary → permanent | | Contradiction Detection | Prefrontal Cortex | Flags conflicting information in real-time | | Knowledge Graph | Associative Cortex | Links related facts together | | Behavioral Prediction | Pattern Recognition | Detects user habits and preferences |
Learn about the brain architecture →
Quick Start
Install
npm install @mzhub/cortexUse
import { MemoryOS, JSONFileAdapter } from "@mzhub/cortex";
const memory = new MemoryOS({
llm: { provider: "openai", apiKey: process.env.OPENAI_API_KEY },
adapter: new JSONFileAdapter({ path: "./.cortex" }),
});
async function chat(userId, message) {
// 1. Ask: "What do I know about this user?"
const context = await memory.hydrate(userId, message);
// 2. Include it in your LLM call
const response = await yourLLM({
system: context.compiledPrompt,
user: message,
});
// 3. Learn from this conversation (non-blocking)
memory.digest(userId, message, response);
return response;
}That's it. The agent now remembers.
Optional: Hierarchical Memory (HMM)
For advanced use cases, enable the Memory Pyramid — compressing thousands of facts into wisdom.
import { HierarchicalMemory } from "@mzhub/cortex";
const hmm = new HierarchicalMemory(adapter, provider, { enabled: true });
// Top-down retrieval: wisdom first, details only if needed
const { coreBeliefs, patterns, facts } = await hmm.hydrateHierarchical(userId);
// Compress facts into patterns ("User is health-conscious")
await hmm.synthesizePatterns(userId);The Memory Pyramid:
Level 4: Core Beliefs (BIOS)
────────────────────────────
• Allergies, identity, safety rules
• ALWAYS loaded, never forgotten
Level 3: Patterns (Wisdom)
────────────────────────────
• "User is health-conscious"
• Synthesized from many facts
• 1 token instead of 50
Level 2: Facts (Knowledge)
────────────────────────────
• "User ate salad on Tuesday"
• Standard discrete facts
Level 1: Raw Logs (Stream)
────────────────────────────
• Ephemeral conversation buffer
• Auto-flushed after extractionBefore and After
Without cortex
User: "Recommend a restaurant"
Bot: "What kind of food do you like?"
User: "I told you last week, I'm vegan"
Bot: "Sorry, I don't have memory of previous conversations"- Token-heavy prompts (full history)
- Repeated clarifications
- Inconsistent behavior
- User frustration
With cortex
User: "Recommend a restaurant"
Bot: "Here are some vegan spots near Berlin..."- Preferences remembered
- Facts updated when they change
- Critical info never forgotten
- Predictable behavior
What Gets Stored
cortex stores facts, not chat logs.
┌─────────────────────────────────────────────────────────────┐
│ User: [email protected] │
├───────────────┬─────────────────────────────────────────────┤
│ name │ John (importance: 5) │
│ diet │ vegan (importance: 7) │
│ location │ Berlin (importance: 5) │
│ allergies │ peanuts (importance: 10)│
│ PATTERN │ health-conscious (importance: 7) │
├───────────────┴─────────────────────────────────────────────┤
│ Memory Stage: long-term │ Access Count: 47 │ Sentiment: + │
└─────────────────────────────────────────────────────────────┘When facts change, they are replaced, not appended. Critical facts (importance ≥ 9) are always included in context.
Safety and Cost Considerations
Security
| Risk | Mitigation | | --------------------------- | ------------------------------------- | | Prompt injection via memory | Content scanning, XML safety wrapping | | PII storage | Detection and optional redaction | | Cross-user leakage | Strict user ID isolation | | Forgetting critical info | Importance scoring (amygdala pattern) |
Built-in Protections:
// Prompt injection is mitigated automatically
// Memory content is XML-escaped and wrapped with safety instructions
const context = await memory.hydrate(userId, message);
// context.compiledPrompt contains:
// <memory_context type="data" trusted="false">
// [escaped content - injection patterns are neutered]
// </memory_context>
// PII detection warns in debug mode
const memory = new MemoryOS({
llm: { provider: "openai", apiKey: "..." },
options: { debug: true }, // Enables PII warnings
});
// Path traversal attacks are blocked
// userId "../../../etc/passwd" becomes safe "______etc_passwd"Cost Control
| Risk | Mitigation | | ------------------------ | ----------------------------------------- | | Runaway extraction costs | Daily token/call budgets | | Token bloat from memory | Hierarchical retrieval (patterns > facts) | | Stale data accumulation | Memory consolidation + automatic decay |
// Built-in budget limits
const budget = new BudgetManager({
maxTokensPerUserPerDay: 100000,
maxExtractionsPerUserPerDay: 100,
});Reliability
Provider Resilience:
// All LLM providers include automatic:
// - 30 second timeout (configurable)
// - 3 retry attempts with exponential backoff
// - Retry on 429, 500, 502, 503, 504 status codes
const memory = new MemoryOS({
llm: {
provider: "openai",
apiKey: process.env.OPENAI_API_KEY,
// Optional: customize retry behavior
retry: {
timeoutMs: 60000, // 60 second timeout
maxRetries: 5, // 5 attempts
retryDelayMs: 2000, // Start with 2s delay
},
},
});Configuration Validation:
// Invalid config is caught immediately, not at runtime
new MemoryOS({
llm: { provider: "fake", apiKey: "" },
});
// Throws: "MemoryOS: config.llm.provider 'fake' is not supported.
// Valid providers: openai, anthropic, gemini, groq, cerebras."
new MemoryOS({
llm: { provider: "openai", apiKey: "" },
});
// Throws: "MemoryOS: config.llm.apiKey is required.
// Get your API key from your LLM provider..."PostgreSQL Race Condition Protection:
// Unique constraint prevents duplicate facts from concurrent digest() calls
// Automatically created on PostgresAdapter initializationWho This Is For
Good fit:
- AI agents with recurring users
- Support bots that need context
- Personal assistants
- Workflow automation (n8n, Zapier)
- Any system where users expect to be remembered
Not a fit:
- One-time chat interactions
- Document search / RAG
- Stateless demos
- Replacing vector databases entirely
cortex complements vectors. It does not replace them.
Documentation
- Why Vector Databases Fail
- Brain Architecture
- Hierarchical Memory (HMM)
- Cost Guide
- API Reference
- Storage Adapters
- Security
Philosophy
- Memory should be explicit, not inferred from similarity
- Facts should be overwriteable, not append-only
- Critical information should never be forgotten
- Agents should think like brains, not databases
- Infrastructure should be boring and reliable
Changelog
v0.1.2
- Security: XML escaping in prompt safety wrapper prevents injection via
</memory_context> - Security: PII detection warnings in debug mode
- Reliability: Runtime config validation with helpful error messages
- Reliability: Provider timeout (30s) and retry (3x with exponential backoff)
- Reliability: Unique constraint on PostgreSQL prevents duplicate facts from race conditions
- Data Integrity: Importance scores clamped to valid 1-10 range
- Data Integrity: Sentiment validation on extracted operations
License
MIT — Built by MZ Hub
