prism-mcp-server
v15.2.1
Published
Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO
Downloads
5,317
Maintainers
Keywords
Readme
🧠 Prism Coder
🌐 Read in your language: 🇬🇧 English · 🇪🇸 Español · 🇫🇷 Français · 🇵🇹 Português · 🇷🇴 Română · 🇺🇦 Українська · 🇷🇺 Русский · 🇩🇪 Deutsch · 🇯🇵 日本語 · 🇰🇷 한국어 · 🇨🇳 中文 · 🇸🇦 العربية
Persistent memory + tool-calling intelligence for AI agents. (formerly Prism MCP)
A Model Context Protocol server that gives Claude, Cursor, and other AI tools a Mind Palace — long-term memory that survives across sessions, with semantic search, cognitive routing, a visual dashboard, and the open-weights prism-coder:7b / prism-coder:14b LLM fleet for offline tool-calling (BFCL Gold Certified, 100 % JSON validity).
Renamed in v14.0.0: the project is now Prism Coder to cover both the Mind Palace memory server and the
prism-coder:7b/prism-coder:14bLLM fleet on HuggingFace + Ollama. The npm package staysprism-mcp-serverso existing install URLs andmcp.jsonentries keep working — theprism-coderbinary has been the canonical entry point since v12.
What Prism Coder does
💾 Your AI remembers across sessions
Every conversation feeds the Mind Palace. Next session, your AI agent loads the right context automatically — no re-explaining.
🔍 Semantic search over your history
Ask "what did I decide about the auth flow last month?" and get the answer with citations. Vector search + keyword + graph traversal.
🧬 Cognitive routing
Different memory types live in different stores: episodic (what happened), semantic (what's true), procedural (how to do X). The router picks where to store and where to retrieve.
🔄 Proactive session drift detection (new in v15)
Your AI agent can now detect when it has drifted from your original goals — mid-session, automatically — and self-correct before you notice the problem.
Three direct Prism calls:
session_save_ledger— snapshot current statesession_cognitive_route— compare current work against original goals, returnson_track / minor_drift / major_driftsession_compact_ledger— if drifted, compress and reload only what matters
When major drift is detected, the alert routes to the Synalux portal so it's visible across sessions and devices — not just in the current conversation.
Real example it caught: A training session promised BFCL ≥90% for three AI models. The agent spent 3 hours debugging audio bugs instead. The drift check surfaced: "Training goal unmet. Layer3 corpus missing from all training sets. 0 BFCL scores measured." The session immediately re-aligned.
No scripts. No cron. No hooks. Three tool calls, Prism handles the rest.
🛡 Local-first
Free tier runs entirely on your machine — SQLite, local embedding model, no API keys, no cloud. Paid tier adds cloud sync via Synalux portal.
⚡ Zero-search retrieval
Holographic Reduced Representations (HRR) for instant similarity lookups without an index. ~5ms over 100K memories.
🌐 Multi-agent Hivemind
Multiple AI agents share the same Mind Palace. Each agent has a role (dev / qa / pm / etc.) and sees scoped context. Heartbeat + roster for coordination.
Get started
# Install globally
npm install -g prism-mcp-server
# Or use npx (no install)
npx prism-mcp-serverAdd to Claude Desktop / Cursor config:
{
"mcpServers": {
"prism": {
"command": "npx",
"args": ["-y", "prism-mcp-server"]
}
}
}That's it. Open Claude / Cursor and your AI now has memory.
More setup details in docs/SETUP_GEMINI.md.
How AI agents use it
| Tool | What it does |
|---|---|
| session_load_context | Recover prior session's state on boot |
| session_save_ledger | Append immutable session log entry |
| session_save_handoff | Save live state for the next session |
| knowledge_search | Semantic + keyword search over all memories |
| query_memory_natural | Natural-language Q&A over your Mind Palace |
| extract_entities | Pull people / projects / decisions from text |
| session_synthesize_edges | Auto-link related memories into a graph |
(35+ tools total — full TypeScript signatures in src/tools/. Architecture overview in docs/ARCHITECTURE.md.)
The LLM context window is treated as ephemeral scratch space. All durable state lives in Prism's persistent store (SQLite / Supabase). Context compaction is a non-event.
Boot protocol — every session (including post-compaction) begins with a mandatory session_load_context call, enforced via CLAUDE.md. The agent is fully oriented before writing a single byte of response.
Two persistent stores:
session_save_ledger— immutable append-only work log (decisions, files changed, summaries)session_save_handoff— versioned live-state snapshot (current task, TODOs, open context)
Ledger compaction (session_compact_ledger) — when a project exceeds a threshold (default: 50 entries), Prism summarizes old entries via LLM into a rollup row, soft-archives originals, and links them via spawned_from graph edges. Runs on a 12-hour background scheduler.
→ Full details: docs/COMPACTION.md
Models
All Prism Coder inference uses only fine-tuned Prism Coder models — no Claude, no Gemini, no OpenRouter fallbacks. Models are exclusively accessible through the Synalux router (authentication + subscription required).
| Model | Where | Tier | Latency | |---|---|---|---| | Qwen3-1.7B (fine-tuned) | On-device — iOS CoreML / Android ONNX | Free | ~50ms offline | | Qwen3-14B (fine-tuned) | RunPod A100 via Synalux | Standard+ | ~200ms | | QwQ-32B (fine-tuned) | RunPod A100 80GB via Synalux | Pro/Enterprise | ~3–5s | | Qwen3-30B-A3B (fine-tuned MoE) | RunPod via Synalux | Enterprise | ~2–3s |
Fine-tuned on the 3-layer corpus: AAC + BFCL tool-calling + clinical workflows. BFCL gate: ≥ 90% on all tiers before production promotion. Adapters stored at dcostenco/prism-coder-* (private HuggingFace).
Plans
| | Free | Standard $19/mo | Pro $49/mo | Enterprise $99/mo | |---|---|---|---|---| | Qwen3-1.7B on-device | ✅ unlimited | ✅ | ✅ | ✅ | | Qwen3-14B cloud | — | ✅ 200 req/day | ✅ 2K req/day | ✅ unlimited | | QwQ-32B reasoning | — | — | ✅ | ✅ priority | | Qwen3-30B-A3B MoE | — | — | — | ✅ | | Custom fine-tuning | — | — | — | ✅ | | HIPAA BAA | — | — | — | ✅ |
What you can build with it
- Persistent coding assistant that remembers your codebase, your decisions, your team's conventions
- Research agent that builds knowledge over time — Auto-Scholar pipeline ingests papers / docs and synthesizes
- Clinical scribe that retains patient context across visits (HIPAA-compliant cloud + local)
- Customer support agent that learns from every ticket
- Writing assistant that knows your voice, your prior drafts, and what you've already published
Companions
Synalux — VS Code Extension
Memory-augmented AI inside VS Code, backed by Prism. 20 multimodal tools, multi-agent orchestration, 12-language support. Works offline (Ollama) or cloud (OpenRouter). HIPAA-compliant healthcare workflows.
# Install from terminal
code --install-extension synalux-ai.synaluxOr open VS Code → Extensions (⇧⌘X) → search "Synalux" → Install.
PrismAAC
AAC communication app for non-speaking users. Powered by Prism's spreading-activation phrase ranking + on-device 7B model. macOS / iOS / Android via web. → github.com/dcostenco/prism-aac
🆕 Prism as Foundation (v14.0.0)
As of v14.0.0, Prism's algorithm exports are a stable public contract under SemVer. External systems can port actrActivation.ts (ACT-R cognitive decay), spreadingActivation.ts (the 0.7 similarity + 0.3 activation hybrid score), routerExperience.ts (experience bias with MIN_SAMPLES=5 cold-start gate), compactionHandler.ts (the 25KB prompt-budget cap), and graphMetrics.ts (warning ratios) with citations and pin a Prism version.
Reference consumers
| Consumer | What it uses from Prism |
|---|---|
| Audit hooks framework | ACT-R decay (d=0.25 lesson rate), spreading activation hybrid score (0.7/0.3), experience bias (MIN_SAMPLES=5, MAX_BIAS_CAP=0.15), graph-metrics warning ratios (0.20 / 0.30 / 0.40), compaction's 25KB prompt-budget. 327 tests pin every constant — CI catches divergence automatically. |
| PrismAAC | Spreading-activation phrase ranking (recency × frequency × per-user history). Caregiver corrections auto-harvest into the personalization corpus via the audit-hooks postflight harvester. The on-device 7B model + this algorithm stack is what makes PrismAAC defensible. |
| Synalux portal | Tier-aware model routing using experience bias on prior outcomes per fingerprint. HIPAA-compliant clinical scribe with on-device-first privacy guarantees. |
Synalux Inference Router — Architecture (v16)
All Prism AAC model inference is protected behind Synalux as a mandatory router. Models are never accessible directly — all traffic goes through Synalux for auth, billing, and rate limiting.
┌─────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ prism-aac (iOS/web) │ Synalux Portal │
└──────────────┬──────────────────────────────────────────────┘
│ POST /api/v1/prism-aac/inference
│ Authorization: Bearer <user-JWT>
▼
┌─────────────────────────────────────────────────────────────┐
│ SYNALUX ROUTER │
│ 1. Verify JWT (no anonymous access) │
│ 2. Check subscription tier │
│ 3. Enforce rate limit (50–2000 req/day by plan) │
│ 4. Route to model tier by complexity │
│ 5. Proxy → RunPod with SECRET key (never sent to client) │
│ 6. Log → aac_inference_log (billing audit trail) │
└──────────┬─────────────────────────────────────┬────────────┘
│ tier=fast │ tier=reason
▼ ▼
┌──────────────────┐ ┌───────────────────────┐
│ Qwen3-14B │ │ QwQ-32B │
│ RunPod A100 40G │ │ RunPod A100 80G │
│ ~200ms │ │ ~3–5s (reasoning) │
│ standard/pro │ │ pro/enterprise only │
└──────────────────┘ └───────────────────────┘
│ │
└────────────────┬─────────────────────┘
▼
HuggingFace dcostenco/prism-coder-* (private)
RunPod pulls at pod start with server-side token
On-device (free, zero latency, offline):
Qwen3-1.7B GGUF Q4_K_M → iOS CoreML / Android ONNX| Plan | Cloud model | Daily limit | On-device | |---|---|---|---| | Free | — | unlimited local | Qwen3-1.7B | | Standard $5/mo | Qwen3-14B | 200 req | + cloud | | Pro $15/mo | QwQ-32B | 2,000 req | + reasoning | | Enterprise | QwQ-32B priority | unlimited | full stack |
See docs/WOW_FEATURES.md for the algorithm catalogue. Release notes in docs/releases/v14.0.0-prism-as-foundation.md.
Detailed docs in this repo:
docs/ARCHITECTURE.md— system architecture, memory routing, HRRdocs/COMPACTION.md— how Prism handles LLM context compaction and ledger compactiondocs/SETUP_GEMINI.md— Gemini configurationdocs/self-improving-agent.md— adversarial eval / anti-sycophancydocs/rfcs/— design RFCsdocs/releases/— per-version release notesCHANGELOG.md— version history (v12.5 Unified Billing, v11.6 Hivemind, v11.5.1 Auto-Scholar, etc.)CONTRIBUTING.md— contributor guide
The original 1933-line README is preserved in git history. To browse the prior version (full feature catalog, Cognitive Architecture v7.8, Autonomous Cognitive OS v9.0, HRR Zero-Search, Adversarial Evaluation walkthroughs, Universal Import patterns, competitive analysis vs LangMem/MemGPT/Letta/Zep, v12.5 Unified Billing details, v11.6 Hivemind, v11.5.1 Auto-Scholar): git show HEAD~1:README.md.
License
BUSL-1.1 — Business Source License. Free for non-production use. Production use requires a Synalux subscription or commercial license. After 2 years, converts to MIT.
