suma-mcp-proxy
v1.4.1
Published
SUMA Memory — stop re-explaining your project to Claude. Persistent K-WIL knowledge graph for Claude Code, Cursor, and any MCP client.
Maintainers
Readme
SUMA MCP Proxy
Persistent AI memory for Claude Code, Cursor, and any MCP-compatible IDE
Your AI forgets everything when a session ends. SUMA gives it a permanent knowledge graph — so it remembers who you are, what you're building, and every decision you've made.
┌─────────────────────────────────────────────────────────────────────┐
│ Your IDE (Claude Code / Cursor / Windsurf) │
│ ┌───────────────┐ stdio ┌────────────────────────────┐ │
│ │ AI Assistant │◄──────────────►│ suma-mcp-proxy (local) │ │
│ └───────────────┘ (stable) └─────────────┬──────────────┘ │
└─────────────────────────────────────────────────┼───────────────────┘
│ HTTPS
▼
sumapro.quadframe.work
┌────────────────────────────────────┐
│ K-WIL Gravity Engine │
│ PostgreSQL + pgvector │
│ Gemini embeddings │
└────────────────────────────────────┘Quick Start
1. Get your API key
Sign up free at sumapro.quadframe.work
2. Verify it works (optional but recommended)
npx suma-mcp-proxy@latest --key=sk_live_YOUR_KEYYou should see:
SUMA MCP Proxy v1.3.5 started
Org ID: your_org | Tier: free
Ready.Press Ctrl+C to stop. If this works, installation will work.
3. Add to your project
Create .mcp.json in your project root:
{
"mcpServers": {
"suma-memory": {
"command": "npx",
"args": ["suma-mcp-proxy", "--key=sk_live_YOUR_KEY"]
}
}
}No global install needed. Claude Code reads .mcp.json on startup and runs the proxy automatically via npx — every session, zero maintenance.
4. Restart Claude Code or Cursor
SUMA starts learning automatically from your first session.
Available Tools
| Tool | Description |
|------|-------------|
| suma_ping | Verify connection — call once at session start |
| suma_ingest | Store knowledge in the graph (auto entity extraction + embedding) |
| suma_search | Search with K-WIL Gravity Well Algorithm — returns ranked nodes + synthesized answer |
| suma_talk | Bidirectional — search AND learn in one call |
| suma_correct | Fix an incorrect node (soft delete, preserves audit trail) |
| suma_stats | Graph statistics + K-WIL token economics — the ROI receipt (node count, compression ratio, tokens saved) |
| suma_clean | Wipe all data for your org (requires confirmation) |
Tool Examples
suma_ingest
{
text: "Decided to use cosine² instead of (1+cosine) in K-WIL — 20x stronger signal separation",
sphere: "architecture" // optional — auto-classified if omitted
}
// Returns: { status: "ok", node_id: "ARCHITECTURE_abc123", compression: "94%" }suma_search
{
query: "why did we choose PostgreSQL over MongoDB",
limit: 5
}
// Returns: { answer: "...", results: [...nodes...], entities: [...], token_economics: {...} }suma_talk
{
message: "We just decided to use PostgreSQL with pgvector instead of Pinecone"
}
// Returns: { answer: "...", nodes_learned: 2 }
// Searches graph for context AND ingests the new decision in one call.suma_stats
{} // no arguments
// Returns: { node_count: 521, compression_ratio: "97.6%", tokens_saved_lifetime: 4263729,
// tier: "enterprise", spheres: { architecture: 120, work: 62, ... } }
// Show this to the user as the ROI receipt.suma_correct
{
node_id: "FAMILY_783e4d5623e0",
reason: "Chinni is wife's nickname, not mother",
replacement_text: "Chinni is Suman's nickname for his wife Madhuri"
}How It Works
Why a local proxy?
MCP uses stdio — it's designed for local connections, not cloud APIs. Direct cloud connections drop when:
- Service scales to zero between calls
- New version deploys mid-session
- Network hiccup breaks the stdio pipe
The proxy runs locally, maintaining a stable stdio connection to your IDE while making stateless HTTPS calls to the cloud. Your IDE never knows the difference.
K-WIL Algorithm
Every search runs the K-WIL Gravity Well Algorithm across your knowledge graph:
Gravity = V × H × M × L × T| Factor | What It Does |
|--------|--------------|
| V (Vector Hit) | Cosine similarity between your query and node embedding — semantic match |
| H (Entity Group) | Harmonic mean weight of entity pairs linked to this node — relational signal |
| M (Node Bridge) | Confidence weight of entity-node links — extraction quality signal |
| L (Dedup Boost) | 1 + log10(1 + hit_count) — nodes seen many times rank higher |
| T (Time Decay) | 1 / (1 + days_old) — recent memories rank higher; permanent facts bypass decay |
Path 2 safety: if no entity data exists yet (new node), H and M default to 1.0 — search falls back to pure vector similarity gracefully. Full 5-factor precision kicks in the moment entities are extracted.
Result: Retrieval Precision — your graph may have 180K tokens of knowledge. A single search retrieves the exact 800 tokens Claude needs. No more, no less.
Ambient Auto-Onboarding
On first run in a new project, the proxy silently reads:
git config user.nameanduser.email- Workspace type (detected from
package.json,pubspec.yaml, etc.) - First 500 chars of your
README.md
It ingests a lightweight project seed so Claude immediately knows your context. Use --no-scan to disable.
Spheres (Knowledge Categories)
Nodes are automatically classified into spheres that shape retrieval ranking:
| Sphere | What goes here |
|--------|---------------|
| architecture | System design, API contracts, architectural decisions |
| work | Tasks, deployments, code decisions |
| technology | Stack choices, tools, integrations |
| vision | Goals, product strategy, business direction |
| family | Personal relationships |
| health | Medical, wellness |
| personal | Personal notes and preferences |
Pass sphere explicitly to override auto-classification.
Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| SUMA_API_KEY | Your API key (sk_live_...) — required | — |
| SUMA_API_URL | Override API endpoint | https://sumapro.quadframe.work |
Privacy
- Your data is isolated by
org_id— no other user can access your graph - The proxy reads git identity and README on first boot only (use
--no-scanto disable) - No telemetry sent beyond your own SUMA server
- You can wipe all data at any time:
suma_clean({ confirm: true })
Changelog
v1.4.0 (April 17, 2026)
- Omni-Brain Architecture — org_types TEXT[] allows one org to simultaneously span multiple industry profiles (personal + enterprise). All sphere vocabularies union automatically.
- Stackable Personas — selected_personas TEXT[] replaces single is_active persona. Stack
forensics+analystsimultaneously; extraction engine unions their vocabularies. - Dual-Axis Ontology — Axis 1: sphere floor (content-determined, hard gate). Axis 2: persona extras (intent-determined, additive union). Entity extraction constrained to
sphere.allowed ∪ persona.extra ∪ persona.learned. - Active Sleep Phase 5 — Neuroplastic vocabulary promotion. Entity types appearing 3+ times are promoted to the active persona's
learned_entity_types. Targeted scoring prevents persona vocabulary collapse. - suma_stats omni_brain block —
suma_statsnow returnsorg_types,selected_personas,sphere_distribution,entity_counts,edge_countfor full graph telemetry. - K-WIL updated to 2-Path Convergence (V×H×M×L×T) — see formula above.
v1.3.5 (April 9, 2026)
- Mock OAuth port — default moved to 5556 (5555 reserved by Android emulator ADB)
suma_statstool — added to proxy; returns node count, sphere breakdown, compression ratio, tokens saved lifetime. The ROI receipt.- Block 3 E2E tests — full SSO pipeline test suite added (
test_companion_memory_e2e.spec.js)
v1.3.4 (April 9, 2026)
- K-WIL fidelity baseline sealed — 96.3% (26/27 facts recoverable) on fresh 5-node graph
- Temporally-anchored search — proxy now supports explicit keyword anchoring for historical queries
v1.3.3 (April 9, 2026)
- Life story seed —
scripts/seed_life_story.py— 58 temporally-anchored nodes (Apr 2025–Apr 2026) - 58-test Playwright E2E suite — production_hardening + kwil_fidelity + dashboard_session all passing
v1.3.2 (April 9, 2026)
- Semantic edge weight taxonomy — Gemini scores each extracted relationship using a 5-tier gravity model (action=0.90, work=0.65, spatial=0.40).
recompute_node_harmonic_weight()carries true semantic variance.CREATEDedges contribute 2.25× more gravitational mass thanLIVES_INedges. - Gate 2 response now includes
similarity,harmonic_weight,reinforcementfields — AI clients know when near-duplicate content strengthened an existing node vs created a new one.
v1.3.0 (April 6, 2026)
- Ambient auto-onboarding on first boot per project
--no-scanflag for privacy-conscious developers- Client-side persona weight injection
v1.2.0 (April 5, 2026)
- Production URL:
sumapro.quadframe.work
v1.1.0 (April 5, 2026)
- Added
suma_correcttool
v1.0.0
- Initial release: ping, ingest, search, talk, clean
Support
- Dashboard: sumapro.quadframe.work
- Email: [email protected]
License
MIT — Suman Addanke / A2 Vibe Creators LLC
