suma-mcp-proxy

v1.4.1

Published

a month ago

SUMA Memory — stop re-explaining your project to Claude. Persistent K-WIL knowledge graph for Claude Code, Cursor, and any MCP client.

0High
0Medium
0Low

sumapro

suma mcp memory ai knowledge-graph claude cursor llm

SUMA MCP Proxy

Persistent AI memory for Claude Code, Cursor, and any MCP-compatible IDE

Your AI forgets everything when a session ends. SUMA gives it a permanent knowledge graph — so it remembers who you are, what you're building, and every decision you've made.

┌─────────────────────────────────────────────────────────────────────┐
│  Your IDE (Claude Code / Cursor / Windsurf)                         │
│  ┌───────────────┐     stdio      ┌────────────────────────────┐   │
│  │ AI Assistant  │◄──────────────►│ suma-mcp-proxy (local)     │   │
│  └───────────────┘   (stable)     └─────────────┬──────────────┘   │
└─────────────────────────────────────────────────┼───────────────────┘
                                                  │ HTTPS
                                                  ▼
                              sumapro.quadframe.work
                              ┌────────────────────────────────────┐
                              │  K-WIL Gravity Engine              │
                              │  PostgreSQL + pgvector             │
                              │  Gemini embeddings                 │
                              └────────────────────────────────────┘

Quick Start

1. Get your API key

2. Verify it works (optional but recommended)

npx suma-mcp-proxy@latest --key=sk_live_YOUR_KEY

You should see:

SUMA MCP Proxy v1.3.5 started
Org ID: your_org | Tier: free
Ready.

Press Ctrl+C to stop. If this works, installation will work.

3. Add to your project

Create .mcp.json in your project root:

{
  "mcpServers": {
    "suma-memory": {
      "command": "npx",
      "args": ["suma-mcp-proxy", "--key=sk_live_YOUR_KEY"]
    }
  }
}

No global install needed. Claude Code reads .mcp.json on startup and runs the proxy automatically via npx — every session, zero maintenance.

4. Restart Claude Code or Cursor

SUMA starts learning automatically from your first session.

Available Tools

| Tool | Description | |------|-------------| | suma_ping | Verify connection — call once at session start | | suma_ingest | Store knowledge in the graph (auto entity extraction + embedding) | | suma_search | Search with K-WIL Gravity Well Algorithm — returns ranked nodes + synthesized answer | | suma_talk | Bidirectional — search AND learn in one call | | suma_correct | Fix an incorrect node (soft delete, preserves audit trail) | | suma_stats | Graph statistics + K-WIL token economics — the ROI receipt (node count, compression ratio, tokens saved) | | suma_clean | Wipe all data for your org (requires confirmation) |

Tool Examples

suma_ingest

{
  text: "Decided to use cosine² instead of (1+cosine) in K-WIL — 20x stronger signal separation",
  sphere: "architecture"  // optional — auto-classified if omitted
}
// Returns: { status: "ok", node_id: "ARCHITECTURE_abc123", compression: "94%" }

suma_search

{
  query: "why did we choose PostgreSQL over MongoDB",
  limit: 5
}
// Returns: { answer: "...", results: [...nodes...], entities: [...], token_economics: {...} }

suma_talk

{
  message: "We just decided to use PostgreSQL with pgvector instead of Pinecone"
}
// Returns: { answer: "...", nodes_learned: 2 }
// Searches graph for context AND ingests the new decision in one call.

suma_stats

{}  // no arguments
// Returns: { node_count: 521, compression_ratio: "97.6%", tokens_saved_lifetime: 4263729,
//            tier: "enterprise", spheres: { architecture: 120, work: 62, ... } }
// Show this to the user as the ROI receipt.

suma_correct

{
  node_id: "FAMILY_783e4d5623e0",
  reason: "Chinni is wife's nickname, not mother",
  replacement_text: "Chinni is Suman's nickname for his wife Madhuri"
}

How It Works

Why a local proxy?

MCP uses stdio — it's designed for local connections, not cloud APIs. Direct cloud connections drop when:

Service scales to zero between calls
New version deploys mid-session
Network hiccup breaks the stdio pipe

The proxy runs locally, maintaining a stable stdio connection to your IDE while making stateless HTTPS calls to the cloud. Your IDE never knows the difference.

K-WIL Algorithm

Every search runs the K-WIL Gravity Well Algorithm across your knowledge graph:

Gravity = V × H × M × L × T

| Factor | What It Does | |--------|--------------| | V (Vector Hit) | Cosine similarity between your query and node embedding — semantic match | | H (Entity Group) | Harmonic mean weight of entity pairs linked to this node — relational signal | | M (Node Bridge) | Confidence weight of entity-node links — extraction quality signal | | L (Dedup Boost) | 1 + log10(1 + hit_count) — nodes seen many times rank higher | | T (Time Decay) | 1 / (1 + days_old) — recent memories rank higher; permanent facts bypass decay |

Path 2 safety: if no entity data exists yet (new node), H and M default to 1.0 — search falls back to pure vector similarity gracefully. Full 5-factor precision kicks in the moment entities are extracted.

Result: Retrieval Precision — your graph may have 180K tokens of knowledge. A single search retrieves the exact 800 tokens Claude needs. No more, no less.

Ambient Auto-Onboarding

On first run in a new project, the proxy silently reads:

git config user.name and user.email
Workspace type (detected from package.json, pubspec.yaml, etc.)
First 500 chars of your README.md

It ingests a lightweight project seed so Claude immediately knows your context. Use --no-scan to disable.

Spheres (Knowledge Categories)

Nodes are automatically classified into spheres that shape retrieval ranking:

| Sphere | What goes here | |--------|---------------| | architecture | System design, API contracts, architectural decisions | | work | Tasks, deployments, code decisions | | technology | Stack choices, tools, integrations | | vision | Goals, product strategy, business direction | | family | Personal relationships | | health | Medical, wellness | | personal | Personal notes and preferences |

Pass sphere explicitly to override auto-classification.

Environment Variables

| Variable | Description | Default | |----------|-------------|---------| | SUMA_API_KEY | Your API key (sk_live_...) — required | — | | SUMA_API_URL | Override API endpoint | https://sumapro.quadframe.work |

Privacy

Your data is isolated by org_id — no other user can access your graph
The proxy reads git identity and README on first boot only (use --no-scan to disable)
No telemetry sent beyond your own SUMA server
You can wipe all data at any time: suma_clean({ confirm: true })

Changelog

v1.4.0 (April 17, 2026)

Omni-Brain Architecture — org_types TEXT[] allows one org to simultaneously span multiple industry profiles (personal + enterprise). All sphere vocabularies union automatically.
Stackable Personas — selected_personas TEXT[] replaces single is_active persona. Stack forensics + analyst simultaneously; extraction engine unions their vocabularies.
Dual-Axis Ontology — Axis 1: sphere floor (content-determined, hard gate). Axis 2: persona extras (intent-determined, additive union). Entity extraction constrained to sphere.allowed ∪ persona.extra ∪ persona.learned.
Active Sleep Phase 5 — Neuroplastic vocabulary promotion. Entity types appearing 3+ times are promoted to the active persona's learned_entity_types. Targeted scoring prevents persona vocabulary collapse.
suma_stats omni_brain block — suma_stats now returns org_types, selected_personas, sphere_distribution, entity_counts, edge_count for full graph telemetry.
K-WIL updated to 2-Path Convergence (V×H×M×L×T) — see formula above.

v1.3.5 (April 9, 2026)

Mock OAuth port — default moved to 5556 (5555 reserved by Android emulator ADB)
suma_stats tool — added to proxy; returns node count, sphere breakdown, compression ratio, tokens saved lifetime. The ROI receipt.
Block 3 E2E tests — full SSO pipeline test suite added (test_companion_memory_e2e.spec.js)

v1.3.4 (April 9, 2026)

K-WIL fidelity baseline sealed — 96.3% (26/27 facts recoverable) on fresh 5-node graph
Temporally-anchored search — proxy now supports explicit keyword anchoring for historical queries

v1.3.3 (April 9, 2026)

Life story seed — scripts/seed_life_story.py — 58 temporally-anchored nodes (Apr 2025–Apr 2026)
58-test Playwright E2E suite — production_hardening + kwil_fidelity + dashboard_session all passing

v1.3.2 (April 9, 2026)

Semantic edge weight taxonomy — Gemini scores each extracted relationship using a 5-tier gravity model (action=0.90, work=0.65, spatial=0.40). recompute_node_harmonic_weight() carries true semantic variance. CREATED edges contribute 2.25× more gravitational mass than LIVES_IN edges.
Gate 2 response now includes similarity, harmonic_weight, reinforcement fields — AI clients know when near-duplicate content strengthened an existing node vs created a new one.

v1.3.0 (April 6, 2026)

Ambient auto-onboarding on first boot per project
--no-scan flag for privacy-conscious developers
Client-side persona weight injection

v1.2.0 (April 5, 2026)

Production URL: sumapro.quadframe.work

v1.1.0 (April 5, 2026)

Added suma_correct tool

v1.0.0

Initial release: ping, ingest, search, talk, clean

Support

Dashboard: sumapro.quadframe.work
Email: [email protected]

License

MIT — Suman Addanke / A2 Vibe Creators LLC