@stackmemoryai/cli
v1.0.0
Published
Zero-configuration memory persistence for AI coding tools
Maintainers
Readme
StackMemory
Lossless, project-scoped memory for AI tools
StackMemory is a memory runtime for AI coding and writing tools that preserves full project context across:
- chat thread resets
- model switching
- editor restarts
- long-running repos with thousands of interactions
Instead of a linear chat log, StackMemory organizes memory as a call stack of scoped work (frames), allowing context to naturally unwind without lossy compaction.
Memory is storage. Context is a compiled view.
Why StackMemory exists
Modern AI tools forget:
- why decisions were made
- which constraints still apply
- what changed earlier in the repo
- what tools already ran and why
StackMemory fixes this by:
- storing everything losslessly (events, tool calls, decisions)
- injecting only the relevant working set into model context
- keeping memory project-scoped, not chat-scoped
Core concepts (quick mental model)
| Concept | Meaning | | -------------- | ------------------------------------------------- | | Project | One GitHub repo (initial scope) | | Frame | A scoped unit of work (like a function call) | | Call Stack | Nested frames; only the active path is "hot" | | Event | Append-only record (message, tool call, decision) | | Digest | Structured return value when a frame closes | | Anchor | Pinned fact (DECISION, CONSTRAINT, INTERFACE) |
Frames can span:
- multiple chat turns
- multiple tool calls
- multiple sessions
Hosted vs Open Source
Hosted (default)
- Cloud-backed memory runtime
- Fast indexing + retrieval
- Durable storage
- Per-project pricing
- Works out-of-the-box
Open-source local mirror
- SQLite-based
- Fully inspectable
- Offline / air-gapped
- Intentionally N versions behind
- No sync, no org features
OSS is for trust and inspection. Hosted is for scale, performance, and teams.
How it integrates
StackMemory integrates as an MCP tool and is invoked on every interaction in:
- Claude Code
- compatible editors
- future MCP-enabled tools
The editor never manages memory directly — it simply asks StackMemory for the context bundle.
QuickStart
1. Hosted (Recommended)
Step 1: Create a project
stackmemory projects create \
--repo https://github.com/org/repoThis creates a project-scoped memory space tied to the repo.
Step 2: Install MCP client
npm install -g stackmemory-mcpor via binary:
curl -fsSL https://stackmemory.dev/install | shStep 3: Configure Claude Code / editor
Add StackMemory as an MCP tool:
{
"tools": {
"stackmemory": {
"command": "stackmemory-mcp",
"args": ["--project", "github:org/repo"]
}
}
}That's it.
Every message now:
- Gets logged losslessly
- Updates the call stack
- Retrieves the correct context automatically
No prompts to manage. No summaries to babysit.
2. Open-Source Local Mode
Step 1: Clone
git clone https://github.com/stackmemory/stackmemory
cd stackmemoryStep 2: Run local MCP server
cargo run --bin stackmemory-mcp
# or
npm run devThis creates:
.memory/
└── memory.db # SQLiteAll project memory lives locally.
Step 3: Point your editor to local MCP
{
"tools": {
"stackmemory": {
"command": "stackmemory-mcp",
"args": ["--local"]
}
}
}What happens on each interaction
On every message/tool call:
Ingest
- New message delta is appended as events
Index
- Anchors updated
- Digests generated when frames close
Retrieve
- Active call stack (hot)
- Relevant digests (warm)
- Pointers to raw data (cold)
Return context bundle
- Sized to token budget
- No global compaction
Example MCP response (simplified)
{
"hot_stack": [
{ "frame": "Debug auth redirect", "constraints": [...] }
],
"anchors": [
{ "type": "DECISION", "text": "Use SameSite=Lax cookies" }
],
"relevant_digests": [
{ "frame": "Initial auth refactor", "summary": "..." }
],
"pointers": [
"s3://logs/auth-test-0421"
]
}Storage & limits
Free tier (hosted)
- 1 project
- Up to X MB stored
- Up to Y MB retrieval egress / month
Paid tiers
- Per-project pricing
- Higher storage + retrieval
- Team sharing
- Org controls
No seat-based pricing.
Guarantees
- ✅ Lossless storage (no destructive compaction)
- ✅ Project-scoped isolation
- ✅ Survives new chat threads
- ✅ Survives model switching
- ✅ Inspectable local mirror
Non-goals
- ❌ Chat UI
- ❌ Vector DB replacement
- ❌ Tool execution runtime
- ❌ Prompt engineering framework
Philosophy
Frames instead of transcripts. Return values instead of summaries. Storage separate from context.
Status
- Hosted: Private beta
- OSS mirror: Early preview
- MCP integration: Stable
Roadmap (high level)
- Team / org projects
- Cross-repo memory
- Background project compilers
- Fine-grained retention policies
- Editor UX surfacing frame boundaries
License
- Hosted service: Proprietary
- Open-source mirror: Apache 2.0 / MIT (TBD)
Additional Resources
ML System Design
- ML System Insights - Comprehensive analysis of 300+ production ML systems
- Agent Instructions - Specific guidance for AI agents working with ML systems
Documentation
- Product Requirements - Detailed product specifications
- Technical Architecture - System design and database schemas
- Beads Integration - Git-native memory patterns from Beads ecosystem
