@bd7pil/opencode-deep-memory
v0.8.7
Published
Persistent context intelligence for OpenCode — cross-session memory, compaction checkpoints, context reconstruction.
Maintainers
Readme
opencode-deep-memory
Persistent memory, checkpoint resilience, and deterministic context compression for OpenCode — zero runtime dependencies.
What it does
OpenCode sessions are stateless. Every restart is a cold start. Native compaction destroys conversation content. deep-memory adds three layers:
| Layer | What survives | How |
|-------|--------------|-----|
| Remember | Decisions, constraints, gotchas | memory_search / memory_store — BM25 + CJK search across sessions |
| Recover | Full conversation context | Checkpoint captures before compaction; resume injection on new session |
| Compress | Token budget | Deterministic stripping + pressure-triggered deep compression — no LLM calls |
Quick start
// opencode.json
{
"plugin": [
"oh-my-openagent",
"@bd7pil/opencode-deep-memory"
]
}OpenCode auto-installs on startup. Memory appears at .deep-memory/ in your project root.
How it works
┌─────────────────────────────────────────────────────────────────┐
│ messages.transform (every turn) │
│ ├─ Strip reasoning/thinking parts (physical removal) │
│ ├─ Remove system-injected messages (physical removal) │
│ ├─ Truncate old tool errors │
│ └─ Deep compress: dedup / tool output / JSON / assistant text │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ system.transform (every turn) │
│ ├─ Inject stable: MEMORY.md constraints + tool hint (cache hit)│
│ └─ Inject volatile: BM25 search results + repo map symbols │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ compacting (before OpenCode destroys messages) │
│ ├─ Capture raw messages → checkpoint.raw.json │
│ ├─ Extract knowledge → checkpoint.md │
│ └─ Inject structured handoff prompt for LLM │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ events │
│ ├─ session.created → resume + dream schedule │
│ ├─ session.idle → enrichment │
│ └─ session.compacted → pressure calibration │
└─────────────────────────────────────────────────────────────────┘Context compression
Three layers, fully automatic, no LLM calls.
Layer 1: Deterministic stripping (always active)
| Target | Action |
|--------|--------|
| Old reasoning/thinking parts | Physical removal |
| System injections (<system-reminder>, etc.) | Physical removal |
| Tool errors >100 chars (older than 4 turns) | Truncate |
| Inline <thinking> tags | Regex strip |
No marker pollution — old content is physically removed, not replaced with [cleared] or [stripped]. This prevents context confusion.
Layer 2: Deep compression (pressure-triggered)
| Pressure | Threshold | Actions |
|----------|-----------|---------|
| always | every turn | tool dedup + error purge + tool output compress + JSON crush + assistant text compress |
| medium | ≥ 50K tokens | + memory nudge (prompts LLM to use memory_store) |
| high | ≥ 150K tokens | + pressure nudge (prompts LLM to summarize old tasks) |
Thresholds are absolute, not percentage-based — they work consistently across 200K and 1M+ context windows. Based on Focus Agent research.
| Target | Strategy | Source | |--------|----------|--------| | Duplicate tool calls | Signature matching | DCP | | Old error inputs | Purge after 4 turns | DCP | | File reads | Keep head + key lines + tail | Edgee | | Command outputs | Keep errors + tail | Edgee | | Search results | Keep top-20, group by file | Edgee | | JSON arrays | Head + dedup middle + tail | Headroom | | Subagent output | Headers + key lines + tail with [ccr:] preservation | [Claude Code][] | | Skill output | Frontmatter + MUST rules + structure headers | [Claude Code][] | | Nested JSON objects | Compress child arrays >30 items | This project | | Old assistant text | Preserve structure, compress prose | LLMLingua |
All compressed content is reversible via CCR (Compress-Cache-Retrieve) — originals cached for 30 minutes with SHA-256 hash, retrievable via deep_expand tool.
No compression on protected tools: question, edit, write, todowrite, memory_*, deep_expand, task, skill. These tools' outputs contain verification data (LSP diagnostics, subagent decisions) essential for the agent to function correctly.
Post-compression re-read: after compression modifies content, recent modified files are listed in a <dm-nudge> so the agent can re-verify if needed — inspired by Claude Code's onCompact callback.
Memory nudge
Detects decisions, constraints, and fixes in conversation — nudges the LLM to persist them.
| Pattern | Example | Nudge |
|---------|---------|-------|
| Decision | "我决定用 PostgreSQL" / "I'll use PostgreSQL" | memory_store(type="decision") |
| Constraint | "不能用 eval()" / "must not use eval()" | memory_store(type="constraint") |
| Error fix | "修复了权限问题" / "fixed the permission error" | memory_store(type="gotcha") |
English + Chinese. Pressure nudge and memory nudge have independent cooldowns.
Tools
| Tool | Purpose |
|------|---------|
| memory_search | Search persistent memory (BM25 + CJK bigram) |
| memory_store | Store decisions, constraints, gotchas, facts, notes |
| memory_forget | Remove stale memory entries |
| memory_expand | Retrieve original content of a compressed message |
| deep_expand | Retrieve original content via CCR hash |
Compaction
When OpenCode compacts a session:
- Capture raw messages to
checkpoint.raw.json - Extract knowledge via 5 heuristic extractors
- Write structured
checkpoint.md - Inject Hermes-8 structured prompt + Codex-style handoff prefix
The LLM produces: Task Overview → Progress → Key Decisions → Constraints → Files Modified → Errors → Next Steps → Critical Context
Memory consolidation
| Cycle | Trigger | Action | |-------|---------|--------| | Auto-dream | 7 days or notes.md >20 lines | Consolidate notes + checkpoints → MEMORY.md | | Auto-distill | 30 days | Package recurring workflows → skill candidates | | Enrichment | Session idle after compaction | LLM enriches checkpoint with cross-references |
New projects: MEMORY.md auto-bootstraps from notes.md. Both agents have memory_forget enabled.
Configuration
| Variable | Default | Purpose |
|----------|---------|---------|
| DEEP_MEMORY_DEBUG | off | 1 = debug log, trace = +hook I/O |
| DEEP_MEMORY_PROJECT_SUBDIR | .deep-memory | Memory directory name |
| DEEP_MEMORY_GLOBAL_ROOT | ~/.local/share/opencode/deep-memory | Cross-project memory |
Storage
<project>/.deep-memory/
├── MEMORY.md persistent decisions/constraints/gotchas
├── notes.md keyword captures
├── checkpoint.md last compaction extraction
├── checkpoint.raw.json raw messages dump
├── .schedule.json dream/distill state
├── .compaction-log.jsonl compaction audit trail
└── sessions/<sid>/ per-session archiveCommands
/checkpoint— manually capture session state/dream— consolidate notes into persistent memory/distill— package recurring workflows into skills
Development
npm install
npm run verify # typecheck + test (363) + build + smoke (49)Acknowledgments
DCP — Dynamic Context Pruning for OpenCode. Tool dedup, error purge, and nudge system.
Headroom — JSON array crush and CCR (Compress-Cache-Retrieve).
Edgee — Per-tool compression strategies (read, bash, grep, glob).
Contextomizer — Content type detection pipeline.
Focus Agent — Absolute token thresholds and assistant text compression research.
LLMLingua — Selective compression: preserve structure, compress prose.
Codex CLI — Handoff prefix pattern for compaction continuity.
Google ADK — Append-only event compaction architecture.
Hermes — 8-section structured compaction prompt design.
MiMo-Code — Terminal-native AI coding assistant with persistent memory.
Magic Context — Unbounded context for coding agents.
Aider — AI pair programming in your terminal.
Roo Code — A whole dev team of AI agents in your code editor.
Continue — Pioneering open-source coding agent.
OpenHands — Code Less, Make More.
Plandex — AI coding agent for large tasks and real world projects.
License
MIT
