@solvedbydev/opencode-observational-memory
v0.1.0
Published
OpenCode plugin implementing Mastra-style observational memory with observer/reflector append-log compaction
Maintainers
Readme
OpenCode Observational Memory Plugin
OpenCode plugin implementing Mastra-inspired observational memory. Who doesn't love limitless context windows?
Observational memory compresses raw conversation history into dense observations using an observer/reflector pattern, so your agent maintains continuity across compaction boundaries and long-running sessions with zero vector DBs or knowledge graphs required.
Quick Start
Install
npm install @solvedbydev/opencode-observational-memoryAdd to OpenCode
In your project's opencode.json:
{
"plugin": ["@solvedbydev/opencode-observational-memory"]
}Set an API Key
The plugin calls an OpenAI-compatible LLM to extract and consolidate observations. Provide a key via environment variable:
export OM_API_KEY="your-api-key"
# or fall back to:
export OPENAI_API_KEY="your-api-key"That's it. The plugin runs automatically with no code changes required.
How It Works
The plugin hooks into OpenCode's event and chat lifecycle to maintain a persistent memory layer:
- Observe - When unobserved message tokens exceed a threshold (default: 30k), an observer agent extracts key facts, decisions, and context from new messages.
- Reflect - When accumulated observation tokens exceed a second threshold (default: 40k), a reflector agent consolidates and compresses observations, removing redundancy while preserving meaning.
- Inject - Observations are injected into the system prompt and message context on every turn, giving the LLM continuity even after compaction discards raw history.
session.idle / messages.transform
|
+-- unobserved tokens >= threshold?
| yes -> Observer extracts observations
| |
| +-- observation tokens >= threshold?
| yes -> Reflector consolidates
|
+-- observations injected into system prompt & messagesObserver / Reflector Pattern
Two specialized LLM agents coordinate memory:
- Observer: Reads new message history plus existing observations. Extracts discrete facts, decisions, preferences, and working context. Also tracks
currentTaskandsuggestedResponsefor session continuity. - Reflector: Takes the full observation log when it grows too large. Compresses it by removing duplication, merging related items, and prioritizing recent/actionable information.
Both agents use configurable models (default: google/gemini-2.5-flash) and support custom instructions for domain-specific extraction.
Configuration
Configuration merges from multiple sources (highest precedence first):
- Environment variables --
OM_API_KEY,OM_OBSERVATION_MODEL,OM_OBSERVATION_MESSAGE_TOKENS,OM_REFLECTION_MODEL,OM_REFLECTION_OBSERVATION_TOKENS,OM_API_BASE_URL - Project config -
<worktree>/.opencode/om-config.json - Global config -
~/.config/opencode/om-config.json - Built-in defaults
Example Config
{
"observation": {
"messageTokens": 30000,
"model": "google/gemini-2.5-flash",
"customInstruction": "Focus on architectural decisions and rejected alternatives."
},
"reflection": {
"observationTokens": 40000,
"model": "google/gemini-2.5-flash",
"customInstruction": ""
},
"api": {
"baseURL": "https://openrouter.ai/api/v1",
"apiKey": "sk-..."
},
"storage": {
"stateDir": ".opencode/om-state"
}
}Config Reference
| Key | Default | Env Override | Description |
|-----|---------|-------------|-------------|
| observation.messageTokens | 30000 | OM_OBSERVATION_MESSAGE_TOKENS | Token threshold to trigger observation |
| observation.model | google/gemini-2.5-flash | OM_OBSERVATION_MODEL | Model for the observer agent |
| observation.customInstruction | -- | -- | Additional instruction injected into observer prompt |
| reflection.observationTokens | 40000 | OM_REFLECTION_OBSERVATION_TOKENS | Token threshold to trigger reflection |
| reflection.model | google/gemini-2.5-flash | OM_REFLECTION_MODEL | Model for the reflector agent |
| reflection.customInstruction | -- | -- | Additional instruction injected into reflector prompt |
| api.baseURL | -- | OM_API_BASE_URL | OpenAI-compatible base URL |
| api.apiKey | -- | OM_API_KEY > OPENAI_API_KEY | API key for LLM calls |
| storage.stateDir | <worktree>/.opencode/om-state | -- | Directory for session state JSON files |
Debug Mode
Set OM_DEBUG=1 to enable verbose logging to stderr for observation cycles, token counts, and state persistence.
Tools
The plugin registers two tools available to the LLM during sessions:
om_status
Returns current session memory metrics: observation token counts, thresholds, cursor mode, unobserved message count, cycle history, current task, and suggested response.
om_observations
Returns the stored observation block for the session, including <observations>, <current-task>, and <suggested-response> sections.
Why Observational Memory?
Even models with large context windows degrade as the window fills. More raw history means more noise, worse adherence to instructions, and wasted tokens on content the agent no longer needs. Mastra calls these context rot and context waste and their observational memory system addresses both.
The idea mirrors how human memory works: you don't remember every word of every conversation. You observe what happened, then your brain reflects by reorganizing, combining, and condensing into long-term memory. OM works the same way, compressing raw context into dense observations (typically 5-40x compression) that keep the agent on task over arbitrarily long sessions.
The result is a context window with three tiers:
- Recent messages - exact conversation history for the current task
- Observations - a dense log of what the observer has extracted
- Reflections - condensed observations when the log itself grows too large
For deeper background, see Mastra's observational memory docs and their announcement post covering the design rationale and LongMemEval benchmark results.
This plugin adapts the pattern for OpenCode's plugin hook system:
- Runs as an OpenCode plugin (event hooks + chat transforms) rather than framework middleware
- Text-based append-log with token-aware compaction, no vector DB or external storage needed
- Tracks
currentTaskandsuggestedResponsefor richer session continuity across compaction - Supports independent custom instructions for observer and reflector agents
Plugin Lifecycle
The plugin registers handlers at these OpenCode extension points:
| Hook | Trigger | Action |
| -------------------------------------- | -------------------- | -------------------------------------------------------------------- |
| session.created | New session | Initialize/load session state from disk |
| session.idle | No active processing | Run observation cycle if thresholds met |
| experimental.chat.messages.transform | Before LLM call | Prune messages to unobserved window; run observation cycle |
| experimental.chat.system.transform | Before LLM call | Inject observations + continuation reminder into system prompt |
| experimental.session.compacting | Context compaction | Force observation cycle; inject observations into compaction context |
Building from Source
npm install
npm run build # tsc -> dist/
npm run typecheck # type-check without emitting
npm run clean # rm -rf distSource Layout
src/
index.ts Plugin entry point, event/hook handlers, tools
agents.ts Observer and reflector LLM agent logic
config.ts Configuration loading and merging
prompts.ts Prompt templates for observer/reflector
state.ts Session state persistence (JSON files)
tokens.ts Token counting (js-tiktoken, o200k_base)
types.ts TypeScript interfacesDependencies
Runtime: js-tiktoken (token counting), openai (OpenAI-compatible API client)
Dev: @opencode-ai/plugin, @opencode-ai/sdk, typescript
License
MIT
