opencode-acm

v0.5.57

Published

9 days ago

Active Context Management plugin for OpenCode — pin, prune, scan, compact, and manage context with surgical precision

0High
0Medium
0Low

rickross

opencode plugin context memory acm

opencode-acm

Active Context Management for OpenCode. ACM helps you and the agent manage the working context more deliberately: keep important material available, compact old history on purpose, and remove large messages that are no longer useful.

Quick Start

Install the plugin in ~/.config/opencode/opencode.json:

{
  "plugin": ["opencode-acm"]
}

Or point at a local checkout while developing:

{
  "plugin": ["file:///path/to/opencode-acm"]
}

Then restart OpenCode and try:

acm_info
acm_scan
acm_pin with message_id=abc123
acm_compact with keep_active_minutes=30

OpenCode 1.3.x or later is required. Tested on 1.3.11.

To see ACM tool output inline in the TUI, enable Show generic tool output in OpenCode's settings.

Why Use ACM

When a session gets long, OpenCode eventually has to compact older history. Most of the time that is fine. Sometimes an important detail gets flattened into a summary or pushed too far out of view.

ACM is for the moments when a session contains something you do not want to lose track of, such as requirements, reproduction steps, schemas, or reference material.

It helps by making a few things easier to see and manage:

what is taking up space
what should stay easy to reach
what can be compacted
what can be removed from the active working set

Agents can use ACM directly once the plugin is installed, so context management can become part of normal tool use instead of a separate cleanup step.

Core Concepts

Active context

The portion of the session that OpenCode currently sends to the model.

Compaction boundary

A marker inserted into the session to tell OpenCode and ACM where active context begins.

Pinned messages

Messages you want ACM to treat as especially important.

Knowledge packages

Named file or inline-content loads created with acm_load. These are useful for API docs, requirements, schemas, or other reference material that should stay available until you explicitly unload it.

Tool Overview

| Category | Tools | Purpose | | --- | --- | --- | | Status | acm_info | Show ACM version, session, model, token usage, and runtime telemetry status | | Pinning | acm_pin, acm_unpin, acm_mark | Mark messages as important and manage pin state | | Pruning | acm_scan, acm_prune | Find large messages and compact specific ones | | Loading | acm_load, acm_unload | Load and unload named knowledge packages | | Inspection | acm_map, acm_scan, acm_search, acm_fetch | Understand context usage and find specific messages | | Compaction | acm_compact | Move the active-context boundary forward | | Housekeeping | acm_snapshot, acm_diagnose, acm_repair | Capture state, inspect corruption, and repair broken sessions |

Common Workflows

Pin something important

acm_pin
acm_pin with message_id=abc123

Find and remove bloat

acm_scan
acm_prune with targets=[abc123, def456]

Load a knowledge package

acm_load with name="API Docs" file="~/project/openapi.json"
acm_unload with name="API Docs"

Understand context usage

acm_info
acm_map
acm_scan with show_compacted=true

Runtime Telemetry

ACM injects a small <runtime-telemetry> block into each turn. This block is not user-authored. It is a runtime hint for the model that includes the current time and context usage.

Example:

<runtime-telemetry>
  <!-- Auto-injected by ACM — not from the user -->
  <time>Tue, Apr 21, 2026 at 04:08 PM CDT</time>
  <context-status tokens="121,822" percent="44%" limit="275,000" date="2026-04-21" time="16:08 CDT" />
</runtime-telemetry>

This is useful when you want the model to stay aware of:

current local time
approximate context usage
the current working limit for the active model

acm_info reports the same status in tool form, along with ACM version, session information, message counts, and whether runtime telemetry is enabled.

On the first turn after a restart, the telemetry may not yet have a resolved context limit. That is expected. ACM cannot read the runtime model-limit data until after a full turn has completed. On the next turn, it should show the resolved limit and percentage normally.

Disabling runtime telemetry

Runtime telemetry is enabled by default.

You can disable it in two ways:

Plugin option in opencode.json (per agent):

{
  "plugin": {
    "opencode-acm@latest": {
      "runtimeTelemetry": false
    }
  }
}

Environment variable in the agent environment:

OPENCODE_ACM_RUNTIME_TELEMETRY=0

The legacy OPENCODE_ACM_SYSTEM_REMINDER environment variable is still accepted for backward compatibility.

If both are present, the environment variable takes precedence.

Compact heartbeat

For fast-moving sessions where full runtime telemetry is too noisy, ACM also supports a compact heartbeat line appended to the last user message.

Heartbeat injection is opt-in. Set heartbeat to true to enable it.

Example:

{
  "plugin": [
    ["opencode-acm@latest", {
      "runtimeTelemetry": false,
      "heartbeat": true,
      "heartbeat_format": "[submitted at: {time} | {context_pct}% | {model} | msgs:{messages}]",
      "heartbeatTz": "America/Chicago"
    }]
  ]
}

See HEARTBEAT.md for the full variable reference.

The heartbeat format can also include literal text such as not typed by Rick; ACM only substitutes recognized {variables}. Set heartbeat to false or omit it to disable injection; heartbeat_format only controls the rendered text.

Context Guard

ACM can inject a separate tail advisory when context usage approaches a configured effective limit. This is different from runtimeTelemetry and heartbeat: it stays quiet until a threshold is crossed, then adds a one-turn synthetic advisory at the end of the message stream so the stable prompt prefix remains cache-friendly.

The guard is disabled by default. Enable it per agent in .opencode/opencode-acm.json:

{
  "contextGuard": {
    "enabled": true,
    "effectiveLimitTokens": 300000,
    "softPercent": 70,
    "warnPercent": 82,
    "criticalPercent": 90,
    "cooldownTurns": 2,
    "preferredRole": "system",
    "alternateRole": "user",
    "includeActions": true
  }
}

Config resolution order:

~/.config/opencode/opencode-acm.json
<agent-or-workspace>/.opencode/opencode-acm.json
explicit config file from OPENCODE_ACM_CONFIG
plugin tuple options
environment variables

Use effectiveLimitTokens for the operational budget, not the advertised model maximum. For example, a model may advertise a 1M token context but become expensive or less reliable above 360K; set effectiveLimitTokens to 360000 and let the percentages apply to that cap.

If effectiveLimitTokens is omitted, ACM uses the resolved OpenCode model context limit when available, then OPENCODE_CONTEXT_STATUS_LIMIT if set.

Environment variables:

OPENCODE_ACM_CONTEXT_GUARD=1
OPENCODE_ACM_CONTEXT_GUARD_LIMIT=300000
OPENCODE_ACM_CONTEXT_GUARD_SOFT_PERCENT=70
OPENCODE_ACM_CONTEXT_GUARD_WARN_PERCENT=82
OPENCODE_ACM_CONTEXT_GUARD_CRITICAL_PERCENT=90
OPENCODE_ACM_CONTEXT_GUARD_COOLDOWN_TURNS=2
OPENCODE_ACM_CONTEXT_GUARD_ROLE=system

Tail system messages are model/provider/harness dependent. If a target model does not preserve or accept tail system messages, configure preferredRole as user; the advisory text explicitly says it is internal and should not be answered directly. ACM never injects the advisory as an assistant message.

Compact to the last 30 active minutes

acm_compact with keep_active_minutes=30

Swap Pattern

If you need to work in a smaller context window while keeping reference material available, ACM supports a simple manual swap pattern:

acm_load important files or notes as named knowledge packages
acm_compact to move the active boundary forward
work in the leaner active window
acm_unload and acm_load packages as your task changes

This is still a manual workflow. The value is that it gives you a predictable way to keep reference material around while keeping the active window smaller.

Caveats

ACM uses OpenCode's native compaction marker format, so both systems agree on the active boundary.
acm_prune and acm_compact affect what the model sees on subsequent turns, not retroactively.
acm_scan and acm_map are size-oriented inspection tools. They are useful for finding bloat, but they should be treated as rough guidance rather than exact token accounting.
Knowledge packages remain loaded until explicitly unloaded.

How It Works

The plugin registers four hooks:

tool registers the ACM tools
experimental.chat.messages.transform replaces compacted message content with stubs before the model sees them
experimental.chat.system.transform strips stale reminders and caches model limits for reminder injection
event listens for session.updated to finalize MKP pinning after acm_load

ACM state is stored in acm.db alongside OpenCode's own database. It does not require schema changes to OpenCode itself.

Compaction boundaries use OpenCode's native marker format: a user message with a compaction part paired with a summary assistant message.

Credits

Built by Rick Ross and a team of AI agents while working on iRelate, after repeatedly running into context-management problems in long sessions.

MIT License.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

opencode-acm

Quick Start

Why Use ACM

Core Concepts

Tool Overview

Common Workflows

Runtime Telemetry

Disabling runtime telemetry

Compact heartbeat

Context Guard

Swap Pattern

Caveats

How It Works

Credits