@vectorize-io/hindsight-openclaw

v0.8.0

Published

a month ago

Hindsight memory plugin for OpenClaw - biomimetic long-term memory with fact extraction

0High
0Medium
0Low

nicoloboschi

dk09876

benfrank241

openclaw memory ai agent hindsight long-term-memory

Hindsight Memory Plugin for OpenClaw

Biomimetic long-term memory for OpenClaw using Hindsight. Automatically captures conversations and intelligently recalls relevant context.

Quick Start

# 1. Install the plugin
openclaw plugins install @vectorize-io/hindsight-openclaw

# 2. Run the interactive setup wizard
npx --package @vectorize-io/hindsight-openclaw hindsight-openclaw-setup

# 3. Start OpenClaw
openclaw gateway

hindsight-openclaw-setup walks you through picking one of three modes:

Cloud — managed Hindsight. Paste your cloud API token, done.
External API — your own running Hindsight deployment. Prompts for the URL and optional token.
Embedded daemon — spawns a local hindsight-embed daemon on this machine. Prompts for the LLM provider (OpenAI / Anthropic / Gemini / Groq / Claude Code / Codex / Ollama) and its API key.

The interactive wizard stores credentials inline in openclaw.json for simplicity — the value is masked as you paste it. For CI / production you can store credentials as a SecretRef (resolved from an env var, file, or exec source at startup) by using the non-interactive flags with --token-env / --api-key-env, or by switching an existing field afterwards with openclaw config set ... --ref-source env --ref-id ….

Manual configuration (without the wizard)

The wizard is a convenience wrapper — all of the same fields can be set directly with openclaw config set:

# Embedded daemon with OpenAI
openclaw config set plugins.entries.hindsight-openclaw.config.llmProvider openai
openclaw config set plugins.entries.hindsight-openclaw.config.llmApiKey \
    --ref-source env --ref-provider default --ref-id OPENAI_API_KEY

# Or: Claude Code (no API key needed)
openclaw config set plugins.entries.hindsight-openclaw.config.llmProvider claude-code

# Or: point at an external Hindsight API
openclaw config set plugins.entries.hindsight-openclaw.config.hindsightApiUrl https://mcp.hindsight.example.com
openclaw config set plugins.entries.hindsight-openclaw.config.hindsightApiToken \
    --ref-source env --ref-id HINDSIGHT_API_TOKEN

Migrating from 0.5.x

0.6.0 removes all process-environment reads from the plugin. Configuration that previously came from shell env vars must now go through OpenClaw's plugin config (with SecretRef for credentials). Concrete mappings:

| Old (0.5.x) | New (0.6.0) | | ---------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | OPENAI_API_KEY=… (auto-detected) | openclaw config set plugins.entries.hindsight-openclaw.config.llmProvider openai openclaw config set plugins.entries.hindsight-openclaw.config.llmApiKey --ref-source env --ref-id OPENAI_API_KEY | | HINDSIGHT_API_LLM_PROVIDER=… | openclaw config set plugins.entries.hindsight-openclaw.config.llmProvider … | | HINDSIGHT_API_LLM_MODEL=… | openclaw config set plugins.entries.hindsight-openclaw.config.llmModel … | | HINDSIGHT_API_LLM_API_KEY=… | openclaw config set plugins.entries.hindsight-openclaw.config.llmApiKey --ref-source env --ref-id … | | HINDSIGHT_API_LLM_BASE_URL=… | openclaw config set plugins.entries.hindsight-openclaw.config.llmBaseUrl … | | HINDSIGHT_EMBED_API_URL=… | openclaw config set plugins.entries.hindsight-openclaw.config.hindsightApiUrl … | | HINDSIGHT_EMBED_API_TOKEN=… | openclaw config set plugins.entries.hindsight-openclaw.config.hindsightApiToken --ref-source env --ref-id … | | HINDSIGHT_BANK_ID=… | openclaw config set plugins.entries.hindsight-openclaw.config.bankId … | | llmApiKeyEnv: "MY_KEY" (plugin config) | llmApiKey configured as a SecretRef with --ref-id MY_KEY |

If your shell already exports OPENAI_API_KEY, the SecretRef config above resolves to the same value at startup — no need to change your shell setup, just point the plugin at the variable explicitly. Run openclaw config validate after migrating to confirm the new shape parses cleanly.

Features

Auto-capture and auto-recall of memories each turn, injected into system prompt space so recalled memories stay out of the visible chat transcript
Memory isolation — configurable per agent, channel, user, or provider via dynamicBankGranularity
Historical backfill CLI — import prior OpenClaw session history into Hindsight using the active plugin bank-routing config by default
Retention controls — choose which message roles to retain, toggle auto-retain on/off, and stamp retained documents with consistent tags/source metadata

Configuration

Optional settings in ~/.openclaw/openclaw.json under plugins.entries.hindsight-openclaw.config:

| Option | -------------------------- | apiPort | daemonIdleTimeout | embedPort | embedVersion | embedPackagePath | bankMission | retainMission | observationsMission | llmProvider | llmModel | llmApiKey | llmBaseUrl | dynamicBankId | bankId | bankIdPrefix | retainTags | retainSource | dynamicBankGranularity | excludeProviders | autoRecall | autoRetain | retainRoles | retainFormat | retainToolCalls | retainEveryNTurns | retainOverlapTurns | recallBudget | recallMaxTokens | recallTypes | recallRoles | recallTopK | recallContextTurns | recallMaxQueryChars | recallPromptPreamble | hindsightApiUrl | hindsightApiToken | ignoreSessionPatterns | statelessSessionPatterns | skipStatelessSessions | debugPerfTiming | enableKnowledgeTools | Default | Description | | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | 9077 | Port for the local Hindsight daemon | | 0 | Seconds before daemon shuts down from inactivity (0 = never) | | 0 | Port for hindsight-embed server (0 = auto-assign) | | "latest" | hindsight-embed version | | — | Local path to hindsight-embed package for development | | — | Mission stamped onto the bank's reflect_mission column on first use. Only affects the reflect operation — does not steer retain or recall. Leave unset (or empty) to manage missions out-of-band via PATCH /banks/{id}. | | — | Mission stamped onto the bank's retain_mission column on first use. Steers what gets extracted as facts during retain. Leave unset to use built-in extraction rules. | | — | Mission stamped onto the bank's observations_mission column on first use. Controls what gets synthesised into observations during consolidation. | | — | LLM provider for memory extraction (openai, anthropic, gemini, groq, ollama, openai-codex, claude-code). Required unless hindsightApiUrl is set. | | provider default | LLM model used with llmProvider | | — | API key for the LLM provider. Sensitive — set via openclaw config set ... --ref-source env --ref-id OPENAI_API_KEY to reference an env var (or --ref-source file/exec for mounted-secret/Vault sources). | | — | Optional base URL override for OpenAI-compatible providers (e.g. https://openrouter.ai/api/v1) | | true | Enable per-context memory banks | | — | Static bank ID used when dynamicBankId is false. | | — | Prefix for bank IDs (e.g. "prod") | | [] | Tags applied to every retained document, useful for cross-agent/source labeling (e.g. source_system:openclaw, agent:agentname). Auto-retain also merges inline per-message tags from <retain_tags>...</retain_tags> or <hindsight_retain_tags>...</hindsight_retain_tags> blocks in user messages. | | "openclaw" | source value written into retained document metadata | | ["agent", "channel", "user"] | Fields used to derive bank ID. Options: agent, channel, user, provider | | ["heartbeat"] | Message providers to skip for recall/retain (e.g. heartbeat, slack, telegram, discord) | | true | Auto-inject memories before each turn. Set to false when the agent has its own recall tool. | | true | Auto-retain conversations after each turn | | ["user", "assistant"] | Which message roles to retain. Options: user, assistant, system, tool | | "json" | Serialization format for retained conversation content. "json" emits a structured array of {role, content} messages (matches Claude Code). "text" emits legacy [role: x] … [x:end] markers. | | true | With retainFormat: "json", each message's content is an Anthropic-shaped block array (text / tool_use / tool_result). Tool results are truncated at 2000 chars. Hindsight's own MCP tools (recall/retain/search/…) are filtered to prevent feedback loops. Set false to retain text-only content. | | 1 | Retain every Nth turn. 1 = every turn (default). Values > 1 enable chunked retention with a sliding window. | | 0 | Extra prior turns included when chunked retention fires. Window = retainEveryNTurns + retainOverlapTurns. Only applies when retainEveryNTurns > 1. | | "mid" | Recall effort: low, mid, or high. Higher budgets use more retrieval strategies. | | 1024 | Max tokens for recall response. Controls how much memory context is injected per turn. | | ["observation"] | Memory types to recall. Options: world, experience, observation. Defaults to observations — the consolidated, deduplicated view — to avoid surfacing the same answer multiple times when many raw memories say the same thing. | | ["user", "assistant"] | Roles included when building prior context for recall query composition. Options: user, assistant, system, tool. | | — | Max number of memories to inject per turn. Applied after API response as a hard cap. | | 1 | Number of user turns to include when composing recall query context. 1 keeps latest-message-only behavior. | | 800 | Maximum character length for the composed recall query before calling recall. | | built-in string | Prompt text placed above recalled memories in the injected <hindsight_memories> system-context block. | | — | External Hindsight API URL (skips local daemon) | | — | Auth token for external API. Sensitive — set via openclaw config set ... --ref-source env --ref-id HINDSIGHT_API_TOKEN. | | [] | Session key glob patterns to skip entirely — no recall, no retain (e.g. ["agent:*:cron:**"]) | | [] | Session key glob patterns for read-only sessions — retain is always skipped; recall is skipped when skipStatelessSessions is true (e.g. ["agent:*:subagent:**", "agent:*:heartbeat:**"]) | | true | When true, sessions matching statelessSessionPatterns also skip recall. Set to false to allow recall but still skip retain. | | false | Emit one info-level perf line per before_prompt_build (recall path) and agent_end (retain path) so you can spot whether latency is in the plugin or upstream. Off by default. Format: perf: <hook> hook_total=Xms <hook-specific fields>. Safe in production — uses the existing logger. | | false | Register agent_knowledge_* tools for explicit agent-driven lookup, reflection, ingest, and knowledge-page management. Set automatically by the self-driving-agents CLI. |

Manual Knowledge Tools

When enableKnowledgeTools is enabled, the plugin registers explicit agent_knowledge_* tools in addition to automatic recall. Use agent_knowledge_recall for ordinary memory lookup. Use agent_knowledge_reflect only for deliberate synthesis, retrospectives, or long-term preference/pattern questions; it retrieves memories and then calls the configured Reflect LLM to generate an answer.

agent_knowledge_reflect uses conservative defaults: budget: "low", max_tokens: 1024, and fact_types: ["world", "experience", "observation"]. Production deployments should also set a finite bank-level reflect_source_facts_max_tokens value, such as 4096 or 8192, rather than leaving reflection source facts unlimited.

Session pattern filtering

ignoreSessionPatterns and statelessSessionPatterns accept glob patterns matched against the session key (format: agent:<agentId>:<type>:<uuid>).

Glob syntax:

* — matches any characters except : (single segment)
** — matches anything including : (multiple segments)

| Pattern | Matches | | --------------------- | ----------------------------------- | | agent:*:cron:** | All cron sessions for any agent | | agent:*:subagent:** | All subagent sessions for any agent | | agent:main:** | All sessions under the main agent |

Difference between the two options:

| | ignoreSessionPatterns | statelessSessionPatterns | | ------ | ----------------------- | ----------------------------------------------- | | Retain | Skipped | Always skipped | | Recall | Skipped | Skipped only when skipStatelessSessions: true |

Example config — exclude cron jobs from memory entirely, allow subagents to read but not write memories:

{
  "ignoreSessionPatterns": ["agent:*:cron:**"],
  "statelessSessionPatterns": ["agent:*:subagent:**"],
  "skipStatelessSessions": false
}

Retention details

Retained documents use stable session-scoped IDs derived from the OpenClaw sessionKey. By default (retainDocumentScope: 'session') every retain in a session shares one document id like openclaw:agent:agentname:discord:channel:123, so all turns of the conversation accumulate under a single Hindsight document. Set retainDocumentScope: 'turn' to fall back to the per-retain ids (...:turn:000001, ...:window:000002 for chunked retention). Either way, retained documents include richer metadata such as session_key, agent_id, provider, channel_id, thread_id, sender_id, turn_index, and retention_scope. Each message in the retained JSON also carries a structured timestamp field (ISO 8601) lifted from OpenClaw's per-message time, so facts are not polluted by inline weekday/date prefixes.

Documentation

For full documentation, configuration options, troubleshooting, and development guide, see:

OpenClaw Integration Documentation

Development

To test local changes to the Hindsight package before publishing:

Add embedPackagePath to your plugin config in ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "hindsight-openclaw": {
        "enabled": true,
        "config": {
          "embedPackagePath": "/path/to/hindsight-wt3/hindsight-embed"
        }
      }
    }
  }
}

The plugin will use uv run --directory <path> hindsight-embed instead of uvx hindsight-embed@latest
To use a specific profile for testing:

# Check daemon status
uvx hindsight-embed@latest -p openclaw daemon status

# View logs
tail -f ~/.hindsight/profiles/openclaw.log

# List profiles
uvx hindsight-embed@latest profile list

Backfilling Existing OpenClaw History

The package includes a config-aware backfill CLI for importing historical OpenClaw sessions into Hindsight.

By default it mirrors the active plugin settings for:

dynamicBankId
dynamicBankGranularity
bankIdPrefix
local daemon vs external hindsightApiUrl

Dry-run example:

npx --package @vectorize-io/hindsight-openclaw hindsight-openclaw-backfill \
  --openclaw-root ~/.openclaw \
  --dry-run

Direct invocation from a built checkout:

node dist/backfill.js --openclaw-root ~/.openclaw --dry-run

Migration-oriented overrides are explicit:

node dist/backfill.js \
  --openclaw-root ~/.openclaw \
  --bank-strategy agent \
  --agent proj-run \
  --resume \
  --max-pending-operations 10

Useful options:

--agent <id> limit import to selected agents
--exclude-archive ignore sessions-archive-from-migration_backup
--bank-strategy mirror-config|agent|fixed
--resume skip only entries already finalized as completed
--checkpoint <path> store progress outside the default location
--wait-until-drained block until the touched bank queues have finished and checkpoint state can be finalized

License

MIT