experience-replay

v0.1.1

Published

4 months ago

OpenClaw plugin for contextual experience replay and self-improvement.

0High
0Medium
0Low

tirpitzia

openclaw-plugin openclaw experience-replay agent-memory

openclaw-experience-replay

Demo

Contextual experience replay plugin for OpenClaw. It stores successful task trajectories, retrieves similar past wins before a run, and injects them as concise in-context guidance — no fine-tuning required.

Features

SQLite-backed local memory with no training step
Offline-first lexical embeddings by default
Ollama support — use nomic-embed-text or any local model via the Ollama API
Optional OpenAI embeddings (text-embedding-3-small, etc.)
Hybrid retrieval — combines neural + lexical similarity for better recall when using Ollama/OpenAI
Prompt injection through before_prompt_build
Success capture through after_tool_call, llm_output, and agent_end
Recent-candidate retrieval window to keep replay fast as memory grows
Run-aware trace registry that separates concurrent runs
Configurable scoring weights — tune what matters for your use case
Bilingual prompts — language: "zh" or "en", or "auto" to detect from env
CLI tool — list, delete, and reset stored experiences

How It Works

before_prompt_build retrieves the top matching past successes.
The plugin injects a short <experience_replay> block ahead of the prompt.
after_tool_call and llm_output accumulate the run trace.
agent_end scores the run and stores only high-quality, non-failure trajectories.

Install

npm install

Then load the plugin from your openclaw.json:

{
  "plugins": {
    "entries": {
      "experience-replay": { "enabled": true }
    },
    "load": { "paths": ["./openclaw-experience-replay"] }
  }
}

Config

All fields are optional. Defaults shown below.

{
  "storePath": "~/.openclaw/experience-replay.db",
  "maxExamples": 3,
  "maxCandidates": 250,
  "similarityThreshold": 0.32,
  "language": "auto",
  "embedding": {
    "provider": "lexical"
  },
  "success": {
    "minScore": 0.65
  }
}

Ollama embeddings (local, no API key)

{
  "embedding": {
    "provider": "ollama",
    "ollamaModel": "nomic-embed-text",
    "ollamaBaseUrl": "http://localhost:11434",
    "hybridWeight": 0.7
  }
}

hybridWeight controls the blend between neural (Ollama/OpenAI) and lexical similarity:

1.0 = pure neural
0.0 = pure lexical
0.7 = default (70% neural + 30% lexical)

OpenAI embeddings

{
  "embedding": {
    "provider": "openai",
    "model": "text-embedding-3-small",
    "openaiApiKey": "sk-...",
    "hybridWeight": 0.7
  }
}

Custom scoring weights

The success score determines whether a run is worth storing. Tune the weights to reflect what matters for your agent.

{
  "success": {
    "minScore": 0.65,
    "scoreWeights": {
      "success":             0.55,
      "finalAnswer":         0.20,
      "toolUse":             0.15,
      "directAnswer":        0.10,
      "noNegativeFeedback":  0.15
    }
  }
}

| Weight | Awarded when… | |--------|--------------| | success | The run is flagged as succeeded | | finalAnswer | A non-empty, non-error answer is present | | toolUse | At least one tool call was made | | directAnswer | No tool calls (direct answer only) | | noNegativeFeedback | Prompt contains no configured negative-feedback patterns |

Key Options

| Option | Default | Description | |--------|---------|-------------| | maxExamples | 3 | Number of past experiences to retrieve and inject | | maxCandidates | 250 | Recent experiences to rank before selecting top matches | | similarityThreshold | 0.32 | Minimum similarity score for retrieval | | language | "auto" | Language for injected prompts: "zh", "en", or "auto" | | success.minScore | 0.65 | Minimum score required to store a run | | embedding.hybridWeight | 0.7 | Neural vs. lexical blend (Ollama/OpenAI only) |

CLI

Manage stored experiences from the command line:

# List recent experiences
npx experience-replay list --limit 20

# Delete a specific experience (id prefix works)
npx experience-replay delete a1b2c3d4

# Reset all stored experiences
npx experience-replay reset --yes

# Show DB statistics
npx experience-replay stats

# Point at a custom DB path
npx experience-replay list --db /path/to/experience-replay.db

Quality Notes

Duplicate experiences are ignored via a content fingerprint.
Failure-shaped outputs (HTTP errors, auth errors) are not persisted.
Incomplete runs with no final answer are skipped.
Negative-feedback patterns (configurable) cause the run to score below minScore and be dropped.

Development

npm test
npm run typecheck