@mnemoclaw/immune

v5.2.1

Published

3 days ago

Hybrid adaptive memory system for AI agents — antibodies (negative patterns) + cheatsheet (positive patterns), with hybrid search (local embeddings + FTS4 + RRF).

0High
0Medium
0Low

smart_rabbit_fitness

claude-code adaptive-memory embeddings cli npm

Immune System v5.2.0 — Hybrid Adaptive Memory for AI Agents

A self-improving memory system that makes AI outputs better over time through two complementary memories:

Immune (antibodies) — Detects and prevents known errors (negative patterns)
Cheatsheet (strategies) — Injects proven best practices before generation (positive patterns)

v5.2 — Hybrid Search: Local embeddings (bi-encoder) + FTS4 keyword search, fused via Reciprocal Rank Fusion (RRF). Everything runs in-process via WASM — no server, no daemon, no API keys for search/dedup.

Provider-agnostic: Built around the Anthropic Messages API shape (originally for Claude Code), but compatible with any provider that exposes a Messages-API-compatible endpoint. Set ANTHROPIC_BASE_URL to point at your provider (OpenRouter, Mistral, local llama.cpp, Ollama, vLLM, LM Studio, etc.) and ANTHROPIC_DEFAULT_HAIKU_MODEL to your provider's fast/cheap model. See Provider Configuration below.

Requirements

| Component | Minimum | Notes | |---|---|---| | Node.js | 18+ | Tested on 20.x and 22.x. The installer refuses to run on older versions. | | Disk | ~70 MB | Dependencies (~50 MB) + embedding model (~22 MB, downloaded on first use) | | RAM | 256 MB free | Embedding model uses ~150 MB resident | | API access | Any Anthropic-compatible endpoint | Only needed for the scan step (LLM). Search/dedup/dedup/strategy injection work fully offline. |

No GPU required. Embeddings run on CPU via WASM. The first run downloads the model (~22 MB); subsequent runs use the cache.

No API key needed for retrieval. Only the scan phase (where an LLM checks your output for known errors) calls a model. Everything else — embedding search, dedup, FTS4 keyword search, strategy injection, scoring, housekeeping — runs locally.

Quick Start

1. Install

npm install -g @mnemoclaw/immune
immune init

That's it — immune init copies the skill into ~/.claude/skills/immune/, installs dependencies, and verifies the install. Re-run immune init after every npm update to upgrade in place (your memory is preserved).

No npm? Use Manual install below.

2. Use it

In Claude Code:

/immune Check this function for common pitfalls
/immune domain=fitness Vérifie ce programme
/immune domains=fitness,code Check this workout API

First invocation will trigger the embedding model download (~22 MB, one-time).

Manual install (alternative)

If you prefer git clone over npm, or want to hack on the source:

git clone https://github.com/Mnemoclaw/immune.git
cd immune
npm install

Then copy only the runtime files into your Claude Code skills directory:

mkdir -p ~/.claude/skills/immune
cp immune-adapter.js immune-inject.js sanitizer.js config.yaml skill.md package.json \
   ~/.claude/skills/immune/
cp -r agents ~/.claude/skills/immune/
cd ~/.claude/skills/immune/
npm install --omit=dev
node immune-adapter.js stats

Prefer a filtered copy over cp -r * — cp -r * copies node_modules/, dev artifacts, and lockfiles alongside the files you actually need.

Provider Configuration

The immune scan (LLM-based detection) needs a model. The "haiku" alias in the code is a logical name — Claude Code resolves it through environment variables. Any provider works as long as it speaks the Messages API shape.

Examples

Anthropic (default):

export ANTHROPIC_API_KEY=sk-ant-...
# haiku alias already points to claude-haiku on first-party

OpenRouter:

export ANTHROPIC_BASE_URL=https://openrouter.ai/api/v1
export ANTHROPIC_API_KEY=sk-or-...
export ANTHROPIC_DEFAULT_HAIKU_MODEL=mistralai/ministral-8b    # cheap fast tier
export ANTHROPIC_DEFAULT_SONNET_MODEL=anthropic/claude-sonnet  # balanced tier

Local (Ollama, llama.cpp, LM Studio, vLLM):

export ANTHROPIC_BASE_URL=http://localhost:11434/v1   # Ollama example
export ANTHROPIC_API_KEY=local                        # any non-empty string
export ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen2.5:7b
export ANTHROPIC_DEFAULT_SONNET_MODEL=qwen2.5:14b

GLM / Mistral / Together / Fireworks / Groq / DeepSeek — same pattern. The system keeps every vendor configurable.

Without these variables, the scan step will fail. Search, dedup, strategy injection, and scoring all keep working — they operate independently of any model.

How It Works

[User Request]
  --> Keyword domain detection (no LLM)
  --> Hybrid search: local embeddings + FTS4 via Reciprocal Rank Fusion
  --> Inject cheatsheet strategies (positive patterns) into prompt
  --> Generate output (with strategy context)
  --> Immune scan via cheap LLM (detect known + new errors)
  --> Fix errors + learn new antibodies
  --> Local embedding dedup (before adding new patterns)
  --> Score (0-100, domain-normalized via Welford's algorithm)
  --> Session log (for future context recall)

Key Features

Hybrid Search (v5.2)

Embeddings (primary) — Xenova/all-MiniLM-L6-v2 (384 dims, ~22 MB, WASM) for semantic matching
FTS4 (secondary) — SQLite full-text search for keyword recall
RRF Fusion — Reciprocal Rank Fusion (k=60, Cormack et al. SIGIR 2009) merges both engines using ranks, not raw scores
TF-IDF + Trigrams — Fallback when embeddings unavailable

Hot/Cold Tiering

Keeps context lean for optimal performance:

Hot — Active patterns: critical severity, seen ≥ 3 times, or recent (<30 days)
Cold — Dormant patterns: sent as one-line summaries, auto-reactivated on match

Dual Storage

JSON (immune_memory.json / cheatsheet_memory.json) — Primary, portable, human-readable
SQLite (immune.sqlite) — FTS4 full-text search + embedding cache

Deduplication

Local embedding cosine similarity (threshold: 0.7)
Jaccard + longest common subsequence fallback (threshold: 0.55)

Quality Gates

housekeep only archives patterns that are COLD + low-seen + old + non-critical
flush-pending runs check-duplicate before any write — duplicates reactivate the original instead of creating new entries
freeze / unfreeze pauses aging clocks (e.g. during vacations) without losing history

Automatic Pre-Generation Injection

Inject relevant strategies into every Claude response automatically:

# Test the inject script manually
echo '{"prompt":"Write a Node.js API endpoint"}' | node ~/.claude/skills/immune/immune-inject.js

Add to ~/.claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "node /absolute/path/to/.claude/skills/immune/immune-inject.js",
            "timeout": 10
          }
        ]
      }
    ]
  }
}

The inject hook detects domains from your prompt via keyword matching and injects up to 5 HOT strategies as compact XML. Zero injection on unrelated prompts. ~50 ms overhead. No LLM call — entirely local.

File Structure

immune/
  immune-adapter.js        # CLI adapter — all operations go through this
  immune-inject.js         # Pre-generation hook (local keyword detection)
  sanitizer.js             # Input sanitization (strips secrets before storage)
  config.yaml              # Full configuration
  skill.md                 # Claude Code skill definition
  agents/
    immune-scan.md         # Scan agent instructions
  benchmark/
    run-blind.js           # Blind retrieval + generalization + learning benchmarks
    run-learning.js        # Standalone learning curve benchmark
    sample-queries.json    # Example benchmark queries (20 cases)
    cases-learning-blind.json
    BENCHMARKS.md          # Published benchmark results
  tools/viz/               # Optional D3 dashboard for inspecting your memory
    README.md
    immune-viz.js          # Generates immune-viz.html from memory files

Files generated at runtime (gitignored): immune_memory.json, cheatsheet_memory.json, immune.sqlite, migration_state.json, archived_*.json, context/.

CLI Commands

# Search (hybrid embeddings + FTS4 via RRF)
node immune-adapter.js search --query "docker crash loop" --type antibody
node immune-adapter.js get-context --query "fitness programme" --days 90
node immune-adapter.js check-duplicate --pattern "..." --type antibody

# Retrieval
node immune-adapter.js get-antibodies --domains '["code"]' --tier hot --limit 15
node immune-adapter.js get-strategies --domains '["code"]' --query "security" --limit 10

# Add / Update
node immune-adapter.js add-antibody --json '{"id":"AB-001","pattern":"...","severity":"critical","correction":"..."}'
node immune-adapter.js update-antibody --id AB-001 --increment_seen

# Bulk
node immune-adapter.js flush-pending --json '{"antibodies":[...],"strategies":[...]}'
node immune-adapter.js import --file export.immune.json

# Maintenance
node immune-adapter.js index              # Rebuild FTS4 index
node immune-adapter.js stats              # Show counts and migration state
node immune-adapter.js housekeep          # Archive useless patterns
node immune-adapter.js integrity-check    # SQLite integrity check
node immune-adapter.js freeze / unfreeze  # Pause/resume aging clocks

# Testing
node immune-adapter.js similarity-test    # Run dedup test suite
node immune-adapter.js retrieval-test     # Run semantic retrieval tests
node immune-adapter.js embed --text "..." # Get raw embedding vector

Domains

Patterns are tagged with domains for targeted retrieval. Edit config.yaml:domain_keywords to add your own.

| Domain | Example Keywords | |--------|-----------------| | code | function, docker, API, script, deployment | | fitness | muscu, exercice, programme, séance | | writing | article, SEO, blog, rédaction | | research | source, étude, analyse, hypothèse | | strategy | marché, compétiteur, ROI | | webdesign | CSS, HTML, responsive, UI | | travel | voyage, hôtel, billet, itinéraire | | _global | Cross-domain patterns |

Benchmarks

See benchmark/BENCHMARKS.md for the full methodology. Headline results (blind test cases written by independent AI agents kept fully blind to the memory data):

| Benchmark | Score | |---|---| | Retrieval accuracy | 70 % (14/20) | | Cross-domain generalization | 53 % (4 strong + 8 partial / 15) | | Improvement after 1 learning pass | +74 pts (0 % → 74 %) | | Housekeep safety | 0 pts lost |

Reproduce:

node benchmark/run-blind.js
node benchmark/run-learning.js --cases benchmark/cases-learning-blind.json

The retrieval benchmark reads from benchmark/sample-queries.json by default. Point IMMUNE_BENCH_QUERIES at your own file to evaluate against your own memory.

Configuration

All tunable parameters live in config.yaml:

Deduplication thresholds (embedding: 0.7, Jaccard: 0.55)
Hot/Cold criteria
Housekeeping limits and archival rules (max_antibodies: 500, max_strategies: 300, max_sqlite_mb: 50)
Domain keywords for auto-detection (edit freely to match your content)

Dependencies

@xenova/transformers ^2.17.2 — Local embedding model (auto-cached on first use)
sql.js ^1.14.1 — SQLite in WASM for FTS4 search
proper-lockfile ^4.1.2 — Concurrency safety
protobufjs ^7.5.8 (override) — Forces patched version to silence npm audit warnings

License

MIT — Jacques Chauvin. See LICENSE.