@nogataka/claw-memory

v0.3.0

Published

6 days ago

Independent, in-process semantic memory MCP server (node:sqlite + sqlite-vec + local Xenova e5) with a lightweight web viewer. No daemon, no Python, no native ABI. Installable as a Claude Code plugin and a Codex MCP server.

0High
0Medium
0Low

nogataka

mcp memory claude-code codex sqlite-vec embeddings rag

claw-memory

English | 日本語

claw-memory memory viewer

Local, in-process long-term memory for AI coding agents (Claude Code & Codex). Your agent remembers past sessions, your preferences, and prior decisions — and can search every raw transcript you've ever recorded. No daemon, no Python, no external vector database, no data leaving your machine (except the LLM call that summarizes a session, which you control).

npm install -g @nogataka/claw-memory

Storage: node:sqlite (built into Node >= 24, no native ABI) + sqlite-vec — vectors live inside one SQLite file
Embeddings: local Xenova/multilingual-e5-small (384-dim, multilingual, offline)
Two memory sources: a distilled semantic DB and full-text search over raw Claude Code + Codex transcripts
Auto-capture: lifecycle hooks distill finished sessions and inject relevant memory back into new ones
Pluggable LLM: distill via your Claude or Codex subscription (no API key), or any Anthropic / OpenAI-compatible endpoint

Features

1. Two independent memory sources

| Source | What it is | Tooling | |--------|-----------|---------| | Distilled DB | LLM-summarized sessions → summaries, preferences, and embedded conversation chunks with structured metadata. Semantically searchable. | memory_recall, memory_search, memory_get | | Raw transcript search | Full-text grep over your actual Claude Code (~/.claude/projects), Codex (~/.codex/sessions), and ChatGPT web export (conversations.json) logs — including sessions that were never distilled. | memory_search_logs |

The distilled DB is curated and fast to recall; raw search is a safety net that finds anything you ever discussed, even before claw-memory was installed.

ChatGPT web conversations live on OpenAI's servers, not on disk, so claw-memory reads them from the official export: ChatGPT → Settings → Data controls → Export data; unzip and drop conversations.json into ~/.claw-memory/chatgpt/ (or point CLAW_MEMORY_CHATGPT_EXPORT at the file/folder). Then it's searchable like any other log, and claw-memory distill-chatgpt adds it to the semantic DB under a dedicated chatgpt project.

2. Automatic capture (distill)

When a session ends, claw-memory distills the transcript into:

a structured summary (### 依頼 / 調査・判明 / 完了 / 次の一手),
user preferences (language, response style, frameworks, tone, …) applied as always-on context,
conversation chunks embedded for semantic search, each tagged with an observation type (discovery / bugfix / feature / decision / change), concepts, and files read / modified.

Distillation is incremental (a watermark skips sessions with no new content) and idempotent (re-distilling a session replaces, never duplicates). Cross-session duplicate chunks are dropped.

3. Automatic recall injection

At the start of a session (and on each prompt), claw-memory injects a memory block:

Preferences as instruction="always-apply" — the agent follows them.
Recent summaries + semantically similar past conversations as instruction="reference-only" — used as background, not parroted back.

This means the agent picks up where you left off without you re-explaining context.

4. Structured, filterable search

memory_search returns a token-light index (id + title + date + type). Filter by type, concept, file, or date range, then pull full bodies with memory_get only for what you need — keeping context usage minimal.

5. Privacy & safety, by design

Fully local: storage and embeddings never leave your machine. Only distill calls an LLM, and you choose which one.
<private>…</private> spans are stripped before anything is persisted or sent to the LLM.
CLAW_MEMORY_EXCLUDED_PROJECTS: never record or recall listed paths.
memory_forget: soft-delete chunks; they vanish from search, recall, and the viewer.

6. Pluggable LLM backend (distill only)

Use a subscription login (no API key) or any HTTP endpoint — see Configuration. Tier routing lets cheap models handle the high-frequency distill work.

7. On-demand web viewer

A zero-build, read-only viewer (claw-memory ui) to browse projects, summaries, chunks (with their metadata), preferences, and to run raw-log search — with live updates via SSE. It runs only when you start it. The viewer also has a Lessons tab for reviewing, approving and editing extracted lessons.

8. Reusable lessons

Beyond retrieving past logs, claw-memory distills AI coding sessions into reusable lessons — actionable, abstracted knowledge such as project-specific constraints, debugging patterns, design decisions and user preferences. Lessons are extracted alongside the normal summary (no extra LLM call), stored locally, embedded with the same local model, and surfaced (only after approval) when a similar task appears later. Each lesson carries scope, confidence, applies_when / avoid_when and a lifecycle (candidate → approved → archived / superseded), with duplicate / conflict detection and confidence decay over time. Raw logs remain available as evidence; day-to-day recall focuses on concise, reusable lessons.

Installation guide

Prerequisites

Node.js ≥ 20
For the subscription LLM backends: Claude Code CLI (logged in) and/or Codex CLI (logged in)
First distill downloads the embedding model (~100 MB, cached under ~/.cache)

Step 1 — install the package globally

npm install -g @nogataka/claw-memory

Installing globally makes hooks and the MCP server start instantly. (Without it, the plugin falls back to npx -y @nogataka/claw-memory@latest, which is slower on first run.)

Step 2a — Claude Code (plugin, recommended)

/plugin marketplace add nogataka/claw-memory
/plugin install claw-memory

Restart Claude Code. This auto-registers:

the MCP server (8 memory tools), and
the hooks: SessionStart → compact recall injection (pull model), Stop → auto-distill. Hook errors are logged to ~/.claw-memory/logs/hook-error.log instead of being swallowed.

No manual config. To verify, run /mcp and look for claw-memory.

Not using the plugin? Run claw-memory install --claude-code to merge the MCP server and hooks into ~/.claude/settings.json (idempotent, backs up the file).

Step 2b — Codex (plugin)

Codex supports the same plugin format as Claude Code. claw-memory ships a .codex-plugin/plugin.json manifest, so installing it as a Codex plugin wires up the MCP server and the lifecycle hooks — full parity with Claude Code:

codex
/plugins

The plugin registers, via Codex's ${CLAUDE_PLUGIN_ROOT} (provided for compatibility):

the claw-memory MCP server (.mcp.json),
SessionStart → compact recall injection (preferences + one-line summaries; details are pulled on demand via memory_recall),
Stop → auto distill of recent Codex sessions (watermark-deduped, async), and
the memory-recall skill.

Step 2b (alt) — Codex (installer, no marketplace)

If you install from npm instead of the plugin marketplace, register via the CLI:

claw-memory install --codex

This idempotently:

adds [mcp_servers.claw-memory] to ~/.codex/config.toml (backed up to config.toml.bak),
merges recall/distill hooks into ~/.codex/hooks.json (backed up; your own hooks are preserved),
installs the memory-recall skill, and
appends an AGENTS.md instruction telling the agent to call memory_recall at session start.

Restart Codex. Recall injection and auto-distill now run via hooks — no manual step needed. You can still backfill on demand:

claw-memory distill-codex --recent     # distill recent Codex sessions (watermark-deduped)
claw-memory distill-codex --all        # backfill everything

Step 2c — from source (development)

git clone https://github.com/nogataka/claw-memory
cd claw-memory
npm install          # no native builds — storage is Node's built-in node:sqlite
npm run build        # tsc -> dist/
npm link             # optional: expose the `claw-memory` binary

MCP tools

| Tool | Purpose | |------|---------| | memory_recall(query, cwd?, topK?) | Ready-to-read context block: preferences + recent summaries + similar past conversations. Call at the start of a task. | | memory_search(query, cwd?, limit?, type?, concept?, file?, dateFrom?, dateTo?) | Token-light hit index (id + title + date + type), with metadata filters. | | memory_get(ids) | Full text + metadata for given chunk ids. | | memory_remember(text, cwd?, sessionId?) | Store a durable free-text note. | | memory_distill(cwd, sessionId? \| transcriptPath?) | Summarize a session into memory (needs an LLM backend). | | memory_get_preferences(cwd?) | List stored preferences for the project. | | memory_search_logs(query, sources?, projectPath?, startDate?, endDate?, limit?, offset?) | Full-text search over RAW Claude Code + Codex + ChatGPT-web transcripts (sources: claude-code / codex / chatgpt-web). | | memory_forget(ids) | Soft-delete chunks (hidden from search / recall / viewer). | | lesson_search(query, cwd?, limit?) | Search approved reusable lessons, ranked by relevance + scope + confidence. | | lesson_inject(query, cwd?, limit?) | Same, returned as a ready-to-read <relevant-lessons> block. | | lesson_get(lesson_id) | Full detail of one lesson (fields + history + links). | | lesson_extract(cwd, sessionId? \| transcriptPath?) | Dedicated lesson-extraction pass over a session (needs an LLM backend). | | lesson_approve / lesson_reject / lesson_archive(lesson_id, reason?) | Lifecycle transitions. | | lesson_supersede(old_lesson_id, new_lesson_id) | Replace an old lesson with a newer one. |

All tools are fully local except memory_distill / lesson_extract (LLM) and memory_search_logs (reads ~/.claude/projects and ~/.codex/sessions directly).

Configuration (environment variables)

| Variable | Default | Purpose | |----------|---------|---------| | CLAW_MEMORY_DIR | ~/.claw-memory | Data directory (holds memory.db and logs/). | | CLAW_MEMORY_LLM_BACKEND | agent-sdk | agent-sdk | codex-sdk | anthropic | openai-compatible. | | CLAW_MEMORY_MODEL / AGENT_SDK_MODEL | claude-sonnet-4-5 | Default distill model (agent-sdk / anthropic). | | CLAW_MEMORY_TIER_SMART / _SUMMARY / _SIMPLE | — | Per-tier model override (route cheap models to simple work). | | CLAW_MEMORY_CODEX_MODEL | Codex default | Model for the codex-sdk backend. | | CLAW_MEMORY_CODEX_API_KEY | — | Optional; otherwise the Codex CLI login is used. | | ANTHROPIC_API_KEY / ANTHROPIC_BASE_URL | — | For the anthropic backend. | | CLAW_MEMORY_OPENAI_API_KEY / CLAW_MEMORY_OPENAI_BASE_URL | — | For the openai-compatible backend (Gemini / OpenRouter / LM Studio). | | CLAW_MEMORY_EXCLUDED_PROJECTS | — | Comma/colon-separated path substrings to never record or recall. | | MEMORY_SIMILARITY_MAX_DISTANCE | 0.6 | Max cosine distance for a semantic hit (lower = stricter). | | CLAW_MEMORY_UI_PORT | 4319 | Viewer port. | | LESSON_RECALL_LIMIT | 3 | Approved lessons injected into the recall block (0 disables). | | CLAW_MEMORY_LESSON_DEDICATED | — | 1 = run a separate, higher-quality lesson-extraction pass (extra LLM call). | | CLAW_MEMORY_LESSON_CONFLICT_LLM | — | 1 = use the LLM to detect conflicting lessons during extraction. | | LESSON_DECAY_FACTOR / LESSON_STALE_DAYS | 0.9 / 30 | Confidence-decay factor and staleness threshold for lessons decay. | | CLAW_MEMORY_CHATGPT_EXPORT | ~/.claw-memory/chatgpt | ChatGPT conversations.json export — a file or a folder of *.json. | | CLAW_MEMORY_CHATGPT_MAX_BYTES | 209715200 (200 MB) | Skip ChatGPT export files larger than this (parsed in memory). |

LLM backends

| Backend | Auth | Notes | |---------|------|-------| | agent-sdk (default) | Claude CLI login (Pro/Max/Team/Enterprise) | zero-config, no API key | | codex-sdk | Codex CLI login (ChatGPT/Codex plan) | @openai/codex-sdk; runs read-only, no tools | | anthropic | ANTHROPIC_API_KEY | plain Messages API over fetch | | openai-compatible | CLAW_MEMORY_OPENAI_API_KEY + base URL + CLAW_MEMORY_MODEL | Gemini / OpenRouter / LM Studio |

export CLAW_MEMORY_LLM_BACKEND=codex-sdk   # distill using the Codex subscription

CLI reference

claw-memory mcp                                  # stdio MCP server (what agents spawn)
claw-memory ui [--port N] [--open]               # read-only web viewer
claw-memory distill --cwd P --session ID [--path FILE] [--if-stale]
claw-memory distill-codex [--recent] [--limit N] [--all]
claw-memory distill-chatgpt [--limit N] [--all]  # distill ChatGPT web export conversations
claw-memory remember --cwd P "a note"
claw-memory lessons list [--status candidate|approved|...] [--cwd P]
claw-memory lessons search "query" [--cwd P] [--limit N]
claw-memory lessons inject "query" [--cwd P] [--limit N]
claw-memory lessons extract --session ID [--cwd P] [--path FILE]
claw-memory lessons approve|reject|archive <lesson_id> [--reason R]
claw-memory lessons supersede <old_id> <new_id>
claw-memory lessons decay [--days N] [--factor F] [--dry]
claw-memory lessons export [--status S] [--cwd P] > bundle.json
claw-memory lessons import bundle.json [--status S] [--cwd P]
claw-memory search-logs "query" [--source claude-code,codex,chatgpt-web] [--project P]
                                 [--start ISO] [--end ISO] [--limit N] [--offset N]
claw-memory hook <recall|distill>               # lifecycle hook (reads JSON on stdin)
claw-memory install   [--codex | --claude-code] # register MCP + hooks (default: codex)
claw-memory uninstall [--codex | --claude-code]

How it works

[write path]                              [read path]
session ends (Stop hook / distill-codex)   session starts (SessionStart hook / memory_recall)
   └ distill                                   └ buildMemoryBlock
       ├ summary  ───────────► session_summaries ──► <previous-session-summaries>
       ├ preferences ────────► user_preferences ───► <user-preferences> (always-apply)
       └ chunks (embed+meta) ─► vec_chunks + ────────► <relevant-past-conversations>
                                conversation_chunks    (cosine KNN, per-project, filtered)

[separate source] raw logs (~/.claude/projects, ~/.codex/sessions) ──► memory_search_logs

One SQLite file at ~/.claw-memory/memory.db. sqlite-vec stores 384-dim vectors inside it; metadata lives in a parallel table; FTS5 provides a keyword fallback.
Embeddings run locally via Xenova/multilingual-e5-small (multilingual, offline, e5 query:/passage: prefixing). The model loads once per MCP process.
Search is hybrid: cosine KNN (filtered by project + metadata) augmented with FTS5 keyword hits, de-duplicated and distance-sorted.
Daily structured logs are written to ~/.claw-memory/logs/.

Memory viewer

claw-memory ui --open        # http://localhost:4319

If you installed via the Claude Code plugin only (no global npm install), the claw-memory binary isn't on your PATH. Run the viewer through npx instead:

npx @nogataka/claw-memory ui --open       # http://localhost:4319
npx @nogataka/claw-memory ui --port 5000 --open

Read-only. Browse projects, session summaries, conversation chunks (with type / concepts / files), and preferences; toggle 🔎 ログ検索 to full-text search raw Claude Code + Codex transcripts. Live-updates via SSE while open. Nothing runs in the background otherwise — start it only when you want to inspect.

Uninstall

claw-memory uninstall --codex          # remove config.toml block + hooks + skill + AGENTS note
claw-memory uninstall --claude-code    # remove mcp + hooks from settings.json
# Claude Code plugin: /plugin uninstall claw-memory
npm uninstall -g @nogataka/claw-memory

Your memory database is left untouched; delete ~/.claw-memory to wipe it.

Notes

Storage uses Node's built-in node:sqlite (Node >= 24 required), so Node upgrades can't break a native ABI. The sqlite-vec extension ships prebuilt and is version-independent.
Per-prompt injection was removed in v0.3.0 (context-bloat fix): hooks inject a compact block at SessionStart only; pull details on demand with memory_recall / memory_search or the memory-recall skill.
claw-memory cleanse finds (and with --apply tombstones) legacy chunks polluted by raw JSON payloads.
The MCP server is long-lived per agent session, so the embedding model loads once.
Viewer + MCP can run simultaneously — SQLite WAL handles concurrent read/write.
On install, dependencies resolve with legacy-peer-deps=true (a zod peer-range overlap between bundled SDKs); this is configured in .npmrc and is harmless.