agent-recall-cli
v3.4.0
Published
CLI for AgentRecall memory system
Downloads
2,615
Maintainers
Readme
Already Using Another Memory System? /arbootstrap
[!NOTE] You don't start from zero. If you've been using Claude's built-in memory, Mem0, or just working in git repos — AgentRecall can discover and import your existing context automatically.
Most users installing AgentRecall aren't starting fresh. They already have:
- Git repos with months of commit history and project structure
- Claude AutoMemory (
~/.claude/projects/) with user profiles, feedback, and project memories - CLAUDE.md files with project conventions and architecture decisions
/arbootstrap scans your machine and imports everything in one shot:
/arbootstrap
──────────────────────────────────────────────────────────────
AgentRecall Bootstrap Scan 2026-04-26
──────────────────────────────────────────────────────────────
Found on your machine:
24 git repos
92 Claude memory files
3 CLAUDE.md files
Projects:
18 new (not yet in AgentRecall)
10 already imported
Scan time: 141ms
──────────────────────────────────────────────────────────────What gets imported per project:
- Identity — project name, language, description →
palace/identity.md - Architecture — CLAUDE.md conventions →
palace/rooms/architecture/ - Memory — Claude AutoMemory files →
palace/rooms/knowledge/ - Trajectory — recent git history → initial journal entry
Safety guarantees:
- Scan is read-only — never writes to your machine
- Import only writes to
~/.agent-recall/— never modifies source files - Skips
.env, credentials,.pem,.key— never reads secrets - Projects already in AgentRecall are skipped (no double-import)
For MCP-only environments (Codex, Cursor, VS Code Copilot):
bootstrap_scan() # discover what's on the machine
bootstrap_import({ scan_result }) # import selected projectsFor CLI:
ar bootstrap # scan and show results
ar bootstrap --dry-run # preview what would be imported
ar bootstrap --import # import all new projects
ar bootstrap --import --project X # import one projectAfter bootstrap, run /arstatus — your projects are ready.
/arstatus, /arsave, /arstart, /arsaveall, and /arbootstrap
[!TIP] New to AgentRecall? Read the → Command Reference — full instructions, all example outputs, installation, and troubleshooting in one place.
[!IMPORTANT] Run
/arstatusat the start of every session. It shows all your projects, what's pending, what's blocked, and lets you pick what to work on — by number, not by remembering project names. Without it, a fresh agent has no idea where to begin.
| Command | When | What it does |
|---------|------|-------------|
| ⭐ /arstatus | Every session — run this first | Status board across ALL projects: pending work, blockers, numbered pick list. The true cold start. |
| /arstart | After picking a project | Load deep context for one project: palace rooms, corrections, task-specific recall |
| /arsave | End of session | Write journal + consolidate to palace + update awareness |
| /arsaveall | End of day (multi-session) | Batch save all parallel sessions at once — scan, merge, deduplicate, done |
| /arbootstrap | First install / switching from another memory system | Scan your machine for existing projects and import them into AgentRecall in seconds |
The session flow: /arstatus → pick a number → /arstart <project> → work → /arsave.
Running 5 agents in parallel? Don't /arsave five times. Type /arsaveall once — it scans all of today's sessions across all projects, merges them into consolidated journals, deduplicates insights, and updates awareness in one shot. Each session writes to its own file (session-ID scoped), so no conflicts, no data loss, no matter how many windows you have open.
What You'll See
Type /arstatus → see everything in flight across all projects, pick by number:
──────────────────────────────────────────────────────────────
AgentRecall Status Board 2026-04-21 5 projects
──────────────────────────────────────────────────────────────
1 ⚠ novada-site 2026-04-21 BLOCKED
Blocked: .env.local missing — Phase 1 cannot proceed
2 ● novada-mcp 2026-04-21
Next: fix novada_search POST /request → publish v0.8.0
3 ● prismma-scraper 2026-04-17
Next: UI upgrade Option A — light mode + 3D visuals
4 ✓ AgentRecall 2026-04-21 complete
Collecting real production data
──────────────────────────────────────────────────────────────
Enter a number, or:
N New project (with memory — agent knows your full history)
X New project (clean slate — no prior context, pure objectivity)
──────────────────────────────────────────────────────────────Type /arsave → the system saves everything and renders a card with exact file paths and counts:
──────────────────────────────────────────────────────────────
AgentRecall ✓ Saved my-project 2026-04-20 #12
──────────────────────────────────────────────────────────────
Journal ~/.agent-recall/projects/my-project/journal/
└─ 2026-04-20--arsave--15L--review-feedback.md [written]
Awareness 2 insights added (8 total)
Palace ~/.agent-recall/projects/my-project/palace/
├─ rooms/Architecture [updated]
└─ rooms/Goals [updated]
Corrections 3 stored (always loaded at session start)
⚡ Similar entries found — consider merging:
2026-04-19 (review, feedback, architecture)
──────────────────────────────────────────────────────────────Type /arstart → loads all context from memory in one shot:
──────────────────────────────────────────────────────────────
AgentRecall ↻ Loaded my-project 2026-04-21
──────────────────────────────────────────────────────────────
Project my-project — SaaS platform for AI agents
Last session 2026-04-20 — review + feedback loop shipped
Insights (top 3):
[5×] Server-rendered cards beat agent templates
[3×] Per-message dedup beats per-session dedup
[2×] Stemming + synonyms improve keyword recall
⚠ Past corrections — watch out:
- "No dark backgrounds" (corrected 3×)
- "Use bb-browser, not Playwright" (corrected 2×)
Cross-project: 2 related insights from novada-mcp
──────────────────────────────────────────────────────────────Type /arsaveall → batch-saves all parallel sessions at once:
──────────────────────────────────────────────────────────────
AgentRecall ✓ Batch Saved 2026-04-20
──────────────────────────────────────────────────────────────
Sessions scanned 5
Projects saved my-project, novada-mcp, prismma-scraper
Insights merged 4 (deduplicated from 7)
Corrections 2 new (auto-captured via hooks)
──────────────────────────────────────────────────────────────Type /arbootstrap → discover and import projects from your existing tools:
──────────────────────────────────────────────────────────────
AgentRecall Bootstrap Complete 2026-04-26
──────────────────────────────────────────────────────────────
12 projects created
87 items imported (identity, memory, architecture, trajectory)
3 items skipped
0 errors
Run /arstatus to see your projects.
──────────────────────────────────────────────────────────────The cards are rendered server-side — computed from actual operation results, not agent interpretation. What you see is always accurate.
# Install commands (one-time, Claude Code only)
mkdir -p ~/.claude/commands
curl -o ~/.claude/commands/arstatus.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arstatus.md
curl -o ~/.claude/commands/arstart.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arstart.md
curl -o ~/.claude/commands/arsave.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arsave.md
curl -o ~/.claude/commands/arsaveall.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arsaveall.md
curl -o ~/.claude/commands/arbootstrap.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arbootstrap.mdThe Difference
WITHOUT AgentRecall WITH AgentRecall
────────────────── ────────────────
Day 1: Build monorepo Day 1: /arstart → /arsave
Day 2: "What monorepo?" Day 2: /arstart
→ 20 min re-explaining → 2 sec: loads all decisions
→ Agent repeats same mistakes → Knows "no version inflation"
→ Forgets your priorities → Knows "arsave = hero section"
→ Misses half the tasks → Pushes to both reposWITHOUT AgentRecall (5 parallel agents) WITH AgentRecall (5 parallel agents)
────────────────────────────────────── ────────────────────────────────────
Agent 1 finishes: you /arsave Agent 1-5 finish: you type /arsaveall once
Agent 2 finishes: you /arsave again → Scans all 5 sessions automatically
Agent 3 finishes: you /arsave again → Merges into consolidated journals
Agent 4 finishes: you /arsave again → Deduplicates insights across sessions
Agent 5 finishes: you /arsave again → Zero conflicts (session-ID scoped files)
→ 5x the work, corrections lost → One command, everything saved
→ Agent 3's correction unknown to Agent 5 → All agents share the same memoryThree Layers of Value
Layer 1 (5 seconds): It makes your AI agent remember what happened last session.
Layer 2 (30 seconds): Every time you correct your agent — "no, not that version", "ask me first" — that correction is stored permanently and recalled before the agent makes the same mistake again. After 10 sessions, your agent understands your priorities, your communication style, your non-negotiables.
Layer 3 (2 minutes): The Intelligent Distance Protocol. The structural gap between human thinking and AI action can't be closed — but it can be navigated better every session. Corrections are training data. The 200-line awareness cap forces quality over quantity. Cross-project insights mean lessons learned once apply everywhere.
Why Choose AgentRecall
AgentRecall is not a memory tool. It's a learning loop.
Memory is the mechanism. Understanding is the goal. Every time you correct your agent — "no, not that version", "put this section first", "ask me before you assume" — that correction is stored, weighted, and recalled next time. After 10 sessions, your agent doesn't just remember your project. It understands how you think: your priorities, your communication style, your non-negotiables.
Your agent learns how you think. Humans are inconsistent — we skip from A to E, forget what we said yesterday, change priorities mid-sentence. AgentRecall captures every correction and surfaces it before the next mistake. The gap between what you mean and what your agent does shrinks with every session.
Compounding awareness, not infinite logs. Memory is capped at 200 lines. New insights either merge with existing ones (strengthening them) or replace the weakest. After 100 sessions, your awareness file is still 200 lines — but each line carries the weight of cross-validated, confirmed observations.
Cross-project recall. Lessons learned in one project apply everywhere. Built a rate limiter last month? That lesson surfaces when you're building one today — in a different repo, through a different agent.
Near-universal compatibility. MCP server for any MCP-compatible agent (Claude Code, Cursor, Windsurf, VS Code, Codex). SDK for any JS/TS framework (LangChain, CrewAI, Vercel AI SDK, custom agents). CLI for terminal and CI workflows. One memory system, every surface.
Zero cloud, zero telemetry, all local. Everything is markdown on disk. Browse it in Obsidian, grep it in the terminal, version it in git. No accounts, no API keys, no lock-in.
Three Ways to Use It
MCP — for AI agents (Claude Code, Cursor, Windsurf, VS Code, Codex):
claude mcp add --scope user agent-recall -- npx -y agent-recall-mcpSDK — for any JS/TS application (LangChain, CrewAI, Vercel AI SDK, custom):
import { AgentRecall } from "agent-recall-sdk";
const memory = new AgentRecall({ project: "my-app" });
await memory.capture("What stack?", "Next.js + Postgres");CLI — for terminal workflows and CI:
npx agent-recall-cli capture "What stack?" "Next.js + Postgres"
npx agent-recall-cli palace walk --depth activeWhat Is AgentRecall?
A learning system that bridges the gap between how humans think and how AI agents work. Not a log. Not a database. A compounding loop where every correction, decision, and insight makes the next session better than the last.
| Without AgentRecall | With AgentRecall |
|---------------------|------------------|
| Agent forgets yesterday's decisions | Decisions live in palace rooms, loaded on cold start |
| Same mistake repeated across sessions | recall_insight surfaces past lessons before work starts |
| 5 min context recovery on each session start | 2 second cold start from palace (~200 tokens) |
| Flat memory files that grow forever | 200-line awareness cap forces merge-or-replace |
| Knowledge trapped in one project | Cross-project insights match by keyword |
| Agent misunderstands, you correct, it forgets | alignment_check records corrections permanently |
Quick Start
MCP Server (for AI agents)
# Claude Code
claude mcp add --scope user agent-recall -- npx -y agent-recall-mcp
# Cursor — .cursor/mcp.json
{ "mcpServers": { "agent-recall": { "command": "npx", "args": ["-y", "agent-recall-mcp"] } } }
# VS Code — .vscode/mcp.json
{ "servers": { "agent-recall": { "command": "npx", "args": ["-y", "agent-recall-mcp"] } } }
# Windsurf — ~/.codeium/windsurf/mcp_config.json
{ "mcpServers": { "agent-recall": { "command": "npx", "args": ["-y", "agent-recall-mcp"] } } }
# Codex
codex mcp add agent-recall -- npx -y agent-recall-mcpSkill (Claude Code only):
mkdir -p ~/.claude/skills/agent-recall
curl -o ~/.claude/skills/agent-recall/SKILL.md \
https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/SKILL.mdSDK (for JS/TS applications)
npm install agent-recall-sdkimport { AgentRecall } from "agent-recall-sdk";
const memory = new AgentRecall({ project: "my-app" });
await memory.coldStart(); // load context (~200 tokens)
await memory.capture("Q", "A"); // quick capture
await memory.palaceWrite("architecture", "Stack: Next.js + Drizzle");CLI (for terminal and CI)
npm install -g agent-recall-cli
ar capture "What ORM?" "Drizzle" --project my-app
ar palace walk --depth active
ar search "rate limiting" --include-palaceSemantic Recall — pgvector Backend (v3.4.0)
[!NOTE] New in v3.4.0. Default keyword recall works without any configuration. Upgrade to Supabase + pgvector when keyword search hits its ceiling: synonyms, paraphrased queries, multi-language recall.
Keyword search matches tokens. Semantic search matches meaning. After upgrading:
recall("session expiry")also surfaces entries about "token refresh" and "auth timeout"- No hand-crafted synonyms needed — embeddings handle the gap
- Local files remain the source of truth — Supabase is a derived read index
Setup (3 steps)
# Step 1 — interactive setup wizard
ar setup supabase
# Step 2 — apply pgvector migration to your Supabase project
ar setup supabase --migrate
# Step 3 — done. Run /arstart — backfill happens automatically.The wizard prompts for your Supabase project URL and service role key → writes to ~/.agent-recall/config.json. Never committed to git.
How it works
remember() / session_end()
→ writes to local ~/.agent-recall/ as always ← source of truth, unchanged
→ async: syncs to Supabase memories table
→ OpenAI text-embedding-3-small (1536 dims)
or Voyage voyage-3-lite (512 dims, zero-padded to 1536)
→ pgvector stores the embedding
recall(query)
→ Supabase configured?
YES → cosine similarity search via pgvector, reranked with local RRF
NO → local keyword search (existing behavior, unchanged)Graceful degradation
If Supabase is unreachable (network error, quota exceeded, not configured), recall() falls back to local keyword search silently. Zero behavior change if the feature is never configured.
Required: SUPABASE_URL + SUPABASE_SERVICE_ROLE_KEY + either OPENAI_API_KEY or VOYAGE_API_KEY. All optional — AgentRecall works fully without them.
Rebuild index: ar setup supabase --backfill — re-embeds all local memories into Supabase.
How an Agent Uses AgentRecall
Automatic (Zero Discipline — Hooks)
Wire once in ~/.claude/settings.json. Every session is captured automatically, even without /arsave:
{
"hooks": {
"SessionStart": [{
"command": "node ~/.local/share/npm/lib/node_modules/agent-recall-cli/dist/index.js hook-start 2>/dev/null || true"
}],
"UserPromptSubmit": [{
"command": "node ~/.local/share/npm/lib/node_modules/agent-recall-cli/dist/index.js hook-correction 2>/dev/null || true"
}],
"Stop": [{
"command": "node ~/.local/share/npm/lib/node_modules/agent-recall-cli/dist/index.js hook-end 2>/dev/null || true"
}]
}
}- hook-start — on every session open: prints identity + top insights + watch_for warnings
- hook-correction — on every prompt: detects corrections (regex) and captures them silently
- hook-end — on every session close: appends a lightweight end-of-session log entry
Session Start (/arstart)
session_start() → identity, insights, active rooms, cross-project matches,
recent journal briefs, watch_for warnings — all in one call
recall(query) → surface task-specific past knowledge from all storesDuring Work
remember("We decided to use GraphQL instead of REST") → auto-routes to the right store
recall("authentication design") → searches all stores, ranked results
check(goal="build auth", confidence="medium") → verify understanding, get warningsSession End (/arsave)
session_end(summary="...", insights=[...], trajectory="...") → journal + awareness + consolidation10 MCP Tools
AgentRecall exposes 10 tools to agents. Each tool composes multiple subsystems internally — the agent doesn't need to know about the plumbing.
| Tool | What it does |
|------|-------------|
| session_start | Load project context for a new session. Returns identity, top insights, active rooms, cross-project matches, recent activity, and predictive watch_for warnings from past corrections. One call, ~400 tokens. |
| remember | Save a memory. Auto-classifies content (bug fix, architecture decision, insight, session note) and routes to the right store (journal, palace, knowledge, or awareness). Auto-generates semantic names for future retrieval. |
| recall | Search all memory stores at once using Reciprocal Rank Fusion (RRF) — each source ranks internally, then positions are merged so no source dominates by default. Returns ranked results with stable IDs. Accepts feedback to rate previous results: positive boosts future ranking, negative penalizes. Query-aware — feedback from one search doesn't bleed into unrelated queries. |
| session_end | Save everything in one call. Writes journal, updates awareness with new insights, consolidates to palace rooms, archives demoted insights (not deleted — preserved with resurrection support). |
| check | Record what you think the human wants. Returns watch_for patterns from past correction history ("You've been corrected on X 3 times — ask about it"). Accepts human_correction and delta after the human responds. Auto-promotes strong patterns (3+) to awareness. |
| digest | Context cache — store pre-computed analysis results (codebase audits, subagent explorations) and recall them instead of recomputing. Actions: store, recall, read, invalidate. Scoring uses Ebbinghaus decay with Zipf-adjusted half-life. Benchmarked: 83% token savings on repeated analysis vs. recompute. |
| project_board | Status board across all AgentRecall projects — same data as /arstatus. Returns numbered project list with pending work, blockers, and last activity. Use at the start of any multi-project session. |
| project_status | Deep status for a single project — next actions, blockers, recent journal summary, palace room health. Use after project_board to pick and focus. |
| bootstrap_scan | Scan the machine for existing projects (git repos, Claude AutoMemory, CLAUDE.md files). Read-only — no writes. Returns scan results for review before import. |
| bootstrap_import | Import projects discovered by bootstrap_scan into AgentRecall. Writes identity, architecture, memory, and trajectory to ~/.agent-recall/. Safe: never modifies source files. |
Legacy tools
The original 22 subsystem tools (palace_write, journal_capture, awareness_update, etc.) remain available via the SDK and CLI for backward compatibility and advanced use cases. They are not registered in the MCP server by default.
How Memory Compounds
Memory is not a list. It's a compounding system where 1+1+1 > 3. Each subsystem feeds the next — naming enables retrieval, retrieval enables feedback, feedback enables ranking, ranking surfaces the right memory at the right time.
1. Auto-Naming
The agent knows content best at the moment of saving. AgentRecall captures that understanding in a semantic slug — not "mcp-verified" but "verified-agentrecall-mcp-22tools-functional". Good naming IS the first layer of retrieval. A well-named memory is 80% findable without any search algorithm.
File naming: {date}--{saveType}--{lines}L--{slug}.md — parseable by agents (split("--") → [date, type, size, topic]), readable by humans. Line count tells the agent the token cost before opening the file.
2. Indexes
| Index | What it tracks | Token cost | |-------|---------------|------------| | Palace index | Room catalog + salience scores | ~50 tokens to scan | | Insights index | Cross-project lessons + keyword matching | ~30 tokens to query | | Awareness | 200-line compounding document (forced merge) | ~200 tokens, each line cross-validated |
3. Relativity
Memories that relate to each other are connected automatically — no wikilinks needed. When you recall("session security"), the system surfaces keyword-matched memories across connected rooms. Edges are stored in graph.json — relativity turns isolated memories into a knowledge graph.
4. Weight + Decay
Not all memories are equal. Salience scoring: recency(0.30) + access(0.25) + connections(0.20) + urgency(0.15) + importance(0.10)
recall applies the Ebbinghaus forgetting curve R(t) = e^(−t/S) with memory-type-specific strength values:
| Memory type | S (days) | 1-day retention | 1-week retention | |-------------|----------|-----------------|------------------| | Journal (episodic) | 2 | 60% | ~7% | | Knowledge / bug fix (procedural) | 180 | 99% | 96% | | Palace / decisions (semantic) | 9999 | ≈100% | ≈100% |
Old journal noise fades in days. Architecture decisions persist indefinitely. Hot-window boost: Items from the last 6 hours get a 3× score multiplier, last 24 hours get 2×, last 72 hours get 1.3×.
5. Feedback Loop
The system uses a Bayesian Beta distribution — the mathematically optimal estimate of true usefulness from binary observations (E[Beta(α,β)] = (pos+1)/(pos+neg+2)). Rating a result "useless" for one query doesn't penalize it for unrelated queries. Feedback is query-aware, not global.
Feedback is now automatic. The ambient recall hook tracks which memories were surfaced. Human's next message is a correction → negative feedback. Not a correction → positive feedback. No agent action required.
The Compounding Effect
Session 1: Save 3 memories (auto-named, indexed, edges created)
Session 5: Recall surfaces memories from sessions 1-4, feedback refines ranking
Session 10: watch_for warns agent about past mistakes before they repeat
Session 20: Awareness contains 10 cross-validated insights (merged from 40+ raw observations)
Session 50: The agent knows your priorities, blind spots, and communication style
— not because it was told, but because every correction compoundedStemming + synonyms: "deploying" matches "deployment," "ship," and "release." A 19-rule suffix stemmer + 100-pair synonym table — no vector DB needed (keyword mode), zero external dependencies.
SDK API
The agent-recall-sdk package exposes the AgentRecall class — a programmatic interface to the full memory system. Use it to add persistent, compounding memory to any JS/TS agent framework.
import { AgentRecall } from "agent-recall-sdk";
const ar = new AgentRecall({ project: "my-project" });Core Methods
| Method | Returns | Description |
|--------|---------|-------------|
| capture(question, answer, opts?) | JournalCaptureResult | Quick Q&A capture (L1 memory) |
| journalWrite(content, opts?) | JournalWriteResult | Write daily journal entry |
| journalRead(opts?) | JournalReadResult | Read journal by date or "latest" |
| journalSearch(query, opts?) | JournalSearchResult | Full-text search across journals |
| coldStart() | JournalColdStartResult | Palace-first context loading (~200 tokens) |
Palace Methods
| Method | Returns | Description |
|--------|---------|-------------|
| palaceWrite(room, content, opts?) | PalaceWriteResult | Write to a room with fan-out cross-refs |
| palaceRead(room?, topic?) | PalaceReadResult | Read room content or list all rooms |
| walk(depth?, focus?) | PalaceWalkResult | Progressive walk: identity → active → relevant → full |
| palaceSearch(query, room?) | PalaceSearchResult | Search rooms by content |
| lint(fix?) | PalaceLintResult | Health check and auto-archive |
Awareness & Insight Methods
| Method | Returns | Description |
|--------|---------|-------------|
| awarenessUpdate(insights, opts?) | AwarenessUpdateResult | Compound new insights into awareness |
| readAwareness() | string | Read the 200-line awareness document |
| recallInsight(context, opts?) | RecallInsightResult | Cross-project insight recall |
Alignment Methods
| Method | Returns | Description |
|--------|---------|-------------|
| alignmentCheck(input) | AlignmentCheckResult | Record confidence + assumptions |
| nudge(input) | NudgeResult | Detect contradictions with past decisions |
| synthesize(opts?) | ContextSynthesizeResult | L3 synthesis, optional palace consolidation |
CLI Commands
The agent-recall-cli package provides the ar command for terminal workflows and CI pipelines.
ar v3.4.0 — AgentRecall CLI
JOURNAL:
ar read [--date YYYY-MM-DD] [--section <name>]
ar write <content> [--section <name>]
ar capture <question> <answer> [--tags tag1,tag2]
ar list [--limit N]
ar search <query> [--include-palace]
ar state read|write [data]
ar cold-start
ar archive [--older-than-days N]
ar rollup [--min-age-days N] [--dry-run]
PALACE:
ar palace read [<room>] [--topic <name>]
ar palace write <room> <content> [--importance high|medium|low]
ar palace walk [--depth identity|active|relevant|full]
ar palace search <query>
ar palace lint [--fix]
AWARENESS:
ar awareness read
ar awareness update --insight "title" --evidence "ev" --applies-when kw1,kw2
INSIGHT:
ar insight <context> [--limit N]
SETUP:
ar setup supabase # interactive Supabase setup wizard
ar setup supabase --migrate # apply pgvector migration
ar setup supabase --backfill # re-embed all local memories
META:
ar projects
ar synthesize [--entries N]
ar knowledge write --category <cat> --title "t" --what "w" --cause "c" --fix "f"
ar knowledge read [--category <cat>]
ar bootstrap [--dry-run] [--import] [--project X]
HOOKS (auto-wired via settings.json — zero discipline required):
ar hook-start # SessionStart: prints identity + insights + watch_for
ar hook-correction # UserPromptSubmit: silently captures corrections from prompt
ar hook-end # Stop: appends end-of-session log entry
GLOBAL FLAGS:
--root <path> Storage root (default: ~/.agent-recall)
--project <slug> Project overrideArchitecture
Five-Layer Memory Pyramid
L1: Working Memory journal_capture "what happened"
L2: Episodic Memory journal_write "what it means"
L3: Memory Palace palace_write / walk "knowledge across sessions"
L4: Awareness awareness_update "compounding insights"
L5: Insight Index recall_insight "cross-project experience"Key Mechanisms
Fan-out writes — Write to one room, cross-references auto-update in related rooms via [[wikilinks]]. Mechanical, zero LLM cost.
Salience scoring — Every room has a salience score: recency(0.30) + access(0.25) + connections(0.20) + urgency(0.15) + importance(0.10). High-salience rooms surface first. Below threshold → auto-archive.
Compounding awareness — awareness.md is capped at 200 lines. When new insights are added, similar existing ones merge (strengthen), dissimilar ones compete (lowest-confirmation gets replaced). The constraint creates compression. Compression creates compounding.
Cross-project insight recall — insights-index.json maps insights to situations via keywords. recall_insight("building quality gates") returns relevant lessons from any project, ranked by severity x confirmation count.
Obsidian-compatible — Every palace file has YAML frontmatter + [[wikilinks]]. Open palace/ as an Obsidian vault → graph view shows room connections. Zero Obsidian dependency.
Storage Layout
~/.agent-recall/
awareness.md # 200-line compounding document (global)
awareness-state.json # Structured awareness data
awareness-archive.json # Demoted insights (preserved, not deleted)
insights-index.json # Cross-project insight matching
config.json # Optional: Supabase URL + keys (never git-committed)
projects/
<project>/
journal/
YYYY-MM-DD.md # Daily journal
YYYY-MM-DD-log.md # L1 captures (hook-start/hook-end entries)
YYYY-MM-DD.state.json # JSON state
index.jsonl # Fast machine-scannable index of all entries
palace/
identity.md # ~50 token project identity card
palace-index.json # Room catalog + salience scores
graph.json # Cross-reference edges (relativity)
feedback-log.json # Per-query feedback scores (recall learning)
alignment-log.json # Past corrections for watch_for patterns
rooms/
goals/ # Active goals, evolution
architecture/ # Technical decisions, patterns
decisions/ # Decision trails with prior/posterior tracking
blockers/ # Current and resolved
alignment/ # Human corrections
knowledge/ # Learned lessons by category
<custom>/ # Agents create rooms on demandPlatform Compatibility
| Platform | MCP | SDK | CLI | Notes |
|----------|:---:|:---:|:---:|-------|
| Claude Code | ✅ | ✅ | ✅ | Full support — MCP + SKILL.md + commands |
| Cursor | ✅ | ✅ | ✅ | MCP via .cursor/mcp.json |
| VS Code (Copilot) | ✅ | ✅ | ✅ | MCP via .vscode/mcp.json |
| Windsurf | ✅ | ✅ | ✅ | MCP via mcp_config.json |
| OpenAI Codex | ✅ | ✅ | ✅ | codex mcp add — config.toml |
| Claude Desktop | ✅ | — | — | MCP server |
| LangChain / LangGraph | — | ✅ | — | new AgentRecall() in your chain |
| CrewAI | — | ✅ | — | SDK in tool definitions |
| Vercel AI SDK | — | ✅ | — | SDK in server actions |
| Custom JS/TS agents | — | ✅ | ✅ | SDK + CLI for any agent framework |
| CI / GitHub Actions | — | — | ✅ | npx agent-recall-cli in workflows |
| Any MCP agent | ✅ | — | — | Standard MCP protocol |
Benchmarked Token Savings
We ran two controlled benchmarks: a 5-round A/B test (Next.js + Drizzle + Stripe project) and a 10-round v3.3.16 benchmark validating digest cache, arsaveall, and cross-project recall. Read this table honestly: for simple throwaway tasks, AR is pure overhead. For anything with 3+ sessions, corrections, or multiple agents, it pays for itself — and the savings compound.
| Scenario | Without AR | With AR | Saved | |----------|:---------:|:------:|:--------:| | A: Simple (2 sessions, 0 corrections) | 567 | 1,131 | +99% overhead | | B: Medium (5 sessions, 1 correction) | 6,220 | 4,382 | -30% | | C: Complex (20 sessions, 5 corrections) | 40,910 | 17,520 | -57% | | D: Multi-agent (3 agents × 5 sessions) | 20,781 | 13,140 | -37% | | E: Digest cache (repeated analysis, 1 recall hit) | ~2,400 | ~400 | -83% |
Break-even: ~3-4 sessions. After that, every session with AR is cheaper than without.
Where the Savings Come From
| Source | Without AR cost | With AR cost | Why | |--------|:-:|:-:|-----| | Context rebuild | Up to ~1,100+ tokens/session | Fixed ~385 tokens (cold start) | AR loads palace context in one call | | Correction retention | ~800 tokens per repeat | 0 (stored once, never repeated) | Biggest single savings driver in long projects | | Clarification avoidance | ~400 tokens/session | 0 (already loaded) | Steady per-session savings | | Cross-project recall | ~500 tokens per insight | ~350 tokens (automatic recall) | Compounds across projects | | Digest cache | ~2,400 tokens (full re-analysis) | ~400 tokens (recall stored digest) | 83% savings on repeated heavy analysis |
All benchmark code: benchmark/run.mjs, benchmark/ab-comparison.mjs, and benchmark/v3316-benchmark.mjs. Run them yourself: node benchmark/run.mjs && node benchmark/ab-comparison.mjs.
Docs
| Document | Description |
|----------|-------------|
| → Command Reference | Full guide to /arstatus, /arstart, /arsave, /arsaveall — example outputs, modes, palace rules, troubleshooting |
| Intelligent Distance Protocol | The foundational theory — why the gap between human and AI is structural, and how to navigate it |
| Scoring Design Rationale | Why the scoring system works this way — RRF, Ebbinghaus, Beta distribution, and the bugs they fix |
| MCP Adapter Spec | Technical spec for building adapters on top of AgentRecall |
| SDK Design | Design doc for the SDK architecture |
| Upgrade v3.4 | Changelog: semantic recall, pgvector backend, 10 MCP tools, bootstrap, palace decisions room |
| MCP Server README | Focused guide for Claude Code / Cursor / Windsurf users |
| Core SDK README | SDK API reference for building with AgentRecall programmatically |
Contributing
- Issues & feedback: GitHub Issues
- Email: [email protected]
- Website: novada.com
MIT License.
AgentRecall(中文文档)
你的智能体记不清楚?听不懂你说话?每次项目都做得非常乱?
AgentRecall 让它学会理解你的思维方式。
赋能agent长期记忆,并从错误中学习和纠正,随时间和项目难度进化,越来越擅长和了解用户和agent的思维。
持久复合记忆 + 智能距离协议。MCP 服务器 + SDK + CLI。
/arstatus、/arsave、/arstart 和 /arsaveall
[!IMPORTANT] 每次新会话都先运行
/arstatus。 它会显示你所有项目的状态、待完成的工作、阻塞项,让你用数字选择下一步——无需记住项目名称。没有它,全新的 agent 根本不知道从哪里开始。
| 命令 | 时机 | 功能 |
|------|------|------|
| ⭐ /arstatus | 每次会话——先运行这个 | 跨所有项目的状态看板:待办事项、阻塞项、编号选择列表。真正的冷启动。 |
| /arstart | 选好项目后 | 加载单个项目的深度上下文:宫殿房间、纠正记录、任务相关召回 |
| /arsave | 会话结束时 | 写入日志 + 整合到记忆宫殿 + 更新感知 |
| /arsaveall | 一天结束时(多会话) | 一次性批量保存所有并行会话 — 扫描、合并、去重、完成 |
会话流程: /arstatus → 输入编号 → /arstart <项目> → 工作 → /arsave。
你会看到什么
输入 /arstatus → 一眼看清所有项目进展:
──────────────────────────────────────────────────────────────
AgentRecall 状态看板 2026-04-21 5 个项目
──────────────────────────────────────────────────────────────
1 ⚠ novada-site 2026-04-21 阻塞
阻塞:缺少 .env.local — Phase 1 无法继续
2 ● novada-mcp 2026-04-21
下一步:修复 novada_search POST /request → 发布 v0.8.0
3 ● prismma-scraper 2026-04-17
下一步:UI 升级 Option A — 浅色模式 + 3D 视觉
4 ✓ AgentRecall 2026-04-21 已完成
收集真实生产数据中
──────────────────────────────────────────────────────────────
输入编号,或:
N 新项目(带记忆——agent 了解你的完整历史)
X 新项目(空白状态——无历史上下文,纯客观模式)
──────────────────────────────────────────────────────────────效果对比
| 没有 AgentRecall | 有 AgentRecall |
|-----------------|---------------|
| 智能体忘记昨天的决策 | 决策存在宫殿房间,冷启动时加载 |
| 跨会话重复同样的错误 | recall_insight 工作前自动呈现过去教训 |
| 每次开始需要 5 分钟恢复上下文 | 2 秒冷启动,从宫殿加载(~200 token) |
| 平面记忆文件无限增长 | 200 行感知上限,强制合并或替换 |
| 知识锁在单个项目 | 跨项目洞察按关键词匹配 |
# 安装命令(一次性,仅 Claude Code)
mkdir -p ~/.claude/commands
curl -o ~/.claude/commands/arstatus.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arstatus.md
curl -o ~/.claude/commands/arstart.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arstart.md
curl -o ~/.claude/commands/arsave.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arsave.md
curl -o ~/.claude/commands/arsaveall.md https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/commands/arsaveall.md快速开始
# Claude Code
claude mcp add --scope user agent-recall -- npx -y agent-recall-mcp
# Cursor — .cursor/mcp.json
{ "mcpServers": { "agent-recall": { "command": "npx", "args": ["-y", "agent-recall-mcp"] } } }
# VS Code — .vscode/mcp.json
{ "servers": { "agent-recall": { "command": "npx", "args": ["-y", "agent-recall-mcp"] } } }
# Codex
codex mcp add agent-recall -- npx -y agent-recall-mcpClaude Code 技能安装:
mkdir -p ~/.claude/skills/agent-recall
curl -o ~/.claude/skills/agent-recall/SKILL.md \
https://raw.githubusercontent.com/Goldentrii/AgentRecall/main/SKILL.md语义召回 — pgvector (v3.4.0)
[!NOTE] v3.4.0 新功能。 默认关键词召回无需任何配置。当关键词搜索遇到天花板时——同义词、改写查询、多语言——升级到 Supabase pgvector 后端。
关键词搜索匹配词汇,语义搜索匹配含义。升级后:recall("会话过期") 也能找到"token 刷新"和"认证超时"相关的条目,无需手动添加同义词。
# 第 1 步 — 交互式配置向导
ar setup supabase
# 第 2 步 — 将 pgvector 迁移应用到你的 Supabase 项目
ar setup supabase --migrate
# 第 3 步 — 完成。运行 /arstart — 下次会话自动回填。本地文件仍为数据源。Supabase 是派生的读取索引 — 随时可删除并用 ar setup supabase --backfill 重建。
所需环境变量: SUPABASE_URL + SUPABASE_SERVICE_ROLE_KEY + OPENAI_API_KEY(或 VOYAGE_API_KEY)。全部可选 — 不配置时 AgentRecall 完全正常运行。
优雅降级: Supabase 不可达时,recall() 静默回退到本地关键词搜索,零行为变化。
10 个 MCP 工具
AgentRecall 目前向 agent 提供 10 个工具。每个工具内部组合多个子系统——agent 不需要了解内部管道。
| 工具 | 功能 |
|------|------|
| session_start | 加载项目上下文。返回身份、洞察、活跃房间、跨项目匹配、watch_for 预警。约 400 token。 |
| remember | 保存记忆。自动分类并路由到正确的存储(日志、宫殿、知识库或感知系统)。 |
| recall | 通过 RRF 一次搜索所有记忆。支持 feedback 评价:正面提升排名,负面降低。查询感知。 |
| session_end | 一次调用保存全部:写日志、更新感知、整合宫殿、归档被替换洞察。 |
| check | 记录对人类意图的理解。返回 watch_for 预警。3+ 次强模式自动提升为感知洞察。 |
| digest | 上下文缓存。存储耗时分析结果(代码库探索、API 审计)。实测节省 83% token。 |
| project_board | 跨所有项目的状态看板,等同于 /arstatus。返回编号项目列表、待办、阻塞项。 |
| project_status | 单个项目的深度状态:下一步行动、阻塞项、最近日志摘要、宫殿健康度。 |
| bootstrap_scan | 扫描机器上的现有项目(git 仓库、Claude AutoMemory、CLAUDE.md)。只读,不写入。 |
| bootstrap_import | 将 bootstrap_scan 发现的项目导入 AgentRecall。安全:不修改源文件。 |
记忆如何复合增长
不是所有记忆都平等。五个子系统相互喂养,自动命名让索引有意义,索引让关联性成为可能,关联性让检索精准,精准检索产生有意义的反馈,反馈让下一次检索更好。
显著性评分: 时效性(0.30) + 访问频率(0.25) + 连接数(0.20) + 紧迫性(0.15) + 重要性(0.10)
Ebbinghaus 衰减 R(t) = e^(−t/S):
| 记忆类型 | S(天) | 1天后 | 1周后 | |----------|---------|-------|-------| | 日志(情景) | 2 | 60% | ~7% | | 知识/Bug修复(程序) | 180 | 99% | 96% | | 宫殿/架构决策(语义) | 9999 | ≈100% | ≈100% |
旧日志噪音数天内消退,架构决策永久保留。
复合效应:
会话 1: 保存 3 条记忆(自动命名、索引、创建边)
会话 10: watch_for 在错误重复之前警告 agent
会话 20: 感知包含 10 条交叉验证的洞察(从 40+ 条原始观察合并)
会话 50: Agent 了解你的优先级、盲点和沟通风格SDK API
import { AgentRecall } from "agent-recall-sdk";
const ar = new AgentRecall({ project: "my-project" });| 方法 | 说明 |
|------|------|
| capture(question, answer, opts?) | 快速问答捕获(L1 记忆) |
| journalWrite(content, opts?) | 写入每日日志 |
| coldStart() | 宫殿优先上下文加载(~200 token) |
| palaceWrite(room, content, opts?) | 写入房间,自动扇出交叉引用 |
| palaceRead(room?, topic?) | 读取房间内容 |
| walk(depth?, focus?) | 渐进式宫殿漫步 |
| awarenessUpdate(insights, opts?) | 复合新洞察到感知系统 |
| recallInsight(context, opts?) | 跨项目洞察召回 |
| alignmentCheck(input) | 记录置信度和假设 |
| synthesize(opts?) | L3 合成,可选宫殿整合 |
CLI 命令
# 日志
ar capture <question> <answer> [--tags tag1,tag2]
ar read [--date YYYY-MM-DD]
ar search <query> [--include-palace]
ar rollup [--min-age-days N] [--dry-run]
# 宫殿
ar palace write <room> <content> [--importance high|medium|low]
ar palace walk [--depth identity|active|relevant|full]
ar palace search <query>
# 感知与洞察
ar awareness update --insight "标题" --evidence "证据" --applies-when kw1,kw2
ar insight <context> [--limit N]
# 语义召回配置
ar setup supabase [--migrate] [--backfill]
# 全局选项
--root <path> 存储根目录(默认:~/.agent-recall)
--project <slug> 项目覆盖架构
五层记忆模型
L1: 工作记忆 journal_capture 「发生了什么」
L2: 情景记忆 journal_write 「这意味着什么」
L3: 记忆宫殿 palace_write / walk 「跨会话的知识」
L4: 感知系统 awareness_update 「复合的洞察」
L5: 洞察索引 recall_insight 「跨项目的经验」扇出写入 — 写入一个房间,相关房间通过 [[wikilinks]] 自动更新交叉引用。零 LLM 成本。
Obsidian 兼容 — YAML frontmatter + [[wikilinks]]。将 palace/ 作为 Obsidian vault 打开即可。
平台兼容性
| 平台 | MCP | SDK | CLI | 说明 |
|------|:---:|:---:|:---:|------|
| Claude Code | ✅ | ✅ | ✅ | 完整支持 — MCP + 技能 + 命令 |
| Cursor | ✅ | ✅ | ✅ | MCP via .cursor/mcp.json |
| VS Code (Copilot) | ✅ | ✅ | ✅ | MCP via .vscode/mcp.json |
| Windsurf | ✅ | ✅ | ✅ | MCP via mcp_config.json |
| OpenAI Codex | ✅ | ✅ | ✅ | codex mcp add |
| LangChain / CrewAI | — | ✅ | — | SDK 集成到你的 chain 中 |
| Vercel AI SDK | — | ✅ | — | SDK 在 server actions 中使用 |
| CI / GitHub Actions | — | — | ✅ | npx agent-recall-cli |
| 任何 MCP 智能体 | ✅ | — | — | 标准 MCP 协议 |
实测 Token 节省
| 场景 | 无 AR | 有 AR | 节省 | |------|:----:|:----:|:------:| | A: 简单 (2 会话,0 纠正) | 567 | 1,131 | +99% 纯开销 | | B: 中等 (5 会话,1 次纠正) | 6,220 | 4,382 | -30% | | C: 复杂 (20 会话,5 次纠正) | 40,910 | 17,520 | -57% | | D: 多 Agent (3 个 agent × 5 会话) | 20,781 | 13,140 | -37% | | E: Digest 缓存 (重复分析,1 次命中) | ~2,400 | ~400 | -83% |
盈亏平衡:~3-4 个会话。 简单一次性任务,AR 是纯开销。3+ 会话、有纠正、多 agent 的场景,AR 都能回本。
文档
| 文档 | 说明 |
|------|------|
| → 命令参考 | /arstatus、/arstart、/arsave、/arsaveall 完整指南 |
| 智能距离协议 | 基础理论 — 人类与 AI 之间的差距是结构性的,如何减少信息损失 |
| 评分设计原理 | RRF、艾宾浩斯、Beta 分布及其修复的 bug |
| v3.4 升级说明 | 语义召回、pgvector、10 工具、bootstrap、decisions 房间 |
贡献
- Issues & 反馈:GitHub Issues
- 邮箱:[email protected]
MIT 许可证。
