memspec

v0.2.0

Published

a month ago

Structured memory for AI agents - spec plus CLI

0High
0Medium
0Low

0xtherin

ai agent memory specification cli mcp

Memspec

Portable, file-canonical memory for AI agents.

The Problem

AI coding agents wake up with amnesia. Every session starts cold — no memory of what was decided, what failed, what the project's tribal knowledge says. Teams solve this with scattered markdown files, prompt stuffing, or bespoke database-backed memory services that create vendor lock-in and operational overhead.

The result: agents re-discover the same facts, repeat the same mistakes, and can't build on prior work. Humans waste time re-explaining context that should persist.

Existing solutions fall into two camps:

File-based (MEMORY.md, daily logs): Human-readable and git-friendly, but retrieval is primitive grep. No lifecycle, no decay, no self-correction. Knowledge rots silently.
Service-based (hosted memory APIs, vector DBs): Better retrieval, but require infrastructure, accounts, API keys, and trust in a third-party data store. Not portable across tools.

What Memspec Does

Memspec is a specification and CLI for managing living project knowledge. It keeps markdown files under .memspec/ as the canonical source of truth, then layers structured lifecycle management, full-text search, and an MCP server on top.

For a potential user, the point is simple:

Keep memory inside the repo, not trapped in a vendor backend
Make it agent-operated with human oversight, not human-curated with agent access
Improve retrieval and hygiene without introducing a service to babysit
Let any tool speak to the same memory through files, CLI, or MCP

Any agent (CLI, MCP, or direct file I/O)
    │
    │  write observations
    │  query for context
    │  signal corrections
    ▼
┌──────────────────────────────────┐
│       Memspec Convention         │
│   Types · Lifecycle · Retrieval  │
├──────────────────────────────────┤
│      Markdown Files (git)        │  ← canonical source of truth
├──────────────────────────────────┤
│   SQLite FTS5 Derived Index      │  ← rebuildable, one-directional
│   + Optional Embeddings          │
└──────────────────────────────────┘

No daemon. No backend service. No database-owned state.

Why This Shape

Most agent memory systems ask you to adopt their runtime, their database, and their API surface before you get value. That is fine for a product. It is bad for a standard.

Memspec takes the opposite path:

The repo owns memory. Your project can outlive any model, SDK, MCP server, or wrapper.
Derived state stays disposable. Search indexes and embeddings can be rebuilt; the markdown files remain the contract.
Interop is optional, not mandatory. A shell script can read the files, an agent can use the CLI, and Claude/Cursor/Codex can use MCP. Same store, different access paths.
Adoption stays low-friction. memspec init gives you working search immediately, and better retrieval is an additive upgrade rather than a platform migration.

That architecture exists for one reason: long-lived project memory should be portable across tools and boring to operate.

Design Principles

Files are truth. The derived index is disposable and rebuildable from files. If the index disappears, you lose speed, not data.
Three memory types. Facts, decisions, and procedures. That's the universal vocabulary every agent understands. Richer categorization uses the extension model, not new core types.
Self-correction over curation. Agents write, correct, and expire memories autonomously. When knowledge goes stale, the agent supersedes it — the old memory links to the replacement, and the evolution is traceable in git history. No human review queue.
Decay is a feature. Every memory has a TTL. Facts go stale as code changes. Procedures drift as tooling evolves. Forcing re-verification keeps memory honest.
Zero-infrastructure default. npm install and memspec init — that's it. No accounts, no API keys, no hosted services. Works offline, works on any platform Node runs on.

Memory Types

| Type | What it captures | Examples | |------|-----------------|---------| | fact | Verified project state | "Auth uses JWT with 15min expiry", "DB is Postgres 16" | | decision | A choice with context and rationale | "Chose REST over GraphQL for client simplicity" | | procedure | A reusable workflow or process | "Deploy: run tests, build, push, verify health" |

Observations that don't fit these types stay as raw observations until they do, or they decay.

Memory Lifecycle

observe → classify → [active] → decay → archive
                        ↑            │
                        │ correction  │
                        └─────────────┘

captured → raw observation, not yet classified
active → classified, available for retrieval, ranked by confidence
corrected → superseded by newer knowledge, pointer to replacement preserved
decayed → TTL expired, removed from active retrieval
archived → retained in git history only

No transition requires human approval. Corrections create new memories and link back to what they replaced. The full evolution is in git.

Default TTLs

| Type | Default TTL | |------|-------------| | fact | 90 days | | decision | 180 days | | procedure | 90 days | | observation | 7 days |

Per-item overrides via decay_after in frontmatter. Use never for genuinely permanent knowledge.

Install

npm install -g memspec

Or clone and link locally:

git clone https://github.com/siimvene/memspec.git
cd memspec
npm install && npm run build
npm link

How It Works

Memspec is agent-operated. The human runs one command — memspec init — to set up the store. After that, the agent reads, writes, corrects, and maintains memory autonomously through CLI commands, MCP tools, or direct file I/O.

The human's role is oversight, not curation: review what the agent captured in git diffs, override when needed, and trust the lifecycle to handle staleness.

Human: one-time setup

memspec init
# Interactive prompt: choose FTS5 (default) or hybrid with embeddings
# Creates .memspec/ in your project root
# Detects brownfield memory sources like MEMORY.md and memory/
# Patches AGENTS.md or CLAUDE.md so the agent knows to use Memspec
# Done. The agent takes it from here.

init is intentionally more than scaffolding. On a brownfield repo, it should leave you with a usable memory store, not an empty directory and more setup work.

Brownfield categorization

When init imports existing memory, the categorization is conservative:

current project state, architecture, and configuration become fact
choices with rationale become decision
reusable workflows and runbooks become procedure
ambiguous notes stay as observations until the agent can classify them safely

The goal is not perfect first-pass extraction. The goal is a correct-enough starting store that the agent can improve through normal work.

Agent: ongoing operation

The agent uses these commands (via shell, MCP, or programmatic access) as part of its normal workflow:

# Learn something → write it down
memspec add fact "API uses JWT" \
  --body "JWT with 15min expiry and refresh tokens" \
  --source agent --tags auth,api

memspec add decision "REST over GraphQL" \
  --body "REST for simplicity — most consumers are CLI tools" \
  --source agent --decay-after never

memspec add procedure "Deploy bot" \
  --body "SSH to server, git pull, pm2 restart" \
  --source agent --tags deploy

# Need context → search for it
memspec search "auth"
memspec search "deploy" --type procedure --json

# Knowledge is stale → correct it
memspec correct ms_01HXK... --reason "Migrated to OAuth" \
  --replace "Now uses OAuth2 with PKCE"

# Housekeeping (agent runs periodically)
memspec status
memspec validate
memspec decay --dry-run

No human in the loop for day-to-day memory operations. The agent decides what to remember, when to search, and when knowledge has gone stale. The human sees the results in git.

Who It Is For

Use Memspec if you want:

project memory that survives model swaps and tool churn
git-visible knowledge instead of hidden prompts or opaque vector stores
better retrieval than grep, without standing up a memory service
agent-operated memory where you review in git, not curate by hand

It is a bad fit if you want:

a hosted multi-tenant product with user accounts, dashboards, and org-level administration
a central memory service that owns the canonical state
fully automatic long-term memory with no review of what gets captured

Search

Two modes, configured at memspec init:

FTS5 (default) — SQLite full-text search with BM25 ranking, porter stemming, and phrase bonus. Zero setup, zero dependencies beyond better-sqlite3.
Hybrid — FTS5 candidate retrieval plus embeddings reranking. Supports OpenAI-compatible endpoints and Ollama. Configured interactively or via CLI flags.

Search always operates over the derived index, which is rebuilt from files on demand. If the index is missing, it rebuilds automatically.

MCP Server

Memspec ships an MCP server for integration with Claude Code, Cursor, Codex, and any MCP-compatible tool:

# Start stdio MCP server in current project
memspec-mcp

# Pin to a specific project root
memspec-mcp --cwd /path/to/project

Exposed tools: memspec_search, memspec_get, memspec_add, memspec_promote, memspec_correct, memspec_status, memspec_validate, memspec_decay, memspec_init, memspec_stores

Host Registration

memspec init auto-creates a .mcp.json file in the project root for host tool discovery. MCP-compatible tools (Claude Code, Cursor, etc.) read this file to find available servers.

For manual setup or other tools, add to .mcp.json:

{
  "mcpServers": {
    "memspec": {
      "command": "memspec-mcp",
      "args": ["--cwd", "/absolute/path/to/project"]
    }
  }
}

If .mcp.json already exists with other servers, memspec init merges its entry without overwriting them.

Global Store

Memspec supports a global store at ~/.memspec/ for cross-project memory (personal preferences, common patterns, infrastructure knowledge). When both a project store and a global store exist, the project store takes priority and the global store is merged as a lower-priority layer.

Create the global store with:

memspec init --cwd ~/.memspec

Store Layout

.memspec/
  observations/        # Raw, unclassified
  memory/
    facts/             # Active facts
    decisions/         # Active decisions
    procedures/        # Active procedures
  archive/             # Corrected, decayed, archived items
  config.yaml          # Search engine, embeddings, decay rules

Each memory is a markdown file with YAML frontmatter (id, type, state, confidence, source, tags, timestamps, decay rules). Human-readable, git-diffable, greppable.

CLI Reference

| Command | Description | |---------|-------------| | memspec init | Create .memspec/ and configure search engine | | memspec add <type> <title> | Add a fact, decision, or procedure | | memspec search <query> | Search active memories | | memspec correct <id> --reason "..." | Correct or invalidate a memory | | memspec status | Store summary (counts, decay warnings, recent items) | | memspec decay [--dry-run] [--archive] | Apply TTL rules to expired items | | memspec validate | Check all memory files against schema |

Memory Consolidation

Agents need triggers to write and maintain memories. Memspec uses two mechanisms:

1. Behavioral triggers (agent instructions)

memspec init patches your CLAUDE.md or AGENTS.md with instructions that tie memory writes to observable events:

Fixed a bug → write/correct the relevant fact
Changed architecture or configuration → correct stale memories, write new ones
Established a workflow → write a procedure
Discovered something non-obvious → write a fact
Made a design choice → write a decision with rationale

These work regardless of git discipline. The agent writes memories as part of the task, not as a deferred chore.

2. Commit hook (Claude Code)

For Claude Code users, a PostToolUse hook triggers a consolidation prompt when the agent commits code. The agent still has full conversation context, so it can write meaningful memories about what it just committed and why.

Install the hook by copying hooks/memspec-consolidate.js to ~/.claude/hooks/ and adding to ~/.claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "node \"$HOME/.claude/hooks/memspec-consolidate.js\"",
            "timeout": 5
          }
        ]
      }
    ]
  }
}

Configurable in .memspec/config.yaml:

consolidation:
  trigger: commit    # commit | manual | none
  frequency: once    # once (per session) | always

commit + once (default): first commit in a session triggers consolidation
commit + always: every commit triggers (thorough but noisy)
manual: agent instructions only, no hook enforcement
none: disabled entirely

For Agent Authors

If init cannot patch your repo instructions automatically, copy the block from AGENTS-ADDON.md into AGENTS.md or CLAUDE.md.

What's Planned

Retrieval profiles with token budgeting and context-aware ranking
Automatic observation classification (rule-based + optional LLM)
Extension model for domain-specific metadata without breaking core types

License

MIT