@swarmclawai/local-memory
v0.1.0
Published
100% local, OpenAI-compatible memory layer for AI agents. On-device embeddings via Ollama + SQLite (sqlite-vec) vector search. Zero cloud round-trips. Built for agents.
Maintainers
Readme
local-memory
100% local, OpenAI-compatible memory layer for AI agents. On-device embeddings via Ollama, SQLite +
sqlite-vecfor vector search, zero cloud round-trips. Drop-in for any code that already speaks/v1/embeddings.
Why this exists
Every AI-agent memory repo today — Supermemory, claude-mem, mem0, Graphiti — assumes you'll call OpenAI for embeddings and often a hosted vector DB on top. That means three things for a solo dev or privacy-conscious user:
- Every agent interaction has a network round-trip.
- A paid API bill scales with your agent's usage.
- Your memory lives off-device, off-machine, in someone else's cloud.
local-memory is the opposite: a single Node process, one SQLite file on disk, embeddings via a local Ollama instance running on your own laptop. It exposes an OpenAI-compatible /v1/embeddings endpoint so any existing client that accepts a baseURL override just works — no SDK swap, no code rewrite. It also exposes a small first-class memory API (POST /memory, /memory/search, namespaces, export, import) for when you want more than just embeddings.
30-second demo
# Prereq: Ollama running with an embedding model pulled
ollama pull nomic-embed-text
ollama serve & # listens on 127.0.0.1:11434
# Start the memory server
npx -y @swarmclawai/local-memory start
# Point your OpenAI client at it
export OPENAI_BASE_URL=http://localhost:3456/v1
export OPENAI_API_KEY=not-used
# From another shell, store and search memories
npx -y @swarmclawai/local-memory add "User prefers kebab-case for slugs"
npx -y @swarmclawai/local-memory search "casing conventions"Install
pnpm add -g @swarmclawai/local-memory
# or
npm i -g @swarmclawai/local-memory
# or run on demand
npx -y @swarmclawai/local-memory --helpProviders
| Provider | Default model | Dim | When to use |
|---|---|---|---|
| ollama (default) | nomic-embed-text | 768 | You have ollama serve running locally |
| mock | deterministic hash | 64 | Unit tests, offline demos, zero-dep fallback |
Override the embedder with --provider, --model, --dim, and --embed-base-url.
Commands
| Command | Purpose |
|---|---|
| local-memory start | Start the HTTP server on port 3456 |
| local-memory add <text> | Embed and store a memory (-n <namespace>, --metadata '{...}') |
| local-memory search <query> | Semantic search (-n <namespace>, -k <limit>) |
| local-memory namespaces | List namespaces + counts |
| local-memory export | Dump memories as JSONL on stdout |
| local-memory import <file> | Re-embed every line from a JSONL file |
| local-memory help-agents | Print the full machine-readable catalog |
Every command accepts --json and returns a one-line JSON envelope. Exit codes: 0 success, 1 user error, 2 internal error.
HTTP endpoints (when started)
| Endpoint | Purpose |
|---|---|
| GET / | Service info + dim + embedder id |
| POST /v1/embeddings | OpenAI-compatible drop-in — body {input: string \| string[]} |
| POST /memory | Store a memory ({text, namespace?, metadata?}) |
| GET /memory/:id | Fetch a memory |
| DELETE /memory/:id | Remove a memory |
| POST /memory/search | Semantic search ({query, namespace?, limit?}) |
| GET /namespaces | List namespaces |
Drop-in for OpenAI clients
Any OpenAI SDK that supports baseURL works out of the box. Example with the Node SDK:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:3456/v1",
apiKey: "not-used",
});
const res = await client.embeddings.create({
model: "anything",
input: "the quick brown fox",
});How it works
- Embeddings come from a local Ollama process — nothing leaves your machine.
- Vectors are stored in a plain SQLite file via
sqlite-vec. Default path:~/.local-memory/memory.db. - Search uses
sqlite-vec'sMATCHoperator for k-nearest-neighbor retrieval. Results come back with a normalized similarity score in(0, 1]— higher is better. - Namespaces are just a column — free per-app or per-agent memory isolation.
How it compares
| | local-memory | Supermemory | claude-mem | mem0 | Graphiti | |---|---|---|---|---|---| | Runs 100% offline | ✅ | ❌ | partial | ❌ | ❌ | | OpenAI-compatible drop-in | ✅ | ❌ | ❌ | ❌ | ❌ | | Single SQLite file | ✅ | ❌ | ❌ | ❌ | ❌ | | Pluggable embedder | ✅ | ❌ | ❌ | partial | partial | | Agent-driven CLI | ✅ | — | partial | partial | — |
Built for coding agents
Every swarmclawai CLI follows the same agent conventions:
--jsoneverywhere, one-line envelope on stdout- Stderr for logs, stdout for data
- Stable exit codes:
0/1/2 - Non-interactive by default
local-memory help-agentsreturns the entire command catalog as JSON
See AGENTS.md for the full machine-readable reference.
Roadmap
- MCP adapter — expose memory tools to any MCP-compatible agent (Claude Code, Cursor, Cline, Aider, etc.)
- Apple Intelligence + MLX providers for macOS
- Gemini Nano provider for Chrome/Pixel
- Incremental compaction (TTL + importance-weighted pruning)
- Hybrid BM25 + vector retrieval
- Encryption-at-rest flag for the DB
Contributing
See CONTRIBUTING.md.
