@suocommerce/memory-openclaw
v2.0.0
Published
The only OpenClaw memory plugin that works completely offline. Surprise-gated learning, briefing cards, importance scoring, local embeddings, fleet sync. No API keys required.
Maintainers
Readme
Generation Timestamp: 2026-04-08T19:45:00Z
Astral Core Memory — OpenClaw Plugin
Offline-first persistent memory for OpenClaw agents. No API keys required.
Your agent remembers across sessions, learns what matters, and works entirely on your machine.
Get the memory server — orbitalfortress.com · €19 one-time · macOS · Windows · Linux
What's New in v2.0.0
- Briefing cards — every session starts with a ≤200 token summary of who the user is and what they're working on. The agent knows you from the first message.
- Enrichment hints — the memory system flags ambiguous memories and the agent can ask targeted follow-up questions to improve quality.
- Importance scoring — high-value memories resist dormancy. Frequently accessed, starred, or high-utility memories stay active longer.
- Dormant cold storage — inactive memories move to cold storage to reduce RAM usage. They reactivate automatically when semantically relevant.
- Configurable similarity threshold —
minSimilaritydefault raised from 0.3 to 0.45, reducing noisy recall at scale (7k+ memories).
Quick Start
1. Install the plugin
npm install @astralcore/memory-openclaw2. Download and run the memory server
The plugin talks to the Astral Core memory server running on your machine.
# macOS (Apple Silicon)
curl -L https://orbitalfortress.com/download/macos -o astral-memory-server
chmod +x astral-memory-server
# Linux x86_64
curl -L https://orbitalfortress.com/download/linux -o astral-memory-server
chmod +x astral-memory-server
# Windows — download from https://orbitalfortress.com/download/windowsActivate with your license key (one-time):
./astral-memory-server --activate SOUL-XXXX-XXXX-XXXX-XXXXStart the server:
./astral-memory-serverThe server runs on http://localhost:8090. Verify:
curl http://localhost:8090/health3. Add the plugin to your OpenClaw config
{
"plugins": {
"memory": {
"provider": "@astralcore/memory-openclaw",
"config": {
"serverUrl": "http://localhost:8090",
"autoCapture": true,
"autoRecall": true,
"maxRecallMemories": 5,
"briefingCardOnStart": true
}
}
}
}That's it. Your agent now has persistent memory.
How It Works
Briefing Cards
At the start of every session, the plugin fetches a briefing card from the memory server. The card is a ≤200 token summary containing:
- Identity facts — the user's name, role, preferences
- Active context — current projects, recent decisions, open tasks
- Category health — which knowledge areas are strong or sparse
The agent sees this card before the first turn, so it begins every conversation with awareness of who it's talking to. No "remind me what we were working on" needed.
The briefing card is generated by the memory engine from stored memories — it's not a static profile. As the user's context evolves, so does the card.
Surprise-Gated Learning
Not everything gets stored. The memory engine uses a Delta Rule matrix to predict incoming information against what it already knows. Only genuinely novel content passes the surprise gate. This means:
- Repeated information is automatically filtered
- The memory store grows with quality, not just volume
- No manual deduplication needed
Three-Tier Memory Lifecycle
Memories progress through tiers based on how useful they prove to be:
Fast (today) → Medium (multi-session) → Slow (long-term)
↓
Dormant (cold storage)
↓
Reactivated when neededThe importance scoring system (v2.0.0) protects high-value memories from going dormant. Memories with high access counts, strong utility scores, or starred metadata stay active longer.
Enrichment Hints
The Cognitive Shell flags memories where the stored information is ambiguous or could benefit from clarification. The agent can surface these as natural follow-up questions:
- "You mentioned framework X — did you mean React or the testing library?"
- "Last time you said the deadline was March — has that changed?"
Enrichment is gentle — the agent asks at most one clarification per conversation, woven into the natural flow.
Fleet Sync (Optional)
With an Orbital Fortress server, memories sync across devices:
Your laptop → Orbital Fortress → Your desktop
↓
Briefings from fleetMemories are uploaded as derivative representations (not raw text) and briefings from other devices are pulled back. Your laptop learns what your desktop discovered.
Learn more at orbitalfortress.com.
Setting Up the Embedding Model
The memory server needs a local embedding model to convert text into searchable vectors. No external API keys are needed — everything runs on your machine.
Option A — Use the bundled model (recommended)
If you have llama.cpp installed, start the embedding server with
the nomic-embed model:
# Download the model (~300MB)
curl -L https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q5_K_M.gguf \
-o nomic-embed-text-v1.5.Q5_K_M.gguf
# Start the embedding server on port 8081
llama-server \
--model nomic-embed-text-v1.5.Q5_K_M.gguf \
--port 8081 \
--embedding \
--ctx-size 2048The memory server connects to localhost:8081 by default.
Option B — Use a different embedding model
Any OpenAI-compatible embedding endpoint works. Pass the URL when starting the memory server:
./astral-memory-server --embedding-url http://localhost:11434/api/embeddingsThis works with Ollama, LM Studio, or any service that serves embeddings over HTTP.
Option C — Use a remote embedding API
If you prefer a hosted embedding service:
./astral-memory-server --embedding-url https://api.openai.com/v1/embeddings \
--embedding-api-key sk-your-keyThis still keeps your memories local — only the embedding vectors are generated remotely, not your stored data.
Agent Tools
The plugin registers these tools with the OpenClaw agent:
| Tool | What it does |
|------|-------------|
| astral_recall | Semantic search across all memory tiers |
| astral_store | Explicitly store a fact, preference, or decision |
| astral_forget | Delete all memories from a specific source |
| astral_briefing | Get a briefing card summary on demand |
| astral_enrich | Check for memories needing user clarification |
| astral_stats | Full memory system statistics and health |
| astral_sync | Sync with Orbital Fortress (when configured) |
Auto-recall and auto-capture
In addition to the manual tools, two hooks run automatically:
Auto-recall (before each turn) — searches memory for content relevant to the user's latest message and injects it into the agent's system prompt. The agent sees past context without needing to call
astral_recallexplicitly.Auto-capture (after each turn) — sends the conversation through the surprise-gated pipeline. Only novel information is stored. The agent doesn't need to call
astral_storefor routine content.
Ingesting Existing Data
If you have existing notes, documents, or conversation history you want the memory server to learn from, use the ingest endpoint.
Ingest conversation turns
curl -X POST http://localhost:8090/v1/memory/ingest \
-H "Content-Type: application/json" \
-d '{
"turns": [
{
"user": "We use Kubernetes for deployment with ArgoCD for GitOps",
"assistant": "Noted — Kubernetes with ArgoCD for your deployment pipeline."
},
{
"user": "Our database is PostgreSQL 16 with pgvector for embeddings",
"assistant": "Got it — PostgreSQL 16 with the pgvector extension."
}
],
"source": "initial-import"
}'The server evaluates each turn and stores only what it considers novel. If you ingest overlapping information, duplicates are automatically filtered out.
Bulk ingest from a file
For larger imports, use a simple script:
import json
import requests
MEMORY = "http://localhost:8090"
# Load your data — any format, convert to turns
with open("my_notes.jsonl") as f:
for line in f:
record = json.loads(line)
requests.post(f"{MEMORY}/v1/memory/ingest", json={
"turns": [{
"user": record["question"],
"assistant": record["answer"]
}],
"source": "bulk-import"
})
print("Import complete")Ingest plain text (without conversation structure)
If you have standalone notes or documents:
curl -X POST http://localhost:8090/v1/memory/ingest \
-H "Content-Type: application/json" \
-d '{
"turns": [
{
"user": "Remember this: Our SLA requires 99.9% uptime for the payments service",
"assistant": "Stored."
}
],
"source": "manual-notes"
}'Ingest an entire folder
Point the server at a directory and it will walk all matching files:
curl -X POST http://localhost:8090/v1/memory/ingest/folder \
-H "Content-Type: application/json" \
-d '{
"path": "/home/user/project/docs",
"extensions": ["md", "txt", "py"],
"since_days": 60,
"recursive": true
}'Check what was stored
# Total memory count
curl http://localhost:8090/v1/memory/stats
# Search for specific memories
curl -X POST http://localhost:8090/v1/memory/search \
-H "Content-Type: application/json" \
-d '{"query": "deployment", "limit": 5}'
# Get the briefing card
curl http://localhost:8090/v1/memory/briefingDelete imported memories if needed
# Delete all memories from a specific source
curl -X DELETE http://localhost:8090/v1/memory/source/bulk-importRe-embedding Memories
If you switch to a different embedding model (different dimensions or better quality), your existing memories need to be re-embedded to match the new vector space.
When do I need to re-embed?
- You changed the embedding model (e.g. from nomic 768d to a 1024d model)
- You upgraded to a newer version of the same model
- Searches are returning poor results after a model change
How to re-embed
Stop the memory server, swap the embedding model, and restart with the re-embed flag:
# 1. Stop the running server (Ctrl+C or kill the process)
# 2. Start the new embedding model on port 8081
llama-server \
--model your-new-embedding-model.gguf \
--port 8081 \
--embedding \
--ctx-size 2048
# 3. Restart the memory server with re-embed flag
./astral-memory-server --rebuild-embeddingsThe server will re-process all existing memories through the new embedding model. This may take a few minutes depending on how many memories you have. Progress is shown in the terminal.
Your memories (the actual text) are never modified — only the vector representations are regenerated.
Checking embedding status
curl http://localhost:8090/healthThe response includes embedding dimension and model information so you can verify the new model is active.
Plugin Configuration
| Option | Default | Description |
|--------|---------|-------------|
| serverUrl | http://localhost:8090 | Memory server URL |
| autoCapture | true | Store memories after each conversation turn |
| autoRecall | true | Inject relevant memories into prompts |
| maxRecallMemories | 5 | Max memories injected per prompt |
| minSimilarity | 0.45 | Minimum cosine similarity for recall (0.0-1.0) |
| briefingCardOnStart | true | Inject briefing card at session start |
| briefingMaxTokens | 200 | Max tokens for the briefing card (50-500) |
| captureMinMessages | 2 | Minimum messages before capturing |
| captureMaxChars | 8000 | Max chars per message sent for capture |
| fortressUrl | (empty) | Orbital Fortress URL for cross-device sync |
| healthCheckOnStart | true | Check server health on plugin load |
Memory Server Configuration
| Flag | Default | Description |
|------|---------|-------------|
| --port | 8090 | Memory server port |
| --embedding-url | http://localhost:8081 | Embedding server URL |
| --embedding-api-key | (none) | API key for remote embedding services |
| --data-dir | ./data | Where memories are stored on disk |
| --rebuild-embeddings | (off) | Re-embed all memories on startup |
| --activate KEY | (none) | Activate with license key (first run only) |
Competitive Comparison
| Feature | memory-core | memory-lancedb | lancedb-pro | Astral Core | |---------|------------|----------------|-------------|----------------| | Works offline | Yes | No (needs OpenAI key) | No (needs API key) | Yes | | Embedding model | SQLite FTS | text-embedding-3-small | text-embedding-3-small | nomic-embed (local) | | Write intelligence | Store everything | Store everything | Store + importance | Surprise-gated | | Memory lifecycle | None | None | Time decay | 3-tier + dormancy | | Briefing cards | No | No | No | Yes | | Importance scoring | No | No | No | Yes | | Enrichment hints | No | No | No | Yes | | Cross-device sync | No | No | No | Yes (Fortress) | | Cost after install | $0 | ~$5-15/mo (OpenAI) | ~$5-15/mo | $0 |
Troubleshooting
"Connection refused" on localhost:8090
The memory server isn't running. Start it:
./astral-memory-server"Embedding server not available"
The embedding model server isn't running on port 8081. Start it:
llama-server --model nomic-embed-text-v1.5.Q5_K_M.gguf --port 8081 --embeddingMemories aren't being stored
The server filters out information it already knows or considers redundant. Check what's stored:
curl http://localhost:8090/v1/memory/statsIf the count is zero, verify the embedding server is running — memories can't be stored without embeddings.
Search returns irrelevant results
This usually means the embedding model changed since memories were stored. Re-embed:
./astral-memory-server --rebuild-embeddingsToo many dormant reactivations (slow search)
If you have thousands of memories and search feels slow, the
minSimilarity threshold may be too low. Raise it in your
plugin config:
{
"config": {
"minSimilarity": 0.5
}
}The default was raised from 0.3 to 0.45 in v2.0.0 to address this. Values of 0.45-0.55 work well for most use cases.
Briefing card not appearing
The briefing card requires memory server v2.5.0 or later. Check your server version:
curl http://localhost:8090/health | jq .versionIf the version is older, download the latest binary from orbitalfortress.com.
Also verify that briefingCardOnStart is true in your plugin
config (it is by default).
Importance scoring not showing in stats
Importance scoring requires the B2 update (memory server with
importance_scorer.py). The stats endpoint will include an
importance_scoring section when available. Older servers
simply omit it — the plugin handles this gracefully.
Architecture
Astral Core Memory is part of the Astral Core project — an open-source AI memory engine built for privacy-first, offline-capable AI assistants.
The full stack:
| Component | License | What it does | |-----------|---------|-------------| | Memory engine (MASK/HOPE) | MIT | Surprise-gated write pipeline | | Memory API server | MIT | REST endpoints on :8090 | | This OpenClaw plugin | MIT | Bridge to OpenClaw Gateway | | Orbital Fortress | AGPL-3.0 | Fleet sync server (self-hostable) |
Contributing
Issues and PRs welcome at github.com/suocommerce/astral-core.
If you're building a memory plugin for another platform (Cursor, Continue, VS Code), the Memory API server is platform-agnostic — any HTTP client can talk to it.
Links
- Get a license — €19, one-time, no subscription
- How Astral Core works — for non-technical users
- For Developers — API reference and integration guide
- Report an issue
License
MIT — see LICENSE for details.
The plugin is open source. The Astral Core memory server binary requires a license (€19 one-time).
Your AI should remember you without phoning home.
