@suocommerce/memory-openclaw

v2.0.0

Published

2 months ago

The only OpenClaw memory plugin that works completely offline. Surprise-gated learning, briefing cards, importance scoring, local embeddings, fleet sync. No API keys required.

Generation Timestamp: 2026-04-08T19:45:00Z

Astral Core Memory — OpenClaw Plugin

Offline-first persistent memory for OpenClaw agents. No API keys required.

Your agent remembers across sessions, learns what matters, and works entirely on your machine.

Get the memory server — orbitalfortress.com · €19 one-time · macOS · Windows · Linux

What's New in v2.0.0

Briefing cards — every session starts with a ≤200 token summary of who the user is and what they're working on. The agent knows you from the first message.
Enrichment hints — the memory system flags ambiguous memories and the agent can ask targeted follow-up questions to improve quality.
Importance scoring — high-value memories resist dormancy. Frequently accessed, starred, or high-utility memories stay active longer.
Dormant cold storage — inactive memories move to cold storage to reduce RAM usage. They reactivate automatically when semantically relevant.
Configurable similarity threshold — minSimilarity default raised from 0.3 to 0.45, reducing noisy recall at scale (7k+ memories).

Quick Start

1. Install the plugin

npm install @astralcore/memory-openclaw

2. Download and run the memory server

The plugin talks to the Astral Core memory server running on your machine.

# macOS (Apple Silicon)
curl -L https://orbitalfortress.com/download/macos -o astral-memory-server
chmod +x astral-memory-server

# Linux x86_64
curl -L https://orbitalfortress.com/download/linux -o astral-memory-server
chmod +x astral-memory-server

# Windows — download from https://orbitalfortress.com/download/windows

Activate with your license key (one-time):

./astral-memory-server --activate SOUL-XXXX-XXXX-XXXX-XXXX

Start the server:

./astral-memory-server

The server runs on http://localhost:8090. Verify:

curl http://localhost:8090/health

3. Add the plugin to your OpenClaw config

{
  "plugins": {
    "memory": {
      "provider": "@astralcore/memory-openclaw",
      "config": {
        "serverUrl": "http://localhost:8090",
        "autoCapture": true,
        "autoRecall": true,
        "maxRecallMemories": 5,
        "briefingCardOnStart": true
      }
    }
  }
}

That's it. Your agent now has persistent memory.

How It Works

Briefing Cards

At the start of every session, the plugin fetches a briefing card from the memory server. The card is a ≤200 token summary containing:

Identity facts — the user's name, role, preferences
Active context — current projects, recent decisions, open tasks
Category health — which knowledge areas are strong or sparse

The agent sees this card before the first turn, so it begins every conversation with awareness of who it's talking to. No "remind me what we were working on" needed.

The briefing card is generated by the memory engine from stored memories — it's not a static profile. As the user's context evolves, so does the card.

Surprise-Gated Learning

Not everything gets stored. The memory engine uses a Delta Rule matrix to predict incoming information against what it already knows. Only genuinely novel content passes the surprise gate. This means:

Repeated information is automatically filtered
The memory store grows with quality, not just volume
No manual deduplication needed

Three-Tier Memory Lifecycle

Memories progress through tiers based on how useful they prove to be:

Fast (today) → Medium (multi-session) → Slow (long-term)
                                              ↓
                                        Dormant (cold storage)
                                              ↓
                                        Reactivated when needed

The importance scoring system (v2.0.0) protects high-value memories from going dormant. Memories with high access counts, strong utility scores, or starred metadata stay active longer.

Enrichment Hints

The Cognitive Shell flags memories where the stored information is ambiguous or could benefit from clarification. The agent can surface these as natural follow-up questions:

"You mentioned framework X — did you mean React or the testing library?"
"Last time you said the deadline was March — has that changed?"

Enrichment is gentle — the agent asks at most one clarification per conversation, woven into the natural flow.

Fleet Sync (Optional)

With an Orbital Fortress server, memories sync across devices:

Your laptop → Orbital Fortress → Your desktop
                    ↓
            Briefings from fleet

Memories are uploaded as derivative representations (not raw text) and briefings from other devices are pulled back. Your laptop learns what your desktop discovered.

Learn more at orbitalfortress.com.

Setting Up the Embedding Model

The memory server needs a local embedding model to convert text into searchable vectors. No external API keys are needed — everything runs on your machine.

Option A — Use the bundled model (recommended)

If you have llama.cpp installed, start the embedding server with the nomic-embed model:

# Download the model (~300MB)
curl -L https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q5_K_M.gguf \
  -o nomic-embed-text-v1.5.Q5_K_M.gguf

# Start the embedding server on port 8081
llama-server \
  --model nomic-embed-text-v1.5.Q5_K_M.gguf \
  --port 8081 \
  --embedding \
  --ctx-size 2048

The memory server connects to localhost:8081 by default.

Option B — Use a different embedding model

Any OpenAI-compatible embedding endpoint works. Pass the URL when starting the memory server:

./astral-memory-server --embedding-url http://localhost:11434/api/embeddings

This works with Ollama, LM Studio, or any service that serves embeddings over HTTP.

Option C — Use a remote embedding API

If you prefer a hosted embedding service:

./astral-memory-server --embedding-url https://api.openai.com/v1/embeddings \
  --embedding-api-key sk-your-key

This still keeps your memories local — only the embedding vectors are generated remotely, not your stored data.

Agent Tools

The plugin registers these tools with the OpenClaw agent:

| Tool | What it does | |------|-------------| | astral_recall | Semantic search across all memory tiers | | astral_store | Explicitly store a fact, preference, or decision | | astral_forget | Delete all memories from a specific source | | astral_briefing | Get a briefing card summary on demand | | astral_enrich | Check for memories needing user clarification | | astral_stats | Full memory system statistics and health | | astral_sync | Sync with Orbital Fortress (when configured) |

Auto-recall and auto-capture

In addition to the manual tools, two hooks run automatically:

Auto-recall (before each turn) — searches memory for content relevant to the user's latest message and injects it into the agent's system prompt. The agent sees past context without needing to call astral_recall explicitly.
Auto-capture (after each turn) — sends the conversation through the surprise-gated pipeline. Only novel information is stored. The agent doesn't need to call astral_store for routine content.

Ingesting Existing Data

If you have existing notes, documents, or conversation history you want the memory server to learn from, use the ingest endpoint.

Ingest conversation turns

curl -X POST http://localhost:8090/v1/memory/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "turns": [
      {
        "user": "We use Kubernetes for deployment with ArgoCD for GitOps",
        "assistant": "Noted — Kubernetes with ArgoCD for your deployment pipeline."
      },
      {
        "user": "Our database is PostgreSQL 16 with pgvector for embeddings",
        "assistant": "Got it — PostgreSQL 16 with the pgvector extension."
      }
    ],
    "source": "initial-import"
  }'

The server evaluates each turn and stores only what it considers novel. If you ingest overlapping information, duplicates are automatically filtered out.

Bulk ingest from a file

For larger imports, use a simple script:

import json
import requests

MEMORY = "http://localhost:8090"

# Load your data — any format, convert to turns
with open("my_notes.jsonl") as f:
    for line in f:
        record = json.loads(line)
        requests.post(f"{MEMORY}/v1/memory/ingest", json={
            "turns": [{
                "user": record["question"],
                "assistant": record["answer"]
            }],
            "source": "bulk-import"
        })

print("Import complete")

Ingest plain text (without conversation structure)

If you have standalone notes or documents:

curl -X POST http://localhost:8090/v1/memory/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "turns": [
      {
        "user": "Remember this: Our SLA requires 99.9% uptime for the payments service",
        "assistant": "Stored."
      }
    ],
    "source": "manual-notes"
  }'

Ingest an entire folder

Point the server at a directory and it will walk all matching files:

curl -X POST http://localhost:8090/v1/memory/ingest/folder \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/home/user/project/docs",
    "extensions": ["md", "txt", "py"],
    "since_days": 60,
    "recursive": true
  }'

Check what was stored

# Total memory count
curl http://localhost:8090/v1/memory/stats

# Search for specific memories
curl -X POST http://localhost:8090/v1/memory/search \
  -H "Content-Type: application/json" \
  -d '{"query": "deployment", "limit": 5}'

# Get the briefing card
curl http://localhost:8090/v1/memory/briefing

Delete imported memories if needed

# Delete all memories from a specific source
curl -X DELETE http://localhost:8090/v1/memory/source/bulk-import

Re-embedding Memories

If you switch to a different embedding model (different dimensions or better quality), your existing memories need to be re-embedded to match the new vector space.

When do I need to re-embed?

You changed the embedding model (e.g. from nomic 768d to a 1024d model)
You upgraded to a newer version of the same model
Searches are returning poor results after a model change

How to re-embed

Stop the memory server, swap the embedding model, and restart with the re-embed flag:

# 1. Stop the running server (Ctrl+C or kill the process)

# 2. Start the new embedding model on port 8081
llama-server \
  --model your-new-embedding-model.gguf \
  --port 8081 \
  --embedding \
  --ctx-size 2048

# 3. Restart the memory server with re-embed flag
./astral-memory-server --rebuild-embeddings

The server will re-process all existing memories through the new embedding model. This may take a few minutes depending on how many memories you have. Progress is shown in the terminal.

Your memories (the actual text) are never modified — only the vector representations are regenerated.

Checking embedding status

curl http://localhost:8090/health

The response includes embedding dimension and model information so you can verify the new model is active.

Plugin Configuration

| Option | Default | Description | |--------|---------|-------------| | serverUrl | http://localhost:8090 | Memory server URL | | autoCapture | true | Store memories after each conversation turn | | autoRecall | true | Inject relevant memories into prompts | | maxRecallMemories | 5 | Max memories injected per prompt | | minSimilarity | 0.45 | Minimum cosine similarity for recall (0.0-1.0) | | briefingCardOnStart | true | Inject briefing card at session start | | briefingMaxTokens | 200 | Max tokens for the briefing card (50-500) | | captureMinMessages | 2 | Minimum messages before capturing | | captureMaxChars | 8000 | Max chars per message sent for capture | | fortressUrl | (empty) | Orbital Fortress URL for cross-device sync | | healthCheckOnStart | true | Check server health on plugin load |

Memory Server Configuration

| Flag | Default | Description | |------|---------|-------------| | --port | 8090 | Memory server port | | --embedding-url | http://localhost:8081 | Embedding server URL | | --embedding-api-key | (none) | API key for remote embedding services | | --data-dir | ./data | Where memories are stored on disk | | --rebuild-embeddings | (off) | Re-embed all memories on startup | | --activate KEY | (none) | Activate with license key (first run only) |

Competitive Comparison

| Feature | memory-core | memory-lancedb | lancedb-pro | Astral Core | |---------|------------|----------------|-------------|----------------| | Works offline | Yes | No (needs OpenAI key) | No (needs API key) | Yes | | Embedding model | SQLite FTS | text-embedding-3-small | text-embedding-3-small | nomic-embed (local) | | Write intelligence | Store everything | Store everything | Store + importance | Surprise-gated | | Memory lifecycle | None | None | Time decay | 3-tier + dormancy | | Briefing cards | No | No | No | Yes | | Importance scoring | No | No | No | Yes | | Enrichment hints | No | No | No | Yes | | Cross-device sync | No | No | No | Yes (Fortress) | | Cost after install | $0 | ~$5-15/mo (OpenAI) | ~$5-15/mo | $0 |

Troubleshooting

"Connection refused" on localhost:8090

The memory server isn't running. Start it:

./astral-memory-server

"Embedding server not available"

The embedding model server isn't running on port 8081. Start it:

llama-server --model nomic-embed-text-v1.5.Q5_K_M.gguf --port 8081 --embedding

Memories aren't being stored

The server filters out information it already knows or considers redundant. Check what's stored:

curl http://localhost:8090/v1/memory/stats

If the count is zero, verify the embedding server is running — memories can't be stored without embeddings.

Search returns irrelevant results

This usually means the embedding model changed since memories were stored. Re-embed:

./astral-memory-server --rebuild-embeddings

Too many dormant reactivations (slow search)

If you have thousands of memories and search feels slow, the minSimilarity threshold may be too low. Raise it in your plugin config:

{
  "config": {
    "minSimilarity": 0.5
  }
}

The default was raised from 0.3 to 0.45 in v2.0.0 to address this. Values of 0.45-0.55 work well for most use cases.

Briefing card not appearing

The briefing card requires memory server v2.5.0 or later. Check your server version:

curl http://localhost:8090/health | jq .version

If the version is older, download the latest binary from orbitalfortress.com.

Also verify that briefingCardOnStart is true in your plugin config (it is by default).

Importance scoring not showing in stats

Importance scoring requires the B2 update (memory server with importance_scorer.py). The stats endpoint will include an importance_scoring section when available. Older servers simply omit it — the plugin handles this gracefully.

Architecture

Astral Core Memory is part of the Astral Core project — an open-source AI memory engine built for privacy-first, offline-capable AI assistants.

The full stack:

| Component | License | What it does | |-----------|---------|-------------| | Memory engine (MASK/HOPE) | MIT | Surprise-gated write pipeline | | Memory API server | MIT | REST endpoints on :8090 | | This OpenClaw plugin | MIT | Bridge to OpenClaw Gateway | | Orbital Fortress | AGPL-3.0 | Fleet sync server (self-hostable) |

Contributing

Issues and PRs welcome at github.com/suocommerce/astral-core.

If you're building a memory plugin for another platform (Cursor, Continue, VS Code), the Memory API server is platform-agnostic — any HTTP client can talk to it.

License

MIT — see LICENSE for details.

The plugin is open source. The Astral Core memory server binary requires a license (€19 one-time).

Your AI should remember you without phoning home.

Get Started · Orbital Fortress · Report Issue