@zibby/memory
v0.1.3
Published
Version-controlled test memory database powered by Dolt — learns from every run
Maintainers
Readme
@zibby/memory
Experimental — This feature is under active development. APIs, schema, and branching behavior may change between releases. Use in production at your own discretion.
A version-controlled test memory database powered by Dolt. Every test run enriches a shared knowledge base that the AI agent queries on future runs — making it smarter over time.
Why this exists
Without memory, every zibby run starts from scratch. The AI has no idea which
selectors are reliable, which pages exist in your app, or what failed last time.
With memory enabled:
- Selectors that worked 50 times get preferred over fragile alternatives.
- Flaky selectors get flagged so the AI avoids them.
- Page model (URLs, key elements, transitions) is pre-loaded — the AI already knows your app's structure before it starts.
- Failure history means the AI can avoid repeating the same mistakes.
- Everything is versioned. You can diff what changed between runs, roll back bad data, and branch for experiments.
Architecture
your-repo/
├── .zibby/
│ └── memory/ ← Dolt database (local, gitignored)
│ ├── .dolt/ ← Dolt internals (like .git/)
│ └── (data files)
├── .gitignore ← includes .zibby/memory/
└── ...Important: The Dolt database is not committed to git. Binary DB files cause merge conflicts that git cannot resolve. Team sync uses Dolt's native remote protocol instead (see Team sync below).
Database schema (4 tables)
| Table | Purpose |
|-------|---------|
| test_runs | One row per test execution — pass/fail, duration, assertion counts, timestamps. |
| selector_history | Every UI selector the AI has used, with success/failure counts per page URL. |
| page_model | Discovered pages (URL patterns), visit frequency, and key interactive elements. |
| page_transitions | How pages connect — from → to URL, trigger action, frequency. |
How the AI accesses memory
Memory is exposed as an MCP skill with 4 read-only tools:
| Tool | What it returns |
|------|-----------------|
| memory_get_test_history | Recent runs for a spec — pass/fail trend, timing, failure reasons. |
| memory_get_selectors | Known selectors for a page — stability score, last seen. |
| memory_get_page_model | Page structure — URLs, key elements, roles. |
| memory_get_navigation | Navigation map — page-to-page transitions. |
The AI calls these tools via SQL queries against Dolt during the execute_live
node. It decides what to query based on the test spec it's running.
Per-node persistence with branching
Memory uses Dolt's git-like branching for isolation:
main ────────────────────────────────────────── main (enriched)
│ ↑
└── run/1773400000000 │ merge
├── commit: "node execute_live: login.txt"┘
└── (branch deleted after merge)
└── run/1773400099999 ← kept (test failed)
└── commit: "node execute_live: signup.txt"The lifecycle for every test run:
- Start —
memoryStartRuncreates branchrun/<sessionId>frommain. - Per-node — After each node completes,
memoryPersistNodewrites data and commits on the run branch. - End (pass) —
memoryEndRunmerges the branch intomainand deletes it. - End (fail) — Branch is left intact for debugging.
mainstays clean.
This means a crashed or failed run never pollutes your main knowledge base.
You can inspect failed branches later with dolt diff, dolt log, or SQL
queries to understand what went wrong.
Getting started
Prerequisites
Install Dolt (one-time):
# macOS
brew install dolt
# Linux
sudo bash -c 'curl -L https://github.com/dolthub/dolt/releases/latest/download/install.sh | bash'
# Verify
dolt versionInitialize memory in your project
zibby init --memThis does two things:
- Creates the Dolt database at
.zibby/memory/ - Includes
SKILLS.MEMORYin your workflow graph
Run tests with memory
# Memory is always enabled if you used --mem during init
zibby run test-specs/auth/login.txt
# To disable memory, edit .zibby/nodes/execute-live.mjs:
# Change: skills: [SKILLS.BROWSER, SKILLS.MEMORY],
# To: skills: [SKILLS.BROWSER],Check memory stats
zibby memory statsReset the database
zibby memory resetTeam sync (Dolt remotes)
The Dolt database is gitignored and lives only on your local machine by default. To share memory across a team, use Dolt's native remote protocol — it merges at the row/cell level (not binary files), so conflicts are virtually impossible.
Why not git?
Dolt stores data as binary chunk files in .dolt/noms/. If two developers
both run tests and commit to git, you get binary merge conflicts that git
cannot resolve. You'd have to pick one side and lose the other's data.
Dolt's own sync protocol understands SQL schema, primary keys, and row identity. When two people add different test runs, selectors, or insights, Dolt merges them cleanly — like git merging two files where different lines were edited.
Setup (one-time)
# Option 1: S3 backend (recommended for AWS teams)
zibby memory remote add aws://your-bucket/zibby-memory/your-project
# Option 2: DoltHub (hosted, no infra needed)
zibby memory remote add https://doltremoteapi.dolthub.com/your-org/your-db
# Option 3: Self-hosted (dolt remotesrv on EC2/ECS)
zibby memory remote add https://your-dolt-server.example.com/memoryHow sync works
Sync is automatic when a remote is configured:
Developer A Developer B
────────────── ──────────────
zibby run login.txt --mem zibby run signup.txt --mem
◆ Synced from remote ◆ Synced from remote
◆ Memory loaded: 5 runs ◆ Memory loaded: 5 runs
→ learns login selectors → learns signup selectors
→ auto-push to remote → auto-push to remote
↓ Dolt merges row-level
Both runs now in shared DBBefore each run: middleware calls dolt pull to grab latest from remote.
After each run: onComplete calls dolt push to share back.
No manual steps. No git add/commit/push for the DB.
Manual sync commands
# Check remote status
zibby memory stats # shows remote info
# Manual push/pull (rarely needed)
cd .zibby/memory
dolt pull origin main
dolt push origin mainConflict resolution
In practice, conflicts are extremely rare because:
- Test runs are always new rows with unique session IDs.
- Selectors use hash-based IDs — two people using the same selector just increment the same counter (Dolt handles this).
- Insights are always new rows, never edited.
- Pages/transitions use upsert — last writer wins for visit counts.
If a true conflict occurs (same cell edited), Dolt shows it and you can resolve with SQL:
cd .zibby/memory
dolt conflicts resolve --theirs selector_history
dolt commit -m "resolve selector conflict"AWS architecture for Zibby platform
For teams already on the Zibby platform, memory sync can use the existing AWS infrastructure:
┌─────────────────┐ dolt push/pull ┌──────────────────────────┐
│ Developer A │ ◄────────────────────► │ S3 Bucket │
│ .zibby/memory/ │ │ zibby-memory/ │
└─────────────────┘ │ {projectId}/ │
│ .dolt/ │
┌─────────────────┐ dolt push/pull │ (row-level chunks) │
│ Developer B │ ◄────────────────────► │ │
│ .zibby/memory/ │ └──────────────────────────┘
└─────────────────┘ │
IAM per project via
┌─────────────────┐ dolt push/pull ZIBBY_API_KEY → presigned
│ CI/CD │ ◄────────────────────► credentials
│ .zibby/memory/ │
└─────────────────┘- Storage: S3 bucket with per-project prefixes (
aws://bucket/{projectId}) - Auth: Existing
ZIBBY_API_KEYmaps to IAM credentials via the Zibby API - Multi-tenant: Each project gets an isolated S3 prefix
- Cost: Just S3 storage — pennies per GB, scales to zero when idle
- No servers: Dolt talks directly to S3, no running service needed
CI/CD
In CI, tests automatically use memory if your workflow includes SKILLS.MEMORY:
# GitHub Actions example
- name: Run tests with memory
run: zibby run test-specs/To have CI persist back (enriching memory from CI runs), just configure the remote — sync is automatic:
- name: Setup memory remote
run: zibby memory remote add aws://your-bucket/zibby-memory/${{ secrets.ZIBBY_PROJECT_ID }}
- name: Run tests (auto-sync)
run: zibby run test-specs/auth/login.txtInspecting the database
Since Dolt is MySQL-compatible, you can query it directly:
cd .zibby/memory
# Interactive SQL shell
dolt sql
# Quick queries
dolt sql -q "SELECT spec_path, passed, run_at FROM test_runs ORDER BY run_at DESC LIMIT 10"
dolt sql -q "SELECT stable_id, success_count, failure_count FROM selector_history ORDER BY failure_count DESC"
dolt sql -q "SELECT url_pattern, visit_count FROM page_model ORDER BY visit_count DESC"
dolt sql -q "SELECT from_url, to_url, frequency FROM page_transitions ORDER BY frequency DESC"
# Version control
dolt log # commit history
dolt diff HEAD~1 # what changed in last commit
dolt branch # list branches (failed runs stay as branches)
dolt diff main run/17734000... # compare failed run to mainHow data flows through the system
zibby run login.txt --mem
│
├─ preflight node
│ └─ (no memory data persisted — planning only)
│
├─ execute_live node ← AI queries memory via MCP tools
│ │ AI performs test actions in browser
│ │ MCP recorder writes events.json
│ │
│ └─ middleware persists after node completes:
│ ├─ test_runs (pass/fail, duration, counts)
│ ├─ selector_history (every selector used + success/failure)
│ ├─ page_model (pages visited + key elements)
│ └─ page_transitions (navigation paths)
│
├─ generate_script node
│ └─ (no memory data persisted — code generation only)
│
└─ onComplete
├─ memoryEndRun: merge run branch → main (if passed)
├─ auto-compact (every 25 runs, configurable)
└─ memorySyncPush (if remote configured)Optionality
Memory is fully optional:
- No
--memflag during init?SKILLS.MEMORYis not included in workflow graph. @zibby/memorynot installed? Dynamicimport()catches the error.- Database doesn't exist?
memorySkill.resolve()throws clear error with setup instructions. - Dolt not installed?
resolve()throws error when trying to query database.
The feature is controlled by:
- Init flag:
zibby init --memincludesSKILLS.MEMORYin generated workflow - Code: Edit
.zibby/nodes/execute-live.mjsto add/removeSKILLS.MEMORY - Middleware:
memoryMiddleware()only activates when included in graph config
API reference
CLI
| Command | Description |
|---------|-------------|
| zibby init --mem | Initialize memory database + include SKILLS.MEMORY in workflow |
| zibby run <spec> | Run test (uses memory if SKILLS.MEMORY in workflow) |
| zibby memory stats | Show database statistics |
| zibby memory compact | Manual compaction (--max-runs N, --max-age N) |
| zibby memory remote add <url> | Configure Dolt remote for team sync |
| zibby memory reset | Delete the memory database |
Environment variables
| Variable | Default | Description |
|----------|---------|-------------|
| ZIBBY_MEMORY_MAX_RUNS | 3000 | Max test runs to keep per spec |
| ZIBBY_MEMORY_MAX_AGE | 1095 | Max age in days for stale data (~3 years) |
| ZIBBY_MEMORY_COMPACT_EVERY | 1500 | Auto-compact every N runs (0 to disable) |
Programmatic (from @zibby/memory)
import {
initMemory, // (projectDir) → { created, available }
memoryStartRun, // (projectDir, { sessionId }) → boolean
memoryPersistNode, // (projectDir, { nodeName, sessionPath, specPath, nodeOutput, sessionId }) → boolean
memoryEndRun, // (projectDir, { sessionId, passed }) → boolean (also triggers auto-compact)
memoryPersistRun, // (projectDir, { sessionPath, specPath, result }) → boolean (legacy, full-run)
getStats, // (projectDir) → { available, initialized, counts, ... }
compactMemory, // (projectDir, { maxRuns?, maxAgeDays? }) → { pruned }
resetMemory, // (projectDir) → boolean
// Sync
memoryRemoteAdd, // (projectDir, url) → boolean
memoryRemoteRemove, // (projectDir) → boolean
memoryRemoteInfo, // (projectDir) → { name, url } | null
memorySyncPull, // (projectDir) → { pulled, error? }
memorySyncPush, // (projectDir) → { pushed, error? }
memorySyncInit, // (projectDir, url) → { ok, action?, error? }
// Middleware
memoryMiddleware, // (options?) → function | null (env-aware, recommended)
createMemoryMiddleware, // (options?) → function (always returns middleware)
DoltDB, // Low-level Dolt wrapper (escape hatch)
} from '@zibby/memory';File structure
packages/memory/
├── src/
│ ├── index.js Public API (init, persist, lifecycle, sync, stats, reset)
│ ├── middleware.js WorkflowGraph middleware (env-aware + low-level)
│ ├── dolt.js DoltDB class (SQL, commits, branching, remotes)
│ ├── schema.js CREATE TABLE statements
│ ├── persister.js Writes session data into SQL rows
│ ├── context-builder.js Builds markdown context from DB for AI prompts
│ └── utils.js Hashing, SQL escaping, URL normalization
├── test/
│ ├── memory.test.js Unit + integration + E2E tests (61 tests)
│ └── middleware.test.js Middleware unit + integration tests (14 tests)
└── package.json
packages/mcps/memory/
├── index.js MCP server (4 read + 1 write tool for AI agents)
└── package.json
packages/skills/src/
└── memory.js Skill definition (tells framework how to launch MCP server)