@monoes/memory

v1.0.0

Published

20 days ago

Memory module - AgentDB unification, HNSW indexing, vector search, hybrid SQLite+AgentDB backend (ADR-009)

0High
0Medium
0Low

@monomind/memory

High-performance memory module for Monomind V1 - AgentDB unification, HNSW indexing, vector search, self-learning knowledge graph, and hybrid SQLite+AgentDB backend (ADR-009).

Features

150x-12,500x Faster Search - HNSW (Hierarchical Navigable Small World) vector index for ultra-fast similarity search
Hybrid Backend - SQLite for structured data + AgentDB for vectors (ADR-009)
Auto Memory Bridge - Bidirectional sync between Claude Code auto memory and AgentDB (ADR-048)
Self-Learning - LearningBridge connects insights to SONA/ReasoningBank neural pipeline (ADR-049)
Knowledge Graph - PageRank + label propagation community detection + HippoRAG PPR re-ranking (ADR-049)
Agent-Scoped Memory - 3-scope agent memory (project/local/user) with cross-agent knowledge transfer (ADR-049)
Vector Quantization - Binary, scalar, and product quantization for 4-32x memory reduction
Multiple Distance Metrics - Cosine, Euclidean, dot product, and Manhattan distance
Query Builder - Fluent API for building complex memory queries
Cache Manager - LRU caching with configurable size and TTL
Migration Tools - Seamless migration from V2 memory systems
DiskANN Backend - SSD-resident Vamana ANN graph for million-scale entry search (arXiv:2305.04359)
A-MEM Auto-Linking - Bidirectional reference edges auto-created on store (arXiv:2409.11987)
GraphRAG Community Retrieval - Community-level summaries annotate semantic search results (arXiv:2404.16130)
HippoRAG PPR Re-ranking - Personalised PageRank re-ranks semantic results via knowledge graph (arXiv:2405.14831)
Collaborative Memory Promotion - Entries auto-promoted to team scope after 3+ agent reads/24 h (arXiv:2505.18279)
Temporal Knowledge Graph - Causal/temporal edge typing inspired by Zep/Graphiti (arXiv:2501.13956)
Injection Filter - Structural prompt-injection detection on semantic search results (arXiv:2302.12173, arXiv:2310.12815)

Installation

npm install @monomind/memory

Quick Start

import { HNSWIndex, AgentDBAdapter, CacheManager } from '@monomind/memory';

// Create HNSW index for vector search
const index = new HNSWIndex({
  dimensions: 1536,  // OpenAI embedding size
  M: 16,             // Max connections per node
  efConstruction: 200,
  metric: 'cosine'
});

// Add vectors
await index.addPoint('memory-1', new Float32Array(embedding));
await index.addPoint('memory-2', new Float32Array(embedding2));

// Search for similar vectors
const results = await index.search(queryVector, 10);
// [{ id: 'memory-1', distance: 0.05 }, { id: 'memory-2', distance: 0.12 }]

API Reference

HNSW Index

import { HNSWIndex } from '@monomind/memory';

const index = new HNSWIndex({
  dimensions: 1536,
  M: 16,                    // Max connections per layer
  efConstruction: 200,      // Construction-time search depth
  maxElements: 1000000,     // Max vectors
  metric: 'cosine',         // 'cosine' | 'euclidean' | 'dot' | 'manhattan'
  quantization: {           // Optional quantization
    type: 'scalar',         // 'binary' | 'scalar' | 'product'
    bits: 8
  }
});

// Add vectors
await index.addPoint(id: string, vector: Float32Array);

// Search
const results = await index.search(
  query: Float32Array,
  k: number,
  ef?: number  // Search-time depth (higher = more accurate)
);

// Search with filters
const filtered = await index.searchWithFilters(
  query,
  k,
  (id) => id.startsWith('session-')
);

// Remove vectors
await index.removePoint(id);

// Get statistics
const stats = index.getStats();
// { vectorCount, memoryUsage, avgSearchTime, compressionRatio }

AgentDB Adapter

import { AgentDBAdapter } from '@monomind/memory';

const adapter = new AgentDBAdapter({
  dimensions: 1536,           // vector dimensions
  hnswM: 16,                  // max connections per HNSW layer
  hnswEfConstruction: 200,    // construction-time search depth
  cacheEnabled: true,         // LRU result cache
  cacheSize: 10000,           // max cached entries
  cacheTtl: 300000,           // 5 minutes
  defaultNamespace: 'default',
  embeddingGenerator: async (text) => myEmbedder.embed(text),
});

await adapter.initialize();

// Store memory — namespace is normalised to defaultNamespace when empty
await adapter.store({
  id: 'mem-123',
  key: 'user-preference',
  content: 'User prefers dark mode',
  type: 'semantic',
  namespace: 'preferences',
  tags: ['ui'],
  metadata: {},
  accessLevel: 'private',
  createdAt: Date.now(),
  updatedAt: Date.now(),
  version: 1,
  references: [],
  accessCount: 0,
  lastAccessedAt: Date.now(),
});

// Semantic search
const results = await adapter.semanticSearch('dark mode preference', 10, 0.7);
// [{ entry: MemoryEntry, score: number, distance: number }, ...]

// Convenience: store from plain input (auto-generates id, timestamps)
const entry = await adapter.storeEntry({
  key: 'my-fact',
  content: 'TypeScript 5+ required',
  namespace: 'learnings',
  type: 'semantic',
  tags: ['setup'],
  metadata: {},
});

Cache Manager

import { CacheManager } from '@monomind/memory';

const cache = new CacheManager<MemoryEntry>({
  maxSize: 1000,       // max entries before LRU eviction
  ttl: 3_600_000,     // entry TTL in ms (1 hour)
  maxMemory: 50 * 1024 * 1024, // optional memory cap (50 MB)
  lruEnabled: true,   // enable LRU ordering (default: true)
});

// Cache operations
cache.set('key', entry);
const entry = cache.get('key');     // null if missing or expired
const exists = cache.has('key');
cache.delete('key');
cache.clear();                      // emits 'cache:cleared' with correct previousSize

// Batch prefetch
await cache.prefetch(['k1', 'k2'], async (keys) => loadFromDB(keys));

// Statistics
const stats = cache.getStats();
// { size, hits, misses, hitRate, evictions, memoryUsage }

// Cleanup — call on process exit to stop the internal cleanup timer
cache.shutdown();

Query Builder

import { query, QueryTemplates } from '@monomind/memory';

// Fluent query construction — produces a MemoryQuery object
const q = query()
  .semantic('authentication patterns')   // type = semantic, embeds content
  .inNamespace('security')
  .withTags(['auth', 'patterns'])
  .ofType('semantic')
  .threshold(0.7)
  .limit(20)
  .sortByNewest()                         // sortField: createdAt, sortDirection: desc
  .build();

// Exact-key lookup
const exact = query().exact('my-key', 'my-namespace').build();

// Prefix search
const prefix = query().prefix('session-').inNamespace('sessions').build();

// Sorting options
query().semantic('...').sortBy('accessCount', 'desc').build();  // most-accessed first
query().semantic('...').oldestFirst().build();                   // createdAt asc
query().semantic('...').recentlyAccessed().build();              // lastAccessedAt desc

// sortField and sortDirection are passed through to all backends that support them
// (agentdb-adapter, rvf-backend, sqlite-backend, sqljs-backend, JsonBackend)

// Predefined templates
const recent = QueryTemplates.recentInNamespace('learnings', 10);
const stale = QueryTemplates.staleEntries('session-cache', 10);

Migration

import { MemoryMigrator, createMigrator } from '@monomind/memory';

// Migrate from a legacy SQLite, JSON, or Markdown source
const migrator = createMigrator(targetAdapter, {
  source: 'sqlite',          // 'sqlite' | 'json' | 'markdown' | 'memory-manager' | 'swarm-memory'
  sourcePath: './data/v2-memory.db',
  batchSize: 100,
  generateEmbeddings: true,  // generate vector embeddings for each entry
  continueOnError: true,     // skip bad entries instead of aborting
  validateData: true,
});

const result = await migrator.migrate();
console.log(`Migrated ${result.progress.migrated} entries`);
console.log(`Failed: ${result.progress.failed}`);

// RVF ↔ JSON bidirectional migration
import { RvfMigrator } from '@monomind/memory';

await RvfMigrator.fromSqlite('./old.db', './new.rvf');
await RvfMigrator.fromJsonFile('./export.json', './new.rvf');
await RvfMigrator.toJsonFile('./store.rvf', './export.json');

Quantization Options

// Binary quantization (32x compression)
const binaryIndex = new HNSWIndex({
  dimensions: 1536,
  quantization: { type: 'binary' }
});

// Scalar quantization (4x compression)
const scalarIndex = new HNSWIndex({
  dimensions: 1536,
  quantization: { type: 'scalar', bits: 8 }
});

// Product quantization (8x compression)
const productIndex = new HNSWIndex({
  dimensions: 1536,
  quantization: { type: 'product', subquantizers: 8 }
});

Auto Memory Bridge (ADR-048)

Bidirectional sync between Claude Code's auto memory files and AgentDB. Auto memory is a persistent directory (~/.claude/projects/<project>/memory/) where Claude writes learnings as markdown. MEMORY.md (first 200 lines) is loaded into the system prompt; topic files are read on demand.

Quick Start

import { AutoMemoryBridge } from '@monomind/memory';

const bridge = new AutoMemoryBridge(memoryBackend, {
  workingDir: '/workspaces/my-project',
  syncMode: 'on-session-end', // 'on-write' | 'on-session-end' | 'periodic'
  pruneStrategy: 'confidence-weighted', // 'confidence-weighted' | 'fifo' | 'lru'
});

// Record an insight (stores in AgentDB + optionally writes to files)
await bridge.recordInsight({
  category: 'debugging',
  summary: 'HNSW index requires initialization before search',
  source: 'agent:tester',
  confidence: 0.95,
});

// Sync buffered insights to auto memory files
const syncResult = await bridge.syncToAutoMemory();

// Import existing auto memory files into AgentDB (on session start)
const importResult = await bridge.importFromAutoMemory();

// Curate MEMORY.md index (stays under 200-line limit)
await bridge.curateIndex();

// Check status
const status = bridge.getStatus();

Sync Modes

| Mode | Behavior | |------|----------| | on-write | Writes to files immediately on recordInsight() | | on-session-end | Buffers insights, flushes on syncToAutoMemory() | | periodic | Auto-syncs on a configurable interval |

Insight Categories

| Category | Topic File | Description | |----------|-----------|-------------| | project-patterns | patterns.md | Code patterns and conventions | | debugging | debugging.md | Bug fixes and debugging insights | | architecture | architecture.md | Design decisions and module relationships | | performance | performance.md | Benchmarks and optimization results | | security | security.md | Security findings and CVE notes | | preferences | preferences.md | User and project preferences | | swarm-results | swarm-results.md | Multi-agent swarm outcomes |

Key Optimizations

Batch import - bulkInsert() instead of individual store() calls
Pre-fetched hashes - Single query for content-hash dedup during import
Async I/O - node:fs/promises for non-blocking writes
Exact dedup - hasSummaryLine() uses bullet-prefix matching, not substring
O(1) sync tracking - syncedInsightKeys Set prevents double-write race
Prune-before-build - Avoids O(n^2) index rebuild loop

Utility Functions

import {
  resolveAutoMemoryDir,  // Derive auto memory path from working dir
  findGitRoot,           // Walk up to find .git root
  parseMarkdownEntries,  // Parse ## headings into structured entries
  extractSummaries,      // Extract bullet summaries, strip metadata
  formatInsightLine,     // Format insight as markdown bullet
  hashContent,           // SHA-256 truncated to 16 hex chars
  pruneTopicFile,        // Keep topic files under line limit
  hasSummaryLine,        // Exact bullet-prefix dedup check
} from '@monomind/memory';

Types

import type {
  AutoMemoryBridgeConfig,
  MemoryInsight,
  InsightCategory,
  SyncDirection,
  SyncMode,
  PruneStrategy,
  SyncResult,
  ImportResult,
} from '@monomind/memory';

Self-Learning Bridge (ADR-049)

Connects insights to the @monomind/neural learning pipeline. When neural is unavailable, all operations degrade to no-ops.

Quick Start

import { AutoMemoryBridge, LearningBridge } from '@monomind/memory';

const bridge = new AutoMemoryBridge(backend, {
  workingDir: '/workspaces/my-project',
  learning: {
    sonaMode: 'balanced',
    confidenceDecayRate: 0.005,   // Per-hour decay
    accessBoostAmount: 0.03,      // Boost per access
    consolidationThreshold: 10,   // Min insights before consolidation
  },
});

// Insights now trigger learning trajectories automatically
await bridge.recordInsight({
  category: 'debugging',
  summary: 'Connection pool exhaustion on high load',
  source: 'agent:tester',
  confidence: 0.9,
});

// Consolidation runs JUDGE/DISTILL/CONSOLIDATE pipeline
await bridge.syncToAutoMemory(); // Calls consolidate() first

Standalone Usage

import { LearningBridge } from '@monomind/memory';

const lb = new LearningBridge(backend, {
  // Optional: inject neural loader for custom setups
  neuralLoader: async () => {
    const { NeuralLearningSystem } = await import('@monomind/neural');
    return new NeuralLearningSystem();
  },
});

// Boost confidence when insight is accessed
await lb.onInsightAccessed('entry-123'); // +0.03 confidence

// Apply time-based decay
const decayed = await lb.decayConfidences('default'); // -0.005/hour

// Find similar patterns via ReasoningBank
const patterns = await lb.findSimilarPatterns('connection pooling');

// Get learning statistics
const stats = lb.getStats();
// { totalTrajectories, activeTrajectories, completedTrajectories,
//   totalConsolidations, accessBoosts, ... }

Confidence Lifecycle

| Event | Effect | Range | |-------|--------|-------| | Insight recorded | Initial confidence from source | 0.1 - 1.0 | | Insight accessed | +0.03 per access | Capped at 1.0 | | Time decay | -0.005 per hour since last access | Floored at 0.1 | | Consolidation | Neural pipeline may adjust | 0.1 - 1.0 |

Knowledge Graph (ADR-049)

Pure TypeScript knowledge graph with PageRank and community detection. No external graph libraries required.

Quick Start

import { AutoMemoryBridge, MemoryGraph } from '@monomind/memory';

const bridge = new AutoMemoryBridge(backend, {
  workingDir: '/workspaces/my-project',
  graph: {
    similarityThreshold: 0.8,
    pageRankDamping: 0.85,
    maxNodes: 5000,
  },
});

// Graph builds automatically on import
await bridge.importFromAutoMemory();

// Curation uses PageRank to prioritize influential insights
await bridge.curateIndex();

Standalone Usage

import { MemoryGraph } from '@monomind/memory';

const graph = new MemoryGraph({
  pageRankDamping: 0.85,
  pageRankIterations: 50,
  pageRankConvergence: 1e-6,
  maxNodes: 5000,
});

// Build from backend entries
await graph.buildFromBackend(backend, 'my-namespace');

// Or build manually
graph.addNode(entry);
graph.addEdge('entry-1', 'entry-2', 'reference', 1.0);
graph.addEdge('entry-1', 'entry-3', 'similar', 0.9);

// Compute PageRank (power iteration)
const ranks = graph.computePageRank();

// Detect communities (label propagation)
const communities = graph.detectCommunities();

// Graph-aware ranking: blend vector score + PageRank
const ranked = graph.rankWithGraph(searchResults, 0.7);
// alpha=0.7 means 70% vector score + 30% PageRank

// Get most influential insights for MEMORY.md
const topNodes = graph.getTopNodes(20);

// BFS traversal for related insights
const neighbors = graph.getNeighbors('entry-1', 2); // depth=2

Edge Types

| Type | Source | Description | |------|--------|-------------| | reference | MemoryEntry.references | Explicit cross-references between entries | | similar | HNSW search | Auto-created when similarity > threshold | | temporal | Timestamps | Entries created in same time window | | co-accessed | Access patterns | Entries frequently accessed together | | causal | Learning pipeline | Cause-effect relationships |

Performance

| Operation | Result | Target | |-----------|--------|--------| | Graph build (1k nodes) | 2.78 ms | <200 ms | | PageRank (1k nodes) | 12.21 ms | <100 ms | | Community detection (1k) | 19.62 ms | — | | rankWithGraph(10) | 0.006 ms | — | | getTopNodes(20) | 0.308 ms | — | | getNeighbors(d=2) | 0.005 ms | — |

Agent-Scoped Memory (ADR-049)

Maps Claude Code's 3-scope agent memory directories for per-agent knowledge isolation and cross-agent transfer.

Quick Start

import { createAgentBridge, transferKnowledge } from '@monomind/memory';

// Create a bridge for a specific agent scope
const agentBridge = createAgentBridge(backend, {
  agentName: 'my-coder',
  scope: 'project', // 'project' | 'local' | 'user'
  workingDir: '/workspaces/my-project',
});

// Record insights scoped to this agent
await agentBridge.recordInsight({
  category: 'debugging',
  summary: 'Use connection pooling for DB calls',
  source: 'agent:my-coder',
  confidence: 0.95,
});

// Transfer high-confidence insights between agents
const result = await transferKnowledge(sourceBackend, targetBridge, {
  sourceNamespace: 'learnings',
  minConfidence: 0.8,   // Only transfer confident insights
  maxEntries: 20,
  categories: ['debugging', 'architecture'],
});
// { transferred: 15, skipped: 5 }

Scope Paths

| Scope | Directory | Use Case | |-------|-----------|----------| | project | <gitRoot>/.claude/agent-memory/<agent>/ | Project-specific learnings | | local | <gitRoot>/.claude/agent-memory-local/<agent>/ | Machine-local data | | user | ~/.claude/agent-memory/<agent>/ | Cross-project user knowledge |

Utilities

import {
  resolveAgentMemoryDir,  // Get scope directory path
  createAgentBridge,       // Create scoped AutoMemoryBridge
  transferKnowledge,       // Cross-agent knowledge sharing
  listAgentScopes,         // Discover existing agent scopes
} from '@monomind/memory';

// Resolve path for an agent scope
const dir = resolveAgentMemoryDir('my-agent', 'project');
// → /workspaces/my-project/.claude/agent-memory/my-agent/

// List all agent scopes in a directory
const scopes = await listAgentScopes('/workspaces/my-project');
// [{ agentName: 'coder', scope: 'project', path: '...' }, ...]

A-MEM Auto-Linking (arXiv:2409.11987)

When HybridBackend is configured with an embeddingGenerator, every stored entry automatically discovers its top-3 semantic neighbors and creates bidirectional references edges — implementing the Zettelkasten note-linking structure from A-MEM.

const backend = new HybridBackend({
  embeddingGenerator: async (text) => myEmbeddingModel.embed(text),
  // A-MEM auto-linking is automatically active when embeddingGenerator is set
});

// Store any entry — references are linked asynchronously, best-effort
await backend.store(entry);

// After linking, querySemantic PPR re-ranking propagates scores through the
// newly created reference graph, improving recall for connected knowledge.
backend.on('amem:linked', ({ id, linkedTo }) =>
  console.log(`Linked ${id} to ${linkedTo.join(', ')}`));

Injection-Safe Semantic Search

Set filterInjection: true to remove entries containing prompt-injection patterns from semantic search results before they reach the agent context:

const backend = new HybridBackend({
  embeddingGenerator: myEmbedder,
  filterInjection: true,    // Screen RAG results for indirect injection
});

// Filtered result — entries matching injection patterns are silently dropped
const entries = await backend.querySemantic({ content: 'OAuth patterns', k: 10 });

// Blocked entries are observable via event
backend.on('injection:blocked', ({ id, namespace }) =>
  securityLogger.warn(`Injection blocked from entry ${id}`));

Source: arXiv:2302.12173, arXiv:2310.12815 — indirect prompt injection in RAG pipelines.

GraphRAG Community Retrieval (arXiv:2404.16130)

querySemantic() now captures community summaries from MemoryGraph.getCommunitySummaries() and annotates each returned entry with its GraphRAG community metadata. This implements the community-level summarisation strategy from Microsoft GraphRAG.

const entries = await backend.querySemantic({ content: 'authentication patterns', k: 10 });
// Each entry now carries:
//   entry.community             — community ID string
//   entry.communityNodeCount    — number of nodes in that community
//   entry.communityAvgPageRank  — mean PageRank of community members

PPR re-ranking is handled by HippoRAG-style personalised PageRank (arXiv:2405.14831), which propagates query-node scores through the knowledge graph before returning results.

Collaborative Memory Promotion (arXiv:2505.18279)

HybridBackend.get(id, agentId?) accepts an optional agentId parameter. When provided, it fires a read-tracking call to the SQLite backend, which promotes the entry's AccessLevel from 'private' to 'team' once 3 or more distinct agents have accessed it within a 24-hour window.

// Each agent reads with its own ID — no other change required
const entry = await backend.get('entry-id-123', 'coder-agent');

// After the third distinct agent reads it within 24 h:
// entry.accessLevel === 'team'
// (auto-promoted, visible to peer agents in the same namespace)

Collaborative promotion is transparent to callers that don't pass agentId — get(id) continues to work exactly as before (backwards-compatible).

Knowledge Graph & Temporal Edges (arXiv:2501.13956)

MemoryGraph models causal and temporal relationships between entries as typed edges, inspired by the Zep/Graphiti episodic knowledge graph (arXiv:2501.13956). Edge types include REFERENCES, CAUSES, PRECEDED_BY, RELATED_TO, and CONTRADICTS, enabling episodic reasoning over the agent's memory history.

import { MemoryGraph, type EdgeType } from '@monomind/memory';

const graph = new MemoryGraph();
graph.addEdge('plan-123', 'code-456', EdgeType.CAUSES, 0.9);
graph.addEdge('code-456', 'test-789', EdgeType.PRECEDED_BY, 1.0);

const ranked = graph.pprRerank(['plan-123'], candidates, 0.85);
// Entries causally downstream of 'plan-123' score higher in PPR

μACP Learning-Bridge Integration (arXiv:2601.03938)

LearningBridge integrates with the μACP coordination substrate: when consolidation detects a pattern conflict between agents, it initiates a μACP round to resolve which variant to promote. The result is stored as a causal edge in MemoryGraph.

// Conflict resolution is automatic during consolidation:
await learningBridge.consolidate();
// Internally calls MuACP.coordinate() when divergent patterns are detected,
// then records the winning pattern as a CAUSES edge.

Source: arXiv:2601.03938.

Bi-Temporal Query Filtering (arXiv:2501.13956)

MemoryQuery now supports eventAfter and eventBefore filters that operate on the eventAt field — the timestamp of when the event occurred (T), as opposed to createdAt which records when the entry was ingested (T'). This is the bi-temporal model from Zep/Graphiti that prevents retrieval failures when data arrives out-of-order or is backdated.

const entries = await backend.query({
  type: 'hybrid',
  namespace: 'incidents',
  limit: 50,
  // Filter by WHEN THE INCIDENT HAPPENED — not when it was logged
  eventAfter:  new Date('2026-01-01').getTime(),
  eventBefore: new Date('2026-04-01').getTime(),
});

// Store with explicit event time (e.g. a past incident being recorded now)
await backend.store({
  key: 'outage-2026-02-14',
  content: 'DB connection pool exhausted during peak traffic',
  type: 'episodic',
  namespace: 'incidents',
  eventAt: new Date('2026-02-14T03:22:00Z').getTime(),  // event time
  // createdAt is auto-set to Date.now() (ingestion time)
  // ...
});

Source: arXiv:2501.13956 — Zep/Graphiti bi-temporal knowledge graph.

MemoRAG Query Rewriting (arXiv:2409.05591)

HybridBackend supports a memoragRewriter configuration option that adds a "draft clue" query-expansion stage before HNSW search. When configured, querySemantic() calls the rewriter to generate 2-3 reformulated sub-queries, searches HNSW independently for each, then fuses all ranked result lists using Reciprocal Rank Fusion (RRF) before continuing with HippoRAG PPR re-ranking and GraphRAG community annotation.

This addresses the MemoRAG insight that naive RAG fails when the user query does not directly match any retrievable chunk — paraphrased sub-queries dramatically improve recall.

import { HybridBackend } from '@monomind/memory';

const backend = new HybridBackend({
  embeddingGenerator: myEmbedder,
  memoragRewriter: async (query) => {
    // Use a cheap LLM (Haiku) or deterministic rules to produce sub-queries
    const reformulated = await callClaude({
      model: 'claude-haiku-4-5',
      prompt: `Generate 3 alternative search queries for: "${query}"\nRespond with a JSON array of strings.`,
    });
    return JSON.parse(reformulated); // e.g. ["...", "...", "..."]
  },
});

// querySemantic() now automatically expands + fuses results
const results = await backend.querySemantic({ content: 'memory leak in production' });
// RRF-fused results from sub-queries: "heap usage spike", "GC pressure", "OOM error"

Source: arXiv:2409.05591 — MemoRAG (TheWebConf 2025).

DiskANN Backend — Large-Scale ANN at Disk Scale (arXiv:2305.04359)

DiskAnnBackend is an IMemoryBackend decorator that activates SSD-resident Vamana ANN search above entry-count thresholds. Wraps any existing backend (typically the long-term SQLite/AgentDB backend in TierManager).

Architecture

Disk-persisted adjacency list — Vamana graph written to graphPath as JSON
In-memory Int8-quantised vectors — Map<string, Int8Array> for fast beam search
Beam search — BFS traversal using Int8 dot-product as the candidate scorer
Full-precision cosine re-ranking — fetches raw embeddings from the delegate backend

Quick Start

import { DiskAnnBackend, type DiskAnnBackendConfig } from '@monomind/memory';

// Wrap any IMemoryBackend
const diskann = new DiskAnnBackend(existingBackend, {
  graphPath: './data/diskann.graph.json',
  R: 32,          // Max graph degree (default: 32)
  L: 64,          // Beam width (default: 64)
  beamWidth: 10,  // Search beam candidates (default: 10)
  dimensions: 128, // Vector dimensions (default: 128)
});

// All CRUD proxies through to the wrapped backend
await diskann.store(entry);
await diskann.get(id);

// ANN search uses beam traversal + cosine re-ranking
const results = await diskann.search(queryVector, { k: 5 });
// [{ entry: MemoryEntry, score: number }, ...]

TierManager Integration

Pass diskAnnConfig to activate DiskANN on the long-term backend:

import { TierManager } from '@monomind/memory';

const tier = new TierManager(
  longTermBackend,
  { shortTermCapacity: 1000 },
  {}, // PartitionedHNSW config
  {   // DiskAnnBackendConfig — activates DiskANN
    graphPath: './data/diskann.graph.json',
    R: 32,
    beamWidth: 12,
  },
);

// tier.diskann is now populated — search() includes DiskANN results
const results = await tier.search('authentication patterns', 10);

DiskAnnBackendConfig

| Field | Default | Description | |-------|---------|-------------| | graphPath | './diskann.graph.json' | Path for persisted Vamana adjacency list | | R | 32 | Max out-degree per node | | L | 64 | Beam width during construction | | beamWidth | 10 | Beam width during search | | dimensions | 128 | Vector dimensions |

Performance Benchmarks

| Operation | V2 Performance | V1 Performance | Improvement | |-----------|---------------|----------------|-------------| | Vector Search | 150ms | <1ms | 150x | | Bulk Insert | 500ms | 5ms | 100x | | Memory Write | 50ms | <5ms | 10x | | Cache Hit | 5ms | <0.1ms | 50x | | Index Build | 10s | 800ms | 12.5x |

ADR-049 Benchmarks

| Operation | Actual | Target | Headroom | |-----------|--------|--------|----------| | Graph build (1k nodes) | 2.78 ms | <200 ms | 71.9x | | PageRank (1k nodes) | 12.21 ms | <100 ms | 8.2x | | Insight recording | 0.12 ms/each | <5 ms/each | 41.0x | | Consolidation | 0.26 ms | <500 ms | 1,955x | | Confidence decay (1k) | 0.23 ms | <50 ms | 215x | | Knowledge transfer | 1.25 ms | <100 ms | 80.0x |

TypeScript Types

import type {
  // Core entry and query types
  MemoryEntry, MemoryEntryInput, MemoryEntryUpdate,
  MemoryQuery, MemoryType, AccessLevel,
  SearchResult, SearchOptions,

  // Query builder sort fields
  SortField,        // 'createdAt' | 'updatedAt' | 'lastAccessedAt' | 'accessCount' | 'key' | 'score'
  SortDirection,    // 'asc' | 'desc'

  // HNSW
  HNSWConfig, HNSWStats, QuantizationConfig, DistanceMetric,

  // Backend interfaces
  IMemoryBackend, BackendStats, HealthCheckResult,

  // Cache
  CacheConfig, CacheStats,

  // Auto Memory Bridge (ADR-048)
  AutoMemoryBridgeConfig, MemoryInsight, InsightCategory,
  SyncMode, SyncResult, ImportResult,

  // Agent Scope (ADR-049)
  AgentMemoryScope, AgentScopedConfig,
  TransferOptions, TransferResult,

  // Learning Bridge (ADR-049)
  LearningBridgeConfig, ConsolidateResult, PatternMatch,

  // Knowledge Graph (ADR-049)
  MemoryGraphConfig, GraphNode, GraphEdge,
  GraphStats, RankedResult,

  // Tiers
  MemoryTier, EntityFact, SessionSummary,

  // Migration
  MigrationSource, MigrationConfig, MigrationResult,

  // Checkpointing
  AgentState, SwarmCheckpoint, CheckpointMeta,
} from '@monomind/memory';

Dependencies

agentdb - Vector database engine
better-sqlite3 - SQLite driver (native)
sql.js - SQLite driver (WASM fallback)
@monomind/neural - Optional peer dependency for self-learning (graceful fallback when unavailable)

Related Packages

@monomind/neural - Neural learning integration (SONA, ReasoningBank, EWC++)
@monomind/shared - Shared types and utilities
@monomind/hooks - Session lifecycle hooks for auto memory sync

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@monomind/memory

Features

Installation

Quick Start

API Reference

HNSW Index

AgentDB Adapter

Cache Manager

Query Builder

Migration

Quantization Options

Auto Memory Bridge (ADR-048)

Quick Start

Sync Modes

Insight Categories

Key Optimizations

Utility Functions

Types

Self-Learning Bridge (ADR-049)

Quick Start

Standalone Usage

Confidence Lifecycle

Knowledge Graph (ADR-049)

Quick Start

Standalone Usage

Edge Types

Performance

Agent-Scoped Memory (ADR-049)

Quick Start

Scope Paths

Utilities

A-MEM Auto-Linking (arXiv:2409.11987)

Injection-Safe Semantic Search

GraphRAG Community Retrieval (arXiv:2404.16130)

Collaborative Memory Promotion (arXiv:2505.18279)

Knowledge Graph & Temporal Edges (arXiv:2501.13956)

μACP Learning-Bridge Integration (arXiv:2601.03938)

Bi-Temporal Query Filtering (arXiv:2501.13956)

MemoRAG Query Rewriting (arXiv:2409.05591)

DiskANN Backend — Large-Scale ANN at Disk Scale (arXiv:2305.04359)

Architecture

Quick Start

TierManager Integration

DiskAnnBackendConfig

Performance Benchmarks

ADR-049 Benchmarks

TypeScript Types

Dependencies

Related Packages

License