npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@backloghq/agentdb

v1.2.1

Published

AI-first embedded database for LLM agents. Library-first architecture with framework-agnostic tool definitions and MCP adapter.

Downloads

581

Readme

AgentDB

CI License: MIT

AI-first embedded database for LLM agents. Zero native dependencies, pure TypeScript.

Install

npm install @backloghq/agentdb

Quick Start

import { AgentDB } from "@backloghq/agentdb";

const db = new AgentDB("./data");
await db.init();

const tasks = await db.collection("tasks");

// Insert
const id = await tasks.insert(
  { title: "Ship v1", status: "active", priority: 1 },
  { agent: "planner", reason: "Sprint kickoff" },
);

// Find
const result = tasks.find({ filter: { status: "active" } });
// → { records: [...], total: 1, truncated: false }

// Update
await tasks.update(
  { _id: id },
  { $set: { status: "done" } },
  { agent: "planner", reason: "Completed" },
);

// Clean up
await db.close();

Declarative Schemas

Define typed, validated collections in one place:

import { AgentDB, defineSchema } from "@backloghq/agentdb";

const db = new AgentDB("./data");
await db.init();

const tasks = await db.collection(defineSchema({
  name: "tasks",
  fields: {
    title: { type: "string", required: true, maxLength: 200 },
    status: { type: "enum", values: ["pending", "done"], default: "pending" },
    priority: { type: "enum", values: ["H", "M", "L"], default: "M" },
    score: { type: "number", min: 0, max: 100 },
    tags: { type: "string[]" },
  },
  indexes: ["status", "priority"],
  arrayIndexes: ["tags"],           // O(1) $contains lookups
  computed: {
    isUrgent: (r) => r.priority === "H" && r.status === "pending",
  },
  virtualFilters: {
    "+URGENT": (r) => r.priority === "H" && r.status === "pending",
  },
  hooks: {
    beforeInsert: (record) => ({ ...record, createdAt: new Date().toISOString() }),
  },
}));

await tasks.insert({ title: "Fix critical bug", priority: "H" });
// → status defaults to "pending", priority validated, createdAt auto-set

const urgent = tasks.find({ filter: { "+URGENT": true } });

Fields support: string, number, boolean, date, enum, string[], number[], object, autoIncrement. Constraints: required, maxLength, min, max, pattern, default, resolve.

Field resolve — transform values before validation (e.g. parse natural language dates):

fields: {
  due: { type: "date", resolve: (v) => v === "tomorrow" ? nextDay() : v },
  score: { type: "number", resolve: (v) => typeof v === "string" ? parseInt(v) : v },
}

Custom tag field+tag/-tag syntax queries "tags" by default, configurable via tagField:

defineSchema({ tagField: "labels", fields: { labels: { type: "string[]" } } })
// +bug → { labels: { $contains: "bug" } }

Three Ways to Use It

1. Direct Import

import { AgentDB } from "@backloghq/agentdb";

Full programmatic access. Use AgentDB to manage collections, Collection for CRUD.

2. Tool Definitions

import { AgentDB } from "@backloghq/agentdb";
import { getTools } from "@backloghq/agentdb/tools";

const db = new AgentDB("./data");
await db.init();

const tools = getTools(db);
// → Array of { name, description, schema, annotations, execute }

Framework-agnostic. Each tool has a zod schema and an execute function that returns { content: [...] }. Works with Vercel AI SDK, LangChain, or any framework that accepts tool definitions.

3. MCP Server

npx agentdb --path ./data              # stdio (single client)
npx agentdb --path ./data --http       # HTTP (multiple clients)

All 30 tools exposed as MCP tools (32 on HTTP with db_subscribe/db_unsubscribe). Claude Code config (~/.claude/settings.json):

{
  "mcpServers": {
    "agentdb": {
      "command": "npx",
      "args": ["agentdb", "--path", "/absolute/path/to/data"]
    }
  }
}

Disk-Backed Storage

For large collections that exceed available RAM, enable disk-backed mode. Collections are compacted to Parquet files with persistent indexes.

// Global: all collections use disk mode
const db = new AgentDB("./data", {
  storageMode: "disk",   // "memory" (default) | "disk" | "auto"
  cacheSize: 10_000,     // LRU cache size (records)
  rowGroupSize: 5000,    // Parquet row group size
});

// Per-collection via schema
const events = await db.collection(defineSchema({
  name: "events",
  storageMode: "disk",
  fields: { ... },
  indexes: ["type", "timestamp"],
  arrayIndexes: ["tags"],
}));

// Auto mode: switches to disk when collection exceeds threshold
const db = new AgentDB("./data", {
  storageMode: "auto",
  diskThreshold: 10_000,  // default
});

Disk mode opens with skipLoad — records are NOT loaded into memory. On close, compaction writes two artifacts:

  • Parquet_id + extracted columns only. For count(), column scans, and skip-scanning. No full records stored.
  • JSONL record store — full records, one per line. For findOne() and find(limit:N) via byte-range seeks.

Point lookups use readBlobRange to seek directly to a record's byte offset in the JSONL file — O(1) per record on filesystem, single HTTP Range request on S3. No row group parsing, no full-file reads.

Compaction is incremental — close writes only new records, not the full dataset. Auto-merges after 10 incremental files. Indexes are lazy-loaded on first query.

All disk I/O goes through StorageBackend — works identically on filesystem and S3. Zero native dependencies.

S3 Backend

Store data in Amazon S3 instead of the local filesystem. Zero code changes — just configure via CLI flags or environment variables.

CLI flags

npx agentdb --backend s3 --bucket my-bucket --region us-east-1
npx agentdb --backend s3 --bucket my-bucket --prefix prod/agentdb --http --port 3000
npx agentdb --backend s3 --bucket my-bucket --agent-id agent-1  # multi-writer

Environment variables

AGENTDB_BACKEND=s3
AGENTDB_S3_BUCKET=my-bucket
AGENTDB_S3_PREFIX=agentdb        # optional key prefix
AWS_REGION=us-east-1
AGENTDB_AGENT_ID=agent-1         # optional multi-writer
npx agentdb

Library usage

import { AgentDB, loadS3Backend } from "@backloghq/agentdb";

const { S3Backend } = await loadS3Backend(); // optional — requires @backloghq/opslog-s3
const db = new AgentDB("mydb", {
  backend: new S3Backend({
    bucket: "my-bucket",
    prefix: "agentdb",
    region: "us-east-1",
  }),
  agentId: "agent-1",  // optional: enables multi-writer
});
await db.init();

AWS credentials use the standard SDK chain (env vars, IAM role, ~/.aws/config). The AWS SDK is only loaded when S3 is configured — filesystem users never pay the cost.

Filter Syntax

Two syntaxes. JSON is primary, compact string is secondary.

JSON Filters

// Equality (implicit)
tasks.find({ filter: { status: "active" } });

// Comparison operators
tasks.find({ filter: { priority: { $gt: 3 } } });

// Dot-notation for nested fields
tasks.find({ filter: { "metadata.tags": { $contains: "urgent" } } });

// Logical operators
tasks.find({
  filter: {
    $or: [{ status: "active" }, { priority: { $gte: 5 } }],
  },
});

Operators: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $contains, $startsWith, $endsWith, $exists, $regex, $not

Top-level keys are implicitly ANDed.

Compact String Filters

Shorthand for tool calls and quick queries:

status:active                          → { status: "active" }
status:active priority.gt:3           → { $and: [{ status: "active" }, { priority: { $gt: 3 } }] }
name.contains:alice                    → { name: { $contains: "alice" } }
(role:admin or role:mod)               → { $or: [{ role: "admin" }, { role: "mod" }] }
tags.in:bug,feature                    → { tags: { $in: ["bug", "feature"] } }
+bug                                   → { tags: { $contains: "bug" } }
-old                                   → { tags: { $not: { $contains: "old" } } }
auth error                             → { $text: "auth error" }
status:active auth                     → { $and: [{ status: "active" }, { $text: "auth" }] }

Modifier aliases: gt, gte, lt, lte, ne, contains, has, startsWith, starts, endsWith, ends, in, nin, exists, regex, match, eq, is, not, after, before, above, below, over, under

Collection API

v1.2 breaking change: findOne, find, findAll, count, search, queryView are now async and return Promises.

const col = await db.collection("tasks");

// Insert
const id = await col.insert(doc, opts?);
const ids = await col.insertMany(docs, opts?);

// Read (async)
const record = await col.findOne(id);
const result = await col.find({ filter?, limit?, offset?, summary?, sort?, maxTokens? });
const n = await col.count(filter?);

// Update
const modified = await col.update(filter, { $set?, $unset?, $inc?, $push? }, opts?);
const { id, action } = await col.upsert(id, doc, opts?);
const results = await col.upsertMany([{ _id, ...doc }, ...], opts?);

// Delete
const deleted = await col.remove(filter, opts?);

// History
const undone = await col.undo();
const ops = col.history(id);

// Inspect
const shape = col.schema(sampleSize?);
const uniq = col.distinct(field);

All mutation methods accept opts?: { agent?: string; reason?: string }.

Tool Definitions

getTools(db) returns 30 tools:

| Tool | Description | |------|-------------| | db_collections | List all collections with record counts | | db_create | Create a collection (idempotent) | | db_drop | Soft-delete a collection | | db_purge | Permanently delete a dropped collection | | db_insert | Insert one or more records | | db_find | Query with filter, pagination, summary mode, token budget | | db_find_one | Get a single record by ID | | db_update | Update matching records ($set, $unset, $inc, $push) | | db_upsert | Insert or update by ID | | db_delete | Delete matching records | | db_count | Count matching records | | db_batch | Execute multiple mutations atomically | | db_undo | Undo last mutation | | db_history | Mutation history for a record | | db_schema | Inspect record shape (fields, types, examples) | | db_distinct | Unique values for a field | | db_stats | Database-level statistics | | db_archive | Move records to cold storage | | db_archive_list | List archive segments | | db_archive_load | View archived records | | db_semantic_search | Search by meaning (requires embedding provider) | | db_embed | Manually trigger embedding | | db_vector_upsert | Store a pre-computed vector with metadata | | db_vector_search | Search by raw vector (no embedding provider needed) | | db_blob_write | Attach a file (base64) to a record | | db_blob_read | Read an attached file | | db_blob_list | List files attached to a record | | db_blob_delete | Delete an attached file | | db_export | Export collections as JSON backup | | db_import | Import from a JSON backup |

Each tool returns { content: [{ type: "text", text: "..." }] }. Errors return { isError: true, content: [...] } — they never throw across the tool boundary.

Agent Identity

Every mutation accepts agent and reason. These are stored internally and visible in history, but stripped from query results.

await col.insert(
  { title: "Fix login bug" },
  { agent: "triage-bot", reason: "Auto-filed from error spike" },
);

// History shows who did what and why
col.history(id);
// → [{ type: "set", key: "...", value: { ..., _agent: "triage-bot", _reason: "..." }, ... }]

Authentication

Bearer token (simplest)

npx agentdb --http --auth-token my-secret-token

# Agents send: Authorization: Bearer my-secret-token

Or via environment variable:

AGENTDB_AUTH_TOKEN=my-secret-token npx agentdb --http

No token configured = open access (backward compatible). Health check at /health always works.

Multi-agent tokens

Map different tokens to different agent identities and permissions:

startHttp(dir, {
  authTokens: {
    "token-reader": { agentId: "reader", permissions: { read: true, write: false, admin: false } },
    "token-writer": { agentId: "writer", permissions: { read: true, write: true, admin: false } },
  },
});

JWT (production)

Validate JWTs from any OAuth provider (Auth0, WorkOS, etc.):

import { startHttp, createJwtAuth } from "@backloghq/agentdb/mcp";

startHttp(dir, {
  authFn: createJwtAuth({
    jwksUrl: "https://your-domain.auth0.com/.well-known/jwks.json",
    audience: "agentdb",
    issuer: "https://your-domain.auth0.com",
  }),
});

Group commit (faster writes)

Buffer writes in memory and flush as a single disk write. ~12x faster for sustained writes. Single-writer only — auto-disabled when agentId is set.

npx agentdb --http --group-commit

# Or via env var
AGENTDB_WRITE_MODE=group npx agentdb --http
const db = new AgentDB("./data", { writeMode: "group" });

Tradeoff: A crash can lose buffered ops (up to 100ms of data). Default "immediate" mode is safe — every write survives a crash.

Read-only mode

Open a read-only instance alongside a running writer — no write locks, safe for dashboards and monitoring:

const reader = new AgentDB("./data", { readOnly: true });
await reader.init();
const col = await reader.collection("tasks");
await col.tail(); // pick up latest writes

Blob storage

Attach files to records — images, PDFs, code, any binary. Stored outside the WAL via the StorageBackend (works on filesystem and S3).

const col = await db.collection("tasks");
await col.insert({ _id: "task-1", title: "Fix auth" });

// Attach files
await col.writeBlob("task-1", "spec.md", "# Spec\n\nDetails...");
await col.writeBlob("task-1", "screenshot.png", imageBuffer);

// Read back
const spec = await col.readBlob("task-1", "spec.md");
const blobs = await col.listBlobs("task-1"); // → ["spec.md", "screenshot.png"]

// Delete
await col.deleteBlob("task-1", "spec.md");

Blobs are automatically cleaned up when their parent record is deleted.

Embeddings and vector search

AgentDB supports semantic search via embedding providers and explicit vector storage.

Embedding providers (for automatic text embedding):

# Local via Ollama (no API key)
npx agentdb --http --embeddings ollama

# OpenAI
OPENAI_API_KEY=sk-... npx agentdb --http --embeddings openai:text-embedding-3-small

# Gemini (free tier available)
GEMINI_API_KEY=... npx agentdb --http --embeddings gemini

# Voyage AI / Cohere
AGENTDB_EMBEDDINGS_API_KEY=... npx agentdb --http --embeddings voyage
AGENTDB_EMBEDDINGS_API_KEY=... npx agentdb --http --embeddings cohere

Explicit vector API (no provider needed):

const col = await db.collection("docs");

// Store pre-computed vectors
await col.insertVector("doc1", [0.1, 0.2, ...], { title: "My Document" });

// Search by vector
const results = col.searchByVector([0.1, 0.2, ...], { limit: 10, filter: { status: "active" } });
// → { records: [...], scores: [0.98, 0.91, ...] }

MCP tools: db_vector_upsert, db_vector_search, db_semantic_search, db_embed.

Rate limiting and CORS

npx agentdb --http --auth-token secret --rate-limit 100 --cors https://app.example.com

Real-time notifications

Subscribe to collection changes via db_subscribe / db_unsubscribe on the HTTP MCP transport. Agents receive push notifications via SSE when records are inserted, updated, or deleted — no polling needed. See examples/multi-agent/ for a working demo.

Docker

docker build -t agentdb .
docker run -p 3000:3000 -v ./data:/data agentdb --path /data --http --host 0.0.0.0

# With auth:
docker run -p 3000:3000 -e AGENTDB_AUTH_TOKEN=secret -v ./data:/data agentdb --path /data --http --host 0.0.0.0

# With S3:
docker run -p 3000:3000 \
  -e AGENTDB_BACKEND=s3 \
  -e AGENTDB_S3_BUCKET=my-bucket \
  -e AWS_REGION=us-east-1 \
  agentdb --http --host 0.0.0.0

Sorting

col.find({ filter: { status: "active" }, sort: "name" });     // ascending
col.find({ filter: { status: "active" }, sort: "-score" });    // descending
col.find({ sort: "-metadata.priority" });                       // nested field

Progressive Disclosure

Use summary: true on find to get compact results. Omits long text fields (>200 chars), nested objects, and large arrays (>10 items). Useful for agents scanning many records before drilling into one.

col.find({ filter: { status: "active" }, summary: true });

Deployment Patterns

| Scenario | Pattern | Storage Mode | Latency | |----------|---------|-------------|---------| | Small datasets (<10K records) | Direct import / stdio MCP | memory (default) | <1ms | | Large datasets (10K-1M+) | Direct import / HTTP MCP | disk | <1ms findOne, ~10ms find | | Auto-scaling | Any | auto (switches at threshold) | varies | | Multiple agents, same machine | HTTP MCP server | memory or disk | ~1-5ms | | Multiple agents, distributed | HTTP MCP + S3 backend | disk | ~50ms | | Decentralized, no server | Multi-writer S3 | memory | ~50ms |

Storage mode guide:

  • memory — all records in RAM. Fastest queries. Use for <10K records.
  • disk — records in JSONL + Parquet on disk/S3. Handles 1M+ records. Lazy index loading for fast cold open.
  • auto — starts in memory, switches to disk when collection exceeds diskThreshold.

Default recommendation: Use memory for small datasets, disk or auto for anything that might grow.

Examples

See examples/ for runnable demos powered by Ollama:

  • Multi-Agent Task Board — Agents collaborate on a shared task board. Event-driven via NOTIFY/LISTEN.
  • RAG Knowledge Base — Ingest docs, embed with Ollama, answer questions via semantic search.
  • Research Pipeline — 3-stage AI pipeline: Researcher → Analyst → Writer. Each stage triggers the next.
  • Multi-Model Code Review — Gemini generates code, Ollama reviews locally, Gemini writes tests. Multi-provider orchestration.
  • Live Dashboard — Real-time CLI view of any running demo's collections.

Development

npm run build          # tsc
npm run lint           # eslint src/ tests/
npm test               # vitest run
npm run test:coverage  # vitest coverage

Built on @backloghq/opslog -- every mutation is an operation in an append-only log. You get crash safety, undo, and audit trails for free.

License

MIT