yutia.db

v1.3.2

Published

22 days ago

Ultra-lightweight embedded database for Realtime Logs and AI Datasets.

0High
0Medium
0Low

embedded-database lightweight-db realtime-logs ai-datasets nedb-alternative json-database local-database typescript yutia-db nodejs

YutiaDB (YDB3)

Ultra-lightweight embedded database optimized for realtime logs, event streams, and AI datasets — with crash-safe append-only storage and fast by-id reads.

Storage format: YDB3 = binary framing + JSON payload
("YDB3" + version + [u32 len][json bytes]...)

Why YutiaDB?

YutiaDB is designed for workloads where writes are massive & continuous, and you need:

Very fast ingestion (append-only)
Crash-safety (power loss safe; last partial record is ignored on recovery)
Small memory overhead (streaming reads, small cache, optional pointer index)
Easy export to JSON/JSONL for analytics / AI training

This is not a replacement for full-feature query databases (MongoDB/Postgres).
It’s a purpose-built embedded DB for append-heavy use cases.

Features

✅ Append-only realtime ingestion (framed JSON)
✅ Crash-safe recovery (truncated tail-safe)
✅ Fast findOne({_id}) via pointer lookup (O(1) per read)
✅ Streaming scan for analytics (no full file load)
✅ Tombstone deletes (_deleted: true)
✅ Lightweight LRU-ish cache for hot records
✅ Works great for: logs, telemetry, dataset storage, user DB (by-id)

Install (npm)

npm i yutiadb
# or
bun add yutiadb
# or
pnpm add yutiadb

Quick Start

import { Datastore } from "yutiadb";

const db = new Datastore({
  filename: "./data/app.ydb",
  autoload: true,

  // performance/durability tuning
  durability: "batched",    // "none" | "batched" | "immediate"
  batchBytes: 4 * 1024 * 1024,
  autoFlushMs: 50,
  fsyncEveryFlush: 1,

  maxCacheEntries: 1000,
  maxPendingDocs: 500_000,
} as any);

// write (realtime)
await db.insert({ type: "req", path: "/api/v1/items", ts: Date.now() });

// read by _id (fast)
const one = await db.findOne({ _id: "..." });

// scan query (stream scan)
const items = await db.find({ type: "req" });

// ensure durability (optional)
await db.flush();  // flush buffered writes
await db.drain();  // wait until all pending buffered writes are written

await db.close();

Durability Modes

durability: "none" Fastest. Data may be in OS buffers; power loss might lose last writes.
durability: "batched" (recommended) Writes are buffered and fsync happens periodically (fsyncEveryFlush).
durability: "immediate" Safest (fsync every flush). Slowest.

Tip for logs: use "batched" and compact/export offline if needed.

Storage Format (YDB3)

File layout:

Header:
- YDB3 (4 bytes)
- version (1 byte)
Records:
- len (UInt32LE, 4 bytes)
- payload (UTF-8 JSON bytes)

Recovery behavior:

If a crash truncates the last record, reader stops safely at the last valid record.

Use Cases

Realtime Logs (Web/API)

request logs
audit logs
webhook logs
notification logs

Dataset for AI

store training samples (JSON)
export to JSONL later
sequential scan for training

User DB (by-id)

fast findOne({_id})
tombstone deletes
optional compaction

Limitations

Query engine is intentionally minimal (stream scan + by-id pointer)
Single-process embedded DB (not a network DB)
Multi-field secondary indexing is not included (yet)

Roadmap (optional)

[ ] Time-window compaction for logs (keep last N days)
[ ] Index snapshot .idx for faster startup
[ ] Tail / live stream tool (tail -f for YDB)
[ ] Export JSONL tool

License

MIT