yutia.db
v1.3.2
Published
Ultra-lightweight embedded database for Realtime Logs and AI Datasets.
Maintainers
Readme
YutiaDB (YDB3)
Ultra-lightweight embedded database optimized for realtime logs, event streams, and AI datasets — with crash-safe append-only storage and fast by-id reads.
Storage format: YDB3 = binary framing + JSON payload
("YDB3" + version + [u32 len][json bytes]...)
Why YutiaDB?
YutiaDB is designed for workloads where writes are massive & continuous, and you need:
- Very fast ingestion (append-only)
- Crash-safety (power loss safe; last partial record is ignored on recovery)
- Small memory overhead (streaming reads, small cache, optional pointer index)
- Easy export to JSON/JSONL for analytics / AI training
This is not a replacement for full-feature query databases (MongoDB/Postgres).
It’s a purpose-built embedded DB for append-heavy use cases.
Features
- ✅ Append-only realtime ingestion (framed JSON)
- ✅ Crash-safe recovery (truncated tail-safe)
- ✅ Fast
findOne({_id})via pointer lookup (O(1) per read) - ✅ Streaming scan for analytics (no full file load)
- ✅ Tombstone deletes (
_deleted: true) - ✅ Lightweight LRU-ish cache for hot records
- ✅ Works great for: logs, telemetry, dataset storage, user DB (by-id)
Install (npm)
npm i yutiadb
# or
bun add yutiadb
# or
pnpm add yutiadbQuick Start
import { Datastore } from "yutiadb";
const db = new Datastore({
filename: "./data/app.ydb",
autoload: true,
// performance/durability tuning
durability: "batched", // "none" | "batched" | "immediate"
batchBytes: 4 * 1024 * 1024,
autoFlushMs: 50,
fsyncEveryFlush: 1,
maxCacheEntries: 1000,
maxPendingDocs: 500_000,
} as any);
// write (realtime)
await db.insert({ type: "req", path: "/api/v1/items", ts: Date.now() });
// read by _id (fast)
const one = await db.findOne({ _id: "..." });
// scan query (stream scan)
const items = await db.find({ type: "req" });
// ensure durability (optional)
await db.flush(); // flush buffered writes
await db.drain(); // wait until all pending buffered writes are written
await db.close();Durability Modes
durability: "none"Fastest. Data may be in OS buffers; power loss might lose last writes.durability: "batched"(recommended) Writes are buffered and fsync happens periodically (fsyncEveryFlush).durability: "immediate"Safest (fsync every flush). Slowest.
Tip for logs: use "batched" and compact/export offline if needed.
Storage Format (YDB3)
File layout:
Header:
YDB3(4 bytes)- version (1 byte)
Records:
len(UInt32LE, 4 bytes)payload(UTF-8 JSON bytes)
Recovery behavior:
- If a crash truncates the last record, reader stops safely at the last valid record.
Use Cases
Realtime Logs (Web/API)
- request logs
- audit logs
- webhook logs
- notification logs
Dataset for AI
- store training samples (JSON)
- export to JSONL later
- sequential scan for training
User DB (by-id)
- fast
findOne({_id}) - tombstone deletes
- optional compaction
Limitations
- Query engine is intentionally minimal (stream scan + by-id pointer)
- Single-process embedded DB (not a network DB)
- Multi-field secondary indexing is not included (yet)
Roadmap (optional)
- [ ] Time-window compaction for logs (keep last N days)
- [ ] Index snapshot
.idxfor faster startup - [ ] Tail / live stream tool (
tail -ffor YDB) - [ ] Export JSONL tool
License
MIT
