@colql/colql
v0.5.0
Published
Memory-efficient in-memory columnar query engine for TypeScript
Maintainers
Readme
ColQL is a zero-dependency, in-memory columnar query engine for TypeScript apps that need compact process-local storage, typed schemas, explicit indexes, and safe mutations.
It is not a SQL database or persistence layer. ColQL is for data you already want to keep inside a Node.js process.
Why ColQL?
- Compact columnar storage backed by typed arrays, dictionaries, and bit-packed booleans
- Lazy queries with filtering, projection, aggregation, streaming, limit, and offset
- Object predicates plus tuple-style
where(column, operator, value) - Explicit equality indexes and sorted numeric indexes for hot predicates
- Unique indexes for stable ID lookups and duplicate-key protection
- Public
query.explain()diagnostics for planner visibility without executing queries - JS Array migration helpers such as
fromRows,firstWhere,countWhere, andexists - Mutable tables with
updateManyanddeleteMany - Runtime validation with structured
ColQLErrorcodes - Binary serialization for table data
- TypeScript inference for rows, predicates, projections, and mutation payloads
- Zero runtime dependencies
Install
npm install @colql/colqlQuick Example
import { column, table } from "@colql/colql";
const users = table({
id: column.uint32(),
age: column.uint8(),
status: column.dictionary(["active", "passive", "archived"] as const),
score: column.float64(),
verified: column.boolean(),
});
users.insertMany([
{ id: 1, age: 29, status: "active", score: 91.5, verified: true },
{ id: 2, age: 17, status: "passive", score: 72.0, verified: false },
{ id: 3, age: 44, status: "active", score: 88.2, verified: true },
]);
users.createIndex("status");
users.createSortedIndex("age");
users.createUniqueIndex("id");
const activeAdults = users
.where({
status: "active",
age: { gte: 18 },
})
.select(["id", "age", "score"])
.toArray();
console.log(
users.where({ status: "active", age: { gte: 18 } }).select(["id"]).explain(),
);
const result = users.updateMany(
{ status: "passive", age: { lt: 18 } },
{ status: "archived" },
);
console.log(activeAdults);
console.log(result.affectedRows);
console.log(users.findBy("id", 1));Performance Snapshot
ColQL includes a Fastify example that can boot with 1M deterministic rows and exercise indexed, range, scan, callback-filter, mutation, stress, and memory paths.
These are local reference numbers from examples/fastify-api after the example's mutation validation path, not guarantees. Actual numbers vary by Node.js version, CPU, memory pressure, data distribution, query selectivity, projection size, and mutation frequency.
| 1M-row workload | Local avg | Expected shape |
|---|---:|---|
| Selective equality query with createIndex() | 2.06ms | Fastest path when candidate sets are small |
| Numeric range query with createSortedIndex() | 27.69ms | Helps selective ranges; broad ranges may resemble scans |
| Broad structured predicate | 25.31ms | May intentionally scan when an index is not selective |
| filter(fn) callback predicate | 218.93ms | Full-scan escape hatch; not index-aware |
ColQL also includes a JS Array comparison benchmark. It is meant to show tradeoffs, not to prove that ColQL is always faster. In local 1M-row runs, JS arrays are often faster for simple broad scans and callback predicates, while ColQL is strongest when compact storage, projection with limits, structured predicates, or explicit indexed lookups matter.
Run the example locally:
cd examples/fastify-api
npm install
npm run test:largeFor benchmark scripts and interpretation notes, see Performance and Benchmarks.
For PR-level regression checks, run npm run bench:codspeed; CodSpeed uses smaller deterministic datasets as regression signals, not absolute throughput claims. Setup is excluded where possible, while destructive mutation benchmarks that need fresh tables use setup-inclusive names. Larger 1M-row comparisons remain covered by the local/manual benchmark scripts.
For JS Array comparisons, run npm run benchmark:array-comparison; results are local guidance, not universal promises or CI requirements.
For a scenario-style local workload, run npm run benchmark:session-analytics.
When To Use ColQL
Use ColQL when:
- you need to keep thousands to millions of records in memory
- JavaScript object arrays use too much memory
- filters and aggregations should avoid intermediate arrays
- a TypeScript schema can describe your columns
- explicit indexes are acceptable for hot equality or range predicates
- stable identity can be modeled with an explicit ID column and unique index
- runtime validation matters because data may come from untyped sources
Avoid ColQL when:
- you need durable storage, transactions, joins, or SQL
- data must be shared across pods, workers, processes, or machines
- writes dominate the workload and broad indexes are frequently dirtied
- row indexes must be stable external identifiers
- a small/simple JavaScript array is already clear and fast enough
- every query requires arbitrary sorting or grouping
- you need concurrent writers or multi-process coordination
- you want automatic indexes, compound indexes, or query planning across tables
- you want analytical SQL over files or large columnar datasets, where DuckDB may be a better fit
Row indexes are physical positions and can change after deletes. Use an explicit id column for stable identity.
Decision Guide
| Tool | Good fit | Not a good fit | |---|---|---| | ColQL | Process-local TypeScript data that benefits from compact in-memory columns, explicit indexes, validation, and inspectable query plans | Persistence, SQL, joins, transactions, shared state, or distributed coordination | | JavaScript arrays | Small or simple datasets, ad hoc transforms, or write-heavy logic where object arrays are already clear and fast enough | Memory-sensitive data, repeated projections, structured indexed lookups, or runtime schema validation | | SQLite | Durable embedded relational storage with SQL, transactions, and indexes | Pure process-local ephemeral caches where a database file and SQL layer are unnecessary | | DuckDB | Analytical SQL, file-based analytics, large columnar datasets, and ad hoc aggregations | TypeScript-first mutable process-local tables with explicit in-memory indexes |
Examples
The Fastify example demonstrates HTTP query params mapped to object predicates, range queries, filter(fn), updateMany, deleteMany, query diagnostics, index stats, and memory counters.
Documentation
Detailed documentation is available under docs/doc.
Recommended reading:
- Overview
- Installation
- Schema and Columns
- Querying
- Equality Indexes
- Sorted Indexes
- Unique Indexes
- Mutations
- Serialization
- Memory Model
- Limitations and Design Decisions
- API Reference
Common APIs
users.insert(row);
users.insertMany(rows);
users.where({ status: "active", age: { gte: 18 } }).toArray();
users.where("age", ">=", 18).select(["id"]).toArray();
users.whereIn("status", ["active", "passive"]);
users.filter((row) => row.score > 90);
users.where({ status: "active" }).explain();
users.count();
users.avg("age");
users.top(10, "score");
users.update(0, { status: "active" });
users.updateMany({ status: "passive" }, { status: "active" });
users.deleteMany({ status: "archived" });
users.createIndex("id");
users.createSortedIndex("age");
users.createUniqueIndex("id");
users.findBy("id", 123);
const buffer = users.serialize();
const restored = table.deserialize(buffer);
restored.createIndex("status");filter(fn) is intentionally a full-scan escape hatch. Prefer structured predicates when you want index planning.
query.explain() returns structured diagnostics without executing the query, scanning rows, materializing rows, calling onQuery, or rebuilding dirty indexes.
Error Handling
ColQL validates schemas, inserted rows, query predicates, mutation payloads, indexes, and serialized input at runtime. Failures throw ColQLError with a stable code, a message, and optional details.
import { ColQLError } from "@colql/colql";
try {
users.insert({ id: 4, age: 300, status: "active", score: 1, verified: true });
} catch (error) {
if (error instanceof ColQLError) {
console.log(error.code); // COLQL_OUT_OF_RANGE
}
}Development
npm install
npm run check
npm test
npm run test:types
npm run build
npm run bench:codspeed
npm run benchmark:memory
npm run benchmark:query
npm run benchmark:indexed
npm run benchmark:range
npm run benchmark:optimizer
npm run benchmark:serialization
npm run benchmark:delete
npm run benchmark:array-comparison
npm run benchmark:session-analyticsStatus
ColQL v0.5.x focuses on trust and stability hardening: narrower index invalidation after updates, stricter snapshot deserialization, a type-test gate for public TypeScript behavior, and clearer diagnostics. Breaking changes may still happen before 1.0.0, but the project is moving toward API stabilization.
Limitations
ColQL intentionally does not include SQL parsing, joins, transactions, concurrency control, automatic indexes, compound indexes, or durable storage. Equality and sorted indexes are derived performance structures; query results must be the same whether ColQL uses an index or a full scan. Unique indexes are derived too, but they also enforce uniqueness while present and are not serialized.
