lakeql
v0.1.7
Published
lakeql: lightweight TypeScript query engine for Iceberg + Parquet on object storage
Maintainers
Readme
lakeql
Query Parquet and Iceberg tables directly from object storage, in TypeScript.
lakeql is small and dependency-light enough to run in Cloudflare Workers and other edge/serverless runtimes, where DuckDB-WASM or a JVM engine is too heavy. It streams with HTTP range reads and bounded memory, and either reads a table correctly or rejects it with a typed error — it won't return quietly-wrong rows.
▶ Try it live in your browser: https://lakeql.com/
npm install lakeqlQuick start
Read a Parquet file over HTTP — no credentials, runs in Node or on the edge:
import { createLake, httpStore, gt } from "lakeql/node";
const lake = createLake({ store: httpStore({ baseUrl: "https://example.com/data" }) });
const rows = await lake
.path("sales.parquet")
.select(["store_id", "amount"])
.where(gt("amount", 100))
.limit(100)
.toArray();Inside a Cloudflare Worker, reading from R2:
import { createLake, r2Store } from "lakeql/cloudflare";
export default {
async fetch(_req: Request, env: { DATA: R2Bucket }) {
const lake = createLake({
store: r2Store(env.DATA),
budget: { maxOutputRows: 1000, maxConcurrentReads: 4 },
});
const rows = await lake.path("sales.parquet").limit(100).toArray();
return Response.json(rows);
},
};Plan an Iceberg table (snapshot selection, partition- and delete-aware pruning):
import { loadIcebergTable, eq } from "lakeql/node";
const table = await loadIcebergTable({
store: httpStore({ baseUrl: "https://example.com/warehouse" }),
metadataPath: "places/metadata/v2.metadata.json",
});
const plan = table.planFiles({ ref: "main", where: eq("country", "US") });Entry points
| Import | Adds |
| --- | --- |
| lakeql | Core query engine, Parquet, Iceberg, and the unified loadTable / planFiles / scanRows / scanBatches helpers. |
| lakeql/node | Everything in lakeql, plus httpStore and s3Store. |
| lakeql/cloudflare | Everything in lakeql, plus r2Store. |
CLI
A global install adds a lakeql command for quick local queries:
npm install -g lakeql
lakeql query --path sales.parquet --sql "select region, sum(amount) as revenue from input group by region order by revenue desc"What it supports
lakeql aims to read supported Parquet and Iceberg features correctly and reject
unsupported table semantics explicitly. See the
compatibility matrix
and unsupported-but-detected.
Object-store adapters (httpStore, s3Store, r2Store) use HTTP range reads by
default; Iceberg writes are append-only.
Documentation
Full docs, recipes, and the engine contract live in the repository: querying Parquet · querying Iceberg · Cloudflare Workers · error codes · why not DuckDB-WASM?
License
MIT
