lakeql

v0.1.7

Published

8 hours ago

lakeql: lightweight TypeScript query engine for Iceberg + Parquet on object storage

0High
0Medium
0Low

earonesty

lakehouse parquet iceberg query-engine object-storage

lakeql

Query Parquet and Iceberg tables directly from object storage, in TypeScript.

lakeql is small and dependency-light enough to run in Cloudflare Workers and other edge/serverless runtimes, where DuckDB-WASM or a JVM engine is too heavy. It streams with HTTP range reads and bounded memory, and either reads a table correctly or rejects it with a typed error — it won't return quietly-wrong rows.

▶ Try it live in your browser: https://lakeql.com/

npm install lakeql

Quick start

Read a Parquet file over HTTP — no credentials, runs in Node or on the edge:

import { createLake, httpStore, gt } from "lakeql/node";

const lake = createLake({ store: httpStore({ baseUrl: "https://example.com/data" }) });

const rows = await lake
  .path("sales.parquet")
  .select(["store_id", "amount"])
  .where(gt("amount", 100))
  .limit(100)
  .toArray();

Inside a Cloudflare Worker, reading from R2:

import { createLake, r2Store } from "lakeql/cloudflare";

export default {
  async fetch(_req: Request, env: { DATA: R2Bucket }) {
    const lake = createLake({
      store: r2Store(env.DATA),
      budget: { maxOutputRows: 1000, maxConcurrentReads: 4 },
    });
    const rows = await lake.path("sales.parquet").limit(100).toArray();
    return Response.json(rows);
  },
};

Plan an Iceberg table (snapshot selection, partition- and delete-aware pruning):

import { loadIcebergTable, eq } from "lakeql/node";

const table = await loadIcebergTable({
  store: httpStore({ baseUrl: "https://example.com/warehouse" }),
  metadataPath: "places/metadata/v2.metadata.json",
});

const plan = table.planFiles({ ref: "main", where: eq("country", "US") });

Entry points

| Import | Adds | | --- | --- | | lakeql | Core query engine, Parquet, Iceberg, and the unified loadTable / planFiles / scanRows / scanBatches helpers. | | lakeql/node | Everything in lakeql, plus httpStore and s3Store. | | lakeql/cloudflare | Everything in lakeql, plus r2Store. |

CLI

A global install adds a lakeql command for quick local queries:

npm install -g lakeql
lakeql query --path sales.parquet --sql "select region, sum(amount) as revenue from input group by region order by revenue desc"

What it supports

lakeql aims to read supported Parquet and Iceberg features correctly and reject unsupported table semantics explicitly. See the compatibility matrix and unsupported-but-detected. Object-store adapters (httpStore, s3Store, r2Store) use HTTP range reads by default; Iceberg writes are append-only.

Documentation

Full docs, recipes, and the engine contract live in the repository: querying Parquet · querying Iceberg · Cloudflare Workers · error codes · why not DuckDB-WASM?

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

lakeql

Quick start

Entry points

CLI

What it supports

Documentation

License