npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

xlsx-stream-rows

v1.1.0

Published

Zero-dependency streaming XLSX/CSV/XLS reader for the browser. Bounded memory regardless of file size.

Readme

xlsx-stream-rows

Streaming, chunked, low-memory spreadsheet reader for the browser. Reads XLSX, CSV, and XLS files row-by-row without loading the entire file into memory — a 1 GB workbook reads in roughly the same memory envelope as a 1 MB one.

Zero runtime dependencies. TypeScript. ESM + CJS. Works in browsers and Web Workers.

npm install xlsx-stream-rows

Why

Most browser-side spreadsheet libraries materialise the entire file before yielding a single row. A 300 MB XLSX usually peaks at 1–2 GB of JS heap and crashes the tab. xlsx-stream-rows treats File as a handle, not a byte array: it reads only the ZIP Central Directory at the end of the archive, then pipes the target sheet through DecompressionStream into an incremental SAX parser, emitting rows lazily, on demand.

If you've been searching for terms like streaming xlsx parser, chunked spreadsheet reader, lazy xlsx, partial xlsx parsing, incremental xlsx reader, async iterator over xlsx rows, row-by-row xlsx, on-demand spreadsheet loading, memory-bounded xlsx, or read large xlsx in browser without OOM — that's what this library does.


Features

  • True streaming. AsyncIterable<Row> — pull rows on demand, stop whenever. No batch parse, no buffered intermediate result.
  • Bounded memory. Peak heap is proportional to the data you consume, not the file size. Measured on a 1,000,000-row XLSX (51 MiB uncompressed sheet, 7.2 MiB on disk): 1.5 MiB peak heap growth, ~34× smaller than the sheet (tests/memory.smoke.test.ts).
  • Three stop mechanisms, all equivalent. maxRows, break out of for await, or AbortSignal. Pipeline tears down promptly: no further bytes fetched, decompressed, or parsed.
  • Format coverage. XLSX (streaming), CSV (streaming), XLS (delegated to optional xlsx peer dep, bounded by xlsMaxBytes).
  • Auto-detect. ZIP / OLE2 magic-byte sniff with filename extension as fallback.
  • Standards-grounded. Every byte offset and XML path traces to PKWARE APPNOTE.TXT (ZIP) or ECMA-376 (OOXML), not reverse-engineered from other libraries.
  • OPC-correct. Resolves package parts via _rels/.rels indirection rather than hardcoded paths, so files from LibreOffice, Google Sheets, or custom generators work too.
  • Strict + transitional schemas. Both SpreadsheetML namespaces accepted. All ST_CellType values handled (n, s, str, inlineStr, b, e, d).
  • Worker-safe. No DOM APIs. Runs inside a Web Worker.
  • Zero dependencies for XLSX/CSV. Web Platform APIs only (File, Blob, DecompressionStream, TextDecoderStream).

Quick start

import { openWorkbook, streamRows, readRows } from 'xlsx-stream-rows';

// 1. List sheets without reading row data (≈ 100 KiB read regardless of file size).
const info = await openWorkbook(file);
console.log(info.sheetNames, info.format); // ['Sheet1', 'Data'], 'xlsx'

// 2. Stream rows one at a time. Memory stays bounded for files of any size.
for await (const row of streamRows(file, { maxRows: 100 })) {
  console.log(row); // (string | number | boolean | Date | null)[]
}

// 3. Convenience: collect into an array.
const rows = await readRows(file, { sheetName: 'Data', maxRows: 1000 });

Bounded read — preview the first N rows of a huge file

// Reads only the bytes needed for the first 50 rows. The decompression
// stream is cancelled as soon as the limit is hit; nothing past that point
// is fetched from the file.
const preview = await readRows(file, { maxRows: 50 });

Stop on a predicate

// `break` cancels the underlying pipeline — no further bytes are read.
for await (const row of streamRows(file)) {
  if (row[0] === 'END') break;
  process(row);
}

Cancel from the outside

const ac = new AbortController();
cancelButton.onclick = () => ac.abort();

try {
  for await (const row of streamRows(file, { signal: ac.signal })) {
    insertIntoTable(row);
  }
} catch (e) {
  if (e === ac.signal.reason) console.log('user cancelled');
  else throw e;
}

AbortSignal cancels in-flight metadata fetches too — you can abort before the first row is yielded (e.g. while sharedStrings.xml is still downloading).

Progress reporting

The pull-based design makes progress trivial — count rows yourself, or report file-position progress via your UI:

let count = 0;
const total = info.estimatedRowCount; // your own estimate, e.g. file.size / 100
for await (const row of streamRows(file)) {
  count++;
  if (count % 1000 === 0) updateProgress(count, total);
  await processRow(row);
}

For byte-level progress you can wrap the input file with a Proxy over slice() that reports cumulative bytes; ask in an issue and we'll add an example.

Run inside a Web Worker

The library has no DOM dependencies, so move heavy spreadsheet imports off the main thread:

// worker.ts
import { streamRows } from 'xlsx-stream-rows';

self.onmessage = async (e: MessageEvent<File>) => {
  const file = e.data;
  for await (const row of streamRows(file, { maxRows: 10_000 })) {
    self.postMessage({ type: 'row', row });
  }
  self.postMessage({ type: 'done' });
};
// main.ts
const worker = new Worker(new URL('./worker.ts', import.meta.url), { type: 'module' });
worker.onmessage = (e) => { /* … */ };
worker.postMessage(file); // File is structured-cloneable

Pick a sheet by name

const info = await openWorkbook(file);
console.log(info.sheetNames); // ['Summary', 'Q1', 'Q2', 'Q3', 'Q4']

for await (const row of streamRows(file, { sheetName: 'Q3' })) {
  // …
}

Date detection (XLSX)

XLSX stores dates as numbers styled with a date format. By default, numeric cells whose style references a date numFmtId (built-in 14–22, 27–36, 45–47, 50–58, plus custom <numFmt> containing date tokens) are returned as Date. Disable this if you want raw serial numbers:

const rows = await readRows(file, { parseDates: false }); // 44197 instead of new Date(2021, 0, 1)

CSV with non-UTF encoding

UTF-8, UTF-16 LE, and UTF-16 BE BOMs are auto-detected. For other encodings (e.g. Windows-1251 / cp1251), pass csvEncoding:

const rows = await readRows(file, { csvEncoding: 'windows-1251' });

Format support

| Format | Streaming | Memory peak | Optional dependency | |--------|-----------|-------------|---------------------| | XLSX / XLSM | yes | ≈ sharedStrings size + few MiB | none | | CSV | yes | ≈ one row + decoder window | none | | XLS | no — full file load | ≤ xlsMaxBytes (default 50 MiB) | xlsx peer dep |

Install xlsx only if you need XLS support:

npm install xlsx

API

type CellValue = string | number | boolean | Date | null;
type Row = CellValue[];

interface WorkbookInfo {
  filename: string;
  sheetNames: string[];
  format: 'xlsx' | 'xls' | 'csv';
}

interface ReadOptions {
  /** Stop after yielding this many rows. */
  maxRows?: number;
  /** Sheet to read. Default: first sheet. Ignored for CSV. */
  sheetName?: string;
  /** XLSX/XLS only: convert numeric date-styled cells to `Date`. Default true. */
  parseDates?: boolean;
  /** XLSX only: cap on `xl/sharedStrings.xml` uncompressed size. Default 64 MiB. */
  sharedStringsMaxBytes?: number;
  /** XLS only: cap on the file size loaded into memory. Default 50 MiB. */
  xlsMaxBytes?: number;
  /** CSV only: text encoding. UTF-8/16 BOMs are auto-detected. Default 'utf-8'. */
  csvEncoding?: string;
  /** Cancel the read at any point — including before the first row is yielded. */
  signal?: AbortSignal;
}

function openWorkbook(file: File): Promise<WorkbookInfo>;
function streamRows(file: File, options?: ReadOptions): AsyncIterable<Row>;
function readRows(file: File, options?: ReadOptions): Promise<Row[]>;

Per-format adapters (openXlsxWorkbook, streamXlsxRows, openCsvWorkbook, streamCsvRows, openXlsWorkbook, streamXlsRows) and lower-level building blocks (readZipEntries, createRowParser, createCsvParser, resolvePackagePaths, …) are also exported for power users.

Errors

All errors inherit from XlsxStreamError so callers can catch the family with one instanceof check.

| Error | When | |-------|------| | NotAZipError | EOCD signature not found within 65,557 bytes of EOF | | Zip64NotSupportedError | File uses ZIP64 extensions (>4 GiB / >65,535 entries) | | InvalidLocalHeaderError | LFH signature mismatch — file corrupted | | UnsupportedCompressionError | ZIP method is not 0 (stored) or 8 (deflate) | | InvalidOpcPackageError | Missing _rels/.rels or officeDocument relationship | | SharedStringsTooLargeError | sharedStrings.xml exceeds sharedStringsMaxBytes | | SheetNotFoundError | Requested sheetName not in workbook | | EntryTooLargeError | Bounded entry read exceeded its uncompressed cap | | XlsFileTooLargeError | XLS file exceeds xlsMaxBytes | | XlsxPackageMissingError | XLS read attempted without the xlsx peer dep installed | | FormatNotSupportedError | Magic + extension both unrecognised |

Pipeline cancellation (maxRows hit, iterator return(), AbortSignal) is not an error — the iterator simply ends, or rejects with signal.reason.


How it works

XLSX files are ZIP archives of XML parts. The Central Directory at the end of the archive lists every entry's offset and size, so:

  1. Fetch the trailing 64 KiB of the file → locate the EOCD record → read the Central Directory.
  2. Resolve the workbook part via OPC relationships (_rels/.relsxl/_rels/workbook.xml.rels).
  3. Open a ReadableStream over the target sheet's compressed bytes (Blob.slice().stream()), pipe through DecompressionStream('deflate-raw') and TextDecoderStream, feed the result to a hand-rolled SAX state machine that emits rows as </row> closes.
  4. Cancelling the iterator (maxRows, break, AbortSignal) cancels the reader, which propagates up the pipeline — no more bytes are fetched.

For CSV: same shape, simpler — file.stream() → TextDecoderStream → RFC-4180 parser.

For XLS: no streaming primitive exists in the BIFF/OLE2 format, so we delegate to the xlsx package and bound the file size to keep memory predictable.


Limitations

  • No ZIP64. Files with the Central Directory or any individual entry over 4 GiB, or with more than 65,535 entries, are rejected with Zip64NotSupportedError. In practice you can comfortably read 5–10 GB of sheet data; multi-GB compressed XLSX with monster sharedStrings tables are the edge case to watch.
  • No formula evaluation. Formula cells return the cached <v> written by the producer; if absent, null. We do not re-evaluate.
  • No merged-cell expansion. Cells appear exactly where the XML places them.
  • CSV is comma-only. Semicolon and tab dialects are out of scope — preprocess if needed.
  • No password-protected files.

Standards


Try it in your browser

The repo includes a self-contained playground:

git clone https://github.com/gudoshnikovn/xlsx-stream-rows
cd xlsx-stream-rows
npm install && npm run build
npx serve .   # or any static server
# open http://localhost:3000/examples/playground.html

Drop a real spreadsheet in, set maxRows, watch it stream.

Development

npm install
npm test            # Node test suite (≈ 150 tests, including fuzz)
npm run test:browser  # same suite in real Chromium / Firefox / WebKit via Playwright
npm run test:memory   # 1M-row memory smoke (set --pool=forks for forced GC)
npm run build         # tsup → dist/ (ESM + CJS + d.ts)
npm run typecheck

Requires Node 20+ for File, Blob.stream(), and DecompressionStream globals. CI runs on Node 20 / 22 plus all three browsers.


License

MIT