npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

csv-super

v1.0.0

Published

Read a 10GB CSV file without crashing RAM — Streaming CSV parser with fixed 50MB memory footprint

Readme

csv-super ⚡

Read a 10GB CSV file without crashing RAM.
Fixed ~50MB memory footprint. RFC 4180 compliant. Zero dependencies.

npm version npm downloads GitHub stars TypeScript License: MIT Node.js Zero deps


The Real Problem

// ❌ This CRASHES on a 10GB file — allocates 16+ GB of RAM
const results = [];
fs.createReadStream('data.csv')
  .pipe(csvParser())
  .on('data', row => results.push(row));   // ← ALL rows land in heap

Why it crashes: csv-parser, fast-csv, and papaparse emit rows via EventEmitter 'data' events. With no backpressure applied, the stream reads at disk speed (~500 MB/s) and piles all rows into an array. By the time a 10GB file finishes, the V8 heap holds 15–30GB of objects (JSON has 1.5–3× overhead).

The Solution

import { csvSuper } from 'csv-super';

// ✅ RAM stays at ~50MB regardless of file size
for await (const { rows, totalSoFar } of csvSuper('data.csv', { batch: 1000 })) {
  await db.insertMany(rows);                // process 1000 rows at a time
  console.log(`Processed: ${totalSoFar}`);  // cumulative counter built-in
}

Why it works: csv-super uses an async function* (Async Generator). Each yield suspends the generator until the consumer's await resolves. While suspended, fs.createReadStream is automatically paused — no data accumulates in the heap. This is native JavaScript backpressure.


Performance Comparison

| Metric | csv-parser (traditional) | csv-super | |-------------------|--------------------------|------------------------| | 1GB file | ~2GB RAM | ~50MB RAM | | 10GB file | 💀 OOM crash | ~50MB RAM | | First row delay | After full file read | Within seconds | | TypeScript | Partial | 100% typed | | RFC 4180 | Partial | Full compliance | | Dependencies | 2 | 0 | | Node.js ≥ 18 | ✅ | ✅ |


Installation

npm install csv-super

Requirements: Node.js ≥ 18.0.0 (uses native fs.createReadStream + Async Generators)


Quick Start

import { csvSuper } from 'csv-super';

// Basic: 1000 rows per batch (default)
for await (const batch of csvSuper('sales.csv')) {
  console.log(batch.rows);      // CsvRow[] = Record<string, string>[]
  console.log(batch.batchIndex); // 0, 1, 2, ...
  console.log(batch.count);      // rows in this batch (≤ 1000)
  console.log(batch.totalSoFar); // cumulative rows processed
}

API Reference

csvSuper(filePath, options?)AsyncGenerator<BatchResult>

import { csvSuper } from 'csv-super';
import type { CsvSuperOptions, BatchResult } from 'csv-super';

Options

| Option | Type | Default | Description | |-----------------|-----------------------------------------|--------------|------------------------------------------| | batch | number | 1000 | Rows per yielded batch (1–100000) | | delimiter | string | ',' | Field separator (single char) | | quote | string | '"' | Quote character | | escape | string | '"' | Escape char (RFC 4180: same as quote) | | headers | boolean | true | First row = column names | | skipEmptyLines| boolean | true | Skip blank lines | | encoding | 'utf8' \| 'utf16le' \| 'latin1' \| 'auto' | 'auto' | File encoding (auto = BOM detection) | | chunkSize | number | 65536 | Read buffer size in bytes (≥ 1024) | | onProgress | (info: ProgressInfo) => void | null | Progress callback |

BatchResult

interface BatchResult {
  rows:       CsvRow[];    // The parsed rows
  batchIndex: number;      // 0-based batch counter
  count:      number;      // rows.length (convenience)
  totalSoFar: number;      // cumulative rows across all batches
}

ProgressInfo

interface ProgressInfo {
  bytesRead:             number;  // bytes consumed so far
  totalBytes:            number;  // total file size
  percentage:            number;  // 0.0–100.0
  speedMBps:             number;  // current read speed (sliding window)
  estimatedSecondsLeft:  number;  // ETA in seconds
  rowsProcessed:         number;  // rows parsed so far
}

Examples

Insert into Database

import { csvSuper } from 'csv-super';

for await (const batch of csvSuper('customers.csv', { batch: 5000 })) {
  await db.customers.insertMany(batch.rows);
  console.log(`✅ Inserted ${batch.totalSoFar} customers`);
}

Progress Bar

for await (const batch of csvSuper('large.csv', {
  batch: 5000,
  onProgress: ({ percentage, speedMBps, estimatedSecondsLeft }) => {
    const bar = '█'.repeat(Math.floor(percentage / 5)).padEnd(20, '░');
    process.stdout.write(
      `\r[${bar}] ${percentage.toFixed(1)}% @ ${speedMBps.toFixed(1)} MB/s`
    );
  },
})) {
  await processRows(batch.rows);
}

TSV Files

for await (const batch of csvSuper('data.tsv', { delimiter: '\t' })) {
  // ...
}

Early Termination (stream closes automatically, no leak)

for await (const batch of csvSuper('events.csv')) {
  for (const row of batch.rows) {
    if (row.severity === 'CRITICAL') {
      console.log('Critical event found:', row);
      return; // or break — stream closes cleanly
    }
  }
}

No Headers (index-based access)

for await (const batch of csvSuper('raw.csv', { headers: false })) {
  // rows: { '0': 'value1', '1': 'value2', ... }
  console.log(batch.rows[0]?.['0']);
}

Pro Features ($17/month)

For enterprise workloads that process CSV files daily, csv-super Pro adds:

Multi-Thread Processing (Worker Threads)

import { csvSuperPro } from 'csv-super';

for await (const batch of csvSuperPro('huge-file.csv', {
  licenseKey: process.env.CSV_SUPER_KEY,
  threads: 8,         // Use 8 CPU cores in parallel
  batch: 10_000,
})) {
  await db.insertMany(batch.rows);
}

Speed: ~N× faster on N-core machines. A file that takes 60s on 1 core takes ~8s on 8 cores.

Transform Pipeline

import { csvSuperPro, TransformPipeline } from 'csv-super';

const pipeline = new TransformPipeline()
  .filter(row => row.status === 'active')          // Filter rows
  .trim()                                           // Trim all fields
  .select(['id', 'name', 'email', 'salary'])        // Select columns
  .rename({ 'id': 'employee_id' })                  // Rename columns
  .mapField('salary', v => String(parseInt(v, 10))) // Type coerce
  .pipe(async row => {                              // Async enrichment
    const dept = await getDept(row.employee_id);
    return { ...row, department: dept };
  });

for await (const batch of csvSuperPro('employees.csv', {
  licenseKey: process.env.CSV_SUPER_KEY,
  transform: pipeline.toFn(),
})) {
  await db.employees.insertMany(batch.rows);
}

Pro Pricing

| Plan | Price | Features | |------------|------------|------------------------------------------------------| | Free | $0 forever | Streaming + Batch + TypeScript + RFC 4180 + Progress | | Pro | $17/month | + Multi-thread + Transform + Priority support | | Enterprise | Custom | + SLA + Custom seat count + Dedicated support |

Get Pro →


Architecture

I/O Layer           Parser Layer           Delivery Layer
──────────          ──────────────         ───────────────
fs.createReadStream → CsvParser (FSM)  →  BatchController
     ↓                    ↓                     ↓
  64KB chunks      RFC 4180 State Machine   Async Generator
  (backpressure)   (incremental, chunked)   (yield + await)

Key insight: When the consumer awaits inside for await...of, the Async Generator suspends at the yield. While suspended, getNextChunk() is not called, so readStream.resume() is never called — the stream stays paused. This is mechanical, guaranteed backpressure.

Memory formula:

heap ≈ chunkSize (64KB) + batch_size × avg_row_size
     ≈ 0.064MB  +  1000  ×  0.05MB
     ≈ ~50MB (constant regardless of file size)

RFC 4180 Compliance

Full support for all CSV edge cases:

✅ Fields with commas:    "Smith, John"
✅ Fields with newlines:  "line1\nline2"
✅ Escaped quotes (x2):  "He said ""hi"""
✅ Empty fields:          a,,b
✅ CRLF line endings:     \r\n
✅ No trailing newline    (handled by finalize())
✅ Unicode content:       UTF-8, UTF-16 LE/BE
✅ Custom delimiters:     TSV (\t), PSV (|), SSV (;)

Error Handling

import { csvSuper, CsvSuperError, ParseError } from 'csv-super';

try {
  for await (const batch of csvSuper('data.csv')) {
    await processRows(batch.rows);
  }
} catch (err) {
  if (err instanceof ParseError) {
    console.error(`Parse error at line ${err.lineNumber}: ${err.message}`);
  } else if (err instanceof CsvSuperError) {
    console.error(`csv-super error [${err.code}]: ${err.message}`);
  } else {
    throw err; // re-throw unexpected errors
  }
}

TypeScript

Fully typed out of the box. No @types package needed.

import type {
  CsvRow,           // Record<string, string>
  BatchResult,      // { rows, batchIndex, count, totalSoFar }
  ProgressInfo,     // { percentage, speedMBps, ... }
  CsvSuperOptions,  // full options type
  TransformFn,      // (row: CsvRow) => CsvRow | null | Promise<CsvRow | null>
} from 'csv-super';

Contributing

Issues and PRs are welcome at github.com/csv-super/csv-super.

git clone https://github.com/Brah-Timo/csv-super
cd csv-super
npm install
npm test
npm run bench

License

Core (free tier): MIT License
Pro (multi-thread + transform): Commercial License — see csv-super.dev/pro


Built with ❤️ for data engineers who have felt the pain of a 10GB CSV on a 16GB server.