npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

rawtype

v1.0.3

Published

Fast heuristic detection of text vs binary data using byte-level analysis.

Readme

rawtype

Rawtype is a lightweight, zero-dependency library for Node.js and browsers that performs a fundamental job: distinguishing binary data from text by analyzing raw bytes. It uses pragmatic heuristics—like scanning for null bytes and checking for common binary headers—to make a fast, informed guess. This is a heuristic, not a guaranteed fact, so results include a confidence score to help you assess reliability.


Quick Start

npm install rawtype
import { detect, isText } from 'rawtype';

// Detect with detailed results
const result = detect(fs.readFileSync('file.dat'));
console.log(result);
// { kind: 'binary', confidence: 0.99, sampleSize: 4096 }

// Simple boolean check
if (isText(userInput)) {
  console.log('Safe to process as a string.');
}

Why Use Rawtype?

It solves specific, common problems where simpler checks fail:

| Problem | Typical Solution | Why It Fails | Rawtype's Approach | | :--- | :--- | :--- | :--- | | Null bytes in text | Buffer.isUtf8() | Fails on UTF-16 or corrupted text. | Heavy weighting for null bytes, but allows for edge cases. | | Unknown file type | MIME type from extension | Easily spoofed; unreliable for raw data. | Scans initial bytes for known binary signatures (magic numbers). | | Large files | Read entire file | Memory intensive and slow. | Samples only the first 4KB by default (configurable). | | Need for certainty | True/False guess | Lacks nuance for ambiguous data. | Provides a confidence score (0.0-1.0) with each result. |


API Reference

Core Function: detect(input, options?)

The primary function analyzes input and returns a result object.

Signature:

function detect(
  input: string | Buffer | Uint8Array | ArrayBuffer | number[],
  options?: {
    maxSample?: number;     // Bytes to sample (default: 4096)
    nullByteWeight?: number; // Penalty for null bytes (default: 50)
    textThreshold?: number;  // Confidence needed for "text" (default: 0.85)
  }
): DetectionResult

Result Object (DetectionResult):

{
  kind: 'text' | 'binary'; // The best-guess classification
  confidence: number;       // Certainty of the guess (0.0 to 1.0)
  sampleSize: number;       // How many bytes were actually analyzed
}

Understanding Confidence: The confidence score represents the algorithm's certainty in its kind classification. A high score (e.g., 0.98) indicates strong evidence, while a score near your textThreshold (default 0.85) suggests the data was ambiguous.

Helper Functions

  • isText(input, options?): boolean – Returns true if detect() returns kind: 'text' with confidence >= threshold.
  • isBinary(input, options?): boolean – Inverse of isText.

File Scanner (scanFile, scanDirectory)

Note: These are Node.js-only utilities.

import { scanFile, scanDirectory } from 'rawtype/file-scanner';

// Scan a single file
const fileResult = await scanFile('./data.bin');

// Recursively scan a directory
const dirResult = await scanDirectory('./user-uploads', {
  recursive: true,
  extensions: ['.dat', '.txt'],
  exclude: ['**/node_modules/**']
});

The scanDirectory function returns a DirectoryScanResult containing a summary and an array of FileScanResult objects for each file.


Integration Examples

1. HTTP API (Edge/Browser)

Use the built-in handler for HTTP detection endpoints.

// Example for Vercel/Cloudflare
import { rawtypeHandler } from 'rawtype/api-handler';
export default rawtypeHandler.fetch;

Endpoint: POST /detect Accepts: text/plain, application/json, application/octet-stream Returns: JSON with the DetectionResult.

2. Stream Processing

Process data from streams without buffering everything.

import { sampleStream } from 'rawtype/stream-helper';
import { detect } from 'rawtype';

// Sample from a fetch response stream
const response = await fetch('https://example.com/data');
const sample = await sampleStream(response.body, 8192);
const result = detect(sample);

3. Binary Data Parsing Pipeline

You can integrate rawtype as a first step before detailed parsing.

import { createReadStream } from 'fs';
import { sampleStream } from 'rawtype/stream-helper';
import { Parser } from 'my-binary-parser'; // Your parser

async function processFile(path: string) {
  const stream = createReadStream(path);
  const sample = await sampleStream(stream, 4096);
  const { kind, confidence } = detect(sample);

  if (kind === 'binary' && confidence > 0.9) {
    const parser = new Parser();
    // ... safe to proceed with binary parsing
  } else {
    // Process as text or log ambiguity
  }
}

License and Compliance

Important: Rawtype is licensed under the GNU General Public License v3.0 (GPLv3).

What This Means for You:

  • You can use, modify, and distribute rawtype freely.
  • If you distribute a modified version of rawtype, you must make your modifications publicly available under the GPLv3.
  • If you distribute a larger application that includes rawtype as part of it, the entire combined work may need to be licensed under GPLv3. This is a requirement of the license's "copyleft" provision.
  • The software is provided without any warranty.

Considerations:

  • For internal use where software is not distributed, GPLv3 requirements generally do not apply.
  • If you intend to use rawtype in a proprietary, closed-source product, the GPLv3 license may not be compatible with your goals. You may need to seek a different license from the copyright holder or choose an alternative library with a more permissive license (like MIT or Apache 2.0).

For full legal details, please read the GPLv3 license text and consult with a legal professional if you have specific questions.


Design Philosophy

Rawtype adheres to a strict, minimalistic design:

  • Zero Dependencies: To keep it lightweight and secure.
  • Single Responsibility: It detects binary vs. text—it does not decode encodings, validate MIME types, or parse formats.
  • Practical Heuristics: Uses byte inspection, null-byte detection, and magic number checks for a balance of speed and accuracy.
  • Honest Results: Provides a confidence score to communicate uncertainty, not just a binary guess.