npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

scout-text-chunker

v0.1.0

Published

Scout Text Chunker provides text chunking strategies for RAG pipelines.

Readme

Scout Text Chunker SDK

Scout Text Chunker is a unified TypeScript toolkit for splitting long-form content into semantically coherent "chunks" that are easy to embed, store, and retrieve in Retrieval-Augmented Generation (RAG) pipelines. It exposes a consistent interface across fixed, structural, semantic, hybrid, and topic-aware chunkers along with pluggable embedding providers.

Key features

  • One interface, many strategies – switch between fixed, recursive, semantic, hybrid, topic, and sliding-window chunkers with a single factory function.
  • Bring your own embeddings – use the built-in embedders or provide a custom implementation that matches your infrastructure.
  • Built for RAG systems – every chunk includes token counts, parent/child relationships, and metadata hooks for downstream retrieval engines.
  • Framework agnostic – works in Node.js, serverless runtimes, and modern build systems thanks to the ESM/CJS bundles included in the package.

Installation

npm install scout-text-chunker

Quick start

import { createChunker, OpenAIEmbedder } from "scout-text-chunker";

type Input = {
  title: string;
  body: string;
};

const article: Input = {
  title: "Scaling Support Documentation",
  body: "Scout Text Chunker helps teams split knowledge base articles into reusable units..."
};

const chunker = createChunker({
  type: "semantic",
  embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! })
});

const chunks = await chunker.chunk(article.body);
console.log(chunks[0]);

Each chunk contains the original text, token counts, and optional metadata describing its origin. You can persist this output in your vector store of choice.

Choosing a chunker

| Chunker | When to use it | |------------|---------------------------------------------------------------------------------| | fixed | Documents with uniform length sentences or when token budgets are strict. | | recursive| Structured content like Markdown or HTML that benefits from heading hierarchy. | | semantic | Narrative text where topic shifts are subtle and require embedding similarity. | | hybrid | Semi-structured data where headings exist but need semantic boundaries within. | | topic | Knowledge bases or FAQs that should be grouped by conceptual clusters. | | sliding | Streaming or chat transcripts that require overlapping context windows. |

Switching chunkers is as simple as changing the type passed to createChunker.

import { createChunker } from "scout-text-chunker";

const chunker = createChunker({
  type: "recursive",
  options: {
    maxTokens: 600,
    overlap: 80
  }
});

Embedding strategies

The SDK ships with embedders that wrap popular providers.

import { createChunker, OpenAIEmbedder, CohereEmbedder } from "scout-text-chunker";

const openAIChunker = createChunker({
  type: "semantic",
  embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! })
});

const cohereChunker = createChunker({
  type: "topic",
  embedder: new CohereEmbedder({ apiKey: process.env.COHERE_API_KEY! })
});

If you already have embeddings, implement the Embedder interface and pass it into the factory.

import { createChunker, Embedder } from "scout-text-chunker";

class LocalEmbedder implements Embedder {
  async embed(texts: string[]) {
    return texts.map((text) => Array.from(text, (char) => char.charCodeAt(0) / 255));
  }
}

const chunker = createChunker({
  type: "semantic",
  embedder: new LocalEmbedder()
});

Attaching metadata

Every chunker.chunk call accepts optional metadata so you can preserve context.

const chunks = await chunker.chunk(article.body, {
  documentId: "support-doc-42",
  source: article.title
});

Metadata travels with each chunk, making it easier to trace responses back to the original source.

CLI & automation

Scout Text Chunker plays nicely with build pipelines. Combine it with tsup or ts-node to preprocess documents before deployment, or bundle it into serverless functions that power your search endpoints.

Testing & type safety

npm test

Vitest ensures chunkers behave consistently across text types, and TypeScript declarations are published alongside the package for first-class IDE support.

Contributing

  1. Clone the repository and install dependencies with npm install.
  2. Run npm run build before submitting changes.
  3. Ensure tests pass with npm test.

License

MIT © Scout