@nlptools/splitter
v0.0.2
Published
Text splitting utilities - LangChain.js text splitters wrapper for NLPTools
Downloads
12
Maintainers
Readme
@nlptools/splitter
Text splitting utilities - LangChain.js text splitters wrapper for NLPTools
This package provides convenient access to LangChain.js text splitting utilities through the NLPTools ecosystem. It includes various text splitters for chunking documents and processing large texts.
Installation
# Install with npm
npm install @nlptools/splitter
# Install with yarn
yarn add @nlptools/splitter
# Install with pnpm
pnpm add @nlptools/splitterUsage
Basic Setup
import {
RecursiveCharacterTextSplitter,
CharacterTextSplitter,
MarkdownTextSplitter,
TokenTextSplitter,
} from "@nlptools/splitter";Available Splitters
- RecursiveCharacterTextSplitter - Splits text recursively using different separators
- CharacterTextSplitter - Splits text by character count
- MarkdownTextSplitter - Specialized splitter for Markdown documents
- TokenTextSplitter - Splits text by token count
- LatexTextSplitter - Specialized splitter for LaTeX documents
Example Usage
import { RecursiveCharacterTextSplitter } from "@nlptools/splitter";
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const text = "Your long text content here...";
const chunks = await splitter.splitText(text);
console.log(chunks);Features
- 📝 Multiple Splitting Strategies: Character, token, and format-aware splitting
- 🔧 Configurable: Customizable chunk size and overlap
- 📦 TypeScript First: Full type safety
- 🚀 Based on LangChain.js: Reliable and well-tested implementations
References
This package incorporates and builds upon the following excellent open source projects:
- LangChain.js Text Splitters - Core text splitting implementations via
@langchain/textsplitters
