ubc-genai-toolkit-chunking
v0.1.0
Published
This module provides a standardized interface for splitting text documents into smaller, more manageable chunks. It supports various chunking strategies and serves as a facade over underlying chunking libraries.
Readme
UBC GenAI Toolkit - Chunking Module
This module provides a standardized interface for splitting text documents into smaller, more manageable chunks. It supports various chunking strategies and serves as a facade over underlying chunking libraries.
Core Concepts
- ChunkingModule: The main entry point for accessing chunking functionality.
- Providers: Concrete implementations of different chunking strategies (e.g.,
SimpleProvider,RecursiveCharacterProvider,TokenProvider). - Configuration: The module is initialized with a
ChunkingConfigobject that specifies the provider and its settings.
Getting Started
To use the Chunking Module, instantiate ChunkingModule with a desired provider and configuration:
```typescript import { ChunkingModule } from 'ubc-genai-toolkit-chunking';
const chunker = new ChunkingModule({ provider: 'simple', // or 'recursive-character', 'token' // Provider-specific config here });
const text = "Your long text document..."; const chunks = await chunker.chunk(text);
console.log(chunks); ```
