ai-token-chunker

v1.0.0

Published

11 days ago

Safely split prompts across multiple AI providers without breaking token or byte limits

0High
0Medium
0Low

vikaskbh

ai llm token chunking prompt openai anthropic gemini mistral cohere groq azure bedrock together ollama

ai-token-chunker

Safely split prompts across multiple AI providers without breaking token or byte limits

A boring, reliable, zero-magic npm package that preflights and chunks inputs safely before sending them to AI providers. No network calls, no tokenizers bundled, just deterministic chunking based on provider limits.

🧠 Why This Exists

Different AI providers enforce different limits:

Tokens (varies by model)
Characters (some providers use character limits)
Bytes (UTF-8 encoding matters)
Image payload size (per image and total)
Message structure constraints

Developers routinely hit:

context_length_exceeded errors
Silent truncation
Partial image loss
Unpredictable failures

This library preflights and chunks inputs safely before you hit the API.

📦 Features

✅ Zero runtime dependencies
✅ Works in Node 18+
✅ ESM first, CommonJS supported
✅ Deterministic output
✅ No network calls
✅ No tokenizers bundled (approximation only)
✅ Supports 10+ providers with default limits
✅ Image validation and handling
✅ Custom limit overrides
✅ Developer-friendly error messages

🚀 Installation

npm install ai-token-chunker

📖 Usage

Basic Example

import { chunkPrompt } from 'ai-token-chunker';

const result = chunkPrompt({
  provider: 'openai',
  model: 'gpt-4o',
  input: 'Your very long text here...',
});

console.log(`Total chunks: ${result.metadata.totalChunks}`);
console.log(`Estimated tokens: ${result.metadata.estimatedTokens}`);

for (const chunk of result.chunks) {
  // Send each chunk to your AI provider
  console.log(`Chunk ${chunk.index}: ${chunk.text.slice(0, 50)}...`);
}

With Images

import { chunkPrompt } from 'ai-token-chunker';
import fs from 'fs';

const imageBuffer = fs.readFileSync('image.png');

const result = chunkPrompt({
  provider: 'anthropic',
  model: 'claude-3-5-sonnet-20241022',
  input: 'Describe this image...',
  images: [imageBuffer], // Buffer, base64 string, or {buffer, mime}
});

Custom Limits

const result = chunkPrompt({
  provider: 'openai',
  model: 'gpt-4o',
  input: 'Long text...',
  options: {
    customLimits: {
      maxBytes: 1000,
      maxChars: 500,
      maxTokens: 250,
    },
  },
});

Chunk Overlap

const result = chunkPrompt({
  provider: 'openai',
  model: 'gpt-4o',
  input: 'Long text...',
  options: {
    chunkOverlap: 100, // Overlap 100 characters between chunks
  },
});

🧩 API Reference

`chunkPrompt(params)`

Main function to chunk prompts.

Parameters:

provider (string, required): Provider name (e.g., 'openai', 'anthropic', 'gemini')
model (string, required): Model name (e.g., 'gpt-4o', 'claude-3-5-sonnet-20241022')
input (string, required): Input text to chunk
images (Array, optional): Array of images as:
- Buffer
- Base64 string
- {buffer: Buffer, mime: string}
options (Object, optional):
- chunkOverlap (number): Characters to overlap between chunks (default: 0)
- respectWordBoundaries (boolean): Try to split at word boundaries (default: true)
- customLimits (Object): Override provider limits

Returns:

{
  chunks: [
    {
      text: string,
      images: Array,
      index: number
    }
  ],
  metadata: {
    provider: string,
    model: string,
    totalChunks: number,
    estimatedTokens: number,
    estimatedBytes: number
  }
}

Throws:

ProviderNotSupportedError: Provider not in supported list
LimitExceededError: Input exceeds limits (when not chunking)
ImageLimitError: Images exceed limits
InvalidInputError: Invalid input parameters

🏢 Supported Providers

Default limits are configured for:

| Provider | Models | Max Tokens | Max Images | |----------|--------|------------|------------| | OpenAI | gpt-4o, gpt-4-turbo, gpt-4, gpt-3.5-turbo | 128K (gpt-4o) | 10 | | Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-sonnet, claude-3-haiku | 200K | 20 | | Google Gemini | gemini-1.5-pro, gemini-1.5-flash, gemini-pro | 2M (1.5-pro) | 16 | | Mistral | mistral-large-latest, mistral-medium-latest, mistral-small-latest | 128K | 0 | | Cohere | command-r-plus, command-r | 128K | 0 | | Groq | llama-3.1-70b, llama-3.1-8b, mixtral-8x7b | 131K | 0 | | Azure OpenAI | Same as OpenAI | 128K | 10 | | AWS Bedrock | Various (Claude, Llama, etc.) | Varies | Varies | | Together AI | Various | 8K | 0 | | Ollama | llama3, mistral, etc. | 8K | 0 |

Note: Limits are conservative estimates based on public documentation. Use customLimits to override for your specific use case.

🧮 Token Estimation

This library uses heuristics, not exact tokenizers:

Default: 1 token ≈ 4 characters
Images counted separately
Byte limit always wins over token limit

Why Approximations?

Exact tokenization requires:

Provider-specific tokenizers (different for each provider)
Large dependency bundles
Network calls or local models

This library prioritizes:

✅ Zero dependencies
✅ Fast, deterministic chunking
✅ Works offline
✅ Safety margin built-in

Important: This is a safety layer, not a tokenizer. For exact token counts, use provider-specific tokenizers (e.g., tiktoken for OpenAI).

🖼 Image Handling

Images are accepted as:

Buffer objects
Base64 strings
{buffer: Buffer, mime: string} objects

The library validates:

✅ Maximum image count per provider
✅ Per-image byte size limits
✅ Total payload size

If an image exceeds limits:

❌ Throws a descriptive error
❌ Does NOT auto-resize
❌ Does NOT auto-compress

You must handle image preprocessing before chunking.

🚨 Error Handling

All errors include context:

import {
  chunkPrompt,
  ProviderNotSupportedError,
  LimitExceededError,
  ImageLimitError,
  InvalidInputError,
} from 'ai-token-chunker';

try {
  const result = chunkPrompt({ /* ... */ });
} catch (error) {
  if (error instanceof ProviderNotSupportedError) {
    console.error(`Provider not supported: ${error.provider}`);
  } else if (error instanceof LimitExceededError) {
    console.error(`Limit exceeded: ${error.limit}`);
    console.error(`Actual: ${error.actual}, Allowed: ${error.allowed}`);
  } else if (error instanceof ImageLimitError) {
    console.error(`Image error: ${error.reason}`);
    console.error(`Image index: ${error.imageIndex}`);
  }
}

🧪 Testing

npm test

📚 Examples

See examples/basic.js for more usage examples.

🎯 Design Philosophy

Predictable > clever: Deterministic chunking, no magic
Explicit limits > magic: You control the limits
Fail early: Errors before API calls
Developer-friendly: Clear error messages with context
Suitable for backend pipelines: FastAPI, Node, Edge functions

⚠️ Common Failure Scenarios

1. Silent Truncation

Problem: Provider silently truncates your input.

Solution: Chunk before sending:

const result = chunkPrompt({ provider, model, input });
// Send each chunk separately

2. Multi-byte Characters

Problem: UTF-8 encoding means character count ≠ byte count.

Solution: This library uses byte limits, not just character limits.

3. Image Payload Too Large

Problem: Single image exceeds provider limit.

Solution: Preprocess images before chunking:

// Resize/compress images first
const processedImage = await resizeImage(originalImage);
const result = chunkPrompt({ provider, model, input, images: [processedImage] });

4. Token Count Mismatch

Problem: Estimated tokens don't match provider's count.

Solution: This is expected. Use provider-specific tokenizers for exact counts, but this library provides a safety margin.

🔧 Advanced Usage

Access Provider Limits

import { getProviderLimits } from 'ai-token-chunker';

const limits = getProviderLimits('openai', 'gpt-4o');
console.log(limits);
// {
//   maxTokens: 128000,
//   maxChars: 512000,
//   maxBytes: 512000,
//   maxImages: 10,
//   imageByteLimit: 20000000
// }

Estimate Tokens

import { estimateTokens } from 'ai-token-chunker';

const tokens = estimateTokens('Hello, world!');
console.log(tokens); // ~3 tokens

📝 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions welcome! Please ensure:

Zero runtime dependencies
Node 18+ compatibility
Tests pass
Documentation updated

⚡ Performance

Fast: No network calls, no heavy dependencies
Deterministic: Same input = same output
Memory efficient: Streams-friendly chunking

🎓 When to Use This Library

✅ Use when:

You need to chunk prompts before sending to AI providers
You want safety margins built-in
You need zero dependencies
You work with multiple providers

❌ Don't use when:

You need exact token counts (use provider tokenizers)
You need automatic image resizing
You need network-based tokenization

Remember: This is a safety layer, not a tokenizer. It prevents you from hitting limits, but doesn't provide exact token counts.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ai-token-chunker

🧠 Why This Exists

📦 Features

🚀 Installation

📖 Usage

Basic Example

With Images

Custom Limits

Chunk Overlap

🧩 API Reference

chunkPrompt(params)

🏢 Supported Providers

🧮 Token Estimation

Why Approximations?

🖼 Image Handling

🚨 Error Handling

🧪 Testing

📚 Examples

🎯 Design Philosophy

⚠️ Common Failure Scenarios

1. Silent Truncation

2. Multi-byte Characters

3. Image Payload Too Large

4. Token Count Mismatch

🔧 Advanced Usage

Access Provider Limits

Estimate Tokens

📝 License

🤝 Contributing

⚡ Performance

🎓 When to Use This Library

`chunkPrompt(params)`