tokenfill
v0.0.7
Published
Generate deterministic filler text with exact token counts.
Readme
tokenfill
Generate deterministic filler text with exact token counts.
tokenfill is available as:
- A CLI:
tokenfill <count> - A library:
tokenfill(count, options) - A tokenizer utility wrapper:
createTokenizer(options)
Install
npm install tokenfillRun with npx:
npx tokenfill 256CLI
tokenfill <count> [--json] [--tokenizer <encoding>]Examples:
tokenfill 512 > sample.txt
tokenfill 128 --json
tokenfill 256 --tokenizer o200k_base --jsonBehavior:
<count>must be a non-negative integer.- Default tokenizer encoding is
cl100k_base. - Without
--json, generated text is written tostdoutand stats tostderr. - With
--json, output is:
{
"text": "…",
"stats": {
"requestedTokens": 128,
"actualTokens": 128,
"encoding": "cl100k_base"
}
}Library Usage
import { tokenfill } from "tokenfill";
const result = tokenfill(1024);
console.log(result.actualTokens); // 1024
console.log(result.text.length > 0); // trueWith an explicit encoding:
import { tokenfill } from "tokenfill";
const result = tokenfill(256, { encoding: "o200k_base" });Tokenizer Utility
import { createTokenizer } from "tokenfill";
const tokenizer = createTokenizer({ encoding: "cl100k_base" });
const tokens = tokenizer.encode("hello world");
const text = tokenizer.decode(tokens);
const count = tokenizer.count(text);
const truncated = tokenizer.truncate(text, 1);
tokenizer.free();Notes
- Output is deterministic for the same token count and encoding.
- Requests larger than the built-in corpus size throw an error.
