@tlibnx/tokenizer-deepseek_v4
v4.0.0
Published
DeepSeek V4 tokenizer for NodeJS/Browser
Downloads
131
Maintainers
Readme
@lenml/tokenizer-deepseek_v4
DeepSeek V4 tokenizer for NodeJS/Browser, powered by @lenml/tokenizers.
Installation
npm install @lenml/tokenizer-deepseek_v4
# or
pnpm add @lenml/tokenizer-deepseek_v4Usage
import { fromPreTrained } from "@lenml/tokenizer-deepseek_v4";
// Initialize tokenizer
const tokenizer = await fromPreTrained();
// Encode text
const encoding = tokenizer.encode("Hello, world!");
console.log(encoding.ids);
// Decode tokens
const text = tokenizer.decode(encoding.ids);
console.log(text);Features
- DeepSeek V4 tokenizer with 128,256 vocabulary size
- Support for Node.js and browser environments
- Built-in special tokens (BOS, EOS, PAD, 256 placeholders)
- Full Hugging Face tokenizer compatibility
API
Complete API parameters and usage can be found in transformer.js tokenizers documentation
License
Apache-2.0
