@cyberlangke/tokkit-upstage
v1.11.0
Published
Upstage tokenizer families for tokkit.
Readme
@cyberlangke/tokkit-upstage
Upstage 官方文本 tokenizer 的 tokkit 子包。
当前内置 family:
solar- 覆盖
upstage/SOLAR-10.7B-v1.0 - 也聚合当前确认完全复用同一 tokenizer 的
upstage/TinySolar-248m-4k
- 覆盖
solar-pro- 覆盖
upstage/solar-pro-preview-instruct
- 覆盖
当前不纳入:
upstage/SOLAR-10.7B-Instruct-v1.0upstage/Solar-Open-100Bupstage/solar-pro-preview-pretrainedupstage/TinySolar-248m-4k-pyupstage/TinySolar-248m-4k-py-instructupstage/TinySolar-248m-4k-code-instructupstage/llama-*upstage/Llama-2-70b-instructupstage/SOLAR-0-70b-*
使用方法
npm install @cyberlangke/tokkit-upstageimport { getTokenizer } from "@cyberlangke/tokkit-upstage"
const solar = await getTokenizer("solar")
const solarPro = await getTokenizer("upstage/solar-pro-preview-instruct")
console.log(solar.encode("Hello, Upstage!"))
console.log(solarPro.encode("Hello, Upstage!"))