@love-rox/tcy-core
v0.3.1
Published
Framework-agnostic tokenizer for Japanese tate-chu-yoko (縦中横) span wrapping
Maintainers
Readme
@love-rox/tcy-core
Framework-agnostic tokenizer for Japanese tate-chu-yoko (縦中横) span wrapping.
Returns a Segment[] that splits a string into text chunks and target chunks (typically half-width alphanumerics). Framework wrappers build on top of this:
@love-rox/tcy-react— React<Tcy>component@love-rox/tcy-vue— Vue 3<Tcy>component@love-rox/tcy-rehype— rehype plugin for HAST@love-rox/tcy-astro— Astro integration +<Tcy>component
Install
pnpm add @love-rox/tcy-coreUsage
import { tokenize } from '@love-rox/tcy-core';
tokenize('第1章 2026年4月');
// [
// { type: 'text', value: '第' },
// { type: 'tcy', value: '1' },
// { type: 'text', value: '章 ' },
// { type: 'tcy', value: '2026' },
// { type: 'text', value: '年' },
// { type: 'tcy', value: '4' },
// { type: 'text', value: '月' },
// ]API
type Segment = { type: 'text'; value: string } | { type: 'tcy'; value: string };
function tokenize(input: string, options?: TcyOptions): Segment[];
interface TcyOptions {
target?: 'alphanumeric' | 'alpha' | 'digit' | 'ascii' | RegExp; // default: 'alphanumeric'
combine?: boolean; // default: true
include?: string | string[];
exclude?: string | string[];
maxLength?: number;
excludeWords?: string[];
}target: preset or customRegExpfor the characters to wrap.alphanumeric=[0-9A-Za-z];ascii= full printable ASCIIcombine: iftrue, consecutive target characters become onetcysegment; iffalse, each character becomes its own segmentinclude/exclude: per-character overrides.excludewins overincludemaxLength: maximum length for a tcy segment. Segments longer than this are demoted to plain textexcludeWords: exact words to exclude from tcy wrapping. Matched against the combined segment value
Links
- Full documentation: Love-Rox/tate-chu-yoko (日本語)
- Issues: https://github.com/Love-Rox/tate-chu-yoko/issues
License
MIT
