@bidiscope/core
v0.1.0
Published
Unicode BiDi algorithm engine optimized for mixed code + RTL text
Maintainers
Readme
@bidiscope/core
The Unicode Bidirectional Algorithm engine for developer tools — with code-context awareness.
Features
- Full UAX #9 — Embedding levels, explicit overrides, isolates, bracket pairing (BD16)
- Arabic Contextual Shaping — 42 Arabic + 14 Persian/Urdu characters with Lam-Alef ligatures
- Code-Context Awareness — Detects string literals, comments, paths, URLs — keeps code structure intact
- ANSI Escape Preservation — Terminal colors survive visual reordering
- Hebrew, Syriac, NKo Support — Complete RTL script coverage
- LTR Fast Path — Zero overhead when no RTL characters are present
- Zero Dependencies — Pure TypeScript, 13.8KB minified
Install
npm install @bidiscope/coreQuick Start
import { resolveBidi } from '@bidiscope/core';
const result = resolveBidi('console.log("مرحباً بالعالم")', {
baseDirection: 'auto',
codeContext: true,
shaping: true,
});
console.log(result.visual); // Correctly ordered for display
console.log(result.paragraphDirection); // 'ltr' or 'rtl'
console.log(result.levels); // Embedding levels per characterAPI
resolveBidi(text, options?)
Main entry point. Resolves bidirectional text for visual display.
Parameters:
| Option | Type | Default | Description |
|:---|:---|:---|:---|
| baseDirection | 'auto' \| 'ltr' \| 'rtl' | 'auto' | Paragraph base direction |
| codeContext | boolean | false | Enable code-aware BiDi (preserves code structure) |
| shaping | boolean | true | Enable Arabic contextual shaping |
| shapingMode | 'presentation-forms' \| 'logical' | 'presentation-forms' | Shaping output mode |
| preserveAnsi | boolean | true | Preserve ANSI escape sequences |
Returns: BidiResult
interface BidiResult {
visual: string; // Visually reordered text
levels: number[]; // Embedding level per character
paragraphDirection: 'ltr' | 'rtl'; // Resolved paragraph direction
runs: BidiRun[]; // Directional runs
}shapeArabic(text)
Apply Arabic contextual shaping to a string. Converts base Arabic codepoints to their positional Presentation Forms.
classifyBidiType(codepoint)
Returns the BiDi character type for a Unicode codepoint. O(1) for ASCII/Arabic/Hebrew (fast table), O(log n) for extended ranges.
Performance
| Input | Time per line | |:---|:---| | Pure LTR (English) | 0.016ms | | Pure Arabic | 0.090ms | | Mixed Arabic + English | 0.096ms | | Complex (code + Arabic + ANSI) | 0.126ms |
All measurements on a standard desktop. Target: < 1ms per line ✅
Unicode Conformance
Tested against the official BidiCharacterTest.txt (91,707 test cases):
| Metric | Result | |:---|:---| | Level Resolution | 99.91% (91,620 / 91,707) | | Visual Reorder | 99.90% (91,619 / 91,707) |
Shaping Modes
| Mode | Use Case | Description |
|:---|:---|:---|
| presentation-forms | Canvas / WebGL terminals | Converts to Unicode Presentation Forms (U+FE70–U+FEFF) |
| logical | Native terminals with HarfBuzz | Keeps base codepoints, relies on font engine for joining |
License
MIT
