pinyin-to-zhuyin
v1.1.0
Published
Bidirectional converter between Pinyin (Romanized Chinese) and Zhuyin (Bopomofo, ㄅㄆㄇㄈ) phonetic systems with tone handling and erhua support.
Maintainers
Readme
Pinyin <-> Zhuyin Converter
A comprehensive Node.js library and command-line tool for converting between Pinyin (Romanized Chinese) and Zhuyin (Bopomofo, ㄅㄆㄇㄈ) Mandarin phonetic systems. This converter supports bidirectional conversion with advanced features like tone handling, erhua (儿化), and various output formats. See online demo.
Features
- Bidirectional Conversion: Convert from Pinyin to Zhuyin (
p2z) and Zhuyin to Pinyin (z2p) - Command Line Tools: Basic CLI interface
- Comprehensive Test Suite: Extensive test coverage for edge cases
- Erhua Support: Handles northern dialect erhua (儿化) with flexible tone placement
- Tone Support: Handles all 5 tones including neutral tone (5th tone)
- Tone Mark Options: Output with tone marks (ā, á, ǎ, à) or tone numbers (1, 2, 3, 4, 5)
- Handles rare syllables: Optionally collapse n/l + üan → uan (see: https://www.moedict.tw/攣)
- Syllable Segmentation: Intelligent syllable boundary detection
- Umlaut Handling: Supports both
üandvfor the same sound
Installation
From Source
git clone https://github.com/logicmason/pinyin-to-zhuyin.git
cd pinyin-to-zhuyin
npm install
npm install -g .As a Dependency
npm install pinyin-to-zhuyinUsage
Command Line Tools
p2z - Pinyin to Zhuyin Converter
Convert pinyin text to zhuyin:
# Convert file
p2z input.txt
# Convert from stdin
echo 'ni3hao3' | p2z
# Show help
p2z --helpOptions:
--tonemarks: Use tone marks instead of tone numbers--no-neutral: Do not mark neutral tones--convert-punctuation: Convert punctuation (Chinese ↔ English)--help: Show help message
z2p - Zhuyin to Pinyin Converter
Convert zhuyin text to pinyin:
# Convert file
z2p input.txt
# Convert from stdin
echo 'ㄋㄧˇㄏㄠˇ' | z2p
# Show help
z2p --helpOptions:
--tonemarks: Use tone marks instead of tone numbers--convert-punctuation: Convert punctuation (Chinese ↔ English)--help: Show help message
Programmatic API
Pinyin to Zhuyin (p2z)
import { p2z } from 'pinyin-to-zhuyin';
// Basic conversion
console.log(p2z('ni3hao3')); // ㄋㄧˇㄏㄠˇ
// With options
const options = {
tonemarks: true, // Use tone marks instead of numbers
inputHasToneMarks: true, // Handle input with tone marks (instead of numbers)
convertPunctuation: false // Convert Chinese punctuation
};
console.log(p2z('Wo3 de ke4ben3', options)); // ㄨㄛˇ ˙ㄉㄜ ㄎㄜˋ ㄅㄣˇZhuyin to Pinyin (z2p)
import { z2p } from 'pinyin-to-zhuyin';
// Basic conversion
console.log(z2p('ㄋㄧˇㄏㄠˇ')); // nǐhǎo
// With options
const options = {
erhuaTone: "after-r", // Place pinyin tone after 'r' in erhua
nlUmlautU: "preserveUmlaut", // Optionally preserve or collapse n/l + üan → uan, see: https://www.moedict.tw/攣
tonemarks: true, // Use tone marks (instead of numbers)
markNeutralTone: false, // Hide neutral tone numbers
apostrophes: 'auto' // Add apostrophes where required at syllable boundaries
};
console.log(z2p('ㄏㄨㄚㄦ', options)); // huārTone utilities (tone-tool.js)
toToneMarks(input: string)– Convert Pinyin with tone numbers to tone marks.toToneNumbers(input: string)– Convert Pinyin with tone marks to tone numbers.- Tone data/helpers live in
tone-tool.jsand are reused by the converters: toneMarkTable– tone placement mapping.vowels,consonants– character-class fragments used to build tokenizer regexes inp2z.
This tool handles mixed language inputs gracefully where possible. Dedicated repo here.
Examples:
numbersToMarks('hai3ou1')
// hǎi'ōu
numbersToMarks('Na4wei4 xian1sheng1 jiao4 Max, ta1 lai2zi4 De1guo2.')
// Nàwèi xiānshēng jiào Max, tā láizì Dēguó.
marksToNumbers('The northeastern region of China has three provinces—Jílín, Hēilóngjiāng, and Liáoníng.')
// The northeastern region of China has three provinces—Ji2lin2, Hei1long2jiang1, and Liao2ning2. API Reference
p2z(pinyin, options)
Converts Pinyin to Zhuyin.
Parameters:
pinyin(string): Input pinyin textoptions(object, optional):tonemarks(boolean, default: true): Use tone marks instead of numbersconvertPunctuation(boolean, default: false): Convert ,.?!;: to ,,。?!;:
Returns: (string) Zhuyin text
z2p(zhuyin, options)
Converts Zhuyin to Pinyin.
Parameters:
zhuyin(string): Input zhuyin textoptions(object, optional):erhuaTone(string, default: 'after-r'): Tone placement for erhua ('before-r' or 'after-r')nlUmlautU(string, default: 'collapse-nl-uan'): Handle n/l + üan → uantonemarks(boolean, default: true): Use tone marks instead of numbersmarkNeutralTone(boolean, default: false): Show neutral tone as number 5apostrophes(string, default: 'auto'): Apostrophe behavior ('auto', true, false)
Returns: (string) Pinyin text
Examples
Basic Conversions
import { p2z, z2p } from 'pinyin-to-zhuyin';
// Pinyin to Zhuyin
p2z('ni3hao3') // ㄋㄧˇ ㄏㄠˇ
p2z('zhong1guo2') // ㄓㄨㄥ ㄍㄨㄛˊ
p2z('Wo3 de ke4ben3') // ㄨㄛˇ ˙ㄉㄜ ㄎㄜˋ ㄅㄣˇ
p2z('hua1r') // ㄏㄨㄚㄦ
// Zhuyin to Pinyin
z2p('ㄋㄧˇㄏㄠˇ') // nǐhǎo
z2p('ㄓㄨㄥ ㄍㄨㄛˊ') // zhōng guó
z2p('ㄨㄛˇ ˙ㄉㄜ ㄎㄜˋ ㄅㄣˇ') // wo3 de ke4ben3
z2p('ㄏㄨㄚㄦ') // huārAdvanced Features
p2z('“tā shuō: ‘nǐhǎo’, rán hòu jiù zǒu lē.”', { convertPunctuation: true })
// Output: 「ㄊㄚ ㄕㄨㄛ:『ㄋㄧˇ ㄏㄠˇ』,ㄖㄢˊ ㄏㄡˋ ㄐㄧㄡˋ ㄗㄡˇ ㄌㄜ。」
// Long sentences with punctuation conversion
z2p('「ㄊㄚ ㄕㄨㄛ:『ㄋㄧˇㄏㄠˇ』,ㄖㄢˊ ㄏㄡˋ ㄐㄧㄡˋ ㄗㄡˇ ˙ㄌㄜ。」', { convertPunctuation: true })
// Output: “tā shuō: ‘nǐhǎo’, rán hòu jiù zǒu lē.”
// Chinese: 俗話說:「江太公釣魚,願者上鉤」
z2p('ㄙㄨˊㄏㄨㄚˋ ㄕㄨㄛ:「ㄐㄧㄤㄊㄞˋㄍㄨㄥ ㄉㄧㄠˋㄩˊ, ㄩㄢˋㄓㄜˇ ㄕㄤˋㄍㄡˇ」', { convertPunctuation: true })
// Output: súhuà shuō: jiāngtàigōng diàoyú, yuànzhě shànggǒu
// Erhua handling
p2z('hua1r') // ㄏㄨㄚㄦ
z2p('ㄏㄨㄚㄦ', { erhuaTone: 'before-r' }) // hua1r
z2p('ㄏㄨㄚㄦ', { erhuaTone: 'after-r' }) // huār1
// Tone mark vs numbers
p2z('ni3hao3', { tonemarks: true }) // ㄋㄧˇ ㄏㄠˇ
p2z('ni3hao3', { tonemarks: false }) // ㄋㄧ3 ㄏㄠ3
z2p('ㄋㄧˇㄏㄠˇ', { tonemarks: true }) // nǐhǎo
z2p('ㄋㄧˇㄏㄠˇ', { tonemarks: false }) // ni3hao3
// Neutral tone handling
p2z('Wo3 de ke4ben3') // ㄨㄛˇ ˙ㄉㄜ ㄎㄜˋ ㄅㄣˇ
z2p('ㄨㄛˇ ˙ㄉㄜ ㄎㄜˋ ㄅㄣˇ', { markNeutralTone: false }) // wo3 de ke4ben3Supported Features
Tone System
- Tone 1: High level (unmarked in zhuyin, ā in pinyin)
- Tone 2: Rising (ˊ in zhuyin, á in pinyin)
- Tone 3: Falling-rising (ˇ in zhuyin, ǎ in pinyin)
- Tone 4: Falling (ˋ in zhuyin, à in pinyin)
- Tone 5: Neutral (˙ in zhuyin, unmarked in pinyin)
Special Cases
- Erhua (儿化): Handles northern dialect erhua with flexible tone placement
- Umlaut: Supports both
üandvfor the same sound - Syllable Boundaries: Intelligent detection and apostrophe insertion
- Disambiguation: Handles ambiguous cases like
xi'anvsxiang
Input Formats
- Tone numbers:
ni3hao3 - Tone marks:
nǐhǎo - Mixed formats supported
- Chinese punctuation conversion available
Testing
Run the test suite:
npm testThe test suite includes comprehensive coverage for:
- Basic conversions
- Tone handling
- Erhua support
- Umlaut variations
- Edge cases and special characters
- File-based conversions
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
License
MIT License
Repository Structure
pinyin-to-zhuyin/
├── pinyin-zhuyin.js # Main library
├── p2z.js # Command-line pinyin to zhuyin converter
├── z2p.js # Command-line zhuyin to pinyin converter
├── package.json # Package configuration
├── test/
│ ├── zhuyin-converter.spec.js # Main test suite
│ └── fixtures/ # Test data files
└── README.md # This file