@mandel59/mojidata
v1.8.1
Published
Mojidata Character Database
Readme
Mojidata Character Database
This package provides a SQLite database with CJKV character data.
The following data are included:
- Adobe, Adobe-Japan1 CMap Resources (aj1)
- Unicode, Unicode Character Database
- CJK Radicals (radicals)
- Standardized Variants (svs)
- Equivalent Unified Ideograph (radeqv)
- U-Source Data (usource)
- Unicode Han Database (unihan)
- Unicode, Ideographic Variation Database (ivs)
- CITPC, List of MJ Characters (mji)
- CITPC, MJ Map (mjsm)
- BabelStone, Ideographic Description Sequences (IDS) for CJK Unified Ideographs (ids)
- Kanji Database Project
- 文化庁, 常用漢字表 (joyo)
- 文化庁, 同音の漢字による書きかえ (doon)
- 出入国在留管理庁, 在留カード等に係る漢字氏名の表記等に関する告示 別表第四 (nyukan)
- 国务院, 通用规范汉字表 (tghb)
The generated SQLite database also includes unihan_value_ref, a compact
reverse lookup table for Unihan values that reference another character either
directly or with a U+XXXX token. This table is derived from the Unihan data and
is used by mojidata-api to avoid scanning the full Unihan table for
unihan_fts lookups.
License
The source code of Mojidata is available under the MIT license.
Some external resources are bundled with the package. See download.txt for the bundled resources and the source URLs of them.
Each of these resources is available under its own license.
See LICENSE.md for details.
