@1-/mdtrim
v0.1.3
Published
Extract translatable text from Markdown while preserving structure for translation. Restore translated content into original Markdown format.
Downloads
504
Maintainers
Readme
@1-/mdtrim : Extract and restore Markdown text for translation
Functionality
Extract translatable text from Markdown while preserving structural prefixes (headers, lists, quotes). Enable content-structure separation for translation, then accurately restore original formatting.
Usage demonstration
Install:
npm install @1-/mdtrimExtract text for translation:
import mdE from '@1-/mdtrim/src/mdE.js';
const markdown = '# Header\n\nParagraph with **bold** text.\n\n- List item\n- Another item';
const [textToTranslate, restoreConfig] = mdE(markdown);
// textToTranslate = ['Header', 'Paragraph with **bold** text.', 'List item', 'Another item']
// restoreConfig = [templateTextList, positionMappingList]Restore translated text:
import mdD from '@1-/mdtrim/src/mdD.js';
const translatedText = ['标题', '带有**粗体**文本的段落。', '列表项', '另一项'];
const restoredMarkdown = mdD(translatedText, restoreConfig);Design rationale
Use prefix recognition to separate content from structure. Parse structural prefixes per line (# headers, - lists, > quotes, 1. numbered lists, [x] checkboxes), extract translatable content, and maintain position mapping for precise restoration.
Technology stack
- JavaScript (ES modules)
- Node.js runtime
- Dependency: @1-/md for Markdown line splitting
Code structure
src/
├── mdE.js # Markdown extractor - separates content from structural prefixes
├── mdD.js # Markdown restorer - reconstructs format using position mapping
└── E/ # Extraction logic
├── rule.js # Line prefix parsing (headers, lists, quotes, numbered lists, checkboxes)
└── lib/ # Helper utilities
├── ascii.js # ASCII character detection
├── spaceSkip.js # Skip whitespace characters
├── isSpace.js # Space character detection
├── charSkip.js # Skip specific characters
└── digitsSkip.js # Skip digit sequencesHistorical context
Markdown was created by John Gruber in 2004 to enable plain-text formatting that could be easily converted to HTML. As global content distribution increased, tools emerged to support multilingual translation while preserving document structure integrity. mdtrim addresses this challenge through prefix recognition technology, demonstrating the intersection of text processing, internationalization, and software engineering.
About
This library is developed by WebC.site.
WebC.site: A new paradigm of web development for AI
@1-/mdtrim : 提取并还原 Markdown 文本以支持翻译
功能介绍
提取 Markdown 中可翻译文本,同时保留结构前缀(标题、列表、引用等)。支持将内容与结构分离,便于翻译后准确还原原始格式。
使用演示
安装:
npm install @1-/mdtrim提取文本用于翻译:
import mdE from '@1-/mdtrim/src/mdE.js';
const markdown = '# 标题\n\n包含**粗体**文本的段落。\n\n- 列表项\n- 另一项';
const [待翻译文本, 还原配置] = mdE(markdown);
// 待翻译文本 = ['标题', '包含**粗体**文本的段落。', '列表项', '另一项']
// 还原配置 = [文本模板列表, 待翻译文本位置列表]还原翻译后的文本:
import mdD from '@1-/mdtrim/src/mdD.js';
const 翻译后文本 = ['Header', 'Paragraph with **bold** text.', 'List item', 'Another item'];
const 还原后Markdown = mdD(翻译后文本, 还原配置);设计思路
采用前缀识别策略分离内容与结构。解析每行 Markdown 的结构前缀(# 标题、- 列表、> 引用、1. 编号列表、[x] 复选框),提取可翻译内容,维护位置映射确保精确还原。
技术栈
- JavaScript(ES 模块)
- Node.js 运行时
- 依赖:@1-/md 用于 Markdown 行分割
代码结构
src/
├── mdE.js # Markdown 提取器 - 分离内容与结构前缀
├── mdD.js # Markdown 还原器 - 按位置映射重建格式
└── E/ # 提取逻辑
├── rule.js # 行前缀解析(标题、列表、引用、编号、复选框)
└── lib/ # 辅助工具
├── ascii.js # ASCII 字符检测
├── spaceSkip.js # 跳过空白字符
├── isSpace.js # 空格字符检测
├── charSkip.js # 跳过特定字符
└── digitsSkip.js # 跳过数字序列历史故事
Markdown 由 John Gruber 于 2004 年创建,旨在提供一种易于转换为 HTML 的纯文本格式化方法。随着全球化内容需求增长,出现了需要在保持文档结构完整性的同时支持多语言翻译的工具需求。mdtrim 通过前缀识别技术解决这一挑战,体现了文本处理、国际化和软件工程的交叉融合。
关于
本库由 WebC.site 开发。
WebC.site : 面向人工智能的网站开发新范式
