npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

docx-edit

v0.3.1

Published

A JS library that parses DOCX into a virtual component tree and writes paragraph-level text changes back to split OOXML runs.

Downloads

659

Readme

docx-edit

一个基于 JavaScript 的 .docx 解析与修改库。相比于传统的 .docx 解析库,本项目的优势在于:对于word动态修改更加友好的支持,支持全文高精度匹配和替换等操作。

当前版本已经实现:

  • 文档虚拟树 diff / patch
  • 段落与 run 的样式级建模
  • 样式新增、修改、清空
  • 组件之间的样式迁移
  • 旧控制器 API 与新虚拟树 API 并存

所有写操作最终都会统一收敛到虚拟树 patch,再同步回底层 OOXML。

特性

  • 解析正文、页眉、页脚、批注、脚注、尾注
  • 识别 paragraphruntexttabletable-rowtable-cellhyperlinktext-boximagemathfootnoteReferenceendnoteReference
  • 支持段落跨多个 w:t 的整段文本读取和回写(保留脚注引用和数学公式占位符)
  • 支持真正的虚拟树 diff / patch
  • 支持段落样式和 run 样式的建模、修改和迁移
  • 兼容旧控制器 API,旧写接口内部自动转为虚拟树 patch
  • 支持上标(superscript)和下标(subscript)样式读写
  • 支持脚注引用(footnoteReference)的读取、写入和新建
  • 支持尾注引用(endnoteReference)的读取和写入
  • 保存修改后的 .docx
  • 提取全文内容为 HTML(支持标题分级、数学公式、表格)
  • 解析标题级别(支持中文/英文样式 ID)

安装

npm install docx-edit

本库当前使用 CommonJS 导出,对应 Node.js 环境建议为 >=18

快速开始

const { loadDocx } = require("docx-edit");

async function main() {
  const doc = await loadDocx("./sample.docx");

  doc.replaceAll("旧词", "新词");
  await doc.saveAs("./sample.modified.docx");
}

main();

虚拟树模型

文档会被解析成一棵虚拟树,典型结构如下:

document
  body
    paragraph
      run
        text
    table
      table-row
        table-cell
          paragraph
  header
    paragraph
  footer
    paragraph
  comments
    comment
      paragraph

目前支持的节点类型:

  • document
  • body
  • header
  • footer
  • footnotes
  • endnotes
  • comments
  • paragraph
  • run
  • text
  • table
  • table-row
  • table-cell
  • hyperlink
  • tab
  • break
  • text-box
  • comment
  • footnote
  • endnote
  • footnoteReference
  • endnoteReference
  • footnoteRef
  • endnoteRef
  • image
  • math

内部写入流程

无论你调用的是旧控制器接口,还是直接使用 doc.patch(nextTree),内部流程都是一致的:

  1. 从当前文档生成一棵新的虚拟树副本
  2. 在副本上修改目标节点
  3. 调用 doc.patch(nextTree)
  4. patch 引擎执行 INSERT / REMOVE / REPLACE / MOVE / PROPS/TEXT_UPDATE
  5. 将结果同步回底层 OOXML
  6. 从 XML 重新建树并重建索引

对于段落文本修改,仍然保留当前 ParagraphTextModel 的策略:

  • 尽量保留原有 w:r / w:t
  • 尽量保留 tab / break
  • 只将新的文本重新分配回原有文本节点

段落文本中的特殊占位符

读取段落文本时,脚注引用和数学公式会以占位符形式出现在文本中:

  • [[FOOTNOTE_REF:id]] — 脚注引用
  • [[ENDNOTE_REF:id]] — 尾注引用
  • [[MATH:text]] — 数学公式

修改段落文本时,这些占位符会被自动保留,不会被覆盖。

样式模型

当前已经支持两层样式建模:

  • paragraph.props.style 对应 w:pPr
  • run.props.style 对应 w:rPr

已支持的段落样式字段

{
  styleId: "BodyText",
  alignment: "center",
  keepNext: true,
  keepLines: true,
  pageBreakBefore: false,
  spacing: {
    before: "120",
    after: "240",
    line: "360",
    lineRule: "auto",
  },
  indent: {
    left: "240",
    right: "120",
    firstLine: "240",
    hanging: "240",
  },
}

已支持的 run 样式字段

{
  styleId: "Emphasis",
  bold: true,
  italic: true,
  underline: "single",
  color: "FF0000",
  highlight: "yellow",
  fontSize: "28",
  vertAlign: "superscript",  // "superscript" | "subscript" | null
  fontFamily: {
    ascii: "Calibri",
    asciiTheme: "minorHAnsi",
    hAnsi: "Calibri",
    hAnsiTheme: "minorHAnsi",
    eastAsia: "宋体",
    eastAsiaTheme: "minorEastAsia",
    cs: "Arial",
    cstheme: "minorBidi",
  },
}

主题字体属性

fontFamily 已支持捕获和回写主题字体引用:

  • asciiTheme / hAnsiTheme / eastAsiaTheme / cstheme

这些属性在 word/styles.xmldocument.xml 中均可正确解析和同步。

导出 API

入口定义在 src/index.js

const {
  loadDocx,
  VirtualWordDocument,
  VNode,
  createVNode,
  cloneVNode,
  DocumentPartController,
  ParagraphController,
  RunController,
  TableController,
  TableRowController,
  TableCellController,
  TextBoxController,
  StructuredEntryController,
} = require("docx-edit");

文档 API

loadDocx(input)

加载 .docx 文件。

  • input: string | Buffer
  • 返回:Promise<VirtualWordDocument>
const doc = await loadDocx("./sample.docx");

doc.toComponentTree()

返回当前文档虚拟树的副本。你可以在这棵树上修改,再传给 doc.patch()

const tree = doc.toComponentTree();
console.log(tree.type); // document

doc.patch(nextTree)

对完整虚拟树执行 patch,并把结果同步到底层 XML。

  • 根节点类型必须为 document
  • 支持文本更新、结构新增、删除、替换、重排
  • 支持段落样式和 run 样式修改
  • 返回 patch 结果,包含执行的操作列表
const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");
body.children[0].props.text = "新的第一段";

const result = doc.patch(tree);
console.log(result.operations);

doc.toBuffer()

返回修改后的 .docx 二进制内容。

const buffer = await doc.toBuffer();

doc.saveAs(outputPath)

保存文档到指定路径。

await doc.saveAs("./sample.modified.docx");

doc.addFootnote(text)

创建一条新脚注,返回脚注 ID。可用于后续在段落中插入脚注引用。

  • text: string — 脚注内容文本
  • 返回:number — 脚注 ID

对于没有 footnotes.xml 的文档,会自动创建。

const doc = await loadDocx("./sample.docx");

// 创建脚注,获取 ID
const fnId = doc.addFootnote("这是脚注内容");

// 在段落中引用该脚注
const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children.push(
  createVNode({
    type: "paragraph",
    props: { text: "正文内容" },
    children: [
      createVNode({
        type: "run",
        props: { text: "正文内容" },
        children: [],
      }),
      createVNode({
        type: "run",
        props: {
          style: { vertAlign: "superscript" },
        },
        children: [
          createVNode({
            type: "footnoteReference",
            props: { id: String(fnId) },
            children: [],
          }),
        ],
      }),
    ],
  }),
);

await doc.patch(tree);
await doc.saveAs("./sample.modified.docx");
doc.getParts();
doc.getBody();
doc.getHeaders();
doc.getFooters();
doc.getParagraphs();
doc.getParagraph(0);
doc.getTables();
doc.getTextBoxes();
doc.getFootnotes();
doc.getEndnotes();
doc.getComments();

doc.replaceAll(searchValue, replacement, options?)

全文替换段落文本。

  • searchValue: string | RegExp
  • replacement: string | Function
  • options.partTypes?: string[]
doc.replaceAll("活动", "主题活动");
doc.replaceAll(/2025/g, "2026");
doc.replaceAll("页眉", "新页眉", { partTypes: ["header"] });

doc.extractHtml(options?)

提取全文内容为简单 HTML 格式。

  • options.partTypes?: string[] — 提取哪些部分,默认 ["body"],可选 "body""headers""footers"

输出格式:

  • 标题:<h1> ~ <h6>(根据 resolveHeadingLevel 自动识别)
  • 正文:<p>
  • 加粗/斜体/下划线/删除线:<strong> / <em> / <u> / <s>
  • 行内公式:<span class="math">...</span>
  • 块级公式:<div class="math-block">...</div>
  • 表格:<table> / <tr> / <td>(支持 colspan)
  • 图片:<img alt="..." />
  • 换行:<br>
const html = doc.extractHtml();
fs.writeFileSync("output.html", html);

doc.resolveHeadingLevel(styleId)

根据 styleId 解析标题级别。

  • styleId: string — 段落样式的 styleId
  • 返回:1~9(标题级别)或 null

解析优先级:

  1. 样式继承链中的 outlineLevel(最可靠)
  2. 样式名称匹配(支持 "标题 1"~"标题 9" 和 "heading 1"~"heading 9")
  3. styleId 直接匹配(Heading1~Heading9
doc.resolveHeadingLevel("Heading1"); // 1
doc.resolveHeadingLevel("1");        // 1(中文 Word)

paragraph.getHeadingLevel()

便捷方法,直接获取段落的标题级别。

  • 返回:1~9null
for (const p of doc.getBody().getParagraphs()) {
  const level = p.getHeadingLevel();
  if (level) {
    console.log(`标题 ${level}: ${p.getText()}`);
  } else {
    console.log(`正文: ${p.getText()}`);
  }
}

控制器 API

旧控制器 API 仍然保留,但内部已经迁移到虚拟树 patch。

DocumentPartController

常见来源:

const body = doc.getBody();
const header = doc.getHeaders()[0];
const footer = doc.getFooters()[0];

可用方法:

body.toComponentTree();
body.getParagraphs();
body.getParagraph(0);
body.getTables();
body.getTable(0);
body.getTextBoxes();
body.replaceAll("旧词", "新词");

对于 comments / footnotes / endnotes part,还可以:

const commentsPart = doc.getParts().find((part) => part.type === "comments");
commentsPart.getEntries();
commentsPart.getEntries({ includeSpecial: true });

ParagraphController

const paragraph = doc.getBody().getParagraph(0);

paragraph.getText();
paragraph.setText("新的段落内容");
paragraph.replace("旧词", "新词");
paragraph.replaceAll("青年", "青年学生");
paragraph.getStyle();
paragraph.setStyle({ alignment: "center" });
paragraph.patchStyle({ spacing: { after: "240" } });
paragraph.getRuns();
paragraph.getRun(0);

RunController

const run = doc.getBody().getParagraph(0).getRun(0);

run.getText();
run.getStyle();
run.setStyle({
  bold: true,
  color: "FF0000",
  fontSize: "28",
});
run.patchStyle({
  italic: true,
  underline: "single",
});

样式迁移

const paragraphA = doc.getBody().getParagraph(0);
const paragraphB = doc.getBody().getParagraph(1);

paragraphB.copyStyleFrom(paragraphA);

const runA = paragraphA.getRun(0);
const runB = paragraphB.getRun(0);
runB.copyStyleFrom(runA);

样式档案(Style Profile)

样式档案可以从文档中提取所有命名样式定义(来自 word/styles.xml),输出为 JSON,也可以用同样的 JSON 格式回写到文档中修改样式定义。

提取样式档案

const profile = doc.getStyleProfile();

console.log(profile.defaults);
// { paragraphStyle: { spacing: { after: "160", line: "278", lineRule: "auto" } }, runStyle: { fontSize: "22" } }

console.log(profile.styles["1"]);
// { name: "heading 1", type: "paragraph", basedOn: "a", paragraphStyle: {...}, runStyle: { fontSize: "48", color: "2F5496" } }

JSON 格式

{
  "defaults": {
    "paragraphStyle": { "spacing": { "after": "160", "line": "278", "lineRule": "auto" } },
    "runStyle": { "fontSize": "22", "fontFamily": { "asciiTheme": "minorHAnsi" } }
  },
  "styles": {
    "a": {
      "name": "Normal",
      "type": "paragraph",
      "basedOn": null,
      "paragraphStyle": {},
      "runStyle": {}
    },
    "1": {
      "name": "heading 1",
      "type": "paragraph",
      "basedOn": "a",
      "paragraphStyle": { "keepNext": true, "spacing": { "before": "480", "after": "80" } },
      "runStyle": { "fontSize": "48", "color": "2F5496" }
    }
  }
}

应用样式档案

// 从文档 A 提取
const profileA = docA.getStyleProfile();

// 应用到文档 B
docB.applyStyleProfile(profileA);
await docB.saveAs("output.docx");

也可以只修改部分样式:

doc.applyStyleProfile({
  styles: {
    "1": {
      name: "heading 1",
      type: "paragraph",
      basedOn: "a",
      runStyle: { fontSize: "56", color: "FF0000" },
    },
  },
});

跨文档格式迁移

不同文档的 styleId 可能不同。例如文档 A 的 heading 1 是 "1",文档 B 的 heading 1 是 "2"。此时需要按样式名称匹配,而非直接按 styleId 应用:

const srcProfile = docA.getStyleProfile();
const dstProfile = docB.getStyleProfile();

const mappedProfile = { defaults: srcProfile.defaults, styles: {} };

for (const [srcId, srcStyle] of Object.entries(srcProfile.styles)) {
  // 在目标文档中找同名样式
  for (const [dstId, dstStyle] of Object.entries(dstProfile.styles)) {
    if (dstStyle.name === srcStyle.name && dstStyle.type === srcStyle.type) {
      mappedProfile.styles[dstId] = {
        name: srcStyle.name,
        type: srcStyle.type,
        basedOn: dstStyle.basedOn,   // 保留目标文档的继承关系
        paragraphStyle: srcStyle.paragraphStyle,
        runStyle: srcStyle.runStyle,
      };
      break;
    }
  }
}

docB.applyStyleProfile(mappedProfile);
await docB.saveAs("output.docx");

关键点:

  • basedOn 使用目标文档的值,保留目标文档自身的继承链结构
  • paragraphStylerunStyle 使用源文档的值,实现格式迁移
  • 源文档中存在但目标文档中不存在的样式会被忽略(不会自动创建)

解析有效样式

解析某个命名样式的最终效果(合并 docDefaults → basedOn 链 → 自身属性):

const effective = doc.resolveEffectiveStyle("1");
// { paragraphStyle: { keepNext: true, spacing: { before: "480", after: "80", ... } },
//   runStyle: { fontSize: "48", color: "2F5496", ... } }

列出所有命名样式

const named = doc.getNamedStyles();
// [ { styleId: "a", name: "Normal", type: "paragraph", basedOn: null }, ... ]
const table = doc.getTables()[0];

table.getRows();
table.getRow(0);
table.getCell(1, 2);

table.fill(
  [
    ["活动名称", "日期", "负责人", "备注"],
    ["分享会", "2026-03-24", "张三", "已确认"],
  ],
  { startRow: 0 },
);

TableRowController

const row = doc.getTables()[0].getRow(0);

row.getCells();
row.getCell(0);

TableCellController

const cell = doc.getTables()[0].getCell(1, 0);

cell.getParagraphs();
cell.getParagraph(0);
cell.getText();
cell.setText("新的单元格内容");

TextBoxController

const textBox = doc.getTextBoxes()[0];

textBox.getParagraphs();
textBox.getText();

StructuredEntryController

用于 comment / footnote / endnote

const comment = doc.getComments()[0];

comment.getParagraphs();
comment.getText();
comment.replaceAll("原文", "新文");

虚拟树 API 调用说明

1. 修改已有段落文本

const { loadDocx } = require("docx-edit");

const doc = await loadDocx("./sample.docx");
const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children[0].props.text = "这是更新后的第一段";

await doc.patch(tree);
await doc.saveAs("./sample.modified.docx");

2. 插入一个新段落

const { createVNode, loadDocx } = require("docx-edit");

const doc = await loadDocx("./sample.docx");
const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children.splice(
  1,
  0,
  createVNode({
    type: "paragraph",
    props: { text: "这是新插入的段落" },
    children: [],
  }),
);

await doc.patch(tree);
await doc.saveAs("./sample.modified.docx");

3. 删除一个段落

const doc = await loadDocx("./sample.docx");
const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children.splice(0, 1);

await doc.patch(tree);

4. 使用 key 做稳定重排

如果你要频繁重排同层节点,建议设置 key

const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children[0].key = "first";
body.children[1].key = "second";
body.children[2].key = "third";

await doc.patch(tree);

const nextTree = doc.toComponentTree();
const nextBody = nextTree.children.find((node) => node.type === "body");
nextBody.children = [nextBody.children[2], nextBody.children[0], nextBody.children[1]];

await doc.patch(nextTree);

5. 修改段落样式

const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children[0].props.style = {
  styleId: "BodyText",
  alignment: "center",
  spacing: {
    before: "120",
    after: "240",
  },
};

await doc.patch(tree);

6. 修改 run 样式

const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");
const firstRun = body.children[0].children[0];

firstRun.props.style = {
  bold: true,
  italic: true,
  color: "FF0000",
  underline: "single",
};

await doc.patch(tree);

7. 在组件之间迁移样式

const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

const sourceParagraphStyle = body.children[0].props.style;
body.children[1].props.style = sourceParagraphStyle;

const sourceRunStyle = body.children[0].children[0].props.style;
body.children[1].children[0].props.style = sourceRunStyle;

await doc.patch(tree);

8. 修改页眉、页脚、批注、文本框

const tree = doc.toComponentTree();

const header = tree.children.find((node) => node.type === "header");
const footer = tree.children.find((node) => node.type === "footer");
const comments = tree.children.find((node) => node.type === "comments");

header.children[0].props.text = "新的页眉";
footer.children[0].props.text = "新的页脚";
comments.children[0].children[0].props.text = "新的批注内容";

await doc.patch(tree);

9. 创建带脚注引用的段落

const { createVNode, loadDocx } = require("docx-edit");

const doc = await loadDocx("./sample.docx");

// 创建脚注
const fnId = doc.addFootnote("脚注说明文字");

// 构建带脚注引用的段落
const tree = doc.toComponentTree();
const body = tree.children.find((node) => node.type === "body");

body.children.push(
  createVNode({
    type: "paragraph",
    props: { text: "这段话有脚注[[FOOTNOTE_REF:" + fnId + "]]。" },
    children: [
      createVNode({
        type: "run",
        props: { text: "这段话有脚注" },
        children: [],
      }),
      createVNode({
        type: "run",
        props: { style: { vertAlign: "superscript" } },
        children: [
          createVNode({
            type: "footnoteReference",
            props: { id: String(fnId) },
            children: [],
          }),
        ],
      }),
      createVNode({
        type: "run",
        props: { text: "。" },
        children: [],
      }),
    ],
  }),
);

await doc.patch(tree);
await doc.saveAs("./sample.modified.docx");

注意paragraph.props.text 包含占位符用于文本匹配(如 replaceAll),但实际 XML 结构由 children 中的 run 节点决定。footnoteReference 节点必须作为 run 的子节点。

createVNode() 说明

createVNode() 用来手动创建新节点。

const node = createVNode({
  type: "paragraph",
  key: "intro",
  props: { text: "介绍段落" },
  children: [],
});

参数说明:

  • type: 节点类型
  • key: 可选,同层稳定重排时推荐提供
  • props: 节点属性
  • children: 子节点数组

注意:

  • 根节点必须是 document
  • patch 时必须保持已有 part 不变,不能随意删除 body/header/footer/comments 这些 part 根
  • 新增节点时,要符合当前支持的父子关系
  • 样式修改建议直接写到 paragraph.props.stylerun.props.style

测试

运行示例脚本:

npm run example

运行测试:

npm test

当前测试覆盖:

  • 段落整段读取和回写
  • tab / break 保留
  • 虚拟树文本 patch
  • 虚拟树结构插入、删除、重排
  • 表格 patch 与 fill() 混用
  • header / footer / comment / text-box 持久化
  • 段落样式和 run 样式解析
  • 样式新增、修改、清空
  • 样式在组件之间迁移
  • word/styles.xml 解析(docDefaults、命名样式、主题字体)
  • 样式继承链解析(basedOn 链 + docDefaults 合并)
  • 样式档案 JSON 导出 / 导入 / 保存回写
  • 真实样本文档回归
  • HTML 全文提取(标题分级、数学公式、表格、图片)
  • 标题级别解析(中英文样式 ID)
  • 上标(superscript)/ 下标(subscript)样式读写
  • 脚注引用(footnoteReference)读取、写入和 round-trip
  • 尾注引用(endnoteReference)读取和写入
  • 数学公式(math)读取、写入和 round-trip
  • 新建脚注(addFootnote)— 含已有/无脚注的文档

已知边界

  • 当前只覆盖常见文本相关 OOXML 节点,不是完整的 Word OOXML 实现
  • 当前样式建模主要覆盖段落和 run 的常用属性
  • 对未知节点的策略是尽量保留,而不是细粒度理解和编辑
  • doc.patch(nextTree) 期望目标树是由当前树演化而来,不保证支持任意非法结构