@markitdownjs/docx
v0.2.0
Published
DOCX converter for MarkItDownJS
Readme
@markitdownjs/docx
DOCX (Word) to AST converter for MarkItDownJS. Parses headings, paragraphs, lists, tables, inline formatting, and document metadata from .docx files.
Install
npm install @markitdownjs/docxUsage
import { MarkItDown } from "@markitdownjs/core";
import { DocxConverter } from "@markitdownjs/docx";
const parser = new MarkItDown();
parser.registerConverter(new DocxConverter());
const result = await parser.convert({
source: docxBuffer,
mimeType: "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
});
console.log(result.markdown);Key Exports
| Export | Description |
|---|---|
| DocxConverter | Converter plugin — register with MarkItDown |
What Gets Extracted
- Headings (mapped from Word heading styles h1–h6)
- Paragraphs with bold, italic, underline, and strikethrough inline formatting
- Ordered and unordered lists (including nested)
- Tables with header row detection
- Document core properties: title, author, description, created/modified dates
Accepted MIME Types
application/vnd.openxmlformats-officedocument.wordprocessingml.documentapplication/msword(.docfiles are not supported — convert to.docxfirst)
Part of the MarkItDownJS monorepo.
