dot-pages-document-parser
v1.0.1
Published
Simple Apple Pages document parser powered by LlamaParse v2
Maintainers
Readme
dot-pages-document-parser
Parse Apple Pages files to markdown or text. Powered by LlamaParse v2. Zero dependencies, Node 18+.
Handles .pages files out of the box — plus PDF, DOCX, PPTX, XLSX, and other common formats.
Install
npm install dot-pages-document-parserQuick Start
Set your API key:
export LLAMA_CLOUD_API_KEY=llx-...Parse a Pages file:
import { parse } from "dot-pages-document-parser";
const result = await parse("./document.pages");
console.log(result.markdown);Advanced Usage
import { dot-pages-document-parser } from "dot-pages-document-parser";
const parser = new dot-pages-document-parser({ apiKey: "llx-..." });
// Parse a Pages file with options
const result = await parser.parse("./report.pages", {
tier: "agentic",
processing_options: { language: "fr" },
});
// Parse a buffer (e.g. from an upload)
const buffer = fs.readFileSync("./presentation.pages");
const result = await parser.parse(buffer, {
fileName: "presentation.pages",
});Supported Formats
| Extension | Format |
|-----------|--------|
| .pages | Apple Pages |
| .pdf | PDF |
| .docx / .doc | Microsoft Word |
| .pptx / .ppt | Microsoft PowerPoint |
| .xlsx / .xls | Microsoft Excel |
| .html | HTML |
| .csv | CSV |
| .txt | Plain text |
| .png / .jpg / .tiff / .bmp / .webp / .gif | Images (OCR) |
API
parse(input, options?)
Uploads a document, waits for parsing to complete, and returns the result.
Input: file path (string) or file contents (Buffer | Uint8Array)
Options:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| tier | string | "fast" | Parsing tier: fast, cost_effective, agentic, agentic_plus |
| version | string | "latest" | API version |
| apiKey | string | env var | Override API key |
| expand | string[] | ["markdown_full", "text_full"] | Fields to expand |
| pollIntervalMs | number | 1000 | Polling interval in ms |
| timeoutMs | number | 300000 | Max wait time in ms |
| fileName | string | "document.pages" | Filename hint for buffer input |
| mimeType | string | auto-detected | MIME type for buffer input |
| signal | AbortSignal | — | Cancellation signal |
| processing_options | object | — | LlamaParse processing options (language, disable_ocr, etc.) |
| agentic_options | object | — | Agentic options (custom_prompt) |
| page_ranges | object | — | Page range options (max_pages, target_pages) |
| disable_cache | boolean | — | Disable document caching |
Returns: ParseResult
interface ParseResult {
markdown: string; // Full markdown output
text: string; // Full text output
job: JobResponse; // Job metadata (id, status, etc.)
_raw: object; // Raw API response
}new dot-pages-document-parser(config?)
Create an instance with explicit configuration.
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| apiKey | string | LLAMA_CLOUD_API_KEY | API key |
| baseUrl | string | https://api.cloud.llamaindex.ai | API base URL |
License
MIT
