odtparser

v1.0.0

Published

4 months ago

Simple ODT (OpenDocument Text) parser powered by LlamaParse v2

0High
0Medium
0Low

hexapode

odt opendocument libreoffice openoffice parser document llamaparse ocr markdown converter

odtparser

Parse ODT (OpenDocument Text) files to markdown or text. Powered by LlamaParse v2. Zero dependencies, Node 18+.

Handles .odt, .ott, .odp, .ods, and other OpenDocument formats out of the box — plus PDF, DOCX, PPTX, XLSX, and other common formats.

Install

npm install odtparser

Quick Start

Set your API key:

export LLAMA_CLOUD_API_KEY=llx-...

Parse an ODT file:

import { parse } from "odtparser";

const result = await parse("./document.odt");
console.log(result.markdown);

Advanced Usage

import { ODTParser } from "odtparser";

const parser = new ODTParser({ apiKey: "llx-..." });

// Parse an ODT file with options
const result = await parser.parse("./report.odt", {
  tier: "agentic",
  processing_options: { language: "fr" },
});

// Parse a buffer (e.g. from an upload)
const buffer = fs.readFileSync("./document.odt");
const result = await parser.parse(buffer, {
  fileName: "document.odt",
});

Supported Formats

| Extension | Format | |-----------|--------| | .odt | OpenDocument Text | | .ott | OpenDocument Text Template | | .odp | OpenDocument Presentation | | .otp | OpenDocument Presentation Template | | .ods | OpenDocument Spreadsheet | | .ots | OpenDocument Spreadsheet Template | | .odg | OpenDocument Graphics | | .pdf | PDF | | .docx / .doc | Microsoft Word | | .pptx / .ppt | Microsoft PowerPoint | | .xlsx / .xls | Microsoft Excel | | .html | HTML | | .csv | CSV | | .txt | Plain text | | .png / .jpg / .tiff / .bmp / .webp / .gif | Images (OCR) |

API

`parse(input, options?)`

Uploads a document, waits for parsing to complete, and returns the result.

Input: file path (string) or file contents (Buffer | Uint8Array)

Options:

| Option | Type | Default | Description | |--------|------|---------|-------------| | tier | string | "fast" | Parsing tier: fast, cost_effective, agentic, agentic_plus | | version | string | "latest" | API version | | apiKey | string | env var | Override API key | | expand | string[] | ["markdown_full", "text_full"] | Fields to expand | | pollIntervalMs | number | 1000 | Polling interval in ms | | timeoutMs | number | 300000 | Max wait time in ms | | fileName | string | "document.odt" | Filename hint for buffer input | | mimeType | string | auto-detected | MIME type for buffer input | | signal | AbortSignal | — | Cancellation signal | | processing_options | object | — | LlamaParse processing options (language, disable_ocr, etc.) | | agentic_options | object | — | Agentic options (custom_prompt) | | page_ranges | object | — | Page range options (max_pages, target_pages) | | disable_cache | boolean | — | Disable document caching |

Returns: ParseResult

interface ParseResult {
  markdown: string;     // Full markdown output
  text: string;         // Full text output
  job: JobResponse;     // Job metadata (id, status, etc.)
  _raw: object;         // Raw API response
}

`new ODTParser(config?)`

Create an instance with explicit configuration.

| Option | Type | Default | Description | |--------|------|---------|-------------| | apiKey | string | LLAMA_CLOUD_API_KEY | API key | | baseUrl | string | https://api.cloud.llamaindex.ai | API base URL |

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

odtparser

Install

Quick Start

Advanced Usage

Supported Formats

API

parse(input, options?)

new ODTParser(config?)

License

`parse(input, options?)`

`new ODTParser(config?)`