npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@zzwz/liteparse-vllm

v1.5.3-custom.1

Published

Open-source PDF parsing with spatial text extraction and OCR processing with Custom Codex-OCR and GML-OCR Servers

Readme

LiteParse OCR vLLM

npm version | License | Upstream Docs

This repository is an independent custom OCR fork of upstream run-llama/liteparse. The upstream project remains the base LiteParse implementation and source reference; this repo carries local custom work for GLM-OCR, vLLM offline packaging, LM Studio diagnostics, Codex OCR diagnostics, agent skills, and release packaging under a separate package name.

Repository identity:

  • Fork repo: https://github.com/lwyBZss8924d/liteparse-ocr-vllm.git
  • Upstream repo: https://github.com/run-llama/liteparse.git
  • Custom branch: custom/vllm-ocr-main
  • Upstream mirror branch: main
  • npm package: @zzwz/liteparse-vllm
  • Current custom version: 1.5.3-custom.1, based on upstream v1.5.3

Do not publish custom OCR releases from main. Keep upstream syncs on main, merge them into custom/vllm-ocr-main, and publish this fork from the custom branch with custom tags such as v1.5.3-custom.1.

Overview

LiteParse OCR vLLM keeps LiteParse's local-first parser and standard OCR HTTP contract, then adds custom advanced OCR packaging for local VLM workflows.

  • Fast Text Parsing: Spatial text parsing using PDF.js
  • Flexible OCR System:
    • Built-in: Tesseract.js for the zero-setup local path
    • Baseline HTTP Servers: EasyOCR, PaddleOCR, or any custom /ocr service
    • GLM-OCR SDK Pipeline: PP-DocLayout-backed layout boxes normalized into LiteParse OCR results
    • vLLM Offline Image: optional GPU-accelerated Docker image tar for air-gapped GLM-OCR model serving; the GLM-OCR SDK pipeline itself can run without this image
    • LM Studio Direct Diagnostics: lightweight local model smoke tests with degraded fallback boxes
    • Codex OCR Diagnostics: online/authenticated multimodal page-understanding artifacts
    • Standard API: unchanged multipart POST /ocr contract with results[].text, results[].bbox, and results[].confidence
  • Screenshot Generation: Generate high-quality page screenshots for LLM agents
  • Multiple Output Formats: JSON and Text
  • Bounding Boxes: Precise text positioning information
  • Standalone CLI: Baseline parsing runs locally; Codex OCR remains online/authenticated only
  • Multi-platform: Linux, macOS (Intel/ARM), Windows

Installation

CLI Tool

Option 1: Global Install (Recommended)

Install globally via npm to use the lit command anywhere:

npm i -g @zzwz/liteparse-vllm

Then use it:

lit parse document.pdf
lit screenshot document.pdf

For macOS and Linux users who want the upstream package instead of this custom OCR fork, liteparse can also be installed via brew:

brew tap run-llama/liteparse
brew install llamaindex-liteparse

Option 2: Install from Source

You can clone the repo and install the CLI globally from source:

git clone https://github.com/lwyBZss8924d/liteparse-ocr-vllm.git
cd liteparse-ocr-vllm
git switch custom/vllm-ocr-main
npm run build
npm pack
npm install -g ./arthur-liteparse-vllm-*.tgz

For a release-grade offline npm tarball, build on Linux x64 so native runtime dependencies match the target host:

npm ci
npm run build
npm prune --omit=dev
npm pack --dry-run --json
npm run smoke:offline-npm-tgz
npm pack

Agent Skill

This fork keeps its custom agent skill source in the repository so OCR commands, package names, and vLLM/GLM-OCR workflows stay aligned with this custom build:

skills/liteparse-cli-tools-custom-collection/

Use npm run validate:agent-skills before publishing changes, then npm run sync:agent-skills:dry-run and npm run sync:agent-skills to refresh the installed runtime projection under /Users/arthur/.agents/skills/liteparse-cli-tools-custom-collection. Do not edit the installed projection directly.

Usage

Parse Files

# Basic parsing
lit parse document.pdf

# Parse with specific format
lit parse document.pdf --format json -o output.md

# Parse specific pages
lit parse document.pdf --target-pages "1-5,10,15-20"

# Parse without OCR
lit parse document.pdf --no-ocr

# Parse a remote PDF
curl -sL https://example.com/report.pdf | lit parse -

# Parse with official GLM-OCR SDK layout pipeline as a LiteParse OCR server
lit glmocr-ocr-server
lit parse document.pdf --ocr-server-url http://127.0.0.1:8831/ocr --format json

# Parse with Codex OCR server for multimodal page understanding
lit codex-ocr-server
lit parse document.pdf --ocr-server-url http://127.0.0.1:8833/ocr --format json

Batch Parsing

You can also parse an entire directory of documents:

lit batch-parse ./input-directory ./output-directory

Generate Screenshots

Screenshots are essential for LLM agents to extract visual information that text alone cannot capture.

# Screenshot all pages
lit screenshot document.pdf -o ./screenshots

# Screenshot specific pages
lit screenshot document.pdf --target-pages "1,3,5" -o ./screenshots

# Custom DPI
lit screenshot document.pdf --dpi 300 -o ./screenshots

# Screenshot page range
lit screenshot document.pdf --target-pages "1-10" -o ./screenshots

Library Usage

Install as a dependency in your project:

npm install @zzwz/liteparse-vllm
# or
pnpm add @zzwz/liteparse-vllm
import { LiteParse } from '@zzwz/liteparse-vllm';

const parser = new LiteParse({ ocrEnabled: true });
const result = await parser.parse('document.pdf');
console.log(result.text);

Buffer / Uint8Array Input

You can pass raw bytes directly instead of a file path, which is useful for remote files:

import { LiteParse } from '@zzwz/liteparse-vllm';
import { readFile } from 'fs/promises';

const parser = new LiteParse();

// From a file read
const pdfBytes = await readFile('document.pdf');
const result = await parser.parse(pdfBytes);

// From an HTTP response
const response = await fetch('https://example.com/document.pdf');
const buffer = Buffer.from(await response.arrayBuffer());
const result2 = await parser.parse(buffer);

Non-PDF buffers (images, Office documents) are written to a temp directory for format conversion. Screenshots also work with buffer input:

const screenshots = await parser.screenshot(pdfBytes, [1, 2, 3]);

Browser Usage

LiteParse's core parsing engine (PDF.js text extraction, grid projection, OCR via Tesseract.js) can run in the browser. Since the library has Node-only dependencies (sharp, fs, child_process), you'll need a bundler like Vite to swap those out with browser stubs.

Vite Configuration

The key is a Vite plugin that redirects Node-only source files to browser-safe replacements, plus resolve.alias entries that stub out Node built-in modules:

// vite.config.ts
import { defineConfig, type Plugin } from "vite";
import { resolve, dirname } from "node:path";

// Node-only files → browser stubs (you write these)
const FILE_REDIRECTS = [
  { match: /\/engines\/pdf\/pdfium-renderer(\.js|\.ts)?$/, target: "stubs/pdfium-renderer.ts" },
  { match: /\/engines\/pdf\/pdfjsImporter(\.js|\.ts)?$/,   target: "stubs/pdfjsImporter.ts" },
  { match: /\/engines\/ocr\/http-simple(\.js|\.ts)?$/,     target: "stubs/http-simple.ts" },
  { match: /\/conversion\/convertToPdf(\.js|\.ts)?$/,      target: "stubs/convertToPdf.ts" },
  { match: /\/processing\/gridDebugLogger(\.js|\.ts)?$/,   target: "stubs/gridDebugLogger.ts" },
  { match: /\/processing\/gridVisualizer(\.js|\.ts)?$/,    target: "stubs/gridVisualizer.ts" },
];

function liteparseNodeRedirects(): Plugin {
  return {
    name: "liteparse-node-redirects",
    enforce: "pre",
    async resolveId(source, importer) {
      if (!importer) return null;
      const abs = source.startsWith(".") ? resolve(dirname(importer), source) : source;
      for (const { match, target } of FILE_REDIRECTS) {
        if (match.test(abs) || match.test(source)) return resolve(target);
      }
      return null;
    },
  };
}

export default defineConfig({
  plugins: [liteparseNodeRedirects()],
  optimizeDeps: { include: ["tesseract.js"] },
  resolve: {
    alias: [
      { find: "node:fs/promises", replacement: "stubs/empty.ts" },
      { find: "node:fs",          replacement: "stubs/empty.ts" },
      { find: "node:url",         replacement: "stubs/empty.ts" },
      { find: "node:path",        replacement: "stubs/empty.ts" },
      { find: "node:os",          replacement: "stubs/empty.ts" },
      { find: "node:child_process", replacement: "stubs/empty.ts" },
      { find: /^fs$/,             replacement: "stubs/empty.ts" },
      { find: /^path$/,           replacement: "stubs/empty.ts" },
      { find: /^os$/,             replacement: "stubs/empty.ts" },
      { find: /^child_process$/,  replacement: "stubs/empty.ts" },
      { find: "form-data",        replacement: "stubs/empty.ts" },
      { find: "axios",            replacement: "stubs/empty.ts" },
      { find: "file-type",        replacement: "stubs/file-type.ts" },
    ],
  },
});

See scripts/browser-compat/ for a complete working example with all the stub files.

What works in the browser

  • PDF parsing from Uint8Array input (use file.arrayBuffer() to get bytes from a <input type="file">)
  • OCR via Tesseract.js (runs in Web Workers, fetches language data from CDN on first use)
  • Text and JSON output formats

What doesn't work

  • File path input (pass Uint8Array instead)
  • DOCX/XLSX/PPTX/image conversion (requires LibreOffice/ImageMagick)
  • HTTP OCR server backend
  • Screenshots (these use PDFium + sharp, which are native Node addons)

CLI Options

Parse Command

$ lit parse --help
Usage: lit parse [options] <file>

Parse a document file (PDF, DOCX, XLSX, PPTX, images, etc.)

Options:
  -o, --output <file>     Output file path
  --format <format>       Output format: json|text (default: "text")
  --ocr-server-url <url>  HTTP OCR server URL (uses Tesseract if not provided)
  --no-ocr                Disable OCR
  --ocr-language <lang>   OCR language(s) (default: "en")
  --num-workers <n>       Number of pages to OCR in parallel (default: CPU cores - 1)
  --max-pages <n>         Max pages to parse (default: "10000")
  --target-pages <pages>  Target pages (e.g., "1-5,10,15-20")
  --dpi <dpi>             DPI for rendering (default: "150")
  --no-precise-bbox       Disable precise bounding boxes
  --preserve-small-text   Preserve very small text
  --password <password>   Password for encrypted/protected documents
  --config <file>         Config file (JSON)
  -q, --quiet             Suppress progress output
  -h, --help              display help for command

Batch Parse Command

$ lit batch-parse --help
Usage: lit batch-parse [options] <input-dir> <output-dir>

Parse multiple documents in batch mode (reuses PDF engine for efficiency)

Options:
  --format <format>       Output format: json|text (default: "text")
  --ocr-server-url <url>  HTTP OCR server URL (uses Tesseract if not provided)
  --no-ocr                Disable OCR
  --ocr-language <lang>   OCR language(s) (default: "en")
  --num-workers <n>       Number of pages to OCR in parallel (default: CPU cores - 1)
  --max-pages <n>         Max pages to parse per file (default: "10000")
  --dpi <dpi>             DPI for rendering (default: "150")
  --no-precise-bbox       Disable precise bounding boxes
  --recursive             Recursively search input directory
  --extension <ext>       Only process files with this extension (e.g., ".pdf")
  --password <password>   Password for encrypted/protected documents (applied to all files)
  --config <file>         Config file (JSON)
  -q, --quiet             Suppress progress output
  -h, --help              display help for command

Screenshot Command

$ lit screenshot --help
Usage: lit screenshot [options] <file>

Generate screenshots of PDF pages

Options:
  -o, --output-dir <dir>  Output directory for screenshots (default: "./screenshots")
  --target-pages <pages>  Page numbers to screenshot (e.g., "1,3,5" or "1-5")
  --dpi <dpi>             DPI for rendering (default: "150")
  --format <format>       Image format: png|jpg (default: "png")
  --password <password>   Password for encrypted/protected documents
  --config <file>         Config file (JSON)
  -q, --quiet             Suppress progress output
  -h, --help              display help for command

OCR Setup

Default: Tesseract.js

# Tesseract is enabled by default
lit parse document.pdf

# Specify language
lit parse document.pdf --ocr-language fra

# Disable OCR
lit parse document.pdf --no-ocr

By default, Tesseract.js downloads language data from the internet on first use. For offline or air-gapped environments, set the TESSDATA_PREFIX environment variable to a directory containing pre-downloaded .traineddata files:

export TESSDATA_PREFIX=/path/to/tessdata
lit parse document.pdf --ocr-language eng

You can also pass tessdataPath in the library config:

const parser = new LiteParse({ tessdataPath: '/path/to/tessdata' });

Optional: HTTP OCR Servers

For higher accuracy or better performance, you can use an HTTP OCR server. We provide ready-to-use example wrappers for popular OCR engines:

You can integrate any OCR service by implementing the simple LiteParse OCR API specification (see OCR_API_SPEC.md).

The API requires:

  • POST /ocr endpoint
  • Accepts file and language parameters
  • Returns JSON: { results: [{ text, bbox: [x1,y1,x2,y2], confidence }] }

See the example servers in ocr/easyocr/ and ocr/paddleocr/ as templates.

For the complete OCR API specification, see OCR_API_SPEC.md.

Optional: GLM-OCR SDK Pipeline

For layout/table/formula-heavy documents, LiteParse can expose the official GLM-OCR SDK self-hosted pipeline as a Custom HTTP OCR server. This path uses PP-DocLayout for layout boxes, then calls a model runtime such as LM Studio for crop OCR:

# Python service path, matching the EasyOCR/PaddleOCR adapter style:
cd ocr/glmocr
uv run server.py

# Or Node-managed wrapper:
# Starts http://127.0.0.1:8831/ocr
# If the model is installed but not loaded, this runs:
# lms load glm-ocr-g32-mixed_4_8-mlx --identifier glm-ocr-g32-mixed_4_8-mlx -y
lit glmocr-ocr-server

lit parse document.pdf \
  --ocr-server-url http://127.0.0.1:8831/ocr \
  --format json

Advanced document-pipeline tooling writes page images, raw GLM-OCR SDK artifacts, LiteParse /ocr result JSON, and final Markdown/JSON:

lit glmocr-pipeline \
  --path document.pdf \
  --output ./glmocr-output \
  --target-pages "1-3"

Use --no-auto-load when you want LiteParse to fail fast instead of calling lms load. Use --model-runtime openai-compatible --ocr-api-url <url> or --model-runtime ollama --ocr-api-url <url> when the GLM-OCR model is hosted outside LM Studio.

Docker: Default Codex OCR Server and Optional vLLM GLM-OCR

The GLM-OCR SDK development path does not require this Docker image and is not GPU-only: cd ocr/glmocr && uv run server.py can run with CPU layout detection and a local LM Studio or other OpenAI-compatible model runtime. The Docker target is an optional vLLM serving package for air-gapped deployment, where a Linux x64 NVIDIA GPU host is expected for practical GLM-OCR model inference.

The image also contains codex-ocr-server, and the default Docker profile is codex. With no profile argument, the container starts a LiteParse-compatible OCR server on 0.0.0.0:8833 using LITEPARSE_CODEX_HOME=/codex-home. The mounted Codex home must provide either Codex auth/config or a custom model_provider config for a local/proxy model endpoint.

The image contains the LiteParse custom CLI, Node runtime dependencies, @openai/codex-sdk, the pinned GLM-OCR SDK, vLLM runtime, zai-org/GLM-OCR, and PaddlePaddle/PP-DocLayoutV3_safetensors.

docker build -f Dockerfile.glmocr-offline \
  -t liteparse-glmocr-vllm-offline:1.5.3-custom.1 \
  --build-arg VLLM_BASE_IMAGE=vllm/vllm-openai@sha256:9eff9734a30b6713a8566217d36f8277630fd2d31cec7f0a0292835901a23aa4 \
  --build-arg GLM_OCR_SDK_REF=cef4d0ea120d1741f5cefe8985eee45f6c8eff1d \
  --build-arg GLM_OCR_MODEL_REVISION=cb34f33832c51008c86436a3b2217bbe4adbe0b8 \
  --build-arg PP_DOCLAYOUT_MODEL_REVISION=3ec586e86ed9245a567bb13395a3db64d5c077cc \
  .

docker save \
  -o liteparse-glmocr-vllm-offline-1.5.3-custom.1.tar \
  liteparse-glmocr-vllm-offline:1.5.3-custom.1

On the deployment host:

docker load -i liteparse-glmocr-vllm-offline-1.5.3-custom.1.tar

# Default profile: codex-ocr-server on :8833.
docker run --rm -p 8833:8833 \
  -e LITEPARSE_CODEX_HOME=/codex-home \
  -v "$HOME/.codex:/codex-home" \
  liteparse-glmocr-vllm-offline:1.5.3-custom.1

# Optional vLLM GLM-OCR profile.
docker run --rm --gpus all --ipc=host --network=none \
  liteparse-glmocr-vllm-offline:1.5.3-custom.1 smoke

docker run --rm --gpus all --ipc=host -p 8831:8831 \
  -e LITEPARSE_OCR_PROFILE=glmocr-vllm \
  liteparse-glmocr-vllm-offline:1.5.3-custom.1

The codex profile starts lit codex-ocr-server on port 8833. The glmocr-vllm profile starts vllm serve /opt/models/glm-ocr on port 8000, waits for /v1/models, then starts lit glmocr-ocr-server on port 8831 with --layout-model-dir /opt/models/pp-doclayout. The image sets HF_HUB_OFFLINE=1 and TRANSFORMERS_OFFLINE=1 at runtime; build the image online once, then distribute the saved tar.

On a Linux x64 NVIDIA GPU host, run the release gate script after copying the tar:

scripts/validate-glmocr-offline-gpu.sh \
  liteparse-glmocr-vllm-offline-1.5.3-custom.1.tar

This script loads the tar, checks image metadata, verifies Docker GPU runtime availability, runs the in-image offline smoke under --network=none, then validates container-internal /health, POST /ocr, and lit parse --ocr-server-url http://127.0.0.1:8831/ocr. On local hosts without NVIDIA GPU support, keep this as an explicit unverified gate and rerun it on the GPU deployment host.

Codex OCR deployment options:

  • Mount a trusted Codex home: -v "$HOME/.codex:/codex-home" -e LITEPARSE_CODEX_HOME=/codex-home. This may include auth.json from codex login and config.toml; treat auth.json as a secret.
  • Use a custom Codex model provider in /codex-home/config.toml, then set model_provider to that provider id. Codex custom providers define base_url, wire_api, auth, and optional headers under [model_providers.<id>].
  • Current official Codex config schema documents wire_api = "responses" for custom providers. For an OpenAI Chat Completions-compatible local endpoint, put an adapter/proxy in front of it that exposes a Responses/Open Responses-compatible API before using it as the Codex provider, unless your pinned Codex version documents another supported wire_api.

Example local Open Responses-compatible Codex config:

# /codex-home/config.toml
#:schema https://developers.openai.com/codex/config-schema.json

model = "local-vision-model"
model_provider = "local-open-responses"
model_reasoning_effort = "medium"

[model_providers.local-open-responses]
name = "Local Open Responses provider"
base_url = "http://host.docker.internal:1234/v1"
wire_api = "responses"
# env_key = "LOCAL_RESPONSES_API_KEY"

References: Codex custom model providers, Codex alternative provider auth, Codex config reference, Codex config schema, OpenAI Responses API, and AI SDK Open Responses provider.

Optional: LM Studio GLM-OCR Direct Wrapper

The legacy direct wrapper remains available for quick single-image or OCR/text smoke tests:

lit lmstudio-ocr page.png --mode text --json
lit lmstudio-ocr-server

Direct mode sends the page or crop straight to LM Studio and may produce fallback line boxes when the model output has no reliable bbox_2d. Use glmocr-ocr-server or glmocr-pipeline when official GLM-OCR layout bboxes are required.

Optional: Codex OCR Server and Pipeline

For agentic multimodal OCR, LiteParse can expose OpenAI Codex as a Custom HTTP OCR server while preserving the standard /ocr response shape:

# Uses @openai/codex-sdk by default.
# Live tests should set HOME to a temp dir containing .codex/auth.json.
lit codex-ocr-server

lit parse document.pdf \
  --ocr-server-url http://127.0.0.1:8833/ocr \
  --format json

The Codex server also exposes POST /ocr/analyze for a full advanced artifact with page Markdown, page metadata, layout regions, segmented assets, annotations, conversion results, model metadata, and provenance. Use --backend app-server to try the experimental codex app-server JSON-RPC wrapper instead of the default SDK path.

Advanced document-pipeline tooling renders supported documents and images into page PNGs, runs Codex OCR per page, and writes page artifacts plus final Markdown/JSON:

lit codex-ocr-pipeline \
  --path document.pdf \
  --output ./codex-ocr-output \
  --target-pages "1-3" \
  --json

The artifact tree includes pages/, codex/, liteparse/, assets/<type>/, annotations/, final/document.md, final/document.json, and manifest.json. Final Markdown includes a LiteParse structured OCR context section that promotes page metadata, selected layout regions, and segmented asset details for downstream QA. Codex bounding boxes are model-inferred visual localization evidence and include codex_bboxes_are_model_inferred warnings; use --strict-bbox to drop regions without usable boxes.

Multi-Format Input Support

LiteParse supports automatic conversion of various document formats to PDF before parsing. This makes it unique compared to other PDF-only parsing tools!

Supported Input Formats

Office Documents (via LibreOffice)

  • Word: .doc, .docx, .docm, .odt, .rtf
  • PowerPoint: .ppt, .pptx, .pptm, .odp
  • Spreadsheets: .xls, .xlsx, .xlsm, .ods, .csv, .tsv

Just install the dependency and LiteParse will automatically convert these formats to PDF for parsing:

# macOS
brew install --cask libreoffice

# Ubuntu/Debian
apt-get install libreoffice

# Windows
choco install libreoffice-fresh # might require admin permissions

For Windows, you might need to add the path to the directory containing LibreOffice CLI executable (generally C:\Program Files\LibreOffice\program) to the environment variables and re-start the machine.

Images (via ImageMagick)

  • Formats: .jpg, .jpeg, .png, .gif, .bmp, .tiff, .webp, .svg

Just install ImageMagick and LiteParse will convert images to PDF for parsing (with OCR):

# macOS
brew install imagemagick

# Ubuntu/Debian
apt-get install imagemagick

# Windows
choco install imagemagick.app # might require admin permissions

Environment Variables

| Variable | Description | |----------|-------------| | TESSDATA_PREFIX | Path to a directory containing Tesseract .traineddata files. Used for offline/air-gapped environments where Tesseract.js cannot download language data from the internet. | | LITEPARSE_TMPDIR | Override the temp directory used for format conversion and intermediate files. Defaults to the OS temp directory (os.tmpdir()). Useful in containerized or read-only filesystem environments. | | LITEPARSE_LMSTUDIO_BASE_URL | Base URL for LM Studio GLM-OCR tooling. Defaults to http://localhost:1234. | | LITEPARSE_GLM_OCR_MODEL | LM Studio model identifier. Defaults to glm-ocr-g32-mixed_4_8-mlx. | | LITEPARSE_LMSTUDIO_API_KEY | Optional bearer token for LM Studio-compatible deployments. | | LITEPARSE_LMSTUDIO_AUTO_LOAD | Set to 0 or false to disable automatic lms load for local LM Studio models. | | LITEPARSE_GLMOCR_ROOT | GLM-OCR SDK root used by lit glmocr-ocr-server. Docker defaults to /opt/glm-ocr-sdk; local installs may omit it when glmocr is importable. | | LITEPARSE_GLMOCR_LAYOUT_MODEL_DIR | PP-DocLayout model directory or Hub identifier. Docker defaults to /opt/models/pp-doclayout. | | HF_HUB_OFFLINE / TRANSFORMERS_OFFLINE | Set to 1 in the offline Docker image so Hugging Face and Transformers use only bundled model artifacts. | | LITEPARSE_CODEX_HOME | Codex state directory for Codex OCR. Use $HOME/.codex for live development/testing so OAuth tokens and config remain separate from normal Codex state. | | LITEPARSE_CODEX_OCR_MODEL | Default Codex OCR model. Defaults to gpt-5.5; use gpt-5.4-mini for cheaper smoke tests. | | LITEPARSE_CODEX_OCR_REASONING | Default Codex OCR reasoning effort. Defaults to medium; the pipeline command defaults to high. |

Configuration

You can configure parsing options via CLI flags or a JSON config file. The config file allows you to set sensible defaults and override as needed.

Config File Example

Create a liteparse.config.json file:

{
  "ocrLanguage": "en",
  "ocrEnabled": true,
  "maxPages": 1000,
  "dpi": 150,
  "outputFormat": "json",
  "preciseBoundingBox": true,
  "preserveVerySmallText": false,
  "password": "optional_password"
}

For HTTP OCR servers, just add ocrServerUrl:

{
  "ocrServerUrl": "http://localhost:8828/ocr",
  "ocrLanguage": "en",
  "outputFormat": "json"
}

Use with:

lit parse document.pdf --config liteparse.config.json

Development

We provide a fairly rich AGENTS.md/CLAUDE.md that we recommend using to help with development + coding agents.

# Install dependencies
npm install

# Build TypeScript (Linux/macOs)
npm run build

# Build Typescript (Windows)
npm run build:windows

# Watch mode
npm run dev

# Test parsing
npm test

License

This custom LiteParse fork and npm package are licensed under Apache-2.0 under this repository's LICENSE.

Third-party model and runtime notices for optional GLM-OCR deployments:

  • The GLM-OCR SDK repository code is Apache-2.0.
  • The GLM-OCR model zai-org/GLM-OCR is MIT licensed according to its model card.
  • The GLM-OCR pipeline uses PP-DocLayoutV3 for document layout analysis; the PaddlePaddle/PP-DocLayoutV3_safetensors component is Apache-2.0 licensed according to the GLM-OCR model card.

If you build or distribute the optional offline Docker image tar, retain the required notices for LiteParse, GLM-OCR, the GLM-OCR model, PP-DocLayoutV3, vLLM, Node runtime dependencies, and Python runtime dependencies included in that image.

Credits

Built on top of: