npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kaenova/document-intelligence-mcp

v0.0.0-20260520.h1051

Published

MCP server for Azure AI Document Intelligence with Read and Layout models

Readme

document-intelligence-mcp

MCP server for Azure AI Document Intelligence. Exposes a single analyze_document tool that lets agents choose between:

  • read — OCR-only text extraction (fast, lightweight)
  • layout — Rich document understanding with tables, selection marks, and structure (recommended for complex documents)

Results are automatically cached using SQLite via Bun's built-in bun:sqlite module for fast repeated analysis of the same document.


Quick Start

1. Prerequisites

  • Bun runtime (required)
  • Azure Document Intelligence resource (create one in Azure Portal if you don't have it)

If you don't have Bun installed yet, install it first:

curl -fsSL https://bun.sh/install | bash

2. Install Dependencies

bun install

3. Configure Environment

Copy the example env file and fill in your Azure credentials:

cp .env.example .env

Edit .env with your Azure endpoint and key:

AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-resource.cognitiveservices.azure.com/
AZURE_DOCUMENT_INTELLIGENCE_KEY=your-api-key-here

4. Run the Server

bun run dev

The server runs in stdio mode and is ready to be used by any MCP-compatible client.


Configuration in Agent Harness (Pi)

This MCP server is Bun-only. Use bunx or bun run; npx/Node.js execution is not supported.

Add this MCP server to your Pi agent by editing the configuration file:

~/.pi/agent/mcp.json

Option 1: Using the Published Package (Recommended)

{
  "mcpServers": {
    "document-intelligence": {
      "command": "bunx",
      "args": ["@kaenova/document-intelligence-mcp"],
      "env": {
        "AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT": "https://your-resource.cognitiveservices.azure.com/",
        "AZURE_DOCUMENT_INTELLIGENCE_KEY": "your-api-key"
      }
    }
  }
}

Option 2: Local Development (Cloned Repository)

{
  "mcpServers": {
    "document-intelligence": {
      "command": "bun",
      "args": ["run", "src/index.ts"],
      "cwd": "/absolute/path/to/your/document-intelligence-mcp",
      "env": {
        "AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT": "https://your-resource.cognitiveservices.azure.com/",
        "AZURE_DOCUMENT_INTELLIGENCE_KEY": "your-api-key"
      }
    }
  }
}

Notes

  • After editing mcp.json, restart your Pi agent or reload the MCP configuration.
  • For local development, make sure to replace the cwd path with the actual location of your cloned repository.
  • You can also load credentials from a .env file instead of hardcoding them in mcp.json.

Installation via bunx (Recommended for most users)

Once published, you can use the package without cloning the repository.

Note: This package is intended to run with Bun only. Install Bun first, then use bunx or bun run.

npm requires semver, so published versions use a semver-compatible date stamp in GMT+7, for example: 0.0.0-20260520.h1530

Using with bunx (recommended if you have Bun)

{
  "mcpServers": {
    "document-intelligence": {
      "command": "bunx",
      "args": ["@kaenova/document-intelligence-mcp"],
      "env": {
        "AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT": "https://your-resource.cognitiveservices.azure.com/",
        "AZURE_DOCUMENT_INTELLIGENCE_KEY": "your-api-key-here"
      }
    }
  }
}

Usage

The analyze_document Tool

This MCP server exposes a single powerful tool:

analyze_document(model, source)

Parameters

| Parameter | Type | Description | |-----------|-------------------------|-----------------------------------------------------------------------------| | model | "read" | "layout" | The analysis model to use | | source | string | Local file path or public HTTPS URL (automatically detected) |

Choosing the Right Model

| Model | Best For | Output Highlights | Speed | |----------|-----------------------------------------------|---------------------------------------|-----------| | read | Simple text extraction, language detection | Raw text, pages, detected languages | Fast | | layout | Documents with tables, forms, structure | Tables, selection marks, rich layout | Slightly slower |

Recommendation: Use layout unless you specifically only need raw OCR text.

Examples

Local PDF file with layout analysis:

{
  "model": "layout",
  "source": "/Users/me/invoices/Q2-report.pdf"
}

Public URL with read-only OCR:

{
  "model": "read",
  "source": "https://example.com/annual-report-2025.pdf"
}

Image file:

{
  "model": "layout",
  "source": "./screenshots/contract-page1.png"
}

Caching

  • Results are automatically cached based on the file content hash + model.
  • If you analyze the same file again with the same model, you get the cached result instantly.
  • Cache is stored in SQLite via Bun (DI_CACHE_PATH, default: .cache/di-cache.sqlite).

Supported File Types

  • PDF
  • Images: JPG, JPEG, PNG, BMP, TIFF, HEIF
  • Office Documents: DOCX, PPTX, XLSX

The tool returns a well-formatted Markdown document with extracted content, tables (when using layout), pages, and language information. Results are cached based on file content hash + model.


Development

# Watch mode
bun run dev

# Type checking
bun run typecheck

# Build for production
bun run build

License

MIT


Built with FastMCP and the official Azure AI Form Recognizer SDK.