okrapdf

v0.14.0

Published

a month ago

OkraPDF — upload a PDF, get an API. Runtime client, React hooks, and CLI.

0High
0Medium
0Low

steventsao713

pdf ocr document extraction api sdk structured-output

okrapdf

Upload a PDF, get an OpenAI-compatible endpoint.

npm install okrapdf

Get your API key at app.okrapdf.com/settings.

Quick Start

import { OkraClient } from 'okrapdf';

const okra = new OkraClient({ apiKey: process.env.OKRA_API_KEY });
const session = await okra.sessions.create('./invoice.pdf');

// Every document gets its own chat/completions URL
console.log(session.modelEndpoint);

That prints a URL like:

https://api.okrapdf.com/v1/documents/doc-441a8a0be0e94914b982

This is a full OpenAI-compatible base URL. Plug it into any client.

Choose Your Entry Point

| I want to... | Start here | |---|---| | Use TypeScript/JS | okra.sessions.create() + session.prompt() — SDK Methods | | Use the CLI | okra auth login + okra upload — CLI | | Use Claude Code or AI agents | MCP server — Setup guide | | Use any LLM client | OpenAI-compatible endpoint — OpenAI SDK | | Build a React app | okrapdf/react hooks — Sub-path exports | | Query across documents | Collections fan-out — Collections |

What You Get

Upload a PDF and OkraPDF gives you predictable URLs for everything:

Document:   doc-441a8a0be0e94914b982

Completion: https://api.okrapdf.com/document/doc-441a8a0be0e94914b982/chat/completions
Status:     https://api.okrapdf.com/document/doc-441a8a0be0e94914b982/status
Pages:      https://api.okrapdf.com/document/doc-441a8a0be0e94914b982/pages
Entities:   https://api.okrapdf.com/document/doc-441a8a0be0e94914b982/nodes
Download:   https://api.okrapdf.com/document/doc-441a8a0be0e94914b982/download

Page images:
  pg 1:     https://api.okrapdf.com/v1/documents/doc-441a8a0be0e94914b982/pg_1.png
  resized:  https://api.okrapdf.com/v1/documents/doc-441a8a0be0e94914b982/w_200,h_300/pg_1.png
  shimmer:  https://api.okrapdf.com/v1/documents/doc-441a8a0be0e94914b982/d_shimmer/pg_1.png

All URLs are deterministic. Build them from the document ID without calling the API first.

Use with OpenAI SDK

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OKRA_API_KEY,
  baseURL: session.modelEndpoint,  // https://api.okrapdf.com/v1/documents/doc-...
});

const res = await openai.chat.completions.create({
  model: 'okra',
  messages: [{ role: 'user', content: 'What form is this?' }],
});

console.log(res.choices[0].message.content);
// → "This is Form W-9 (Request for Taxpayer Identification Number and
//    Certification), used by entities to collect a taxpayer's TIN..."

Use with AI SDK

import { createOpenAICompatible } from '@ai-sdk/openai-compatible';
import { generateText } from 'ai';

const provider = createOpenAICompatible({
  name: 'okra',
  apiKey: process.env.OKRA_API_KEY,
  baseURL: session.modelEndpoint,
});

const { text } = await generateText({
  model: provider('okra'),
  prompt: 'Summarize this document in 3 bullet points',
});

Use with curl

# Upload
curl -X POST https://api.okrapdf.com/document/doc-my-w9/upload-url \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.irs.gov/pub/irs-pdf/fw9.pdf"}'

# Ask a question
curl https://api.okrapdf.com/document/doc-my-w9/chat/completions \
  -H "Authorization: Bearer $OKRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "List all parts of this form."}]}'

Response:

{
  "id": "chatcmpl-18g5qhmmrm",
  "object": "chat.completion",
  "model": "accounts/fireworks/models/kimi-k2p5",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Based on the Form W-9 document, there are two numbered parts:\n\n| Part | Title |\n|------|-------|\n| Part I | Taxpayer Identification Number (TIN) |\n| Part II | Certification |"
    },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 227, "completion_tokens": 404, "total_tokens": 631 }
}

SDK Methods

The SDK wraps all of this so you don't need a separate client:

// Ask a question (non-streaming)
const { answer } = await session.prompt('What is the total amount due?');

// Stream
for await (const event of session.stream('Summarize this document')) {
  if (event.type === 'text_delta') process.stdout.write(event.text);
}

// Structured output with Zod
import { z } from 'zod';

const Invoice = z.object({
  vendor: z.string(),
  total: z.number(),
  lineItems: z.array(z.object({
    description: z.string(),
    amount: z.number(),
  })),
});

const { data } = await session.prompt('Extract the invoice', { schema: Invoice });
// data: { vendor: "Acme Corp", total: 1250.00, lineItems: [...] }

Pages & Entities

const pages = await session.pages();           // { pageCount: 6, pages: [...] }
const { nodes } = await session.entities();    // extracted text, tables, etc.
const { nodes } = await session.entities({ type: 'table' });

Upload

Accepts file paths, URLs, Blob, ArrayBuffer, or Uint8Array:

// URL
const session = await okra.sessions.create('https://example.com/report.pdf');

// Bytes
const session = await okra.sessions.create(pdfBytes, {
  upload: { fileName: 'report.pdf' },
});

// Attach to existing document (no upload, no wait)
const session = okra.sessions.from('doc-441a8a0be0e94914b982');

Deterministic URLs

Build page image and export URLs from a document ID — no API call needed:

import { doc } from 'okrapdf/doc';

const d = doc('doc-441a8a0be0e94914b982');

d.pages(1).image();                    // .../pg_1.png
d.pages(1).image({ w: 200, h: 300 }); // .../w_200,h_300/pg_1.png
d.export('markdown');                  // .../export.md

Collections — Fan-Out Query to CSV

Ask the same question across every document in a collection. Each doc answers independently in parallel, results stream back as NDJSON.

import { OkraClient } from 'okrapdf';
import { z } from 'zod';
import { writeFileSync } from 'fs';

const okra = new OkraClient({ apiKey: process.env.OKRA_API_KEY });

// Fan-out: ask every doc in the collection the same question
const stream = okra.collections.query(
  'col-40da068481cf4f248853507cba6be611',
  'Who are the top 3 people mentioned in this document?',
);

// Gather all results
const result = await stream.gather();

// Write CSV
const header = 'doc_id,doc_name,answer,cost_usd';
const rows = [...result.answers.values()].map(a =>
  `"${a.docId}","${a.answer.slice(0, 200)}",${a.costUsd}`
);
writeFileSync('results.csv', [header, ...rows].join('\n'));

console.log(`${result.completed} docs, $${result.totalCostUsd.toFixed(4)} total`);

Real output from a 10-K earnings collection:

doc_id,file_name,answer,cost_usd
"doc-9a3f21...","NVDA-10K-2025.pdf","Revenue: $130.5B, Net Income: $72.9B, YoY Growth: 114%",0.0048
"doc-b7e810...","AAPL-10K-2025.pdf","Revenue: $391.0B, Net Income: $101.2B, YoY Growth: 5%",0.0039
"doc-c4d562...","MSFT-10K-2025.pdf","Revenue: $254.2B, Net Income: $97.1B, YoY Growth: 16%",0.0051

Experimental: structured collection fan-out is still available via a schema, but it is not part of the stable v0.14 surface:

const FinancialReport = z.object({
  company: z.string(),
  revenue: z.number(),
  netIncome: z.number(),
  quarter: z.string(),
});

const stream = okra.collections.query(
  'col-financials',
  'Extract the financial summary',
  { schema: FinancialReport },
);

const result = await stream.gather();
for (const [docId, answer] of result.answers) {
  console.log(answer.data); // { company: "NVIDIA", revenue: 35082, ... }
}

Or stream per-doc events in real time:

for await (const event of stream) {
  if (event.type === 'result') {
    console.log(`${event.doc_id}: ${event.answer.slice(0, 80)}...`);
  }
}

Sub-path Exports

| Import | Use | |--------|-----| | okrapdf | OkraClient, types, errors | | okrapdf/doc | doc() URL builder | | okrapdf/browser | Browser-safe client (no Node.js deps) | | okrapdf/worker | Cloudflare Worker adapter | | okrapdf/react | React hooks (useSession, usePages) |

CLI

npm install -g okrapdf
okra auth login

okra upload ./invoice.pdf
okra chat "What is the total?" --doc doc-abc123
okra extract ./invoice.pdf --schema ./invoice.schema.json
okra collection query earnings "What changed quarter over quarter?" -o earnings.csv

Use npx okrapdf ... for one-off runs if you do not want a global install.

The default CLI help intentionally focuses on the promoted workflows: auth, upload, extract, chat, document read/list/delete, and collection query. Advanced inspection and local-only commands still exist, but they are intentionally hidden from default help during the clean-house release candidate. Experimental structured collection extraction still exists via okra collection extract ... and okra collection query ... --schema ..., but it is intentionally not part of the stable v0.14 surface.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

okrapdf

Quick Start

Choose Your Entry Point

What You Get

Use with OpenAI SDK

Use with AI SDK

Use with curl

SDK Methods

Pages & Entities

Upload

Deterministic URLs

Collections — Fan-Out Query to CSV

Sub-path Exports

CLI

License