npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@thedriveai/sdk

v1.4.0

Published

TypeScript SDK for The Drive AI API — structured data extraction from any file or URL

Readme

The Drive AI SDKs

Python and TypeScript clients for The Drive AI developer API — a set of file intelligence endpoints for extracting structured data, analyzing documents, converting to markdown, and generating thumbnails from any file or URL. Built for AI agents and developers who need to process documents programmatically.

This is not the SDK for The Drive AI's core product. These are standalone developer APIs available at dev.thedrive.ai.

Website | API Docs | Get API Key

Installation

Python

pip install thedriveai

TypeScript

npm install @thedriveai/sdk

Quick start

Python

from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.extract(
    file=open("invoice.pdf", "rb"),
    schema={
        "vendor": {"type": "string", "description": "Company name"},
        "total": {"type": "number", "description": "Total amount due"},
    },
)
print(result.data["vendor"])  # "Acme Corp"
print(result.data["total"])   # 1234.56

TypeScript

import { TheDriveAI } from "@thedriveai/sdk";
import { readFileSync } from "fs";

const client = new TheDriveAI({ apiKey: "tda_live_..." });

const result = await client.extract({
  file: readFileSync("invoice.pdf"),
  schema: {
    vendor: { type: "string", description: "Company name" },
    total: { type: "number", description: "Total amount due" },
  },
});
console.log(result.data.vendor); // "Acme Corp"

Extract

Pull structured data from any file or URL. Define the fields you want, get typed results back.

result = client.extract(
    url="https://example.com/receipt.pdf",
    schema={
        "merchant": {"type": "string", "description": "Store name"},
        "date": {"type": "string", "description": "Purchase date"},
        "items": {
            "type": "array",
            "description": "Line items",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "price": {"type": "number"},
                },
            },
        },
        "total": {"type": "number", "description": "Total amount", "required": True},
    },
    model="accurate",
)

print(result.data)
print(result.confidence)       # {"merchant": "high", "date": "high", ...}
print(result.field_status)     # per-field found/not_found status
print(result.credits_used)

Using Pydantic models (Python)

from pydantic import BaseModel, Field

class Invoice(BaseModel):
    vendor: str = Field(description="Company name")
    total: float = Field(description="Total amount due")
    is_paid: bool = Field(description="Whether the invoice is paid")

result = client.extract(file="invoice.pdf", schema=Invoice)

Using Zod schemas (TypeScript)

import { fromZod } from "@thedriveai/sdk";
import { z } from "zod";

const Invoice = z.object({
  vendor: z.string().describe("Company name"),
  total: z.number().describe("Total amount due"),
  is_paid: z.boolean().describe("Whether the invoice is paid"),
});

const result = await client.extract({
  file: readFileSync("invoice.pdf"),
  schema: fromZod(Invoice),
});

Analyze

Compute, reason, and derive answers from documents. Unlike extract (which finds existing data), analyze can perform calculations and make inferences.

result = client.analyze(
    file="financial_report.pdf",
    schema={
        "total_revenue": {"type": "number", "description": "Sum of all revenue line items"},
        "is_profitable": {"type": "boolean", "description": "Whether net income is positive"},
        "risk_factors": {"type": "array", "items": {"type": "string"}, "description": "Key risks mentioned"},
    },
    include_steps=True,  # see the agent's reasoning trace
)

for name in result.data:
    print(f"{name}: {result.data[name]}")
    print(f"  reasoning: {result.reasoning[name]}")
    print(f"  confidence: {result.confidence[name]}")
const result = await client.analyze({
  file: readFileSync("financial_report.pdf"),
  schema: {
    total_revenue: { type: "number", description: "Sum of all revenue line items" },
    is_profitable: { type: "boolean", description: "Whether net income is positive" },
  },
  includeSteps: true,
});

console.log(result.data.total_revenue);        // 45200000
console.log(result.reasoning.total_revenue);   // "Summed line items from pages 3-5..."
console.log(result.confidence.total_revenue);  // 0.97

Cross-document analysis

Validate and reason across multiple documents simultaneously — e.g. check an invoice against a contract.

result = client.analyze_cross(
    files=["invoice.pdf", "contract.pdf"],
    document_labels=["invoice", "contract"],
    schema={
        "rates_match": {"type": "boolean", "description": "Do invoice rates match the contract?"},
        "total_within_budget": {"type": "boolean", "description": "Is the invoice total within the contract budget?"},
    },
    include_steps=True,
)

for name in result.data:
    print(f"{name}: {result.data[name]}")
    print(f"  sources: {result.sources[name]}")  # ["[invoice] ...", "[contract] ..."]

for doc in result.documents:
    print(f"  {doc.label}: {doc.total_pages} pages")
const result = await client.analyzeCross({
  files: [readFileSync("invoice.pdf"), readFileSync("contract.pdf")],
  documentLabels: ["invoice", "contract"],
  schema: {
    rates_match: { type: "boolean", description: "Do invoice rates match the contract?" },
    total_within_budget: { type: "boolean", description: "Is the invoice total within the contract budget?" },
  },
  includeSteps: true,
});

console.log(result.data.rates_match);        // false
console.log(result.sources.rates_match);     // ['[invoice] "Rate: $175/hr"', '[contract] "Rate: $150/hr"']
console.log(result.documents[0].label);      // "invoice"

Markdown

Convert any file or URL to clean markdown.

# From a file
result = client.markdown(file="document.docx")
print(result.markdown)

# From a URL (shorthand)
md = client.markdown_url("https://example.com/about")
print(md)
const result = await client.markdown({ url: "https://example.com/about" });
console.log(result.markdown);

Thumbnails

Generate thumbnail images from files or URLs.

result = client.thumbnail(
    url="https://example.com/report.pdf",
    width=800,
    height=600,
    quality=90,
    response_type="url",  # or "base64"
)
print(result.thumbnail_url)
print(result.metadata.file_type)  # "pdf"

Screenshots

Capture a screenshot of any URL. Returns raw JPEG bytes.

jpeg_bytes = client.screenshot("https://stripe.com", width=1280, height=800)
with open("screenshot.jpg", "wb") as f:
    f.write(jpeg_bytes)
const buffer = await client.screenshot("https://stripe.com", {
  width: 1280,
  height: 800,
});

Async and batch processing

For large files or high-volume workloads, use async mode or batch processing.

Async (single file)

Submit a task and poll for results, or provide a webhook URL to get notified.

# Submit
task = client.extract_async(
    file="large-document.pdf",
    schema={
        "title": {"type": "string", "description": "Document title"},
        "author": {"type": "string", "description": "Author name"},
    },
)
print(task.task_id)    # "task_abc123"
print(task.poll_url)   # "/api/v1/jobs/task_abc123"

# Wait for completion
result = client.wait_for_task(task.task_id, poll_interval=2.0, timeout=300.0)
print(result.result)   # the extraction data

# Or use a webhook instead of polling
task = client.extract_async(
    file="large-document.pdf",
    schema={"title": {"type": "string", "description": "Document title"}},
    webhook_url="https://your-app.com/webhooks/driveai",
)

Batch (multiple files)

Process up to 20 files/URLs with a single schema.

batch = client.extract_batch(
    files=["invoice1.pdf", "invoice2.pdf", "invoice3.pdf"],
    urls=["https://example.com/invoice4.pdf"],
    schema={
        "vendor": {"type": "string", "description": "Company name"},
        "total": {"type": "number", "description": "Total amount"},
    },
)
print(batch.batch_id)  # "batch_xyz789"
print(batch.total)     # 4

# Wait for all to complete
result = client.wait_for_batch(batch.batch_id)
print(result.completed)     # 4
print(result.failed)        # 0
print(result.credits_used)  # 4
for task in result.results:
    print(task["result"]["data"])
const batch = await client.extractBatch({
  files: [readFileSync("invoice1.pdf"), readFileSync("invoice2.pdf")],
  urls: ["https://example.com/invoice3.pdf"],
  schema: {
    vendor: { type: "string", description: "Company name" },
    total: { type: "number", description: "Total amount" },
  },
});

const result = await client.waitForBatch(batch.batch_id);

Error handling

from thedriveai import TheDriveAI, TheDriveAIError

client = TheDriveAI(api_key="tda_live_...")

try:
    result = client.extract(file="doc.pdf", schema={"title": {"type": "string", "description": "Document title"}})
except TheDriveAIError as e:
    print(e.status_code)  # 400, 413, 503, etc.
    print(e.detail)       # error detail from the API
import { TheDriveAI, TheDriveAIError } from "@thedriveai/sdk";

try {
  const result = await client.extract({ ... });
} catch (e) {
  if (e instanceof TheDriveAIError) {
    console.log(e.statusCode);
    console.log(e.detail);
  }
}

Configuration

client = TheDriveAI(
    api_key="tda_live_...",
    base_url="https://dev.thedrive.ai",  # default
    timeout=120.0,                        # seconds, default
)

# Use as context manager to auto-close
with TheDriveAI(api_key="tda_live_...") as client:
    result = client.extract(...)
const client = new TheDriveAI({
  apiKey: "tda_live_...",
  baseUrl: "https://dev.thedrive.ai",  // default
  timeout: 120_000,                     // ms, default
});

API reference

| Method | Description | Returns | |--------|-------------|---------| | extract() | Extract structured data from a file or URL | ExtractionResult | | extract_async() | Submit extraction for async processing | TaskStatus | | extract_batch() | Batch extract from multiple files/URLs | BatchStatus | | analyze() | Analyze a document (compute, reason, derive) | AnalysisResult | | analyze_async() | Submit analysis for async processing | TaskStatus | | analyze_batch() | Batch analyze multiple files/URLs | BatchStatus | | analyze_cross() | Cross-reference and validate across 2-5 documents | CrossAnalysisResult | | markdown() | Convert a file or URL to markdown | MarkdownResult | | markdown_url() | Convert URL to markdown (shorthand) | str | | thumbnail() | Generate a thumbnail from file or URL | ThumbnailResult | | screenshot() | Screenshot a URL | bytes / ArrayBuffer | | poll_task() | Check async task status | TaskStatus | | poll_batch() | Check batch status | BatchStatus | | wait_for_task() | Poll until task completes | TaskStatus | | wait_for_batch() | Poll until batch completes | BatchStatus |

Links