@thedriveai/sdk

v1.4.0

Published

12 days ago

TypeScript SDK for The Drive AI API — structured data extraction from any file or URL

0High
0Medium
0Low

bigyankarki

thedriveai extraction document ai structured-data

The Drive AI SDKs

Python and TypeScript clients for The Drive AI developer API — a set of file intelligence endpoints for extracting structured data, analyzing documents, converting to markdown, and generating thumbnails from any file or URL. Built for AI agents and developers who need to process documents programmatically.

This is not the SDK for The Drive AI's core product. These are standalone developer APIs available at dev.thedrive.ai.

Website | API Docs | Get API Key

Installation

Python

pip install thedriveai

TypeScript

npm install @thedriveai/sdk

Quick start

Python

from thedriveai import TheDriveAI

client = TheDriveAI(api_key="tda_live_...")

result = client.extract(
    file=open("invoice.pdf", "rb"),
    schema={
        "vendor": {"type": "string", "description": "Company name"},
        "total": {"type": "number", "description": "Total amount due"},
    },
)
print(result.data["vendor"])  # "Acme Corp"
print(result.data["total"])   # 1234.56

TypeScript

import { TheDriveAI } from "@thedriveai/sdk";
import { readFileSync } from "fs";

const client = new TheDriveAI({ apiKey: "tda_live_..." });

const result = await client.extract({
  file: readFileSync("invoice.pdf"),
  schema: {
    vendor: { type: "string", description: "Company name" },
    total: { type: "number", description: "Total amount due" },
  },
});
console.log(result.data.vendor); // "Acme Corp"

Extract

Pull structured data from any file or URL. Define the fields you want, get typed results back.

result = client.extract(
    url="https://example.com/receipt.pdf",
    schema={
        "merchant": {"type": "string", "description": "Store name"},
        "date": {"type": "string", "description": "Purchase date"},
        "items": {
            "type": "array",
            "description": "Line items",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "price": {"type": "number"},
                },
            },
        },
        "total": {"type": "number", "description": "Total amount", "required": True},
    },
    model="accurate",
)

print(result.data)
print(result.confidence)       # {"merchant": "high", "date": "high", ...}
print(result.field_status)     # per-field found/not_found status
print(result.credits_used)

Using Pydantic models (Python)

from pydantic import BaseModel, Field

class Invoice(BaseModel):
    vendor: str = Field(description="Company name")
    total: float = Field(description="Total amount due")
    is_paid: bool = Field(description="Whether the invoice is paid")

result = client.extract(file="invoice.pdf", schema=Invoice)

Using Zod schemas (TypeScript)

import { fromZod } from "@thedriveai/sdk";
import { z } from "zod";

const Invoice = z.object({
  vendor: z.string().describe("Company name"),
  total: z.number().describe("Total amount due"),
  is_paid: z.boolean().describe("Whether the invoice is paid"),
});

const result = await client.extract({
  file: readFileSync("invoice.pdf"),
  schema: fromZod(Invoice),
});

Analyze

Compute, reason, and derive answers from documents. Unlike extract (which finds existing data), analyze can perform calculations and make inferences.

result = client.analyze(
    file="financial_report.pdf",
    schema={
        "total_revenue": {"type": "number", "description": "Sum of all revenue line items"},
        "is_profitable": {"type": "boolean", "description": "Whether net income is positive"},
        "risk_factors": {"type": "array", "items": {"type": "string"}, "description": "Key risks mentioned"},
    },
    include_steps=True,  # see the agent's reasoning trace
)

for name in result.data:
    print(f"{name}: {result.data[name]}")
    print(f"  reasoning: {result.reasoning[name]}")
    print(f"  confidence: {result.confidence[name]}")

const result = await client.analyze({
  file: readFileSync("financial_report.pdf"),
  schema: {
    total_revenue: { type: "number", description: "Sum of all revenue line items" },
    is_profitable: { type: "boolean", description: "Whether net income is positive" },
  },
  includeSteps: true,
});

console.log(result.data.total_revenue);        // 45200000
console.log(result.reasoning.total_revenue);   // "Summed line items from pages 3-5..."
console.log(result.confidence.total_revenue);  // 0.97

Cross-document analysis

Validate and reason across multiple documents simultaneously — e.g. check an invoice against a contract.

result = client.analyze_cross(
    files=["invoice.pdf", "contract.pdf"],
    document_labels=["invoice", "contract"],
    schema={
        "rates_match": {"type": "boolean", "description": "Do invoice rates match the contract?"},
        "total_within_budget": {"type": "boolean", "description": "Is the invoice total within the contract budget?"},
    },
    include_steps=True,
)

for name in result.data:
    print(f"{name}: {result.data[name]}")
    print(f"  sources: {result.sources[name]}")  # ["[invoice] ...", "[contract] ..."]

for doc in result.documents:
    print(f"  {doc.label}: {doc.total_pages} pages")

const result = await client.analyzeCross({
  files: [readFileSync("invoice.pdf"), readFileSync("contract.pdf")],
  documentLabels: ["invoice", "contract"],
  schema: {
    rates_match: { type: "boolean", description: "Do invoice rates match the contract?" },
    total_within_budget: { type: "boolean", description: "Is the invoice total within the contract budget?" },
  },
  includeSteps: true,
});

console.log(result.data.rates_match);        // false
console.log(result.sources.rates_match);     // ['[invoice] "Rate: $175/hr"', '[contract] "Rate: $150/hr"']
console.log(result.documents[0].label);      // "invoice"

Markdown

Convert any file or URL to clean markdown.

# From a file
result = client.markdown(file="document.docx")
print(result.markdown)

# From a URL (shorthand)
md = client.markdown_url("https://example.com/about")
print(md)

const result = await client.markdown({ url: "https://example.com/about" });
console.log(result.markdown);

Thumbnails

Generate thumbnail images from files or URLs.

result = client.thumbnail(
    url="https://example.com/report.pdf",
    width=800,
    height=600,
    quality=90,
    response_type="url",  # or "base64"
)
print(result.thumbnail_url)
print(result.metadata.file_type)  # "pdf"

Screenshots

Capture a screenshot of any URL. Returns raw JPEG bytes.

jpeg_bytes = client.screenshot("https://stripe.com", width=1280, height=800)
with open("screenshot.jpg", "wb") as f:
    f.write(jpeg_bytes)

const buffer = await client.screenshot("https://stripe.com", {
  width: 1280,
  height: 800,
});

Async and batch processing

For large files or high-volume workloads, use async mode or batch processing.

Async (single file)

Submit a task and poll for results, or provide a webhook URL to get notified.

# Submit
task = client.extract_async(
    file="large-document.pdf",
    schema={
        "title": {"type": "string", "description": "Document title"},
        "author": {"type": "string", "description": "Author name"},
    },
)
print(task.task_id)    # "task_abc123"
print(task.poll_url)   # "/api/v1/jobs/task_abc123"

# Wait for completion
result = client.wait_for_task(task.task_id, poll_interval=2.0, timeout=300.0)
print(result.result)   # the extraction data

# Or use a webhook instead of polling
task = client.extract_async(
    file="large-document.pdf",
    schema={"title": {"type": "string", "description": "Document title"}},
    webhook_url="https://your-app.com/webhooks/driveai",
)

Batch (multiple files)

Process up to 20 files/URLs with a single schema.

batch = client.extract_batch(
    files=["invoice1.pdf", "invoice2.pdf", "invoice3.pdf"],
    urls=["https://example.com/invoice4.pdf"],
    schema={
        "vendor": {"type": "string", "description": "Company name"},
        "total": {"type": "number", "description": "Total amount"},
    },
)
print(batch.batch_id)  # "batch_xyz789"
print(batch.total)     # 4

# Wait for all to complete
result = client.wait_for_batch(batch.batch_id)
print(result.completed)     # 4
print(result.failed)        # 0
print(result.credits_used)  # 4
for task in result.results:
    print(task["result"]["data"])

const batch = await client.extractBatch({
  files: [readFileSync("invoice1.pdf"), readFileSync("invoice2.pdf")],
  urls: ["https://example.com/invoice3.pdf"],
  schema: {
    vendor: { type: "string", description: "Company name" },
    total: { type: "number", description: "Total amount" },
  },
});

const result = await client.waitForBatch(batch.batch_id);

Error handling

from thedriveai import TheDriveAI, TheDriveAIError

client = TheDriveAI(api_key="tda_live_...")

try:
    result = client.extract(file="doc.pdf", schema={"title": {"type": "string", "description": "Document title"}})
except TheDriveAIError as e:
    print(e.status_code)  # 400, 413, 503, etc.
    print(e.detail)       # error detail from the API

import { TheDriveAI, TheDriveAIError } from "@thedriveai/sdk";

try {
  const result = await client.extract({ ... });
} catch (e) {
  if (e instanceof TheDriveAIError) {
    console.log(e.statusCode);
    console.log(e.detail);
  }
}

Configuration

client = TheDriveAI(
    api_key="tda_live_...",
    base_url="https://dev.thedrive.ai",  # default
    timeout=120.0,                        # seconds, default
)

# Use as context manager to auto-close
with TheDriveAI(api_key="tda_live_...") as client:
    result = client.extract(...)

const client = new TheDriveAI({
  apiKey: "tda_live_...",
  baseUrl: "https://dev.thedrive.ai",  // default
  timeout: 120_000,                     // ms, default
});

API reference

| Method | Description | Returns | |--------|-------------|---------| | extract() | Extract structured data from a file or URL | ExtractionResult | | extract_async() | Submit extraction for async processing | TaskStatus | | extract_batch() | Batch extract from multiple files/URLs | BatchStatus | | analyze() | Analyze a document (compute, reason, derive) | AnalysisResult | | analyze_async() | Submit analysis for async processing | TaskStatus | | analyze_batch() | Batch analyze multiple files/URLs | BatchStatus | | analyze_cross() | Cross-reference and validate across 2-5 documents | CrossAnalysisResult | | markdown() | Convert a file or URL to markdown | MarkdownResult | | markdown_url() | Convert URL to markdown (shorthand) | str | | thumbnail() | Generate a thumbnail from file or URL | ThumbnailResult | | screenshot() | Screenshot a URL | bytes / ArrayBuffer | | poll_task() | Check async task status | TaskStatus | | poll_batch() | Check batch status | BatchStatus | | wait_for_task() | Poll until task completes | TaskStatus | | wait_for_batch() | Poll until batch completes | BatchStatus |