@meistrari/document-sdk

v1.9.0

Published

20 days ago

SDK para a API de Processamento de Documentos, com suporte a extração de PDF, templates, conversão para imagem e mais.

Document Processing SDK

A TypeScript SDK for the Document Processing API that provides methods for PDF extraction, splitting, template generation, PDF-to-image conversion, merging, cropping, and document conversion operations.

Installation

npm install @meistrari/document-sdk
# or
pnpm add @meistrari/document-sdk
# or
yarn add @meistrari/document-sdk

Quick Start

Using DataToken Authentication

import { docClient } from '@meistrari/document-sdk'

const client = docClient({
  apiUrl: 'https://your-api-url.com',
  dataToken: 'your-data-token'
})

Using API Key Authentication

import { docClient } from '@meistrari/document-sdk'

const client = docClient({
  apiUrl: 'https://your-api-url.com',
  apiKey: 'Bearer your-api-key'
})

Using DocClient Class Directly

import { DocClient } from '@meistrari/document-sdk'

const client = new DocClient({
  apiUrl: 'https://your-api-url.com',
  dataToken: 'your-data-token'
})

Operations

Extract Pages from PDF

Extract specific pages from PDF documents. Supports single output or multiple outputs.

// Single output (all pages merged into one PDF)
const result = await client.extract({
  type: 'page',
  indexes: [1, 3, -1, '2-4'], // Support for ranges and negative indexing
  files: [
    {
      file_url: 'vault://document-123',
      filename: 'document.pdf'
    }
  ]
})

// Multiple outputs (each group creates a separate PDF)
const result = await client.extract({
  type: 'page',
  indexes: [[1, 2], [3, 4], [5, '6-10']], // Each array creates a separate output file
  files: [
    {
      file_url: 'vault://document-123',
      filename: 'document.pdf'
    }
  ]
})

Index formats supported:

Positive integers: 1, 2, 3 (1-based indexing)
Negative integers: -1, -2 (last page, second to last, etc.)
Ranges as strings: '2-4', '1-3'

Split PDF

Split PDF documents into smaller chunks by page count.

const result = await client.split({
  files: [
    {
      file_url: 'vault://document-123',
      filename: 'large-document.pdf'
    }
  ],
  chunk_size: 5, // Pages per chunk (default: 1)
  type: 'page'
})

Generate Document from Template

Generate documents from .docx templates with dynamic data substitution. Supports Mustache templating with loops and conditionals.

const result = await client.template({
  template: {
    content: 'base64-encoded-docx-content',
    filename: 'template.docx'
  },
  data: {
    name: 'John Doe',
    company: 'Acme Corp',
    user: {
      email: '[email protected]'
    },
    // Arrays for loops
    items: [
      { name: 'Item 1', price: 100 },
      { name: 'Item 2', price: 200 }
    ]
  },
  options: {
    outputFormat: 'vault_url', // 'base64' or 'vault_url'
    enableMustache: true, // Enable loops and conditionals
    createPermalink: true, // Create public download URL
    permalinkExpiresAt: '2025-12-31T23:59:59Z', // Optional expiration
    outputFilename: 'contract.docx', // Custom output filename
    validateData: true // Validate placeholders
  }
})

Convert PDF to Images

Convert specific PDF pages to high-quality JPEG images.

const result = await client.pdfToImage({
  file_url: 'vault://document-123',
  filename: 'document.pdf',
  pages: [1, 2, 5], // Pages to convert (1-indexed)
  quality: 'auto'   // optional: 'auto' (default) | 'standard' | 'high'
})

// Response includes image metadata
// result.images[0].metadata = { width, height, format }
//
// quality:
//   'auto'     — hybrid: extracts the original embedded image when a page is
//                dominated by one, otherwise rasterizes at an adaptive resolution
//                that preserves the native pixels of any downscaled XObjects
//                (capped at 6× / 4000px). Output dimensions vary per page —
//                always read metadata.width/height for downstream cropping.
//   'standard' — legacy fixed scale (≈192 DPI). Output dimensions match the
//                pre-quality-flag behavior.
//   'high'     — forces adaptive rasterization up to the cap, skipping the
//                direct-extraction gate.

Merge Files

Merge multiple PDFs or audio files into a single file.

// Merge PDFs
const result = await client.merge({
  files: [
    { file_url: 'vault://doc1', filename: 'chapter1.pdf' },
    { file_url: 'vault://doc2', filename: 'chapter2.pdf' }
  ],
  outputFilename: 'complete-book.pdf'
})

// Merge audio files (mp3, wav, m4a, aac, ogg, flac, wma, opus, webm)
const result = await client.merge({
  files: [
    { file_url: 'vault://audio1', filename: 'intro.mp3' },
    { file_url: 'vault://audio2', filename: 'main.mp3' }
  ],
  outputFilename: 'complete-audio.mp3'
})

Crop Image

Crop images using coordinates.

const result = await client.crop({
  file_url: 'vault://image-123',
  filename: 'image.jpg',
  coordinates: {
    x1: 10, // Left
    y1: 10, // Top
    x2: 200, // Right
    y2: 200 // Bottom
  }
})

// Response includes dimensions
// result.dimensions = { width, height }

Convert Markdown to PDF

Convert Markdown content to PDF with customizable styling.

const result = await client.markdownToPdf({
  markdownContent: '# My Document\n\nThis is **bold** text.',
  options: {
    theme: 'tela-default', // 'tela-default', 'legal-document', 'invoice'
    content: {
      logo: 'data:image/png;base64,...', // Optional logo
      title: 'Report Title',
      footerText: 'Confidential'
    },
    align: {
      headerLogo: 'right', // 'left', 'center', 'right', 'none'
      headerTitle: 'center',
      pageNumber: 'center',
      footerText: 'left'
    },
    customCss: 'h1 { color: #003366; }',
    pdfOptions: {
      format: 'A4',
      margin: {
        top: '20mm',
        right: '15mm',
        bottom: '20mm',
        left: '15mm'
      }
    }
  }
})

Convert JSON to PDF

Convert JSON data to a formatted PDF document with syntax highlighting.

const result = await client.jsonToPdf({
  jsonData: {
    request: {
      method: 'POST',
      url: '/api/users'
    },
    response: {
      status: 200,
      data: { id: 123 }
    }
  },
  options: {
    content: {
      title: 'API Request Log',
      footerText: 'Generated in DEV'
    },
    align: {
      headerTitle: 'center',
      pageNumber: 'center'
    },
    jsonFormatting: {
      spacing: 2,
      highlightTheme: 'github', // 'github', 'stackoverflow-light', 'atom-one-light', 'googlecode'
      fontFamily: 'Fira Code, monospace',
      fontSize: '12px',
      lineHeight: '1.4'
    },
    pdfOptions: {
      format: 'A4',
      printBackground: true,
      displayHeaderFooter: true
    }
  }
})

Error Handling

The SDK provides typed error classes for different scenarios:

import {
  AuthenticationError,
  NetworkError,
  NotFoundError,
  ServerError,
  ValidationError
} from '@meistrari/document-sdk'

try {
  const result = await client.extract(request)
}
catch (error) {
  if (error instanceof ValidationError) {
    console.log('Validation failed:', error.message)
  }
  else if (error instanceof AuthenticationError) {
    console.log('Authentication failed:', error.message)
  }
  else if (error instanceof NetworkError) {
    console.log('Network error:', error.message)
  }
  else if (error instanceof ServerError) {
    console.log('Server error:', error.message, error.statusCode)
  }
  else if (error instanceof NotFoundError) {
    console.log('Not found:', error.message)
  }
}

TypeScript Support

The SDK is written in TypeScript and provides comprehensive type definitions:

import type {
  Align,

  Content,
  // Crop
  CropRequest,
  CropResponse,
  // Config
  DocConfig,

  ElementPosition,
  // Extract
  ExtractRequest,

  ExtractResponse,
  // Common
  FileInput,
  IndexesInput,
  JsonToPdfOptions,

  // JSON to PDF
  JsonToPdfRequest,
  JsonToPdfResponse,

  MarkdownToPdfOptions,
  // Markdown to PDF
  MarkdownToPdfRequest,

  MarkdownToPdfResponse,
  // Merge
  MergeRequest,

  MergeResponse,
  PdfOptionsMargin,
  // PDF to Image
  PdfToImageRequest,

  PdfToImageResponse,
  SingleIndex,
  // Split
  SplitRequest,

  SplitResponse,
  TemplateInput,
  TemplateOptions,
  // Template
  TemplateRequest,
  TemplateResponse,
  Theme,
} from '@meistrari/document-sdk'

File URL Formats

The SDK supports multiple file URL formats:

Vault URLs: vault://file-id
External URLs: https://example.com/file.pdf
Base64: For template content (inline base64 encoded data)

Build & Development

# Install dependencies
pnpm install

# Build the SDK
pnpm build

# Run tests
pnpm test

# Lint code
pnpm lint

License

UNLICENSED