@gabsterai/ocr-client

v1.2.1

Published

13 days ago

TypeScript/JavaScript client SDK for OCR Service API

0High
0Medium
0Low

gabsterai

ocr client sdk api document-processing deepseek

@gabsterai/ocr-client

Simple, powerful OCR SDK for document processing. Extract text from images and PDFs with one line of code.

Installation

npm install @gabsterai/ocr-client

Quick Start

import { ocr } from '@gabsterai/ocr-client';

// Extract text from a document
const result = await ocr('document.pdf');
console.log(result.text);

That's it! No configuration needed if you set the OCR_API_KEY environment variable.

Usage Examples

Simple Text Extraction

import { ocr, extractText } from '@gabsterai/ocr-client';

// Get full result object
const result = await ocr('invoice.pdf');
console.log(result.text);

// Or just get the text
const text = await extractText('invoice.pdf');

Multiple Files

const result = await ocr(['page1.jpg', 'page2.jpg', 'page3.jpg']);
console.log(result.text); // Combined text from all pages

Grouped Files (e.g., School Enrollment)

const result = await ocr({
  birth_certificate: 'documents/birth.pdf',
  photo: 'documents/student.jpg',
  grades: ['transcripts/grade9.pdf', 'transcripts/grade10.pdf'],
});

// Access by group
console.log(result.groups.birth_certificate);
console.log(result.groups.grades);

// Or iterate
for (const [name, text] of Object.entries(result.groups)) {
  console.log(`${name}:`, text);
}

Processing Modes

// Markdown format (default) - preserves structure
const result = await ocr('report.pdf', { mode: 'markdown' });

// Table extraction - optimized for spreadsheets
const result = await ocr('spreadsheet.png', { mode: 'table' });

// Receipt parsing - extracts structured data
const result = await ocr('receipt.jpg', { mode: 'receipt' });

// Handwriting recognition
const result = await ocr('notes.jpg', { mode: 'handwriting' });

// Free-form text
const result = await ocr('document.pdf', { mode: 'free' });

With Buffer (In-Memory Files)

const fileBuffer = await fs.promises.readFile('document.pdf');
const result = await ocr(fileBuffer);

// Or with filename for better MIME type detection
const result = await ocr({ file: fileBuffer, filename: 'document.pdf' });

Progress Tracking

const result = await ocr('large-document.pdf', {
  onProgress: (progress, job) => {
    console.log(`Progress: ${progress}%`);
  }
});

Async Processing with Webhooks

import { OCRClient } from '@gabsterai/ocr-client';

const client = new OCRClient();

// Submit without waiting
const jobId = await client.submit('document.pdf', {
  webhook: 'https://your-server.com/webhooks/ocr'
});

console.log('Job submitted:', jobId);
// Your webhook will receive the result when complete

Configuration

Environment Variables (Recommended)

# Required
export OCR_API_URL="https://your-ocr-api.com"  # Your OCR API endpoint
export OCR_API_KEY="your-api-key"              # Your API key

# Optional
export OCR_MODE="markdown"                      # Default mode
export OCR_TIMEOUT="120000"                     # Timeout in ms

Programmatic Configuration

import { configure } from '@gabsterai/ocr-client';

// Configure once at app startup
configure({
  apiUrl: 'https://your-ocr-api.com',
  apiKey: 'your-api-key',
  defaultMode: 'markdown',
  timeout: 120000,
});

Per-Request Options

const result = await ocr('document.pdf', {
  apiUrl: 'http://custom-server:8080',
  apiKey: 'different-key',
  mode: 'table',
  timeout: 60000,
});

API Reference

Functions

| Function | Description | |----------|-------------| | ocr(input, options?) | Process files and return full result | | extractText(input, options?) | Process files and return just the text | | isAvailable(options?) | Check if OCR service is available | | health(options?) | Get service health status | | configure(options) | Set SDK defaults |

OCRClient Class

For advanced use cases:

import { OCRClient } from '@gabsterai/ocr-client';

const client = new OCRClient({ apiKey: 'your-key' });

// Methods
await client.process(file, options);        // Process single file
await client.processMany(files, options);   // Process multiple files
await client.processGrouped(groups, options); // Process grouped files
await client.submit(input, options);        // Submit without waiting
await client.getJob(jobId);                 // Get job status
await client.waitFor(jobId, options);       // Wait for completion
await client.listJobs(options);             // List recent jobs
await client.deleteJob(jobId);              // Delete a job
await client.health();                      // Check health

Types

type OCRMode = 'free' | 'markdown' | 'table' | 'receipt' | 'handwriting';

interface OCRResult {
  text: string;                              // Combined extracted text
  jobId: string;                             // Job ID for reference
  status: JobStatus;                         // 'completed' | 'failed' | etc.
  groups: Record<string, string>;            // Results by group name
  files: Array<{ filename: string; text: string }>;  // Per-file results
  job: OCRJob;                               // Full job object
}

interface OCROptions {
  mode?: OCRMode;                            // Processing mode
  webhook?: string;                          // Webhook URL
  apiUrl?: string;                           // Custom API URL
  apiKey?: string;                           // Custom API key
  timeout?: number;                          // Timeout in ms
  onProgress?: (progress: number, job: OCRJob) => void;
}

Error Handling

import { ocr, OCRError } from '@gabsterai/ocr-client';

try {
  const result = await ocr('document.pdf');
} catch (error) {
  if (error instanceof OCRError) {
    console.error('OCR Error:', error.message);
    console.error('Code:', error.code);  // 'FILE_NOT_FOUND', 'API_ERROR', 'TIMEOUT', etc.
  }
}

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@gabsterai/ocr-client

Installation

Quick Start

Usage Examples

Simple Text Extraction

Multiple Files

Grouped Files (e.g., School Enrollment)

Processing Modes

With Buffer (In-Memory Files)

Progress Tracking

Async Processing with Webhooks

Configuration

Environment Variables (Recommended)

Programmatic Configuration

Per-Request Options

API Reference

Functions

OCRClient Class

Types

Error Handling

License