@nakamura196/ndl-koten-ocr-web

v1.0.7

Published

5 months ago

Web-based OCR library for ancient Japanese text recognition using ONNX models

0High
0Medium
0Low

nakamura196

ocr japanese koten ndl onnx webassembly text-recognition layout-detection

@nakamura196/ndl-koten-ocr-web

Web-based OCR library for ancient Japanese text recognition using ONNX models. This is a web port of NDL Koten OCR.

Features

🎯 Automatic layout detection for historical Japanese documents
📝 High-accuracy text recognition for classical Japanese characters
🚀 Runs entirely in the browser using WebAssembly
📦 Includes pre-trained ONNX models
🔧 Simple API with TypeScript support

Installation

npm install @nakamura196/ndl-koten-ocr-web

The package includes all necessary ONNX models (約78MB), so installation may take a moment.

Model Files

Model files are included in the package and loaded automatically from node_modules. No manual setup required!

Quick Start

import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';

// Initialize OCR engine
const ocr = new NDLKotenOCR();
await ocr.init(); // Simple initialization with defaults

// Process an image
const image = document.getElementById('myImage') as HTMLImageElement;
const result = await ocr.process(image);

// Access results
console.log(result.text); // Extracted text
console.log(result.json); // Structured data with bounding boxes
console.log(result.xml); // XML format output

Advanced Usage

Custom Model Path

If you're serving models from a different location:

await ocr.init({
  modelPath: '/static/models/', // Custom model directory
  progressCallback: (progress, message) => {
    console.log(`${progress}% - ${message}`);
  }
});

Model Size

Currently includes small models only:

// Small models (default)
await ocr.init({ modelSize: 'small' });

Note: Large models are defined in the code but not included in the current package.

Processing Options

const result = await ocr.process(image, {
  imageName: 'page_001',  // Optional: name for the image
  onProgress: (progress, message) => {  // Optional: progress callback during processing
    console.log(`Processing: ${progress * 100}% - ${message}`);
  }
});

Manual Initialization (Advanced)

For complete control over model loading:

await ocr.initialize(
  '/models/rtmdet-s-1280x1280.onnx',    // Layout detection model
  {},                                     // Layout config
  '/models/ndl.yaml',                     // Layout config file
  '/models/parseq-ndl-32x384-tiny-10.onnx', // Text recognition model
  {},                                     // Recognition config
  '/models/NDLmoji.yaml',                 // Character list config
  (progress, message) => {                // Progress callback
    console.log(`${progress}% - ${message}`);
  }
);

Integration Examples

Next.js / Vercel

// components/OCRComponent.tsx
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';

const ocr = new NDLKotenOCR();

// Initialize with default settings (models from node_modules)
await ocr.init();

// Or specify custom path if needed
await ocr.init({
  modelPath: '/node_modules/@nakamura196/ndl-koten-ocr-web/models/'
});

React Component

import { useState, useEffect } from 'react';
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';

function OCRComponent() {
  const [ocr, setOcr] = useState<NDLKotenOCR | null>(null);
  const [isLoading, setIsLoading] = useState(true);

  useEffect(() => {
    const initOCR = async () => {
      const ocrInstance = new NDLKotenOCR();
      await ocrInstance.init({
        progressCallback: (progress, message) => {
          console.log(`Loading: ${progress}% - ${message}`);
        }
      });
      setOcr(ocrInstance);
      setIsLoading(false);
    };
    
    initOCR();
  }, []);

  const processImage = async (file: File) => {
    if (!ocr) return;
    
    const img = new Image();
    img.src = URL.createObjectURL(file);
    await img.decode();
    
    const result = await ocr.process(img);
    console.log('OCR Result:', result.text);
  };

  return (
    <div>
      {isLoading ? (
        <p>Loading OCR engine...</p>
      ) : (
        <input type="file" onChange={(e) => {
          if (e.target.files?.[0]) {
            processImage(e.target.files[0]);
          }
        }} />
      )}
    </div>
  );
}

Output Formats

The OCR results are available in multiple formats:

Text Format

result.text // Plain text output

JSON Format

result.json // Structured data with coordinates
// {
//   document: {
//     image: {
//       text: [
//         { x: 100, y: 200, width: 50, height: 30, text: "文字" }
//       ]
//     }
//   }
// }

XML Format

result.xml // XML formatted output

Advanced Features

TEI/XML Conversion

Convert OCR results to TEI (Text Encoding Initiative) format:

import { NDLKotenOCR, TEIConverter, TEIConversionData } from '@nakamura196/ndl-koten-ocr-web';

const ocr = new NDLKotenOCR();
await ocr.init();

// Process multiple images
const results = [];
for (const image of images) {
  const result = await ocr.process(image);
  results.push({
    ...result,
    imageName: image.name,
    imageWidth: image.width,
    imageHeight: image.height
  });
}

// Convert to TEI/XML
const teiConverter = new TEIConverter();
const teiData: TEIConversionData = {
  title: 'My Document',
  sourceUrl: 'https://example.com/manifest.json',
  results: results
};

const teiXml = teiConverter.convertOCRResults(teiData);
console.log(teiXml);

IIIF Manifest Processing

Process images directly from IIIF manifests:

import { NDLKotenOCR, IIIFProcessor } from '@nakamura196/ndl-koten-ocr-web';

// Initialize OCR engine
const ocr = new NDLKotenOCR();
await ocr.init();

// Create IIIF processor
const iiifProcessor = new IIIFProcessor(ocr);

// Process a IIIF manifest
const manifestUrl = 'https://example.com/manifest.json';
const { results, teiXml, manifest } = await iiifProcessor.processManifestUrl(
  manifestUrl,
  {
    maxImages: 10, // Process only first 10 images
    onImageProgress: (index, progress, message) => {
      console.log(`Image ${index + 1}: ${progress}% - ${message}`);
    }
  }
);

// results: Array of OCR results for each image
// teiXml: Complete TEI/XML document
// manifest: The parsed IIIF manifest

console.log('Processed', results.length, 'images');
console.log('TEI/XML:', teiXml);

Processing Local Files with TEI Export

import { NDLKotenOCR, TEIConverter } from '@nakamura196/ndl-koten-ocr-web';

const ocr = new NDLKotenOCR();
await ocr.init();

// Process files and generate TEI
async function processFilesWithTEI(files: File[]) {
  const results = [];

  for (const file of files) {
    const img = new Image();
    img.src = URL.createObjectURL(file);
    await img.decode();

    const result = await ocr.process(img, {
      imageName: file.name
    });

    results.push({
      ...result,
      imageName: file.name,
      imageWidth: img.naturalWidth,
      imageHeight: img.naturalHeight
    });
  }

  // Convert to TEI/XML
  const teiConverter = new TEIConverter();
  const teiXml = teiConverter.convertOCRResults({
    title: 'Batch OCR Results',
    results: results
  });

  // Download as file
  const blob = new Blob([teiXml], { type: 'text/xml' });
  const url = URL.createObjectURL(blob);
  const a = document.createElement('a');
  a.href = url;
  a.download = 'ocr-results.xml';
  a.click();
}

Model Information

This package includes the following pre-trained models:

Layout Detection: RTMDet-S (1280x1280) - Detects text regions in document images
Text Recognition: PARSEQ-NDL (32x384-tiny-10) - Recognizes classical Japanese characters
Character Set: NDLmoji - Comprehensive classical Japanese character mappings

Models are based on the work by:

National Diet Library (NDL)
Yuta Hashimoto (@yuta1984)

Browser Compatibility

Chrome 90+
Firefox 89+
Safari 15.4+
Edge 90+

Requires WebAssembly and Web Workers support.

Development

Building from Source

git clone https://github.com/yuta1984/ndlkotenocr-lite-web
cd ndlkotenocr-lite-web/packages/ndl-koten-ocr-core
npm install
npm run build

Running Tests

npm test

Troubleshooting

Models Not Loading

Ensure your web server is configured to serve .onnx files with the correct MIME type:

application/octet-stream

CORS Issues

If serving models from a CDN, ensure CORS headers are properly configured.

Memory Issues

For large images, consider resizing before processing:

const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = Math.min(image.width, 2000);
canvas.height = Math.min(image.height, 2000);
ctx.drawImage(image, 0, 0, canvas.width, canvas.height);
const result = await ocr.process(canvas);

License

MIT

Credits

This is a web port of NDL Koten OCR developed by:

Original implementation: National Diet Library (NDL Lab)
Web port: Yuta Hashimoto (@yuta1984)
npm package: Satoru Nakamura (@nakamura196)

Related Projects

Support

For issues and questions:

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@nakamura196/ndl-koten-ocr-web

Features

Installation

Model Files

Quick Start

Advanced Usage

Custom Model Path

Model Size

Processing Options

Manual Initialization (Advanced)

Integration Examples

Next.js / Vercel

React Component

Output Formats

Text Format

JSON Format

XML Format

Advanced Features

TEI/XML Conversion

IIIF Manifest Processing

Processing Local Files with TEI Export

Model Information

Browser Compatibility

Development

Building from Source

Running Tests

Troubleshooting

Models Not Loading

CORS Issues

Memory Issues

License

Credits

Related Projects

Support