@nakamura196/ndl-koten-ocr-web
v1.0.7
Published
Web-based OCR library for ancient Japanese text recognition using ONNX models
Maintainers
Readme
@nakamura196/ndl-koten-ocr-web
Web-based OCR library for ancient Japanese text recognition using ONNX models. This is a web port of NDL Koten OCR.
Features
- 🎯 Automatic layout detection for historical Japanese documents
- 📝 High-accuracy text recognition for classical Japanese characters
- 🚀 Runs entirely in the browser using WebAssembly
- 📦 Includes pre-trained ONNX models
- 🔧 Simple API with TypeScript support
Installation
npm install @nakamura196/ndl-koten-ocr-webThe package includes all necessary ONNX models (約78MB), so installation may take a moment.
Model Files
Model files are included in the package and loaded automatically from node_modules. No manual setup required!
Quick Start
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';
// Initialize OCR engine
const ocr = new NDLKotenOCR();
await ocr.init(); // Simple initialization with defaults
// Process an image
const image = document.getElementById('myImage') as HTMLImageElement;
const result = await ocr.process(image);
// Access results
console.log(result.text); // Extracted text
console.log(result.json); // Structured data with bounding boxes
console.log(result.xml); // XML format outputAdvanced Usage
Custom Model Path
If you're serving models from a different location:
await ocr.init({
modelPath: '/static/models/', // Custom model directory
progressCallback: (progress, message) => {
console.log(`${progress}% - ${message}`);
}
});Model Size
Currently includes small models only:
// Small models (default)
await ocr.init({ modelSize: 'small' });Note: Large models are defined in the code but not included in the current package.
Processing Options
const result = await ocr.process(image, {
imageName: 'page_001', // Optional: name for the image
onProgress: (progress, message) => { // Optional: progress callback during processing
console.log(`Processing: ${progress * 100}% - ${message}`);
}
});Manual Initialization (Advanced)
For complete control over model loading:
await ocr.initialize(
'/models/rtmdet-s-1280x1280.onnx', // Layout detection model
{}, // Layout config
'/models/ndl.yaml', // Layout config file
'/models/parseq-ndl-32x384-tiny-10.onnx', // Text recognition model
{}, // Recognition config
'/models/NDLmoji.yaml', // Character list config
(progress, message) => { // Progress callback
console.log(`${progress}% - ${message}`);
}
);Integration Examples
Next.js / Vercel
// components/OCRComponent.tsx
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';
const ocr = new NDLKotenOCR();
// Initialize with default settings (models from node_modules)
await ocr.init();
// Or specify custom path if needed
await ocr.init({
modelPath: '/node_modules/@nakamura196/ndl-koten-ocr-web/models/'
});React Component
import { useState, useEffect } from 'react';
import { NDLKotenOCR } from '@nakamura196/ndl-koten-ocr-web';
function OCRComponent() {
const [ocr, setOcr] = useState<NDLKotenOCR | null>(null);
const [isLoading, setIsLoading] = useState(true);
useEffect(() => {
const initOCR = async () => {
const ocrInstance = new NDLKotenOCR();
await ocrInstance.init({
progressCallback: (progress, message) => {
console.log(`Loading: ${progress}% - ${message}`);
}
});
setOcr(ocrInstance);
setIsLoading(false);
};
initOCR();
}, []);
const processImage = async (file: File) => {
if (!ocr) return;
const img = new Image();
img.src = URL.createObjectURL(file);
await img.decode();
const result = await ocr.process(img);
console.log('OCR Result:', result.text);
};
return (
<div>
{isLoading ? (
<p>Loading OCR engine...</p>
) : (
<input type="file" onChange={(e) => {
if (e.target.files?.[0]) {
processImage(e.target.files[0]);
}
}} />
)}
</div>
);
}Output Formats
The OCR results are available in multiple formats:
Text Format
result.text // Plain text outputJSON Format
result.json // Structured data with coordinates
// {
// document: {
// image: {
// text: [
// { x: 100, y: 200, width: 50, height: 30, text: "文字" }
// ]
// }
// }
// }XML Format
result.xml // XML formatted outputAdvanced Features
TEI/XML Conversion
Convert OCR results to TEI (Text Encoding Initiative) format:
import { NDLKotenOCR, TEIConverter, TEIConversionData } from '@nakamura196/ndl-koten-ocr-web';
const ocr = new NDLKotenOCR();
await ocr.init();
// Process multiple images
const results = [];
for (const image of images) {
const result = await ocr.process(image);
results.push({
...result,
imageName: image.name,
imageWidth: image.width,
imageHeight: image.height
});
}
// Convert to TEI/XML
const teiConverter = new TEIConverter();
const teiData: TEIConversionData = {
title: 'My Document',
sourceUrl: 'https://example.com/manifest.json',
results: results
};
const teiXml = teiConverter.convertOCRResults(teiData);
console.log(teiXml);IIIF Manifest Processing
Process images directly from IIIF manifests:
import { NDLKotenOCR, IIIFProcessor } from '@nakamura196/ndl-koten-ocr-web';
// Initialize OCR engine
const ocr = new NDLKotenOCR();
await ocr.init();
// Create IIIF processor
const iiifProcessor = new IIIFProcessor(ocr);
// Process a IIIF manifest
const manifestUrl = 'https://example.com/manifest.json';
const { results, teiXml, manifest } = await iiifProcessor.processManifestUrl(
manifestUrl,
{
maxImages: 10, // Process only first 10 images
onImageProgress: (index, progress, message) => {
console.log(`Image ${index + 1}: ${progress}% - ${message}`);
}
}
);
// results: Array of OCR results for each image
// teiXml: Complete TEI/XML document
// manifest: The parsed IIIF manifest
console.log('Processed', results.length, 'images');
console.log('TEI/XML:', teiXml);Processing Local Files with TEI Export
import { NDLKotenOCR, TEIConverter } from '@nakamura196/ndl-koten-ocr-web';
const ocr = new NDLKotenOCR();
await ocr.init();
// Process files and generate TEI
async function processFilesWithTEI(files: File[]) {
const results = [];
for (const file of files) {
const img = new Image();
img.src = URL.createObjectURL(file);
await img.decode();
const result = await ocr.process(img, {
imageName: file.name
});
results.push({
...result,
imageName: file.name,
imageWidth: img.naturalWidth,
imageHeight: img.naturalHeight
});
}
// Convert to TEI/XML
const teiConverter = new TEIConverter();
const teiXml = teiConverter.convertOCRResults({
title: 'Batch OCR Results',
results: results
});
// Download as file
const blob = new Blob([teiXml], { type: 'text/xml' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'ocr-results.xml';
a.click();
}Model Information
This package includes the following pre-trained models:
- Layout Detection: RTMDet-S (1280x1280) - Detects text regions in document images
- Text Recognition: PARSEQ-NDL (32x384-tiny-10) - Recognizes classical Japanese characters
- Character Set: NDLmoji - Comprehensive classical Japanese character mappings
Models are based on the work by:
- National Diet Library (NDL)
- Yuta Hashimoto (@yuta1984)
Browser Compatibility
- Chrome 90+
- Firefox 89+
- Safari 15.4+
- Edge 90+
Requires WebAssembly and Web Workers support.
Development
Building from Source
git clone https://github.com/yuta1984/ndlkotenocr-lite-web
cd ndlkotenocr-lite-web/packages/ndl-koten-ocr-core
npm install
npm run buildRunning Tests
npm testTroubleshooting
Models Not Loading
Ensure your web server is configured to serve .onnx files with the correct MIME type:
application/octet-streamCORS Issues
If serving models from a CDN, ensure CORS headers are properly configured.
Memory Issues
For large images, consider resizing before processing:
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = Math.min(image.width, 2000);
canvas.height = Math.min(image.height, 2000);
ctx.drawImage(image, 0, 0, canvas.width, canvas.height);
const result = await ocr.process(canvas);License
MIT
Credits
This is a web port of NDL Koten OCR developed by:
- Original implementation: National Diet Library (NDL Lab)
- Web port: Yuta Hashimoto (@yuta1984)
- npm package: Satoru Nakamura (@nakamura196)
Related Projects
Support
For issues and questions:
