npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@luii/node-tesseract-ocr

v2.4.0

Published

Native C++ addon for Node.js that exposes Tesseract OCR (`libtesseract-dev`) to JavaScript/TypeScript.

Downloads

545

Readme

node-tesseract-ocr

Native C++ addon for Node.js that exposes Tesseract OCR (libtesseract-dev) to JavaScript/TypeScript.

Table of Contents

Features

  • Native bindings to Tesseract (prebuilds via pkg-prebuilds)
  • Access to Tesseract enums and configuration from TypeScript
  • Progress callback and multiple output formats
  • Lazy download of missing traineddata (configurable)

Prerequisites

  • nodejs
  • node-addon-api
  • c++ build toolchain (e.g. build-essentials)
  • libtesseract-dev (exactly 5.5.2)
  • libleptonica-dev
  • Tesseract training data (eng, deu, ...) or let the library handle that

See Install

Install

sudo apt update
sudo apt install -y nodejs npm build-essential pkg-config libtesseract-dev libleptonica-dev tesseract-ocr-eng

Verify the required Tesseract version:

pkg-config --modversion tesseract
# expected: 5.5.2

If your distro ships another version, install/build tesseract 5.5.2 and ensure pkg-config resolves that installation.

git clone [email protected]:luii/node-tesseract-ocr.git
cd node-tesseract-ocr
npm install

Install additional training data

On Debian/Ubuntu, language data is provided as packages named tesseract-ocr-<lang>. Install additional languages as needed, for example:

sudo apt install -y tesseract-ocr-deu tesseract-ocr-eng tesseract-ocr-jpn

If you install traineddata files manually, make sure TESSDATA_PREFIX points to the directory that contains them (for example /usr/share/tessdata).

If traineddata is missing, this package will download it lazily during init by default. You can control this behavior via ensureTraineddata, cachePath, and dataPath.

Build

# Debug build (native addon + TS outputs)
npm run build:debug

# Release build
npm run build:release

Start

Set TESSDATA_PREFIX to your traineddata directory (usually /usr/share/tesseract-ocr/5/tessdata or /usr/share/tessdata).

env TESSDATA_PREFIX=/usr/share/tessdata node path/to/your/app.js

If you prefer automatic downloads, you can skip setting TESSDATA_PREFIX and let the default cache directory handle traineddata on first use.

Scripts

# Build native addon + TS outputs (debug / release)
npm run build:debug
npm run build:release

# Run the JS example (builds debug first)
npm run example:recognize

# Tests
npm run test:cpp
npm run test:js
npm run test:js:watch

Examples

Run Included Example

env TESSDATA_PREFIX=/usr/share/tessdata npm run example:recognize

Basic OCR (Local Traineddata)

You can find a similar example in the examples/ folder of the project.

import fs from "node:fs";
import Tesseract, { OcrEngineModes } from "node-tesseract-ocr";

process.env.TESSDATA_PREFIX = "/usr/share/tessdata/";

async function main() {
  const tesseract = new Tesseract();
  await tesseract.init({
    langs: ["eng"],
  });

  const buffer = fs.readFileSync("example1.png");
  await tesseract.setImage(buffer);
  await tesseract.recognize((info) => {
    console.log(`Progress: ${info.percent}%`);
  });

  const text = await tesseract.getUTF8Text();
  console.log(text);

  await tesseract.end();
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Lazy Traineddata Download (Default)

import fs from "node:fs";
import Tesseract from "node-tesseract-ocr";

async function main() {
  const tesseract = new Tesseract();
  await tesseract.init({
    langs: ["eng"],
    ensureTraineddata: true
    dataPath: './tessdata-local'
  });

  const buffer = fs.readFileSync("example1.png");
  await tesseract.setImage(buffer);
  await tesseract.recognize();
  const text = await tesseract.getUTF8Text();
  console.log(text);

  await tesseract.end();
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Public API

Enums

Language

Mapping of available Tesseract language codes. Most are 3-letter ISO 639-2/T style (e.g. eng, deu, jpn), with Tesseract-specific variants such as chi_sim, deu_latf, or osd. The value must match the installed traineddata filename (without the .traineddata suffix). You can pass a single code via TesseractInitOptions.lang.

[!IMPORTANT] If you join codes with a plus sign (e.g. deu+eng), Tesseract will look for multiple languages in the same image (here: German and English).

OcrEngineMode

Full list of OCR engine modes from Tesseract.

| Name | Value | Deprecated | Description | | ----------------------------- | ----- | ---------- | ---------------------------------------------------------- | | OEM_TESSERACT_ONLY | 0 | Yes | Run Tesseract only (fastest). | | OEM_LSTM_ONLY | 1 | No | Run only the LSTM line recognizer. | | OEM_TESSERACT_LSTM_COMBINED | 2 | Yes | Run LSTM with fallback to Tesseract. | | OEM_DEFAULT | 3 | No | Infer engine mode from configs; default is Tesseract-only. |

PageSegmentationMode

Full list of page segmentation modes from Tesseract.

| Name | Value | Deprecated | Description | | ---------------------------- | ----- | ---------- | --------------------------------------------------------- | | PSM_OSD_ONLY | 0 | No | Orientation and script detection only. | | PSM_AUTO_OSD | 1 | No | Automatic page segmentation with OSD. | | PSM_AUTO_ONLY | 2 | No | Automatic page segmentation, no OSD or OCR. | | PSM_AUTO | 3 | No | Fully automatic page segmentation, no OSD. | | PSM_SINGLE_COLUMN | 4 | No | Assume a single column of text of variable sizes. | | PSM_SINGLE_BLOCK_VERT_TEXT | 5 | No | Assume a single uniform block of vertically aligned text. | | PSM_SINGLE_BLOCK | 6 | No | Assume a single uniform block of text (default). | | PSM_SINGLE_LINE | 7 | No | Treat the image as a single text line. | | PSM_SINGLE_WORD | 8 | No | Treat the image as a single word. | | PSM_CIRCLE_WORD | 9 | No | Treat the image as a single word in a circle. | | PSM_SINGLE_CHAR | 10 | No | Treat the image as a single character. | | PSM_SPARSE_TEXT | 11 | No | Find as much text as possible in no particular order. | | PSM_SPARSE_TEXT_OSD | 12 | No | Sparse text with orientation and script detection. | | PSM_RAW_LINE | 13 | No | Single text line, bypassing Tesseract-specific hacks. |

Types

TesseractInitOptions

| Field | Type | Optional | Default | Description | | ----------------------- | ----------------------------------------------------------------------------------------------------- | -------- | -------------------------------------- | --------------------------------------- | | langs | Language[] | Yes | undefined | Languages to load as an array. | | oem | OcrEngineMode | Yes | undefined | OCR engine mode. | | vars | Partial<Record<keyof ConfigurationVariables, ConfigurationVariables[keyof ConfigurationVariables]>> | Yes | undefined | Variables to set. | | configs | Array<string> | Yes | undefined | Tesseract config files to apply. | | setOnlyNonDebugParams | boolean | Yes | undefined | If true, only non-debug params are set. | | ensureTraineddata | boolean | Yes | true | Download missing traineddata lazily. | | cachePath | string | Yes | ~/.cache/node-tesseract-ocr/tessdata | Cache directory for downloads. | | dataPath | string | Yes | TESSDATA_PREFIX or cachePath | Directory used by Tesseract for data. | | progressCallback | (info: TrainingDataDownloadProgress) => void | Yes | undefined | Download progress callback. |

TesseractSetRectangleOptions

| Field | Type | Optional | Default | Description | | -------- | -------- | -------- | ------- | ----------------- | | top | number | No | n/a | Top coordinate. | | left | number | No | n/a | Left coordinate. | | width | number | No | n/a | Rectangle width. | | height | number | No | n/a | Rectangle height. |

ProgressChangedInfo

| Field | Type | Optional | Default | Description | | ---------- | -------- | -------- | ------- | ------------------------------------------ | | progress | number | No | n/a | Chars in the current buffer. | | percent | number | No | n/a | Percent complete (0-100). | | ocrAlive | number | No | n/a | Non-zero if worker is alive. | | top | number | No | n/a | Top coordinate of current element bbox. | | right | number | No | n/a | Right coordinate of current element bbox. | | bottom | number | No | n/a | Bottom coordinate of current element bbox. | | left | number | No | n/a | Left coordinate of current element bbox. |

TesseractProcessPagesStatus

| Field | Type | Optional | Default | Description | | ----------------- | --------- | -------- | ------- | ----------------------------------------------------- | | active | boolean | No | n/a | Whether a multipage session is currently active. | | healthy | boolean | No | n/a | Whether the renderer is healthy. | | processedPages | number | No | n/a | Number of pages already processed in this session. | | nextPageIndex | number | No | n/a | Zero-based index that will be used for the next page. | | outputBase | string | No | n/a | Effective output base used by the PDF renderer. | | timeoutMillisec | number | No | n/a | Timeout per page in milliseconds (0 = unlimited). | | textonly | boolean | No | n/a | Whether text-only PDF mode is enabled. |

DetectOrientationScriptResult

| Field | Type | Optional | Default | Description | | ----------------------- | -------- | -------- | ------- | -------------------------------------------------- | | orientationDegrees | number | No | n/a | Orientation of the source image (0, 90, 180, 270). | | orientationConfidence | number | No | n/a | Confidence for the orientation. | | scriptName | string | No | n/a | Detected script name. | | scriptConfidence | number | No | n/a | Confidence for the script. |

Tesseract API

Constructor

new Tesseract();

Creates a new Tesseract instance.

Initialization Requirements

Call init(...) once before using OCR/engine-dependent methods.

Methods that do not require init(...):

  • version()
  • isInitialized()
  • setInputName(...)
  • getInputName()
  • abortProcessPages()
  • getProcessPagesStatus()
  • document.abort()
  • document.status()
  • init(...)
  • end()

Methods that require init(...):

  • setInputImage(...)
  • getInputImage()
  • getSourceYResolution()
  • getDataPath()
  • setOutputName(...)
  • clearPersistentCache()
  • clearAdaptiveClassifier()
  • setImage(...)
  • getThresholdedImage()
  • getThresholdedImageScaleFactor()
  • setPageMode(...)
  • setRectangle(...)
  • setSourceResolution(...)
  • recognize(...)
  • detectOrientationScript()
  • meanTextConf()
  • allWordConfidences()
  • getPAGEText(...)
  • getLSTMBoxText(...)
  • getBoxText(...)
  • getWordStrBoxText(...)
  • getOSDText(...)
  • getUTF8Text()
  • getHOCRText(...)
  • getTSVText(...)
  • getUNLVText()
  • getALTOText(...)
  • getInitLanguages()
  • getLoadedLanguages()
  • getAvailableLanguages()
  • setDebugVariable(...)
  • setVariable(...)
  • getIntVariable(...)
  • getBoolVariable(...)
  • getDoubleVariable(...)
  • getStringVariable(...)
  • clear()
  • beginProcessPages(...)
  • addProcessPage(...)
  • finishProcessPages()
  • document.begin(...)
  • document.addPage(...)
  • document.finish()

version

Returns the currently loaded libtesseract version string.

version(): Promise<string>

isInitialized

Returns whether init(...) has already completed successfully and has not been reset via end().

isInitialized(): Promise<boolean>

init

Initializes the OCR engine with language, OEM, configs, and variables.

| Name | Type | Optional | Default | Description | | --------- | ----------------------------------------------- | -------- | ------- | ----------------------- | | options | TesseractInitOptions | No | n/a | Initialization options. |

init(options: TesseractInitOptions): Promise<void>

initForAnalysePage

Initializes the engine in analysis-only mode.

initForAnalysePage(): Promise<void>

analyseLayout

Runs page layout analysis on the current image.

| Name | Type | Optional | Default | Description | | ------------------- | --------- | -------- | ------- | ------------------------------------------- | | mergeSimilarWords | boolean | No | n/a | Merge similar words during layout analysis. |

analyseLayout(mergeSimilarWords: boolean): Promise<void>

setInputName

Sets the source/input name used by renderer/training APIs.

| Name | Type | Optional | Default | Description | | ----------- | -------- | -------- | ------- | ------------------------------------------ | | inputName | string | No | n/a | Input name used by renderer/training APIs. |

setInputName(inputName: string): Promise<void>

getInputName

Returns the current input name from engine state.

getInputName(): Promise<string>

setInputImage

Sets the encoded source image buffer.

| Name | Type | Optional | Default | Description | | -------- | -------- | -------- | ------- | ---------------------------- | | buffer | Buffer | No | n/a | Encoded source image buffer. |

setInputImage(buffer: Buffer): Promise<void>

getInputImage

Returns the current input image bytes.

getInputImage(): Promise<Buffer>

getSourceYResolution

Returns source image Y resolution (DPI).

getSourceYResolution(): Promise<number>

getDataPath

Returns the active tessdata path from the engine.

getDataPath(): Promise<string>

setOutputName

Sets the output base name for renderer-based outputs.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ------- | -------------------------------------- | | outputName | string | No | n/a | Output base name for renderer outputs. |

setOutputName(outputName: string): Promise<void>

clearPersistentCache

Clears global library-level caches (for example dictionaries).

clearPersistentCache(): Promise<void>

clearAdaptiveClassifier

Cleans adaptive classifier state between pages/documents.

clearAdaptiveClassifier(): Promise<void>

setImage

Sets the image used by OCR recognition.

| Name | Type | Optional | Default | Description | | -------- | -------- | -------- | ------- | ------------------------ | | buffer | Buffer | No | n/a | Image data used for OCR. |

setImage(buffer: Buffer): Promise<void>

getThresholdedImage

Returns thresholded image bytes from Tesseract internals.

getThresholdedImage(): Promise<Buffer>

getThresholdedImageScaleFactor

Returns scale factor for thresholded/component images.

getThresholdedImageScaleFactor(): Promise<number>

setPageMode

Sets the page segmentation mode (PSM).

| Name | Type | Optional | Default | Description | | ----- | ----------------------------------------------- | -------- | ------- | ----------------------- | | psm | PageSegmentationMode | No | n/a | Page segmentation mode. |

setPageMode(psm: PageSegmentationMode): Promise<void>

setRectangle

Restricts recognition to the given rectangle.

| Name | Type | Optional | Default | Description | | --------- | --------------------------------------------------------------- | -------- | ------- | ------------------ | | options | TesseractSetRectangleOptions | No | n/a | Region definition. |

setRectangle(options: TesseractSetRectangleOptions): Promise<void>

setSourceResolution

Sets the source resolution in PPI.

| Name | Type | Optional | Default | Description | | ----- | -------- | -------- | ------- | ------------------------- | | ppi | number | No | n/a | Source resolution in PPI. |

setSourceResolution(ppi: number): Promise<void>

document

Facade for multipage PDF/document processing lifecycle.

document: {
  begin(options: TesseractBeginProcessPagesOptions): Promise<void>;
  addPage(buffer: Buffer, filename?: string): Promise<void>;
  finish(): Promise<string>;
  abort(): Promise<void>;
  status(): Promise<TesseractProcessPagesStatus>;
}

document.begin

Starts a multipage processing session.

| Name | Type | Optional | Default | Description | | --------- | ----------------------------------- | -------- | ------- | --------------------------- | | options | TesseractBeginProcessPagesOptions | No | n/a | Multipage renderer options. |

document.begin(options: TesseractBeginProcessPagesOptions): Promise<void>

document.addPage

Adds an encoded page to the active session.

| Name | Type | Optional | Default | Description | | ---------- | -------- | -------- | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | buffer | Buffer | No | n/a | Encoded page image buffer. | | filename | string | Yes | undefined | Optional source filename/path passed to Tesseract ProcessPage for this page. Tesseract/Leptonica may open this file internally and use it as the source image for parts of PDF rendering. If output pages look wrong (for example inverted or visually corrupted), pass a real image path here to force a stable source image path for that page. |

document.addPage(buffer: Buffer, filename?: string): Promise<void>

document.finish

Finalizes the active session and returns output PDF path.

document.finish(): Promise<string>

document.abort

Aborts and resets the active multipage session.

document.abort(): Promise<void>

document.status

Returns the current multipage session status (active flag, page counters, and effective renderer settings).

document.status(): Promise<TesseractProcessPagesStatus>

getProcessPagesStatus

Returns the current multipage session status from the instance API.

getProcessPagesStatus(): Promise<TesseractProcessPagesStatus>

setDebugVariable

Sets a debug configuration variable.

| Name | Type | Optional | Default | Description | | ------- | -------------------------------------------------------------- | -------- | ------- | --------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. | | value | SetVariableConfigVariables[keyof SetVariableConfigVariables] | No | n/a | Variable value. |

setDebugVariable(
  name: keyof SetVariableConfigVariables,
  value: SetVariableConfigVariables[keyof SetVariableConfigVariables],
): Promise<boolean>

setVariable

Sets a regular configuration variable.

| Name | Type | Optional | Default | Description | | ------- | -------------------------------------------------------------- | -------- | ------- | --------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. | | value | SetVariableConfigVariables[keyof SetVariableConfigVariables] | No | n/a | Variable value. |

setVariable(
  name: keyof SetVariableConfigVariables,
  value: SetVariableConfigVariables[keyof SetVariableConfigVariables],
): Promise<boolean>

getIntVariable

Reads a configuration variable as integer.

| Name | Type | Optional | Default | Description | | ------ | ---------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |

getIntVariable(name: keyof SetVariableConfigVariables): Promise<number>

getBoolVariable

Reads a configuration variable as boolean (0/1).

| Name | Type | Optional | Default | Description | | ------ | ---------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |

getBoolVariable(name: keyof SetVariableConfigVariables): Promise<number>

getDoubleVariable

Reads a configuration variable as double.

| Name | Type | Optional | Default | Description | | ------ | ---------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |

getDoubleVariable(name: keyof SetVariableConfigVariables): Promise<number>

getStringVariable

Reads a configuration variable as string.

| Name | Type | Optional | Default | Description | | ------ | ---------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |

getStringVariable(name: keyof SetVariableConfigVariables): Promise<string>

recognize

Runs OCR recognition (optionally with progress callback).

| Name | Type | Optional | Default | Description | | ------------------ | ------------------------------------- | -------- | ----------- | ---------------------- | | progressCallback | (info: ProgressChangedInfo) => void | Yes | undefined | OCR progress callback. |

recognize(progressCallback?: (info: ProgressChangedInfo) => void): Promise<void>

detectOrientationScript

Detects orientation and script with confidence values.

detectOrientationScript(): Promise<DetectOrientationScriptResult>

meanTextConf

Returns mean text confidence.

meanTextConf(): Promise<number>

allWordConfidences

Returns all word confidences for current recognition result.

allWordConfidences(): Promise<number[]>

getPAGEText

Returns PAGE XML output.

| Name | Type | Optional | Default | Description | | ------------------ | ------------------------------------- | -------- | ----------- | ---------------------------------- | | progressCallback | (info: ProgressChangedInfo) => void | Yes | undefined | PAGE generation progress callback. | | pageNumber | number | Yes | undefined | 0-based page number. |

getPAGEText(
  progressCallback?: (info: ProgressChangedInfo) => void,
  pageNumber?: number,
): Promise<string>

getLSTMBoxText

Returns LSTM box output.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ----------- | -------------------- | | pageNumber | number | Yes | undefined | 0-based page number. |

getLSTMBoxText(pageNumber?: number): Promise<string>

getBoxText

Returns classic box output.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ----------- | -------------------- | | pageNumber | number | Yes | undefined | 0-based page number. |

getBoxText(pageNumber?: number): Promise<string>

getWordStrBoxText

Returns WordStr box output.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ----------- | -------------------- | | pageNumber | number | Yes | undefined | 0-based page number. |

getWordStrBoxText(pageNumber?: number): Promise<string>

getOSDText

Returns OSD text output.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ----------- | -------------------- | | pageNumber | number | Yes | undefined | 0-based page number. |

getOSDText(pageNumber?: number): Promise<string>

getUTF8Text

Returns recognized UTF-8 text.

getUTF8Text(): Promise<string>

getHOCRText

Returns hOCR output.

| Name | Type | Optional | Default | Description | | ------------------ | ------------------------------------- | -------- | ----------- | ---------------------------------- | | progressCallback | (info: ProgressChangedInfo) => void | Yes | undefined | hOCR generation progress callback. | | pageNumber | number | Yes | undefined | 0-based page number. |

getHOCRText(
  progressCallback?: (info: ProgressChangedInfo) => void,
  pageNumber?: number,
): Promise<string>

getTSVText

Returns TSV output.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ----------- | -------------------- | | pageNumber | number | Yes | undefined | 0-based page number. |

getTSVText(pageNumber?: number): Promise<string>

getUNLVText

Returns UNLV output.

getUNLVText(): Promise<string>

getALTOText

Returns ALTO XML output.

| Name | Type | Optional | Default | Description | | ------------ | -------- | -------- | ----------- | -------------------- | | pageNumber | number | Yes | undefined | 0-based page number. |

getALTOText(pageNumber?: number): Promise<string>

getInitLanguages

Returns languages used during initialization (for example deu+eng).

getInitLanguages(): Promise<string>

getLoadedLanguages

Returns languages currently loaded in the engine.

getLoadedLanguages(): Promise<Language[]>

getAvailableLanguages

Returns languages available from tessdata.

getAvailableLanguages(): Promise<Language[]>

clear

Clears internal recognition state/results.

clear(): Promise<void>

end

Releases native resources and ends the instance.

end(): Promise<void>

License

Apache-2.0. See LICENSE.md for full terms.