@luii/node-tesseract-ocr
v2.1.0
Published
Native C++ addon for Node.js that exposes Tesseract OCR (`libtesseract-dev`) to JavaScript/TypeScript.
Maintainers
Readme
node-tesseract-ocr
Native C++ addon for Node.js that exposes Tesseract OCR (libtesseract-dev) to JavaScript/TypeScript.
Table of Contents
Features
- Native bindings to Tesseract (prebuilds via
pkg-prebuilds) - Access to Tesseract enums and configuration from TypeScript
- Progress callback and multiple output formats
- Lazy download of missing traineddata (configurable)
Prerequisites
- nodejs
- node-addon-api
- c++ build toolchain (e.g. build-essentials)
- libtesseract-dev
- libleptonica-dev
- Tesseract training data (eng, deu, ...) or let the library handle that
See Install
Install
sudo apt update
sudo apt install -y nodejs npm build-essential pkg-config libtesseract-dev libleptonica-dev tesseract-ocr-enggit clone [email protected]:luii/node-tesseract-ocr.git
cd node-tesseract-ocr
npm installInstall additional training data
On Debian/Ubuntu, language data is provided as packages named tesseract-ocr-<lang>.
Install additional languages as needed, for example:
sudo apt install -y tesseract-ocr-deu tesseract-ocr-eng tesseract-ocr-jpnIf you install traineddata files manually, make sure TESSDATA_PREFIX points to the directory that contains them (for example /usr/share/tessdata).
If traineddata is missing, this package will download it lazily during init by default. You can control this behavior via ensureTraineddata, cachePath, and dataPath.
Build
# Debug build (native addon + TS outputs)
npm run build:debug
# Release build
npm run build:releaseStart
Set TESSDATA_PREFIX to your traineddata directory (usually /usr/share/tesseract-ocr/5/tessdata or /usr/share/tessdata).
env TESSDATA_PREFIX=/usr/share/tessdata node path/to/your/app.jsIf you prefer automatic downloads, you can skip setting TESSDATA_PREFIX and let the default cache directory handle traineddata on first use.
Scripts
# Build native addon + TS outputs (debug / release)
npm run build:debug
npm run build:release
# Run the JS example (builds debug first)
npm run example:recognize
# Tests
npm run test:cpp
npm run test:js
npm run test:js:watchExamples
Run Included Example
env TESSDATA_PREFIX=/usr/share/tessdata npm run example:recognizeBasic OCR (Local Traineddata)
You can find a similar example in the examples/ folder of the project.
import fs from "node:fs";
import Tesseract, { OcrEngineModes } from "node-tesseract-ocr";
process.env.TESSDATA_PREFIX = "/usr/share/tessdata/";
async function main() {
const tesseract = new Tesseract();
await tesseract.init({
langs: ["eng"],
});
const buffer = fs.readFileSync("example1.png");
await tesseract.setImage(buffer);
await tesseract.recognize((info) => {
console.log(`Progress: ${info.percent}%`);
});
const text = await tesseract.getUTF8Text();
console.log(text);
await tesseract.end();
}
main().catch((err) => {
console.error(err);
process.exit(1);
});Lazy Traineddata Download (Default)
import fs from "node:fs";
import Tesseract from "node-tesseract-ocr";
async function main() {
const tesseract = new Tesseract();
await tesseract.init({
langs: ["eng"],
ensureTraineddata: true
dataPath: './tessdata-local'
});
const buffer = fs.readFileSync("example1.png");
await tesseract.setImage(buffer);
await tesseract.recognize();
const text = await tesseract.getUTF8Text();
console.log(text);
await tesseract.end();
}
main().catch((err) => {
console.error(err);
process.exit(1);
});Public API
Enums
Language
Mapping of available Tesseract language codes. Most are 3-letter ISO 639-2/T style (e.g. eng, deu, jpn), with Tesseract-specific variants such as chi_sim, deu_latf, or osd. The value must match the installed traineddata filename (without the .traineddata suffix). You can pass a single code via TesseractInitOptions.lang.
[!IMPORTANT] If you join codes with a plus sign (e.g.
deu+eng), Tesseract will look for multiple languages in the same image (here: German and English).
OcrEngineMode
Full list of OCR engine modes from Tesseract.
| Name | Value | Deprecated | Description |
| ----------------------------- | ----- | ---------- | ---------------------------------------------------------- |
| OEM_TESSERACT_ONLY | 0 | Yes | Run Tesseract only (fastest). |
| OEM_LSTM_ONLY | 1 | No | Run only the LSTM line recognizer. |
| OEM_TESSERACT_LSTM_COMBINED | 2 | Yes | Run LSTM with fallback to Tesseract. |
| OEM_DEFAULT | 3 | No | Infer engine mode from configs; default is Tesseract-only. |
PageSegmentationMode
Full list of page segmentation modes from Tesseract.
| Name | Value | Deprecated | Description |
| ---------------------------- | ----- | ---------- | --------------------------------------------------------- |
| PSM_OSD_ONLY | 0 | No | Orientation and script detection only. |
| PSM_AUTO_OSD | 1 | No | Automatic page segmentation with OSD. |
| PSM_AUTO_ONLY | 2 | No | Automatic page segmentation, no OSD or OCR. |
| PSM_AUTO | 3 | No | Fully automatic page segmentation, no OSD. |
| PSM_SINGLE_COLUMN | 4 | No | Assume a single column of text of variable sizes. |
| PSM_SINGLE_BLOCK_VERT_TEXT | 5 | No | Assume a single uniform block of vertically aligned text. |
| PSM_SINGLE_BLOCK | 6 | No | Assume a single uniform block of text (default). |
| PSM_SINGLE_LINE | 7 | No | Treat the image as a single text line. |
| PSM_SINGLE_WORD | 8 | No | Treat the image as a single word. |
| PSM_CIRCLE_WORD | 9 | No | Treat the image as a single word in a circle. |
| PSM_SINGLE_CHAR | 10 | No | Treat the image as a single character. |
| PSM_SPARSE_TEXT | 11 | No | Find as much text as possible in no particular order. |
| PSM_SPARSE_TEXT_OSD | 12 | No | Sparse text with orientation and script detection. |
| PSM_RAW_LINE | 13 | No | Single text line, bypassing Tesseract-specific hacks. |
Types
TesseractInitOptions
| Field | Type | Optional | Default | Description |
| ----------------------- | ----------------------------------------------------------------------------------------------------- | -------- | -------------------------------------- | --------------------------------------- |
| langs | Language[] | Yes | undefined | Languages to load as an array. |
| oem | OcrEngineMode | Yes | undefined | OCR engine mode. |
| vars | Partial<Record<keyof ConfigurationVariables, ConfigurationVariables[keyof ConfigurationVariables]>> | Yes | undefined | Variables to set. |
| configs | Array<string> | Yes | undefined | Tesseract config files to apply. |
| setOnlyNonDebugParams | boolean | Yes | undefined | If true, only non-debug params are set. |
| ensureTraineddata | boolean | Yes | true | Download missing traineddata lazily. |
| cachePath | string | Yes | ~/.cache/node-tesseract-ocr/tessdata | Cache directory for downloads. |
| dataPath | string | Yes | TESSDATA_PREFIX or cachePath | Directory used by Tesseract for data. |
| progressCallback | (info: TrainingDataDownloadProgress) => void | Yes | undefined | Download progress callback. |
TesseractSetRectangleOptions
| Field | Type | Optional | Default | Description |
| -------- | -------- | -------- | ------- | ----------------- |
| top | number | No | n/a | Top coordinate. |
| left | number | No | n/a | Left coordinate. |
| width | number | No | n/a | Rectangle width. |
| height | number | No | n/a | Rectangle height. |
ProgressChangedInfo
| Field | Type | Optional | Default | Description |
| ---------- | -------- | -------- | ------- | ------------------------------------------ |
| progress | number | No | n/a | Chars in the current buffer. |
| percent | number | No | n/a | Percent complete (0-100). |
| ocrAlive | number | No | n/a | Non-zero if worker is alive. |
| top | number | No | n/a | Top coordinate of current element bbox. |
| right | number | No | n/a | Right coordinate of current element bbox. |
| bottom | number | No | n/a | Bottom coordinate of current element bbox. |
| left | number | No | n/a | Left coordinate of current element bbox. |
DetectOrientationScriptResult
| Field | Type | Optional | Default | Description |
| ----------------------- | -------- | -------- | ------- | -------------------------------------------------- |
| orientationDegrees | number | No | n/a | Orientation of the source image (0, 90, 180, 270). |
| orientationConfidence | number | No | n/a | Confidence for the orientation. |
| scriptName | string | No | n/a | Detected script name. |
| scriptConfidence | number | No | n/a | Confidence for the script. |
Tesseract API
Constructor
new Tesseract();Creates a new Tesseract instance.
init
Initializes Tesseract with language, engine mode, configs, and variables.
| Name | Type | Optional | Default | Description |
| ------- | ----------------------------------------------- | -------- | ------- | ----------------------- |
| options | TesseractInitOptions | No | n/a | Initialization options. |
init(options: TesseractInitOptions): Promise<void>initForAnalysePage
Initializes for layout analysis only.
initForAnalysePage(): Promise<void>analysePage
Runs the layout analysis.
| Name | Type | Optional | Default | Description | | ----------------- | ------- | -------- | ------- | ------------------------------- | | mergeSimilarWords | boolean | No | n/a | Whether to merge similar words. |
analysePage(mergeSimilarWords: boolean): Promise<void>setPageMode
Sets the page segmentation mode.
| Name | Type | Optional | Default | Description |
| ---- | ------------------------------------------------ | -------- | ------- | ----------------------- |
| psm | PageSegmentationMode | No | n/a | Page segmentation mode. |
setPageMode(psm: PageSegmentationMode): Promise<void>setVariable
Sets a Tesseract variable. Returns false if the lookup failed.
| Name | Type | Optional | Default | Description | | ----- | -------------------------------------------------------------- | -------- | ------- | --------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. | | value | SetVariableConfigVariables[keyof SetVariableConfigVariables] | No | n/a | Variable value. |
setVariable(name: keyof SetVariableConfigVariables, value: SetVariableConfigVariables[keyof SetVariableConfigVariables]): Promise<boolean>getIntVariable
Reads an integer variable from Tesseract.
| Name | Type | Optional | Default | Description | | ---- | -------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
getIntVariable(name: keyof SetVariableConfigVariables): Promise<number>getBoolVariable
Reads a boolean variable from Tesseract. Returns 0 or 1.
| Name | Type | Optional | Default | Description | | ---- | -------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
getBoolVariable(name: keyof SetVariableConfigVariables): Promise<number>getDoubleVariable
Reads a double variable from Tesseract.
| Name | Type | Optional | Default | Description | | ---- | -------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
getDoubleVariable(name: keyof SetVariableConfigVariables): Promise<number>getStringVariable
Reads a string variable from Tesseract.
| Name | Type | Optional | Default | Description | | ---- | -------------------------------- | -------- | ------- | -------------- | | name | keyof SetVariableConfigVariables | No | n/a | Variable name. |
getStringVariable(name: keyof SetVariableConfigVariables): Promise<string>setImage
Sets the image from a Buffer.
| Name | Type | Optional | Default | Description | | ------ | ------ | -------- | ------- | ----------- | | buffer | Buffer | No | n/a | Image data. |
setImage(buffer: Buffer): Promise<void>setRectangle
Sets the image region using coordinates and size.
| Name | Type | Optional | Default | Description |
| ------- | --------------------------------------------------------------- | -------- | ------- | ------------------ |
| options | TesseractSetRectangleOptions | No | n/a | Region definition. |
setRectangle(options: TesseractSetRectangleOptions): Promise<void>setSourceResolution
Sets the source resolution in PPI.
| Name | Type | Optional | Default | Description | | ---- | ------ | -------- | ------- | ---------------- | | ppi | number | No | n/a | Pixels per inch. |
setSourceResolution(ppi: number): Promise<void>recognize
Starts OCR and calls the callback with progress info.
| Name | Type | Optional | Default | Description |
| ---------------- | ------------------------------------------------------------- | -------- | ------- | ------------------ |
| progressCallback | (info: ProgressChangedInfo) => void | No | n/a | Progress callback. |
recognize(progressCallback: (info: ProgressChangedInfo) => void): Promise<void>getUTF8Text
Returns recognized text as UTF-8.
getUTF8Text(): Promise<string>getHOCRText
Returns HOCR output. Optional progress callback and page number.
| Name | Type | Optional | Default | Description |
| ---------------- | ------------------------------------------------------------- | -------- | --------- | ---------------------- |
| progressCallback | (info: ProgressChangedInfo) => void | Yes | undefined | Progress callback. |
| pageNumber | number | Yes | undefined | Page number (0-based). |
getHOCRText(
progressCallback?: (info: ProgressChangedInfo) => void,
pageNumber?: number,
): Promise<string>getTSVText
Returns TSV output.
getTSVText(): Promise<string>getUNLVText
Returns UNLV output.
getUNLVText(): Promise<string>getALTOText
Returns ALTO output. Optional progress callback and page number.
| Name | Type | Optional | Default | Description |
| ---------------- | ------------------------------------------------------------- | -------- | --------- | ---------------------- |
| progressCallback | (info: ProgressChangedInfo) => void | Yes | undefined | Progress callback. |
| pageNumber | number | Yes | undefined | Page number (0-based). |
getALTOText(
progressCallback?: (info: ProgressChangedInfo) => void,
pageNumber?: number,
): Promise<string>detectOrientationScript
Detects orientation and script with confidences. Returns DetectOrientationScriptResult.
detectOrientationScript(): Promise<DetectOrientationScriptResult>meanTextConf
Mean text confidence (0-100).
meanTextConf(): Promise<number>getInitLanguages
Returns Language in raw Tesseract format (e.g. "deu+eng").
getInitLanguages(): Promise<string>getLoadedLanguages
Returns Language[] in raw Tesseract format.
getLoadedLanguages(): Promise<Language[]>getAvailableLanguages
Returns Language[] in raw Tesseract format.
getAvailableLanguages(): Promise<Language[]>clear
Clears internal state.
clear(): Promise<void>end
Ends the instance.
end(): Promise<void>License
Apache-2.0. See LICENSE.md for full terms.
Special Thanks
- Stunt3000
