npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@yusseiin/node-red-gpc-ocr-line-segmentation

v0.0.6

Published

Performs line segmentation based on characters polygon coordinates for data extraction for gpc ocr in node-red.

Downloads

17

Readme

node-red-gpc-ocr-line-segmentation

Performs line segmentation based on characters polygon coordinates for data extraction for gpc ocr in node-red.

Based on: https://github.com/sshniro/line-segmentation-algorithm-to-gcp-vision

Introduction

Google Vision provides 2 options for optical character recognition(OCR).

- Option 1: TEXT_DETECTION - Words with coordinates
- Option 2: DOCUMENT_TEXT_DETECTION - OCR on dense text to extract lines and paragraph information

The second option is suitable for data extraction from articles (Dense Text such as News Papers/Books). This option has an intelligent segmentation method to merge words which are nearby and form lines and paragraphs.

This feature is not desirable for images with sparse text content such as retail invoices, where the data relevant to the same line resides in two corners (A huge gap/whitespace between the product name and price). For these images the OCR segments the lines in a different order. If the distance of two words in a single line is too far apart then google vision identifies them as two separate paragraphs/lines.

The below images shows the sample output for a typical invoice from google vision.

This behaviour creates a problem in information extraction scenarios. For example, to extract a price of a product from a retail invoice the system needs to find a way to match the words in the same line. The algorithm proposed below performs line segmentation based on characters polygon coordinates for data extraction.

Usage Guide

Usage instruction for each programing language is located in the ReadMe files inside the relevant folders.

Proposed Algorithm

The implemented algorithm runs in two stages

  • Stage 1 - Groups nearby words to generate a longer strip of line
  • Stage 2 - Connects words which are far apart using the bounding polygon approach

Explanation.

Stage one helps to reduce the computations needed for the second phase of the algorithm. In the first phase the algorithms tries to merge words/characters which are very near. Stage 1 should be completed because for price related text like $3.40 is presented as 2 words by Google Vision (word 1: $3. word 2:,40). The first stage helps to concat nearby characters to form a text-block/word. This step helps reduces the computation needed for the second phase.

The stage 2 algorithm draws an imaginary bounding polygon (with a threshold) over the words and computes the words which belongs to each line.

Issues.

The algorithm successfully works for most of the slanted and slightly crumpled images. But it will fail to highly crumpled or folded images.

Future Work

Try to implement the water-flow algorithm for line segmentation and measure accuracies with bounding polygon approach.