@ming822/ocr-reciept-helper
v1.0.2
Published
Extract all the pairs of an item and a price of the reciept object from google ocr api response
Maintainers
Readme
Ocr Reciept Helper
Ocr Reciept Helper is a javascript library for extracting all the pairs of an item and a price of a reciept object from google ocr api response. Notice
- Specialized in the forms of reciepts in Korea
- Using Linear equation when detect a line of an item and a price
- Work well with Rotated or sligtly curved reciepts
- Poor work with curved or folded reciepts
Installation
Use npm to install ocr reciept helper.
npm install @ming822/ocr-reciept-helperUsage
response.jsonbelow is a response from your [google ocr api request] (https://cloud.google.com/vision/docs/ocr?)- parameters when declare new instance
resJson: required. the object of google api response. It must have textAnnotations.placeName: the name of the place of receipt.sumKeywords- the words which refer sum.
- they play a role of startline to detect lines of reciepts.
- recommend to use common words at the bottom parts of receipt as it detects from bottom to top.
dateKeywords: the words which refer date.skipwords: all the unnecessary words
- variables of an instance
whitespace- the minimum distance of phrase
- an average width of all the character * 2
- return value of
calcSpacemethod
dateInfo- lines which contains dateKeywords.
- also check lines of yyyy-mm-dd or yyyy-m-d through
findDateingetDateInfo
priceTable- all the pairs of an item and a price of reciepts
- pairs which are expected to be below the skipwords and above the sumKeywords are extracted through
filterSkipwords
const recieptHelper = require('@ming822/ocr-reciept-helper')
const res = fs.readFileSync('response.json', {encoding:'utf-8', flag:'r'})
const resJson = JSON.parse(res)
const reciept = new recieptHelper(resJson)
console.log(reciept.priceTable)Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
