boa-statement-parser
v2.0.0
Published
Production-ready Bank of America statement PDF parser with transaction categorization
Maintainers
Readme
boa-statement-parser
A production-ready Node.js library and CLI for parsing Bank of America bank statement PDFs into clean, normalized, categorized JSON with full JSON Schema validation.
Features
- Multi-format support — Checking, savings, and credit card statements
- Batch processing — Process entire directories with smart deduplication
- 70+ categorization rules — Priority-ordered with confidence tiers
- ML categorization — Optional TensorFlow.js hybrid approach
- Channel detection — CHECKCARD, ATM, Zelle, Online Banking, etc.
- Multiple export formats — JSON, CSV, OFX
- Schema validation — AJV (Draft 2020-12) + Zod
- Unified sync pipeline — PDF + Plaid + Supabase with gap-fill and DB as source of truth
- Integrations — Supabase persistence, Plaid live banking
- TypeScript-first — Full type safety with strict mode
Installation
npm install -g boa-statement-parser # global
npm install boa-statement-parser # localQuick Start
# Initialize project (creates .env, ML model, statements dir)
parse-boa init
# Parse a single PDF
parse-boa ./statement.pdf --out result.json
# Batch process a directory
parse-boa --inputDir ./statements --out result.json --verboseimport { parseStatementFile } from 'boa-statement-parser';
const result = await parseStatementFile('./statement.pdf', {
strict: true,
verbose: false,
});
console.log(result.statement.transactions);Documentation
| Guide | Description | |-------|-------------| | CLI Reference | All commands, options, and usage examples | | Programmatic Usage | Library API, advanced usage, and code examples | | Output Schema | JSON schema structure, v1/v2 versioning, deduplication | | Categorization | Rule-based and ML categorization, training | | Channels & References | Transaction channel types and bank reference extraction | | Recurring Transactions | Subscription and recurring payment detection | | Export Formats | CSV and OFX export details | | Supabase Integration | Database storage, analytics views, RLS | | Plaid Integration | Live banking sync, reconciliation, webhooks | | Environment Variables | All configuration options | | Architecture | Project structure, parsing pipeline, extensibility |
Development
pnpm build # Build
pnpm test # Run all tests
pnpm test:watch # Watch mode
pnpm test:coverage # With coverage
pnpm lint # Check for issues
pnpm lint:fix # Auto-fix
pnpm format # Format with PrettierLicense
MIT
Contributing
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Run
pnpm lint && pnpm test - Submit a pull request
