findata-kit

v2.2.0

Published

a month ago

Extensible financial data toolkit — parse bank statement PDFs, sync live data via Plaid, persist to Supabase. Ships with Bank of America integration; add Chime, Capital One, or any institution.

0High
0Medium
0Low

ruben.avetisyan

findata financial-data bank-statement-parser pdf-parser plaid supabase transaction-categorization bank-of-america capital-one chime ofx csv

findata

An extensible financial data toolkit for Node.js — parse bank statement PDFs, sync live transactions via Plaid, persist to Supabase, and export to JSON/CSV/OFX. Ships with Bank of America as the first institution integration; designed so you can add Chime, Capital One, Self.inc, or any other institution.

Why findata?

Most financial tools are locked to one bank or one data source. findata gives you a single pipeline that combines offline PDFs, live Plaid data, and a Supabase database — with pluggable institution parsers so you're never locked in.

Features

Core Platform

Pluggable institution parsers — Add any bank's PDF format as a parser module
Unified sync pipeline — PDF + Plaid + Supabase with automatic gap-fill; database as source of truth
Plaid integration — Live transaction sync, cursor-based incremental updates, reconciliation
Supabase persistence — Normalized schema, analytics views, RLS, human corrections
70+ categorization rules — Priority-ordered with confidence tiers
ML categorization — Optional TensorFlow.js hybrid approach (rules + neural network)
Multiple export formats — JSON (v1/v2 schema), CSV, OFX 2.2
Schema validation — AJV (Draft 2020-12) + Zod runtime validation
Recurring detection — Automatic subscription and recurring payment identification
TypeScript-first — Full type safety with strict mode

Supported Institutions

| Institution | Status | Parser Module | |-------------|--------|---------------| | Bank of America | ✅ Shipped | src/parsers/boa/ | | Chime | 🔜 Planned | src/parsers/chime/ | | Capital One | 🔜 Planned | src/parsers/capitalone/ | | Self.inc | 🔜 Planned | src/parsers/self/ | | Your bank | Contribute! | src/parsers/<bank>/ |

The Bank of America parser supports checking, savings, and credit card statements plus "Print Transaction Details" PDFs from online banking.

Installation

npm install -g findata-kit   # global CLI
npm install findata-kit      # library

Quick Start

CLI

# Initialize project (creates .env, ML model, statements dir)
findata init

# Parse a single PDF
findata ./statement.pdf --out result.json

# Batch process a directory
findata --inputDir ./statements --out result.json --verbose

# Unified build: PDF + Plaid live data + Supabase → v2 output
findata plaid build --inputDir ./statements --out result.json

# Plaid-only (no local PDFs, database as source of truth)
findata plaid build --start-date 2025-01-01 --out result.json

Library

import { parseStatementFile } from 'findata-kit';

const result = await parseStatementFile('./statement.pdf', {
  strict: true,
  verbose: false,
});
console.log(result.statement.transactions);

Sub-path imports for tree-shaking:

import { reconcileTransactions } from 'findata-kit/plaid';
import { importV2Result } from 'findata-kit/supabase';
import { exportCsv } from 'findata-kit/output';
import { groupByRows } from 'findata-kit/layout';
import { validateOutput } from 'findata-kit/validation';
import { HybridCategorizer } from 'findata-kit/categorization';

Adding a New Institution

The architecture is designed for pluggable bank parsers. To add support for a new institution:

Create a parser directory: src/parsers/<bank>/
Implement a detection function (identify the institution from PDF text)
Create account-type-specific parsers (checking, savings, credit)
Add bank-specific categorization rules if needed
Register the parser in src/parsers/index.ts

src/parsers/
  boa/                  # Bank of America (shipped)
  chime/                # Chime (example)
    index.ts            # Detection + main parser
    checking-parser.ts  # Checking account logic
    types.ts            # Internal types
  capitalone/           # Capital One (example)
    ...

See Architecture for the full parsing pipeline details.

Documentation

| Guide | Description | |-------|-------------| | CLI Reference | All commands, options, and usage examples | | Programmatic Usage | Library API, advanced usage, and code examples | | Output Schema | JSON schema structure, v1/v2 versioning | | Categorization | Rule-based and ML categorization, training | | Channels & References | Transaction channel types and bank references | | Recurring Transactions | Subscription and recurring payment detection | | Export Formats | CSV and OFX export details | | Supabase Integration | Database storage, analytics views, RLS | | Plaid Integration | Live banking sync, reconciliation, webhooks | | Environment Variables | All configuration options | | Architecture | Parsing pipeline, project structure, extensibility |

Development

pnpm build              # Build
pnpm test               # Run all tests (523 tests)
pnpm test:watch         # Watch mode
pnpm test:coverage      # With coverage
pnpm lint               # Check for issues
pnpm lint:fix           # Auto-fix
pnpm format             # Format with Prettier

License

MIT

Contributing

Fork the repository
Create a feature branch
Make your changes with tests
Run pnpm lint && pnpm test
Submit a pull request

Adding a new bank parser? See Adding a New Institution above and the Architecture guide.