@temboplus/crdb-narration-helper
v0.1.2
Published
Extract sender names from CRDB bank transaction narrations with OCR corruption handling
Downloads
9
Readme
@temboplus/crdb-narration-helper
A robust TypeScript library for extracting sender names from CRDB (Commercial Rural Development Bank) transaction narrations, with built-in OCR corruption handling and support for multiple payment method formats specific to CRDB systems.
Features
- ✅ CRDB-Specific Payment Methods: Supports SIMAPP FT, IB FT, AGENCY FT, SWIFT, ESB TIPS formats used by CRDB
- ✅ OCR Corruption Handling: Automatically fixes common text fragmentation issues in CRDB statements
- ✅ Name Sanitization: Reconstructs fragmented names and filters invalid entries
- ✅ Tanzanian Banking Support: Handles English and French banking terminology used in Tanzania
- ✅ TypeScript: Full type safety and IntelliSense support
- ✅ Zero Dependencies: Lightweight with no external dependencies
- ✅ Comprehensive Tests: Well-tested with real-world CRDB transaction examples
Installation
npm install @temboplus/crdb-narration-helperQuick Start
import { NarrationHelper } from '@temboplus/crdb-narration-helper';
const helper = new NarrationHelper();
// Extract sender name from a transaction narration
const narration = "REF:123456 SIMAPP FT FROM JOHN DOE SMITH TO TEMBOPLUS XTVT00012345TZS";
const senderName = helper.extractSenderName(narration);
console.log(senderName); // "JOHN DOE SMITH"Supported CRDB Transaction Formats
The library supports extraction from the following CRDB payment method formats:
1. SIMAPP FT (Mobile App Transfers)
REF:123456 SIMAPP FT FROM JOHN DOE TO TEMBOPLUS XTVT00012345TZS2. IB FT (Internet Banking Transfers)
REF:123456 IB FT FROM MARY SMITH TO TEMBOPLUS XTVT00012345TZS3. AGENCY FT (Agency Banking Transfers)
REF:123456 AGENCY FT FROM PETER JONES TO TEMBOPLUS XTVT00012345TZS4. SWIFT International Transfers
TZ#226IBOT252450501#WENMING CHEN#ROC/XTVT00012820TZS
INWS SWIFT TZ ABC123XYZ MICHAEL BROWN USD BEING PAYMENT5. ESB TIPS (Electronic Settlement System)
REF:123456 ESB TIPS STANBIC 240215-TXN123 123456789 DAVID WILSON TO TEMBOPLUS6. Cash Deposits and French Banking
Cash Deposit EMMA WATSON TZ123456789
Sortie Caisse DAVID MILLER TZ987654321CRDB OCR Corruption Handling
The library automatically handles common OCR corruption issues found in CRDB statements:
// Fragmented words are automatically fixed
const corrupted = "REF:123 SIMAPP FT FRO M JOHN DOE TO TEMB OPLUS";
const result = helper.extractSenderName(corrupted);
console.log(result); // "JOHN DOE"
// Fragmented names are reconstructed
const fragmented = "REF:123 SIMAPP FT FROM FR IDA SMITH TO TEMBOPLUS";
const result2 = helper.extractSenderName(fragmented);
console.log(result2); // "FRIDA SMITH"API Reference
NarrationHelper
extractSenderName(narration: string): string | null
Extracts and returns the sender name from a bank transaction narration.
Parameters:
narration- The bank transaction narration string
Returns:
string- The extracted sender name, ornullif not found or invalid
Example:
const helper = new NarrationHelper();
const result = helper.extractSenderName("REF:123 IB FT FROM ALICE WONDER TO TEMBOPLUS");
console.log(result); // "ALICE WONDER"sanitizeText(text: string): string
Sanitizes text by fixing common OCR corruption and fragmentation issues.
Parameters:
text- The raw text to sanitize
Returns:
string- Sanitized text with common corruptions fixed
Example:
const helper = new NarrationHelper();
const result = helper.sanitizeText("FRO M JOHN TO TEMB OPLUS");
console.log(result); // "FROM JOHN TO TEMBOPLUS"sanitizeName(name: string): string | null
Sanitizes and validates extracted names by reconstructing fragmented words and filtering invalid entries.
Parameters:
name- The raw extracted name to sanitize
Returns:
string- Sanitized name, ornullif invalid
Example:
const helper = new NarrationHelper();
const result = helper.sanitizeName("M ARY SMITH");
console.log(result); // "MARY SMITH"Validation Rules
The library applies the following validation rules to extracted names:
- ❌ Names containing numbers are rejected
- ❌ Names containing "TEMBOPLUS" (system reference) are rejected
- ✅ Fragmented single/double character words are joined with the next word
- ✅ Abbreviations ending with punctuation (like "A.") are preserved as separate words
Error Handling
The library gracefully handles various edge cases:
const helper = new NarrationHelper();
// Returns null for invalid inputs
console.log(helper.extractSenderName("")); // null
console.log(helper.extractSenderName(null)); // null
// Returns null for unrecognized patterns
console.log(helper.extractSenderName("Random text")); // null
// Returns null for names with numbers
console.log(helper.extractSenderName("REF:123 SIMAPP FT FROM JOHN123 TO TEMBOPLUS")); // nullDevelopment
# Install dependencies
npm install
# Run tests
npm test
# Run tests in watch mode
npm run test:watch
# Build the project
npm run build
# Clean build files
npm run cleanTesting
The library includes comprehensive tests covering:
- All supported transaction formats
- OCR corruption handling
- Name sanitization and validation
- Edge cases and error conditions
Run tests with:
npm testLicense
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit issues and pull requests.
Changelog
0.1.0
- Initial CRDB-specific release
- Support for 13+ CRDB transaction patterns
- OCR corruption handling for CRDB statements
- Name sanitization and validation
- Comprehensive test suite with CRDB examples
- TypeScript support
- Tanzanian banking terminology support
