@origints/xlsx
v0.2.0
Published
XLSX parsing for Origins with full traceability to sheets, ranges, and cells.
Downloads
188
Maintainers
Readme
@origints/xlsx
XLSX parsing for Origins with full traceability to sheets, ranges, and cells.
Features
- Parse XLSX files from streams, buffers, or file paths
- Navigate workbooks, sheets, ranges, and cells
- Cursor-based iteration for sequential processing
- Cell predicates for conditional extraction
- Convert ranges to JSON, arrays, or objects
- Export to CSV format
- Full source location tracking for every cell
Installation
npm install @origints/xlsx @origints/coreUsage with Planner
Extract cell values from a spreadsheet
import { Planner, loadFile, run } from '@origints/core'
import { parseXlsx } from '@origints/xlsx'
const plan = new Planner()
.in(loadFile('data.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) =>
out
.add('title', $.firstSheet().cell('A1').string())
.add('revenue', $.firstSheet().cell('B2').number())
)
.compile()
const result = await run(plan, { readFile, registry })
// result.value: { title: 'Q4 Report', revenue: 150000 }Extract from specific sheets
const plan = new Planner()
.in(loadFile('report.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) =>
out
.add('totalSales', $.sheet('Sales').cell('B10').number())
.add('totalExpenses', $.sheet('Expenses').cell('B10').number())
)
.compile()Extract rows from a range
Use range().rows() to iterate over rows in a range and extract structured data:
// Spreadsheet has headers in row 1: Name | Age | Department
// Data rows in A2:C10
const plan = new Planner()
.in(loadFile('employees.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) =>
out.add(
'employees',
$.firstSheet()
.range('A2:C4')
.rows(row => ({
kind: 'object',
properties: {
name: row.col(1).string(),
age: row.col(2).number(),
dept: row.col(3).string(),
},
}))
)
)
.compile()
const result = await run(plan, { readFile, registry })
// result.value: {
// employees: [
// { name: 'Alice', age: 30, dept: 'Engineering' },
// { name: 'Bob', age: 25, dept: 'Marketing' },
// ...
// ]
// }Range from two corners
range() accepts two addresses or two dynamic cell builders:
// Two string addresses — equivalent to range("A2:C10")
$.firstSheet().range("A2", "C10").rows(...)
// Dynamic corners — cells resolved at runtime
const topLeft = $.firstSheet().find(cell.equals("Name"));
const bottomRight = $.firstSheet().find(cell.equals("Total")).left();
$.firstSheet().range(topLeft, bottomRight).rows((row) => ({
kind: "object",
properties: {
name: row.col(1).string(),
value: row.col(2).number(),
},
}));Collect rows with predicates
Use eachSlice() to iterate rows from a starting cell while a predicate holds, with header-relative column access:
import { cell, rowCol } from '@origints/xlsx'
const hasData = rowCol(0, cell.isNotEmpty()).and(
rowCol(0, cell.startsWith('Total').not())
)
const plan = new Planner()
.in(loadFile('report.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) => {
const header = $.firstSheet().find(cell.equals('Name'))
return out.add(
'people',
header.down().eachSlice('down', hasData, row => ({
kind: 'object',
properties: {
name: row.colWhere(header, cell.equals('Name')).string(),
role: row.colWhere(header, cell.equals('Role')).string(),
},
}))
)
})
.compile()Collect cell values in a direction
Use eachCell() to gather cells in a direction while a predicate matches:
import { cell } from '@origints/xlsx'
const plan = new Planner()
.in(loadFile('data.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) =>
out.add(
'values',
$.firstSheet()
.cell('B2')
.eachCell('down', cell.isNotEmpty(), c => c.number())
)
)
.compile()
const result = await run(plan, { readFile, registry })
// result.value: { values: [100, 200, 300, 400] }Optional and fallback extraction
Handle missing or invalid cell values using optional(), tryExtract(), mapSpec(), and guard():
import { literal, optional, tryExtract, mapSpec, guard } from '@origints/core'
import { cell, rowCol } from '@origints/xlsx'
const plan = new Planner()
.in(loadFile('portfolio.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) => {
const header = $.firstSheet().find(cell.equals('Company'))
const hasData = rowCol(0, cell.isNotEmpty()).and(
rowCol(0, cell.startsWith('Total').not())
)
return out.add(
'companies',
header.down().eachSlice('down', hasData, row => ({
kind: 'object',
properties: {
// Required field
name: row.colWhere(header, cell.equals('Company')).string(),
// Optional: returns null when cell is empty or has wrong type
ownership: optional(
row.colWhere(header, cell.equals('Ownership %')).number(),
null
),
// Fallback: try number first, then parse string, then null
revenue: tryExtract(
row.colWhere(header, cell.equals('Revenue')).number(),
mapSpec(
row.colWhere(header, cell.equals('Revenue')).string(),
v => parseFloat((v as string).replace(/[,$]/g, '')),
'parseFloat'
),
literal(null)
),
// Guard: ensure investment amount is positive
investment: guard(
row.colWhere(header, cell.equals('Investment')).number(),
v => (v as number) > 0,
'Investment must be positive'
),
},
}))
)
})
.compile()Combine with other data sources
const plan = new Planner()
.in(loadFile('budget.xlsx'))
.mapIn(parseXlsx())
.emit((out, $) => out.add('budget', $.firstSheet().cell('B2').number()))
.in(loadFile('config.json'))
.mapIn(parseJson())
.emit((out, $) => out.add('department', $.get('department').string()))
.compile()Standalone usage (without Planner)
For direct workbook navigation:
import { parseXlsxAsyncImpl, XlsxWorkbook } from '@origints/xlsx'
const workbook = (await parseXlsxAsyncImpl.execute(buffer)) as XlsxWorkbook
// Navigate sheets
const sheetResult = workbook.sheet('Sheet1')
if (sheetResult.ok) {
const sheet = sheetResult.value
// Read a cell
const cellResult = sheet.cell('A1')
if (cellResult.ok) {
console.log(cellResult.value.string().value)
}
// Work with ranges
const rangeResult = sheet.range('A1:D10')
if (rangeResult.ok) {
const range = rangeResult.value
// Convert to array of arrays
const rows = range.toArray()
// Convert to objects using first row as headers
const records = range.toObjects()
}
}Cursor-based iteration
import { XlsxCursor } from '@origints/xlsx'
const sheetResult = workbook.sheet('Data')
if (sheetResult.ok) {
const cursor = new XlsxCursor(sheetResult.value, 'A1')
// Move and grab values
cursor.move('right', 2)
const value = cursor.grab()
// Iterate rows
while (cursor.hasMore('down')) {
const row = cursor.grabRow(4)
cursor.move('down')
}
}API
| Export | Description |
| ---------------------------------- | ----------------------------------------------------- |
| parseXlsx(options?) | Create a transform AST for use with Planner.mapIn() |
| parseXlsxImpl | Sync transform implementation |
| parseXlsxAsyncImpl | Async transform implementation (stream/buffer) |
| registerXlsxTransforms(registry) | Register all XLSX transforms with a registry |
| XlsxWorkbook | Workbook navigation |
| XlsxSheet | Sheet navigation with cell/range access |
| XlsxRange | Range operations and conversion |
| XlsxCell | Cell value extraction |
| XlsxCursor | Sequential cursor-based iteration |
License
MIT
