@kodexa-ai/document-wasm-ts
v2026.2.0-23158979244
Published
TypeScript WASM wrapper for high-performance Kodexa Document processing using Go backend
Maintainers
Readme
Kodexa Document Models - TypeScript WASM Wrapper
High-performance TypeScript wrapper for the Kodexa Go library using WebAssembly. This provides fast document processing capabilities for both Node.js and browser environments.
🚀 Features
- High Performance: Direct access to Go library performance through WebAssembly
- Cross-Platform: Works in both Node.js and browsers
- Type Safe: Full TypeScript support with comprehensive type definitions
- Memory Efficient: Proper memory management with automatic cleanup
- Complete API: All Go library functions available through TypeScript interface
📦 Installation
npm install @kodexa-ai/document-wasm-ts🏗️ Building from Source
Prerequisites
- Node.js 16+
- Go 1.22+
- TypeScript 5.8+
Build Steps
# Install dependencies
npm install
# Build WASM module only (from Go source)
npm run build:wasm
# Build TypeScript library only
npm run build
# Build everything (WASM + TypeScript)
npm run build:all
# Run tests
npm testBuild Scripts
npm run build:all- Build both WASM and TypeScriptnpm run build:wasm- Build Go WASM module only (runsmake wasm wasm-supportin lib/go)npm run build- Build TypeScript library onlynpm test- Run test suitenpm run clean- Clean dist artifacts
🎯 Quick Start
Node.js
import { Kodexa } from '@kodexa-ai/document-wasm-ts';
async function main() {
// Initialize WASM module
await Kodexa.init();
// Create document from text
const document = await Kodexa.fromText('Hello, world!');
// Get root node
const root = await document.getRoot();
console.log(await root?.getContent()); // "Hello, world!"
// Cleanup
document.dispose();
Kodexa.cleanup();
}
main().catch(console.error);Browser
<!DOCTYPE html>
<html>
<head>
<!-- sql.js for in-browser SQLite -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.11.0/sql-wasm.js"></script>
<!-- Bridge script and Go WASM runtime -->
<script src="node_modules/@kodexa-ai/document-wasm-ts/dist/sqljs-bridge.bundle.js"></script>
<script src="node_modules/@kodexa-ai/document-wasm-ts/dist/wasm_exec.js"></script>
</head>
<body>
<script type="module">
import { Kodexa } from './node_modules/@kodexa-ai/document-wasm-ts/dist/index.js';
async function run() {
await Kodexa.init();
const doc = await Kodexa.fromText('Browser document');
console.log('Document created!');
doc.dispose();
}
run().catch(console.error);
</script>
</body>
</html>📚 API Reference
Kodexa Class
Main entry point for the library:
// Initialize WASM module (required before use)
await Kodexa.init();
// Create documents
const doc1 = await Kodexa.createDocument();
const doc2 = await Kodexa.fromText('text content');
const doc3 = await Kodexa.fromJson('{"data": "json"}');
const doc4 = await Kodexa.fromKddb('/path/to/file.kddb');
// Check if WASM is loaded
const loaded = Kodexa.isLoaded();
// Cleanup resources
Kodexa.cleanup();GoDocument Class
High-level document operations:
// Create documents
const doc = await GoDocument.create();
const textDoc = await GoDocument.fromText('content');
const jsonDoc = await GoDocument.fromJson('{}');
// Document operations
const root = await doc.getRoot();
const json = await doc.toJson();
const kddlBytes = await doc.toKddb();
// Node management
const node = await doc.createNode('paragraph');
await doc.setContentNode(node);
// Selection
const nodes = await doc.select('paragraph');
const firstNode = await doc.selectFirst('heading');
// Metadata
await doc.setMetadataValue('key', 'value');
const value = await doc.getMetadataValue('key');
// Cleanup
doc.dispose();GoContentNode Class
Node manipulation and traversal:
// Basic properties
const nodeType = await node.getNodeType();
await node.setNodeType('heading');
const content = await node.getContent();
await node.setContent('New content');
const index = await node.getIndex();
await node.setIndex(0);
// Hierarchy
const parent = await node.getParent();
const children = await node.getChildren();
const childCount = await node.getChildCount();
const child = await node.getChild(0);
await node.addChild(childNode);
// Navigation
const next = await node.nextNode();
const prev = await node.previousNode();
const isFirst = await node.isFirstChild();
const isLast = await node.isLastChild();
// Tagging
await node.tag('important');
await node.tagWithOptions('label', { confidence: 0.95 });
const hasTag = await node.hasTag('important');
await node.removeTag('important');
const tags = await node.getTags();
// Features
await node.setFeature('style', 'color', ['blue']);
const feature = await node.getFeature('style', 'color');
const value = await node.getFeatureValue('style', 'color');
const hasFeature = await node.hasFeature('style', 'color');
const features = await node.getFeatures();
const styleFeatures = await node.getFeaturesOfType('style');
// Spatial data
await node.setBBox(10, 20, 300, 50);
const bbox = await node.getBBox();
const x = await node.getX();
const y = await node.getY();
await node.setRotate(45);
// Selection
const selected = await node.select('span');
const first = await node.selectFirst('span');
// Cleanup
node.dispose();🎨 Examples
Document Creation and Manipulation
import { Kodexa } from '@kodexa-ai/document-wasm-ts';
async function documentExample() {
await Kodexa.init();
// Create document
const doc = await Kodexa.fromText('Sample document');
const root = await doc.getRoot();
// Create nodes
const heading = await doc.createNode('heading');
await heading.setContent('Main Title');
const paragraph = await doc.createNode('paragraph');
await paragraph.setContent('This is content.');
// Build hierarchy
if (root) {
await root.addChild(heading);
await root.addChild(paragraph);
}
// Tag and style
await heading.tag('title');
await paragraph.setFeature('style', 'font-size', ['14px']);
// Serialize
const json = await doc.toJson();
console.log('Document JSON:', json);
// Cleanup
doc.dispose();
Kodexa.cleanup();
}Advanced Node Operations
async function nodeExample() {
await Kodexa.init();
const doc = await Kodexa.createDocument();
const node = await doc.createNode('paragraph');
// Spatial positioning
await node.setBBox(100, 200, 400, 50);
const bbox = await node.getBBox();
console.log(`Position: ${bbox?.x},${bbox?.y}`);
// Multiple features
await node.setFeature('style', 'color', ['red']);
await node.setFeature('style', 'weight', ['bold']);
await node.setFeature('layout', 'margin', ['10px']);
// Get all style features
const styleFeatures = await node.getFeaturesOfType('style');
console.log('Style features:', styleFeatures);
// Navigation example
const parent = await node.getParent();
const siblings = parent ? await parent.getChildren() : [];
const isLast = await node.isLastChild();
doc.dispose();
Kodexa.cleanup();
}Performance Example
async function performanceExample() {
await Kodexa.init();
const start = Date.now();
const documents = [];
// Create 1000 documents
for (let i = 0; i < 1000; i++) {
const doc = await Kodexa.fromText(`Document ${i}`);
documents.push(doc);
}
const duration = Date.now() - start;
console.log(`Created 1000 documents in ${duration}ms`);
// Cleanup
documents.forEach(doc => doc.dispose());
Kodexa.cleanup();
}🧪 Testing
# Run all tests
npm test
# Run with coverage
npm run test:coverage
# Run specific test
npm test -- wasm-document.test.ts
# Run integration tests (requires WASM build)
WASM_INTEGRATION_TEST=true npm testHTML Test Files
The library includes HTML test files for interactive browser testing. These files must be served via HTTP (not opened directly with file://) due to CORS and ES module requirements.
Available test files:
test-extraction.html- Test extraction engine functionalitytest-queries.html- Test document query functions (getLines, getNodeTypes, etc.)test-minimal.html- Minimal WASM loading and basic functionality testkddb-compare.html- Compare kddb file processing between implementations
Serving the test files:
cd lib/typescript
# Option 1: Python (built-in)
python3 -m http.server 8080
# Option 2: Node.js http-server
npx http-server -p 8080
# Option 3: Node.js serve
npx serve -p 8080Then open http://localhost:8080/test-queries.html or http://localhost:8080/test-extraction.html in your browser.
Note: Make sure you've built the WASM module first with npm run build:all.
🔧 Configuration
TypeScript Configuration
The library includes TypeScript definitions. Configure your tsconfig.json:
{
"compilerOptions": {
"target": "ES2020",
"module": "ESNext",
"moduleResolution": "node",
"allowSyntheticDefaultImports": true,
"esModuleInterop": true,
"strict": true
}
}Webpack Configuration
For browser usage with Webpack:
module.exports = {
resolve: {
fallback: {
"fs": false,
"path": false
}
},
experiments: {
asyncWebAssembly: true
}
};⚡ Performance
The WASM wrapper provides significant performance benefits:
- Document Creation: ~0.1ms per document
- Node Operations: ~0.01ms per operation
- Memory Usage: ~50% less than pure JS implementations
- File I/O: Native Go performance for KDDB files
Benchmarks
Operation | Pure JS | WASM | Improvement
-------------------------|----------|---------|------------
Create 1000 documents | 500ms | 100ms | 5x faster
Process large document | 2000ms | 400ms | 5x faster
Memory usage (1MB doc) | 5MB | 2.5MB | 50% less🔒 Memory Management
Proper memory management is crucial for WASM applications:
// Always dispose of documents and nodes
const doc = await Kodexa.fromText('content');
try {
// Use document...
} finally {
doc.dispose(); // Free WASM memory
}
// Cleanup at application end
window.addEventListener('beforeunload', () => {
Kodexa.cleanup();
});🐛 Troubleshooting
Common Issues
WASM module not loading:
// Check if WASM is supported
if (!WebAssembly) {
console.error('WebAssembly not supported');
}
// Check loading
try {
await Kodexa.init();
} catch (error) {
console.error('WASM init failed:', error);
}Memory leaks:
// Always dispose resources
const doc = await Kodexa.fromText('content');
// ... use document
doc.dispose(); // Required!
// Check for undisposed objects
// Use browser dev tools to monitor memoryPerformance issues:
// Batch operations when possible
const nodes = [];
for (let i = 0; i < 1000; i++) {
nodes.push(await doc.createNode('item'));
}
// Better: create in batches
const batch = await Promise.all(
Array(1000).fill(0).map(() => doc.createNode('item'))
);📄 License
This project is licensed under the same terms as the main Kodexa project.
📦 Release Process
This package is automatically published to npm with provenance attestation via GitHub Actions.
Automatic Publishing
The package is automatically published on every push to main or develop branches that modifies files in kodexa-document/lib/typescript/.
Version Format
Versions follow the format: MAJOR.MINOR.PATCH-BUILDID
- Example:
8.0.0-20484605521 - The build ID is the GitHub Actions run ID, ensuring unique versions
How to Release
Simply push your changes to develop or main:
# Make your changes to the TypeScript package
git add .
git commit -m "feat: add new feature to document API"
git push origin developThe workflow will:
- Build the TypeScript package
- Generate a unique version using the GitHub run ID
- Publish to npm with provenance
Bumping Major/Minor Version
To change the base version (e.g., from 8.0.0 to 9.0.0):
Update the version in
package.json:"version": "9.0.0"Commit and push:
git add package.json git commit -m "chore: bump base version to 9.0.0" git push origin develop
The next build will publish as 9.0.0-<run-id>.
Manual Publish (Dry Run)
To test publishing without actually releasing:
- Go to Actions > Publish TypeScript Package
- Click "Run workflow"
- Check "Dry run" option
- Click "Run workflow"
Build Traceability
Each published version includes the GitHub Actions run ID, allowing you to trace any version back to its exact build and commit.
🤝 Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📞 Support
- Documentation: https://docs.kodexa.com
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Made with ❤️ by the Kodexa team
