@kodexa/kodexa-document
v8.0.19857160443
Published
TypeScript models for Kodexa Document
Readme
Kodexa Document TypeScript SDK
A TypeScript implementation of the Kodexa Document model for working with structured documents.
Installation
npm install @kodexa/kodexa-documentOverview
The Kodexa Document TypeScript SDK provides a comprehensive framework for working with structured documents. It enables developers to create, load, manipulate, and query documents with a hierarchical node structure. The SDK offers a powerful selector language (similar to XPath) for extracting specific content from documents based on complex criteria.
Key Features
- Create and manipulate hierarchical document structures
- Add, update, and remove content nodes and features
- Query documents using a powerful selector language
- Tag content for classification and extraction
- Track document processing steps
- Store and retrieve external data
Usage Examples
Creating a Document
import { Document, DocumentMetadata } from '@kodexa/kodexa-document';
// Create a new document
const document = new Document(new DocumentMetadata());
// Create a root node
const rootNode = document.createNode('root', 'Root content');
document.contentNode = rootNode;
// Add child nodes
rootNode.addChild(document.createNode('paragraph', 'This is a paragraph'));
rootNode.addChild(document.createNode('paragraph', 'This is another paragraph'));Creating a Document from Text
import { Document } from '@kodexa/kodexa-document';
// Create a document from text
const document = Document.fromText('Hello World');Querying Documents
import { Document } from '@kodexa/kodexa-document';
// Create a document with some content
const document = Document.fromText('Hello World');
// Select nodes using selectors
const nodes = document.select('//text');
// Select the first matching node
const firstNode = document.selectFirst('//text');Adding Features to Nodes
import { Document } from '@kodexa/kodexa-document';
// Create a document with some content
const document = Document.fromText('Hello World');
// Add a feature to the root node
document.contentNode?.addFeature('metadata', 'language', 'en');
// Get features
const features = document.contentNode?.getFeatures();Tagging Content
import { Document } from '@kodexa/kodexa-document';
// Create a document with some content
const document = Document.fromText('Hello World');
// Tag the content
document.contentNode?.tag('important', { confidence: 0.95 });
// Get tags
const tags = document.contentNode?.getTags();API Reference
Document
The main class for working with documents.
constructor(metadata?: DocumentMetadata, source?: SourceMetadata, ref?: string): Create a new documentstatic fromText(text: string): Create a document from textcreateNode(nodeType: string, content?: string, virtual?: boolean): Create a new content nodeselect(selector: string, params?: Record<string, any>): Select nodes using a selectorselectFirst(selector: string, params?: Record<string, any>): Select the first matching nodegetRoot(): Get the root node of the documentgetSteps(): Get the processing stepssetSteps(steps: Array<ProcessingStep>): Set the processing stepsgetExternalData(): Get external datasetExternalData(externalData: Record<string, any>): Set external data
ContentNode
Represents a node in the document hierarchy.
constructor(document: Document, nodeType: string, id?: number, content?: string): Create a new content nodegetParent(): Get the parent nodegetChildren(): Get child nodesaddChild(child: ContentNode, index?: number): Add a child noderemoveChild(contentNode: ContentNode): Remove a child nodeaddFeature(featureType: string, name: string, value: any): Add a feature to the nodegetFeatures(): Get all featuresgetFeature(featureType: string, name: string): Get a specific featuretag(name: string, options?: any): Add a tag to the nodegetTags(): Get all tagsgetTag(name: string): Get tags by nameremoveTag(name: string): Remove a tagselect(selector: string, params?: Record<string, any>): Select nodes using a selector
ContentFeatureClass
Represents a feature associated with a content node.
constructor(featureType: string, name: string, value: any): Create a new featuregetValue(): Get the feature valuetoString(): Get a string representation of the featuretoDict(): Convert the feature to a dictionary
Tag
Represents a tag applied to a content node.
constructor(start?: number, end?: number, value?: string, uuid?: string, data?: any): Create a new tagtoDict(): Convert the tag to a dictionary
Running Tests
To run the tests:
# From the lib/typescript directory
npm install
npm testBuilding the Package
To build the package:
# From the lib/typescript directory
npm run buildLicense
ISC
