@knowhere-ai/sdk
v0.1.4
Published
Official Node.js SDK for Knowhere document parsing API
Maintainers
Readme
Knowhere Node.js SDK
Official Node.js/TypeScript SDK for the Knowhere document parsing API.
Features
- 🚀 TypeScript-first - Full type safety with comprehensive type definitions
- 📦 Stream-based uploads - Efficient handling of large files
- 🔄 Automatic retries - Exponential backoff for transient failures
- 📊 Adaptive polling - Smart waiting for job completion
- 🎯 Progressive API - High-level convenience methods + low-level control
- ⚡ Modern JavaScript - ESM and CommonJS support
Installation
npm install @knowhere-ai/sdkRequirements:
- Node.js >= 20.19.0
- npm >= 10.0.0
- TypeScript >= 5.0 (optional, for type checking)
Quick Start
import Knowhere from '@knowhere-ai/sdk';
// Initialize client
const client = new Knowhere({
apiKey: process.env.KNOWHERE_API_KEY,
});
// Parse a document from URL
const result = await client.parse({
url: 'https://example.com/document.pdf',
});
// Access parsed content
console.log(`Found ${result.textChunks.length} text chunks`);
console.log(`Found ${result.imageChunks.length} images`);
console.log(`Found ${result.tableChunks.length} tables`);
// Work with chunks
result.textChunks.forEach((chunk) => {
console.log(chunk.content);
console.log(chunk.keywords);
console.log(chunk.summary);
});
// Save results to disk
await result.save('./output/');Configuration
Environment Variables
KNOWHERE_API_KEY=sk_... # Required
KNOWHERE_BASE_URL=https://api.knowhereto.ai # OptionalClient Options
const client = new Knowhere({
apiKey: 'sk_...', // API authentication key
baseURL: 'https://...', // API base URL
timeout: 60000, // Request timeout (ms)
uploadTimeout: 600000, // Upload timeout (ms)
maxRetries: 5, // Max retry attempts
});Usage Examples
Parse from File
// From file path (recommended)
const result = await client.parse({
file: './document.pdf',
});
// From Buffer
const buffer = await fs.readFile('./document.pdf');
const result = await client.parse({
file: buffer,
fileName: 'document.pdf',
});
// From Stream
const stream = fs.createReadStream('./document.pdf');
const result = await client.parse({
file: stream,
fileName: 'document.pdf',
});fileName 会在 file 是本地文件路径时自动推断;当 file 是 Buffer、Uint8Array 或不带路径信息的流时必须显式提供。
Advanced Options
const result = await client.parse({
url: 'https://example.com/doc.pdf',
model: 'advanced', // 'base' | 'advanced'
ocr: true, // Enable OCR
docType: 'pdf', // Document type hint
smartTitleParse: true, // Smart title detection
summaryImage: true, // Generate image summaries
summaryTable: true, // Generate table summaries
summaryText: true, // Generate text summaries
addFragDesc: 'Custom context', // Additional fragment description
kbDir: 'project_docs', // Knowledge base directory
pollInterval: 10000, // Polling interval (ms)
pollTimeout: 1800000, // Max wait time (ms)
verifyChecksum: true, // Verify ZIP checksum (default: true)
webhook: { // Webhook for completion
url: 'https://...',
},
onUploadProgress: (progress) => {
console.log(`Upload: ${progress.percent}%`);
},
onPollProgress: (status) => {
console.log(`Status: ${status.status}`);
},
});Low-Level API
For granular control over the job lifecycle:
// 1. Create job
const job = await client.jobs.create({
sourceType: 'file',
fileName: 'document.pdf',
parsingParams: { model: 'advanced', ocrEnabled: true },
});
// 2. Upload file
await client.jobs.upload(job, {
file: './document.pdf',
onProgress: ({ percent }) => console.log(`${percent}%`),
});
// 3. Wait for completion
const jobResult = await client.jobs.wait(job.jobId, {
pollInterval: 10000,
});
// 4. Load results
const result = await client.jobs.load(jobResult);Error Handling
import {
BadRequestError,
AuthenticationError,
RateLimitError,
PollingTimeoutError,
JobFailedError,
ValidationError,
InvalidStateError,
} from '@knowhere-ai/sdk';
try {
const result = await client.parse({ url: '...' });
} catch (error) {
if (error instanceof ValidationError) {
console.error('Invalid parameters:', error.message);
} else if (error instanceof RateLimitError) {
// Wait and retry
await sleep(error.retryAfter * 1000);
} else if (error instanceof AuthenticationError) {
console.error('Invalid API key');
} else if (error instanceof PollingTimeoutError) {
console.error('Processing timeout');
} else if (error instanceof JobFailedError) {
console.error('Job failed:', error.jobResult.error);
} else if (error instanceof InvalidStateError) {
console.error('Invalid state:', error.message);
}
}Documentation
For complete documentation, visit https://docs.knowhereto.ai
Examples
Check out the examples directory for more usage examples:
Development
# Install dependencies
npm install
# Run tests
npm test
# Run tests with coverage
npm run test:ci
# Lint code
npm run lint
# Format code
npm run format
# Type check
npm run typecheck
# Build
npm run buildLicense
Support
- 📧 Email: [email protected]
- 🐛 Issues: GitHub Issues
- 📚 Documentation: https://docs.knowhereto.ai
Changelog
See CHANGELOG.md for release history.
