@mixpeek/spark
v1.0.0
Published
Apache Spark integration for Mixpeek — UDF transformers, batch processing, and schema mapping
Readme
@mixpeek/spark
Apache Spark integration for Mixpeek — UDF transformers, batch processing, and schema mapping
Installation
npm install @mixpeek/sparkQuick Start
import sparkTransformer from '@mixpeek/spark';
const instance = sparkTransformer({
apiKey: process.env.MIXPEEK_API_KEY
});Modules
SparkTransformer
Spark UDF/transformer that applies Mixpeek enrichment to DataFrame columns
import { createSparkTransformer } from '@mixpeek/spark';
const sparkTransformer = createSparkTransformer({
apiKey: process.env.MIXPEEK_API_KEY
});BatchProcessor
Batch processes Spark DataFrames through Mixpeek with rate limiting and retries
import { createBatchProcessor } from '@mixpeek/spark';
const batchProcessor = createBatchProcessor({
apiKey: process.env.MIXPEEK_API_KEY
});SchemaMapper
Maps Spark schemas to/from Mixpeek document schemas for seamless data flow
import { createSchemaMapper } from '@mixpeek/spark';
const schemaMapper = createSchemaMapper({
apiKey: process.env.MIXPEEK_API_KEY
});Testing
npm test # Unit tests
npm run test:e2e # End-to-end tests
npm run test:live # Live API tests (requires MIXPEEK_API_KEY)
npm run test:coverage # Coverage reportLicense
MIT
