@schemaloom/cli
v1.0.0
Published
Enterprise-grade CLI for SchemaLoom data pipeline management and deployment automation
Maintainers
Readme
SchemaLoom CLI
Enterprise-grade command-line interface for SchemaLoom data pipeline management and deployment automation.
Overview
The SchemaLoom CLI provides a comprehensive suite of tools for creating, managing, and deploying AI-powered data extraction pipelines. Built with enterprise requirements in mind, it enables rapid development and deployment of production-ready data processing workflows.
Features
- Pipeline Scaffolding: Generate production-ready pipeline projects with industry-standard structure
- Multi-Schema Support: Built-in templates for events, products, articles, contacts, and invoices
- Docker Integration: Automated containerization with optimized Docker configurations
- Environment Management: Secure configuration handling with environment variable templates
- Health Monitoring: Built-in health checks and monitoring endpoints
- Production Ready: TypeScript compilation, dependency management, and build optimization
Installation
Prerequisites
- Node.js 18.0.0 or higher
- npm 8.0.0 or higher
- Docker (for containerization features)
Build from Source
git clone <repository-url>
cd schemaloom/cli
npm install
npm run buildGlobal Installation (Optional)
npm install -g .Usage
Command Structure
schemaloom <command> [options]Available Commands
create-pipeline
Creates a new SchemaLoom data pipeline project with configurable templates and deployment options.
Syntax:
schemaloom create-pipeline <name> [options]Options:
--template <template>: Schema template (event, product, article, contact, invoice)--directory <path>: Target directory for project creation--with-docker: Include Docker configuration files--with-examples: Include sample data and usage examples
Examples:
# Create basic event extraction pipeline
schemaloom create-pipeline event-processor --template event
# Create product pipeline with Docker support
schemaloom create-pipeline product-analyzer --template product --with-docker
# Create comprehensive pipeline with all features
schemaloom create-pipeline invoice-processor --template invoice --with-docker --with-examplesdocker-build
Builds Docker images for existing SchemaLoom pipelines with configurable build options.
Syntax:
schemaloom docker-build <pipeline-path> [options]Options:
--tag <tag>: Docker image tag (default: latest)--file <file>: Dockerfile path (default: Dockerfile)--no-cache: Build without using cache--push: Push image to registry after building--registry <registry>: Target Docker registry
Examples:
# Build production image
schemaloom docker-build my-pipeline --tag v1.0.0
# Build and push to registry
schemaloom docker-build my-pipeline --tag v2.0.0 --push --registry my-registry.cominfo
Displays CLI version and system information.
schemaloom infoPipeline Architecture
Generated Project Structure
pipeline-name/
├── src/
│ └── index.ts # Main pipeline logic
├── examples/ # Sample data and usage scripts
├── package.json # Dependencies and build scripts
├── tsconfig.json # TypeScript configuration
├── .env # Environment configuration
├── README.md # Project documentation
├── Dockerfile # Docker configuration
├── .dockerignore # Docker ignore patterns
└── docker-compose.yml # Docker Compose configurationSupported Schema Templates
| Template | Schema | Use Case |
|----------|--------|----------|
| event | EventListSchema | Event extraction and processing |
| product | ProductListSchema | Product catalog management |
| article | ArticleListSchema | Content analysis and extraction |
| contact | ContactListSchema | Contact information processing |
| invoice | InvoiceListSchema | Financial document processing |
Configuration
Environment Variables
The CLI generates environment templates with the following configuration options:
# Server Configuration
PORT=3000
HOST=0.0.0.0
# Google API Configuration
GOOGLE_API_KEY=your_api_key_here
# Extraction Configuration
EXTRACTION_TIMEOUT=30000
MAX_CHUNK_SIZE=100000
CHUNK_OVERLAP=0
TEMPERATURE=0
MODEL=gemini-2.5-flash
# Logging
LOG_LEVEL=infoSecurity Considerations
- API Key Management: Store sensitive credentials in environment variables
- Environment Isolation: Use separate configurations for development, staging, and production
- Access Control: Implement appropriate network security for production deployments
Deployment
Docker Deployment
# Build image
docker build -t my-pipeline .
# Run container
docker run -p 3000:3000 -e GOOGLE_API_KEY=your_key my-pipeline
# Use docker-compose
docker-compose up --buildProduction Considerations
- Resource Limits: Configure appropriate CPU and memory limits
- Health Checks: Monitor pipeline health with built-in endpoints
- Logging: Implement centralized logging for production environments
- Scaling: Use container orchestration for high-availability deployments
Development
Local Development
# Install dependencies
npm install
# Development mode with hot reload
npm run dev
# Build for production
npm run build
# Start production server
npm startTesting
# Run health checks
curl http://localhost:3000/health
# Test extraction endpoints
curl -X POST http://localhost:3000/extract \
-F "[email protected]" \
-F "schema=event"API Reference
Health Endpoint
GET /healthReturns service health status.
Extraction Endpoint
POST /extractProcesses file uploads for data extraction using predefined schemas.
Parameters:
file: File to processschema: Schema template to usechunkSize: Text chunk size (1000-1000000)temperature: AI model creativity (0-2)
Custom Schema Endpoint
POST /extract/customProcesses data using custom schema definitions.
Body:
{
"content": "Text content to process",
"schemaDefinition": {
"type": "object",
"properties": {
"name": {"type": "string"},
"value": {"type": "number"}
}
}
}Troubleshooting
Common Issues
Docker Build Failures
- Ensure Docker daemon is running
- Verify sufficient disk space for image building
- Check network connectivity for dependency downloads
Environment Variable Issues
- Confirm
.envfile exists and contains required variables - Verify API key validity and permissions
- Check Docker environment variable passing
TypeScript Compilation Errors
- Ensure all dependencies are installed
- Verify TypeScript configuration
- Check source code syntax
Debug Mode
Enable verbose logging for troubleshooting:
LOG_LEVEL=debug npm startContributing
Development Setup
- Fork the repository
- Create a feature branch
- Implement changes with appropriate tests
- Submit a pull request with detailed description
Code Standards
- Follow TypeScript best practices
- Include comprehensive error handling
- Maintain backward compatibility
- Document all public APIs
License
This project is licensed under the MIT License. See the LICENSE file for details.
Support
Documentation
Community
Enterprise Support
For enterprise customers requiring dedicated support, please contact our sales team at [email protected].
Version History
- v1.0.0: Initial release with basic pipeline creation and Docker support
- v1.1.0: Added multi-schema templates and environment management
- v1.2.0: Enhanced Docker integration and production optimizations
