@uluops/core
v0.23.0
Published
The foundational execution engine for UluOps. Orchestrates AI-powered code analysis through a 4-layer execution hierarchy (Agent > Command > Workflow > Pipeline), manages LLM tool loops via Vercel AI SDK, and integrates with UluOps Registry and Validation
Downloads
2,524
Maintainers
Readme
UluOps · Operating Intelligence as Infrastructure
@uluops/core
The foundational execution engine for UluOps. Orchestrates AI-powered code analysis through a 4-layer execution hierarchy (Agent > Command > Workflow > Pipeline), manages LLM tool loops via Vercel AI SDK, and integrates with UluOps Registry and Validation services.
Prerequisites
- Node.js 18+ with an ESM project (
"type": "module"in package.json) - tsx for running TypeScript examples:
npm install -D tsx - UluOps API key — get one at app.uluops.ai, or use bundled starter agents offline (see below)
- AI provider key — at least one of
ANTHROPIC_API_KEY,OPENAI_API_KEY, orGOOGLE_API_KEY
npm install @uluops/core
export ULUOPS_API_KEY=ulr_your_key_here
export ANTHROPIC_API_KEY=your_anthropic_key # or OPENAI_API_KEYQuick Start
Create validate.ts:
import { UluOpsClient } from '@uluops/core';
const client = new UluOpsClient({
apiKey: process.env.ULUOPS_API_KEY,
});
// Run a single agent
const result = await client.runAgent('code-validator', './src', {
model: 'sonnet',
thresholds: { pass: 80 },
});
console.log(`Score: ${result.score} | Decision: ${result.decision}`);
// Or run a saved command configuration (model, thresholds, and aggregation
// come from the command definition — ideal for CI):
const cmd = await client.runCommand('validate', { target: './src' });
console.log(`Score: ${cmd.score} | Decision: ${cmd.decision}`);npx tsx validate.tsOffline Quick Start (No UluOps API Key)
Use the bundled starter agents without registry access. An AI provider key (e.g. ANTHROPIC_API_KEY) is still required:
import { UluOpsClient, STARTER_DEFINITIONS_DIR } from '@uluops/core';
const client = new UluOpsClient({
localDefinitions: STARTER_DEFINITIONS_DIR,
trackingEnabled: false,
});
const result = await client.runAgent('code-validator', './src');
console.log(`Score: ${result.score} | Decision: ${result.decision}`);This still requires an AI provider key but no UluOps API key or network access to the registry.
Note:
runAgent/resolvereturn immediately from local definitions when found.client.list(), however, always queries the registry first and only falls back to the bundled definitions after the request fails — in a genuinely offline environment that means a few seconds of retry/backoff before the local list returns.
Table of Contents
- Overview
- Installation
- Authentication
- Usage
- Architecture
- Execution Hierarchy
- Advanced Exports
- Configuration
- TypeScript Support
- Error Handling
- Security
- Dependencies
- Development
Overview
The @uluops/core SDK provides:
- 4-Layer Execution Hierarchy - Agent > Command > Workflow > Pipeline orchestration
- AI SDK v6 Integration - Vercel AI SDK for LLM communication with automatic tool loops (
maxSteps) and built-in retry - Registry-Backed Model Resolution - Model aliases resolved via UluOps Registry with provider metadata
- Multi-Provider AI - Anthropic-first with deepest optimization (caching, context management, bash tools); OpenAI + Google bundled; Mistral, Cohere, and 10+ others via dynamic
@ai-sdk/*import. See SCOPE.md for provider strategy. - Filesystem Sandboxing - ToolHandler restricts LLM file access to the target directory with symlink-aware path validation
- Content-Addressed Integrity Verification - Registry-resolved definitions carry a SHA-256 YAML content hash (
sha256:…) and, for agents/commands, apromptHashover the frozen rendered prompt. Hashing uses the shared@uluops/sdk-coreimplementation, so the client and registry hash identically. Remote resolution executes the frozenruntimeMdthepromptHashcertifies (not a live re-render). Callers can pinexpectedHash/expectedPromptHash(from a trusted channel) onresolve()/ExecutionOptions; pins are verified fail-closed on every resolve path (cache/local/remote) and a mismatch throwsIntegrityError. Verification is opt-in — unpinned resolves behave as before. See Integrity Verification. - Universal Agent Output - Single
agentOutputSchemawith categories + artifacts for all 6 agent types (validator, executor, analyst, generator, explorer, forecaster) - Structured Output Extraction - 4-strategy fallback: AI SDK structured output > JSON code fence > inline JSON > regex text parsing
- Validation Tracking - Automatic result submission with issue correlation, regression detection, per-agent execution recording, and analytics
- Analysis Summary Extraction - Automatic extraction of category scores, system metrics, epistemic assessments, and exploration maps from agent results at submission time
- Local Development Support - Load definitions from local YAML files with registry fallback
- Bundled Starter Agents - 5 built-in agents for immediate use without registry access
Installation
npm install @uluops/coreBundled starter agents (no registry needed): code-validator, docs-validator, public-interface-validator, security-analyst, test-architect.
Authentication
UluOps API Key
Required for registry and validation service access:
# Environment variable (recommended)
export ULUOPS_API_KEY=ulr_your_api_key_hereOr pass directly in config:
const client = new UluOpsClient({ apiKey: 'ulr_your-key' });The SDK checks for ULUOPS_API_KEY then ULU_API_KEY environment variables. Keys use a ulr_ prefix. Generate one at app.uluops.ai.
AI Provider Keys
Set environment variables for the providers you want to use. The SDK auto-detects configured providers:
export ANTHROPIC_API_KEY=your_anthropic_key
export OPENAI_API_KEY=your_openai_key # optional
export GOOGLE_API_KEY=your_google_key # optional (also accepts GOOGLE_GENERATIVE_AI_API_KEY)Or configure explicitly:
const client = new UluOpsClient({
apiKey: process.env.ULUOPS_API_KEY,
ai: {
providers: {
anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },
openai: { apiKey: process.env.OPENAI_API_KEY },
google: { apiKey: process.env.GOOGLE_API_KEY },
},
defaultProvider: 'anthropic',
},
});Usage
Agent Execution
Direct agent execution with call-time options. Best for interactive/ad-hoc validation:
const result = await client.runAgent('code-validator', './src', {
model: 'sonnet',
thresholds: { pass: 80, warn: 60 },
trackResults: true,
project: 'my-project',
});
console.log(`Score: ${result.score} | Decision: ${result.decision}`);
console.log(`Recommendations: ${result.recommendations.length}`);Operator Prompt
Pass a prompt to give the agent a directive or focus. Especially useful for generators and executors that need to know what to create:
// Generator: tell it what to create
const generated = await client.runAgent('aristotle-generator', {
target: './src',
prompt: 'Create a health check endpoint for the Express API',
}, { model: 'opus' });
// Validator: provide focus context
const focused = await client.runAgent('security-analyst', {
target: './src',
prompt: 'Focus on the authentication middleware and JWT handling',
});The prompt appears as a prominent Directive: section in the initial message, before project context. When omitted, behavior is identical to previous versions.
Run Completeness & Degradation Markers
Every agent run carries a completeness signal — distinct from the agent's decision — describing whether the run actually finished its work:
const result = await client.runAgent('security-analyst', './src');
// Decision = what the agent concluded; completeness = whether the run finished its work.
console.log(`${result.decision} · ${result.completeness ?? 'complete'}`);
if (result.completeness !== 'complete') {
for (const m of result.degradationMarkers ?? []) {
console.log(`[${m.severity}] ${m.code}${m.detail ? ` — ${m.detail}` : ''}`);
}
}completeness:'complete' | 'partial' | 'failed', derived from degradation markers (anycritical⇒failed; anydegraded⇒partial; elsecomplete). Absent ⇒ treat ascomplete. APASS+partialresult is a positive verdict reached on incomplete evidence — worth surfacing.degradationMarkers: typed{ code, phase, severity, detail? }[].codeis the stable contract (e.g.budget.forced-wrap-up,steps.near-exhaustion,extraction.low-confidence,render.raw-yaml-fallback);detailis human-only — never match on it.phaseis'resolution' | 'execution'.- The engine observes completeness from how the run actually executed; agents never self-report it.
deriveCompleteness(markers)is exported if you want to recompute it. degradations: string[]is the deprecated legacy alias (resolution-phase strings only), retained for compatibility — preferdegradationMarkers.
Empty-output step exhaustion is a thrown
MaxStepsExhaustedError, not a marker; near-exhaustion with output is thesteps.near-exhaustionmarker.
Command Execution
Execute saved command configurations. Uses model, thresholds, and aggregation from the command definition. Ideal for CI/CD:
const result = await client.runCommand('validate', { target: './src' });
// Override the definition's default model at runtime (e.g., for CI cost control)
const fast = await client.runCommand('validate', { target: './src' }, { model: 'haiku' });
console.log(`Score: ${result.score}`);
console.log(`Categories:`, result.categories);Workflow Execution
DAG-based multi-phase orchestration with quality gates. Independent phases execute in parallel; dependent phases wait for their dependencies:
const result = await client.runWorkflow('ship', { target: './src' });
console.log(`Overall: ${result.decision} (score: ${result.score})`);
for (const phase of result.phases) {
// decision: 'passed' | 'warned' | 'blocked' | 'skipped' | 'aborted'
console.log(` ${phase.name}: ${phase.decision} (${phase.score})`);
}
// Metrics break down phase outcomes
const { phasesExecuted, phasesPassed, phasesBlocked, phasesAborted } = result.metrics;Workflows define phase dependencies and failure behaviors in their WDL definition:
orchestration:
on_failure: stop # stop | abort | continue | warn
max_parallel: 3 # optional concurrency limit
phases:
- id: lint
commands: [lint-validator@latest]
- id: test
commands: [test-architect@latest]
- id: security
commands: [security-analyst@latest]
depends_on: [lint, test] # DAG dependency — waits for lint + test
gate:
threshold: 85
on_fail: abortAuto-Routing
Universal execution — auto-detects definition type and routes to the correct executor:
// Routes automatically based on whether name resolves to agent, command, workflow, or pipeline
const result = await client.run('code-validator', { target: './src' });
console.log(`Decision: ${result.decision}`);Pipeline Execution
Synchronous execution (blocks until complete):
const result = await client.runPipeline('foundations', { target: './src' });
console.log(`Overall: ${result.decision} (score: ${result.score})`);
for (const stage of result.stages) {
console.log(` ${stage.name}: ${stage.status}`);
}Async execution with handle-based control:
// Start async pipeline
const handle = await client.startPipeline('full-validation', {
target: './src',
});
// Monitor progress
const status = await handle.status();
console.log(`Stage ${status.stages.length} of pipeline`);
// Wait for completion
const result = await handle.wait();
// Or cancel
await handle.cancel();Convenience Methods
Shorthand methods for common workflows:
// Run the built-in validate command
const result = await client.validate('./src');
// Security audit
const secResult = await client.security('./src');
// Code optimization analysis
const optResult = await client.optimize('./src');
// Ship workflow (full pre-release validation)
const shipResult = await client.ship('./src');
// Post-implementation validation
const postResult = await client.postImplementation('./src');Discovery
// List available definitions
const definitions = await client.list({ type: 'agent', domain: 'software' });
// Inspect a definition
const info = await client.describe('code-validator');
console.log(info.name, info.version, info.interface);Cache Management
// Clear the definition resolution cache — call after registry updates in long-lived processes
client.clearCache();Validation Tracking
Submit execution results, preview submissions, and query run history:
// Automatic tracking: results are submitted automatically when trackingEnabled is true
const result = await client.runAgent('code-validator', './src', {
trackResults: true,
project: 'my-project',
});
// Manual submission: submit results from a custom execution
const response = await client.submitResults('my-project', 'post-implementation', result);
console.log(`Run #${response.runNumber}: ${response.dashboardUrl}`);
console.log(`New issues: ${response.correlation.newIssues}, Regressions: ${response.correlation.regressions}`);
// Dry run: preview what a submission would do without saving
const preview = await client.previewSubmission('my-project', 'post-implementation', result);
if (preview.validationErrors.length > 0) {
console.error('Validation errors:', preview.validationErrors);
}
// Query history: list past runs for a project
const history = await client.getHistory('my-project');
for (const entry of history) {
console.log(`Run #${entry.runNumber} — ${entry.workflowType} — Score: ${entry.averageScore}`);
}
// Run details: fetch full details for a specific run
const run = await client.getRun('run-uuid');Usage Metrics
All execution results include token usage metrics. Provider-specific token fields are mapped to a unified format:
const result = await client.runAgent('code-validator', './src');
const { metrics } = result;
console.log(`Input: ${metrics.inputTokens}, Output: ${metrics.outputTokens}`);
console.log(`Cache: ${metrics.cacheCreationTokens ?? 0} created, ${metrics.cacheReadTokens ?? 0} read`);
// Google Gemini 2.5+ models report thinking tokens separately from output tokens.
// thinking_tokens are included in totalEffectiveTokens (unlike OpenAI reasoning_tokens
// which are already counted within outputTokens).
if (metrics.thinkingTokens) {
console.log(`Thinking: ${metrics.thinkingTokens}`);
}
console.log(`Effective total: ${metrics.totalEffectiveTokens}`);Integrity Verification
Pin a definition's expected hashes so execution is refused if the resolved content doesn't match. Pins come from a trusted, independent channel (a lockfile, a reviewed value) — recomputing against the registry's own returned hash only catches an internally-inconsistent registry, not a compromised one.
import { IntegrityError } from '@uluops/core';
try {
const result = await client.runAgent('code-validator', './src', {
expectedHash: 'sha256:…', // pins the YAML (source + config)
expectedPromptHash: 'sha256:…', // pins the rendered prompt (agents/commands)
});
} catch (err) {
if (err instanceof IntegrityError) {
// err.kind: 'yaml' | 'prompt' | 'unavailable'; err.expected / err.actual
// err.definitionName / err.definitionVersion identify which definition failed
console.error(`Execution refused (${err.kind}) for ${err.definitionName}@${err.definitionVersion}: ${err.message}`);
}
}- Both pins are optional. Unpinned resolves are unverified and behave exactly as before.
expectedHashverifiescomputeHash(resolved.yaml)— covers source and execution config. For WDL/PDL the YAML is the runtime, so the YAML pin alone fully covers execution.expectedPromptHashverifies the frozen rendered prompt and is required (withexpectedHash) for full agent/command executed-prompt integrity. Supplying it for a definition with no rendered prompt (workflow/pipeline, local, content-gated, schema-stale) throwsIntegrityError(kind: 'unavailable')— never a silent pass.- Verification runs on every resolve path, including cache hits — a prior unpinned resolve cannot let a later pinned one through unchecked.
ResolvedDefinitionalso surfacespromptHashandtranslatorVersionso callers can detect a retranslation restamp.
Trust bootstrap. This ships the verification mechanism and explicit pin inputs, not pin provenance. A pin seeded from a first unpinned
resolve()against an already-compromised registry is trust-on-first-use. A pin manifest (lockfile) is the natural completion.
Architecture
For a detailed, hop-by-hop trace of every execution chain (resolution → LLM generation → persistence), see ARCHITECTURE.md.
UluOpsClient (facade)
|
+-- AgentExecutor (single-agent LLM execution)
| +-- AIProvider (AI SDK v6 wrapper, provider registry, context management)
| +-- ToolHandler (sandboxed filesystem tools with symlink protection)
| +-- ToolAdapter (converts tools to AI SDK format)
| +-- OutputExtractor (4-strategy: structured output > JSON fence > inline JSON > regex)
|
+-- CommandExecutor (single/multi-agent aggregation via Promise.allSettled)
| +-- AgentExecutor
| +-- preflight (prerequisite checks with path traversal protection)
|
+-- WorkflowExecutor (DAG-based parallel phase orchestration with quality gates)
| +-- CommandExecutor
|
+-- PipelineExecutor (multi-stage async pipelines)
| +-- WorkflowExecutor
| +-- CommandExecutor
|
+-- RegistryClient (definition resolution + content hash)
+-- SubmissionClient (result submission + history)
| +-- AnalysisSummaryExtractor (auto-extract analysis from AgentResult)
+-- ModelCatalog (registry-backed model alias resolution)Execution Hierarchy
| Level | Definition | Description |
|-------|-----------|-------------|
| Agent | ADL | Atomic unit: single LLM with filesystem tools. 6 types: validator, executor, analyst, generator, explorer, forecaster — all produce a universal AgentResult (score/maxScore are number \| null: null for generators/executors that produce artifacts not scores), with categories and optional artifacts |
| Command | CDL | Wraps 1+ agents with preflight checks and aggregation |
| Workflow | WDL | Sequences commands into phases with quality gates (DAG-based parallel execution) |
| Pipeline | PDL | Orchestrates workflows/commands across stages |
Advanced Exports
All internal components are exported for direct use when UluOpsClient is too opinionated. Import from the package root (reference listing — import only what you need):
import {
// Executors — run definitions at any level without the UluOpsClient facade
AgentExecutor, // Single-agent LLM execution with tool loop
CommandExecutor, // Multi-agent aggregation with preflight checks
WorkflowExecutor, // DAG-based parallel phase orchestration with quality gates
PipelineExecutor, // Multi-stage async pipelines
// Service clients — talk to UluOps APIs directly
RegistryClient, // Definition resolution, local/remote (server-side normalization via API, local via @uluops/definition-factory)
SubmissionClient, // Run submission, history queries, regression detection
// AI layer — provider management and model resolution
AIProvider, // AI SDK v6 wrapper with provider registry and error mapping
ModelCatalog, // Registry-backed model alias → provider/model resolution
ToolAdapter, // Converts ToolHandler tools to AI SDK ToolSet format
TokenBudgetTracker, // Tracks token consumption against configurable budgets
// Analysis
AnalysisSummaryExtractor, // Auto-extract analysisSummary + analysisRecords from agent results
// Utilities
OutputExtractor, // 4-strategy LLM output parser (structured > JSON fence > inline > regex)
ToolHandler, // Sandboxed filesystem tools (read_file, list_files, search_content, get_file_info, get_directory_tree, get_symbols)
parseRef, // Parse "name@version" reference strings
classifyDecision, // Classify decision strings into positive/negative/conditional/neutral
buildVocabularyMap, // Build custom decision vocabulary from agent definitions
deriveCompleteness, // Recompute completeness from a DegradationMarker[]
resolutionMarkersFromLegacy, // Convert the deprecated degradations[] to DegradationMarker[]
} from '@uluops/core';Exported Constants
The default thresholds and limits used by the executors are exported for custom threshold logic, diagnostics, and tests:
import {
DEFAULT_PASS_THRESHOLD, // 75 — default validator pass threshold
DEFAULT_WARN_THRESHOLD, // 50 — default warn threshold
DEFAULT_GATE_THRESHOLD, // 70 — default workflow phase gate
DEFAULT_MAX_STEPS, // 50 — default tool-loop step ceiling
DEFAULT_MAX_TOKENS, // 16384 — default output tokens per generation call
STARTER_DEFINITIONS_DIR, // bundled starter definitions directory (offline quick start)
} from '@uluops/core';Direct Executor Usage
import { AgentExecutor, AIProvider, RegistryClient } from '@uluops/core';
// Wire up dependencies manually
const ai = new AIProvider(config, catalog, logger);
const executor = new AgentExecutor(config, ai, logger);
// Execute with full control over options
const result = await executor.execute(resolvedDefinition, {
target: '/path/to/project',
prompt: 'Create a database migration for the users table', // optional operator directive
}, {
model: 'opus',
maxTokens: 16384,
timeoutMs: 60_000,
});Model Resolution
import { ModelCatalog } from '@uluops/core';
const catalog = new ModelCatalog(registrySdk);
const resolved = await catalog.resolve('sonnet', {
requiredCapabilities: ['tools', 'extendedThinking'],
});
// → { provider: 'anthropic', modelId: 'claude-sonnet-4-...', providerModelId: 'claude-sonnet-4-...' }
// Enumerate available models and aliases
const aliases = await catalog.listAliases();
const premiumModels = await catalog.listModels({ tier: 'premium' });
// Clear in-memory cache after registry admin syncs models
catalog.refresh();Decision Classification
import { classifyDecision, buildVocabularyMap } from '@uluops/core';
// Core vocabularies — covers all execution layers
classifyDecision('PASS'); // → 'positive'
classifyDecision('COMPLETE'); // → 'positive'
classifyDecision('FAIL'); // → 'negative'
classifyDecision('WARN'); // → 'conditional'
classifyDecision('PARTIAL'); // → 'conditional' (progress, not failure)
classifyDecision('MAYBE'); // → 'neutral' (unknown)
// Custom vocabulary from agent definition — cognitive lens agents use these
const vocab = buildVocabularyMap(agentDefinition);
classifyDecision('EXAMINED', vocab); // → 'positive' (Socrates)
classifyDecision('VITAL', vocab); // → 'positive' (Nietzsche)Configuration
const client = new UluOpsClient({
// Required
apiKey: 'your-api-key', // or ULUOPS_API_KEY env var
// AI Configuration
ai: {
providers: { // Provider API keys (env var fallback)
anthropic: { apiKey: '...' },
},
defaultProvider: 'anthropic', // Default AI provider
modelOverride: 'sonnet', // Override model for all executions
additionalProviders: ['groq', 'xai'], // Enable extra @ai-sdk/* providers (must be installed)
},
// Service URLs
registryUrl: 'https://...', // Registry API (or ULUOPS_REGISTRY_URL)
submissionUrl: 'https://...', // Submission API (or ULUOPS_SUBMISSION_URL)
// Behavior
trackingEnabled: true, // Auto-submit results to validation service
timeout: 300000, // Request timeout in ms
defaultProject: 'my-project', // Default project for result submission
debug: false, // Detailed execution logging (or ULUOPS_DEBUG)
defaultThinkingBudget: 10000, // Extended thinking budget (Anthropic + Google models)
contextBudget: 200000, // Optional cap on the context budget (forces wrap-up at 80%, Anthropic eviction at 50%).
// When unset, the engine uses the resolved model's real context window
// (registry `limits.context`) — e.g. 1M for Opus 4.6+, 128k for many GPT/Gemini —
// falling back to 200k only when the window is unknown. When set, it caps the
// budget at min(this, modelWindow). Set it to control cost on large-window models.
maxRetries: 2, // Retries for transient LLM errors (429/5xx); exponential backoff via AI SDK
maxConcurrency: 8, // Global ceiling on concurrent in-flight LLM calls across the whole engine
// (or ULUOPS_MAX_CONCURRENCY). Bounds total requests regardless of how many
// workflow phases, parallel steps, or inline pipeline agents fan out at once —
// the global throttle that stops fan-out × retry from amplifying a rate limit.
// Distinct from a workflow's per-level `max_parallel`, which caps one layer only.
dashboardUrl: 'https://app.uluops.ai', // Dashboard link prefix for run URLs
// Security
allowedTools: ['bash'], // Operator tool allowlist (or ULUOPS_ALLOWED_TOOLS)
// Default: all tools except 'bash' are allowed
// Local Development
localDefinitions: './definitions', // Load YAML definitions from local dir
});Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| ULUOPS_API_KEY | Platform API key | (required) |
| ULU_API_KEY | Platform API key (legacy alias; used only if ULUOPS_API_KEY is unset) | - |
| ANTHROPIC_API_KEY | Anthropic provider key | - |
| OPENAI_API_KEY | OpenAI provider key | - |
| GOOGLE_API_KEY | Google/Gemini provider key | - |
| GOOGLE_GENERATIVE_AI_API_KEY | Google provider key (alternative) | - |
| ULUOPS_REGISTRY_URL | Registry API URL | https://api.uluops.ai/api/v1/registry |
| ULUOPS_SUBMISSION_URL | Submission API URL | https://api.uluops.ai/api/v1 |
| ULUOPS_TRACKING_ENABLED | Auto-submit results | true |
| ULUOPS_PROJECT | Default project name | - |
| ULUOPS_LOCAL_DEFINITIONS | Local definitions path | - |
| ULUOPS_DASHBOARD_URL | Dashboard base URL for run links | https://app.uluops.ai |
| ULUOPS_ALLOWED_TOOLS | Comma-separated tool allowlist (e.g., bash) | all except bash |
| ULUOPS_MAX_CONCURRENCY | Global ceiling on concurrent in-flight LLM calls | 8 |
| ULUOPS_DEBUG | Enable detailed execution logging | false |
TypeScript Support
Full TypeScript support with exported types for all parameters and results:
import {
UluOpsClient,
// Result types — AgentResult is universal for all 6 agent types
type AgentResult,
type CommandResult,
type WorkflowResult,
type PipelineResult,
// Definition types
type AgentDefinition,
type CommandDefinition,
type WorkflowDefinition,
type PipelineDefinition,
// Config types
type UluOpsConfig,
type ExecutionInput,
type ExecutionOptions,
// Async pipeline handle (return type of startPipeline)
type PipelineHandle,
// Decision classification
classifyDecision,
type DecisionCategory,
type DecisionVocabularyMap, // return type of buildVocabularyMap
// Analysis & AI layer companion types (for direct Advanced Exports usage)
type AnalysisExtractionResult, // return type of AnalysisSummaryExtractor
type ResolvedModel, // return type of ModelCatalog.resolve
// Completeness & degradation markers
deriveCompleteness,
resolutionMarkersFromLegacy, // migrate the deprecated degradations[] field
type Completeness,
type DegradationMarker,
type DegradationPhase, // 'resolution' | 'execution'
type DegradationSeverity, // 'info' | 'degraded' | 'critical'
// Usage metrics
type ExecutionMetrics, // type of result.metrics on every result type (camelCase)
type UsageMetrics, // raw AI-provider usage on AIGenerateResult.usage (advanced; snake_case)
type AIGenerateResult, // return type of AIProvider.generate() (advanced)
// Error classes
ExecutionError,
MaxStepsExhaustedError,
ConfigurationError,
ModelNotFoundError,
// Error code narrowing
SubmissionErrorCodes,
UluOpsErrorCodes,
} from '@uluops/core';Subpath Exports
For tree-shaking or importing just types/errors without pulling in the full client:
// Import only types (zero runtime cost)
import type { AgentResult, ExecutionInput } from '@uluops/core/types';
// Import only error classes
import { ExecutionError, ConfigurationError } from '@uluops/core/errors';Note: The
/typessubpath exports consumer-facing types only. Internal registry configuration types (CategoryConfig,CriteriaConfig,PhaseConfig, etc.) are not part of the public API — use the YAML schema definitions as the authoritative reference for these structures.
Error Handling
The SDK provides a structured error hierarchy:
Core SDK Errors
| Error | Thrown by | Description |
|-------|----------|-------------|
| UluOpsError | (base class) | Base error class for all SDK errors. Use UluOpsErrorCodes for exhaustive code narrowing |
| ConfigurationError | UluOpsClient constructor, RegistryClient.resolve(), AIProvider.ensureProvider() | Missing API key, invalid provider config, definition not found in registry, invalid definition format |
| ModelNotFoundError | ModelCatalog.resolve() | Model alias not found in registry catalog |
| CapabilityError | ModelCatalog.resolve() | Resolved model lacks a required capability (e.g. tools, vision, extendedThinking) |
| PreflightError | CommandExecutor (preflight phase) | Preflight check failed — missing env var, file not found, command unavailable |
| ExecutionError | AgentExecutor.execute(), CommandExecutor.execute() | Agent execution failure or definition type mismatch. Check error.partialResult for partial output |
| MaxStepsExhaustedError | AgentExecutor.execute() | The tool loop hit the maxSteps ceiling while the model was still calling tools, leaving empty output. Subclass of ExecutionError (code MAX_STEPS_EXHAUSTED); carries error.steps and error.finishReason. Raise maxSteps, narrow the target, or lower the context budget so wrap-up triggers earlier |
| ParseError | OutputExtractor.extractWithMetadata() | LLM output could not be parsed as structured JSON. Check error.contentPreview for raw output |
| SubmissionError | SubmissionClient methods | Validation service rejected a submission. Use SubmissionErrorCodes to narrow by code |
| WorkflowError | WorkflowExecutor.execute() | Phase gate failure. Check error.context.partialResult for completed phase results |
| PipelineError | PipelineExecutor.execute() | Pipeline stage failure. Check error.context for stage name/index |
| SubscriptionRequiredError | RegistryClient.resolve() | Definition requires a higher subscription tier. Check error.requiredTier, error.currentTier, and error.upgradeUrl for upgrade guidance |
| IntegrityError | RegistryClient.resolve() (caller-pinned) | A pinned expectedHash/expectedPromptHash did not match the resolved content, or a prompt pin was supplied for a definition with no rendered prompt. Check error.kind ('yaml'/'prompt'/'unavailable'), error.expected, error.actual, and error.definitionName/error.definitionVersion (which definition failed). Fail-closed — execution is refused |
import { ConfigurationError, ModelNotFoundError, CapabilityError, ExecutionError, MaxStepsExhaustedError, WorkflowError, SubscriptionRequiredError } from '@uluops/core';
try {
const result = await client.runAgent('code-validator', './src');
} catch (error) {
if (error instanceof ConfigurationError) {
console.error('Check your config:', error.message);
} else if (error instanceof ModelNotFoundError) {
console.error('Unknown model alias:', error.message);
} else if (error instanceof CapabilityError) {
console.error('Model lacks a required capability:', error.message);
} else if (error instanceof MaxStepsExhaustedError) {
// Check this BEFORE ExecutionError — it is a subclass.
console.error(`Hit the step ceiling (${error.steps} steps) — raise maxSteps or narrow the target.`);
} else if (error instanceof ExecutionError) {
console.error('Execution failed:', error.message);
console.log('Partial result:', error.partialResult);
} else if (error instanceof WorkflowError) {
// Phase gate failure — completed phases are in error.context.partialResult
console.error('Workflow gate failed:', error.message);
} else if (error instanceof SubscriptionRequiredError) {
console.error(`Upgrade required: needs "${error.requiredTier}" (you have "${error.currentTier}"). ${error.upgradeUrl}`);
}
}Re-exported from @uluops/sdk-core
| Error | Description |
|-------|-------------|
| SdkApiError | Base API error |
| RateLimitError | 429 rate limit exceeded |
| UnauthorizedError | 401 authentication failure |
| ForbiddenError | 403 access denied |
| NotFoundError | 404 resource not found |
| ServiceUnavailableError | 503 service unavailable |
| NetworkError | Connection failures |
| TimeoutError | Request timeout |
Network Error Recovery
import { NetworkError, TimeoutError, RateLimitError } from '@uluops/core';
try {
const result = await client.runAgent('code-validator', './src');
} catch (error) {
if (error instanceof TimeoutError) {
// Increase timeout: default is 300s, some large repos need more
const result = await client.runAgent('code-validator', './src', {
timeoutMs: 600_000,
});
} else if (error instanceof RateLimitError) {
// Back off and retry — the SDK does not auto-retry rate limits
await new Promise(r => setTimeout(r, 5000));
} else if (error instanceof NetworkError) {
// Check ULUOPS_REGISTRY_URL and ULUOPS_SUBMISSION_URL environment variables
console.error('Connection failed. Verify API URLs and network access.');
}
}Security
Tool Allowlist
Agent definitions can request tools (e.g., tools: ['bash'] in YAML), but the operator must explicitly permit them. This separates the trust boundary: definition authors declare what they need, operators decide what they permit.
By default, all tools except bash are allowed. The bash tool passes LLM-generated command strings to sh -c, granting full host OS access scoped to the working directory. Only enable it in sandboxed environments (containers, CI).
// Default: bash blocked even if definition requests it
const client = new UluOpsClient({});
// Explicit opt-in for containerized environments
const client = new UluOpsClient({
allowedTools: ['bash'],
});Or via environment variable:
ULUOPS_ALLOWED_TOOLS=bashFilesystem Sandboxing
The ToolHandler restricts LLM file operations to the target directory:
- Path traversal prevention with separator-aware prefix matching
- Symlink resolution via
fs.realpath()to detect escape attempts - Fail-closed on filesystem errors (dangling symlinks, race conditions)
- macOS
/tmp→/private/tmpsymlink handling
Preflight Checks
CDL command definitions can declare prerequisite checks (file existence, git state, tool availability) that run before agent execution. Preflight command checks:
- Execute in the target directory (
cwd = input.target), matching the execution context offile_existsandgit_cleanchecks - Are restricted to a read-only allowlist:
test,git,grep,find,ls,cat,head,tail,wc,which,command, and shell built-ins ([,true,false,echo) - Reject shell metacharacters (
;,|,&&,`,$()), interpreter eval flags (-e,-c), and chaining operators - Quote
$ARGUMENTSsubstitutions viashellQuote()to prevent CWE-78 injection
Package managers (npm, pip), orchestrators (docker, kubectl), build tools (make, cargo), and interpreters (node, python) are not permitted in preflight — they have broad side-effect authority that doesn't belong in prerequisite checks. The security boundary for preflight commands is supply-chain trust in the definition author, not runtime effect confinement.
Dependencies
| Package | Purpose |
|---------|---------|
| @uluops/sdk-core | Shared HTTP infrastructure (HttpClient, errors, auth) |
| @uluops/registry-sdk | Registry API client for definitions, models, and server-side normalization (?normalize=true) |
| @uluops/ops-sdk | Validation tracking API client |
| ai | Vercel AI SDK v6 - LLM communication and tool loops |
| @ai-sdk/anthropic | Anthropic provider for AI SDK |
| @ai-sdk/openai | OpenAI provider for AI SDK |
| @ai-sdk/google | Google/Gemini provider for AI SDK |
| yaml | YAML parsing for local definitions |
| glob | File globbing for ToolHandler |
| zod | Schema validation for AI SDK tools |
Development
# Install dependencies
npm install
# Type check
npm run typecheck
# Run tests
npm test
# Run tests in watch mode
npm run test:watch
# Run tests with coverage
npm run test:coverage
# Lint
npm run lint
# Build
npm run buildMaintainers
- Alex Self (@aself101) — architecture, execution engine, AI integration
- Claude (Anthropic) — implementation, validation, documentation
License
MIT — Copyright (c) 2026 Uluops. See LICENSE for details.
