@galileodev/verify

v0.38.4

Published

19 hours ago

**Grounding layer — multi-plugin verification, automated remediation, and metric-driven code optimization.**

0High
0Medium
0Low

d13gomarono13

@galileodev/verify

Grounding layer — multi-plugin verification, automated remediation, and metric-driven code optimization.

Overview

@galileodev/verify closes the feedback loop between code generation and real-world correctness. It runs generated code through a battery of verification plugins (TypeScript compiler, ESLint, Semgrep, test runners), orchestrates automated remediation when verification fails, and drives metric-driven optimization loops for code improvement.

The package implements Galileo's "AC/DC" (Assemble Context / Deliver Code) verification cycle: generate → verify → remediate → re-verify, with a configurable budget and cycle limit. It also provides Karpathy-style tree search for parallel code optimization against arbitrary metrics.

Architecture

┌──────────────────────────────────────────────────────┐
│                  @galileodev/verify                    │
│                                                       │
│  ┌────────────────────────────────────────────────┐  │
│  │              ACDCOrchestrator                    │  │
│  │                                                  │  │
│  │  1. Guide ──→ 2. Generate ──→ 3. Verify         │  │
│  │       ↑                           │              │  │
│  │       │         ┌─── pass ←───────┤              │  │
│  │       │         │                 │              │  │
│  │       │         │          4. Solve (fail)       │  │
│  │       │         │                 │              │  │
│  │       │         │         Re-verify              │  │
│  │       │         │                 │              │  │
│  │       └─────────┴── next cycle ←──┘              │  │
│  └────────────────────────────────────────────────┘  │
│                                                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
│  │  Verifiers  │  │   Solve     │  │  Karpathy   │  │
│  │  (4 plugins)│  │   Agent     │  │  Tree Search│  │
│  └─────────────┘  └─────────────┘  └─────────────┘  │
└──────────────────────────────────────────────────────┘

Core Modules

ACDCOrchestrator

The top-level orchestrator that wires together the complete verification cycle. Each cycle consists of four phases:

Guide — GuideComposer assembles context about the project (constraints, file tree, playbook entries)
Generate — Core Pipeline runs the G→R→C loop (delegated to @galileodev/core)
Verify — VerifierRunner executes all registered verification plugins
Solve — On failure, SolveAgent attempts automated remediation

The orchestrator manages cycle limits (maxCycles), token budgets (TokenBudget), selection mode propagation for A/B evaluation, checkpoint pausing (configurable per-stage via CheckpointHandler), and feedback recording (wiring verification outcomes back to the playbook via FeedbackRecorder).

const result = await orchestrator.run({
  taskId: 'task-1',
  instruction: 'Add rate limiting',
  projectDir: '/path/to/project',
  llm,
  maxCycles: 3,
  budget: new TokenBudget(100000),
  selectionMode: 'aligned',
});

result.passed;       // boolean — did final verification pass?
result.cycles;       // number — how many cycles were needed
result.generation;   // GenerationResult from core pipeline
result.verification; // VerificationReport with findings
result.remediation;  // SolveResult (if remediation was attempted)
result.feedback;     // FeedbackResult from the recorder
result.metrics;      // MetricRecord for the eval harness

Verifier Plugins

Four verification plugins, all implementing the VerifierPlugin interface:

| Plugin | What It Checks | Severity | |--------|---------------|----------| | TscVerifier | TypeScript type errors via tsc --noEmit | error | | EslintVerifier | Lint violations via ESLint | error / warning | | SemgrepVerifier | Security vulnerabilities via Semgrep static analysis | error / warning | | TestRunnerVerifier | Test suite execution (detects test failures) | error |

Each plugin reports VerificationFinding[] with: verifierId, severity, ruleId, message, file, and line.

VerifierRunner — The execution engine:

Registers plugins via .register(plugin)
Checks plugin availability before execution (gracefully skips unavailable tools)
Supports failFast mode (stop on first plugin failure) or full-suite mode
Aggregates findings into a VerificationReport with passed flag and summary counts

GuideComposer

Assembles project context for the generation step:

Project constraints — Loaded from .galileo/constraints.json (project-specific rules, patterns, restrictions)
File tree scanning — scanFileTree() produces a compact project structure representation
Playbook entries — Relevant memories from the store

SolveAgent

Automated remediation when verification fails. The solve agent:

Receives verification findings and project context
Generates hypotheses and patches via LLM
Applies patches and re-verifies
Supports configurable retry limits per finding (maxRetriesPerFinding)
Operates within the shared TokenBudget
Returns SolveResult with resolved and unresolved findings, plus RemediationAttempt[] history

Karpathy Tree Search

Parallel UCB (Upper Confidence Bound) exploration for metric-driven code optimization. Inspired by Andrej Karpathy's approach to test-time compute.

Use case: Optimize a specific file against a measurable metric (e.g., bundle size, test coverage, performance benchmark).

const loop = new KarpathyLoop();
const result = await loop.run({
  metric: 'bundle-size',
  command: 'du -sb dist | cut -f1',
  target: 'src/index.ts',
  direction: 'minimize',
  maxIterations: 10,
  parallelBranches: 3,
});

Each iteration generates multiple candidate modifications in parallel, evaluates them against the metric, and selects the best-performing branch using UCB scoring. The tree grows by exploring promising branches while balancing exploration vs. exploitation.

API Surface

// Orchestrator
export { ACDCOrchestrator };

// Guide
export { GuideComposer, loadConstraints, scanFileTree };

// Verifiers
export { VerifierRunner, TscVerifier, TestRunnerVerifier, EslintVerifier, SemgrepVerifier };

// Solve
export { SolveAgent };

// Karpathy Tree Search
export { KarpathyLoop };

// Types
export type {
  GuideContext, GuideOptions, ProjectConstraint, RenderedTemplate,
  VerificationFinding, VerificationReport, VerifyTarget, VerifierPlugin,
  SolveContext, SolveResult, RemediationAttempt,
  KarpathyMetric, KarpathyConfig, KarpathyResult, KarpathyExperiment,
  ACDCInput, ACDCResult,
};

Dependencies

| Dependency | Purpose | |------------|---------| | @galileodev/core | Pipeline, playbook store, LLM types, token budget, execution sandbox, event bus | | zod | Schema validation for configuration and inputs |

External tools (not npm dependencies — must be installed in the project):

tsc (TypeScript compiler)
eslint
semgrep (optional — Semgrep CLI for security analysis)
A test runner (vitest, jest, mocha, etc.)

Usage

Standalone verification

import {
  VerifierRunner, TscVerifier, TestRunnerVerifier,
  EslintVerifier, SemgrepVerifier,
} from '@galileodev/verify';
import { ExecutionSandbox } from '@galileodev/core';

const verifier = new VerifierRunner({ failFast: false });
verifier.register(new TscVerifier());
verifier.register(new TestRunnerVerifier());
verifier.register(new EslintVerifier());
verifier.register(new SemgrepVerifier());

const report = await verifier.runAll({
  workingDir: '/path/to/project',
  sandbox: new ExecutionSandbox(),
});

console.log(report.passed);          // boolean
console.log(report.summary.errors);  // number
console.log(report.findings);        // VerificationFinding[]

Full AC/DC cycle

import { ACDCOrchestrator, GuideComposer, VerifierRunner, SolveAgent } from '@galileodev/verify';
import { Pipeline, Generator, Reflector, Curator, /* ... */ } from '@galileodev/core';

const pipeline = new Pipeline({ store, generator, reflector, curator, eventBus, embeddings, llm });
const composer = new GuideComposer(store, projectDir);
const verifier = new VerifierRunner({ failFast: false });
// ... register verifiers ...
const solver = new SolveAgent();

const orchestrator = new ACDCOrchestrator(
  pipeline, composer, verifier, solver, feedbackRecorder, eventBus,
);

const result = await orchestrator.run({
  taskId: 'build-001',
  instruction: 'Add input validation to the user endpoint',
  projectDir: '/path/to/project',
  llm,
  maxCycles: 3,
  budget: new TokenBudget(100000),
  selectionMode: 'aligned',
});

Metric-driven optimization

import { KarpathyLoop } from '@galileodev/verify';

const loop = new KarpathyLoop();
const result = await loop.run({
  metric: 'test-coverage',
  command: 'npx vitest run --coverage --reporter=json | jq .total.lines.pct',
  target: 'src/utils.ts',
  direction: 'maximize',
  maxIterations: 8,
  parallelBranches: 3,
});

console.log(result.bestScore);       // Best metric value achieved
console.log(result.experiments);     // Full exploration tree

Testing

npm test -w packages/verify

Tests include unit tests for each verifier plugin, integration tests for the full AC/DC cycle (with mock verifiers), feedback recording integration tests, checkpoint handler tests, and Karpathy tree search tests.

License

See the root LICENSE file.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@galileodev/verify

Overview

Architecture

Core Modules

ACDCOrchestrator

Verifier Plugins

GuideComposer

SolveAgent

Karpathy Tree Search

API Surface

Dependencies

Usage

Standalone verification

Full AC/DC cycle

Metric-driven optimization

Testing

License