@galileodev/meta

v0.9.4

Published

17 hours ago

**Cognitive strategy layer — structured prompt composition, self-consistency validation, and autonomous prompt evolution.**

0High
0Medium
0Low

d13gomarono13

@galileodev/meta

Cognitive strategy layer — structured prompt composition, self-consistency validation, and autonomous prompt evolution.

Overview

@galileodev/meta is the meta-prompting engine for the Galileo ecosystem. While @galileodev/core handles what the pipeline does (generate, reflect, curate), meta handles how prompts are composed and improved over time. It replaces ad-hoc string concatenation with a structured template system, validates that LLM outputs respect declared constraints, and autonomously evolves templates to improve pipeline performance.

The package implements three middleware components: PromptBuilder for template-driven prompt composition, ConsistencyValidator for two-tier post-generation checking, and RatchetOptimizer for autonomous prompt evolution through controlled experiments.

Architecture

┌─────────────────────────────────────────────────┐
│                  @galileodev/meta                │
│                                                  │
│  ┌──────────────────────────────────────────┐   │
│  │            PromptBuilder                  │   │
│  │  ┌──────────┐  ┌──────────┐  ┌────────┐ │   │
│  │  │ Template  │  │ Registry │  │Tokenizer│ │   │
│  │  │ Defaults  │  │ (JSONL)  │  │(tiktoken│ │   │
│  │  └──────────┘  └──────────┘  └────────┘ │   │
│  └──────────────────────────────────────────┘   │
│                                                  │
│  ┌──────────────────────────────────────────┐   │
│  │        ConsistencyValidator               │   │
│  │  ┌──────────────┐  ┌──────────────────┐  │   │
│  │  │ Rule-Based   │  │  LLM-Judged      │  │   │
│  │  │ Constraints  │  │  Constraints      │  │   │
│  │  └──────────────┘  └──────────────────┘  │   │
│  └──────────────────────────────────────────┘   │
│                                                  │
│  ┌──────────────────────────────────────────┐   │
│  │          RatchetOptimizer                 │   │
│  │  ┌───────────┐  ┌──────────┐  ┌───────┐ │   │
│  │  │ Experiment│  │ Evaluator│  │Ratchet│ │   │
│  │  │  Runner   │  │ (Metrics)│  │ Guard │ │   │
│  │  └───────────┘  └──────────┘  └───────┘ │   │
│  └──────────────────────────────────────────┘   │
└─────────────────────────────────────────────────┘

Core Modules

PromptBuilder

Replaces ad-hoc prompt string building with a declarative template system. Templates use {{slot}} placeholders, declare their own constraints, are versioned, and track lineage via parentId.

PromptTemplate — The core data structure:

id / version — Unique identifier and version number
stage — Which pipeline stage this template serves (generator, reflector, curator, decomposer, project-planner)
slots — Typed slot definitions (string, entries, artifacts, lessons) with required/optional flags
sections — Ordered content blocks with {{slot}} placeholders
constraints — Declared validation rules (output-format, content-rule, language, consistency)
metadata.parentId — Lineage tracking for template evolution

PromptBuilder — The rendering engine:

build(stage, slots) → RenderedPrompt — Fills all slots, validates required fields, assembles sections in order
getTemplate(stage) → Active template for a stage
registerTemplate(template) — Used by the optimizer to install new template versions

TemplateRegistry — JSONL-backed persistent storage:

Append-only .galileo/templates.jsonl for full history
.galileo/templates-active.json maps stage → active template ID
Supports listing all templates, querying by stage, and activating specific versions

Default Templates

Five built-in templates ship with the package, covering all pipeline stages:

| Template | Stage | Purpose | |----------|-------|---------| | DEFAULT_GENERATOR_TEMPLATE | generator | XML-structured prompt for reasoning trajectories + code artifacts | | DEFAULT_REFLECTOR_TEMPLATE | reflector | Lesson extraction with confidence scoring | | DEFAULT_CURATOR_TEMPLATE | curator | Utility/harmfulness scoring and delta production | | DEFAULT_DECOMPOSER_TEMPLATE | decomposer | Breaking user requests into independent, testable steps | | DEFAULT_PROJECT_PLANNER_TEMPLATE | project-planner | Multi-phase project planning from user goals |

All templates are accessible via ALL_DEFAULT_TEMPLATES for bulk registration.

Token Counting

Local token estimation via js-tiktoken (GPT-4o tokenizer). Used for pre-call budget enforcement so the system doesn't need an API call just to measure cost.

import { countTokens } from '@galileodev/meta';

const estimate = countTokens(promptText); // number of tokens

ConsistencyValidator

Two-tier post-generation validation that checks LLM outputs against the template's declared constraints:

Tier 1 — Rule-based checks (fast, no LLM call):

checkLanguageConstraint — Verifies output uses the expected programming language
checkRangeConstraint — Validates numeric scores fall within declared bounds (e.g., utility ∈ [0, 1])
Zod schema validation — Parses structured output against declared schemas (e.g., zod:ReflectionResponseSchema)

Tier 2 — LLM-judged checks (deeper, requires LLM call):

consistency constraints — An LLM judge verifies that the output doesn't contradict itself or the prompt's intent
Only triggered when rule-based checks pass, to avoid wasting tokens on obviously invalid outputs

RatchetOptimizer

Autonomous prompt evolution through controlled experiments. The optimizer generates template variants, evaluates them against historical pipeline inputs, and keeps only improvements — a monotonic ratchet that never regresses.

Optimization flow:

Load the current active template for the target stage
Cold-start guard — Requires ≥3 playbook entries (aborts with helpful error if not met)
Ask an LLM to generate a variant using a configurable directive
Run both templates against a test suite (past inputs from the playbook)
Evaluate composite metric with accuracy gatekeeper
If variant scores higher → commit the new template, register as active
If equal or lower → discard (no commit)
Feed a sliding window of the last 3 failures to the next variant generation
Repeat up to maxExperiments, respecting tokenBudget

Composite scoring:

score = 0                                           if accuracy < threshold
score = accuracy.weight × a + efficiency.weight × e   if accuracy ≥ threshold

Evaluators:

computeScore — Composite metric calculation with accuracy gatekeeper
computeEfficiency — Token efficiency relative to a baseline

Experiment tracking:

.galileo/experiments.jsonl — Full audit trail of every variant tried and scored
Each ExperimentResult records: templateId, parentId, score, accuracy, efficiency, tokensUsed, commitSha, reverted

API Surface

// Builder
export { PromptBuilder, TemplateRegistry, countTokens };
export { DEFAULT_GENERATOR_TEMPLATE, DEFAULT_REFLECTOR_TEMPLATE,
         DEFAULT_CURATOR_TEMPLATE, DEFAULT_DECOMPOSER_TEMPLATE,
         DEFAULT_PROJECT_PLANNER_TEMPLATE, ALL_DEFAULT_TEMPLATES };

// Validator
export { ConsistencyValidator, checkLanguageConstraint, checkRangeConstraint };

// Optimizer
export { RatchetOptimizer, runExperiment, computeScore, computeEfficiency, validateMetric };

// Types & Schemas
export type { PromptTemplate, SlotDefinition, TemplateSection, Constraint, RenderedPrompt };
export type { ValidationResult, ConstraintViolation, ValidationOptions };
export type { ExperimentConfig, ExperimentResult, MetricDefinition };
export { PromptTemplateSchema, SlotDefinitionSchema, TemplateSectionSchema, ConstraintSchema };

Dependencies

| Dependency | Purpose | |------------|---------| | @galileodev/core | Pipeline types, LLMProvider interface, playbook types | | zod | Schema validation for templates and experiment configs | | js-tiktoken | Local token counting (GPT-4o tokenizer) | | ulid | Unique ID generation for templates and experiments |

Usage

Building prompts

import { PromptBuilder, TemplateRegistry, DEFAULT_GENERATOR_TEMPLATE } from '@galileodev/meta';

const registry = new TemplateRegistry('.galileo');
await registry.register(DEFAULT_GENERATOR_TEMPLATE);

const builder = new PromptBuilder(registry);
const rendered = builder.build('generator', {
  instruction: 'Add rate limiting to the API',
  playbookContext: selectedEntries,
});

console.log(rendered.text);           // The assembled prompt string
console.log(rendered.tokenEstimate);  // Pre-call token count
console.log(rendered.constraints);    // Constraints for validation

Validating outputs

import { ConsistencyValidator } from '@galileodev/meta';

const validator = new ConsistencyValidator(llm);
const result = await validator.validate(llmOutput, rendered.constraints);

if (!result.valid) {
  console.log(result.violations); // Array of ConstraintViolation
}

Evolving templates

import { RatchetOptimizer } from '@galileodev/meta';

const optimizer = new RatchetOptimizer(registry, llm, '.galileo');
const results = await optimizer.run(
  {
    targetStage: 'generator',
    tokenBudget: 50000,
    maxExperiments: 5,
    directive: 'Improve code quality and reduce verbosity',
    accuracyThreshold: 0.7,
    concurrency: 4,
    metric: {
      accuracy: { weight: 0.7, evaluator: 'schema-pass-rate' },
      efficiency: { weight: 0.3, baseline: 2000 },
    },
  },
  historicalInputs,
);

const wins = results.filter(r => !r.reverted).length;
console.log(`${wins}/${results.length} experiments improved the template`);

Testing

npm test -w packages/meta

Tests cover template rendering, slot validation, registry persistence, token counting, constraint checking (both rule-based and LLM-judged), experiment execution, and metric evaluation.

License

See the root LICENSE file.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@galileodev/meta

Overview

Architecture

Core Modules

PromptBuilder

Default Templates

Token Counting

ConsistencyValidator

RatchetOptimizer

API Surface

Dependencies

Usage

Building prompts

Validating outputs

Evolving templates

Testing

License