@x12i/optimixer

v3.5.2

Published

11 days ago

Resilient scenario-based runtime optimization for AI workloads; ai.max_tokens.v1 with bundled ai-profiles model resolution, warmup tolerance, and Activix learning

0High
0Medium
0Low

x12i

optimixer activix max-tokens prediction ai ai-profiles resilience

@x12i/optimixer

Version 3.3.0 — resilience release

Part of the Activitix monorepo.

Predict-time normalization for modelProfile: concrete vendor models, gateway wire ids, and @x12i/ai-profiles profile+choice keys — without host pre-resolution. Warmup tolerates messy historical Activix rows; predict and init stay up.

Dependencies: @x12i/ai-profiles ^3.0.0 · @x12i/activix-contracts · optional peer @x12i/activix ^8.3.1 for learning mode.

Resilience guarantees (3.3.0)

| Guarantee | Behavior | |-----------|----------| | Single modelProfile entry point | Pass concrete { provider, model }, wire id, or profile+choice (cheap/default) — Optimixer resolves via bundled resolveBundledInput | | No host pre-resolve | xynthesis / ai-tasks PRE / studio pass keys as stored; Optimixer normalizes before caps + reasoning checks | | CR-14 effort reconcile | Historical reasoningEffort: not-applicable + reasoning-capable catalog model → promotes to low (or modelProfile.reasoning.effort) | | Warmup never fails init on one bad row | Invalid rows skipped with optimixer.warmup.row.skipped debug logs; valid rows still load | | Cold-start never throws | Missing history → conservative budget + labeled confidence | | Unknown models | UNKNOWN_MODEL_DEFAULTS fallback; caller overrides still apply | | Complete is best-effort | Missing requestId or persistence errors do not throw | | Gateway boundary | MAIN HTTP invoke stays concrete-only — Optimixer is predict-only |

Package boundaries

@x12i/ai-profiles   profile+choice + catalog caps (sync bundled, text lane)
@x12i/optimixer     predict budget, reasoning reconcile, warmup, learning buckets
gateway / ai-skills MAIN HTTP   concrete provider + model only

See .docs/ai-profiles-boundary.md for ownership detail.

What Optimixer is

Optimixer is a scenario-based runtime optimization engine for AI workloads.

It predicts execution decisions before the call, observes what really happened after the call, and improves future executions across templates, models, reasoning modes, and usage profiles.

It is not a “max tokens utility.” Max-token optimization is the first implemented scenario.

Every scenario follows the same lifecycle:

Before execution:  predict the best runtime decision
During execution:    apply the decision
After execution:     observe actual outcome
Over time:           learn better decisions for similar future executions

Optimixer decides. Activix remembers. The AI wrapper applies.

Optimixer owns prediction and learning. Activix stores durable evidence. Your gateway or task runner orchestrates the flow.

First scenario: `ai.max_tokens.v1`

Today's shipped pipeline is ai.max_tokens.v1 (algorithm 3.0.0): learn the right max completion budget for each AI call — from historical evidence, model/reasoning profile, declared output intent, and acceptable risk.

Future scenarios may include model selection, reasoning effort, retry policy, context compression, provider fallback, cost/latency prediction, and output repair. They reuse the same predict → execute → observe → learn pattern.

Why max tokens is a good first scenario

Max tokens is a strong first use case because it is frequent, measurable, easy to observe from provider usage, easy to learn from, and directly tied to failures (truncation) and waste (over-allocation). Every LLM call has a completion budget; every response reports usage.

Why not “just set max tokens very high”?

A huge static budget feels safe but at enterprise scale it is the wrong operating model:

Hides execution design — you stop learning that classification needs ~120 tokens, extraction ~900, a reasoning finalizer ~5,000. Everything becomes “give it a lot and hope.”
Increases waste — models ramble, JSON bloats, latency grows, and retry behavior gets harder to reason about across millions of calls.
Does not solve context-window pressure — output budget competes with system prompt, retrieval, tools, and memory. Oversized output leaves less safe room for input and context.
Makes failures harder to diagnose — when everything uses 16k, you cannot tell whether failure was input size, model change, reasoning burn, or template drift. Optimixer records what was predicted, why, what was used, and whether the call was under- or over-allocated.
Reasoning models need a split budget — completion may cover hidden reasoning plus visible output; one large number does not express that.

The right budget is the smallest safe budget for the specific task — not the largest number that fits in a form field.

Max-token optimization is how Optimixer removes manual runtime guessing from enterprise AI execution:

Manual guess:     max_tokens = 4000 everywhere

Optimixer:        template A → 300
                  template B → 900
                  template C → 1,800
                  template D + reasoning → 5,000
                  sparse history → related evidence + labeled confidence

At enterprise scale, execution settings should be learned from historical evidence, not copied from a spreadsheet.

Technical overview

Uses @x12i/activix-contracts for record shapes and client interfaces. Persists prediction and learning evidence through an Activix-compatible client (embedded or standalone).

Deeper architecture: explained.md · Consumer migration: ../../.docs/MIGRATION-CONSUMERS.md

Install

Embedded mode (recommended — you already run Activix):

npm install @x12i/optimixer@^3.3.0 @x12i/activix@^8.0.0

@x12i/activix-contracts is installed automatically as a dependency of Optimixer.

Standalone mode (Optimixer owns Activix wiring):

npm install @x12i/optimixer@^3.3.0 @x12i/activix@^8.0.0

Peer: @x12i/activix ^8.0.0 (required at runtime for embedded and standalone).

Peer dependency (optional at runtime): @x12i/activix ^8.3.1 — required when using embedded or standalone persistence, not for predict-only.

Migrating to 3.0.0 (breaking)

Every predictAiMaxTokens call must include:

reasoningEffort: not-applicable | none | low | medium | high
outputIntent: { mode: 'fixed' | 'relative', expectedVisibleTokens: number, outputToInputRatio?: number }

Optimixer no longer guesses output size from input alone. Input tokens affect context-window fit and cost; output budget is driven by your declared intent plus reasoning reserve.

See Token budget model and ../../.docs/MIGRATION-CONSUMERS.md.

3.3.0 — modelProfile resilience

Optimixer is the only place that should resolve profile+choice for prediction. Uses @x12i/ai-profiles resolveBundledInput (3.0.0+, sync bundled, catalogLane: 'text' — same as ai-tasks PRE).

Accepted `modelProfile` shapes

| Shape | Example | Resolved by | |-------|---------|-------------| | Concrete vendor + model | { provider: 'openrouter', model: 'google/gemini-2.5-flash-lite' } | Bundled models catalog | | Gateway wire id | { model: 'openrouter/openai/gpt-5.5' } | Catalog (gateway prefix split) | | Profile+choice | { profileChoice: 'cheap/default' } or { model: 'cyber/deep_forensics' } | @x12i/ai-profiles registry (strict profile/choice) → catalog caps |

Legacy field modelProfile.alias is accepted as an alias for profileChoice (Optimixer field name, not ai-profiles registry alias). Bare profile keys (cheap) and registry shortcuts are not supported — use explicit profile/choice (ai-profiles 3.0.0).

Reasoning + historical rows

When the bundled catalog marks the resolved model as reasoning-capable, Optimixer sets modelProfile.reasoning.enabled: true. If the caller (or Activix warmup row) still has reasoningEffort: 'not-applicable', normalize promotes effort to low or modelProfile.reasoning.effort — so xynthesis PRE init succeeds on old rows recorded before reasoning metadata existed.

Live predict with an active effort on a non-reasoning model still throws OPTIMIXER_REASONING_PROFILE_INCONSISTENT (CR-12 inverse case).

Example — profile+choice (PRE / xynthesis)

const prediction = await optimixer.predictAiMaxTokens({
  templateId: 'pre-synthesis',
  inputSize: 1200,
  reasoningEffort: 'low', // or omit reconcile path: pass not-applicable on old shapes
  outputIntent: { mode: 'fixed', expectedVisibleTokens: 512 },
  modelProfile: { profileChoice: 'cheap/default' },
});
// modelProfile normalized to openrouter + concrete slug + catalog caps before budget

Do not pre-resolve profile+choice in hosts before Optimixer. Gateway / funcx / ai-skills MAIN HTTP must still receive concrete provider + model.

Mismatch on live predict throws OptimixerError — use isOptimixerError(err) / err.code / err.remediationHint.

Functional requirements (FR-OPT)

| FR | Behavior | |----|----------| | FR-OPT-1 | Predict works without DB; cold-start never throws; init failures throw Optimixer initialization failed: … | | FR-OPT-2 | Min 256 tokens, +256 safety margin, structured JSON floor 6144 (outputMode: json + schemaComplexity != none) | | FR-OPT-3 | Optimixer.create({ persistence: 'predict-only' }) — no Activix/Mongo required | | FR-OPT-4 | Predict returns recommendedMaxTokens, requestId, confidence, bucketKey | | FR-OPT-5 | completeAiMaxTokensPrediction is best-effort; no-op without requestId; persistence errors do not throw | | FR-OPT-6 | Model caps from modelProfile; unknown models use UNKNOWN_MODEL_DEFAULTS (see exports) | | FR-OPT-7 | 3.3.0 — modelProfile accepts concrete + profile+choice; bundled @x12i/ai-profiles resolution | | FR-OPT-8 | 3.3.0 — Warmup skips unrecoverable historical rows; init loads all valid evidence | | FR-OPT-9 | 3.3.0 — Reasoning effort reconciled after catalog resolve (CR-14); no warmup init throw on legacy rows |

Token budget model

input tokens     = prompt + messages + tools + context (context fit, cost, latency)
visible output   = caller-declared expected visible tokens (fixed or relative)
reasoning budget = reserve by reasoningEffort (low/medium/high)
max_tokens       = min(generation budget, model.maxOutputTokens, contextWindow - input - margin)

Non-reasoning (reasoningEffort: 'not-applicable'): 5k-word input + yes/no answer → ~10 visible tokens → ~512 max (not thousands).

Reasoning (reasoningEffort: 'medium'): same task → visible ~10 + reasoning reserve ~3000 + margins → ~3500+ max.

Context overflow: 100k input on 128k window clamps output to available headroom regardless of requested budget.

Quick start

import { Activix } from '@x12i/activix';
import { Optimixer } from '@x12i/optimixer';

const activix = await Activix.create({
  collection: 'ai-gateway-activities',
  mongoUri: process.env.MONGO_URI,
});

const optimixer = await Optimixer.create({
  activixClient: activix,
  activixCollection: 'ai-gateway-activities', // must match Activix collection name
  pipelines: { aiMaxTokens: { enabled: true } },
});

const prediction = await optimixer.predictAiMaxTokens({
  templateId: 'summarize',
  inputSize: 1200,
  contextSize: 800,
  acceptableRisk: 'medium',
  reasoningEffort: 'not-applicable',
  outputIntent: {
    mode: 'relative',
    expectedVisibleTokens: 900,
    outputToInputRatio: 0.05,
  },
  taskType: 'summarization',
  outputMode: 'markdown',
  modelProfile: {
    provider: 'openai',
    model: 'gpt-4o-mini',
    outputTokenParam: 'max_tokens',
    contextWindow: 128000,
    maxOutputTokens: 16384,
  },
  runContext: { sessionId: 'sess-1', jobId: 'job-1' },
});

await aiClient.call({ ...request, ...prediction.providerParams });

const result = await optimixer.completeAiMaxTokensPrediction({
  requestId: prediction.requestId,
  actual: {
    promptTokens: 900,
    completionTokens: 420,
    totalTokens: 1320,
    finishReason: 'stop',
    latencyMs: 800,
  },
});

if (result.retryPrediction) {
  await aiClient.call({ ...request, ...result.retryPrediction.providerParams });
}

Lifecycle

predictAiMaxTokens
  → Activix startRecord (prediction evidence)
  → caller uses providerParams on LLM request
completeAiMaxTokensPrediction
  → update Activix record + in-memory buckets (+ optional profile summaries)

requestId is the Activix activityId from startRecord.

Do not call Activix separately for each predict/complete — Optimixer owns that lifecycle.

Request fields

| Field | Required | Role | |-------|----------|------| | templateId | yes | Stable learning identity (action/template) | | inputSize | yes | Estimated input tokens (context fit — not the generation ceiling) | | reasoningEffort | yes | not-applicable | none | low | medium | high — may be promoted from not-applicable when catalog infers reasoning (CR-14) | | outputIntent | yes | Fixed or relative visible output guess (see below) | | contextSize | no | Estimated context tokens (default 0) | | acceptableRisk | no | very-low | low | medium | high (default medium) → percentile | | modelProfile | no | provider, model, profileChoice (ai-profiles profile/choice), legacy alias, caps, reasoning — see 3.3.0 resilience | | taskType | no | e.g. summarization, extraction, code-generation | | outputMode | no | text, json, markdown, code, schema | | schemaComplexity | no | none, small, medium, large — with outputMode: json triggers structured floor | | expectedVerbosity | no | Metadata only (does not drive budget in 3.0) | | plannedTools | no | Tool names for bucket identity | | constraints | no | Caller min/max/preferred caps, retry flags | | runContext | no | Same envelope as Activix (sessionId, jobId, …) | | dryRun | no | Predict without Activix write (requestId is '') |

`outputIntent`

| Mode | Fields | Cold-start behavior | |------|--------|-------------------| | fixed | expectedVisibleTokens | Uses caller guess; input size does not inflate visible output | | relative | expectedVisibleTokens, optional outputToInputRatio | max(guess, inputTokens × ratio) until history learns |

Use modelProfile for provider/model or @x12i/ai-profiles profile+choice — see accepted shapes above.

When the bundled catalog marks the resolved model as reasoning-capable, Optimixer sets modelProfile.reasoning.enabled: true automatically.

Stable error codes (throw OptimixerError; check error.code):

| Code | When | |------|------| | OPTIMIXER_REASONING_PROFILE_INCONSISTENT | reasoningEffort is low/medium/high but the resolved model is not reasoning-capable | | OPTIMIXER_PREDICTION_INPUT_INVALID | Missing or invalid required predict inputs |

Both codes include remediationHint where applicable.

Prediction output (required consumer contract)

| Field | Meaning | |-------|---------| | recommendedMaxTokens | Final clamped budget (primary field) | | requestId | Activix activityId; '' when not persisted (predict-only / dry-run / record failure) | | confidence | none | low | medium | high | | bucketKey | Bucket used for evidence / learning identity |

Additional fields:

| Field | Meaning | |-------|---------| | recommendedMaxCompletionTokens | Completion budget after reasoning split | | providerParams | Spread into LLM request (max_tokens, max_completion_tokens, …) | | dataState | exact-history, fallback-history, task-shape-history, global-history, cold-start | | learningPhase | bootstrap | early | mature — why the budget is conservative or tight | | templateSampleCount | Completed samples for this templateId at predict time | | bootstrapMultiplierApplied | Set during bootstrap when a high safety multiplier was applied to outputIntent | | evidenceConfidence | exact | related | prior | cold-start | | signature / signatureKey | Canonical prediction identity | | evidence, explanation, riskPolicy | Debug / Studio explainability | | reasoningBudget, contextBudget, clamping | Budget breakdown when applicable |

Predict-only mode (no Activix)

const optimixer = await Optimixer.create({ persistence: 'predict-only' });
const prediction = await optimixer.predictAiMaxTokens({
  templateId: 'classify',
  inputSize: 7500,
  reasoningEffort: 'not-applicable',
  outputIntent: { mode: 'fixed', expectedVisibleTokens: 10 },
});
// prediction.requestId === ''

Or use the pure function with an empty bucket store:

import { predictAiMaxTokensV1, BucketStore } from '@x12i/optimixer';

Unknown model defaults (FR-OPT-6)

import { UNKNOWN_MODEL_DEFAULTS } from '@x12i/optimixer';
// { contextWindow: 128_000, maxOutputTokens: 16_384, outputTokenParam: 'max_tokens' }

Pass explicit modelProfile.contextWindow / maxOutputTokens to override catalog defaults. Known models resolve from bundled @x12i/ai-profiles ^3.0.0 via a single resolveBundledInput call; reasoning capability uses isReasoningModel for CR-14 effort reconciliation when reasoning.enabled is unset.

Configuration

Optimixer.create({
  activixClient,
  activixCollection: 'ai-gateway-activities',
  pipelines: {
    aiMaxTokens: {
      enabled: true,
      defaultMaxTokens: 4096,
      stats: { minSamplesExact: 20, minSamplesFallback: 10 },
      bootstrap: {
        minSamplesBeforePredict: 2,  // predict from history on 3rd call per templateId
        safetyMultiplier: 4,           // high budget during bootstrap (tune up e.g. 6)
        earlyPercentile: 'p99',        // conservative until exact bucket reaches minSamplesExact
      },
      contextReserve: { minTokens: 1000, percentOfContext: 0.02, maxTokens: 8000 },
      evidenceStrategy: { mode: 'best_available', minSamplesExact: 20 },
      warmup: { enabled: true, profileFirst: true, rawRows: { maxRecords: 500 } },
      summaryPersistence: { enabled: true, mode: 'batched', batchIntervalMs: 30000 },
      templates: [/* local task/output budgets */],
      policies: [/* risk → percentile overrides */],
    },
  },
});

Dry-run

const preview = await optimixer.predictAiMaxTokens({
  templateId: 'summarize',
  inputSize: 1200,
  contextSize: 800,
  acceptableRisk: 'medium',
  reasoningEffort: 'not-applicable',
  outputIntent: { mode: 'relative', expectedVisibleTokens: 900 },
  modelProfile: { provider: 'openai', model: 'gpt-4o-mini' },
  dryRun: true,
});
// preview.requestId === ''
// complete without requestId is a no-op (does not throw)

Stats API

const stats = optimixer.getAiMaxTokensStats({ templateId: 'summarize', model: 'gpt-4o-mini' });
// stats.buckets[].sampleCount, completionTokens percentiles, rates

Standalone mode

const optimixer = await Optimixer.create({
  activix: {
    collection: 'optimixer-activities',
    mongoUri: process.env.MONGO_URI,
    storageMode: 'automatic',
  },
});

Requires @x12i/activix as a peer dependency.

Activix records

Optimixer has no dedicated Mongo collection. In learning mode it writes standard Activix activity rows (default collection ai-actions, database activitix).

Find prediction rows:

db.getCollection('ai-actions').find({
  'outer.metadata.optimizer': 'optimixer',
  'outer.metadata.kind': 'optimixer:prediction',
  'outer.metadata.pipelineId': 'ai.max_tokens.v1',
  status: 'completed',
}).sort({ startTime: -1 }).limit(20)

Find profile summary rows:

db.getCollection('ai-actions').find({
  'outer.metadata.kind': 'optimixer:profile',
  'outer.metadata.pipelineId': 'ai.max_tokens.v1',
})

| Location | Content | |----------|---------| | outer.input | Normalized prediction request (outputIntent, sizes, model profile, …) | | outer.output | Full decision; on complete adds actual, learning, optional event | | outer.optimixer | Telemetry: phase, bootstrap multiplier, prediction vs actual fit, token breakdown | | outer.metadata.kind | optimixer:prediction (ACTIVIX_METADATA_KIND_OPTIMIXER_PREDICTION) | | outer.metadata.pipelineId | ai.max_tokens.v1 |

`outer.optimixer` schema

Written at predict; enriched at complete:

| Field | Meaning | |-------|---------| | phase | bootstrap | early | mature | | templateSampleCount | Template-level samples at predict time | | bootstrap.applied / bootstrap.multiplier | Whether high-safety bootstrap multiplier was used | | prediction | Predicted visible/completion tokens, recommendedMaxTokens, outputIntent | | actual | Observed prompt, reasoning, visible, completion, total tokens | | fit.status | within_budget | exceeded_prediction | truncated | | fit.exceededByTokens | How far completion exceeded recommendedMaxTokens when truncated | | fit.predictedVsActualDelta | Predicted visible minus actual visible | | fit.intentVsActualRatio | Actual visible / input tokens (relative mode) | | fit.tokenBreakdown | inputShare, reasoningShare, visibleShare of total tokens |

Profile aggregate rows use outer.metadata.kind = optimixer:profile.

Bootstrap learning phases

| Phase | Trigger | Budget strategy | |-------|---------|-----------------| | bootstrap | Fewer than bootstrap.minSamplesBeforePredict (default 2) completed samples for this templateId | outputIntent estimate × bootstrap.safetyMultiplier (default 4); global/fallback history ignored | | early | Template bootstrapped but exact bucket < minSamplesExact (20) | History at bootstrap.earlyPercentile (default p99) + headroom | | mature | Exact bucket ≥ 20 samples | Normal risk → percentile mapping |

Max-token-limit failures

Always call completeAiMaxTokensPrediction when the provider stops for token limit (finishReason such as length). Optimixer:

Records outer.output.event.type = 'max_tokens_too_low'
Learns a lower bound (not “required = truncated count”)
May return retryPrediction with bumped providerParams

Warmup

On Optimixer.create() in learning mode, Optimixer:

Loads optimixer:profile summaries when present (warmup.profileFirst)
Replays recent completed optimixer:prediction rows via findRecords
Rebuilds multi-level in-memory bucket stats

3.3.0 resilience: each raw row is restored and normalized through the same path as live predict (resolveModelProfile + CR-14 effort reconcile). Rows that cannot be normalized (missing required fields, unrecoverable input) are skipped — logged at optimixer.warmup.row.skipped — without failing init. Mixed alias/concrete historical modelProfile values (e.g. cheap/default, openrouter + gemini-2.5-flash-lite) are supported.

Diagnostics (@x12i/logxer)

Optimixer is the only Activitix monorepo package that depends on @x12i/logxer (pinned 4.5.0). Activix itself uses lightweight console diagnostics.

ENABLE_OPTIMIXER_LOGXER=true — enable full logxer output (errors-only by default)
OPTIMIXER_LOGS_LEVEL — debug, info, warn, error, verbose, or off / none / silent (default warn when enabled and unset)
LOGXER_PACKAGE_LEVELS — bulk levels, e.g. OPTIMIXER:debug,OTHER_PKG:off (see logxer docs/package-log-levels-stack.md)
logging?: StackLoggingOptions on Optimixer.create and logger helpers — pass host levels in code

import {
  createOptimixerLogxer,
  isOptimixerDiagnosticLoggingEnabled,
  resolveOptimixerInternalLogger,
  type StackLoggingOptions,
} from '@x12i/optimixer';

const log = createOptimixerLogxer({
  logging: { packageLevels: { OPTIMIXER: 'debug' } },
});

Environment configuration

Operators configure Activix and Optimixer using a unified settings paradigm:

# Master enable switch (Default: true)
OPTIMIXER_ENABLED=true

# MongoDB URI ladder (Resolves OPTIMIXER_MONGO_URI ?? MONGO_LOGS_URI ?? MONGO_URI)
OPTIMIXER_MONGO_URI=mongodb://localhost:27017

# MongoDB Database Name (Resolves ACTIVIX_DB_NAME ?? MONGO_AI_LOGS_DB ?? MONGO_LOGS_DB ?? MONGO_DB ?? 'activitix')
MONGO_LOGS_DB=activitix

# MongoDB Collection Target (Resolves OPTIMIXER_ACTIVIX_COLLECTION ?? ACTIVIX_EXTRA_ACTIVITY_COLLECTIONS with 'ai-actions' ?? 'ai-actions')
OPTIMIXER_ACTIVIX_COLLECTION=ai-actions

# Dev warmup sample cap limit (Default: 500)
OPTIMIXER_WARMUP_MAX_RECORDS=500

# Optional cold-start default max tokens pipeline fallback
OPTIMIXER_DEFAULT_MAX_TOKENS=2000

# Optional fixed buffer added to every recommended max tokens result
OPTIMIXER_MAX_TOKENS_BUFFER=256

Load and resolve these values seamlessly in node runtimes or server scripts using the built-in helper:

import { resolveOptimixerActivixConfigFromEnv } from '@x12i/optimixer';

const config = resolveOptimixerActivixConfigFromEnv(process.env);
console.log(config.activixCollection); // e.g. "ai-actions"
console.log(config.dbName);            // e.g. "activitix"

Dashboards, Vite BFF routes, and React UI for operator tooling live in the client app (Studio), not in this package. The client calls Optimixer.create, predictAiMaxTokens, getAiMaxTokensStats, etc. over its own HTTP layer.

Monorepo

| Topic | Doc | |-------|-----| | Architecture deep-dive | explained.md | | Consumer migration | ../../.docs/MIGRATION-CONSUMERS.md | | Publishing | ../../.docs/PUBLISHING.md | | Changelog | CHANGELOG.md |

Scripts

npm run build
npm run test:all

From repo root: npm run build:optimixer · npm run test:optimixer

License

Athenix License

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@x12i/optimixer

Resilience guarantees (3.3.0)

Package boundaries

What Optimixer is

First scenario: ai.max_tokens.v1

Why max tokens is a good first scenario

Why not “just set max tokens very high”?

Technical overview

Install

Migrating to 3.0.0 (breaking)

3.3.0 — modelProfile resilience

Accepted modelProfile shapes

Reasoning + historical rows

Example — profile+choice (PRE / xynthesis)

Functional requirements (FR-OPT)

Token budget model

Quick start

Lifecycle

Request fields

outputIntent

Prediction output (required consumer contract)

Predict-only mode (no Activix)

Unknown model defaults (FR-OPT-6)

Configuration

Dry-run

Stats API

Standalone mode

Activix records

outer.optimixer schema

Bootstrap learning phases

Max-token-limit failures

Warmup

Diagnostics (@x12i/logxer)

Environment configuration

Monorepo

Scripts

License

First scenario: `ai.max_tokens.v1`

Accepted `modelProfile` shapes

`outputIntent`

`outer.optimixer` schema