@axiastudio/aioc
v0.2.7
Published
Governance-first agent SDK with deterministic policy gates, auditable run records, and IoC-oriented orchestration.
Maintainers
Readme
@axiastudio/aioc
AIOC is a deterministic governance kernel for LLM agents: models can propose actions, while application-owned policies and runtime controls decide what is actually allowed to happen.
It is built around default-deny gates for tools and handoffs, portable audit artifacts (RunRecord, prompt snapshots, request fingerprints), and replay/compare workflows for verifiable iteration on prompts and policies.
AIOC is designed for enterprise and public-sector contexts where governance cannot be outsourced to an observability dashboard, hosted trace store, or framework-owned approval workflow.
Project home and documentation: https://axiastudio.github.io/aioc
Positioning
AIOC does not try to be a full agent platform, hosted tracing product, evaluation suite, or human-review workflow engine. Those systems are useful, but they make governance-sensitive decisions about storage, retention, access, escalation, and audit semantics.
AIOC keeps the enforcement boundary inside the application instead:
- the model proposes; deterministic policy code authorizes or denies
- execution-impacting capabilities are deny-by-default
- audit evidence is emitted as portable records rather than owned by a required control plane
- persistence, approvals, retention, access control, monitoring, and deployment remain host-application responsibilities
In short: AIOC treats observability as evidence, not as governance itself. Its core job is to make agent execution boundaries explicit, testable, auditable, and hard to bypass.
Release Status
Current stable release: 0.2.5.
The stable line started with 0.1.0.
The core runtime surface is compatibility-managed. Breaking changes to the stable surface should ship only with explicit migration guidance and release notes.
Stable Scope
AIOC 0.1.0 stabilized the core runtime surface, public documentation aligned to the exported contract, RunRecord and replay/compare workflows, and the governance-first runtime model validated in real applications beyond toy examples.
AIOC 0.1.1 adds thread-history utilities, the run-output stream adapter, and provider-specific instruction-role documentation without changing the stable governance model.
AIOC 0.1.2 adds approval evidence helpers for application-owned approval workflows.
AIOC 0.2.0 adds implemented runtime utilities and the Agent Harness Descriptor API. The core runtime remains compatibility-managed; descriptor shape and loader helpers are now part of the supported 0.2.x surface and may evolve across 0.x minor releases only with explicit migration guidance.
AIOC 0.2.1 realigns the 0.2.x package line with the current main history after 0.2.0 was published from the descriptor release branch.
AIOC 0.2.2 completes the descriptor instruction-composition surface with reusable instruction_parts, ordered instructions_sequence, and boolean where gates.
AIOC 0.2.3 adds RFC-0010 policy composition helpers for exact-name tool and handoff policy dispatch without changing runtime enforcement semantics.
AIOC 0.2.4 adds experimental governance-event packages and an OpenTelemetry Logs exporter for reduced, operational observability events derived from RunRecord values.
AIOC 0.2.5 makes replay history-faithful by recording the initial input scope in RunRecord and replaying from it by default.
AIOC 0.2.6 adds descriptor-level conditional agent handoffs with boolean where gates that filter handoff tools before provider requests.
Experimental Packages
@axiastudio/aioc-regression-judge implements the RFC-0012 companion judge
helpers outside the core runtime package. It builds bounded judge inputs for
run-regression suites and parses structured RunJudgeResult outputs while the
host application owns the model invocation.
@axiastudio/aioc-governance-events implements the RFC-0009 governance-event
mapper outside the core runtime package. It derives reduced, redacted,
event-shaped records from RunRecord values.
@axiastudio/aioc-export-otel maps those governance events to OpenTelemetry
Logs. Both packages remain experimental while real exporter usage validates the
schema and operational shape.
- Release notes:
CHANGELOG.md - Historical beta contract snapshot:
docs/BETA-CONTRACT.md - Historical alpha contract snapshot:
docs/ALPHA-CONTRACT.md - Privacy baseline:
docs/PRIVACY-BASELINE.md
Documentation Site
This repository also contains aioc-docs, a Starlight documentation app:
- app path:
apps/aioc-docs - source of truth for normative documents:
docs/ - GitHub Pages target:
https://axiastudio.github.io/aioc/ - start locally:
npm run docs:dev - build statically:
npm run docs:build
The root docs:* commands intentionally invoke the app from inside apps/aioc-docs so it can use its own Node toolchain.
Contact
If you want to collaborate or provide feedback, write to [email protected].
Install
npm install @axiastudio/aiocQuickstart
OpenAI
import "dotenv/config";
import { Agent, run, setupOpenAI } from "@axiastudio/aioc";
setupOpenAI(); // reads OPENAI_API_KEY from env
const agent = new Agent({
name: "Hello Agent",
model: "gpt-4.1-mini",
instructions: "Answer in 2 short sentences.",
});
const result = await run(
agent,
"In one sentence, what is a deterministic policy gate in an agent SDK?",
);
console.log(result.finalOutput);Mistral
import "dotenv/config";
import { Agent, run, setupMistral } from "@axiastudio/aioc";
setupMistral(); // reads MISTRAL_API_KEY from env
const agent = new Agent({
name: "Hello Agent",
model: "mistral-small-latest",
instructions: "Answer in 2 short sentences.",
});
const result = await run(
agent,
"In one sentence, what is a deterministic policy gate in an agent SDK?",
);
console.log(result.finalOutput);Core API
Agent,RunContext- optional
Agent.promptVersionto version resolved instructions Tool,tool(...)- handoffs via
Agent({ handoffs: [...] }) run(...)with streaming support (streamdefaults tofalse)- policy helpers
allow(...)/deny(...)/requireApproval(...) - approval evidence helpers
createApprovalRequestSeed(...),toApprovedProposalHashes(...),toActiveApprovalGrantMap(...) - provider setup helpers
setupMistral(...),setupOpenAI(...),setupProvider(...) - run logger hook
run(..., { logger }) - run record hook
run(..., { record }) - run output adapter
toRunOutputEvents(...) - run-record utilities
extractToolCalls(...),compareRunRecords(...),replayFromRunRecord(...) - thread history utilities
toThreadHistory(...),appendUserMessage(...),replaceThreadHistory(...),applyRunResultHistory(...) - JSON helper
toJsonValue(...)
Agent Harness Descriptor
buildAgentHarness(...)hashAgentHarnessDescriptor(...)loadAgentHarnessDescriptor(...)loadAgentHarnessDescriptorFromFile(...)AgentHarnessDescriptorand related descriptor types
The Agent Harness Descriptor is included in the supported 0.2.x API surface as the aioc.agent_graph.v0 descriptor contract. Use it for controlled configuration, examples, and evaluation harnesses; keep executable tools, policies, providers, persistence, approvals, and deployment configuration in application code. Descriptor handoffs may use boolean where gates to hide unavailable handoff tools before the provider call; HandoffPolicy still owns allow/deny/approval decisions for exposed handoffs.
Policy Gate (Minimal)
import { Agent, allow, deny, run, tool, type ToolPolicy } from "@axiastudio/aioc";
const toolPolicy: ToolPolicy<{ actor: { groups: string[] } }> = ({ runContext }) => {
if (!runContext.context.actor.groups.includes("finance")) {
return deny("deny_missing_finance_group", {
resultMode: "tool_result",
publicReason: "You are not authorized to access this report.",
});
}
return allow("allow_finance_group_access");
};
await run(agent, "Summarize report Q1.", {
context: { actor: { groups: ["finance"] } },
policies: { toolPolicy },
});Default behavior is deny when no policy is configured.
resultMode is the canonical non-allow delivery mode ("throw" or "tool_result"). denyMode is no longer supported.
Run Record (Minimal)
const records = [] as RunRecord<MyContext>[];
await run(agent, "question", {
context,
policies: { toolPolicy },
record: {
includePromptText: true,
contextRedactor: (ctx) => ({
contextSnapshot: { ...ctx, userId: "[redacted]" },
contextRedacted: true,
}),
sink: (record) => records.push(record),
},
});Privacy notes:
record.includePromptTextdefaults tofalse; keep it disabled unless needed.record.contextRedactorshould be considered mandatory for production persistence.- sink adapters should implement encryption, access controls, retention, and deletion.
RunRecord Utilities (Minimal)
extractToolCalls(...)
import { extractToolCalls } from "@axiastudio/aioc";
const calls = extractToolCalls(runRecord);
console.log(calls[0]?.name, calls[0]?.argsHash, calls[0]?.hasOutput);compareRunRecords(...)
import { compareRunRecords } from "@axiastudio/aioc";
const comparison = compareRunRecords(runRecordA, runRecordB, {
includeSections: ["response", "toolCalls", "policy", "guardrails", "metadata"],
responseMatchMode: "exact",
});
console.log(comparison.summary);
console.log(comparison.metrics);
console.log(comparison.differences);replayFromRunRecord(...)
import { allow, replayFromRunRecord } from "@axiastudio/aioc";
const replay = await replayFromRunRecord({
sourceRunRecord,
agent,
mode: "strict", // live | strict | hybrid
runOptions: {
policies: {
toolPolicy: () => allow("allow_replay"),
},
},
});
console.log(replay.result.finalOutput);
console.log(replay.replayStats);replayFromRunRecord(...) does not bypass policy enforcement: in strict and hybrid, provide runOptions.policies when tool/handoff execution must be authorized.
Reference UI Example
This repository also contains aioc-inspect, a private reference example UI for visual RunRecord analysis:
- path:
apps/aioc-inspect - public sample files:
apps/aioc-inspect/public/samples - regenerate samples:
npm run inspect:samples - purpose: show one possible way to inspect, navigate, and compare
RunRecordartifacts visually - scope: experimental, stateless, session-only
- positioning: example application for implementors, not a hosted service or production console
aioc-inspect exists to demonstrate the value of the RunRecord contract. It should be read as one possible interpretation of the data model, not as the only intended UI for aioc.

Examples
Suggested reading path: start with example:hello, then policy, approval,
tool-policy, and policy-composition. Use run-record and harness examples after
the basics; example:non-regression, example:run-regression, and
example:run-regression-judge are advanced workflows.
| Command | Purpose | Needs API key |
|---|---|---|
| npm run example:hello | Minimal single-agent run | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:policy | Minimal denied tool + policy flow | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:approval-required | Minimal approval-required tool + policy flow | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:approval-evidence | Approval evidence passed through context and reevaluated by policy | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:tool-policy | Straight tool + policy flow with allowed execution | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:policy-composition | Exact-name tool policy dispatch with fallback deny | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:harness-rerun | Replay against a modified harness with a mocked new tool output | Yes (OPENAI_API_KEY) |
| npm run example:run-record | Run-record persistence with redaction + audit | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:rru:01-extract | Minimal extractToolCalls(...) | No |
| npm run example:rru:02-compare | Minimal compareRunRecords(...) | No |
| npm run example:rru:03-replay-strict | Minimal strict replay | No |
| npm run example:rru:04-replay-hybrid | Minimal hybrid replay | No |
| npm run example:non-regression | Advanced v1/v2 run-record diff | Yes (AIOC_EXAMPLE_PROVIDER + matching provider API key) |
| npm run example:run-regression | Local runRegressionSuite(...) over a modified harness | Yes (OPENAI_API_KEY) |
| npm run example:run-regression-judge | Local runRegressionSuite(...) with an LLM judge | Yes (OPENAI_API_KEY) |
Notes:
- for live-provider examples, set
AIOC_EXAMPLE_PROVIDERtoopenaiormistral; the matching API key must also be available example:harness-rerun,example:run-regression, andexample:run-regression-judgeconfigure OpenAI directly fromOPENAI_API_KEY; harness models are declared in YAML descriptors- run-record utility examples are deterministic and do not need a provider
example:non-regressionis educational and can be non-deterministic because it uses a live provider.- canonical examples guide:
docs/CANONICAL-EXAMPLES.md.
Optional LangChain Examples
Optional LangChain interoperability examples live in examples/langchain.
They use their own package.json so LangChain dependencies do not become
dependencies of the core @axiastudio/aioc runtime package.
They demonstrate two composition patterns:
- aioc-first, LangChain-extended: aioc owns the governed agent run while LangChain provides OSS components behind aioc tools, such as retrieval.
- LangGraph-orchestrated, aioc-governed: LangGraph owns workflow orchestration while selected graph nodes call aioc for policy-gated execution and portable audit evidence.
Test Commands
npm run test:unitnpm run test:integrationnpm run test:regressionnpm run test:ci
Project Principles
AIOC adopts the following non-negotiable principles:
- LLM outside the control plane: critical decisions remain in deterministic components; the LLM supports but does not govern.
- End-to-end transparency: each decision is traceable (inputs, context, prompt/policy version, output).
- Verifiable corrigibility: prompts, policies, and materials are versioned, editable, and comparable before/after changes.
- Non-degeneration validation: each correction must pass regression tests and quality checks.
- Bias and misalignment control: continuous monitoring, dedicated tests, and clear mitigation/escalation mechanisms.
- Privacy by design and data minimization: collect and process only what is strictly necessary, protect sensitive data by default (redaction, encryption, retention limits), and provide auditable controls for access and deletion.
Current Governance Documents
docs/RFC-0001-governance-first-runtime.md(Accepted)docs/RFC-0002-policy-gates-for-tools-and-handoffs.md(Accepted)docs/RFC-0003-run-record-audit-trail-and-persistence.md(Accepted)docs/RFC-0004-policy-outcomes-and-approval-model.md(Accepted)docs/RFC-0005-suspended-proposals-and-approval-lifecycle.md(Accepted)docs/RFC-0006-approval-evidence-helpers.md(Accepted)docs/RFC-0007-thread-state-utilities.md(Accepted)docs/RFC-0008-run-stream-consumer-utilities.md(Accepted)docs/RFC-0009-governance-events-and-exporters.md(Experimental)docs/RFC-0010-policy-composition-helpers.md(Accepted)docs/RFC-0011-agent-harness-descriptor.md(Accepted)docs/PRIVACY-BASELINE.md
Historical Snapshots
docs/ALPHA-CONTRACT.mddocs/BETA-CONTRACT.mddocs/BETA-CONTRACT-AUDIT.mddocs/P0-TRIAGE.mddocs/PRIVACY-ADOPTION.md
License
- Project license:
MIT(LICENSE) - Third-party notices:
THIRD_PARTY_NOTICES.md
