ai-capability-framework
v1.0.0
Published
Provider-agnostic AI capability contracts, runtime controls, evals, and model/provider adapters.
Maintainers
Readme
AI Capability Framework (AICF)
AI Capability Framework (AICF) is a provider-agnostic AI capability framework: a governed capability layer for AI-accessible application functionality. It helps application teams describe what an AI system is allowed to do, expose only the right tools to a model, validate every model tool call, and prove the behavior with deterministic evals.
AICF is not an agent framework. It is a governed capability layer for AI-accessible application functionality.
Models propose; applications validate, authorize, execute, and audit.
AICF supports OpenAI, Anthropic Claude, Google Gemini, Vercel AI SDK, Model Context Protocol, LangChain/LangGraph, and Semantic Kernel-compatible MCP/OpenAPI workflows. OpenAI is one adapter, not the architecture.
How AICF Works
manifests
-> validated registry
-> routed capability slice
-> optional runtime controls
-> provider tools
-> runtime validation
-> read/prepare execution
-> approval/commit lifecycle
-> optional governance control plane
-> evals
-> sanitized replay traces
-> evidence export
-> optional content provenance sidecarsIn plain terms:
- You write public-safe manifests that describe capabilities, entities, and eval cases.
- AICF validates those manifests and builds a registry.
- Runtime routing picks the smallest safe slice of capabilities for a user request.
- Optional controls can deny, force approval, make matching capabilities read-only, or enforce per-run budgets.
- Provider adapters turn that slice into tool definitions for OpenAI or another provider.
- Tool calls map back to AICF capability IDs and are validated against the original schema.
- The runtime can execute host-registered read and prepare handlers.
- Commit remains host-controlled through prepared actions, approvals, idempotency, and optional audit ledger records.
- An optional self-hosted control plane can review capabilities, evidence, approvals, controls, and redacted replay metadata.
- Eval fixtures prove selection, arguments, refusal, approval, and no-commit boundaries without calling a model.
- Sanitized replay traces can be rerun or converted into review-required regression eval drafts.
- Evidence packs summarize public-safe governance, eval, conformance, approval, retention, and coverage status for review.
- Optional provenance hooks attach refs-and-hashes metadata to generated customer-facing content through host-owned publishing or signing pipelines.
Start Here
- Supported Node.js versions: 20.x, 22.x, and 24.x. Package metadata uses
engines.node >=20. - New to AICF: Start here
- Provider-neutral quickstart: no-key path
- Choose a runtime: provider/runtime guide
- Optional provider quickstarts: OpenAI, Anthropic, and Gemini
- Documentation index: Docs
- Concrete OpenAI flow: OpenAI walkthrough
- Main terms: Glossary
- Full API reference: API
- Public API policy: root and subpath exports
- Agent skills package: AICF Agent Skills
- Security policy: SECURITY.md
- Release/certification: Final v1 certification
- Final certification matrix: local release gate
- npm release preflight: package ownership and tags
- License decision: MIT for v1
- Dependency license exceptions: reviewed exception register
What AICF Does / What Your App Does
| AICF does | Your app does | | -------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- | | Validates capability, entity, eval, context, decision, and result contracts. | Owns production auth, account state, entitlements, and tenant boundaries. | | Routes a small model-facing capability slice. | Decides which user request and host context to pass into AICF. | | Exports provider tool definitions and binding maps. | Calls OpenAI or another provider with a caller-provided client. | | Parses provider tool calls back to capability IDs. | Provides real data access and business logic handlers. | | Validates tool arguments with AICF schemas before execution. | Performs durable storage, payment, email, ticketing, or other side effects. | | Returns model-safe envelopes for success, validation errors, denials, approval-required actions, and failures. | Shows approval UI, collects approvals, and commits side effects through host-controlled lifecycle APIs. | | Can write optional canonical audit ledger records with redacted refs and hashes. | Stores production audit evidence, retention policy, and compliance workflows in your own systems. | | Can evaluate optional kill switches, circuit breakers, and per-run budgets. | Owns production control stores, authenticated operator workflows, and incident response. | | Can govern whether host-supplied memory summaries may become model context. | Owns memory storage, consent, deletion, recall, identity resolution, and tenant scoping. | | Provides an optional self-hostable control-plane API and reference UI for public-safe review. | Owns production access control, durable control-plane storage, approval identity, and evidence retention. | | Can export public-safe evidence packs with gaps and disclaimers. | Owns compliance decisions, audit engagement, legal review, and production evidence systems. | | Can create public-safe generated-content provenance sidecars and adapter-hook inputs. | Owns real content signing, document/media embedding, CMS integration, and authenticity claims. | | Scores deterministic eval fixtures and provider conformance cases. | Produces candidate results from tests, mocks, or optional live runs. |
Try It
Install and validate the public examples:
npm install
npm run validateBuild the CLI and inspect the example registry:
npm run build
node dist/cli.js inspect examplesExport OpenAI Responses function tools without calling a model:
node dist/cli.js openai-tools examples --context examples/support/openai/context.support_agent.jsonRun the public mock runtime flow:
node examples/runtime-support-billing/run-mock.mjsRun deterministic evals:
node dist/cli.js eval examples --results examples/eval-results/public.results.passing.jsonFor expected output excerpts and what each command proves, use Start here.
Documentation Map
Start and concepts:
- Documentation index
- Start here
- Getting started checklist
- Installation
- Provider-neutral quickstart
- Quickstart
- Concepts
- OpenAI quickstart
- Anthropic quickstart
- Gemini quickstart
- OpenAI walkthrough
- Glossary
- Public API policy
- 1.0 spec
- Host responsibilities
Runtime and policy:
- Governance lifecycle, risk, compatibility, and impact
- Governance CI gate
- Audit ledger records
- Trust, taint, redaction, and retention
- Governed memory and preferences
- Capability-aware security packs
- Runtime controls
- Runtime contracts
- Action lifecycle
- Policy broker
- Control plane
- OpenAI Responses runtime
- OpenAI Responses descriptor adapter
Evals and providers:
- Eval runner
- Eval manifests
- Security pack eval templates
- Replay and trace-to-golden
- Live evals
- EvalOps export interfaces
- Evidence export
- Content provenance hooks
- Provider foundation
- Choose a provider/runtime
- Provider conformance
- Anthropic Claude runtime
- Google Gemini runtime
- Vercel AI SDK bridge
- LangChain/LangGraph bridge
- MCP server runtime
- Semantic Kernel compatibility
- AWS reference integration
- AWS production reference adapters
- Observability runtime
Reference and release:
- API reference
- Capability manifests
- Interoperability
- Adapter roadmap
- Migration 0.1 to 1.0
- Release checklist
- Final v1.0 certification
- npm release preflight
- CHANGELOG
- CONTRIBUTING
- SECURITY
Examples
The public examples are synthetic:
examples/01-basic-read-capability/throughexamples/11-control-plane/provide numbered README-first tutorials.examples/support/describes a support ticket and refund workflow.examples/scheduling/describes scheduling capabilities.examples/runtime-support-billing/runs a mock route, read, prepare, approval, and commit flow without credentials.examples/control-plane/runs a local governance control-plane reference app with synthetic seed state and ignored local mutations.examples/aws/documents credential-free AWS adapter wiring and production host responsibilities.examples/providers/contains README-only provider examples.
Private drafts, raw prompts, traces, provider payloads, generated local docs, and
local-only artifacts are excluded from tracked files. See AGENTS.md for the workspace
boundary.
Development Checks
Final v1.0 certification:
npm run check:certificationThe normal development gate is shorter:
npm run lint
npm run build
npm run typecheck
npm run docs:build
npm test
npm run validate
npm run conformance
npm run gate:examples
npm run checkProvider live tests are opt-in and require explicit environment variables. Normal checks use mock clients, descriptor exports, synthetic fixtures, and no live model calls.
For artifact review:
npm run check:package
npm run skills:ci
npm run skills:check
npm run skills:pack:dry
npm pack
npm run release:preflight:npm
npm run release:publish:dry
npm run archive:source
npm run check:source-archiveRelease tags publish two npm artifacts from the same commit and version:
ai-capability-framework and @aicf/agent-skills. For example, package version 1.0.0
uses tag v1.0.0 and the latest dist tag for both packages.
Use npm pack for npm package review and npm run archive:source for public source
review. Do not zip the working directory manually; raw workspace archives can include
.git/, dependencies, generated output, private notes, traces, logs, prompts, or
provider payloads.
CI also runs dedicated docs, security, release dry-run, package hygiene, conformance, and governance-gate workflows. See Release process and Release checklist. npm ownership and dist-tag checks are documented in npm release preflight. Final v1.0 certification is documented in Final v1.0 certification.
License
MIT
