@netlify/axis

v1.17.0

Published

a day ago

Open source tooling and scoring framework to measure how well services work for AI agents.

0High
0Medium
0Low

AXIS is an open source tooling and a scoring framework to measure how well services work for AI agents. Think Lighthouse, but for agent experience.

Give AXIS a scenario, an agent, and a prompt. It runs the agent, captures a full transcript, and produces a graded score across four independent dimensions: Goal Achievement, Environment, Service, and Agent.

Why AXIS

The web has Lighthouse. APIs have contract testing. Performance has k6. But there's no standardized way to answer: "How well does my system work when an AI agent tries to use it?".

As agents become a primary interface for interacting with sites, APIs, and developer platforms, the systems they interact with need to be measured and optimized for that experience — just like we optimize for page load time or accessibility. AXIS is that measurement.

Quick start

npm install @netlify/axis

axis.config.json:

{
  "scenarios": "./scenarios",
  "agents": ["claude-code"]
}

scenarios/hello-world.json:

{
  "name": "Hello world",
  "prompt": "Navigate to https://example.com and describe what you see on the page.",
  "judge": [
    { "check": "Agent visited the target URL", "weight": 0.5 },
    { "check": "Agent provided a description of the page content", "weight": 0.5 }
  ]
}

axis run

AXIS executes the scenario, scores the result, and writes a report to .axis/reports/.

Documentation

Full documentation lives at axis.run:

Quick start - install through your first scored run
Configuration - axis.config.json, scenarios, MCP servers, skills
CLI reference - axis run, axis reports, axis baseline
Running tests - execution model, workspace isolation, custom adapters, CI integration
Scoring framework - the four dimensions, signals, calibration

Programmatic API

Use the programmatic API when you want to integrate AXIS into an existing test runner, build tool, or CI pipeline rather than calling the CLI directly.

Roadmap

Delivered: scenario runner, four-dimension scoring pipeline, baselines with regression detection, MCP/skills wiring, custom adapter API, built-in adapters for Claude Code, Codex, and Gemini.

Planned:

Historical trending - score regression detection over time
AXIS badge - embeddable score badge for READMEs
Configurable judge - separate adapter/model for scoring, independent of the agent under test
Score thresholds - CI gating with configurable pass/fail thresholds
Human interruption detection - penalize agent requests for human intervention

Contributing

AXIS is built in the open. Contributions are welcome. New scenarios, agent adapters, bug fixes, and documentation improvements all help.

AXIS is open source under the MIT license, created by Netlify and developed with founding contributors including Auth0 and Resend.

Full docs: axis.run

Published

Vulnerabilities

Links

Maintainers

Keywords