npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@useanvil/sdk

v0.1.6

Published

Anvil: execution correctness layer for AI-triggered actions

Readme

Anvil

Anvil is an execution correctness layer for risky actions.

Put it in front of the calls that can move money, mutate a system, trigger an irreversible workflow step, or cause an expensive side effect if they happen twice.

Anvil helps teams make sure those actions:

  • do not run twice by accident
  • stay inside policy before they run
  • leave behind durable receipts operators can inspect later
  • help verify whether the downstream side effect actually happened
  • can be observed safely in ghost before enforce
  • stay under centralized rollout control
  • leave an operational record teams can trust

The point is not “more infrastructure.” The point is fewer bad outcomes, safer automation rollout, less hand-rolled defensive code, and clearer proof that AI or software is operating correctly in production.

Production Truth

For design-partner production deployments, the durable model is:

  • Postgres is the source of truth for receipts, receipt events, audit records, durable commands, workflow recovery state, and control history.
  • Redis is coordination only: locks, leases, counters, and short-lived in-flight coordination.
  • Local JSONL is optional debug/export output only.

Practical consequence:

  • if Postgres is unavailable before a risky side effect starts, Anvil fails closed
  • if a process crashes after the provider may have accepted the action, the durable attempt record is still in Postgres
  • if Redis is flushed, recovery still works because the durable truth is not in Redis

Runtime boot rule:

  • if ANVIL_POSTGRES_URL is set, Anvil now auto-registers the built-in PostgresPersistenceAdapter at startup
  • you only need to call registerPersistenceAdapter() yourself when you want a custom adapter or test double

Lock lease invariant:

  • ANVIL_LOCK_TTL_SECONDS defaults to 30 and must be longer than the slowest real execute() call.
  • ANVIL_LOCK_KEEPALIVE_ENABLED=true renews the Redis lock every half-TTL while the action is still running.
  • Example: if a Stripe charge can take 25s, a 30s lease leaves only 5s of margin. That is risky in production. Increase the TTL rather than relying on luck.

Repo Layout

The repo is organized around a few clear layers:

  • src/: runtime SDK, CLI, control plane, receipts, policies, and UI generators
  • src/ui/: generated local product surfaces like Start, Ghost Report, Policy Studio, and Control Center
  • src/sinks/: optional output adapters for observability and local metrics
  • policies/: bundled starter and production-ready policy files
  • demo/: runnable demos
  • eval/fixtures/: bundled Ghost evaluation datasets
  • scripts/: verification, benchmarking, evidence, and ops drills
  • tests/: unit and integration coverage
  • docs/: product docs, PRDs, and operations notes
  • preview/: static prototype and marketing previews
  • artifacts/: generated local output only, not source of truth
  • dist/: build output only

Ghost Modes

There are two different Ghost stories in this repo, and they are not the same thing:

  • Runtime ghost mode is the SDK/CLI/app rollout mode for live traffic. It still runs the real provider call and returns the real result or thrown error to the app. It only records what Anvil would have done as a separate shadow decision: would_allow, would_dedup, would_block, or would_unknown.
  • Ghost replay reconstructs execution truth from messy logs after the fact.

Ghost replay works like this:

  • parse raw logs into normalized events
  • group normalized events into entities
  • run deterministic inference for attempts, final status, retries, and duplicates
  • optionally attach AI only for narrow low-confidence unknown cases

The deterministic layer is the source of truth. AI is optional enrichment only and never gets to rewrite a deterministic success or failure.

Run the Ghost pipeline directly:

npm run eval -- --logs ./eval/fixtures/adversarial_logs.ndjson --truth ./eval/fixtures/adversarial_ground_truth.json --out ./artifacts/ghost-eval

That writes:

  • artifacts/.../summary.json
  • artifacts/.../comparisons.json
  • artifacts/.../operations.json

Practical rule:

  • If it defines behavior, it belongs in src/, policies/, tests/, or docs/.
  • If it is generated by a command, it belongs in artifacts/ or dist/.

Runtime Ghost Mode v2 Migration

Old runtime ghost behavior skipped the protected execute() callback and returned synthetic runtime statuses such as blocked, duplicate, or executed with result = null. That was wrong for live traffic because enabling ghost could suppress real side effects or break caller-visible return contracts.

Ghost Mode v2 fixes that contract:

  • execute() still runs in runtime ghost mode.
  • The app still gets the real return value or the real thrown error.
  • The top-level runtime status is the real execution outcome.
  • The shadow decision now lives in response.ghost and audit record.ghost as would_allow, would_dedup, would_block, or would_unknown.
  • Ghost observation state is kept separate from enforce-mode idempotency state.

For the runtime hot-path map, Redis round-trip counts, benchmark command, and provider-facing idempotency audit, see docs/GHOST_MODE_V2_GA_READINESS.md.

Quick Start

  1. Install dependencies:
npm install
  1. Start with the guided front door:
anvil start

That command now writes the Start page, runs a protected local execution, blocks the duplicate replay, and prints a real receipt path so the first experience ends with proof instead of setup trivia.

  1. Create your local config if you want a .env scaffold:
npm run init
  1. Check your environment:
npm run doctor

By default, runtime installs now start in ANVIL_MODE=enforce. Switch to ANVIL_MODE=ghost explicitly when you want observe-only rollout on real traffic.

If you are bringing up production durability:

npm run db:migrate
  1. Re-run the dead-simple local demo directly any time:
npm run demo
  1. See value immediately with the built-in ghost sample:
npm run ghost:sample
  1. Measure runtime overhead on your own Redis before rollout changes:
npm run build --silent
npm run bench:guard

That benchmark compares enforce and runtime ghost in separate child processes and reports the delta in throughput plus average and p95 latency.

  1. Run the duplicate-prevention demo with a Stripe test key:
npm run demo:stripe -- --stripe-key sk_test_...
  1. For existing product code, start with the default adoption namespace:
import { safe } from '@useanvil/sdk';

There is one mental model: point Anvil at the risky thing and protect it. You do not have to know Anvil's primitive taxonomy first.

  • safe.protect(...) — protect one inline risky side effect.
  • safe.wrap(...) — protect an existing function without changing its contract.

Describe the risky call in plain business terms and Anvil infers the surface (payment / subscription / external mutation / agent step) and applies the right keying, metadata, reporting vocabulary, and policy defaults automatically:

import { safe } from '@useanvil/sdk';

// Inferred as a payment charge — keyed on orderId, reported as payments.charge.
await safe.protect({
  orderId,
  amount: cents,
  execute: async () =>
    stripe.charges.create({ amount: cents, currency: 'usd', customer: customerId })
});

// Inferred as a refund (the `reason` field is the high-confidence refund signal).
await safe.protect({ orderId, amount: cents, reason: 'customer_request', execute });

// Inferred as an external mutation — keyed on stable business identity.
await safe.protect({
  provider: 'salesforce', resourceType: 'contact', resourceId, operation: 'update-email',
  method: 'PATCH', execute
});

// Inferred as an agent/workflow step — enforces workflow budgets.
await safe.protect({ workflow, stepId, toolName, execute });

// Inferred as inbound event processing — dedups a retried webhook on its eventId,
// so your handler runs exactly once. (Outbound delivery has a deliveryId; inbound never does.)
await safe.protect({ eventId: evt.id, provider: 'stripe', execute: () => handleEvent(evt) });

// Inferred as a payout / transfer — keyed on the durable payout/transfer id.
await safe.protect({ payoutId, amount: cents, execute });
await safe.protect({ transferId, amount: cents, destinationAccount, execute });

// Inferred as an OUTBOUND webhook delivery — keyed on eventId + deliveryId.
// (An eventId WITHOUT a deliveryId is inferred as inbound webhook.received instead.)
await safe.protect({ endpoint, eventId, deliveryId, provider: 'stripe', execute });

// Inferred as an email send — keyed on messageId + recipient.
await safe.protect({ messageId, recipient, execute });

// Inferred as a support ticket create — keyed on provider + externalCaseId.
await safe.protect({ provider: 'zendesk', externalCaseId, execute });

execute() receives the same computed context the explicit surface would hand it, so migrating loses nothing — most importantly the idempotencyKey to forward to your provider for provider-level idempotency, plus the resolved surface, the business metadata, the provider-ready providerMetadata, and any workflow:

await safe.protect({
  paymentIntentId,
  amount: cents,
  confirm: true,
  execute: async ({ idempotencyKey, providerMetadata }) =>
    stripe.paymentIntents.confirm(paymentIntentId, { idempotencyKey, metadata: providerMetadata })
});

Anvil leads with inference but never guesses an idempotency key. When the data is ambiguous (e.g. a paymentIntentId that could be a confirm or a capture), it throws AnvilAmbiguousProtectionError telling you the candidate surfaces and the field — or as override — that resolves it. For the confirm/capture case you just add the intent boolean — no taxonomy needed:

// Disambiguate a payment intent with a plain business word:
await safe.protect({ paymentIntentId, amount: cents, confirm: true, execute }); // → payment.confirm
await safe.protect({ paymentIntentId, amount: cents, capture: true, execute }); // → payment.capture

// Or, if you prefer, pin the surface explicitly — same key, same result:
await safe.protect({ as: 'payment.capture', paymentIntentId, amount: cents, execute });

safe.wrap(...) works the same way; business fields may be plain values or factories of the call context so keys can be derived from arguments:

const createCharge = safe.wrap({
  fn: stripe.charges.create,
  orderId: ({ args }) => args[0].metadata.orderId,
  amount: ({ args }) => args[0].amount
}); // → inferred payment.charge

Fully explicit / advanced path. The productized surfaces are still first-class and remain the right choice when you want zero inference or provider-shaped naming. Use them directly, or pin any universal call with as:

  • safe.payments.charge|confirm|capture|refund(...)
  • safe.subscriptions.create|change|cancel|resume(...)
  • safe.external.mutation(...)
  • safe.agent.toolCall(...) / safe.agent.step(...)

You can also stay fully explicit on safe.protect/safe.wrap by passing key and action yourself — that is the original contract and is unchanged.

// Explicit: you supply the key and canonical action yourself.
const charge = await safe.protect({
  key: safe.key.payment(orderId),
  action: 'payment.charge',
  amount: cents,
  execute: async () =>
    stripe.charges.create({ amount: cents, currency: 'usd', customer: customerId })
});
  1. Use guard() directly only when you explicitly want Anvil's structured response object in your own code:
import { guard, key } from '@useanvil/sdk';

await guard({
  key: key.payment('order-8821'),
  action: 'payment.charge',
  amount: 4999,
  execute: async () => ({ ok: true })
});

policy defaults to stripe-v1, so you only need to pass it when using a different policy.

Practical rule:

  • Start with safe.protect() (inline) or safe.wrap() (existing function) and describe the risky thing in business terms. Let Anvil infer the surface.
  • Reach for safe.payments.*, safe.subscriptions.*, safe.external.mutation(), or safe.agent.toolCall() when you want to be fully explicit, or pin a universal call with as.
  • Pass key + action to safe.protect()/safe.wrap() when you want to control the idempotency key yourself.
  • Use guard() when you want the full structured AnvilResponse.

Safe adoption rule:

  • If a team is unsure which surface to choose, that is exactly the case the universal path is for — safe.protect({ ...business fields, execute }) and let inference decide.
  • If Anvil throws AnvilAmbiguousProtectionError, it is telling you the data is genuinely ambiguous. Add the field it names or pin the surface with as — do not work around it, because a wrong key means a wrong dedup decision.
  • Move to direct guard() handling only when you actually want downstream code to consume Anvil statuses like duplicate, blocked, or unknown.

Universal protection — design decisions:

  • Smart defaults, safe fallbacks — never magic. Inference keys off stable, structured business fields (provider/resourceId, subscriptionId, paymentIntentId, workflow+stepId+toolName), never on function or variable names. Callee-name heuristics are fine for CLI suggestions but too weak for runtime keying.
  • Refuse over guess. When two surfaces are plausible from the same data (confirm vs capture, or a bare subscriptionId), Anvil throws instead of guessing. A wrong classification is a wrong idempotency key, which in a dedup engine can mean a double-charge or a swallowed write.
  • Fail loud on bad input. Malformed input is rejected up front with AnvilInvalidProtectionInputError before anything is classified or executed, so a poisoned value can never reach the key, the payload hash, or a budget comparison. A non-finite or negative amount (NaN/Infinity/-5) is refused outright — it would otherwise silently defeat budget caps (NaN > limit is always false) — and a missing/non-function execute fails with a clear message instead of a deep crash. Empty or whitespace-only identity fields are treated as absent, so a typo can never key on " ".
  • Name the missing field. When you are one field short of a valid surface (e.g. an external mutation without method, or a customerId with no priceId), the error names exactly what to add rather than a generic "could not infer" — because a confusing error is what pushes people toward hand-rolling keys, the footgun this layer exists to remove.
  • Single source of truth. The universal path does not re-implement keying or metadata. It classifies, then dispatches to the existing product surfaces, so Ghost/audit/receipts keep speaking the correct business language for free.
  • Zero-loss migration (semantic fidelity). The universal path dispatches to the same surfaces, so the result, the thrown errors (provider errors, duplicate replay, blocked), the receipts, and the metadata are byte-for-byte what the explicit surface produces. The execute() callback receives the same computed context too — idempotencyKey, surface, metadata, providerMetadata, workflow — so moving an explicit call to the universal form never drops information you were relying on.
  • Override precedence. as pins the surface; an explicit key pins the idempotency key; explicit action (on safe.protect/safe.wrap) keeps the original low-level contract. Explicit always wins over inference.
  • Backward compatible. Existing callers that pass action (and key) are routed down the exact original path, including wrap's lightweight hosted lock/complete protocol. The universal path is purely additive. Hosted mode still works: universal protect/wrap dispatch through the product surfaces into guard(), which has its own full hosted-API path (ANVIL_API_KEY), so no local Redis is required.

Integration proof:

  • See docs/INTEGRATION_PROOF.md for four production-style examples: existing backend function, API handler, worker/job, and webhook handler.

Production deployment:

What Anvil Is

Anvil is a runtime execution control layer for side-effectful actions.

Use it when an agent, workflow, or service can:

  • move money
  • mutate an external system
  • trigger a step that should not run twice

In practice, Anvil sits between “request received” and “real-world side effect happens.”

Anvil is strongest on:

  • payments and refunds
  • external API calls
  • workflow steps with real-world side effects

What Anvil Is Not

Anvil is not:

  • a general workflow engine
  • a broad authorization platform
  • a replacement for business logic or provider-native idempotency

If a path is read-only or low-risk, Anvil is probably unnecessary.

The strongest setup is usually:

  • provider-native idempotency where available
  • Anvil around the full side-effectful workflow

That combination is stronger than either layer alone.

Common Commands

npm run init
npm run doctor
npm run start
npm run demo
npm run cc
npm run policy:studio
npm run ghost:sample
npm run demo:stripe -- --stripe-key sk_test_...
npm run verify

Claude Code / Agent Quick Start

Born protected: the agent pack

The highest-leverage protection is to never make it a separate step. init-agents teaches your AI coding tools to write risky calls already wrapped, in ghost mode, at generation time:

npx @useanvil/sdk init-agents

It detects which agents your repo uses and drops one drop-in guidance file for each:

  • Claude Code.claude/skills/anvil-protect/SKILL.md
  • Cursor.cursor/rules/anvil.mdc
  • Codex / others → an AGENTS.md section

All three are compiled from a single source (agent-pack/anvil-wrap-guidance.md), so every agent writes the byte-identical wrap({ fn, key, action }) shape. After this, asking any of them to "add a Stripe charge" yields code that compiles, runs, sits in ghost mode, and shows up in your dashboard — with a durable idempotency key or an honest TODO(anvil) when one can't be inferred. Reads are never wrapped.

npx @useanvil/sdk init-agents --all      # install for every agent regardless of detection
npx @useanvil/sdk init-agents --check    # CI-friendly: exit 1 if a dropped file is stale

The pack is regenerated on every publish and guarded by a drift test, so the files you install always match the SDK you installed.

MCP server: protect & introspect from your agent

The agent pack makes new code born-protected. The MCP server lets the agent retrofit existing calls and verify protection on demand — same codemod, so the wraps are byte-identical. It runs locally over stdio and inherits "which repo / which file" from the agent's workspace, so there's nothing to point at.

Five small tools: find_action, protect_action (wraps the real call in ghost mode, returns a unified diff + the durable key it chose or an honest TODO, reversible via .anvil-bak), check_protection, list_observations, and set_mode (the only way to enable enforcement).

Configure once — no global install (npx fetches it):

claude mcp add anvil -- npx -y @useanvil/sdk mcp

Or add to Claude Code's .mcp.json / Cursor's .cursor/mcp.json:

{
  "mcpServers": {
    "anvil": { "command": "npx", "args": ["-y", "@useanvil/sdk", "mcp"] }
  }
}

Run anvil mcp --config to print this any time. Then just ask: "protect the payment.charge in this repo" — the agent wraps it, shows the diff, done.

Inline protection

For AI-triggered payments or API calls, the safest copy-paste path for an existing system is to describe the risky thing and let Anvil infer the surface:

import { safe } from '@useanvil/sdk';

await safe.protect({
  orderId,
  amount: cents,
  execute: async () => stripe.charges.create({
    amount: cents,
    currency: 'usd',
    customer: customerId
  })
});

Prefer to pin the key and action yourself? That contract is unchanged:

await safe.protect({
  key: safe.key.payment(orderId),
  action: 'payment.charge',
  amount: cents,
  execute: async () => stripe.charges.create({ amount: cents, currency: 'usd', customer: customerId })
});

If you need to persist the key in a queue or workflow step, round-trip it safely:

import { key, safe } from '@useanvil/sdk';

const storedKey = safe.key.payment(orderId).toString();

await safe.protect({
  key: key.from(storedKey),
  action: 'payment.charge',
  amount: cents,
  execute: async () => doWork()
});

Product surface example:

import { safe } from '@useanvil/sdk';

await safe.payments.confirm({
  paymentIntentId: 'pi_123',
  amount: 4999,
  execute: async ({ paymentIntentId, idempotencyKey }) =>
    stripe.paymentIntents.confirm(paymentIntentId, {
      idempotencyKey
    })
});

Stripe money movement rule:

  • Use safe.payments.charge(...), safe.payments.confirm(...), safe.payments.capture(...), or safe.payments.refund(...) as the default path.
  • Key charges and refunds from durable business identity like orderId, refund business identity, or the payment intent being acted on. Do not key from request IDs, retry attempts, or webhook delivery IDs.
  • Keep Stripe idempotency turned on with the idempotencyKey Anvil gives you, but do not confuse that with full execution correctness.

What Stripe idempotency alone does not solve:

  • it does not enforce your retry or amount policy before the call fires
  • it does not give operators one durable receipt across retries, duplicates, and unknown outcomes
  • it does not tell you whether a provider timeout is safe to retry without reconciliation
  • it does not protect non-Stripe work around the same money movement path

Replay safety rule:

  • replay is safe only when Anvil captured a replayable original provider result, so duplicate callers can receive the same value without re-running the charge or refund
  • replay is not safe when the original result was not serializable or otherwise not replayable; in that case Anvil preserves the protection decision and throws instead of fabricating a fake provider result
  • when runtime truth is unknown, hold retries until Stripe reconciliation or manual operator review says the next attempt is safe

Built-In Adapters

Plain English: built-in adapters mean you do not have to hand-build the Anvil wrapper for the most common risky calls.

Your app still performs the real provider call inside execute, but Anvil now gives you the recommended key shape, metadata, receipt context, and verification posture by default.

Most application code should prefer safe.payments.*, safe.external.*, safe.subscriptions.*, or safe.agent.toolCall(...) first. Reach for safe.adapters.* only when the provider-shaped surface is the more natural fit than the product-shaped one.

Adapters return an AnvilResponse on purpose. If you need a non-breaking call surface for existing code paths, prefer wrap(), protect(), or adapters.protected.* instead of swapping call sites straight to guard() or an adapter response object.

Built in today:

  • Stripe charge
  • Stripe refund
  • Stripe payout
  • Stripe transfer
  • Stripe subscription cancel
  • Webhook delivery
  • Email send
  • Support ticket create
  • Generic external API mutation

Example: Stripe charge

import Stripe from 'stripe';
import { adapters } from '@useanvil/sdk';

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);

await adapters.stripe.charge({
  orderId: 'order-8821',
  amount: 4999,
  execute: async (context) =>
    stripe.charges.create({
      amount: context.amount,
      currency: context.currency,
      metadata: context.stripeMetadata
    }, {
      idempotencyKey: context.idempotencyKey
    })
});

Same adapter, production-safe return contract:

const charge = await adapters.protected.stripe.charge({
  orderId: 'order-8821',
  amount: 4999,
  execute: async (context) =>
    stripe.charges.create({
      amount: context.amount,
      currency: context.currency,
      metadata: context.stripeMetadata
    }, {
      idempotencyKey: context.idempotencyKey
    })
});

Example: email send

import { adapters } from '@useanvil/sdk';

await adapters.comms.emailSend({
  recipient: '[email protected]',
  messageId: 'welcome-8821',
  provider: 'resend',
  template: 'welcome',
  execute: async () => sendEmailSomehow()
});

Example: support ticket create

import { adapters } from '@useanvil/sdk';

await adapters.support.ticketCreate({
  provider: 'zendesk',
  externalCaseId: 'case-8821',
  customerId: 'cust-42',
  execute: async () => createTicketSomehow()
});

If the risky call is a plain outbound HTTP write, the default path should be safe.external.mutation(...), not a custom wrapper.

Minimum fields for a safe generic mutation:

  • method: the write verb that will fire
  • provider: the external or internal system being changed
  • resourceType: what business object is being changed
  • resourceId: the stable business identifier for that object
  • operation: the business mutation being attempted
  • target: optional request path or URL for operator context
  • actorId / workflow: optional governance and runtime context when the mutation sits inside a job or agent workflow

Keying rule:

  • Prefer the same stable business identity you would use to explain the mutation to an operator: provider + resourceType + resourceId + operation.
  • Do not key by request IDs, trace IDs, queue attempt IDs, retry counters, timestamps, webhook delivery attempts, or process-local UUIDs.
  • If the same business mutation can be sent to multiple URLs over time, keep the same business key and record the URL in target.

Examples:

Update Salesforce contact

import { safe } from '@useanvil/sdk';

await safe.external.mutation({
  provider: 'salesforce',
  resourceType: 'contact',
  resourceId: '0038A00000F1ABC',
  operation: 'update-email',
  method: 'PATCH',
  target: '/services/data/v60.0/sobjects/Contact/0038A00000F1ABC',
  actorId: 'ops-user-7',
  execute: async () =>
    salesforce.patch('/services/data/v60.0/sobjects/Contact/0038A00000F1ABC', {
      Email: '[email protected]'
    })
});

Patch HubSpot record

await safe.external.mutation({
  provider: 'hubspot',
  resourceType: 'company',
  resourceId: '9482211',
  operation: 'patch-lifecycle-stage',
  method: 'PATCH',
  target: '/crm/v3/objects/companies/9482211',
  workflow: {
    workflowId: 'nightly-crm-sync'
  },
  execute: async () =>
    hubspot.patch('/crm/v3/objects/companies/9482211', {
      properties: {
        lifecyclestage: 'customer'
      }
    })
});

Cancel account in internal billing API

await safe.external.mutation({
  provider: 'internal-billing-api',
  resourceType: 'account',
  resourceId: account.id,
  operation: 'cancel-account',
  method: 'POST',
  target: `/v1/accounts/${account.id}/cancel`,
  actorId: req.user.id,
  execute: async () =>
    billingClient.post(`/v1/accounts/${account.id}/cancel`, {
      reason: 'fraud-review'
    })
});

Create support artifact in a third-party tool

await safe.external.mutation({
  provider: 'zendesk',
  resourceType: 'ticket',
  resourceId: caseId,
  operation: 'create-followup-ticket',
  method: 'POST',
  target: '/api/v2/tickets',
  reconciliationMode: 'custom',
  execute: async () =>
    zendesk.tickets.create({
      external_id: caseId,
      subject: 'Follow-up required'
    })
});

If you explicitly need the lower-level AnvilResponse surface or a workflow-local api.call shape, use adapters.external.mutation(...). If you need downstream verification for a generic mutation, register a custom reconciliation adapter.

Agent Workflow Limits

Anvil now supports first-class workflow limits for agent systems:

  • max tool calls per workflow
  • max retries per workflow
  • max spend budget per workflow
  • max recursion / step depth per workflow

These limits apply when you pass a workflowId into the guarded call. Anvil now auto-tracks tool call count, run-level retry pressure, and guarded spend for the run by default, then enforces those limits before the side effect runs. stepDepth remains optional caller-supplied context when the workflow runtime already knows nesting.

Example policy:

{
  "name": "agent-runtime-v1",
  "allowed_actions": ["api.call"],
  "workflow_limits": {
    "max_tool_calls": 20,
    "max_retries": 6,
    "max_spend": 20000,
    "max_step_depth": 12
  },
  "limits": {
    "api.call": {
      "max_retries": 5,
      "execute_timeout_ms": 8000
    }
  }
}

Plain English:

  • max_tool_calls: stop the workflow before tool loop number 21
  • max_retries: stop the workflow before retry count 7
  • max_spend: stop the workflow before cumulative guarded amount goes above $200.00 in the default bundled policies
  • max_step_depth: stop runaway recursion or planner depth before step 13

Zero-Config Agent Runtime

The intended system is simple:

  • Anvil is the runtime control point in front of a risky tool call
  • agent and workflow teams use it when a step can mutate a real system, move money, or create an expensive side effect
  • they use it at the exact point where the tool would normally run
  • their workflow changes from “call the provider directly” to “call the provider through Anvil once”

Concrete example:

  • an agent tries to verify a customer five times in the same run after a planner loop starts drifting
  • the developer passes workflowId, stepId, and toolName
  • Anvil derives the business key, blocks duplicate step execution, increments the run counter, and blocks the fifth tool call before the provider sees it

What now works out of the box with the bundled policies:

  • idempotent tool steps from workflowId + stepId
  • automatic per-run tool call counting
  • automatic per-run retry counting when the same guarded work is reclaimed after failure
  • automatic per-run guarded spend accumulation when amount is provided
  • explicit status, safeToRetry, and nextAction
  • destructive-action flags with destructive and requiresConfirmation
  • durable receipts through anvil.receipts

Agent Starter

For an existing agent or workflow, the lowest-risk path is safe.agent.toolCall(...) because it preserves the normal tool return contract while still enforcing Anvil. A full starter lives in demo/agent-starter.ts.

import { safe } from '@useanvil/sdk';

await safe.agent.toolCall({
  workflow: {
    workflowId: run.id
  },
  stepId: `verify-customer-${customerId}`,
  toolName: 'kyc.verify',
  action: 'api.call',
  amount: 80,
  metadata: {
    provider: 'persona',
    customer_id: customerId
  },
  execute: async () =>
    persona.verifications.create({
      customerId
    })
});

Why this matters:

  • the workflow limits now live in policy, not scattered in agent code
  • Anvil now owns the tool-count, retry-count, and guarded-spend bookkeeping for the guarded steps
  • receipts now capture workflow context cleanly
  • the same runtime can govern both ordinary risky API calls and agent tool execution
  • if the tool succeeds, your workflow still gets the tool result instead of a new wrapper object

Direct OpenAI Tool Execution

If your team is using the OpenAI Responses API directly, the minimal safe path is:

  • generate one stable workflowId when the agent run starts
  • use the provider's function_call.call_id as the Anvil stepId
  • hand the model's function_call items to runOpenAIToolCalls(...)
  • send the returned function_call_output items back to client.responses.create(...)

This stays faithful to the normal OpenAI loop. Anvil only governs whether the risky tool is allowed to run and whether a retry should replay or stop.

import OpenAI from 'openai';
import { runOpenAIToolCalls, safe } from '@useanvil/sdk';

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const workflowId = crypto.randomUUID(); // create once per agent run

const response = await client.responses.create({
  model: 'gpt-5.2',
  input: 'Refund order ord_8821 and then cancel the internal billing account if the refund succeeds.',
  tools: [
    {
      type: 'function',
      name: 'refund_order',
      description: 'Refund a completed order exactly once.',
      parameters: {
        type: 'object',
        properties: {
          orderId: { type: 'string' },
          paymentIntentId: { type: 'string' },
          amountCents: { type: 'integer' }
        },
        required: ['orderId', 'paymentIntentId', 'amountCents']
      }
    },
    {
      type: 'function',
      name: 'cancel_billing_account',
      description: 'Cancel an account in the internal billing API exactly once.',
      parameters: {
        type: 'object',
        properties: {
          accountId: { type: 'string' },
          reason: { type: 'string' }
        },
        required: ['accountId', 'reason']
      }
    }
  ]
});

const toolCalls = response.output.filter((item) => item.type === 'function_call');

const toolOutputs = await runOpenAIToolCalls({
  calls: toolCalls,
  config: { workflowId, policy: 'agent-runtime-v1' },
  tools: {
    refund_order: {
      action: 'payment.refund',
      amount: ({ parsedArguments }) => parsedArguments.amountCents,
      metadata: ({ parsedArguments }) => ({
        provider: 'stripe',
        order_id: parsedArguments.orderId,
        payment_intent_id: parsedArguments.paymentIntentId
      }),
      execute: async ({ orderId, paymentIntentId, amountCents }) =>
        safe.payments.refund({
          provider: 'stripe',
          paymentIntentId,
          amount: amountCents,
          businessId: orderId,
          execute: async () =>
            stripe.refunds.create({
              payment_intent: paymentIntentId,
              amount: amountCents
            })
        })
    },
    cancel_billing_account: {
      action: 'api.call',
      metadata: ({ parsedArguments }) => ({
        provider: 'internal_billing_api',
        account_id: parsedArguments.accountId,
        operation: 'account.cancel'
      }),
      execute: async ({ accountId, reason }) =>
        safe.external.mutation({
          system: 'internal_billing_api',
          operation: 'account.cancel',
          method: 'POST',
          target: `/accounts/${accountId}/cancel`,
          businessId: accountId,
          execute: async () =>
            billingClient.post(`/accounts/${accountId}/cancel`, { reason })
        })
    }
  }
});

await client.responses.create({
  model: 'gpt-5.2',
  previous_response_id: response.id,
  input: toolOutputs
});

Practical rules:

  • workflowId comes from your runtime. It must stay stable across retries of the same agent run.
  • stepId comes from OpenAI's call_id. Do not replace it with a request-local UUID.
  • If Anvil blocks or cannot confirm execution, runOpenAIToolCalls(...) returns a structured function_call_output payload the model can reason about.
  • If the exact same tool call is replayed later with the same call_id, Anvil replays the original result instead of running the side effect again when replay is safe.

Direct Anthropic Tool Execution

If your team is using the Anthropic Messages API directly, the minimal safe path is the same:

  • generate one stable workflowId per agent run
  • use the provider's tool_use.id as the Anvil stepId
  • hand the tool_use blocks to runAnthropicToolUses(...)
  • append the returned tool_result blocks immediately after the assistant tool-use message
import Anthropic from '@anthropic-ai/sdk';
import { runAnthropicToolUses, safe } from '@useanvil/sdk';

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const workflowId = crypto.randomUUID(); // create once per agent run

const message = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 1024,
  tools: [
    {
      name: 'create_refund_case',
      description: 'Create a support artifact exactly once for a risky refund incident.',
      input_schema: {
        type: 'object',
        properties: {
          orderId: { type: 'string' },
          customerId: { type: 'string' },
          summary: { type: 'string' }
        },
        required: ['orderId', 'customerId', 'summary']
      }
    }
  ],
  messages: [{ role: 'user', content: 'Create a support case for the customer who saw a duplicate refund attempt.' }]
});

const toolUses = message.content.filter((block) => block.type === 'tool_use');

const toolResults = await runAnthropicToolUses({
  toolUses,
  config: { workflowId, policy: 'agent-runtime-v1' },
  tools: {
    create_refund_case: {
      action: 'api.call',
      metadata: ({ input }) => ({
        provider: 'zendesk',
        order_id: input.orderId,
        customer_id: input.customerId,
        operation: 'ticket.create'
      }),
      execute: async ({ orderId, customerId, summary }) =>
        safe.external.mutation({
          system: 'zendesk',
          operation: 'ticket.create',
          method: 'POST',
          target: '/api/v2/tickets',
          businessId: `refund-case:${orderId}`,
          execute: async () =>
            zendesk.tickets.create({
              subject: `Refund incident for ${orderId}`,
              comment: { body: summary },
              requester_id: customerId
            })
        })
    }
  }
});

await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 1024,
  messages: [
    { role: 'assistant', content: message.content },
    { role: 'user', content: toolResults }
  ]
});

Practical rules:

  • workflowId is your run identity, not Anthropic's request id.
  • tool_use.id is the step identity. That is what makes duplicate retries collapse onto the same guarded execution.
  • Blocked or unknown outcomes come back as tool_result blocks with is_error: true, so Claude can reason about the stop condition instead of blindly retrying.
  • If the tool use is retried after a successful completion, Anvil returns the prior result and does not run the side effect again.

Framework Integrations

Framework wrappers are the right choice when your team already lives inside that framework's tool loop and wants to keep the framework ergonomics intact.

Recommended path:

  • use the direct OpenAI and Anthropic integrations when you are already calling those provider SDKs directly
  • use withAnvilGuard(...) for Vercel AI SDK when your tool execution already runs through tool()
  • use anvilToolkit(...) for LangChain or LangGraph when your tools already run through invoke()

The trust boundary is simple:

  • workflowId must come from your runtime and stay stable across retries of the same run
  • stepId must come from a stable per-tool-call identity if the framework can provide one
  • if the framework cannot provide stable step identity, Anvil can still enforce workflow budgets, but durable per-call replay semantics become weaker

Vercel AI SDK

withAnvilGuard(...) is the production path when your tools execute through the Vercel AI SDK.

What it guarantees:

  • if the SDK provides toolCallId, Anvil uses it as the guarded stepId
  • repeated execution of the same toolCallId replays the original result when replay is safe
  • blocked or unknown outcomes throw readable tool errors so the model can adapt

What it does not guarantee unless the SDK provides toolCallId:

  • durable call-level deduplication across retries outside the current process
import { generateText, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
import { withAnvilGuard } from '@useanvil/sdk/integrations/vercel-ai';

const workflowId = request.id; // stable for this agent run

const tools = withAnvilGuard(
  {
    refund_order: tool({
      description: 'Refund an order exactly once.',
      parameters: z.object({
        orderId: z.string(),
        paymentIntentId: z.string(),
        amountCents: z.number().int().positive()
      }),
      execute: async ({ orderId, paymentIntentId, amountCents }) =>
        stripe.refunds.create({
          payment_intent: paymentIntentId,
          amount: amountCents,
          metadata: { orderId }
        })
    }),
    cancel_billing_account: tool({
      description: 'Cancel an account in the internal billing API exactly once.',
      parameters: z.object({
        accountId: z.string(),
        reason: z.string()
      }),
      execute: async ({ accountId, reason }) =>
        billingClient.post(`/accounts/${accountId}/cancel`, { reason })
    })
  },
  {
    workflowId,
    actions: {
      refund_order: 'payment.refund',
      cancel_billing_account: 'api.call'
    },
    requireToolCallId: true
  }
);

await generateText({
  model: openai('gpt-5.2'),
  tools,
  prompt: 'Refund order ord_8821 and cancel the internal billing account.'
});

Operational rule:

  • set requireToolCallId: true if you want the integration to fail fast instead of silently falling back to a generated local step id

LangChain And LangGraph

anvilToolkit(...) is the right wrapper when your tools already execute through LangChain.

The honest default:

  • generic LangChain does not expose a stable per-tool-call id through this wrapper surface
  • Anvil therefore generates a fresh local stepId by default
  • that still enforces workflow-level tool counts, retry budgets, and spend budgets
  • it does not give durable per-call deduplication across process restarts unless you provide a stable getStepId(...)
import { DynamicStructuredTool } from '@langchain/core/tools';
import { AgentExecutor, createOpenAIFunctionsAgent } from 'langchain/agents';
import { anvilToolkit } from '@useanvil/sdk/integrations/langchain';

const workflowId = request.id; // stable for this agent run

const tools = anvilToolkit(
  [
    new DynamicStructuredTool({
      name: 'refund_order',
      description: 'Refund an order exactly once.',
      schema: refundSchema,
      func: async ({ orderId, paymentIntentId, amountCents }) =>
        stripe.refunds.create({
          payment_intent: paymentIntentId,
          amount: amountCents,
          metadata: { orderId }
        })
    }),
    new DynamicStructuredTool({
      name: 'patch_salesforce_contact',
      description: 'Patch a Salesforce contact exactly once.',
      schema: salesforceSchema,
      func: async ({ contactId, email }) =>
        salesforce.patch(`/services/data/v60.0/sobjects/Contact/${contactId}`, {
          Email: email
        })
    })
  ],
  {
    workflowId,
    actions: {
      refund_order: 'payment.refund',
      patch_salesforce_contact: 'api.call'
    }
  }
);

If you are in LangGraph and your runtime already has durable step identity, pass it in:

const tools = anvilToolkit(myTools, {
  workflowId: graphThreadId,
  actions: {
    refund_order: 'payment.refund'
  },
  getStepId: ({ toolName, options }) => {
    const configurable = options?.configurable as { checkpoint_id?: string; task_id?: string } | undefined;
    if (!configurable?.task_id) return undefined;
    return `langgraph:${toolName}:${configurable.task_id}`;
  }
});

Operational rule:

  • if you cannot provide a durable getStepId(...), trust anvilToolkit(...) for workflow budgets and policy enforcement, not for provider-style per-call replay across restarts

Framework retry semantics:

  • replayable duplicate: framework wrapper returns the original tool result
  • non-replayable duplicate: framework wrapper throws a clear Anvil error instead of re-running the side effect
  • blocked or unknown outcome: framework wrapper throws a readable error so the framework can surface it back into the loop
  • conflict: if the winning result is replayable, the wrapper returns it; otherwise it throws and tells the caller to wait and inspect the receipt

Runtime Response Contract

Every guarded call now returns or persists the same high-signal runtime fields:

  • status: executed, duplicate, blocked, or unknown
  • safeToRetry: whether a retry is currently safe
  • nextAction: none, retry, do_not_retry, reconcile, or review
  • destructive: whether the action looks irreversible by default
  • requiresConfirmation: whether the caller should force an explicit confirmation step

Example unknown outcome:

const response = await agent.toolCall({
  workflow: { workflowId: run.id },
  stepId: 'issue-refund',
  toolName: 'stripe.refunds.create',
  action: 'payment.refund',
  amount: 15000,
  metadata: {
    operation: 'refund'
  },
  execute: async () => stripe.refunds.create({ payment_intent: paymentIntentId, amount: 15000 })
});

if (response.status === 'unknown') {
  console.log(response.safeToRetry, response.nextAction);
}

Receipts API

The simplest operator surface is now:

import { anvil } from '@useanvil/sdk';

const recent = anvil.receipts.list({ action: 'payment.charge', limit: 10 });
const receipt = anvil.receipts.show(recent[0]!.id);

If you need explicit lifecycle control for long-lived runs:

const current = await anvil.workflowState.read(run.id);
await anvil.workflowState.reset(run.id);
  1. Open the visual policy editor:
anvil policy studio

This writes a self-contained HTML policy editor you can open in a browser, tune visually, and download as valid Anvil policy JSON.

Start Here If You Are New

If you only want the easiest path to understanding the product, run:

anvil start

That writes ./artifacts/anvil-start.html, a simple guided home page that explains:

  • what Anvil does in plain English
  • which command to run first
  • how ghost mode, policies, and enforce mode fit together
  • the fastest path for business users and the fastest path for engineers

Why Teams Use It Instead Of Hand-Rolling

Teams usually do not adopt Anvil just for a Redis lock.

They adopt it because it combines:

  • duplicate prevention
  • policy enforcement before risky execution
  • action-specific required business context when a path needs stronger guardrails
  • durable execution receipts and unknown-outcome recovery
  • downstream reconciliation when teams need to verify real provider truth
  • explicit unknown-outcome handling
  • audit visibility and ghost-to-enforce rollout
  • centralized rollout control across services
  • per-action defaults like retries, amount caps, and timeouts

That is the part most in-house versions either skip or only discover after incidents.

The simple reason to wrap the call is:

  • the expensive mistake is in execution, not in the request object
  • the painful incident is the charge, refund, workflow step, or external mutation happening twice or outside policy
  • the thing teams need in production is one product that covers prevention, policy, receipts, recovery, rollout safety, and operational visibility together

That is what Anvil is for.

First Production Path

The best first path to protect is usually:

  • high-value enough to matter
  • common enough to produce real signal
  • simple enough that unknowns can be investigated quickly

Good examples:

  • payment.charge
  • payment.refund
  • one external API step with clear ownership

CLI

anvil start
anvil init
anvil init-agents
anvil mcp --config
anvil doctor
anvil policy studio
anvil ghost --sample
anvil ghost --logs ./your-logs.ndjson
anvil ghost --logs ./your-logs.ndjson --report ./artifacts/anvil-ghost-report.html
anvil control show
anvil receipts list --current-status unknown
anvil receipts inspect <receipt-id>
anvil receipts resolve <receipt-id> manual_review_required --note "Investigate provider outcome"
anvil receipts reconcile-auto <receipt-id>
anvil control resolve checkout-service payment.refund
anvil control set-mode enforce
anvil control kill-switch on
anvil control set-service-policy checkout-service payments-prod-v3
anvil control set-action-mode checkout-service payment.refund ghost
anvil audit incident --action payment.charge
anvil demo
anvil demo stripe --stripe-key sk_test_...

anvil ghost --report writes a polished HTML report you can open in a browser, hand to engineering or business, and export from with built-in JSON/CSV download buttons.

anvil policy studio writes a visual policy editor at ./artifacts/anvil-policy-studio.html by default. Use it to add actions, set max amounts and retries, then download a policy JSON file and point ANVIL_POLICY_PATH at it.

Make A Policy By Hand

You do not need Policy Studio.

An Anvil policy is just JSON. Put a file in ./policies/ like this:

{
  "name": "my-payments-policy",
  "allowed_actions": ["payment.charge", "payment.refund", "api.call"],
  "limits": {
    "payment.charge": {
      "max_amount": 50000,
      "max_retries": 2,
      "execute_timeout_ms": 15000
    },
    "payment.refund": {
      "max_amount": 25000,
      "max_retries": 1,
      "execute_timeout_ms": 15000
    },
    "api.call": {
      "max_retries": 3,
      "execute_timeout_ms": 8000
    }
  }
}

Then point Anvil at it:

export ANVIL_POLICY_PATH=./policies/my-payments-policy.json

Then use it in code:

import { guard, key } from '@useanvil/sdk';

await guard({
  key: key.payment(orderId),
  action: 'payment.charge',
  policy: 'my-payments-policy',
  amount: 4999,
  execute: async () => stripe.charges.create({ amount: 4999, currency: 'usd' })
});

Rules of thumb:

  • allowed_actions is the allowlist.
  • max_amount is in cents.
  • max_retries is how many tries Anvil will allow.
  • allowed_services and blocked_services let you scope a policy to specific services.
  • required_metadata lets you require stable facts like actor_id, request_id, or approval_ticket.
  • custom_rule points to a local JS/MJS/CJS module when the JSON model is not enough.
  • custom_rule.run_in_ghost lets that custom code run during ghost replay when your logs contain enough context for it to be meaningful.
  • If you set max_amount, you must pass amount to guard().
  • If you set allowed_services, set ANVIL_SERVICE in the running service.
  • If you set required_metadata or use custom_rule, pass metadata to guard().
  • Every action in allowed_actions should have a matching row in limits.
  • If several policy files live in the same directory, Anvil can load them by policy name.

Advanced example:

{
  "name": "anvil-payments-governed",
  "allowed_actions": ["payment.charge", "payment.refund", "api.call"],
  "allowed_services": ["checkout-service", "payments-worker"],
  "blocked_services": ["admin-dashboard"],
  "required_metadata": ["actor_id", "request_id"],
  "custom_rule": {
    "module": "./custom/anvil-payments-governed-rule.mjs",
    "export": "default",
    "timeout_ms": 500,
    "run_in_ghost": false,
    "config": {
      "approvalAmountCents": 100000,
      "requiredApprovalKey": "approval_ticket"
    }
  },
  "limits": {
    "payment.charge": { "max_amount": 200000, "max_retries": 1, "execute_timeout_ms": 12000 },
    "payment.refund": { "max_amount": 100000, "max_retries": 1, "execute_timeout_ms": 12000 },
    "api.call": { "max_retries": 2, "execute_timeout_ms": 6000 }
  }
}

And the runtime call:

await guard({
  key: key.payment(orderId),
  action: 'payment.charge',
  policy: 'anvil-payments-governed',
  amount: 150000,
  metadata: {
    actor_id: 'user-42',
    request_id: 'req-991',
    approval_ticket: 'ap-7'
  },
  execute: async () => doCharge()
});

Bundled examples:

  • ./policies/stripe-v1.json
  • ./policies/anvil-payments-starter.json
  • ./policies/anvil-payments-prod.json
  • ./policies/anvil-payments-governed.json

Central Control Plane

Phase 1 adds a minimal Redis-backed control layer for multi-service rollout consistency.

  • Set ANVIL_SERVICE=<service-name> in each service.
  • Store the central JSON document in Redis at anvil:control:config by default.
  • Runtime resolves mode in this order: kill switch, per-action override, per-service default, global default.
  • Runtime resolves policy from the service assignment first, then falls back to the local policy configuration.
  • If central control cannot be read, Anvil falls back to the existing local env and policy behavior.
  • anvil control resolve <service> <action> previews the exact resolved mode and policy before you roll anything forward.

ANVIL_POLICY_PATH can still point at a single file. If you place multiple policy JSON files in that same directory, Anvil can now load sibling files by policy name for per-service assignments.

If an action has a max_amount, Anvil now requires you to pass amount at runtime. That keeps money-moving actions from silently bypassing the cap.

If a policy row has execute_timeout_ms, Anvil will use that as the default timeout for the action unless the caller overrides it.

Reliability And Metrics

  • Lock TTL is automatically extended to cover the action timeout plus a safety buffer, so slow provider calls do not reopen the key early.
  • Redis Sentinel is supported through ANVIL_REDIS_SENTINELS and ANVIL_REDIS_SENTINEL_NAME if you want failover without changing the guard path.
  • If you explicitly set ANVIL_REDIS_UNAVAILABLE_BEHAVIOR=passthrough, Anvil can execute without idempotency protection when Redis is unreachable. Use this only for low-risk paths.
  • You can hook Redis failures directly with hooks.onRedisError.
  • Built-in in-process counters are available through @useanvil/sdk/sinks/metrics for Prometheus-style scraping, including latency histogram buckets, Redis error counters, and audit write failure counters.
  • anvil doctor --json now emits a machine-readable operational health report you can wire into deployment checks.
  • anvil prove idempotency is the product-facing proof command. It hammers one key under concurrency, writes ./artifacts/anvil-idempotency-proof.json, and lets anvil doctor report the last verified proof.

Performance Check

Use the local benchmark harness before making a public throughput claim:

npm run build
BENCH_REQUESTS=10000 BENCH_CONCURRENCY=200 npm run bench:guard

This measures the enforced happy path against your configured Redis and prints throughput plus p50/p95/p99 latency. Treat the result as environment-specific, not a universal ceiling.

For the production-facing idempotency verification your design partner can actually run:

npm run build
ANVIL_REDIS_URL=redis://127.0.0.1:6379 anvil prove idempotency --requests 1000 --concurrency 50

That writes ./artifacts/anvil-idempotency-proof.json and prints a pass/fail line like:

PASS — 1 executed, 999 protected duplicates, 0 unknown, Redis RTT p95 3ms — idempotency held across 1000 concurrent attempts.

anvil doctor will then show the last proved date, request count, and proof result.

npm run prove:guard still exists as the older developer proof harness. Use anvil prove idempotency when you want the product surface a design partner can run and understand.

For a fuller release artifact that bundles doctor output, proof runs, and optional drills:

npm run build
ANVIL_REDIS_URL=redis://127.0.0.1:6379 npm run evidence:release

Set RELEASE_EVIDENCE_INCLUDE_DRILL=1 to include a Redis recovery drill in the output report.

Use npm run drill:redis-recovery for a local degraded-mode and recovery drill. It manages its own temporary Redis instance by default so you can validate fail-closed, passthrough, and recovery behavior without touching a shared environment.

Production Shape

Recommended baseline:

  • Run at least two app instances behind your normal service load balancer.
  • Keep Redis in the same region and low-latency network boundary as the app.
  • Use Redis Sentinel or your managed Redis failover equivalent for production deployments.
  • Expose anvil doctor --json or healthCheckDetailed() in deployment checks.
  • Expose metrics from createMetricsAggregator() to your existing telemetry stack.
  • Keep proof reports as release artifacts before making scale claims.

Repo Layout

  • src/ contains the runtime and CLI.
  • src/ui/ contains the generated product surfaces: Start, Policy Studio, and Ghost Report.
  • policies/ contains bundled starter policies.
  • demo/ contains live demo entrypoints.
  • artifacts/ is the default home for generated HTML reports and local audit output.
  • docs/product/ contains the product and spec documents.

Operations

For the short operational model and failure semantics, see docs/OPERATIONS.md.

Verification

npm run verify