aifunctions-js

v2.9.1

Published

16 days ago

Minimal ask() API for remote (OpenRouter) and local (llama.cpp / Transformers.js) LLMs

0High
0Medium
0Low

nx-morpheus

openrouter llm llama.cpp gguf transformers.js

aifunctions-js

Turn prompts into real functions — typed I/O, JSON-safe execution, evaluation, and instruction optimization.

Use it as an npm library (you're in full control; no proxy required).
Or run the included stateless REST server to expose functions over HTTP and manage the full function lifecycle.

Terminology

Function: canonical public term for an ability.
Skill: internal/content-store alias for the same ability. Content is stored under functions/<id>/... (one folder per function).
/skills/* endpoints are kept for backward compatibility and act as aliases for function operations.

Why this exists

LLM "functions" start easy and get messy:

output drifts
JSON breaks
edge cases show up
you add retries, parsing, validation, model picking… and end up with glue code

aifunctions-js standardizes that work:

Typed contracts (input/output schemas)
JSON-only execution with retries + repair
Evaluation (judge/rules)
Optimization loop (run → judge → fix → repeat)
Skill packs stored as files (weak/strong/ultra), so prompts aren't buried in code
Functions lifecycle (create → validate → release) with score-gated promotion

Install

Node 18+ required (see engines in package.json).

npm i aifunctions-js

Optional local CPU backends

npm i node-llama-cpp          # GGUF via llama.cpp
npm i @huggingface/transformers  # Transformers.js

Quick start

1) Use prebuilt functions

import { classify, summarize, match, matchLists } from "aifunctions-js/functions";

const { categories } = await classify({
  text: "I was charged twice this month.",
  categories: ["Billing", "Auth", "Support"],
});

const { summary } = await summarize({ text: longDoc, length: "brief" });

// One query → best matching candidates from a labelled list
const result = await match({
  query: "CVSS base score",
  candidates: [
    { id: "cvss_base",    label: "CVSS base score | numeric severity 0–10" },
    { id: "cvss_vector",  label: "CVSS vector string | attack surface encoding" },
    { id: "patch_status", label: "Patch status | applied / missing / n/a" },
  ],
  maxResults: 1,
});
// result.noMatch → false; result.matches[0].id → "cvss_base"

const r = await matchLists({
  list1: sourceFields,
  list2: targetFields,
  guidance: "Match by semantic meaning.",
});

`match()` — one query, many candidates

Finds the best-matching candidates from an arbitrary list for a single query string. Returns stable IDs, confidence scores, and optional reasons. Never throws on model parse failures — returns { noMatch: true } instead.

import { match } from "aifunctions-js/functions";

const result = await match({
  query: "affected product / version",
  candidates: fieldList,   // MatchCandidate[]  { id, label }
  maxResults: 3,           // default 3
  minScore: 0.4,           // optional threshold
  allowNoMatch: true,      // default true  → noMatch result instead of throw
  returnReasons: true,     // default true  → reason string per match
  maxCandidates: 80,       // default 100   → cap sent to model; excess pre-ranked by trigram overlap
  model: "openai/gpt-5-nano",
});

if (result.noMatch) {
  console.log("no match found");
} else {
  console.log(result.matches[0].id, result.matches[0].score, result.matches[0].reason);
}

MatchInput fields:

| Field | Type | Default | Description | |---|---|---|---| | query | string | required | Free-text search query | | candidates | MatchCandidate[] | required | { id: string\|number; label: string } | | maxResults | number | 3 | Maximum matches returned | | minScore | number | — | Drop matches below this confidence (0–1) | | allowNoMatch | boolean | true | Return { noMatch: true } instead of throwing when no match / model error | | returnReasons | boolean | true | Include a short reason per match | | maxCandidates | number | 100 | Hard cap on candidates sent to the model; excess are pre-ranked by character-trigram overlap | | guidance | string | — | Extra matching guidance appended to the prompt | | additionalInstructions | string | — | Appended verbatim to all instruction tiers | | model, mode, temperature, maxTokens, timeoutMs, vendor, client | — | — | Standard LLM options |

MatchOutput:

{
  query: string;
  noMatch: boolean;
  matches: Array<{ id: string | number; score: number; reason?: string }>;
}

Large candidate sets: pass maxCandidates to guard against context-window overflow. The library pre-selects the most relevant candidates using trigram overlap before the model call, so quality stays high even when the full list is large. Default cap is 100; tune down to ~40–60 for lightweight models (gpt-5-nano, local llama).

2) Run any function by name

import { run } from "aifunctions-js/functions";

const result = await run("extract-invoice-lines", { text: invoiceText });
// → { lines: [{ description: "...", amount: 4500 }, ...] }

Full list (manual mode): All library functions are available for programmatic use with rules and optimization. See Library functions — manual mode for the complete list and how to run them by name with a content provider or resolver.

Core concepts

Function packs (file-based prompts)

Functions live as files in a content store (git repo or local folder). Canonical paths use the functions/ prefix:

functions/<id>/weak       # local/cheap instructions
functions/<id>/strong     # cloud/high-quality instructions (mode "normal" uses this)
functions/<id>/ultra      # optional highest-tier instructions
functions/<id>/rules      # optional judge rules (JSON)
functions/<id>/meta.json  # status: draft | released, version, scoreGate
functions/<id>/test-cases.json   # stored test cases for validate/optimize
functions/<id>/race-config.json  # race defaults + winner profiles (best/cheapest/fastest/balanced)
functions/<id>/races.json        # race history (append-only, capped)

Content can be loaded from a git-backed resolver (e.g. .content repo) or via a content provider (shared-store or inline). The library exports createFunctionContentProvider and ResolverBackedContentProvider for use with run(skillName, request, { contentProvider, scopeId?, profile?, validateOutput? }).

Docs: COLLECTIONS_MAPPING.md and DATA_MAPPING.md (when present in docs/) describe collections, schema, and data; see also .docs/ for in-repo design docs.

Prompts are:

reviewable in PRs
shareable across projects
versionable like code

Modes

| Mode | Typical backend | Use | | ------------------- | --------------- | ------------------------ | | weak | local (CPU) | dev/offline/cheap | | normal / strong | cloud | production default | | ultra | cloud | strictest / highest-tier |

The safety layer (JSON + validation + retries)

For any JSON-producing call, the library applies:

extract-first JSON (prefers ```json fences, then first balanced object/array)
safe parsing (guards against prototype poisoning)
optional schema validation (Ajv)
deterministic retries (normal → JSON-only guard → fix-to-schema with errors)

import { runJsonCompletion } from "aifunctions-js/functions";

const { parsed, text, usage } = await runJsonCompletion({
  instruction: "Extract line items from this invoice. Return JSON only.",
  options: { model: "openai/gpt-4o", maxTokens: 800 },
});

Evaluation & optimization

Judge a response

import { judge } from "aifunctions-js/functions";

const verdict = await judge({
  instructions: "...",
  response: "...",
  rules: [
    { rule: "Must output valid JSON only", weight: 3 },
    { rule: "Field names must match the schema exactly", weight: 2 },
  ],
  threshold: 0.8,
  mode: "strong",
});
// verdict.pass, verdict.scoreNormalized, verdict.ruleResults

Generate rules from instructions

import { generateJudgeRules } from "aifunctions-js/functions";

const { rules } = await generateJudgeRules({ instructions: myPrompt });

Methodology: When providing good/bad examples (e.g. POST /optimize/rules or optimizeJudgeRules), include a brief rationale (why it's good or bad) when possible; it improves rule quality.

Generate / improve instructions until they pass

import { generateInstructions } from "aifunctions-js/functions";

const best = await generateInstructions({
  seedInstructions: myPrompt,
  testCases: [{ id: "t1", inputMd: "Invoice #1234\nConsulting: $4,500" }],
  call: "ask",
  targetModel: { model: "openai/gpt-4o-mini", class: "normal" },
  judgeThreshold: 0.8,
  targetAverageThreshold: 0.85,
  loop: { maxCycles: 5 },
  optimizer: { mode: "strong" },
});
// best.achieved, best.best.instructions, best.best.avgScoreNormalized

Fix instructions from judge feedback

import { fixInstructions } from "aifunctions-js/functions";

const { fixedInstructions, changes } = await fixInstructions({
  instructions: myPrompt,
  judgeFeedback: verdict,
});

Compare two instruction versions

import { compare } from "aifunctions-js/functions";

const result = await compare({
  instructions: baseInstructions,
  responses: [
    { id: "v1", text: responseFromVersionA },
    { id: "v2", text: responseFromVersionB },
  ],
  threshold: 0.8,
});
// result.bestId, result.ranking

Benchmark models

import { raceModels } from "aifunctions-js/functions";

const ranking = await raceModels({
  taskName: "invoice-lines",
  call: "askJson",
  testCases: [{ id: "t1", inputMd: "..." }],
  threshold: 0.8,
  models: [
    { id: "gpt4o", model: "openai/gpt-4o", vendor: "openai", class: "strong" },
    { id: "claude", model: "anthropic/claude-3-5-haiku", vendor: "anthropic", class: "strong" },
  ],
});

Client (one API across providers)

import { createClient } from "aifunctions-js";

const ai = createClient({ backend: "openrouter" });

const res = await ai.ask("Write a product tagline.", {
  model: "openai/gpt-5-nano",
  maxTokens: 200,
  temperature: 0.7,
});
// res.text, res.usage, res.model

// Or use mode and let the client resolve model from config/env/preset:
await ai.ask("...", { mode: "strong", maxTokens: 500, temperature: 0.7 });

You can set the strong/normal model once via env (LLM_MODEL_STRONG, LLM_MODEL_NORMAL) or createClient({ models: { normal, strong } }); then ask(..., { mode: "strong" }) uses that model without passing it every time.

Backends

createClient({ backend: "openrouter", models?: { normal?, strong? }, openrouter?: { apiKey?, baseUrl?, appName?, appUrl? } })
createClient({ backend: "llama-cpp", llamaCpp: { modelPath, contextSize?, threads? } })
createClient({ backend: "transformersjs", transformersjs: { modelId?, cacheDir?, device?: "cpu" } })

Configuration

.env (all optional unless you use that backend):

OPENROUTER_API_KEY=sk-or-...
LLM_MODEL_NORMAL=gpt-5-nano    # optional; used when ask(..., { mode: "normal" })
LLM_MODEL_STRONG=gpt-5.2       # optional; used when ask(..., { mode: "strong" })

LLAMA_CPP_MODEL_PATH=./models/model.gguf
LLAMA_CPP_THREADS=6

TRANSFORMERS_JS_MODEL_ID=Xenova/distilbart-cnn-6-6

REST API (optional, stateless)

Expose functions and the full lifecycle over HTTP. Request/response shapes are described in this README and in the API contract when present (e.g. docs/API_CONTRACT.md); server–contract sync: docs/CONTRACT_SYNC.md. If docs/ is not in your clone, the REST API section below is the endpoint reference.

npm run build && npm run serve
# PORT=3780 by default

Authentication

| Header | Purpose | |---|---| | x-api-key | Authenticates to the server — validated against LIGHT_SKILLS_API_KEY env | | x-openrouter-key | BYOK — passed through to OpenRouter so each user can use their own key and billing |

If neither LIGHT_SKILLS_API_KEY nor AIFUNCTIONS_API_KEY is set, all requests are allowed.

Run and health

GET  /health                  health check → { version, uptime, skills, hasOpenrouterKey, backends }
GET  /config/modes            server mode→model mapping → { weak, normal, strong, ultra } (each { model, description })
POST /run                     { skill, input, options } → { result, usage }
POST /skills/:name/run        { input, options }        → { result, usage }
POST /functions/:id/run       { input, options }        → { result, usage, requestId, draft?, trace? }
GET  /skills                  list functions + metadata (legacy alias route)
GET  /skills/:name            function detail (legacy alias route)
GET  /functions               list functions
GET  /functions/:id           function detail with status, version, last validation, currentInstructions, currentRules, currentRulesCount

Run request body may include options.scopeId and options.profile (best | cheapest | fastest | balanced) for scope-specific model selection; options.validateOutput: true returns { result, validation } against the library index schema. Run responses include requestId (same as attribution traceId when provided). When the function is in draft status, the response includes draft: true. Pass options.trace: true in the run body to receive a trace object with the full prompt(s), model selection, and model used per call (for that request only; not stored). Run endpoints are rate-limited per key (see RATE_LIMIT_PER_MINUTE) and return X-RateLimit-Remaining and X-RateLimit-Reset headers.

Run mode may be weak, normal, strong, ultra, or profile modes best, cheapest, fastest, balanced. Profile modes require a race to have been run first; otherwise the server returns 422 NO_RACE_PROFILE.

Functions lifecycle

Create a function, iterate on it, validate quality, then release it to a stable versioned endpoint.

POST /functions               create: { id, seedInstructions, scoreGate?, rules? }
POST /functions/:id:validate  run schema + semantic scoring → { passed, scoreNormalized, cases }
POST /functions/:id:release   promote to released (blocked if score < scoreGate). Body may include `scopeId` for scope-specific release.
POST /functions/:id:rollback  set current instructions/rules to a previous version (body: `{ version: gitRef }`; requires version APIs)
POST /functions/:id:optimize  rewrite instructions in-place
POST /functions/:id:push      push to remote git repo (requires SKILLS_LOCAL_PATH)
GET  /functions/:id/versions  instruction version history (git shas)
POST /functions/:id/versions/:version/run  run at a pinned version (ref = git sha from versions list)
GET  /functions/:id/scopes    query: scopeId required → { functionId, scopeId, releases }
POST /functions/:id/apply    apply evaluation result to scope (body: { scopeId?, evaluationSessionId, appliedBy? }) → { appliedProfileSet, scopeId, functionId }
GET  /functions/:id/test-cases
PUT  /functions/:id/test-cases  { testCases: [{ id, input, expectedOutput? }] }
POST /functions/:id/save-optimization  persist instructions, rules, examples from optimization wizard
POST /functions/generate-examples      { description, count?, mode? } → { examples, usage? }

Race / benchmark

POST /race/models              race models, temperatures, or tokens (async job) — body: skillName|prompt, testCases?, candidates|models, functionKey?, applyDefaults?, raceLabel?, notes?, type? (model|temperature|tokens), model?, temperatures?, tokenValues?
GET  /functions/:id/profiles   race winner profiles and defaults → { defaults, profiles: { best, cheapest, fastest, balanced } }
GET  /functions/:id/race-report  race history — query: last, since, raceId → { races }

Job result for a race includes ranked, raw, winners, usage. Run with mode: best|cheapest|fastest|balanced uses the stored profile for that function.

The cheapest winner is selected by pricing rate using the bundled pricing table (see below).

Optimization endpoints

POST /optimize/generate       generate instructions from test cases (async job)
POST /optimize/judge          score a response against rules → { pass, score, ruleResults }
POST /optimize/rules          generate rules from labeled examples or instructions
POST /optimize/rules-optimize optimize existing rules from examples with rationale (append/replace)
POST /optimize/fix            fix instructions from judge feedback → { fixedInstructions, changes, summary, usage, optional addedRuleBullets }
POST /optimize/compare        rank 2+ responses by quality → { ranking, bestId, candidates }
POST /optimize/instructions   one-shot instruction rewrite
POST /optimize/skill          rewrite one skill's instructions in-place
POST /optimize/batch          batch rewrite (async job)

Jobs (for async operations)

GET /jobs               list recent jobs
GET /jobs/:id           status, progress, result
GET /jobs/:id/logs      streaming log lines

Content workflows

POST /content/sync         sync instructions to content store (and optional push)
POST /content/index        build library index (body: { prefix?: "functions/" }) → { indexed, skills, errors }
POST /content/index/full   build and return full embedded library snapshot (body: prefix?, staticOnly?, writeDocsFallback?) → { fullSnapshot, ... }; writes .docs/library-index.full.fallback.json by default
POST /content/fixtures      validate examples vs io.output schemas
POST /content/layout-lint  enforce folder-based layout under functions/

Cost estimation

All responses include usage.costEstimate with machine-readable cost status, confidence, reason, and source metadata.

usage.estimatedCost is still returned for backward compatibility and mirrors usage.costEstimate.amountUsd when available.
OpenRouter exact cost: when provider response includes usage.cost, costEstimate.status="available" and source="provider-response".
Bundled pricing fallback: OpenAI models can be estimated from data/openai-cost.json (priceVersion: "openai-cost@2026-03-05"), with status="estimated" and source="provider-pricing-registry".
Unavailable pricing: when cost cannot be computed, amountUsd is null and costEstimate includes status="unavailable" with a deterministic reasonCode (BACKEND_NO_PRICING, MODEL_PRICING_MISSING, PRICE_LOOKUP_FAILED, USAGE_INCOMPLETE, or NOT_COMPUTED).

Function-level cost & activity tracking

Every LLM call is automatically tagged with the function that originated it, so usage data can be attributed precisely — not just at the model or account level.

How it works:

The server injects functionId automatically (e.g. extract.requirements, optimize.judge).
Callers may optionally pass projectId, traceId, and tags in any POST body.
The server embeds these as metadata in the outgoing provider request (user field for OpenRouter).
Every response returns extended attribution fields in the usage object.

Optional request fields (any POST endpoint that calls an LLM):

| Field | Type | Description | |---|---|---| | projectId | string | Logical project or tenant (e.g. "cognni-prod") | | traceId | string | Correlation ID for distributed tracing. Auto-generated UUID if omitted. | | tags | object | Free-form key-value metadata (string values) |

Example request:

POST /functions/extract.requirements/run
{
  "input": { "text": "..." },
  "projectId": "cognni-prod",
  "traceId": "req-983741",
  "tags": { "workflow": "classification", "environment": "production" }
}

Extended usage response:

{
  "promptTokens": 240,
  "completionTokens": 82,
  "totalTokens": 322,
  "model": "openai/gpt-5-nano",
  "latencyMs": 1430,
  "estimatedCost": 0.000147,
  "costEstimate": {
    "amountUsd": 0.000147,
    "status": "estimated",
    "confidence": "medium",
    "source": "provider-pricing-registry",
    "priceVersion": "openai-cost@2026-03-05",
    "reason": "Estimated using bundled OpenAI pricing table."
  },
  "functionId": "extract.requirements",
  "projectId": "cognni-prod",
  "traceId": "req-983741",
  "tags": { "workflow": "classification", "environment": "production" }
}

functionId is always present. projectId, traceId, and tags appear only when provided.

The package itself remains stateless — it does not store usage history. The usage object is returned to the caller for forwarding to any external logging pipeline, analytics service, or dashboard.

Analytics APIs

Proxy endpoints that fetch usage and cost data directly from the upstream provider. No data is stored by this server.

GET /models/available                 list models available via OpenRouter (x-openrouter-key or OPENROUTER_API_KEY)
GET /activity                         server-side activity log (query: from, to, functionId, projectId, model, limit) → { activities, summary }
GET /analytics/openrouter/credits     account balance and total usage
GET /analytics/openrouter/generations generation records (query: dateMin, dateMax, model, userTag, limit)
GET /analytics/openai/usage            org usage buckets — requires OPENAI_ADMIN_KEY (query: startTime, endTime, groupBy, projectIds, models, limit)
GET /analytics/openai/costs            org cost buckets  — requires OPENAI_ADMIN_KEY (query: startTime, endTime, groupBy, projectIds, limit)

OpenRouter: uses x-openrouter-key header (BYOK) or OPENROUTER_API_KEY env.

OpenAI: requires OPENAI_ADMIN_KEY env var (an admin-scoped key from your OpenAI org settings). Standard project keys do not have access to organization-level analytics.

Filter generations by function or project:

GET /analytics/openrouter/generations?userTag=cognni-prod:extract.requirements&dateMin=2026-03-01

The userTag filter matches the attribution tag this package injects into the OpenRouter user field, so you can pull all generations for a specific function or project:function pair.

Server env vars

| Var | Default | Description | |---|---|---| | PORT | 3780 | Server port | | LIGHT_SKILLS_API_KEY | — | If set, requires x-api-key header (legacy: AIFUNCTIONS_API_KEY) | | OPENROUTER_API_KEY | — | Default OpenRouter key (overridden per-request by x-openrouter-key) | | MAX_CONCURRENCY | 10 | Max parallel LLM calls | | RATE_LIMIT_PER_MINUTE | 60 | Max run requests per minute per key (BYOK or server); responses include X-RateLimit-Remaining, X-RateLimit-Reset | | JOB_TTL | 3600 | Seconds before completed jobs are cleaned up | | VALIDATE_SKILL_OUTPUT | 0 | If 1, all runs include schema validation | | SKILLS_LOCAL_PATH | — | Local git path, required for :push endpoint | | OPENAI_ADMIN_KEY | — | Admin-scoped OpenAI key, required for GET /analytics/openai/* |

Run endpoints enforce a 100KB max request body (413 when exceeded). Backend default timeout per LLM call is 60s; for a 120s run timeout (e.g. aifunction.dev free tier) configure your backend or client.

Content (functions repo) workflow

npm run content:sync              # sync instructions to .content and push
npm run content:sync:optimize     # optimize instructions (requires OPENROUTER_API_KEY)
npm run content:index             # build library index with LLM (schemas/examples; needs OPENROUTER_API_KEY)
npm run content:index:static      # build library index from content only (no LLM)
npm run content:index:copy-fallback    # copy built index to .docs/library-index.fallback.json
npm run content:index:copy-full-fallback # build full snapshot and write .docs/library-index.full.fallback.json
npm run content:index:real        # content:index + copy-fallback
npm run content:index:real:full   # content:index + copy-fallback + copy-full-fallback
npm run content:fixtures          # validate examples vs io.output schemas
npm run content:layout-lint      # enforce folder-based layout under functions/

Agent toolbox: workers, tool loop, and orchestrators

Function packs with several distinct workers (different system prompts, different JSON output shapes) and an optional orchestrator that calls them as tools can use the agent toolbox. Register workers once; the same registry drives both direct dispatch and orchestrator tool definitions — no second list to maintain.

Minimal example: two workers, one orchestrator

import { createClient } from "aifunctions-js";
import {
  createWorkerRegistry,
  runOrchestrator,
  composeInstructions,
} from "aifunctions-js/functions";

// 1. Create an OpenRouter client.
const client = createClient({ backend: "openrouter" });

// 2. Register workers.  id === tool name exposed to the model.
const registry = createWorkerRegistry([
  {
    id: "graph_create",
    instructions: {
      normal: "You are a graph builder. Given a description, return a JSON graph with nodes and edges.",
    },
    tool: {
      description: "Create a workflow graph from a plain-text description.",
      parameters: {
        type: "object",
        properties: { description: { type: "string", description: "Plain-text description of the workflow." } },
        required: ["description"],
      },
    },
  },
  {
    id: "graph_explain",
    instructions: {
      normal: "You are a graph analyst. Given a JSON graph, return a plain-text explanation of its structure.",
    },
    tool: {
      description: "Explain an existing workflow graph in plain language.",
      parameters: {
        type: "object",
        properties: { graph: { type: "object", description: "The graph to explain." } },
        required: ["graph"],
      },
    },
  },
]);

// 3. Build the orchestrator system prompt from named segments.
const system = await composeInstructions({
  persona: "You are a workflow assistant.",
  rules:   "Use graph_create to build graphs, then graph_explain to describe them.",
  format:  "Respond concisely after all tool calls are complete.",
});

// 4. Run the orchestrator.  It calls graph_create then graph_explain and
//    returns the final model text.  No manual tool-dispatch loop required.
const result = await runOrchestrator({
  client,
  registry,
  system,
  userMessage: "Create a three-step onboarding workflow and explain it.",
  maxSteps: 6,
  // chatOptions sets the model for the orchestrator's own calls.
  // workerOpts sets the model for individual worker calls (can differ).
  chatOptions: { model: "openai/gpt-4.1-mini", maxTokens: 1024, temperature: 0.3 },
  workerOpts:  { model: "openai/gpt-4.1-mini", maxTokens: 512,  temperature: 0.1 },
  onStep:      (e) => console.log(`orchestrator step ${e.stepIndex}`),
  onWorkerEnd: (e) => console.log(`${e.workerId} finished in ${e.latencyMs}ms`),
});

console.log(result.text);
// => "The onboarding workflow starts with..."
console.log(result.steps, result.totalUsage);

Token and latency budget: each runOrchestrator call makes at least one model round-trip per tool invocation plus one final call for the text response. Plan for (number of workers called + 1) LLM requests. For a two-worker flow expect 3 calls and roughly 3× the per-call latency. Use onStep / onWorkerEnd callbacks to instrument cost and latency in production.

Direct worker dispatch (no orchestrator)

const output = await registry.runWorker({
  workerId: "graph_create",
  request: { description: "User registration flow" },
  buildPrompt: (r) => `Create a graph for: ${r.description}`,
});

Catalog-driven tools

When a worker pack grows, adding worker #7 only requires updating the registry. The tool list passed to the model derives from the same ids — no second array to keep in sync:

// All registered workers become tools automatically.
const tools = registry.getToolSpecs(); // ToolDefinition[]

// Or from an external catalog object (throws on id mismatch):
import { buildToolsFromCatalog } from "aifunctions-js/functions";
const tools = buildToolsFromCatalog(registry, {
  graph_create:  { description: "...", parameters: { ... } },
  graph_explain: { description: "...", parameters: { ... } },
});

Low-level tool loop

runToolLoop is the underlying primitive. Use it when you need full control over the message array or want to mix non-registry handlers:

import { runToolLoop } from "aifunctions-js/functions";

const result = await runToolLoop({
  client,         // must implement client.chat()
  messages: [
    { role: "system", content: system },
    { role: "user",   content: "Go." },
  ],
  tools,
  handlers: {
    graph_create:  async (args) => JSON.stringify(await myCreateFn(args)),
    graph_explain: async (args) => await myExplainFn(args),
  },
  chatOptions:     { model: "openai/gpt-4.1-mini", maxTokens: 1024 },
  maxSteps:        10,
  toolErrorPolicy: "result",   // or "throw"
  signal:          AbortSignal.timeout(30_000),
});

Privacy & data handling

Library usage (npm): everything runs in your environment.
REST server: stateless by design — does not persist request/response bodies or API keys.

Security notes

Never commit .env
Don't log provider keys
Add .content/ to .gitignore

Testing

npm run test:unit          # unit tests only (no API key required; includes agent toolbox unit tests)
npm run test:live          # live integration tests (requires OPENROUTER_API_KEY / OPEN_ROUTER_KEY)
npm run test:live:agent    # live tests for the agent toolbox only
npm test                   # full suite
npm run typecheck          # TypeScript check

Documentation

Library functions — manual mode — full list of built-in functions and programmatic use with rules/optimization
Library index: JSON and API — index format, update commands, and HTTP API for the catalog
API contract and Contract sync — when present in docs/; otherwise the REST API section above is the source of truth for endpoints

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

aifunctions-js

Terminology

Why this exists

Install

Optional local CPU backends

Quick start

1) Use prebuilt functions

match() — one query, many candidates

2) Run any function by name

Core concepts

Function packs (file-based prompts)

Modes

The safety layer (JSON + validation + retries)

Evaluation & optimization

Judge a response

Generate rules from instructions

Generate / improve instructions until they pass

Fix instructions from judge feedback

Compare two instruction versions

Benchmark models

Client (one API across providers)

Backends

Configuration

REST API (optional, stateless)

Authentication

Run and health

Functions lifecycle

Race / benchmark

Optimization endpoints

Jobs (for async operations)

Content workflows

Cost estimation

Function-level cost & activity tracking

Analytics APIs

Server env vars

Content (functions repo) workflow

Agent toolbox: workers, tool loop, and orchestrators

Minimal example: two workers, one orchestrator

Direct worker dispatch (no orchestrator)

Catalog-driven tools

Low-level tool loop

Privacy & data handling

Security notes

Testing

Documentation

Links

`match()` — one query, many candidates