aifunctions-js
v2.5.8
Published
Minimal ask() API for remote (OpenRouter) and local (llama.cpp / Transformers.js) LLMs
Maintainers
Readme
aifunctions-js
Turn prompts into real functions — typed I/O, JSON-safe execution, evaluation, and instruction optimization.
- Use it as an npm library (you're in full control; no proxy required).
- Or run the included stateless REST server to expose functions over HTTP and manage the full function lifecycle.
Terminology
- Function: canonical public term for an ability.
- Skill: internal/content-store alias for the same ability. Content is stored under
functions/<id>/...(one folder per function). /skills/*endpoints are kept for backward compatibility and act as aliases for function operations.
Why this exists
LLM "functions" start easy and get messy:
- output drifts
- JSON breaks
- edge cases show up
- you add retries, parsing, validation, model picking… and end up with glue code
aifunctions-js standardizes that work:
- Typed contracts (input/output schemas)
- JSON-only execution with retries + repair
- Evaluation (judge/rules)
- Optimization loop (run → judge → fix → repeat)
- Skill packs stored as files (weak/strong/ultra), so prompts aren't buried in code
- Functions lifecycle (create → validate → release) with score-gated promotion
Install
Node 18+ required (see engines in package.json).
npm i aifunctions-jsOptional local CPU backends
npm i node-llama-cpp # GGUF via llama.cpp
npm i @huggingface/transformers # Transformers.jsQuick start
1) Use prebuilt functions
import { classify, summarize, matchLists } from "aifunctions-js/functions";
const { categories } = await classify({
text: "I was charged twice this month.",
categories: ["Billing", "Auth", "Support"],
});
const { summary } = await summarize({ text: longDoc, length: "brief" });
const r = await matchLists({
list1: sourceFields,
list2: targetFields,
guidance: "Match by semantic meaning.",
});2) Run any function by name
import { run } from "aifunctions-js/functions";
const result = await run("extract-invoice-lines", { text: invoiceText });
// → { lines: [{ description: "...", amount: 4500 }, ...] }Full list (manual mode): All library functions are available for programmatic use with rules and optimization. See Library functions — manual mode for the complete list and how to run them by name with a content provider or resolver.
Core concepts
Function packs (file-based prompts)
Functions live as files in a content store (git repo or local folder). Canonical paths use the functions/ prefix:
functions/<id>/weak # local/cheap instructions
functions/<id>/strong # cloud/high-quality instructions (mode "normal" uses this)
functions/<id>/ultra # optional highest-tier instructions
functions/<id>/rules # optional judge rules (JSON)
functions/<id>/meta.json # status: draft | released, version, scoreGate
functions/<id>/test-cases.json # stored test cases for validate/optimize
functions/<id>/race-config.json # race defaults + winner profiles (best/cheapest/fastest/balanced)
functions/<id>/races.json # race history (append-only, capped)Content can be loaded from a git-backed resolver (e.g. .content repo) or via a content provider (shared-store or inline). The library exports createFunctionContentProvider and ResolverBackedContentProvider for use with run(skillName, request, { contentProvider, scopeId?, profile?, validateOutput? }).
Docs: COLLECTIONS_MAPPING.md and DATA_MAPPING.md (when present in docs/) describe collections, schema, and data; see also .docs/ for in-repo design docs.
Prompts are:
- reviewable in PRs
- shareable across projects
- versionable like code
Modes
| Mode | Typical backend | Use |
| ------------------- | --------------- | ------------------------ |
| weak | local (CPU) | dev/offline/cheap |
| normal / strong | cloud | production default |
| ultra | cloud | strictest / highest-tier |
The safety layer (JSON + validation + retries)
For any JSON-producing call, the library applies:
- extract-first JSON (prefers
```jsonfences, then first balanced object/array) - safe parsing (guards against prototype poisoning)
- optional schema validation (Ajv)
- deterministic retries (normal → JSON-only guard → fix-to-schema with errors)
import { runJsonCompletion } from "aifunctions-js/functions";
const { parsed, text, usage } = await runJsonCompletion({
instruction: "Extract line items from this invoice. Return JSON only.",
options: { model: "openai/gpt-4o", maxTokens: 800 },
});Evaluation & optimization
Judge a response
import { judge } from "aifunctions-js/functions";
const verdict = await judge({
instructions: "...",
response: "...",
rules: [
{ rule: "Must output valid JSON only", weight: 3 },
{ rule: "Field names must match the schema exactly", weight: 2 },
],
threshold: 0.8,
mode: "strong",
});
// verdict.pass, verdict.scoreNormalized, verdict.ruleResultsGenerate rules from instructions
import { generateJudgeRules } from "aifunctions-js/functions";
const { rules } = await generateJudgeRules({ instructions: myPrompt });Methodology: When providing good/bad examples (e.g. POST /optimize/rules or optimizeJudgeRules), include a brief rationale (why it's good or bad) when possible; it improves rule quality.
Generate / improve instructions until they pass
import { generateInstructions } from "aifunctions-js/functions";
const best = await generateInstructions({
seedInstructions: myPrompt,
testCases: [{ id: "t1", inputMd: "Invoice #1234\nConsulting: $4,500" }],
call: "ask",
targetModel: { model: "openai/gpt-4o-mini", class: "normal" },
judgeThreshold: 0.8,
targetAverageThreshold: 0.85,
loop: { maxCycles: 5 },
optimizer: { mode: "strong" },
});
// best.achieved, best.best.instructions, best.best.avgScoreNormalizedFix instructions from judge feedback
import { fixInstructions } from "aifunctions-js/functions";
const { fixedInstructions, changes } = await fixInstructions({
instructions: myPrompt,
judgeFeedback: verdict,
});Compare two instruction versions
import { compare } from "aifunctions-js/functions";
const result = await compare({
instructions: baseInstructions,
responses: [
{ id: "v1", text: responseFromVersionA },
{ id: "v2", text: responseFromVersionB },
],
threshold: 0.8,
});
// result.bestId, result.rankingBenchmark models
import { raceModels } from "aifunctions-js/functions";
const ranking = await raceModels({
taskName: "invoice-lines",
call: "askJson",
testCases: [{ id: "t1", inputMd: "..." }],
threshold: 0.8,
models: [
{ id: "gpt4o", model: "openai/gpt-4o", vendor: "openai", class: "strong" },
{ id: "claude", model: "anthropic/claude-3-5-haiku", vendor: "anthropic", class: "strong" },
],
});Client (one API across providers)
import { createClient } from "aifunctions-js";
const ai = createClient({ backend: "openrouter" });
const res = await ai.ask("Write a product tagline.", {
model: "openai/gpt-5-nano",
maxTokens: 200,
temperature: 0.7,
});
// res.text, res.usage, res.model
// Or use mode and let the client resolve model from config/env/preset:
await ai.ask("...", { mode: "strong", maxTokens: 500, temperature: 0.7 });You can set the strong/normal model once via env (LLM_MODEL_STRONG, LLM_MODEL_NORMAL) or createClient({ models: { normal, strong } }); then ask(..., { mode: "strong" }) uses that model without passing it every time.
Backends
createClient({ backend: "openrouter", models?: { normal?, strong? }, openrouter?: { apiKey?, baseUrl?, appName?, appUrl? } })
createClient({ backend: "llama-cpp", llamaCpp: { modelPath, contextSize?, threads? } })
createClient({ backend: "transformersjs", transformersjs: { modelId?, cacheDir?, device?: "cpu" } })Configuration
.env (all optional unless you use that backend):
OPENROUTER_API_KEY=sk-or-...
LLM_MODEL_NORMAL=gpt-5-nano # optional; used when ask(..., { mode: "normal" })
LLM_MODEL_STRONG=gpt-5.2 # optional; used when ask(..., { mode: "strong" })
LLAMA_CPP_MODEL_PATH=./models/model.gguf
LLAMA_CPP_THREADS=6
TRANSFORMERS_JS_MODEL_ID=Xenova/distilbart-cnn-6-6REST API (optional, stateless)
Expose functions and the full lifecycle over HTTP. Request/response shapes are described in this README and in the API contract when present (e.g. docs/API_CONTRACT.md); server–contract sync: docs/CONTRACT_SYNC.md. If docs/ is not in your clone, the REST API section below is the endpoint reference.
npm run build && npm run serve
# PORT=3780 by defaultAuthentication
| Header | Purpose |
|---|---|
| x-api-key | Authenticates to the server — validated against LIGHT_SKILLS_API_KEY env |
| x-openrouter-key | BYOK — passed through to OpenRouter so each user can use their own key and billing |
If neither LIGHT_SKILLS_API_KEY nor AIFUNCTIONS_API_KEY is set, all requests are allowed.
Run and health
GET /health health check → { version, uptime, skills, hasOpenrouterKey, backends }
GET /config/modes server mode→model mapping → { weak, normal, strong, ultra } (each { model, description })
POST /run { skill, input, options } → { result, usage }
POST /skills/:name/run { input, options } → { result, usage }
POST /functions/:id/run { input, options } → { result, usage, requestId, draft?, trace? }
GET /skills list functions + metadata (legacy alias route)
GET /skills/:name function detail (legacy alias route)
GET /functions list functions
GET /functions/:id function detail with status, version, last validation, currentInstructions, currentRules, currentRulesCountRun request body may include options.scopeId and options.profile (best | cheapest | fastest | balanced) for scope-specific model selection; options.validateOutput: true returns { result, validation } against the library index schema. Run responses include requestId (same as attribution traceId when provided). When the function is in draft status, the response includes draft: true. Pass options.trace: true in the run body to receive a trace object with the full prompt(s), model selection, and model used per call (for that request only; not stored). Run endpoints are rate-limited per key (see RATE_LIMIT_PER_MINUTE) and return X-RateLimit-Remaining and X-RateLimit-Reset headers.
Run mode may be weak, normal, strong, ultra, or profile modes best, cheapest, fastest, balanced. Profile modes require a race to have been run first; otherwise the server returns 422 NO_RACE_PROFILE.
Functions lifecycle
Create a function, iterate on it, validate quality, then release it to a stable versioned endpoint.
POST /functions create: { id, seedInstructions, scoreGate?, rules? }
POST /functions/:id:validate run schema + semantic scoring → { passed, scoreNormalized, cases }
POST /functions/:id:release promote to released (blocked if score < scoreGate). Body may include `scopeId` for scope-specific release.
POST /functions/:id:rollback set current instructions/rules to a previous version (body: `{ version: gitRef }`; requires version APIs)
POST /functions/:id:optimize rewrite instructions in-place
POST /functions/:id:push push to remote git repo (requires SKILLS_LOCAL_PATH)
GET /functions/:id/versions instruction version history (git shas)
POST /functions/:id/versions/:version/run run at a pinned version (ref = git sha from versions list)
GET /functions/:id/scopes query: scopeId required → { functionId, scopeId, releases }
POST /functions/:id/apply apply evaluation result to scope (body: { scopeId?, evaluationSessionId, appliedBy? }) → { appliedProfileSet, scopeId, functionId }
GET /functions/:id/test-cases
PUT /functions/:id/test-cases { testCases: [{ id, input, expectedOutput? }] }
POST /functions/:id/save-optimization persist instructions, rules, examples from optimization wizard
POST /functions/generate-examples { description, count?, mode? } → { examples, usage? }Race / benchmark
POST /race/models race models, temperatures, or tokens (async job) — body: skillName|prompt, testCases?, candidates|models, functionKey?, applyDefaults?, raceLabel?, notes?, type? (model|temperature|tokens), model?, temperatures?, tokenValues?
GET /functions/:id/profiles race winner profiles and defaults → { defaults, profiles: { best, cheapest, fastest, balanced } }
GET /functions/:id/race-report race history — query: last, since, raceId → { races }Job result for a race includes ranked, raw, winners, usage. Run with mode: best|cheapest|fastest|balanced uses the stored profile for that function.
The cheapest winner is selected by pricing rate using the bundled pricing table (see below).
Optimization endpoints
POST /optimize/generate generate instructions from test cases (async job)
POST /optimize/judge score a response against rules → { pass, score, ruleResults }
POST /optimize/rules generate rules from labeled examples or instructions
POST /optimize/rules-optimize optimize existing rules from examples with rationale (append/replace)
POST /optimize/fix fix instructions from judge feedback → { fixedInstructions, changes, summary, usage, optional addedRuleBullets }
POST /optimize/compare rank 2+ responses by quality → { ranking, bestId, candidates }
POST /optimize/instructions one-shot instruction rewrite
POST /optimize/skill rewrite one skill's instructions in-place
POST /optimize/batch batch rewrite (async job)Jobs (for async operations)
GET /jobs list recent jobs
GET /jobs/:id status, progress, result
GET /jobs/:id/logs streaming log linesContent workflows
POST /content/sync sync instructions to content store (and optional push)
POST /content/index build library index (body: { prefix?: "functions/" }) → { indexed, skills, errors }
POST /content/index/full build and return full embedded library snapshot (body: prefix?, staticOnly?, writeDocsFallback?) → { fullSnapshot, ... }; writes .docs/library-index.full.fallback.json by default
POST /content/fixtures validate examples vs io.output schemas
POST /content/layout-lint enforce folder-based layout under functions/Cost estimation
All responses include usage.costEstimate with machine-readable cost status, confidence, reason, and source metadata.
usage.estimatedCostis still returned for backward compatibility and mirrorsusage.costEstimate.amountUsdwhen available.- OpenRouter exact cost: when provider response includes
usage.cost,costEstimate.status="available"andsource="provider-response". - Bundled pricing fallback: OpenAI models can be estimated from
data/openai-cost.json(priceVersion: "openai-cost@2026-03-05"), withstatus="estimated"andsource="provider-pricing-registry". - Unavailable pricing: when cost cannot be computed,
amountUsdisnullandcostEstimateincludesstatus="unavailable"with a deterministicreasonCode(BACKEND_NO_PRICING,MODEL_PRICING_MISSING,PRICE_LOOKUP_FAILED,USAGE_INCOMPLETE, orNOT_COMPUTED).
Function-level cost & activity tracking
Every LLM call is automatically tagged with the function that originated it, so usage data can be attributed precisely — not just at the model or account level.
How it works:
- The server injects
functionIdautomatically (e.g.extract.requirements,optimize.judge). - Callers may optionally pass
projectId,traceId, andtagsin any POST body. - The server embeds these as metadata in the outgoing provider request (
userfield for OpenRouter). - Every response returns extended attribution fields in the
usageobject.
Optional request fields (any POST endpoint that calls an LLM):
| Field | Type | Description |
|---|---|---|
| projectId | string | Logical project or tenant (e.g. "cognni-prod") |
| traceId | string | Correlation ID for distributed tracing. Auto-generated UUID if omitted. |
| tags | object | Free-form key-value metadata (string values) |
Example request:
POST /functions/extract.requirements/run
{
"input": { "text": "..." },
"projectId": "cognni-prod",
"traceId": "req-983741",
"tags": { "workflow": "classification", "environment": "production" }
}Extended usage response:
{
"promptTokens": 240,
"completionTokens": 82,
"totalTokens": 322,
"model": "openai/gpt-5-nano",
"latencyMs": 1430,
"estimatedCost": 0.000147,
"costEstimate": {
"amountUsd": 0.000147,
"status": "estimated",
"confidence": "medium",
"source": "provider-pricing-registry",
"priceVersion": "openai-cost@2026-03-05",
"reason": "Estimated using bundled OpenAI pricing table."
},
"functionId": "extract.requirements",
"projectId": "cognni-prod",
"traceId": "req-983741",
"tags": { "workflow": "classification", "environment": "production" }
}functionId is always present. projectId, traceId, and tags appear only when provided.
The package itself remains stateless — it does not store usage history. The usage object is returned to the caller for forwarding to any external logging pipeline, analytics service, or dashboard.
Analytics APIs
Proxy endpoints that fetch usage and cost data directly from the upstream provider. No data is stored by this server.
GET /models/available list models available via OpenRouter (x-openrouter-key or OPENROUTER_API_KEY)
GET /activity server-side activity log (query: from, to, functionId, projectId, model, limit) → { activities, summary }
GET /analytics/openrouter/credits account balance and total usage
GET /analytics/openrouter/generations generation records (query: dateMin, dateMax, model, userTag, limit)
GET /analytics/openai/usage org usage buckets — requires OPENAI_ADMIN_KEY (query: startTime, endTime, groupBy, projectIds, models, limit)
GET /analytics/openai/costs org cost buckets — requires OPENAI_ADMIN_KEY (query: startTime, endTime, groupBy, projectIds, limit)OpenRouter: uses x-openrouter-key header (BYOK) or OPENROUTER_API_KEY env.
OpenAI: requires OPENAI_ADMIN_KEY env var (an admin-scoped key from your OpenAI org settings). Standard project keys do not have access to organization-level analytics.
Filter generations by function or project:
GET /analytics/openrouter/generations?userTag=cognni-prod:extract.requirements&dateMin=2026-03-01The userTag filter matches the attribution tag this package injects into the OpenRouter user field, so you can pull all generations for a specific function or project:function pair.
Server env vars
| Var | Default | Description |
|---|---|---|
| PORT | 3780 | Server port |
| LIGHT_SKILLS_API_KEY | — | If set, requires x-api-key header (legacy: AIFUNCTIONS_API_KEY) |
| OPENROUTER_API_KEY | — | Default OpenRouter key (overridden per-request by x-openrouter-key) |
| MAX_CONCURRENCY | 10 | Max parallel LLM calls |
| RATE_LIMIT_PER_MINUTE | 60 | Max run requests per minute per key (BYOK or server); responses include X-RateLimit-Remaining, X-RateLimit-Reset |
| JOB_TTL | 3600 | Seconds before completed jobs are cleaned up |
| VALIDATE_SKILL_OUTPUT | 0 | If 1, all runs include schema validation |
| SKILLS_LOCAL_PATH | — | Local git path, required for :push endpoint |
| OPENAI_ADMIN_KEY | — | Admin-scoped OpenAI key, required for GET /analytics/openai/* |
Run endpoints enforce a 100KB max request body (413 when exceeded). Backend default timeout per LLM call is 60s; for a 120s run timeout (e.g. aifunction.dev free tier) configure your backend or client.
Content (functions repo) workflow
npm run content:sync # sync instructions to .content and push
npm run content:sync:optimize # optimize instructions (requires OPENROUTER_API_KEY)
npm run content:index # build library index with LLM (schemas/examples; needs OPENROUTER_API_KEY)
npm run content:index:static # build library index from content only (no LLM)
npm run content:index:copy-fallback # copy built index to .docs/library-index.fallback.json
npm run content:index:copy-full-fallback # build full snapshot and write .docs/library-index.full.fallback.json
npm run content:index:real # content:index + copy-fallback
npm run content:index:real:full # content:index + copy-fallback + copy-full-fallback
npm run content:fixtures # validate examples vs io.output schemas
npm run content:layout-lint # enforce folder-based layout under functions/Privacy & data handling
- Library usage (npm): everything runs in your environment.
- REST server: stateless by design — does not persist request/response bodies or API keys.
Security notes
- Never commit
.env - Don't log provider keys
- Add
.content/to.gitignore
Testing
npm run test:unit # unit tests only (no API key required)
npm test # full suite
npm run typecheck # TypeScript checkDocumentation
- Library functions — manual mode — full list of built-in functions and programmatic use with rules/optimization
- Library index: JSON and API — index format, update commands, and HTTP API for the catalog
- API contract and Contract sync — when present in
docs/; otherwise the REST API section above is the source of truth for endpoints
Links
- GitHub: nx-intelligence/light-skills (repository and homepage)
- npm package name:
aifunctions-js
