@ageflow/runner-api

v0.4.1

Published

a month ago

OpenAI-compatible HTTP runner for ageflow (OpenAI, Groq, Together, Ollama, vLLM, LM Studio, Azure).

0High
0Medium
0Low

neftedollar

ai agents workflow llm dsl typescript openai groq ollama multi-agent api-runner

@ageflow/runner-api

OpenAI-compatible HTTP runner for ageflow. Talks to any /chat/completions endpoint via fetch(). Supports multi-round tool calling internally, pluggable session storage, and returns ToolCallRecord[] for observability. Zero external dependencies.

Install

bun add @ageflow/runner-api

Quick start

import { registerRunner } from "@ageflow/core";
import { ApiRunner } from "@ageflow/runner-api";

registerRunner(
  "api",
  new ApiRunner({
    baseUrl: "https://api.openai.com/v1",
    apiKey: process.env.OPENAI_API_KEY!,
    defaultModel: "gpt-4o-mini",
  }),
);

Then use runner: "api" in any defineAgent call:

import { defineAgent } from "@ageflow/core";
import { z } from "zod";

const summarize = defineAgent({
  runner: "api",
  model: "gpt-4o-mini",
  input: z.object({ text: z.string() }),
  output: z.object({ summary: z.string() }),
  prompt: (i) =>
    `Summarize in one sentence as JSON {"summary": string}:\n\n${i.text}`,
});

Provider compatibility

| Provider | baseUrl | |---------------|------------------------------------------------------------------------| | OpenAI | https://api.openai.com/v1 | | Groq | https://api.groq.com/openai/v1 | | Together AI | https://api.together.xyz/v1 | | Ollama | http://localhost:11434/v1 | | vLLM | http://localhost:8000/v1 | | LM Studio | http://localhost:1234/v1 | | Azure OpenAI | https://<resource>.openai.azure.com/openai/deployments/<model> |

For Azure you must include ?api-version=... directly in baseUrl — the runner appends /chat/completions to baseUrl as a path segment and does not merge query parameters separately. Do not pass api-version via headers; Azure rejects requests where it appears only as a header.

Example: baseUrl: "https://<resource>.openai.azure.com/openai/deployments/<model>?api-version=2024-02-01"

Configuration

new ApiRunner({
  // Required
  baseUrl: "https://api.openai.com/v1",  // trailing slash is stripped automatically
  apiKey: "sk-...",

  // Optional
  defaultModel: "gpt-4o-mini",   // used when spawn() args.model is not set
  tools: {                       // tool registry — see Tool calling below
    readFile: { description: "...", parameters: { ... }, execute: async (args) => ... },
  },
  sessionStore: myStore,         // custom SessionStore — see Session persistence below
  maxToolRounds: 10,             // max tool-call loops before MaxToolRoundsError (default 10)
  requestTimeout: 120_000,       // ms before AbortController fires (default 120 000)
  headers: {                     // extra headers forwarded on every request
    "x-custom-header": "value",  // e.g. custom tracing headers
  },
  fetch: myFetchImpl,            // injectable fetch (default: globalThis.fetch)
})

Tool calling

Register tools that the model may invoke. The runner loops internally until the model stops requesting tool calls or maxToolRounds is reached.

import { ApiRunner } from "@ageflow/runner-api";
import * as fs from "node:fs/promises";

const runner = new ApiRunner({
  baseUrl: "https://api.openai.com/v1",
  apiKey: process.env.OPENAI_API_KEY!,
  tools: {
    readFile: {
      description: "Read the contents of a file from disk",
      parameters: {
        type: "object",
        properties: { path: { type: "string", description: "Absolute file path" } },
        required: ["path"],
      },
      execute: async ({ path }) => {
        return await fs.readFile(String(path), "utf-8");
      },
    },
    writeFile: {
      description: "Write content to a file",
      parameters: {
        type: "object",
        properties: {
          path: { type: "string" },
          content: { type: "string" },
        },
        required: ["path", "content"],
      },
      execute: async ({ path, content }) => {
        await fs.writeFile(String(path), String(content), "utf-8");
        return "ok";
      },
    },
  },
});

const result = await runner.spawn({
  prompt: "Read ./README.md and summarize it in one sentence.",
  tools: ["readFile"],            // subset of registered tools exposed to model
});

console.log(result.stdout);      // final model reply
console.log(result.toolCalls);   // ToolCallRecord[] — every tool invocation

Session persistence

By default each spawn() call gets a fresh UUID session handle and messages are stored in an InMemorySessionStore (lives for the lifetime of the ApiRunner instance). Pass a sessionHandle to resume a conversation:

const first = await runner.spawn({ prompt: "My name is Alice." });
// first.sessionHandle === "some-uuid"

const second = await runner.spawn({
  prompt: "What is my name?",
  sessionHandle: first.sessionHandle,
});
// second.stdout === "Your name is Alice."

Custom `SessionStore` (e.g. Redis)

import type { SessionStore } from "@ageflow/runner-api";
import type { ChatMessage } from "@ageflow/runner-api";
import { createClient } from "redis";

const redis = createClient();
await redis.connect();

const redisStore: SessionStore = {
  async get(handle) {
    const raw = await redis.get(`session:${handle}`);
    return raw ? (JSON.parse(raw) as ChatMessage[]) : undefined;
  },
  async set(handle, messages) {
    await redis.set(`session:${handle}`, JSON.stringify(messages), { EX: 3600 });
  },
};

const runner = new ApiRunner({
  baseUrl: "https://api.openai.com/v1",
  apiKey: process.env.OPENAI_API_KEY!,
  sessionStore: redisStore,
});

Observability

RunnerSpawnResult.toolCalls is a ToolCallRecord[] containing every tool invocation made during the session:

const result = await runner.spawn({ prompt: "...", tools: ["readFile"] });

for (const call of result.toolCalls ?? []) {
  console.log(call.name);       // "readFile"
  console.log(call.args);       // { path: "./foo.ts" }
  console.log(call.result);     // "export const ..."
  console.log(call.durationMs); // 12
}

The executor passes toolCalls through to TaskMetrics / ExecutionTrace when present, enabling end-to-end observability without extra instrumentation.

Validation

runner.validate() hits GET /models and returns { ok, version?, error? }. Useful for health-checks and pre-flight guards:

const { ok, version, error } = await runner.validate();
if (!ok) throw new Error(`API runner not reachable: ${error}`);
console.log("First available model:", version);

Error types

| Error class | When thrown | |---------------------|-------------------------------------------------------------------| | MaxToolRoundsError | Tool-call loop exceeded maxToolRounds | | ApiRequestError | HTTP response was non-2xx | | ToolNotFoundError | Reserved — executor pre-flight; runner itself soft-errors unknown tools |

import { MaxToolRoundsError, ApiRequestError } from "@ageflow/runner-api";

try {
  await runner.spawn({ prompt: "loop forever", tools: ["infiniteTool"] });
} catch (err) {
  if (err instanceof MaxToolRoundsError) {
    console.error("Too many tool rounds:", err.message);
  }
}

Using MCP servers

Pass MCP server configuration via mcp.servers on any defineAgent call. The API runner spawns each server as a stdio subprocess via @modelcontextprotocol/sdk. Tools are discovered at spawn time and registered in the tool-loop under the fully-qualified name mcp__<server>__<tool>.

import { defineAgent, safePath } from "@ageflow/core";
import { z } from "zod";

const fileAgent = defineAgent({
  runner: "api",
  model: "gpt-4o-mini",
  input: z.object({ query: z.string() }),
  output: z.object({ result: z.string() }),
  prompt: ({ query }) => query,
  mcp: {
    servers: [
      {
        name: "filesystem",
        command: "npx",
        args: ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"],
        // Allowlist — only these tools are exposed to the model
        tools: ["read_file", "list_directory"],
        // Refine — validate path args before forwarding to the server
        refine: {
          read_file: z.object({ path: safePath({ allowAbsolute: false }) }),
        },
        // ${env:VAR} is resolved at launch time by the executor
        env: { NODE_ENV: "${env:NODE_ENV}" },
        // Keep this server alive across spawn() calls on the same runner instance
        reusePerRunner: true,
      },
    ],
  },
});

Allowlist (tools): when set, only the listed tools are added to the tool-loop registry. Unlisted tools never reach the model, and a post-dispatch guard rejects unexpected call attempts.

Refine (refine): a map of tool name → Zod schema. Arguments are validated against the schema before the call is dispatched. Use safePath() to prevent path traversal.

Environment expansion (env): values of the form ${env:VAR} are replaced with the corresponding process environment variable at launch time.

`reusePerRunner` — server lifecycle pooling

By default each spawn() call starts its own MCP server subprocesses and stops them when the call completes. Set reusePerRunner: true on a server to keep it alive in a per-runner pool and reuse it across all spawn() calls on the same ApiRunner instance. This avoids repeated cold-start overhead for servers that are expensive to initialize.

// Server stays up across calls — warm on every spawn()
{ name: "filesystem", command: "npx", args: [...], reusePerRunner: true }

`runner.shutdown()` — draining the pool

runner.shutdown() is process-scoped — it is called automatically by the AgentFlow CLI (agentwf run) and the server's close() method at process exit. You do not need to call it manually when using those entry points.

If you are using ApiRunner directly (outside the CLI or server), call shutdownAllRunners() from @ageflow/core when your process exits:

import { shutdownAllRunners } from "@ageflow/core";

process.on("SIGTERM", async () => {
  await shutdownAllRunners();
  process.exit(0);
});

API reference

`new ApiRunner(config: ApiRunnerConfig)`

Creates a new runner instance. All config fields except baseUrl and apiKey are optional.

`runner.validate(): Promise<{ ok: boolean; version?: string; error?: string }>`

Checks connectivity by calling GET /models. Returns ok: false on any error (network, 4xx, 5xx) — never throws.

`runner.spawn(args: RunnerSpawnArgs): Promise<RunnerSpawnResult>`

Executes a prompt, optionally resuming a session, and loops until the model produces a non-tool-call response. Returns stdout (final text), sessionHandle, tokensIn, tokensOut, and toolCalls.

`runner.shutdown(): Promise<void>`

Stops all pooled MCP server subprocesses (reusePerRunner: true) and clears the pool. Per-spawn servers are already stopped by spawn() itself — only the pool requires an explicit shutdown() call. Safe to call more than once.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@ageflow/runner-api

Install

Quick start

Provider compatibility

Configuration

Tool calling

Session persistence

Custom SessionStore (e.g. Redis)

Observability

Validation

Error types

Using MCP servers

reusePerRunner — server lifecycle pooling

runner.shutdown() — draining the pool

API reference

new ApiRunner(config: ApiRunnerConfig)

runner.validate(): Promise<{ ok: boolean; version?: string; error?: string }>

runner.spawn(args: RunnerSpawnArgs): Promise<RunnerSpawnResult>

runner.shutdown(): Promise<void>

License

Custom `SessionStore` (e.g. Redis)

`reusePerRunner` — server lifecycle pooling

`runner.shutdown()` — draining the pool

`new ApiRunner(config: ApiRunnerConfig)`

`runner.validate(): Promise<{ ok: boolean; version?: string; error?: string }>`

`runner.spawn(args: RunnerSpawnArgs): Promise<RunnerSpawnResult>`

`runner.shutdown(): Promise<void>`