npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

llm-runtime

v0.6.6

Published

Runtime layer for application-owned LLM workflows with tool orchestration, MCP integration, and skill loading.

Downloads

1,371

Readme

llm-runtime

llm-runtime is a TypeScript runtime layer for application-owned LLM workflows. It gives a host app one package boundary for provider calls, tool execution, MCP discovery, skills, and bounded agentic completion.

The package is intentionally not a full agent product. Your app still owns UI, persistence, permissions, transcript storage, workspace lifetime, and business policy. llm-runtime owns the provider/tool loop mechanics that should not be reimplemented in every harness.

For a human-oriented walkthrough of the codebase, start with the local project wiki: .wiki/index.md.

Installation

npm install llm-runtime

The package is ESM-only, targets Node.js 18+, and exposes a single root entrypoint.

Public Surface

The root entrypoint exports four runtime functions:

  • generate(...)
  • complete(...)
  • streamComplete(...)
  • createRuntime(...)

It also exports the public type set for providers, messages, tools, runtime options, completion results, stream events, MCP config, and provider config.

Lower-level loop internals, direct provider clients, recovery prompts, validation helpers, and tool-resolution helpers are not part of the root public API. If an app needs stable reusable dependencies and tool execution helpers, create a runtime and use the methods on that runtime instance.

Providers

Supported provider names:

  • openai
  • anthropic
  • google
  • azure
  • xai
  • openai-compatible
  • ollama

Provider configuration can be passed per call, through a provider map, or through a reusable runtime:

import { generate } from 'llm-runtime';

const response = await generate({
  provider: 'openai',
  model: 'gpt-5',
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY!,
    },
  },
  messages: [
    { role: 'user', content: 'Summarize this in one paragraph.' },
  ],
});

console.log(response.content);

Runtime Model

Use createRuntime(...) when your app wants stable provider config, MCP config, skill roots, defaults, and registry state across many calls.

Put stable harness state in the runtime:

  • provider config
  • MCP config or MCP registry
  • skill roots or skill registry
  • default reasoningEffort
  • default toolPermission

Keep request-local state per call:

  • provider
  • model
  • messages
  • temperature
  • maxTokens
  • context.workingDirectory
  • context.abortSignal
  • webSearch
  • per-call builtIns, extraTools, or tools

Single-Turn Generation

generate(...) performs one provider call. It resolves the requested tool surface and passes it to the provider, but it does not execute returned tool calls or continue the conversation. Use it when the host wants to own the loop.

The returned LLMResponse is either:

  • type: 'text' with content
  • type: 'tool_calls' with tool_calls

Use complete(...) when the runtime should own repeated model calls, tool execution, and terminal control-tool handling.

In short: generate(...) asks the model once; complete(...) keeps working until the runtime reaches a completion, user-input, blocked, or bounded-stop condition.

Example:

import { createRuntime } from 'llm-runtime';

const runtime = createRuntime({
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY!,
    },
  },
  skillRoots: ['/app/skills', '/workspace/.codex/skills'],
  defaults: {
    reasoningEffort: 'medium',
    toolPermission: 'auto',
  },
  mcpConfig: {
    servers: {
      docs: {
        command: 'node',
        args: ['docs-server.js'],
        transport: 'stdio',
      },
    },
  },
});

const response = await runtime.generate({
  provider: 'openai',
  model: 'gpt-5',
  messages: [
    { role: 'user', content: 'Read the project and identify the main runtime boundary.' },
  ],
  context: {
    workingDirectory: process.cwd(),
  },
  builtIns: {
    read_file: true,
    list_files: true,
    search_files: true,
  },
});

console.log(response.content);

await runtime.dispose();

Completion

complete(...) owns a bounded model/tool loop. It retries weak non-progressing responses, executes known tools, injects the runtime completion contract, and terminates through internal control tools:

  • final_answer
  • blocked

Those control tools are runtime-reserved. Do not define app tools with those names.

The loop continues under these rules:

  • normal tool call: execute the tool, append the tool result, and call the model again
  • final_answer: stop with status: 'completed'
  • ask_user_input: stop with status: 'tool_calls' so the host can ask/resume
  • known custom tool without an executor: stop with status: 'tool_calls' so the host can run it and resume
  • blocked: stop with status: 'failed'
  • plain narration or intent text: keep going; narration is not completion
  • empty text: retry according to emptyTextRetryLimit
  • missing required action evidence: reject premature final text or final_answer and continue with recovery guidance
  • host mutating tool exposed: require a host mutating tool result before accepting final completion
  • repeated identical tool calls or maxIterations: stop with the corresponding bounded failure
  • host cancellation through context.abortSignal: abort the active model/tool path when the host decides the task should stop

builtIns only changes which package-owned tools are available. It does not decide whether the loop itself runs. builtIns: false with host-supplied extraTools or tools still gives complete(...) a valid loop: host tools remain executable, and the runtime still injects final_answer and blocked.

complete(...) returns:

  • status: 'completed' with output when the model reaches final_answer
  • status: 'tool_calls' when the host must handle a tool call, commonly required user input
  • status: 'failed' when the run is blocked, invalid, or otherwise cannot complete
  • status: 'max_iterations' when loop bounds stop the run

Example:

import { createRuntime } from 'llm-runtime';

const runtime = createRuntime({
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY!,
    },
  },
});

const result = await runtime.complete({
  provider: 'openai',
  model: 'gpt-5',
  messages: [
    { role: 'user', content: 'Inspect the workspace and tell me where runtime completion is implemented.' },
  ],
  context: {
    workingDirectory: process.cwd(),
  },
  builtIns: {
    read_file: true,
    search_files: true,
    path_exists: true,
  },
  maxIterations: 12,
});

if (result.status === 'completed') {
  console.log(result.output);
}

if (result.status === 'tool_calls') {
  console.log(result.toolCalls);
}

streamComplete(...) runs the same completion path and yields lifecycle events. It emits model/tool events plus provider text, reasoning, tool-call argument, and final answer deltas when the provider adapter supplies them:

  • model_start
  • text_delta
  • reasoning_delta
  • tool_call_delta
  • answer_delta
  • assistant_message
  • tool_start
  • tool_result
  • tool_error
  • tool_calls
  • completed
  • failed
  • raw
for await (const event of runtime.streamComplete({
  provider: 'openai',
  model: 'gpt-5',
  messages: [
    { role: 'user', content: 'Use tools if needed, then give the final answer.' },
  ],
  builtIns: {
    read_file: true,
    search_files: true,
    path_exists: true,
  },
})) {
  if (event.type === 'text_delta' || event.type === 'answer_delta') {
    process.stdout.write(event.delta);
  }

  if (event.type === 'completed') {
    console.log(event.result.output);
  }
}

Tools

Tool sources are merged into one model-facing surface:

  • built-in runtime tools
  • app-provided extraTools
  • app-provided tools
  • MCP tools discovered from configured servers

Built-in tool names are reserved:

  • shell_cmd
  • load_skill
  • ask_user_input
  • web_fetch
  • read_file
  • write_file
  • list_files
  • search_files
  • create_directory
  • path_exists

Built-ins default to all package-owned tools for host convenience. Pass false to disable them, or pass a narrow map when the task should expose less:

  • omitting builtIns enables every built-in tool
  • builtIns: false enables no built-in tools
  • builtIns: true enables every built-in tool
  • pass an explicit per-tool map such as { read_file: true, search_files: true }
  • string shorthand modes such as builtIns: 'all' and builtIns: 'read-only' are not supported

Use small, task-specific maps:

const readOnlyBuiltIns = {
  load_skill: true,
  list_files: true,
  search_files: true,
  read_file: true,
  path_exists: true,
};

const writeFileBuiltIns = {
  ...readOnlyBuiltIns,
  create_directory: true,
  write_file: true,
};

const commandBuiltIns = {
  ...writeFileBuiltIns,
  shell_cmd: true,
};

Opt into write or command tools only when the task needs them. Do not use a broad preset for ordinary file inspection:

const result = await runtime.complete({
  provider: 'openai',
  model: 'gpt-5',
  messages: [
    { role: 'user', content: 'Run the project test command and summarize the result.' },
  ],
  context: {
    workingDirectory: process.cwd(),
  },
  builtIns: {
    read_file: true,
    search_files: true,
    path_exists: true,
    create_directory: true,
    write_file: true,
    shell_cmd: true,
  },
});

toolPermission: 'read' is a hard read-only boundary for package-owned mutating tools. It blocks write_file, create_directory, and shell_cmd even if those built-ins are exposed.

Prefer structured workspace tools over shell_cmd for routine file work:

  • list_files for directory listing
  • search_files for glob-like discovery
  • read_file for paginated file reads
  • path_exists for file or directory checks
  • create_directory for recursive directory creation when enabled

App tools can be passed as extraTools or tools. They are additive; they cannot override reserved built-ins or completion control tools.

const result = await runtime.complete({
  provider: 'openai',
  model: 'gpt-5',
  messages: [
    { role: 'user', content: 'Look up customer c_123 and summarize the account state.' },
  ],
  extraTools: [
    {
      name: 'lookup_customer',
      description: 'Look up a customer by id.',
      evidenceKind: 'read',
      parameters: {
        type: 'object',
        properties: {
          customerId: { type: 'string' },
        },
        required: ['customerId'],
        additionalProperties: false,
      },
      execute: async ({ customerId }) => {
        return { customerId, plan: 'enterprise', status: 'active' };
      },
    },
  ],
});

Runtime instances also expose resolveTools(...), executeToolCall(...), and executeToolCalls(...) for hosts that need to inspect or run the effective tool surface outside complete(...).

Human Input

ask_user_input is the public human-input tool contract. It uses a structured questions[] payload:

{
  type?: "single-select" | "multiple-select";
  allowSkip?: boolean;
  questions: Array<{
    header: string;
    id: string;
    question: string;
    options: Array<{
      id: string;
      label: string;
      description?: string;
    }>;
  }>;
}

If you use a narrow builtIns map, include it when the model is allowed to ask the host for a human decision:

builtIns: {
  ask_user_input: true,
}

When completion needs host-owned user input, it returns status: 'tool_calls'. The host should surface the question, then resume by appending a normal tool-result message for the pending tool call and calling complete(...) again with the updated message list.

const resumedMessages = [
  ...result.messages,
  {
    role: 'tool',
    tool_call_id: result.toolCalls![0].id,
    content: JSON.stringify({
      answers: {
        scope: 'all',
      },
    }),
  },
];

MCP And Skills

MCP servers are configured through mcpConfig. Both servers and legacy mcpServers shapes are accepted. URL-based servers default to streamable-http; stdio servers require a command.

const runtime = createRuntime({
  mcpConfig: {
    servers: {
      search: {
        url: 'https://example.com/mcp',
        headers: {
          Authorization: `Bearer ${process.env.MCP_TOKEN}`,
        },
      },
    },
  },
});

Skills are discovered from configured skillRoots and loaded through the load_skill built-in. Later skill roots have higher precedence when duplicate skill ids are found.

Skills add instruction context. They are not executable tools.

Web Search

Pass webSearch per call:

const response = await generate({
  provider: 'openai',
  model: 'gpt-5',
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY!,
    },
  },
  messages: [
    { role: 'user', content: 'Use current public information to answer.' },
  ],
  webSearch: {
    searchContextSize: 'medium',
  },
});

Provider behavior:

  • openai, anthropic, and google receive provider-native web search options
  • azure, openai-compatible, xai, and ollama ignore unsupported web search on the current chat path and return a web_search_ignored warning
  • Gemini Google Search grounding is not combined with function calling; when both tools and webSearch are requested for google, tools win and web search is ignored with a warning
  • searchContextSize is forwarded for OpenAI-style requests and ignored by Anthropic and Gemini

Cleanup

Call runtime.dispose() when a runtime owns MCP clients:

const runtime = createRuntime({ mcpConfig });

try {
  await runtime.complete(request);
} finally {
  await runtime.dispose();
}

The host still owns temporary workspaces, transcript persistence, app-specific registries, and any resources it injected into the runtime.

Local Development

npm run build
npm run check
npm test

Useful scripts:

  • npm run build compiles src/ into dist/
  • npm run check runs TypeScript without emitting files
  • npm test runs the Vitest suite in tests/llm
  • npm run test:watch runs Vitest in watch mode
  • npm run test:e2e runs the live provider showcase
  • npm run test:e2e:dry-run validates showcase wiring without live provider calls
  • npm run test:e2e:azure runs Azure live-provider coverage
  • npm run test:e2e:azure:dry-run validates Azure showcase wiring
  • npm run test:e2e:gemini runs Gemini live-provider coverage
  • npm run test:e2e:gemini:dry-run validates Gemini showcase wiring
  • npm run test:e2e:turn-loop runs runtime completion showcase coverage
  • npm run test:e2e:turn-loop:dry-run validates turn-loop showcase wiring
  • npm run test:e2e:hardening runs deterministic hardening coverage without a live provider
  • npm run test:e2e:host-owned runs deterministic host-owned tool-call coverage without a live provider
  • npm run test:e2e:host-owned:gemini runs the host-owned tool-call coverage against Gemini 2.5 Flash by default

Live showcase runners expect a repo-local .env with the relevant provider credentials.