npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@inferencesh/sdk

v0.6.10

Published

Official JavaScript/TypeScript SDK for inference.sh - Run AI models with a simple API

Downloads

1,964

Readme

@inferencesh/sdk — ai inference api for javascript & typescript

npm version npm downloads License: MIT TypeScript

official javascript/typescript sdk for inference.sh — the ai agent runtime for serverless ai inference.

run ai models, build ai agents, and deploy generative ai applications with a simple api. access 250+ models including flux, stable diffusion, llms (claude, gpt, gemini), video generation (veo, seedance), and more.

Installation

npm install @inferencesh/sdk
# or
yarn add @inferencesh/sdk
# or
pnpm add @inferencesh/sdk

Getting an API Key

Get your API key from the inference.sh dashboard.

Quick Start

import { inference, TaskStatusCompleted } from '@inferencesh/sdk';

const client = inference({ apiKey: 'your-api-key' });

// Run a task and wait for the result
const result = await client.tasks.run({
  app: 'your-app',
  input: {
    prompt: 'Hello, world!'
  }
});

if (result.status === TaskStatusCompleted) {
  console.log(result.output);
}

Usage

Basic Usage

import { inference, TaskStatusCompleted } from '@inferencesh/sdk';

const client = inference({ apiKey: 'your-api-key' });

// Wait for result (default behavior)
const result = await client.tasks.run({
  app: 'my-app',
  input: { prompt: 'Generate something amazing' }
});

if (result.status === TaskStatusCompleted) {
  console.log('Output:', result.output);
}

With Setup Parameters

Setup parameters configure the app instance (e.g., model selection). Workers with matching setup are "warm" and skip setup:

const result = await client.tasks.run({
  app: 'my-app',
  setup: { model: 'schnell' },  // Setup parameters
  input: { prompt: 'hello' }
});

Fire and Forget

// Get task info immediately without waiting
const task = await client.tasks.run(
  { app: 'my-app', input: { prompt: 'hello' } },
  { wait: false }
);

console.log('Task ID:', task.id);
console.log('Status:', task.status);

Real-time Status Updates

By default, the client streams task progress over NDJSON (/tasks/{id}/stream) and invokes onUpdate as the task changes. Use onPartialUpdate when you only need specific fields from a partial stream payload:

const result = await client.run(
  { app: 'my-app', input: { prompt: 'hello' } },
  {
    onUpdate: (update) => {
      console.log('Status:', update.status);
      console.log('Progress:', update.logs);
    },
    onPartialUpdate: (update, fields) => {
      console.log('Changed fields:', fields, update.status);
    },
  }
);

Streaming vs Polling

SSE/NDJSON streaming is the default. For edge runtimes that cannot keep long-lived connections open (Convex actions, Cloudflare Workers, etc.), disable streaming and use lightweight status polling instead:

const client = inference({
  apiKey: 'your-api-key',
  stream: false,           // poll /tasks/{id}/status instead of streaming
  pollIntervalMs: 2000,    // default: 2000
});

// Per-call override
const result = await client.run(
  { app: 'my-app', input: { prompt: 'hello' } },
  { stream: false, onUpdate: (u) => console.log(u.status) }
);

In polling mode, the SDK checks /tasks/{id}/status and fetches the full task when the status changes. If that fetch fails after a status transition, run() rejects with the underlying error.

Batch Processing

async function processImages(images: string[]) {
  const results = [];

  for (const image of images) {
    const result = await client.tasks.run({
      app: 'image-processor',
      input: { image }
    }, {
      onUpdate: (update) => console.log(`Processing: ${update.status}`)
    });

    results.push(result);
  }

  return results;
}

File Upload

// Upload from base64
const file = await client.files.upload('data:image/png;base64,...', {
  filename: 'image.png',
  contentType: 'image/png'
});

// Use the uploaded file in a task
const result = await client.tasks.run({
  app: 'image-app',
  input: { image: file.uri }
});

Cancel a Task

const task = await client.tasks.run(
  { app: 'long-running-app', input: {} },
  { wait: false }
);

// Cancel if needed
await client.tasks.cancel(task.id);

Sessions (Stateful Execution)

Sessions allow you to maintain state across multiple task invocations. The worker stays warm between calls, preserving loaded models and in-memory state.

// Start a new session
const result = await client.tasks.run({
  app: 'my-stateful-app',
  input: { prompt: 'hello' },
  session: 'new'
});

const sessionId = result.session_id;
console.log('Session ID:', sessionId);

// Continue the session with another call
const result2 = await client.tasks.run({
  app: 'my-stateful-app',
  input: { prompt: 'remember what I said?' },
  session: sessionId
});

Custom Session Timeout

By default, sessions expire after 60 seconds of inactivity. You can customize this with session_timeout (1-3600 seconds):

// Create a session with 5-minute idle timeout
const result = await client.tasks.run({
  app: 'my-stateful-app',
  input: { prompt: 'hello' },
  session: 'new',
  session_timeout: 300  // 5 minutes
});

// Session stays alive for 5 minutes after each call

Notes:

  • session_timeout is only valid when session: 'new'
  • Minimum timeout: 1 second
  • Maximum timeout: 3600 seconds (1 hour)
  • Each successful call resets the idle timer

Session management API

Manage sessions directly without running a task:

// Inspect a session
const info = await client.sessions.get(sessionId);
console.log(info.status, info.expires_at, info.call_count);

// List active sessions
const sessions = await client.sessions.list();

// Extend idle timeout without a task call (sliding window)
await client.sessions.keepalive(sessionId);

// Release the worker immediately
await client.sessions.end(sessionId);

Session errors

import {
  SessionNotFoundError,
  SessionExpiredError,
  SessionEndedError,
} from '@inferencesh/sdk';

try {
  await client.tasks.run({
    app: 'my-stateful-app',
    input: { prompt: 'hello' },
    session: sessionId,
  });
} catch (error) {
  if (
    error instanceof SessionNotFoundError ||
    error instanceof SessionExpiredError ||
    error instanceof SessionEndedError
  ) {
    // Start a new session and retry
    const result = await client.tasks.run({
      app: 'my-stateful-app',
      input: { prompt: 'hello' },
      session: 'new',
    });
  } else {
    throw error;
  }
}

For complete session documentation including error handling, best practices, and advanced patterns, see the Sessions Developer Guide.

Agent Chat

Chat with AI agents using client.agents.create().

Using a Template Agent

Use an existing agent from your workspace by its namespace/name@shortid:

import { inference } from '@inferencesh/sdk';

const client = inference({ apiKey: 'your-api-key' });

// Create agent from template
const agent = client.agents.create('my-org/assistant@abc123');

// Send a message with streaming
await agent.sendMessage('Hello!', {
  onMessage: (msg) => {
    if (msg.content) {
      for (const c of msg.content) {
        if (c.type === 'text' && c.text) {
          process.stdout.write(c.text);
        }
      }
    }
  }
});

// Clean up
agent.disconnect();

Creating an Ad-Hoc Agent

Create agents on-the-fly without saving to your workspace:

import { inference, tool, string } from '@inferencesh/sdk';

const client = inference({ apiKey: 'your-api-key' });

// Create ad-hoc agent (config uses API field names: core_app, system_prompt)
const weatherTool = tool('get_weather')
  .describe('Get current weather')
  .param('city', string('City name'))
  .build();

const agent = client.agents.create({
  core_app: { ref: 'infsh/claude-sonnet-4@abc123' },
  system_prompt: 'You are a helpful assistant.',
  tools: [weatherTool], // only schemas are sent to the API; handlers stay client-side
});

await agent.sendMessage('What is the weather in Paris?', {
  onMessage: (msg) => console.log(msg),
  onToolCall: async (call) => {
    const result = await runMyClientTool(call.name, call.args);
    await agent.submitToolResult(call.id, result);
  },
});

For multi-turn chats, the SDK opens the chat stream before sending the next message so updates are not missed. Use stopChat() to cancel in-flight generation (POST /chats/{id}/stop), and reset() to clear the current chat and start fresh.

Tool builder

Use the fluent builders to define AgentTool schemas. Client tools (tool) run in your app via onToolCall; server-side tools run on inference.sh.

| Builder | Runs on | Description | |---------|---------|-------------| | tool(name) | Client | Local handler; only the schema is sent to the API | | appTool(name, appRef) | Server | Invoke another inference app | | agentTool(name, agentRef) | Server | Delegate to a sub-agent | | httpTool(name, url) / callTool(name, url) | Server | HTTP request with credential injection (preferred over webhookTool) | | webhookTool(name, url) | Server | Unsigned webhook (legacy; use httpTool for new tools) | | mcpTool(name, integrationId, toolName) | Server | Call a tool on a connected MCP integration | | internalTools() | Server | Built-in plan, memory, and widget tools |

import {
  inference,
  tool,
  appTool,
  httpTool,
  mcpTool,
  internalTools,
  string,
  IntegrationProviderGoogle,
} from '@inferencesh/sdk';

const clientTool = tool('get_weather')
  .describe('Get current weather')
  .param('city', string('City name'))
  .build();

// HTTP tool with OAuth integration credentials (injected server-side)
const gmailSend = httpTool('gmail_send', 'https://gmail.googleapis.com/gmail/v1/users/me/messages/send')
  .describe('Send an email via Gmail')
  .method('POST')
  .auth({ integration: IntegrationProviderGoogle, integrationId: 'your-integration-id' })
  .build();

// API key or bearer auth
const fetchData = httpTool('fetch', 'https://api.example.com/data')
  .method('GET')
  .auth({ apiKey: 'YOUR_KEY', header: 'X-API-Key' }) // default header: X-API-Key
  .header('Accept', 'application/json')
  .build();

const bearerFetch = httpTool('bearer_fetch', 'https://api.example.com')
  .auth({ bearer: 'YOUR_TOKEN' })
  .build();

const imageGen = appTool('generate_image', 'infsh/flux-schnell@abc123')
  .param('prompt', string('Image description'))
  .requireApproval()
  .build();

const mcpSearch = mcpTool('notion_search', 'your-mcp-integration-id', 'search')
  .describe('Search Notion pages')
  .param('query', string('Search query'))
  .build();

const agent = client.agents.create({
  core_app: { ref: 'infsh/claude-sonnet-4@latest' },
  system_prompt: 'You are helpful.',
  tools: [clientTool, gmailSend, imageGen, mcpSearch],
  internal_tools: internalTools().memory().build(),
});

callTool is an alias for httpTool. Run npx tsx examples/tool-builder.ts for more schema examples (no API key required).

File attachments

Pass files in sendMessage options. Blob values are uploaded first; objects with a uri (already uploaded via client.files.upload) are attached as-is:

const uploaded = await client.files.upload(imageBlob, {
  filename: 'photo.png',
  contentType: 'image/png',
});

await agent.sendMessage('Describe this image', {
  files: [imageBlob, uploaded], // Blob uploads; FileDTO reuses uri
  onMessage: (msg) => console.log(msg),
});

Structured Output

Use output_schema to get structured JSON responses:

const agent = client.agents.create({
  core_app: { ref: 'infsh/claude-sonnet-4@latest' },
  output_schema: {
    type: 'object',
    properties: {
      summary: { type: 'string' },
      sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] },
      confidence: { type: 'number' },
    },
    required: ['summary', 'sentiment', 'confidence'],
  },
  internal_tools: { finish: true },
});

const output = await agent.run('Analyze: Great product!');

agent.run() sends a message with polling (no SSE), waits until the chat is idle, and returns chat.output (parsed finish-tool result, or null if none).

Agent Methods

| Method | Description | |--------|-------------| | sendMessage(text, options?) | Send a message; streams or polls until idle when callbacks or stream: false | | run(text, options?) | Send and return structured chat.output (always uses polling) | | getChat(chatId?) | Get the current or specified chat (chat_messages on the returned chat) | | stopChat() | Stop generation for the current chat (no-op if no active chat) | | submitToolResult(toolId, resultOrAction) | Submit result for a client tool (string or {action, form_data}) | | startStreaming(options?) | Manually attach to /chats/{id}/stream for the current chat | | disconnect() | Stop active stream/poll connections | | reset() | Disconnect and clear chat state so the next message starts a new chat |

API Reference

inference(config)

Creates a new inference client.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | config.apiKey | string | Yes | Your inference.sh API key | | config.baseUrl | string | No | Custom API URL (default: https://api.inference.sh) | | config.stream | boolean | No | Use NDJSON streaming (true, default) or status polling (false) | | config.pollIntervalMs | number | No | Poll interval when stream: false (default: 2000) | | config.proxyUrl | string | No | Proxy base URL for frontend apps (keeps API keys server-side) |

client.tasks.run(params, options?)

Runs a task on inference.sh.

Parameters:

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | params.app | string | Yes | App identifier (e.g., 'username/app-name') | | params.input | object | Yes | Input parameters for the app | | params.setup | object | No | Setup parameters (affects worker warmth/scheduling) | | params.infra | string | No | Infrastructure: 'cloud' or 'private' | | params.variant | string | No | App variant to use | | params.session | string | No | Session ID or 'new' to start a new session | | params.session_timeout | number | No | Session timeout in seconds (1-3600, only with session: 'new') |

Options:

| Option | Type | Default | Description | |--------|------|---------|-------------| | wait | boolean | true | Wait for task completion | | stream | boolean | client default | Use NDJSON streaming or status polling | | pollIntervalMs | number | client default | Poll interval when stream: false | | onUpdate | function | - | Callback for task updates (full fetch on status change when polling) | | onPartialUpdate | function | - | Callback for partial NDJSON stream updates (task, fields) | | maxReconnects | number | 5 | Max poll retries when stream: false |

client.tasks.get(taskId)

Gets a task by ID.

client.tasks.cancel(taskId)

Cancels a running task.

client.files.upload(data, options?)

Uploads a file to inference.sh.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | data | string \| Blob | Base64 string, data URI, or Blob | | options.filename | string | Filename | | options.contentType | string | MIME type | | options.public | boolean | Make file publicly accessible |

client.agents.create(templateOrConfig) / client.agent(...)

Creates an agent instance from a template or ad-hoc configuration. client.agent(...) is an alias for client.agents.create(...).

sendMessage options: onMessage, onChat, onToolCall, files, stream, pollIntervalMs. Client tools with status awaiting_input are dispatched once per invocation ID via onToolCall.

Template mode:

const agent = client.agents.create('namespace/name@version');

Ad-hoc mode:

const agent = client.agents.create({
  core_app: { ref: 'infsh/claude-sonnet-4@abc123' },
  system_prompt: 'You are helpful.',
  tools: [...],
});

client.sessions

| Method | HTTP | Description | |--------|------|-------------| | get(sessionId) | GET /sessions/{id} | Session metadata (status, expires_at, call_count, …) | | list() | GET /sessions | All sessions (empty array if none) | | keepalive(sessionId) | POST /sessions/{id}/keepalive | Reset idle expiration | | end(sessionId) | DELETE /sessions/{id} | End session and release worker |

Task Status Constants

import {
  TaskStatusQueued,
  TaskStatusRunning,
  TaskStatusCompleted,
  TaskStatusFailed,
  TaskStatusCancelled
} from '@inferencesh/sdk';

if (task.status === TaskStatusCompleted) {
  console.log('Done!');
}

Integration Constants

IntegrationDTO fields (provider, type, auth, status) use typed string unions exported as constants:

import type { IntegrationDTO } from '@inferencesh/sdk';
import {
  IntegrationProviderGoogle,
  IntegrationAuthTypeOAuth,
  IntegrationStatusConnected,
  IntegrationStatusDisconnected,
  IntegrationStatusExpired,
  IntegrationStatusError,
  isRequirementsNotMetException,
} from '@inferencesh/sdk';

function isGoogleConnected(integration: IntegrationDTO): boolean {
  return (
    integration.provider === IntegrationProviderGoogle &&
    integration.status === IntegrationStatusConnected
  );
}

// HTTP 412 when an app requires a missing secret, integration, or scope
try {
  await client.run({ app: 'my-app', input: {} });
} catch (error) {
  if (isRequirementsNotMetException(error)) {
    for (const req of error.errors) {
      if (req.type === 'integration' && req.action?.provider === IntegrationProviderGoogle) {
        // User must connect Google — see https://inference.sh/docs/extend/integrations
      }
    }
  }
}

| Constant group | Values | |----------------|--------| | IntegrationProvider* | google, slack, notion, github, x, microsoft, salesforce, discord, gcp, mcp, reddit | | IntegrationAuthType* | service_account, oauth, api_key, wif, mcp | | IntegrationStatus* | connected, disconnected, expired, error |

Instance Status Constants

Use when working with engine instance APIs (InstanceDTO.status):

import {
  InstanceStatusCreating,
  InstanceStatusPendingProvider,
  InstanceStatusPending,
  InstanceStatusActive,
  InstanceStatusError,
  InstanceStatusDeleting,
  InstanceStatusDeleted,
} from '@inferencesh/sdk';

Tool Parameter Types

When building AgentTool schemas manually (outside the tool builder), use ToolParamType* for JSON Schema type fields:

import {
  ToolParamTypeObject,
  ToolParamTypeString,
  ToolParamTypeInteger,
  ToolParamTypeNumber,
  ToolParamTypeBoolean,
  ToolParamTypeArray,
  ToolParamTypeNull,
} from '@inferencesh/sdk';

const schema = {
  type: ToolParamTypeObject,
  properties: {
    city: { type: ToolParamTypeString, description: 'City name' },
  },
  required: ['city'],
};

The fluent tool builder (string(), number(), object(), …) infers these types automatically.

TypeScript Support

This SDK is written in TypeScript and includes full type definitions. All types are exported:

import type {
  Task,
  ApiAppRunRequest,
  RunOptions,
  IntegrationDTO,
  AgentTool,
} from '@inferencesh/sdk';

Requirements

  • Node.js 18.0.0 or higher
  • Modern browsers with fetch support

resources

  • documentation — getting started guides and api reference
  • blog — tutorials on ai agents, image generation, and more
  • app store — browse 250+ ai models
  • discord — community support
  • github — open source projects

license

MIT © inference.sh