stillrunning-vercel-ai-sdk

v0.1.0

Published

20 days ago

One-line monitoring for the Vercel AI SDK. Auto-pings StillRunning on every generateText / streamText / generateObject run with duration, tokens, cost, model, and tool-call counts.

0High
0Medium
0Low

stillrunning.ai

vercel ai-sdk ai monitoring observability stillrunning agent llm cost heartbeat anomaly

stillrunning-vercel-ai-sdk

Monitoring for the Vercel AI SDK, in one line.

Wrap your generateText / streamText / generateObject calls and every run reports its duration, token usage, estimated cost, model, and tool-call count to a StillRunning workflow. Get alerted the moment an agent fails, runs too long, or costs too much, without writing any ping plumbing.

npm install stillrunning-vercel-ai-sdk

30-second quickstart

Create a workflow at stillrunning.ai/app/new and copy its ping token.
Set it as an env var:
```
STILLRUNNING_TOKEN=your_token_here
```

Swap your ai import for the StillRunning client:

import { stillrunning } from 'stillrunning-vercel-ai-sdk'
import { openai } from '@ai-sdk/openai'

const { generateText } = stillrunning() // reads STILLRUNNING_TOKEN

const { text } = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Summarize today’s standup notes.',
})

That’s it. Every call now shows up in StillRunning with cost, tokens, and timing, and you get an alert if a run fails, stalls, or spikes in cost.

What gets captured

On each run the SDK sends a ping with:

| Field | Source | | ------------ | ----------------------------------------------------------------- | | durationMs | wall-clock time of the call | | tokensIn | result.totalUsage.inputTokens (aggregated across all steps) | | tokensOut | result.totalUsage.outputTokens | | costUsd | estimated from a built-in pricing table (override-able) | | model | result.response.modelId | | toolCalls | total tool calls across every step | | traceId | groups one logical run (auto-generated, or set via withTrace) | | metadata | { finishReason, steps } |

A failed call sends a fail ping with the error message, then rethrows the original error unchanged. Monitoring never alters your control flow, and a ping that fails to send never throws into your code.

Streaming

streamText is handled too. The success ping fires when the stream finishes, and your own onFinish / onError callbacks are preserved:

const { streamText } = stillrunning()

const result = streamText({
  model: openai('gpt-4o'),
  prompt: 'Write a haiku about uptime.',
  onFinish: ({ text }) => console.log('done:', text), // still called
})
for await (const chunk of result.textStream) process.stdout.write(chunk)

Grouping multi-step agent runs with `withTrace`

By default each call is its own run (one traceId). When an agent makes several model calls that are really one logical execution, wrap them so they share a trace, and StillRunning stitches them into a single outcome chain:

import { stillrunning, withTrace } from 'stillrunning-vercel-ai-sdk'

const sr = stillrunning()

await withTrace(async () => {
  await sr.generateText({ model, prompt: 'plan the task' })
  await sr.generateText({ model, prompt: 'execute step 1' })
  await sr.generateText({ model, prompt: 'execute step 2' })
}) // all three pings share one traceId

You can pass an explicit traceId / parentRunId for nested agents: withTrace(fn, { traceId, parentRunId }).

Cost estimation

Cost is estimated from token counts and a built-in pricing table covering current Claude, GPT, and Gemini models. It’s intentionally approximate, it powers relative cost-anomaly detection (a 5x spike is a 5x spike regardless of the exact rate) and a ballpark spend figure. For exact accounting:

// Full control:
const sr = stillrunning({
  computeCost: ({ model, inputTokens, outputTokens }) => myExactPricing(model, inputTokens, outputTokens),
})

// Or extend / override the built-in table:
import { registerModelPricing } from 'stillrunning-vercel-ai-sdk'
registerModelPricing([[/my-custom-model/, { input: 1.5, output: 6 }]]) // USD per 1M tokens

Unknown models simply send no cost rather than a wrong one.

Configuration

stillrunning({
  token,            // ping token; defaults to process.env.STILLRUNNING_TOKEN
  baseUrl,          // defaults to https://stillrunning.ai
  computeCost,      // (input) => number | undefined , override cost estimation
  awaitPing,        // default true; set false for lowest latency (fire-and-forget)
  pingTimeoutMs,    // default 3000
  onError,          // (err) => void , observe ping delivery failures
  fetch,            // custom fetch (testing / non-global-fetch runtimes)
})

By default the ping is awaited so it delivers reliably on serverless, adding the ping's round-trip (a single small POST, hard-bounded by pingTimeoutMs) to a non-streaming call's return. A slow or down StillRunning can therefore add up to pingTimeoutMs to a call but never hangs your agent; set awaitPing: false for zero added latency (fire-and-forget). For streamText, your own onFinish always runs before the ping, so streaming consumers are never gated on StillRunning.

Requirements

Node 18+ (or any runtime with fetch and AsyncLocalStorage)
ai (Vercel AI SDK) v5 or later, as a peer dependency

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

stillrunning-vercel-ai-sdk

30-second quickstart

What gets captured

Streaming

Grouping multi-step agent runs with withTrace

Cost estimation

Configuration

Requirements

License

Grouping multi-step agent runs with `withTrace`